Category Archives: Software

Limitations to using a Raspberry Pi as your Super Fast Broadband router

I’ve seen a few recommendations about using a Raspberry Pi as an Internet Router. Last night I did some experimentation and found that unfortunately it’s not quite up to the task on my super-fast link (50Mbit/s, “soon” to be upgraded to 100Mbit/s).

I’m going to experiment with some other options and see if I can find a configuration that works – if I do, I’ll post an update. At this stage, though, I’m pretty doubtful it’ll work for me.

Testing Environment

  • Raspberry Pi revision 2 board.
  • Standard current Raspian release (2013-02-09-wheezy). I kept the device in standard configuration (no adjustments to ethernet ‘Turbo’ mode, or to memory allocation).
  • Edimax EU-4207 USB ethernet adapter.
  • Iperf – the default version on all boxes.
  • Two very capable boxes – able of sustaining full gig-ethernet iperf performance between them with a crossover. Both ran Ubuntu – one Quantal, and one Raring.
  • Direct CAT 5E ethernet links from the capable boxes into the two ethernets on the Raspberry Pi. All links (both directions) showed 100Mbit/s full duplex connections on both sets of interfaces.
  • Static IP addresses all round, with the two interfaces on the Raspberry Pi acting as a gateway.
  • IP forwarding on the Raspberry Pi enabled, but with no firewalls or NAT configured, for maximum performance.

Machine A to Raspberry Pi

This test comprised running iperf from machine A to the Raspberry Pi using the Pi’s built-in ethernet port. Iperf performance seemed fine – topping out at about 94Mbit/s. Unfortunately there was pretty much 0% cpu time available, with almost everything being in system time. The CPU was mostly keeping up, but there was very little wiggle room. I was pretty impressed with that, overall.

Machine B to Raspberry Pi

This used the EU-4207 ethernet adapter. Performance was marginally up from the native port. Iperf performance seemed fine again, topping out at about 96Mbit/s. Again, cpu was pretty much 100% by system time, servicing the packets coming in.

Machine A to Machine B

This is the real test, and unfortunately the Pi fails pretty badly here for my usage. Performance varied from about 35Mbit/s to peaks at around 55Mbit/s, and was quite variable.

The vmstat output is more telling – what’s interesting here is that although the test ran for 10 seconds, ‘vmstat 1’ only shows 4 lines of output (there should be 10). The cpu is so badly lagged that vmstat goes on the back burner.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 259932 18004 132188 0 0 35 21 1068 99 3 8 89 1
7 0 0 258716 18004 132188 0 0 0 0 5348 29 0 33 67 0 -- performance test starts here
7 0 0 256892 18004 132188 0 0 0 0 24359 17 0 100 0 0
7 0 0 255548 18004 132188 0 0 0 0 17461 8 0 100 0 0
3 0 0 256412 18004 132188 0 0 0 0 88444 77 0 100 0 0 -- note that vmstat outputs are missing
0 0 0 259420 18004 132188 0 0 0 0 1346 46 0 4 96 0
0 0 0 259420 18004 132188 0 0 0 0 348 60 0 0 100 0
0 0 0 259420 18004 132188 0 0 0 0 348 65 0 0 100 0
1 0 0 259420 18004 132188 0 0 0 0 348 61 0 1 99 0
0 0 0 259420 18004 132188 0 0 0 0 355 70 0 3 97 0
0 0 0 259420 18004 132188 0 0 0 0 350 65 0 0 100 0
0 0 0 259420 18004 132188 0 0 0 0 346 65 0 0 100 0

Many links are indeed slower than this speed, and the Pi is probably fine for those purposes (just don’t any run other apps on the Pi, otherwise you’ll end up with random slowdowns). The problem is that as soon as you start adding the overhead of NAT, PPOE, and stateful packet filtering into the mix, performance is definitely only going to get worse. And that’ll lead to packet loss and reduced browsing speed.

Remember also that this is a unidirectional iperf test. So if you’re uploading and downloading at the same time (hint, you are!) things will get even worse.

I’d say that if your link is up to 20Mbit/s and you have simple home-type network with limited NAT requirements you’re probably fine with the Pi as a router. Above that, I’d start to be wary.

Optimising things

There’s always some room for optimising Linux’s networking, and I’ll have a look around to see if there’s some set of magic sysctls that’ll help here. I’ll post an update if I find anything. If you have any suggestions you’d like me to test, tweet me at @oskarpearson

Perl Crypt::PBKDF2 library with minimal dependencies

I’ve published a partially-compatible Crypt::PBKDF2 library with minimal dependencies on GitHub

Review would be both welcomed and appreciated.

Note that you should probably rather use the original library instead, unless you are running a very old Linux distribution. The official Crypt::PBKDF2 library requires Perl Moose, which isn’t available on old operating systems, which is the reason I’ve had to put this together. Sometimes you can’t just upgrade everything to the latest OS – it’s more complicated than that.

Been working on a facelift for the London Hackspace website

Before

London Hackspace Website - Before
London Hackspace Website – Before

After

London Hackspace Website - After
London Hackspace Website – After

Background

As you may know, I’m a member of the London Hackspace, a community-run Hacker Space in Hoxton. As a member, you get to play with pretty much everything: programming, Arduinos, electronics, and even Liquid Nitrogen (assuming the pledge goes well!).

I’ve taken some time to re-organise the homepage, based on an original design concept first discussed in 2011, using 960.gs and html5. I hope you like it! More importantly, I hope it helps build momentum at the space, and that they might get new members out of it.

I’m using the following historical image as a guideline – https://dl.dropbox.com/u/1909920/lhs-website-sketch-2011-01-08.png That sketch is something Russ’ friend put together, and Russ posted to the infrastructure group on 24th Jan 2011 (!) https://groups.google.com/forum/#!topic/london-hack-space-infrastructure/nUlZC7l3PmE shows the original emails.

Here’s the github link for the updates, assuming you’re that interested.

Thought of the day from Linus Torvalds back in 1999

This quote by Linus Torvalds has stuck with me since I first read it back in 1999:

I generally have a problem with your patches: a lot of them are really
good. But at the same time a lot of them are changing things because
you’re in “random walk” mode, and you just walk over the code without
really thinking about them. In the end, you then have a huge patch, and
nobody really knows which part of your patch is really the worthwhile
part. Which is then why it takes so long for some of your _real_ fixes to
get into a stable kernel..

It’s not the only occurrence of him talking about software developers using the ‘random walk’ methodology for code improvement – sometimes he is occasionally less polite.

As a developer, if you notice something else is broken or needs improvement, fix each problem in a separate and logically internally-consistent change that makes sense to the reader and the end user. Doing things this way means less cognitive load for everyone involved, as they can weigh up each part of your change with a consistent end-goal in mind, and build a clear mental model of your change without irrelevant detours.

This method also nicely matches one of the more general rules of debugging: change one thing at a time. If you change 10 different and unrelated things at once and the system breaks, it’s often much more difficult to determine why the problem occurred.

Following this method also means that it’s much easier to revert a single change that’s gone wrong, without the side effect of bug-free changes disappearing, simply because they were included in the same patch set.

So: fix one thing at a time, and fix it right.

Punishing poor security by copyrighting common unsalted SHA-3 passwords

Geeky fun idea of the day – I want to pre-calculate and copyright all un-salted hashes of common words the moment the SHA-3 algorithm is approved.

As some of you may know, there are plans afoot for creating a new SHA-3 one-way-hash algorithm.

One-way hashes are often used in cryptography/security – the idea is that you can take some data (like a password, or an entire book/dvd/cd) and run it through an algorithm which returns you a single number that uniquely identifies the input data. However, given the number alone, you can’t get back to the data (hence it’s “one way”). The returned number returned is indeed huge, but it is easily represented as a simple text string of a consistent length.

Many people use hash algorithms for storing people’s passwords in databases, so that they can avoid having a massive list of user’s passwords in easily-stealable form. The idea is that instead of storing the password in the database directly, you store an encrypted (hashed) version of it. When someone tries to authenticate themselves, the code takes the password the user supplied and re-hashes it. The code then compares the new hash against the database hash – if they are the same, the person entered the correct password.

This is all fine and dandy, except when the supplied password is a common word. If you use the word “fred” as your password, the hash algorithm will always return 570a90bfbf8c7eab5dc5d4e26832d5b1. So, if I find an encrypted password entry in a database which is 570a90bfbf8c7eab5dc5d4e26832d5b1, then I know the input value (their password) is “fred”.

The OED has, according to Wikipedia, 500,000 words. So it’s entirely possible to take every english word and hash it. Voila – you have now figured out a huge number of passwords in the database.

So what I want to do is pre-calculate (and copyright) all the hashed words in SHA-3. If someone stores them in a database, they are effectively using my copyrighted information.

The only problem is that it it’s not possible to copyright a numberor is it? (oh, and the fact that it’s only theoretically possible to copyright longer texts…)

There’s a technique called salting, which helps avoid the problem described above. In short, a random string is added to the supplied password – so now instead of having to calculate the password for “fred”, I have to calculate every possible combination of “fred” and all possible random strings. This is effectively equivalent to pre-calculating AAAAfred, AAABfred, AAACfred, etc, all the way to ZZZZfred. By chosing enough “salt” characters, I can make pre-storing all possible passwords / salt combinations impossible.

Oh, if you’re interested – check out this page for more info on storing passwords in your database. In short, use bcrypt – more info here

Alphanumeric passwords with enforced numbers – more “Security Theatre”

Reading through TechCrunch’s Depressing Analysis Of RockYou Hacked Passwords:

According to a study by Imperva, [the most common password is] “123456,” followed by “12345,” “123456789″ and “Password,” in that order. “iloveyou” came in at no. 5.

I generate my passwords with APG, which generates passwords like this:

  • Irikyak6
  • RaypHiam6
  • radsErn2
  • reebrIjLi

As you can tell, these are for all intents and purposes, secure. However, some sites out there insist that the last one on the list is insecure. Why? It doesn’t have a number in it, so it must be terrible. 123456? That’s cool with them, though.

So riddle me this:

I’ve got a one character password. The password has to contain a digit.

What are the chances of guessing my password? Why, 1 in 10. I can’t use A-z, given the rules, so the password has to be a single digit, 0-9.

However:

I’ve got a 1 character alphanumeric password. There are no rules about how many numbers the password must contain.

What are the chances of guessing my password now? Why, A-Z (1 in 26), a-z (another 1 in 26), and 0-9 (1 in 10): So your chances are 1 in 62.

The same thing applies in longer passwords: the more information you give an attacker about the rules associated with a password, the less work they need to do to crack the password.

A better solution? Integrate a library like cracklib and set some reasonable rules about what passwords are, and aren’t allowed. It’ll stop 123456 in it’s tracks.

Oh, and for goodness sake, encrypt your passwords when you store them in the database.

Shared Vocabulary, Problem Solving, and Domain Driven Design

In The Science of Screwing Up, Wired Magazine discusses Kevin Dunbar: “a researcher who studies how scientists study things — how they fail and succeed”.

When Dunbar reviewed the transcripts of [a meeting involving people from numerous disciplines], he found that the intellectual mix generated a distinct type of interaction in which the scientists were forced to rely on metaphors and analogies to express themselves. (That’s because, unlike [his comparison group of specialists,] the E. coli group, the second lab lacked a specialized language that everyone could understand.) These abstractions proved essential for problem-solving, as they encouraged the scientists to reconsider their assumptions. Having to explain the problem to someone else forced them to think, if only for a moment, like an intellectual on the margins, filled with self-skepticism.

Interestingly, the Domain Driven Design book makes a similar plea:

When domain experts use this LANGUAGE in discussions with developers or among themselves, they quickly discover areas where the model is inadequate for their needs or seems wrong to them. The domain experts (with the help of the developers) will also find areas where the precision of the model-based language exposes contradictions or vagueness in their thinking.

However, I find the essence of the two discussions to be slightly different:

  • The Domain Driven Design book encourages developers and architects to move towards one language, that can be shared between the stakeholders.
  • Dunbar found that the process of “working around” the differences in the languages and concepts produced the real results, of finding a common understanding.
  • I think the correct solution lies somewhere between these two extremes.

    Thoughts on “Art of the Start”

    Given my previous post on blogging, I may have given the impression that “The Art of the Start” by Guy Kawasaki is the best thing since sliced bread.

    Unfortunately – it’s not. It is good – and it adds a reasonable amount of value, but it’s not “all that”.

    I skim-read the book over coffee in the bookshop when I first saw it. And unfortunately for Guy, I have say that I got the most value out of the book in that initial skim reading.

    Perhaps it’s because I haven’t any use for information on: how to deal with potential venture capitalists; the differences in venture-capital meeting dress-style between the east and west coast; and how most financial projections in business plans are completely useless.

    Or perhaps it’s because in he and I have a similar way of thinking about these sorts of things, so there isn’t that much that seems new. Your mileage may vary.

    One of the main problems I have with the book is that it’s marketed as “for anyone starting anything.” However, I suspect there was a conversation like this between two of the publishers just before releasing the book:

    1. Publisher 1 – “So we’re publishing a book for people starting software businesses that need venture capital. Now tell me, how many people do that each year? And how many do we have to sell to make a profit? Who the hell signed this deal?”
    2. Publisher 2 – “Well… the introduction is fairly general – you could even use it as a way of structuring a blog, for example. He also does use some other examples besides software and tech business stuff in parts of the book – there’s one sentence on page 47, for example. How about we subtitle it ‘Your guide to starting anything?'”
    3. Publisher 1 – “Isn’t that a bit bland? It needs a ‘little something’, I think”
    4. Publisher 2 – “How about ‘The Time-Tested, Battle-Hardened Guide for Anyone Starting Anything’? Now everyone can buy our book.”
    5. Publisher 1 – “Publisher 2, you’re a bloody genius.”

    That said – I don’t regret buying the book. The first few chapters especially have some important and useful ideas. But it has some major assumptions about the business you’re going to build, where it’s going to be located, and how you are going to build it. (It always goes something along these lines: think of the idea; start building it while you try find venture capital; list it on the stock market or sell it to a competitor.)

    I personally think of starting a business in a different way – but I’ll have more about that in another post.

    Structure and Interpretation of Computer Programs

    I’ve recently started reading The Structure and Interpretation of Computer Programs – one of the textbooks for MIT’s electrical engineering and computer science degrees.

    One thing that’s immediately interesting to me is how challenging the first year introductory text is. The second thing is how well the book is thought out. I know how difficult that is, having written a reasonable amount of a book myself for the Squid User’s Guide.

    The third thing I’ve noticed is how years of coding predominantly in “the {} languages” (Perl, and occasional bits of C, Java, etc) make it difficult for my brain to parse scheme/lisp syntax like this:

    (define (abs x)
    (cond ((> x 0) x)
    ((= x 0) 0)
    ((< x 0) (- x))))

    My brain also encounters similar pain parsing ruby (though it does use {braces}):

    5.times { print "Odelay!" }

    and smalltalk:

    a := [:x | x + 1]

    If you've ever learned another spoken language (and I'm not referring to chatting with a programmer-friend in Perl with statements like "oops - s/oskar/fred/g in that last statement"), you'll probably remember the first time that you started "thinking in the new language".

    Right now, I'm not "thinking in ruby" or "thinking in smalltalk" - I go through a mental process that's very similar to reading a new language:

    1. Change each word to it's English equivalent (In the case of a programming language, convert the tokens and keywords to some other language)
    2. Rearrange the words so that they make sense in English - if the direct translation is "couch sat on by man", I'd rearrange to "the man sat on the couch". For more complicated statements, I'd have to try out multiple rearrangements, until I find something that makes sense in the sentence's context.

    The process I have to follow with new programming languages is pretty similar to the process of converting from one spoken language to another.

    With time, no doubt, I'll be "thinking in" these programming languages. But the only way to do that isn't through reading the language - it's by writing the language.

    Luckily, in this process I don't have to upset people with my poor pronunciation and grammar.