Introduction


In the United States, the most common keyboard layout for computer keyboards remains the one designed by Christopher Sholes for the original Remington "Type Writer" in 1876. This layout is commonly called QWERTY after the order of the first few letters on its top row.

Other layouts for a standard 3-row keyboard exist (as well as some interesting nonstandard arrangements). I have been using the Dvorak keyboard layout for about a year now. I like it a lot for my daily work, which involves a lot of typing. I used to feel a numbness of the backs of my hands after a long day with QWERTY, but I don't with Dvorak. And quantified measurements bear out its efficiency relative to QWERTY. (I acknowledge the argument that learning Dvorak also got me to type with the right fingers on the right keys, but I don't think that that's the whole story.)

But Dvorak designed his layout in the 1930's without the aid of computers. It contains a couple annoying features that lead to common errors in my typing -- namely the placement of Y and B). Could a modern evolutionary algorithm and a huge input sample discover a better arrangement? I had to give it a try. The results surprised me!

This note summarizes my experiments thus far.

The first experiment

I limited this experiment to the 30 keys on the three main rows under the four fingers of the two hands. They include the 26 letters of the English alphabet and four punctuation symbols (comma, period, quote, and semicolon). (A QWERTY layout typically has the slash in this region instead of quote.) Other punctuation was ignored.

First, I needed a quantifiable metric by which one keyboard layout could be compared to another. I constructed a complicated function that measures the amount of "work" needed to touch-type a given text with a given layout. This function estimates total finger travel, with some extra penalties in some situations and some bonuses in others.  Specifically, I simulate the typing of a (single) word with these rules:

My work estimation function simulates the typing of words, not simply digraphs and longer sequences. So it captures the effects of fingers being left on whatever keys they are on after they type the letters.

I also needed a corpus of sample text. I collected about 20 megabytes of English from Project Gutenberg. I used the King James Version of the Bible, the complete works of William Shakespeare and (naturally!) Charles Darwin, the first volume of Gibbon's Decline and Fall, Boswell's Life of Johnson, Melville's Moby Dick, the Education of Henry Adams, and other works. To that I added a decade of sent e-mail (bodies only) and about 100,000 lines of C code. This corpus therefore reflects my own typing to some degree by intent and other people would surely observe different results.

Many people have commented that the presence of Elizabethan English in this corpus skews the results.  I don't think so, because the only oddball word that shows up with great frequency is thou, and that's just a pair of common digraphs that I'm sure I type thousands of times a week.

The total input size is 4293746 words in 3307922 bytes. There are 88732 distinct words that appear more than once, where a word is defined as a contiguous sequence of characters that can be typed with the thirty central keys. The most frequent words are:

240036 the
140994 of
140641 and
92363 to
69993 in
69255 a
51265 i
50069 that
37358 :
37291 is
36819 for
32300 with
30704 he
30079 his
29071 be
28690 as
27130 it
25650 not
22020 this
21603 have
21467 by
21402 but
21241 my
21001 on
20928 was
18867 you
18067 if
17949 from
17483 ;
17330 are
17058 which
16973 all
16958 they
16587 or
14853 will
14809 at
14768 we
14362 shall
13701 their
12595 ,
12416 so
11985 .
11518 had
11454 an
11307 thou
11037 your
10377 sv      (The alphabetic part of the code-name of my project at work)
10296 him
10257 when
10155 one

Next, I constructed an evolutionary framework. After trying several, I ended up with a scheme in which a pool of 4096 keyboard layouts compete with each other. The layouts in the initial pool are entirely random. In each generation, they all race to "type" a word list, and their per-word times are multiplied by the word frequencies in the input sample. After the race, the fastest half are kept. The pool is then repopulated by generating a single mutation for each survivor. The mutations are made by permuting keys in the layout, with a 50% chance of swapping two keys, a 25% chance of swapping three, a 12.5% chance of four, and so on.

The evolutionary framework itself had to evolve. It was challenging to find a scheme with sufficient mutation possibilities that would allow a medium-quality layout enough time to improve itself with multiple mutations before getting eliminated. I also learned that it was important to track only distinct layouts, for otherwise a single good one would rapidly fill the pool with identical copies of itself.

And yes, strictly speaking this is not a genetic algorithm, since the genotypes are never combined.  I just can't think of a way in which that could be done, since a keyboard layout is a permutation of a list rather than a selection from a multiset.

When no new best layout has risen to the top of the pool in some number of generations, the round stops. The best layouts are stored away and the pool repopulated with random keyboards. This allows a fresh start after one layout has populated the pool with itself and its mutations.

Last, there is an "all-star" round in which the best survivors from all the rounds compete. The Dvorak and QWERTY layouts get seeded into this round too for fun.

Enough detail on the experimental framework! Once I was happy with the evaluation function and evolutionary framework, I was fascinated to watch it run in real time and see the intermediate results. I kept a running display of the top five layouts in each generation. Usually, layouts with different home row orders will battle it out until one has proven itself superior. It would only take a couple of generations for a round to produce something better than QWERTY. And it quickly became clear that putting the vowels on the home row of the left hand, which is a cornerstone concept of the Dvorak layout, was not seen by the algorithm as optimal.

What came out of the exercise, after running it overnight? Here is the winner, which I used to type the first edition of this note, as well as the Dvorak and QWERTY layouts with their scores:

' , . p y  f g c r l  Dvorak layout
a o e u i  d h t n s  12189785
; q j k x  b m w v z

q w e r t  y u i o p  Sholes' layout, with quote replacing /
a s d f g  h j k l ;  25390660
z x c v b  n m , . '

. u y p q  k l d c g  Best evolved layout
e a i n w  r h t s o  9640479
' , ; f z  j m v b x

The second experiment

The next step was to actually try using the layout.  I spent a couple days with it, and learned that my layout evaluation function was just too smart for its own good.  Too many words required complicated patterns using the fingers of the right hand. The word bottom convinced me that Dvorak was on to something when he designed a keyboard that maximized alternation between the hands.

(The insight is that hand alternation increases parallelism.  When the fingers of one hand are hitting keys, the fingers on the other are getting into position atop the next keys.  This should have been obvious, but it wasn't until I started the third experiment and saw some empirical timing data that I realized how much faster things are with high rates of hand alternation.)

So I updated -- simplified, really -- my evaluation function.  Now I charge points when too many keys are hit in succession by fingers of the same hand, with some credits for hitting adjacent keys.  Specifically, the new simplified rules are:

Much simpler!  So I ran the experiment again.  What did I see?

' , . p y  f g c r l   Dvorak layout
a o e u i  d h t n s   32129548
; q j k x  b m w v z

q w e r t  y u i o p   Sholes' layout, with quote replacing /
a s d f g  h j k l ;   59514344
z x c v b  n m , . '


k , u y p  w l m f c   Best evolved layout
o a e i d  r n t h s   28281895
q . ' ; z  x v g b j


That looks way more usable to me.  But (perhaps not surprisingly), it sure looks a lot like Dvorak, too, and is not quantifiably all that much better, and that advantage is probably less than the level of error in my work estimation function.  Note that the simple goal of hand alternation did bring the vowels all over to the hand opposite the one with the T (which my program automatically places under the right hand).  But it pulled U out of the home row so that R could live there.

Other differences from Dvorak are not that profound, and seem to correlate pretty well with a simple letter frequency analysis.  I note that H was put where Dvorak has N, perhaps so that SH would be seen as using adjacent fingers.  And P and Y swapped places.

Now I'm going to try this second layout for a day or two and see whether it's sufficiently (subjectively) superior to Dvorak to be worth the hassle of switching to it...

(At this point, this work was the subject of a front-page story on Slashdot and I was deluged with mail.  Thanks!)

The third experiment


Well, it's a usable layout.  But I'm still not comfortable with the way that I assigned cost estimates to the various keyboard positions.

So I've switched temporarily back to Dvorak (oh, it feels so good) and have written a program to monitor my keystrokes.  I plan to collect a couple of weeks' worth of data.  I'm tracking each single letter, digraph, and trigraph with timing data and error rates.  This will let me construct a map of the keyboard that has a real empirical cost for movement between most keys. Of course, it may well be biased by my use of the Dvorak layout; I have found some QWERTY users willing to collect similar tracking data. The application to the evolutionary algorithm is obvious.

Interesting patterns have already arisen.  Suppose that you group the 30 keys on the three main rows into six groups (top, home and bottom rows for each hand) as follows:

    left  right
    00000 11111  top row
    22222 33333  home row
    44444 55555  bottom row

Here are the average transition times in milliseconds for my own typing between a keystroke in one group and a keystroke in another:

mean ms:
 ->    0      1      2      3      4      5
0    403    198    311    351    547    236
1    271    193    157    237    614    320
2    287    119    159    134    409    123
3    263    415    140    180    418    152
4    499    524    334    495    629    173
5    159    245    168    307    268    318


I see that the fastest transitions are from the left hand home row (group 2) to anything under the right hand (groups 1, 3, and 5) and vice versa.  Note that the Dvorak layout that I use has all the vowels on the left hand's home row (group 2).

(News flash: Running the keyboard layout evolution program using the inter-key timing data collected from my own fingers has produced a result so surprising that I don't really trust it!  Namely, the evolutionary algorithm couldn't produce a layout that could beat the Dvorak layout that gets seeded into the final all-star round.  I've enlisted a bunch of QWERTY typists to collect their own keystroke data and see what the experiment produces for them.  Look for more results in a couple of weeks.)

(Update: A couple of week have now passed, but I haven't received any QWERTY keystroke timing data yet that I can plug into the evolutionary model.  Volunteers, if your $HOME/.kbwatch file is larger than a few hundred Kbytes, please "make kbstats_data.c" and
mail it in to me.  Thanks!)

Links


I used xkbcomp to remap my keyboard; here is the file if you are crazy enough to want to try it yourself.

In all, this little program required about 1000 lines of C.

A better picture of the second layout is here in a PDF file.  Note that I have also swapped CAPS LOCK with the left CONTROL key and exchanged the parentheses with the angled brackets (less than and greater than).