Reforming the GUI monoculture
This article is a continuation to my treatise on the past, status quo and an unlikely future of computer user interfaces, begun in “Smashing the monopoly” and to some degree in “Vanhassa vara parempi”.
While in “Smashing the monopoly” some ways to improve the terminal and shell and close the gap between the text terminal and graphical user interfaces were discussed, this story takes the opposite approach. It concentrates on closing the gap from the other direction and also on other reforms on the WIMP GUI paradigm that would make it more keyboard-operable and as a consequence less cluttered for mouse operation as well.
Perhaps the biggest problem with the WIMP paradigm is that it tends to
produce what I'd call highly nonlinear designs in lack of a better word.
To demonstrate this, take almost any dialog that has more than two widgets
in almost any WIMP program. Look up the active widget, supposing it has
one and the program supports even a limited form of keyboard navigation.
Now pick what you consider the “next” widget from this. Hit Tab. Did
the widget you guessed become activated? Unless you were familiar with the
peculiarities of that particular dialog, I'd wager it didn't. Simple
one-dimensional “linear” keyboard navigation simply isn't good enough for a
two-dimensional layout. To make keyboard navigation predictable and thus
usable, you either have to linearise the layout by making it one-dimensional,
or you have to add an extra dimension to keyboard navigation.
However, keyboard navigation keys for all four primary directions isn't enough, if the layout doesn't provide a relatively unambiguous decision on which widget or a group of such should be considered the be to the left or right of, or above or below another. That is, if the layout doesn't consist of linear slices in both coordinates, such as consisting of columns of vertically stacked widgets. The primary example of such a highly nonlinear and cluttered layout is, of course, the conventional window management scheme or the desktop paradigm. Ion, on the other hand, enables managing windows in a more linear layout. It does allow for what would seem very unpredictable layouts as well, but on the other hand, in Ion such layouts are entirely the user's decision, not the programs'.
In some cases such a stacked-columns in two dimensions approach, however, does not quite work. Sometimes you have to do the column-stacking in the third dimension. Such is the case with the menu bar. To begin listing the problems of the menu bar widget, first of all, there tends to be the problem that the menu bar is navigated with the same keys that sub-menus are, but there's no way to specifically return to the main menu bar. Instead, the menu bar can only be navigated when a non-sub menu entry is selected in a menu. This, however, is not directly related to the unsuitability of the stacked-columns in 2D approach, and could be fixed within the confines of the menu bar approach. I, however, think that there should not be two types of menus: the menu bar and drop-down menus. What suffices, is dropping the menu bar and stacking the menus in the Z dimension, and simply providing a key or button to display the main menu. This is what many programs did in the DOS era, and games still tend to often do, although the layout is not always so linear anymore. Ion also uses this approach, although recently menus were experimentally replaced with queries (textual input of the same old menu entries as commands). In fact, a combination of menus and queries would be an interesting and likely fruitful experiment: a sort of deep type-ahead find for menus.
The menu bar is not so usable with the mouse either, because the menu items tend to be small and aiming small widgets is one of the worst parts of mouse use. The purportedly Fitt's Law based approach of MacOS of locating the focused window's menu bar at the top of the screen isn't essentially any better, since horizontally the menu entries are still small (or take a lot of screen space). If, on the other hand, some corners of the screen simply contained a menu-activation button or area – as is the case in SymphonyOS/Mezzo – it would be quite easy to hit, and the actual menu entries could be much bigger.
Let us now return to the mention of textual input of commands, as there's lot to be said about that. You see, I don't think keyboard usability is just about having keyboard shortcuts to everything. Even the term “shortcut” implies that there's something fundamentally wrong about the primary way the UI was intended to be used in. The linearisation of the layout that was already discussed is, of course, part of a good keyboard-oriented design. But simply being able to predictably navigate deeply-nested dialogs and menus is not always enough. Neither are shortcuts, as they can be difficult to remember, and there can be a lot of them. I think command names can be often the easiest to remember – easier than the location of a widget or a key binding – for things that are used relatively often, but not all the time. After all, we often think in terms of verbs and commands. A command may also be easier to type than a complex escape-meta-alt-control-shift key-combination, or looking up a widget for the action, by mouse or by keyboard. That's one of the reasons why I prefer LaTeX over word processors. (The others are that I don't like WYSIWYG, instead preferring a more what-you-see-is-what-you-mean approach; and all the same reasons that apply to almost any WIMP program.)
Indeed, although it uses the WYSIWYG paradigm, TeXmacs is, however,
quite interesting in that it supports LaTeX-like commands in the input
stream. Simply type '``' (backslash) followed by the name of the command
and enter, and it will query for the parameter for the command.
(Unfortunately in a WYSIWYG manner. I'd prefer to type the whole command
before it was rendered, and space instead of enter). For example,
typing \section followed by enter will start a new section, and what
is next typed will be the name of that section. I'd like such an approach
to be used much more.
Regarding key bindings for the often used actions, there's also the fact
that the bindings of most WIMP GUIs tend to be quite limited and do not
make good use of the easily accessible Control+Key combinations, and
instead require one to use the arrow, Home, End and other keys far away
from the typing position of hands. In fact, WIMP GUIs tend to waste
perfectly useful key combinations for something used very rarely, such as
Control-P for printing the document (instead of doing the same as the
up arrow), and do not often even provide convenient ways to configure
anything.
Finally, to make keyboard use more attractive, programs should advertise their key bindings more so that people can easily learn them, or look them up if they don't. I don't mean a manual or anything of the kind. It should be possible to put the currently relevant bindings always visible. Needless to say, good old DOS programs tended to do this more than programs today. This is not only a problem of WIMP GUI apps, but also of more traditional Unix terminal programs. When I started using Linux back in the mid-90s, the only powerful editor that I wasn't overwhelmed with was joe and the number one reason for this was that it has a very convenient help screen and the binding which toggles it on is advertised on the statusbar.
Of course, one can ask why is such extensive keyboard support needed at all? Isn't keyboard an out-dated method of input, mice and, say, voice recognition being the future? As for voice recognition, well, you can't use it everywhere, so clearly other methods of input would still be needed. Also I don't even think it is applicable to everything. The keyboard and the mouse also aren't applicable to every application. You can't conveniently type this story with a mouse. Likewise, some applications simply need some kind of pointing device: programs that are used to create graphics in particular. However, even in such cases a clever combination of the keyboard and the pointing device can often be the best (known) approach, as evidenced by e.g. Blender and the prevailing input systems of first-person shooter games.
I think the two most important points speaking for keyboard-oriented design are:
Extensive mouse usage tends in my personal experience to be more wearing on wrists than keyboard usage.
The inconvenience of switching the input device when working with textual data (such as this story or computer code and so on), and efficient access to almost all functionality.
Some people may define efficiency here terms of productivity. Fast keyboard access to all functionality they need enables them to get more work done. But I'm not one to endorse such a definition. Instead, I define an interface as efficient if it minimises the time I have to spend interacting with it – and in the long-term, no less. I think keyboard-orientation as outlined above best provides such efficiency in most applications.