Smashing the monopoly
This article will continue on the path begun in “Vanhassa vara parempi”, but instead of simply praising old things, we will look into the past for inspiration for the future. More specifically, we will look at ways to fuse the good old command line with graphical user interfaces, on which the defective WIMP paradigm currently has a monopoly on, although it need not be so.
So, consider a shell, such as bash, running in a terminal emulator, such
as xterm. All you see is text and perhaps some special characters. But
wouldn't it be nice to display graphics for certain things? What if
uptime or a similar command would in addition to the load averages be
able to display a graph of the CPU usage for the past few minutes? If dd
could display a graph or bar depicting disk usage? If it were possible to
display caution sign images for error conditions? Or, more generally,
cat an image to view it? And all this conveniently within the “terminal
stream”, instead of popping up new windows.
This is not a new idea, I should note. Some variants of the w3m and Links web browsers in fact have kludges to do display images over the terminal, but since it isn't coordinated with the terminal emulator, it is not difficult to guess that the result isn't very good. Also XMLterm can potentially support all kinds of graphics within the terminal. I do think it takes the wrong approach, however. The bloated and over-complicated approach that isn't even general enough. The terminal software shouldn't support complex XML languages or even HTML.
Instead, I propose that in case of X11, the terminal emulator should simply enable displaying arbitrary XIDs (X windows and bitmaps/pixmaps) within the terminal stream, in two modes. These would be the inline and display modes to lend TeX math mode terminology. In inline mode, the XID could take the height of a single line, and a variable number of columns (but obviously not wrap to another line). XIDs in display mode would take multiple lines.
The benefit of this approach is, obviously, that the terminal emulator can be kept quite simple and still supporting all the old escape code functionality without much trouble, but nevertheless the terminal can display arbitrary content that isn't limited by the capabilities of any particular data format. Even interactive content is easily possible, because the actual drawing is handled be external programs, and the terminal simply keeps track of these “inserts”. The protocol for doing this can itself be quite simple too. Just a few new terminal escape codes are needed: one code to query terminal font size and colour information, et cetera, and another to request inserting a given XID. The rest can be simply be the relevant parts (resize/move, close) of the ICCCM.
Things get even more exciting if we combine the above ideas with a shell and tools with typed pipes (and arguments). Something like MSH, although I would do all the details completely differently. Immutable ADTs (belonging to various type classes), not objects, are more suitable for piping. Objects have mutable state and identity, ADTs as such don't. The .NET API is horrible bloat, I want to stick to something more unixy (i.e. simple and compact) yet also more functional for the basics. Shell piping is, after all, in many ways similar to lazy functional programming and Haskell's IO monad as well. H4sh is indeed a nice hack bringing FP tools to the *nix shell. And, yeah, speaking of Haskell, I want strong static typing with (parametric) polymorphism for the shell as well.
In any case, assuming such a shell, upon receiving typed data from a command, the shell should check whether the data type implements a graphical display routine (i.e. belongs to a suitable type class), and execute the routine (possibly another command on the file system) for displaying it graphically in the terminal. Otherwise it should check if the type implements a textual display routine, and so on.
I'm not done yet. There's one more thing I want to discuss about. In
“Vanhassa vara parempi” I mentioned that there's a lot to be learned
from old DOS software. One of those things related to all the above is
this. In DOS, executing an application, even a graphical one, from the
command prompt would take over the command prompt, instead of popping up a
new window somewhere around the screen. So do many text-based curses
and other similar programs in *nix terminal emulators. Additionally, these
curses programs allow switching back to the shell by putting them in the
background with ^Z or other binding. I like this behaviour very much.
It's why I prefer using text editors that run in the terminal emulator to
X text editors that might otherwise have better clipboard support and so
on. I'd like applications in X to behave in this manner. To overtake the
terminal emulator, but allow switching back to it with a simple binding.
I'd like all such interactive applications (as opposed to the “terminal
stream” applications discussed above) to think they were the only program
running on the screen like those old DOS programs did. Only now the screen
would be the terminal emulator, and it could be resized on the fly and so
on.
Of course, Ion to a degree emulates this behaviour by putting new
windows in the currently active frame, but it isn't perfect. It doesn't
really know from which terminal (if any) a window originated from, so it
can't switch back to it or always put the window in the right frame. The
FDO startup notification specification may be a small remedy in the
future. But there's another big problem as well. Too many programs use
multiple windows (per “document”) instead of the document-window-as-screen
approach. Additionally, shell integration would be useful. For example,
the shell's fg command could switch to a window of the program to be
put on the foreground.
As we can see, there are many ways in which the distinction between graphical user interfaces and the command line can be removed, and therefore both improved. WIMP does not have to have a monopoly on graphics.