Professional GEM - Part VIII - User interfaces
Professional GEM 57
PART VII
USER INTERFACES
AND NOW FOR SOMETHING COMPLETELY DIFFERENT
In response to a number of requests, this installment of ST
PRO GEM will be devoted to examining a few of the principles of
computer/human interface design, or "religion" as some would
have it. I'm going to start with basic ergonomic laws, and try
to draw some conclusions which are fairly specific to designing
for the ST. If this article meets with general approval,
further "homilies" may appear at irregular intervals as part of
the ST PRO GEM series.
For those who did NOT ask for this topic, it seems fair to
explain why your diet of hard-core technical information has
been interrupted by a sermon! As a motivater, we might consider
why some programs are said by reviewers to have a "hot" feel
(and hence sell well!) while others are "confusing" or
"boring".
Alan Kay has said that "user interface is theatre". I
think we may be able to take it further, and suggest that a
successful program works a bit of magic, persuading the user to
suspend his disbelief and enter an imaginary world behind the
screen, whether it is the mathematical world of a spreadsheet,
or the land of Pacman pursued by ghosts.
A reader of a novel or science fiction story also suspends
disbelief to participate in the work. Bad grammar and clumsy
plotting by the author are jarring, and break down the
illusion. Similarly, a programmer who fails to pay attention to
making his interface fast and consistent will annoy the user,
and distract him from whatever care has been lavished on the
functional core of the program.
CREDIT WHERE IT'S DUE ?
Before launching into the discussion of user interface, I
should mention that the general treatment and many of the
specific research results are drawn from Card, Newell, and
Moran's landmark book on the topic, which is cited at the end of
the article. Any errors in interpretation and application to
GEM and the ST are entirely my own, however.
Professional GEM Part VIII 58
FINGERTIPS
We'll start right at the user's fingers with the basic
equation governing positioning of the mouse, Fitt's Law, which
is given as
T = I * LOG2( D / S + .5)
where T is the amount of time to move to a target, D is the
distance of the target from the current position, and S is the
size of the target, stated in equivalent units. LOG2 is the
base 2 (binary) logarithm function, and I is a proportionality
constant, about 100 milliseconds per bit, which corresponds to
the human's "clock rate" for making incremental movements.
We can squeeze an amazing amount of information out of this
formula when attempting to speed up an interface. Since motion
time goes up with distance, we should arrange the screen with
the usual working area near the center, so the mouse will have
to move a smaller distance on average from a selected object to
a menu or panel. Likewise, any items which are usually used
together should be placed together.
The most common operations will have the greater impact on
speed, so they should be closest to the working area and perhaps
larger than other icons or menu entries. If you want to have
all other operations take about the same time, then the targets
farthest from the working area should be larger, and those
closer may be proportionately smaller.
Consider also the implications for dialogs. Small check
boxes are out. Large buttons which are easy to hit are in.
There should be ample space between selectable items to allow
for positioning error. Dangerous options should be widely
separated from common selections.
MUSCLES
Anyone who has used the ST Desktop for any period of time
has probably noticed that his fingers now know where to find the
File menu. This phenomenon is sometimes called "muscle memory",
and its rate of onset is given by the Power Law of Practice:
T(n) = T(1) * n ** (-a)
where T(n) is the time on the nth trial, T(1) is the time on the
first trial, and a is approximately 0.4. (I have appropriated
** from Fortran as an exponentiation operator, since C lacks
one.)
Professional GEM Part VIII 59
This first thing to note about the Power Law is that it
only works if a target stays in the same place! This should be
a potent argument against rearranging icons, menus, or dialogs
without some explicit request by the user. The time to hit a
target which moves around arbitrarily will always be T(1)!
In many cases, the Power Law will also work for sequences
of operations to even greater effect. If you are a touch
typist, you can observe this effect by comparing how fast you
can enter "the" in comparison to three random letters. We'll
come back shortly to consider what we can do to encourage this
phenomenon.
EYES
Just as fingers are the way the user sends data to the
computer, so the eyes are his channel from the machine. The
rate at which information may be passed to the user is
determined by the "cycle time" of his visual processor.
Experimental results show that this time ranges between 50 and
200 milliseconds.
Events separated by 50 milliseconds or less are always
perceived as a single event. Those separated by more than 200
milliseconds are always seen as separate. We can use these
facts in optimizing user of the computer's power when driving
the interface.
Suppose your application's interface contains an icon which
should be inverted when the mouse passes over it. We now know
that flipping it within one twentieth of a second is necessary
and sufficient. Therefore, if a "first cut" at the program
achieves this performance, there is no need for further
optimization, unless you want to interleave other operations.
If it falls short, it will be necessary to do some assembly
coding to achieve a smooth feel.
On the other hand, two actions which you want to appear
distinct or convey two different pieces of information must be
separated by an absolute minimum of a fifth of a second, even
assuming that they occur in an identical location on which the
user's attention is already focused.
We are able to influence the visual processing rate within
the 50 to 200 millisecond range by changing the intensity of the
stimulus presented. This can be done with color, by flashing a
target, or by more subtle enhancements such as bold face type.
For instance, most people using GEM soon become accustomed to
the "paper white" background of most windows and dialogs. A
dialog which uses a reverse color scheme, white letters on
Professional GEM Part VIII 60
black, is visually shocking in its starkness, and will
immediately draw the user's eyes.
It should be quickly added that stimulus enhancement will
only work when it unambiguously draws attention to the target.
Three or four blinking objects scattered around the screen are
confusing, and worse than no enhancement at all!
SHORT-TERM MEMORY
Both the information gathered by the eyes and movement
commands on their way to the hand pass through short-term memory
(also called working memory). The amount of information which
can be held in short-term memory at any one time is limited.
You can demonstrate this limit on yourself by attempting to type
a sheet of random numbers by looking back and forth from the
numbers to the screen. If you are like most people, you will be
able to remember between five and nine numbers at a time. So
universal is this finding that it is sometimes called "the magic
number seven, plus or minus two".
This short-term capacity sets a limit on the number of
choices which the user can be expected to grasp at once. It
suggests that the number of independent choices in a menu, for
instance, should be around seven, and never exceed nine. If
this limit is violated, then the user will have to take several
glances, with pauses to think, in order to make a choice.
CHUCKING
The effective capacity of short-term memory can be
increased when several related items are mentally grouped as a
"chunk". Humans automatically adopt this strategy to save
themselves time. For instance, random numbers had to be used
instead of text in the example above, because people do not type
their native language as individual characters. Instead, they
combine the letters into words and remember these chunks
instead. Put another way, the characters are no longer
considered as individual choices.
A well designed interface should promote the use of
chunking as a strategy by the user. One easy way is to gather
together related options in a single place. This is one reason
that like commands are grouped into a single menu which is
hidden except for its title. If all of the menu options were
"in the open", the user would be overwhelmed with dozens of
alternatives at once. Instead, a "Show Info" command, for
instance, becomes two chunks: pick File menu, then pick Show.
Professional GEM Part VIII 61
Sometimes the interface can accomplish the chunking for the
user. Consider the difference between a slider bar in a GEM
program, and a three digit entry field in a text mode
application. Obviously, the GEM user has fewer decisions to
make in order to set the associated variable.
THINK !
While we are puttering around trying to speed up the
keyboard, the mouse, and the screen, the user is actually trying
to get some work done. We need to back off now, and look at the
ways of thinking, or cognitive processes, that go into
accomplishing the job.
The user's goal may be to enter and edit a letter, to
retrieve information from a database, or simply draw a picture,
but it probably has very little to do with programming. In
fact, the Problem Space Principle says that the task can be
described as a set of states of knowledge, a set of operators
and associated constraints for changing the states, and the
knowledge to choose the appropriate operator, which resides in
the user's head.
Those with a background in systems theory can consider this
as a somewhat abstract, but straightforward, statement in terms
of state variables and operators. A programmer might compare
the knowledge states to the values of variables, the operators
to arithmetic and logic operations, the constraints to the rules
of syntax, and the user's knowledge to the algorithm embodied by
a program.
ARE WE NOT MEN ?
A rational person will try to attain his goals (get the job
done) by changing the state of his problem space from its
initial state to the goal state. The initial state, for
instance, might be a blank word processor screen. The desired
final state is to have a completed business letter on the
screen.
The Rationality Principle says that the user's behavior in
typing, mousing, and so on, can be explained by considering the
tasks required to achieve the goal, the operators available to
carry out the tasks, and the limitations on the user's
knowledge, observations, and processing capacity. This sounds
like the typical user of a computer program must spend a good
deal of time scratching his head and wondering what to do next.
In fact, one of Card and Moran's key results is that this is NOT
what takes place.
Professional GEM Part VIII 62
What happens, in fact, is that the trained user strikes a
sort of "modus vivendi" with his tool and adopts a set of
repetitive, trained behavior patterns as the best way to get the
job done. He may go so far as to ignore some functions of the
program in order to set up a reliable pattern. What we are
looking for is a way of measuring and predicting the "quality"
of this trained behavior. Since using computers is a human
endeavor, we should consider not only the speed with which the
task is completed, but the degree of annoyance or pleasure
associated with the process.
Card and Moran constructed a series of behavioral models
which they called GOMS models, for
Goals-Operators-Methods-Selection. These models suggested that
in the training process the user learned to combine the basic
operators in sequences (chunks!) which then became methods for
reaching the goals. Then these first level methods might be
combined again into second level methods, and so forth, as the
learning progressed.
The GOMS models were tested in a lengthy series of trials
at Xerox PARC using a variety of word processing software.
(Among the subjects of these experiments were the inventors of
the windowing methods used in GEM!) The results were again
surprising: the level of detail in the models was really
unimportant!
It turned out to be sufficient to merely count up the
number of keystrokes, mouse movements, and thought intervals
required by each task. After summing up all of the tasks, any
extra time for the computer to respond, or the user to move his
hands from keyboard to mouse, or eyes from screen to printed
page is added in. This simplified version is called the
Keystroke-Level Model.
As an example of the Keystroke Model, consider the task of
changing a mistyped letter on the screen of a GEM word
processor. This might be broken down as follows: 1) find the
letter on the screen; 2) move hand to mouse; 3) point to letter;
4) click mouse button; 5) move hand to keyboard; 6) strike
"Delete" key; 7) strike key for new character.
The sufficiency of the Keystroke Model is great news for
our attempt to design faster interfaces. It says we can
concentrate our efforts on minimizing the number of total
actions to be taken, and making sure that each action is as fast
as possible. We have already discussed some ways to speed up
the mouse and keyboard actions, so let's now consider how to
speed up the thought intervals, and cut the number of actions.
Professional GEM Part VIII 63
One way to cut down "think time" is to make sure that the
capacity of short-term memory is not exceeded during the course
of a task. For example, the fix-a-letter task described above
required the user to remember 1) his place in the overall job of
typing the document; 2) the task he is about to perform; 3)
where the bad character appeared, and 4) what the new character
was. When this total of items creeps toward seven, the user
often loses his place and commits errors.
You can appreciate the ubiquity of this problem by
considering how many times you have made mistakes nesting
parentheses, or had to go back to count them, because too many
things happened while typing the line to remember the nesting
levels. The moral is that operations with long strings of
operands should be avoided when designing an interface.
The single most important factor in making an interface
comfortable to use is increasing its predictability, and
decreasing the amount of indecision present at each step during
a task. There is (inevitably) an Uncertainty Principle which
relates the number of choices at each step to the associated
time for thought:
T = I * LOG2 ( N + 1)
where LOG2 is the binary logarithm function, N is the number of
equally probable choices, and I is a constant of approximately
140 msec/bit. When the alternates are not equally probable, the
function is more complex:
T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )
where the P(i) are the probabilities of each of the choices
(which must sum to one). (SUM-FOR-i... is the best I can do for
a sigma operator on-line!) Those of you with some information
theory background will recognize this formula as the entropy of
the decision; we'll come back to that later.
So what can we learn from this hash? It turns out, as we
might expect, that we can decrease the decision time by making
some of the user's choices more probable than others. We do
that by means of feedback cues from the interface.
The important of reliable, continuous meaningful feedback
cannot be emphasized enough. It helps the beginner learn the
system, and its predictability makes the program comfortable for
the expert. Programs with no feedback, or unreliable cues,
produce confusion, dissonance, and frustration in the user.
This principle is so important that I going to give several
examples from common GEM practice. The Desktop provides several
Professional GEM Part VIII 64
instances. When an object is selected and a menu drops down,
only those choices which are legal for the object are in black.
The others are dimmed to grey, and are therefore removed from
the decision. When a pick is made from the menu, the bar entry
remains black until the operation is complete, reassuring the
user that the correct choice was made. In both the Desktop and
the RCS, items which are double-clicked open up with a "zoom
box" from the object, again showing that the right object was
picked.
Other techniques are useful when operator icons are exposed
on the screen. When an object is picked, the legal operations
might be outlined, or the bad choices might be dimmed. If the
screen flashing produced by this is objectionable, the legal
icons can be made mouse sensitive, so they will "light up" when
the cursor passes over - again showing the user which choices
are legal.
The desire for feedback is so strong that it should be
provided even while the computer is doing an operation on its
own. The hour glass mouse form is a primitive example of this.
More sophisticated are "progress indicators" such as animated
thermometer bars, clocks, or text displays of the processing
steps. The ST Desktop provides examples in the Format and Disk
Copy functions. The purpose of all of these is to reassure the
user that the operation is progressing normally. Their lack can
lead to amusing spectacles such as secretaries leaning over to
hear if their disk drives are working!
Another commonly overlooked feature is error prevention and
correction. Card and Moran's results showed that in order to go
faster, people will tolerate error rates of up to 30% in their
work. Any program which does not give a fast way to fix
mistakes will be frustrating indeed!
The best way to cope with an error is to "make it didn't
happen", to quote a common child's phrase. The same feedback
methods discussed above are also effective in preventing the
user from picking inappropriate combinations of objects and
operations. Replacement of numeric type-ins with sliders or
other visual controls eliminates the common "Range Error". The
use of radio buttons prevents the user from picking incompatible
options. When such techniques are used consistently, the
beginner also gains confidence that he may explore the program
without blundering into errors.
Once an error has occured, the best solution is to have an
"inverse operation" immediately available. For instance, the
way to fix a bad character is to hit the backspace key. If a
line is inadvertantly deleted, there should be a way to restore
it.
Professional GEM Part VIII 65
Sometimes the mechanics of providing true inverses are
impractical, or end up cluttering the interface themselves. In
these cases, a global "Undo" command should be provided to
reverse the effect of the last operation, no matter what it
was.
OF MODES AND BANDW
Now I am going to depart from the Card, Newell and Moran
thread of discussion to consider how we can minimize the number
of operations in a task by altering the modes of the interface.
Although "no modes" has been a watchword of Macintosh
developers, the term may need definition for Atarians.
Simply stated, a mode exists any time you cannot get to all
of the capabilities of the program without taking some
intermediate step. Familiar examples are old-style
"menu-driven" programs, in which user must make selections from
a number of nested menus in order to perform any operation. The
options of any one menu are unavailable from the others.
Recall that the user is trying to accomplish work in his
own problem space, by altering its states. A mode in the
program adds additional states to the problem space, which he is
forced to consider in order to get the job done. We might call
an interface which is completely modeless "transparent", because
it adds no states between the user and his work. One of the
best examples of a transparent program is the 15-puzzle in the
Macintosh desk accessory set. The problem space of rearranging
the tiles is identical between the program and a physical
puzzle.
Unfortunately, most programmers find themselves forced to
put modes of some sort into their programs. These often arise
due to technological limitations, such as memory space, screen
"real estate", or performance limitations of peripherals. The
question is how the modes can be made least offensive.
I will make the general claim that the frustration which a
mode produces is directly proportional to the amount of the
user's bandwidth which it consumes. In other words, we need to
consider how many keystrokes, mouse clicks, eye movements, and
so on, are going into manipulating the true problem states, and
how many are being absorbed by the modes of the program. If the
interface is wasting a large amount of the user's effort, it
will be perceived as slow and annoying.
Here we can consider again the hierarchy of goals and
methods which the user employs. When the mode is low in the
hierarchy, and close to the user's "fingertips", it is
Professional GEM Part VIII 66
encountered the most frequently. For instance, consider how
frustrating it would be to have to hit a function key before
typing in each character!
The "menu-driven" style of programs mentioned above are
almost as bad, since usually only one piece of information is
collected at each menu. Such a program becomes a labyrinth of
states better suited to an adventure game!
The least offensive modes are found at the higher, goal
related levels of the hierarchy. The better they align with
changes in the state of the original problem, the more they are
tolerated. For example, a word processing program might have
one screen layout for program editing, another for writing
letters, and yet another while printing the documents. A
multi-function business package might have one set of menus for
the spreadsheet, another for a graphing module, and a third for
a database.
In some cases the problem solved by the program has
convenient "fracture lines" which can be used to define the
modes. An example in my own past is the RCS, where the editing
of each type of resource tree forms its own mode, with each of
the modes nested within the overall mode and problem of
composing the entire resource tree.
TO DO IS TO BE !
Any narrative description of user interface is bound to be
lacking. There is no way text can convey the vibrancy and
tactile pleasure of a good interface, or the sullen boredom of a
bad one. Therefore, I encourage you to experiment. Get out
your favorite arcade game and see if you can spot some of the
elements I have described. Dig into your slush pile for the
most annoying program you have ever seen, run it and see if you
can see mistakes. How would you fix them? Then... go do it to
your own program!
AMEN...
This concludes the sermon. I'd like some Feedback as to
whether you found this Boring Beyond Belief or Really Hot
Stuff. If enough people are interested, homily number two will
appear a few episodes from now. The very next installment of
ST PRO GEM will go back to basics to explore VDI drawing
primitives. In the meantime, you might investigate some of the
Good Books on interface design referenced below.
Back to Professional_GEM