| A well-written program should post very few error
messages indeed; instead, absolutely whenever possible, the program should cope with the
problem gracefully and continue without bothering the customer. By this yardstick, of course, most programs are poorly written.
For the purposes of this discussion, there are two
classes of poorly written programs. First, there is the program that can t remedy things
on its own, or that needs so much hand-holding that it bothers its customers
unnecessarily. Second, and the focus of this discussion, is the kind of program that
encounters some real problem, but confuses or offends the customer by providing an
inadequate error message.
Of course, the best error message is no error message at
all. In the case where something has gone awry, a program should do everything within its
power to remedy the situation at hand. For example, a program should never post a dialog
saying that a file cannot be found unless the program has actually bothered to look for
it. At a minimum, a program (that is to say, a programmer) should search all local hard
drives for the missing file. If the program finds the file in an inappropriate place, the
program should either update its own records to point to the file, or make a copy of the
file in an appropriate place. There should be no need to disturb the customer in either
case.
If your program has to post an error message, don t
waste the customer s time either before or after the error condition is detected. For
example, an installation program should not begin copying files unless it is certain that
the files will fit onto the destination disk. A simple set of calculations can determine
whether there is adequate disk space, but most programs don t even bother with this
basic check. Just as bad, installation programs frequently refuse to proceed, even when
already-existing files are going to be overwritten.
Don t depend on the operating system to handle things
properly. Amazingly, after almost twenty years in the field, the DOS COPY and XCOPY
commands don t bother to check for disk space before the copy starts; instead, they
begin copying blindly and hope that the destination disk doesn t fill up before the
operation is complete. Windows is no better; like DOS, it stupidly fails to check for
sufficient disk space before performing a file copy. Worse, if you are copying a set of
files, Windows will stop the process on the first error, will refuse to continue, and will
forget your selection.
When you write code, anticipate the error conditions and
code around them. Try to fulfill the user's goal to the greatest degree possible, and
don't view error conditions as catastrophic unless they are. Remember the program's state
at the time that the error occurred, and permit the user to restore that state easily.
Always write functions that return status codes, and return a unique error code for each
error condition. At the point the status code is returned, there is typically quite a bit
of information available that you can relay to people who are going to need to identify
and fix the problem. On the other hand, remember that your program's internal errors are
not the customer's concern, so don't overload or intimidate the customer. Make it clear
that some information is for the customer to act upon, but that other information is there
only to help the person that is helping her.
What Does a Good Error Message Look Like?
A well-constructed error message
- should identify the program that is posting the error
message
- should alert the customer to the specific problem
- should provide some specific indication as to how
the problem may be solved
- should suggest where the customer may obtain further help
- should provide extra information to the person who is
helping the customer
- should not suggest an action that will fail to solve the
problem and thus waste the customer s time
- should not contain information that is unhelpful,
redundant, incomplete, or inaccurate
- should provide an identifying code to distinguish it from
other, similar messages
A Good Example
One of the best error messages I have ever seen went
something like this:
This was an error message from an applicant tracking
system (called "Applicant Tracking System") that was designed for a personnel
agency by an independent consultant in 1988. The message looked almost, but not quite
exactly, as I ve rendered it above. A significant difference is that the original
message did not have had a Windows look and feel, because this message came from a DOS
program. I mention this because the author provided this detailed message even in the days
of the 640K memory limit. The customers of this system were not experienced with
computers, but even if they had been experts, the message would have been helpful.
Let s look at this error message and compare it with
the list of requirements above:
- This error message clearly identifies the program from
which it is coming. The title bar gets extra points because it identifies the type of
error.
- The message says that the program has lost communication
with the printer. The message does not say that the program "is unable to
print", nor does it say "LPT1: Error", nor some equally vague text relayed
from the operating system. Most operating systems provide notoriously terse and
usually poor error messages. This message is in terms that the customer can
understand.
- The message scores top marks for giving the customer
constructive steps that are within his power to perform, regardless of his skill level or
experience. The program does not offer a vague guess as to what the problem might be. The
steps are ordered from simplest to most complicated, and they re also ordered in terms
of probability. Part of this is due to luck the most common problems are not always the
easiest to solve.
- The program does not offer a foolish suggestion to the
customer that is likely to waste his time. ("Try restarting the application", or
worse, "Try re-installing the application".)
- The error message is carefully worded. Each item in the
message is worth checking. Nothing is restated pointlessly. There is no attempt to blame
another application for the problem. The message is accurate and helpful.
- Best of all, there is specific tech support information
right in the message, for the customer, the technician, and the developer. If there is a
defect in the code, the error message suggests clearly to the programmer where in the
program the error can be found, and the type of error involved. As an added plus,
there s the name of and an invitation from a real person. Apart from the pleasant
feeling that the customer gets from dealing with a person, rather than a corporation, the
programmer s name suggests pride in the work.
Ten Rotten Error Messages
Now, by contrast, here are some examples of the very
worst kinds of error messages. You ll see that my examples are all from Microsoft
software. Microsoft is not the only company that releases software with lousy user
interfaces, but it certainly seems to have perfected the art of the irritating error
dialog.
Duh. This message states something that is entirely
obvious, and fails to state anything at all that is helpful. There is nothing here to
remedy the customer s problem or to help him through it. There is no information that
would help even an imaginative tech support person to work through some possible solutions
with the customer. The developer responsible for maintaining this code--typically not the
person who wrote the original program--is not offered even a hint of what the problem is,
or the error code returned by the called function. If more than one error condition posts
this dialog, there's no way to tell which one caused the problem.
I have no comment on this message.
I have no comment on this message either. Although
somehow this looks a little less severe than the last one.
You know more than you re saying, don t you? And by
the way-- restarting Outlook will help how, exactly?
Which applications? How will it be incompatible? Why
didn t you fix the problem? Thank God it doesn t seem to be incompatible with non-
existent applications.
"May" again. Is a component busy or missing, or
is it neither? If a component is involved, which component? Is it busy? Or is it missing?
And what is a component anyway? A file? If so, could we have the file name please?
Really? Really? Which action? Which action? What should I
do to fix the problem? What should I do to fix the problem?
Nope, I don t. I want you to find it.
Still won t look for it, eh? In fact, I ve forgotten
the context in which I got this message, and so I ve forgotten which application is
involved. However, I do remember that it was unclear to me even at the time which
application needed to be reinstalled.
Why Are Error Messages So Poor, and How Can They Be Improved?
Our systems for teaching programming almost never discuss
error messages, or even error handling. How many programming books emphasize the
importance of checking return codes from operating system or library functions, and
handling errors gracefully? How many source code examples show even minimal error checking
or commenting? How many programming books discuss even the most basic user interface
issues, such as how to construct a useful error message?
Let's start with what is displayed to the world outside
your program. Error messages are often less than helpful or useful because they re
written by people who have an intimate knowledge of the program. Those people often fail
to recognize that the program will be run by other people, who don t have that
knowledge. Thus it is important that you consider the customer s plight carefully when
writing error-handling routines; that you involve someone other than yourself with the
design and testing of the program; and that you provide each and every error message to
someone else for review. The reviewer should not be an expert in the program. Your
messages should be detailed and polite. They should not offend or frustrate the customer.
Write and test your program so that it will have
to display as few error messages as possible.
If your programming language provides debug-build
validity checking like the C ASSERT macro, use it; if you have to hand-roll validity
checking yourself, do it. Walk through code in the debugger. Include features in the
release version of the program, such as log files or verbose modes, to help with
troubleshooting. Each condition in the program that has a chance of failure should return
a distinct error code, and should display this code as part of the error message.
The error code will not only help to narrow a problem
down, but is also good internationalization strategy; error codes will form a useful
cross-check when the program is translated. Comment each status code as thoroughly as you
can to make life easier for the maintenance programmer and for documenters, and use the
header to help define a table of error status codes for technical support. Make sure that
there is a mechanism to identify missing files, registry entries, and the like. Create
error handling classes and functions to supply consistent, well-formatted error
messages--and reuse them consistently. Use code review and walkthroughs with other
developers and quality assurance to make sure that your program is readable, consistent,
maintainable, and free of defects. Provide testers with tools or a test program that will
allow them to view all of the error messages displayed by your program.
Façade programming is a useful construction
strategy. As the program is being constructed, write skeletons of each function. Until you
have the internals of the function coded, simply have the function do nothing and return a
positive return code. Define the return codes as symbols- constants or enumerated
values. Later, when you begin to flesh out the function (and as you check return values at
each stage), define distinct symbolic codes for each type of error.
Programming is, of course, more complicated than ever.
There are more technologies, more languages, and more different disciplines to master this
week than there were last week. Developers are pressured to design too little, and to code
too quickly. Each step of the development process is squeezed so that products can be
released as quickly as possible. However, neither programmers nor managers should kid
themselves; other parts of the company are not likely to take responsibility for a program
that is sent to testing (or worse, to customers) laden with obvious defects and opaque
error messages. Developers and development managers must therefore learn to include design
and debugging time in planning estimates, and must argue effectively for more time and
more help, especially in areas that don't require coding, such as user interaction design.
It's rational to assume that help won't arrive
immediately, so walk a mile in the customer's shoes and program defensively. When you re
constructing an error message, the important thing to remember is that your message must
convey useful information. Useful information saves time. Remember that the message will
be read not only by the customer. The message must also be interpreted by the tech support
person who handles the call, the quality assurance analyst who helps to track down the
problem, and the maintenance programmer who is charged with fixing the problem in the
code. Each person in this process represents a cost to your company. What s more, while
the error-handling routine need be written only once, the support path is typically
followed many times--tens, or hundreds, or thousands of times. Form alliances with
technical support, testing, and documentation; ask questions, do the math, and put dollar
amounts on what it costs to solve (or sandbag) a problem after the product has been
released. Don't forget future lost sales in your calculations. If senior management at
your company wants to rush the product to market without leaving you time to code proper
error handling, remind management politely of the cost of such a policy.
Further Reading
If you d like to read more on this subject, have a look
at Alan Cooper s books About Face: The Essentials of User Interface Design and The Inmates are Running the Asylum: Why High Tech Products Drive Us Crazy
and How To Restore The Sanity , both just a click away on Amazon. Mr. Cooper's primary
thesis is that software confuses customers and makes them feel stupid, and that as
software professionals, we are obliged to serve them better than we do. This is quite
true. I respectfully submit that Mr. Cooper makes several errors in his
books--particularly when he argues that the hierarchical file structure mirrors the
computer s view of the file system, which is simply untrue--but there is much of value
in his writing. Even when he s wrong, his views are interesting and worth considering,
and several of the ideas in this essay are inspired and informed by his work.
Steve McConnell s definitive book on good software
construction practices, Code Complete: A Practical Handbook of Software Construction, has some
excellent material on defensive programming and error handling. Few other
books--particularly introductory programming texts--pay the topic any more than lip
service, which is a disgrace. New programmers should read this book in parallel with an
introductory text on their language of choice.
About Michael Bolton
Copyright ©2003 Michael Bolton. Visit his website at: http://www.developsense.com
Your Thoughts?
What are your thoughts on this? Drop me a line at ivan
at klariti dot com |