We all know that the main function of libraries is to make information accessible in ways that satisfy user needs. Following Ranganathan’s Fourth Law of Library Science, library instructions guiding users to information must be clear and simple in order to save the user’s time. This is why library signage avoids exotic fonts, splashy decorations, and any embellishments that can muddle the intended message. Library service that wastes the user’s time is bad service.
So I am baffled by how lenient our profession is when it comes to muddled and unclear presentations of quantitative information in the form of data visualizations. We have yet to realize that the sorts of visualizations that are popular nowadays actually waste the user’s time—bigtime! As appealing as these visualizations may be, from an informational standpoint they violate Ranganathan’s Fourth Law.
Consider the data visualization shown below from the American Library Association’s (ALA) Digital Inclusion Study:
ALA Digital Inclusion Study national-level dashboard. Click to access original dashboard.
This visualization was designed to keep state data coordinators (staff at U.S. state libraries) informed. The coordinators were called upon to encourage local public libraries to participate in a survey conducted last fall for this study. The graphic appears on the project website as tool for monitoring progress of the survey state by state.
Notice that the visualization is labeled a dashboard, a data display format popularized by the Balanced Scorecard movement. The idea is a graphic containing multiple statistical charts, each one indicating the status of an important dimension of organizational performance. As Stephen Few observed in his 2006 book, Information Dashboard Design, many dashboard software tools are created by computer programmers who know little to nothing about the effective presentation of quantitative information. Letting programmers decide how to display quantitative data is like letting me tailor your coat. The results will tend towards the Frankensteinian. Few’s book provides several scary examples.
Before examining the Digital Inclusion Study dashboard, I’d like to show you a different example, the graphic appearing below designed by the programmers at Zoomerang and posted on The Center for What Works website. It gives you some idea of the substandard designs that programmers can dream up:1
Zoomerang chart posted on http://www.whatworks.org. Click to see larger version.
The problems with this chart are:
- There are no axis labels explaining what data are being displayed. The data seem to be survey respondents’ self-assessment of areas for improvement based on a pre-defined list in a questionnaire.
- There is no chart axis indicating scaling. There are no gridlines to assist readers in evaluating bar lengths.
- Long textual descriptions interlaced between the blue bars interfere with visually evaluating bar lengths.
- 3D-shading on the blue bars has a visual effect not far from something known as moiré, visual “noise” that makes the eye work harder to separate the visual cues in the chart. The gray troughs to the right of the bars are extra cues the eye must decipher.
- The quantities at the far right are too far away from the blue bars, requiring extra reader effort. The quantities are located where the maximum chart axis value typically appears. This unorthodox use of the implied chart axis is confusing.
- The questionnaire items are not sorted in a meaningful order, making comparisons more work.
We should approach data visualizations the way we approach library signage. The visualizations should make the reader’s task quick and easy—something the Zoomerang chart fails at. Here’s a better design:2
Revision of original (blue) Zoomerang chart posted above. Click to see larger version.
WARNING: Beware of statistical, graphical, and online survey software. Nine times out of ten the companies that create this software are uninformed about best practices in graphical data presentation. (This applies to a range of vendors, from Microsoft and Adobe to upstart vendors that hawk visualization software for mobile devices.) Indiscriminate use of these software packages can cause you to waste the user’s time.
The Digital Inclusion Study dashboard appearing at the beginning of this post wastes the user’s time. Let’s see how. Note that the dashboard contains three charts—a gauge, line chart, and map of the U.S. The titles for these are imprecise, but probably okay for the study’s purposes (assuming the state data coordinators were trained in use of the screen). Still, for people unfamiliar with the project or users returning to this display a year later, the titles could be worded more clearly. (Is a goal different from a target? How about a survey submission versus a completion?)
Understandability is a definite problem with the map’s color-coding scheme. The significance of the scheme is likely to escape the average user. It uses the red-amber-green traffic signal metaphor seen in the map legend (bottom left). With this metaphor green usually represents acceptable/successful performance, yellow/amber, borderline/questionable performance, and red, unacceptable performance.
Based on the traffic signal metaphor, when a state’s performance is close to, at, or exceeds 100%, the state should appear in some shade of green on the map. But you can see that this is not the case. Instead, the continental U.S. is colored in a palette ranging from light reddish to bright yellow. Although Oregon, Washington, Nevada, Michigan, and other states approach or exceed 100% they are coded orangeish-yellow.3 And states like Colorado, North Carolina, and Pennsylvania, which reported 1.5 to 2 times the target rate, appear in bright yellow.
This is all due to the statistical software reserving green for the highest value in the data, Hawaii’s 357% rate. Generally speaking, color in a statistical chart is supposed to contain (encode) information. If the encoding imparts the wrong message, then it detracts from the informativeness of the chart. In other words, it wastes user time—specifically, time spent wondering what the heck the coding means!
Besides misleading color-coding, the shading in the Digital Inclusion Study dashboard map is too subtle to interpret reliably. (The dull haze covering the entire map doesn’t help.) Illinois’ shading seems to match Alabama’s, Michigan’s, and Mississippi’s, but these three differ from Illinois by 13 – 22 points. At the same time, darker-shaded California is only 5 points lower than Illinois.
The Digital Inclusion map’s interactive feature also wastes time. To compare data for two or more states the user must hover her device pointer over each state, one at a time. And then remember each percentage as it is displayed and then disappears.
Below is a well-designed data visualization that clarifies information rather than making it inaccessible. Note that the legend explains the color-coding so that readers can determine which category each state belongs to. And the colors have enough contrast to allow readers to visually assemble the groupings quickly—dark blue, light blue, white, beige, and gold. Listing the state abbreviations and data values on the map makes state-to-state comparisons easy.
A well-designed data visualization. Source: U.S. Bureau of Economic Analysis. Click to see larger version.
This map is definitely a time saver!
Now let’s turn to an…er…engaging feature of the ALA dashboard above—the dial/gauge. To the dismay of Stephen Few and others, dials/gauges are ubiquitous in information dashboards despite the fact that they are poor channels for the transmission of information. Almost always these virtual gadgets obscure information rather than reveal it.4 Meaning, again, that they are time wasters.
The gauge in the dashboard above presents a single piece of data—the number 88. It is astonishing that designers of this virtual gadget have put so many hurdles in the way of users trying to comprehend this single number. I hope this bad design comes from ignorance rather than malice. Anyway, here are the hurdles:
- The dial’s scaling is all but invisible. The dial is labeled, but only at the beginning (zero) and end (100) of the scale, and in a tiny font. To determine values for the rest of the scale the user must ignore the prominent white lines in favor of the obscured black lines (both types of lines are unlabelled). Then she has to study the spacing to determine that the black lines mark the 25, 50, and 75 points on the dial. The white lines turn out to be superfluous.
- The needle is impossible to read. The green portion of the banding causes the red tick-marks to be nearly invisible. The only way to tell exactly where the needle is pointing is by referring to the ‘88’ printed on the dial, a requirement that renders the needle useless.
- The uninitiated user cannot tell what is being measured. The text at the center of the image is masked at both edges because it has been squeezed into too small a space. And the gauge’s title is too vague to tell us much. I am guessing that the dial measures completed survey questionnaires as a percentage of some target quantity set for the U.S. public libraries that were polled. (And, honestly, I find it irritating that the 88 is not followed by a percent symbol.)
- The time period for the data depicted by the gauge is unspecified. Not helpful that the line chart at the right contains no scale values on the horizontal axis. Or, technically, the axis has one scale value—the entirety of 2013. (Who ever heard of a measurement scale with one point on it?) The dial and line chart probably report questionnaires submitted to date. So it would be especially informative for the programmers to have included the date on the display.
- Although the red-amber-green banding seems to be harmless decoration, it actually can lead the reader to false conclusions. Early on in the Digital Inclusion Study survey period, submissions at a rate of, say, 30%, would be coded ‘unacceptable’ even though the rate might be quite acceptable. The same misclassification can occur in the amber region of the dial. Perhaps users should have been advised to ignore the color-coding until the conclusion of the survey period. (See also the discussion of this scheme earlier in this post.)
The graphic below reveals a serious problem with these particular gauges. The graphic is from a second dashboard visible on the Digital Inclusion Study website, one that appears when the user selects any given U.S. state (say, Alaska) from the dashboard shown earlier:
ALA Digital Inclusion Study state-level dashboard. Click to see larger version.
Notice that this dashboard contains five dials—one for the total submission rate for Alaska (overall) and one for each of four location categories (city, suburban, town, and rural). While the scaling in all five dials spans from 0% to 100%, two of the dials—city and town—depict quantities far in excess of 100%. I’ll skip the questions of how and why the survey submission rate could be so high, as I am uninformed about the logistics of the survey. But you can see that, regardless of the actual data,the needles in these two gauges extend only a smidgen beyond the 100% mark.
Turns out these imitation gauges don’t bother to display values outside the range of the set scaling, which, if you think about it, is tantamount to withholding information.5 Users hastily scanning just the needle positions (real-life instrument dials are designed for quick glances) will get a completely false impression of the data. Obviously, the gauges are unsatisfactory for the job of displaying this dataset correctly.
So now the question becomes, why use these gauges at all? Why not just present the data in a single-row table? This is all the dials are doing anyway, albeit with assorted visual aberrations. Besides, there are other graphical formats capable of displaying these data intelligently. (I won’t trouble you with the details of these alternatives.)
One point about the line chart in the Alaska (state-level) dashboard. Well, two points, actually. First, the weekly survey submission counts should be listed near the blue plotted line—again, to save the user’s time. Second, the horizontal axis is mislabeled. Or, technically, untitled. The tiny blue square and label are actually the chart legend, which has been mislocated. As it is, its location suggests that both chart axes measure survey completions, which makes no sense. The legend pertains only to the vertical axis, not to the horizontal. The horizontal axis represents the survey period measured in weeks. So perhaps the label “Weeks” would work there.
In charts depicting a single type of data (i.e. a single plotted line) there is no need for a color-coded legend at all. The sort of detail that software programmers will know nothing about.
Finally, a brief word about key information the dashboard doesn’t show—the performance thresholds (targets) that states had to meet to earn an acceptable rating. Wouldn’t it be nice to know what these are? They might provide some insight into the wide variation in states’ overall submission rates, which ranged from 12% to 357%. And the curiously high levels seen among the location categories. Plus, including these targets would have required the dashboard designers to select a more effective visualization format instead of the whimsical gauges.
Bottom line, the Digital Inclusion Study dashboard requires a lot of user time to obtain a little information, some of which is just plain incorrect. Maybe this is no big deal to project participants who have adjusted to the visualization’s defects in order to extract what they need. Or maybe they just ignore it. (I’m still confused about the purpose of the U.S. map.)
But this a big deal in another way. It’s not a good thing when nationally visible library projects model such unsatisfactory methods for presenting information. Use of canned visualizations from these software packages is causing our profession to set the bar too low. And libraries mimicking these methods in their own local projects will be unaware of the methods’ shortcomings. They might even assume that Ranganathan would wholeheartedly approve!
1 Convoluted designs by computer programmers are not limited to data visualizations. Alan Cooper, the inventor of Visual Basic, describes how widespread this problem is in his book, The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity.
2 Any chart with closely spaced bars can be subject to moiré, especially when bold colors are used. Pastel shades, like the tan in this chart, help minimize this.
3 Delaware also falls into this category and illustrates the distortion intrinsic to maps used to display non-spatial measures. (Shale deposit areas by state is a spatial measure; prevalance of obesity by state is a non-spatial measure.) Large states will be visually over-emphasized while tiny states like Delaware and Rhode Island struggle to be seen at all.
4 My favorite example, viewable in Stephen Few’s blog, is how graphic artists add extra realism as swatches of glare on the dials’ transparent covers. These artists don’t think twice about hiding information for the sake of a more believable image.
5 This is extremely bad form—probably misfeasance—on the part of the software companies. More responsible software companies, like SAS and Tableau Software, are careful to warn chart designers when data extend beyond the scaling that chart designers define.