Non-Exponential Potential

A new OCLC membership report, Perceptions of Libraries, 2010: Context and Community, is hot off the…er…PDF-Maker! The report is formatted more like a magazine than a study, with key findings summarized in a myriad of graphical illustrations. So, I must confess that I have rather neglected the narrative so far. But from browsing mostly through the pictures, I have come up with a few suggestions that might enhance the report’s message, quantitatively speaking.

First, it would be better if the OCLC market researchers avoided citing large percentages, like 1,544% growth in e-Book sales and 1,050% growth in smart phone ownership.1 As Derrick Niederman and David Boyum explain in their book, percentages like these tend to be overstatements due to the baseline figures used. And the percentages just aren’t that informative. Measuring the periodic rate of growth of a new technological product from some time after its earliest phases of adoption would be more convincing. For comparison purposes, past growth rates for similar new products over an equivalent time period could also be included.

A corollary to this idea is that relatively large increases in otherwise established trends (i.e., non-startup situations) are newsworthy. A good example is the report’s citing a 358% increase in home foreclosures in the U.S. from 2005 to 2009.2 On the other hand, the enormous percentage in this excerpt from the report is of the startup variety I’m talking about:

In 2007, a YouTube search found 25,700 videos that included ‘library,’ ‘libraries’ or ‘librarians.’  In January 2011, that number has rocketed to 1,010,000 videos, a 3,830% increase.3

The percentage, offered as evidence of increased online visibility of libraries, also illustrates another problem. As reference librarians would caution us, these results are unmediated. Raw counts of search hits are gross indicators of relevant content.

At any rate, it is quite possible that the rocketing growth in YouTube videos of all types explains the 3,830% figure. So, to see if libraries are more visible online in 2011 than 2007, we would need to measure how prevalent relevant library-related YouTube videos were among all posted 2007 YouTube videos compared with the same statistic for 2011.

Speaking of sensational growth, data trends that don’t jibe with sensationalist adjectives used in the OCLC report—like rocketing and soaring—come off as a bit anticlimactic.4 Here’s an example:

The percentage of teens ages 12 to 17 who own a cell phone exploded from 45% in 2005 to 75% in 2010.5

The 2010 percentage is about 2/3 higher than in 2005, a healthy but not colossal increase. Neither are the trends shown in the chart below exponential, regardless of chart’s title. Exponential growth has a specific mathematical meaning. In graphs depicting real exponential growth the trends surge upward as they approach the right edge of the graph, a pattern absent from the chart shown here. Other adjectives like substantial, remarkable, impressive, and so on would be fine for describing the chart’s data.

Source: Perceptions of Libraries, 2010, OCLC, Inc.    Click for larger image.

As intriguing as the colorful ribbons in this chart are (they remind me of fettucini), the chart is too difficult to read. In his book, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Stephen Few uses this style chart as a prime example of how 3-D effects make graphs less informative and more confusing.

In the chart it is impossible to tell exactly which quantities the ribbons are supposed to represent. Take a look at the intersection in 2009 of the orange ribbon (count of monthly Facebook visitors) and the blue ribbon (count of monthly YouTube visitors). What quantity does this intersection depict? It could be either 122 million, judging from the front edge of the intersection, or about 135 million, judging from the rear edge. That’s a 13 million difference. The same ambiguity can be seen where the orange and blue ribbons intersect in 2005, where the difference is 20 million.

Re-drawing the above chart as a simple line chart would eliminate these problems completely. As data visualization pioneers Edward Tufte, William S. Cleveland, and Howard Wainer have advised over the years, clarity is important, decoration is not.

3-D decorative effects can also distort data. Surely, OCLC researchers would not intentionally suggest to readers that 37% (the percent of economically-impacted respondents who reported using libraries more frequently) exceeds 49% (the percent of these respondents who reported unchanged or decreased use). But that is the message in this 3-D image:

Source: Perceptions of Libraries, 2010, OCLC, Inc.

In the graphic below I’ve added vertical lines dividing the image into equal lengths. Notice that the 37% segment takes up nearly half of the image’s length, more than the 9% and 40% segments combined:

The segments of this 2-D bar represent the quantities accurately:

But the ideal visual comparison of these data is a plain monochrome bar chart like this:

Visually gauging the comparative sizes of the bars is easy with this chart. It’s clarity leaves no room for suspicion of graphical trickery.

On the topic of visual comparisons, the OCLC report often renders numerals in different font sizes to emphasize magnitude differences. But these fonts give an inaccurate impression since they don’t correspond with the real differences in the data. Here are two numbers from p. 39 of the report which I aligned side-by-side:


The 75% is 1/3 taller than the 69%, although, arithmetically, 75 is not quite 1/10th larger than 69.

Enhanced fonts are also used in the circular graphic below to contrast larger and smaller numbers. At first view this arresting image, which occupies a full page in the OCLC report, seems to be bursting with information. But a closer look reveals that it’s just a table of numbers re-shaped into a wheel. Its complete informational content appears in the two-row table below (without the library card registration figures).

OCLCCircle300

Circular Graphic Comparing Respondent Group Library Use
Source: Perceptions of Libraries, 2010, OCLC, Inc.  Click for larger image.

Content of OCLC Circular Graphic Re-presented in Tabular Format.    Click for larger image.

Compared to the table, the only extra information the graphic provides is its encoding of the outer row with a larger font and orange concentric band.6 The code is intended to indicate that the outer row numbers exceed the inner row ones. Except the rule doesn’t apply to two of the nine spokes in the wheel. (Can you spot these?)

As hypnotic as the OCLC graphic is, it doesn’t teach us much about the data. Readers can usually tell smaller numbers from larger ones. The point of a graphic depicting different sized numbers should be to convey something specific about the differences. Say we would like to know which item has the largest gap between the two respondent groups? Or how the different gaps compare with each other? Or whether all of the economically impacted respondent numbers really do exceed the other group’s?

With a table of numbers at hand, we would answer these questions by inspecting each pair of numbers, doing the subtraction, and then comparing the results. Readers who enjoy the challenge of deciphering numerals upside down and at oblique angles can use OCLC’s circular graphic for this exercise. For others, the two-row table above will work.

Or a humble bar chart can do this work for us, like the one below, which reveals so much more about these numbers. Without bragging, I think I can honestly say this little chart runs circles around the competition! If the OCLC researchers want to help readers understand the numbers, a chart like this one would be a great beginning.


Content of OCLC Circular Graphic Re-presented in Bar Chart Format.   Click for larger image.

Well, that’s about it for my suggestions. Nothing earth-shattering. Just a few minor refinements to strengthen the case the OCLC market researchers have put forward. Not exponentially, of course. But maybe enough to make some difference.

  
—————————

1  Online Computer Library Center (2010). Perceptions of libraries, pp. 11 & 15.
2  Online Computer Library Center (2010). p. 19.
3  Online Computer Library Center (2010). p. 15.
4  Online Computer Library Center (2010). pp. 15 & 27.
5  Online Computer Library Center (2010). p. 13.
6  The brighter orange segments between the wheel’s spokes distract the readers’ eyes away from the data. This and other stylish effects in the OCLC graphic would, I regret to say, be labeled chartjunk by Edward Tufte.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s