Strength in Numbers

I want to tell you about a group of U.S. public libraries that are powerhouses when it comes to providing services to the American public. You might suppose that I’m referring to the nation’s large urban and county systems that serve the densest populations with large collections and budgets. These are the libraries you’d expect to dominate national library statistics. However, there’s a group of libraries with modest means serving moderate size communities that are the unsung heroes in public library service provision. These are libraries with operating expenditures ranging from $1 million to $4.9 million.1   Due to their critical mass combined with their numbers (there are 1,424 of them) these unassuming libraries pack a wallop in the service delivery arena.

Their statistical story is an interesting one. Let me introduce it to you by means of the patchwork graphic below containing 6 charts known as treemaps.


Click to view larger graphic.

From a data visualization standpoint treemaps (and pie charts also) have certain drawbacks that were identified in my prior post.2 Still, treemaps do have their place when used judiciously. And their novelty and color are refreshing. So, let’s go with them!

At first glance, treemaps are not that easy to decipher. Let me offer a hint to begin with and then follow with a fuller explanation. The hint: Notice how prominent the gold rectangles are among the 6 treemaps shown above. As the graph legend indicates, gold represents the $1 million to $4.9 million expenditure group that this post is focused on. (I purposely color-coded them gold!) And the story is about the appearance and meaning of these gold rectangles. Or more exactly, the rectangles representing this expenditure group—however the rectangles are colored. (Below you’ll see I also use monochrome-shaded treemaps to tell the story.)

Now let’s see how treemaps work. Treemaps are like rectangular pie charts in that they use geometrical segments to depict parts-of-a-whole relationships. In other words, treemaps present a categorical breakdown of quantitative data (not the statistician). A single treemap represents 100% of the data and the categories are represented by inset rectangles rather than pie wedges. The sizes of treemap segments reflect the data quantities. In some cases treemaps also use color to represent data quantities, as this green treemap does:


Number of Libraries by Expenditure Group  
Click to view larger interactive chart.

Before getting to the quantitative aspects of this green chart, let me explain that the ballooned text is an interactive feature of Tableau Public, the statistical software used to generate the chart. If you would, click the treemap now to view the interactive version. At the top right of that chart is a legend indicating how color-shading works. Also, below the treemap is a bar chart displaying percentages data—the same figures visible on the treemap balloons.

In monochrome treemaps like the green one above the largest and darkest rectangle represents the highest number in the data, and the smallest and lightest represents the lowest number. The largest rectangle is always located at the top left of the treemap and the smallest at the bottom right. All tree maps, including the 6 charts above, follow this top-left-to-right-bottom arrangement. But only monochrome treemaps use color-shading in addition to rectangle sizing to portray quantities. With multi-color treemaps color is used instead to encode the data categories, in our case, library expenditure ranges.

In all of the treemaps included in this post each inset rectangle represents one of the total operating expenditure ranges listed in the first column of this table:


Data Source: Public Libraries in the U.S. Survey, 2011, Institute of Museum & Library Services.3

Although these expenditure categories pertain to quantities (dollar ranges), remember that categories are always qualitative, that is, non-numerical.4  To emphasize the fact that the categories are non-numerical, in the table their labels have letter prefixes.

In the green treemap the rectangles are arranged according to the number of public libraries falling within each expenditure range, left to right as described above. However, this order does not apply to the bar chart appearing below the interactive version of the treemap. That bar chart is instead sorted by the category ranges low to high.

The largest rectangle in the treemap is the $50K or less category. Hovering the pointer over the $50K or less rectangle in the interactive chart or looking at the corresponding bar in the bar chart shows this category’s percentage as 21.1%. Similarly, 16.1% of public libraries fall under the next-largest rectangle which represents the $400K – $999.9K category. And the categories with the third and fourth largest rectangles, $1.0M – $4.9M and $100K – $299.9K, account for 15.4% and 14.7% of the total number of libraries. (The complete data appear in the table above.)

This next treemap (below in blue) depicts population data for each expenditure category. To see detailed figures click on the chart. (Again, a bar chart appears below the treemap there.) Hover the pointer over the top left rectangle (the 1.0 million to $4.9 million category) of the interactive treemap. Notice that this category serves the highest population amount, 88.4 million or nearly 29% of the total U.S. population served by public libraries. Next is the $10.0 million to $49.9 million group, the rectangle just below. Libraries in this category served 83 million or 27% of the population in 2011.


Population Served by Expenditure Group
Click to view larger interactive chart.

Now the question is, how do these two measures—U.S. public library counts and population served—relate to each other? The next bar chart offers an answer:


Click to view larger interactive chart.

Again, click on the bar chart to see the interactive version. Then, click on the legend (Libraries / Population Served) to highlight one set of bars at a time. You’ll see that from left to right the libraries percentages (green bars) drop, remain fairly level, and then drop again. The population served percentages (blue bars) swoop up from left to right to the $1.0 million to $4.9 million category, step down, up, and then down again.

These trends are not surprising since we know that the smallest libraries serve the smallest communities on the smallest budgets. And that these communities are very numerous. Likewise, the largest libraries serve the largest communities, which are few in number. But it is surprising that the $1.0 million to $4.9 million category serves as large of a swath of the population as it does. And that the next highest libraries expenditure-wise, the $5.0 million to $9.9 million group, does not keep up with this first group. (I’m curious about why this would be the case.)

In a moment I’ll get to other key measures for this spectacular $1.0 million to $4.9 million category of libraries. But first I thought I would lay out the population data differently by looking at how libraries are distributed independent of library expenditures. Just as a reminder of how it works. The chart below shows the distribution of public libraries among 11 population categories labeled on the horizontal axis. Adding up the percentages for the left 5 bars, you see that 77% of public libraries serve communities with less than 25,000 population. Note in the bar chart (and treemap in the interactive version) that the 10K to 24.99K population group contains the most libraries.5


Distribution of Population Served
Click to view larger interactive chart.

Okay. Now lets look at total operating expenditures by expenditure category in the purple chart here:


Total Expenditures by Expenditure Category
Click to view larger interactive chart.

In this chart the two left-most rectangles look identical in size, don’t they? Click on the interactive version and you can see that the $10 million to $49.9 million and the $1.0 million to $4.9 million groups each account for more than $3 billion in annual public library operating expenditures. And their expenditure levels are nearly equal. The $10 million to $49.9 million group outspends the $1.0 million to $4.9 million group only by 1.2% ($38 million).

Where service provision is concerned, however, the $1.0 million to $4.9 million libraries shine. First, as seen in the interactive version of the chart below, their 2011 total visits surpassed the $10 million to $49.9 million group by 2% or 18 million. Granted, if we were to combine all libraries with expenditures exceeding $10 million into a single category, that category would win out. But the point here is that the 1,424 members of the $1.0 million to $4.9 million group are able to generate library services at nearly the same level as the largest urban libraries in the country. Without a doubt, the productivity of these moderate size libraries is substantial.


Total Visits by Expenditure Category
Click to view larger interactive chart.

On circulation, however, the $10 million to $49.9 million libraries out-perform the $1.0 million to $4.9 million group. The former group accounts for 4% more of total U.S. public library circulation than the latter. These larger libraries account for 31.6% of all circulation nationwide, compared to the $1.0 million to $4.9 million group which accounts for 27.6%. (Click on the gray chart below to view these and other figures.)


Total Circulation by Expenditure Category
Click to view larger interactive chart.

Yet circulation is the only major output measure where the $1.0 million to $4.9 million libraries play second fiddle to libraries from the other expenditure categories. Besides total visits, our 1,424 libraries excel in total program attendance and public Internet computer users. The next (olive) treemap shows the 7% margin (6.5 million) for total program attendance this group holds over the second-place group.


Total Program Attendance by Expenditure Category
Click to view larger interactive chart.

The final treemap below gives data on public Internet computer users. Again these middling libraries exceed the $10 million to $49.9 million libraries by 2.5% or about 8.3 million computer users. Rather startling that this group of libraries would outpace the large and well-equipped libraries of the nation in the delivery of technology services to communities.


Total Public Computer Users by Expenditure Category
Click to view larger interactive chart.

To recap the data presented here let’s revisit the 6 multi-color treemaps introduced at the beginning of this post. We can see the gold rectangle is the largest among all expenditure groups for population, visits, program attendance, and public Internet computer users. And it is 2nd highest in operating expenditures and circulation.

As I mentioned, the standing of the largest expenditure categories could be enhanced by merging the $10 million to $49.9 million and the $50 million or more categories into a single category. (Of course, any boundary within this wide range of expenditures would be arbitrary.) Even so, the $1.0 million to $4.9 million group would still show a strong presence, leaving its next largest peers, the $5.0 million to $9.9 million category, in the dust. No matter how you slice the data, the $1.0 million to $4.9 million group is a major player in national library statistics. Now we need to think of some appropriate recognition for them…


1   Based on the Public Libraries in the United States Survey, 2011, Institute of Museum and Library Services.
2   It was Willard Brinton who identified the problem in his 1914 book. In my prior post scroll down to the sepia graphic of squares arranged laterally. There you see Brinton’s words, “The eye cannot fit one square into another on an area basis so as to get the correct ratio.” Bingo. With treemaps this is even more problematic since a single quantity in the data can be represented by different-shaped but equivalent rectangles—stubby ones or more elongated ones. You’ll see in the examples that it is impossible to visually determine which of two similarly-sized rectangles is larger. This difficulty also applies to pie wedges.
3   For purposes of this post I used only libraries reporting to the Institute of Museum and Library Services in 2011 that were located in the continental U.S., Alaska, and Hawaii.
4   The expenditure groups are examples of categorical data. Other examples are geographical regions of the U.S. and library expenditure types (collection, staffing, technology, capital, and so forth). Categorical data are also called nominal data or data on a nominal scale.
5   For detailed information about the statistical and geographic distributions of small libraries see the new report, The State of Small and Rural Libraries in the United States, IMLS Research Brief. No. 5., Sept. 2013.

