That’s the Wrong Question

In recent years in suburban Cleveland, Ohio the Cuyahoga County Public Library embarked on an ambitious building program which ended up alienating some community members. In a public forum last year one citizen asked how the building campaign could be justified when the library’s own “statistically valid” survey indicated that 90% of patrons were satisfied with the library facilities as they were.1  The library director began her response by saying that “some of it is the way the [survey] questions were asked.” She then went on to explain that a range of other relevant information, beyond patron satisfaction percentages, was considered in the library board’s decision, including systematically gathered community input to building designs.

I cannot say whether the library’s decisions were in the best interests of the local community or not. However, I can comment on the data aspect of this story. So, let me restate that part of the director’s argument more forcefully:

A statistically valid survey finding applied to the wrong research question is logically invalid. The community’s level of satisfaction with current library facilities is not a reliable indicator of its feelings about the sufficiency of those facilities over the longer term. Nor whether the community believes it is better to incur large costs to maintain older facilities or to invest in new ones that permit the library to adapt better to changing community needs. In other words, that’s the wrong question.

Contrary to popular belief, on their own data don’t necessarily mean anything. Their meaning comes from how the data are interpreted and what questions they are used to address. Interpreting data with respect to pressing questions is the crux of data analysis. This is why Johns Hopkins biostatistician Jeff Leek begins his new book, Elements of Data Analytic Style, with a chapter about how the type of research question predetermines what analysis is needed.   [Read more]


1   The citizen with the question used the phrase “statistically valid.”

Posted in Advocacy, Measurement, Numeracy, Reporting Evaluation/Assessment Results, Research, Statistics | Leave a comment

Do No Quantitative Harm

Every measurement and every real-world number is a little bit fuzzy, a little bit uncertain. It is an imperfect reflection of reality. A number is always impure: it is an admixture of truth, error, and uncertainty.
Charles Seife, Proofiness: How You Are Being Fooled by the Numbers

Seife explains that the most well-conceived measures and carefully collected data are still imperfect. In real life situations where measurement designs are far from ideal and data collection is messy, the numbers are even more imperfect. The challenge for library assessment and research professionals is making sure our study designs and measures don’t make things any worse than they already are. To the best of our abilities we should strive to do no harm to the data.

Sharpening our skills in quantitative reasoning/numeracy will help make sure our measures aren’t exacerbating the situation. Here I’m continuing a quantitative exercise begun in my Nov. 9 post about a library return-on-investment (ROI) white paper connected with the LibValue project. In the post I explained that the ROI formula substantially exaggerated library benefits, a quantitative side-effect I suspect the researchers weren’t aware of.

A caveat before proceeding: Quantitative reasoning is not for people expecting quick and simple takeaways. Or those seeking confirmation of their pre-conceived notions. Quantitative reasoning is about thoroughness. It involves systematic thinking and lots of it! (That’s why this post is so long.)  [Read more]

Posted in Library assessment, Measurement, Numeracy, Research | Leave a comment

Quantitative Thinking Improves Mental Alertness

It’s been a while since I’ve posted here. Writer’s block, I guess. I was hoping to come up with some new angle on library statistics. But to be honest, I haven’t been able to shake the quantitative literacy kick I’ve been on. I believe that quantitative literacy/numeracy is important in this era of data-driven, evidence-based, value-demonstrated librarianship. Especially when much of the data-driving, evidence-basing, and value-demonstrating has been undermined by what I’ll call quantitative deficit disorder. Not only has this disorder gone largely undiagnosed among library advocacy researchers and marketing afficionados, it has also found its way to their audiences. You may even have a colleague nearby who suffers from the disorder.

The most common symptoms among library audiences are these: When presented with research reports, survey findings, or statistical tables or graphs, subjects become listless and unable to concentrate. Within seconds their vision begins to blur. The primary marker of the disorder is an unusually compliant demeanor. Common subject behavior includes visible head-nodding in agreement with all bullet points in data presentations or executive summaries. In severe cases, subjects require isolation from all data-related visual or auditory stimuli before normal cognitive processes will resume.

The only known therapeutic intervention for quantitative deficit disorder is regular exercise consisting of deliberate and repetitive quantitative thinking. Thankfully, this intervention has been proven to be 100% effective! Therefore, I have an exercise to offer to those interested in staving off this disorder.  [Read more]

Posted in Measurement, Numeracy, Research | Tagged , , | Leave a comment

If You Plug Them In They Will Come

In their book What the Numbers Say Derrick Niederman and David Boyum say that the way to good quantitative thinking is practice, practice, practice! In this spirit I offer this post as another exercise for sharpening the reader’s numeracy skills.

A couple of months back I presented a series of statistical charts about large U.S. public library systems. Sticking with the theme of large public libraries, I thought I’d focus on one in particular, The Free Library of Philadelphia. This is because the PEW Charitable Trusts Philadelphia Research Initiative did an up-close analysis of The Free Library in 2012. So this post is a retrospective on that PEW report. Well, actually, on just this graph from the report:

PEW Philadelphia Report Bar Chart

Source: The Library in the City, PEW Charitable Trusts Philadelphia Research Initiative.
 Click to see larger image.

The PEW researchers interpreted the chart this way:

Over the last six years, a period in which library visits and circulation grew modestly, the number of computer sessions rose by 80 percent…These numbers only begin to tell the story of how the public’s demands on libraries are changing.1

The implication is that because demand for technology outgrew demand for traditional services by a factor of 8-to-1, The Free Library should get ready to plug in even more technological devices! This plan may have merit, but the evidence in the chart does not justify it. Those data tell quite a different story when you study them closely. So, let’s do that.  [Read more]


1  PEW Charitable Trusts Philadelphia Research Initiative,The Library in the City: Changing Demands and a Challenging Future, 2012, p. 10.

Posted in Uncategorized | Leave a comment

Averages Gone Wrong

In this post I’ll be telling a tale of averages gone wrong. I tell it not just to describe the circumstances but also as a mini-exercise in quantitative literacy (numeracy), which is as much about critical thinking as it is about numbers. So if you’re game for some quantitative calisthenics, I believe you’ll find this tale invigorating. Also, you’ll see examples of how simple, unadorned statistical graphs are indispensable in data sleuthing!

Let me begin, though, with a complaint. I think we’ve all been trained to trust averages too much. Early in our school years we acquiesced to the idea of an average of test scores being the fairest reflection of our performance. Later in college statistics courses we learned about a host of theories and formulas that depend on the sacrosanct statistical mean/average. All of this has convinced us that averages are a part of the natural order of things.

But the truth is that idea of averageness is a statistical invention, or more accurately, a sociopolitical convention.1 There are no such things as an average student, average musician, average automobile, average university, average library, average book, or an average anything. The residents of Lake Wobegon realized this a long time ago!

Occasionally our high comfort level with averages allows them to be conduits for wrong information. Such was the case for the average that went wrong found in this table from a Public Library Funding and Technology Access Study (PLFTAS) report:


Source: Hoffman, J. et al. 2012, Libraries Connect Communities: Public Library
Funding & Technology Study 2011-2012
, 11.   Click to see larger image.

The highlighted percentage for 2009-2010 is wrong. It is impossible for public libraries nationwide to have, on average, lost 42% of their funding in a single year.   [Read more…]


1   Desrosières, A. 1998. The Politics of Large Numbers: A History of Statistical Reasoning. Cambridge MA: Harvard University Press. See chapters 2 & 3.

Posted in Advocacy, Library statistics, Measurement, Numeracy, Statistics | Leave a comment

I Think That I Shall Never See…

This post is about a much discussed question: How did the Great Recession affect U.S. public libraries? I’m not really going to answer the question, as that would amount to a lengthy journal article or two. But I am going to suggest a way to approach the question using data from the Institute of Museum and Library Services (IMLS) Public Libraries in the United States Survey. Plus I’ll be demonstrating a handy data visualization tool known as a trellis chart that you might want to consider for your own data analysis tasks. (Here are two example trellis charts in case you’re curious. They are explained further on.)

As for the recession question, in the library world most of the discussion has centered on pronouncements made by advocacy campaigns: Dramatic cuts in funding. Unprecedented increases in demand for services. Libraries between a rock and hard place. Doing more with less. And so forth.

Two things about these pronouncements make them great as soundbites but problematic as actual information. First, the pronouncements are based on the presumption that looking at the forest—or at the big picture, to mix metaphors—tells us what we need to know about the trees. But it does not…   [Read more…]

Posted in Advocacy, Data vizualization, Library statistics | Tagged , , , , , | Leave a comment

Roughly Wrong

I decided to move right on to my first 2014 post without delay. The reason is the knot in my stomach that developed while viewing the Webjunction webinar on the University of Washington iSchool Impact Survey. The webinar, held last fall, presented a new survey tool designed for gathering data about how public library patrons make use of library technology and what benefits this use provides them.

Near the end of the webinar a participant asked whether the Impact Survey uses random sampling and whether results can be considered to be statistically representative. The presenter explained that the survey method is not statistically representative since it uses convenience sampling (a topic covered in my recent post). And she confirmed that the data only represent the respondents themselves. And that libraries will have no way of knowing whether the data provide an accurate description of their patrons or community.

Then she announced that this uncertainty and the whole topic of sampling were non-issues, saying, “It really doesn’t matter.” She urged attendees to set aside any worries they had about using data from unrepresentative samples…   [Read more…]

Posted in Advocacy, Probability, Research, Statistics | Leave a comment