Fake Data

The political discourse in the U.S. is affecting my psychological health. I’m not kidding! This became clear a few of weeks back after my colleague Keith Curry Lance brought a library statistics report to my attention. This was the 2016 Annual Report issued by Project Outcome, the Public Library Association’s (PLA) national standardized outcome measurement initiative funded by the Bill and Melinda Gates Foundation. I began having dreams about how Project Outcome promotes wrong information about its survey findings.

In one dream I found myself in a society that had a central social medium used by all citizens known as Blatther. Somehow I ended up joining lock-step with the millions of Blattherers who found it entertaining to broadcast rude 140-character messages called blurts. I ended up blurting these messages:

Project Outcome is at it again. Distributing F-A-K-E data! Pitiful! F-A-K-E = Factually Askew or Known to be Erroneous

IMLS & Gates Foundation need to audit C-R-A-P in grantee reports. No quality control! C-R-A-P = Claims Resting on Absurd Propositions

When will library advocates stop spreading L-I-E-S? Disgraceful! L-I-E-S = Literally Inadequate Evidentiary Statements

In my waking life I avoid social media mostly for reasons outlined in Neil Postman’s pre-Twitter era book, Entertaining Ourselves to Death: Public Discourse in the Age of Show Business (originally published in 1985). So, I’m not proud that I jumped onto the Blattherer bandwagon in my dream.

Still, my blurts are essentially true despite their coarseness. And I feel like I should try to substantiate them. First, I need to say that Project Outcome is providing a valuable service to U.S. and Canadian public libraries. Getting library’s to collect and use data in their planning and operations is a good thing, as long as libraries have the basic levels of knowledge and skills needed to use data responsibly. I hope that the ultimate result of Project Outcome is to enable libraries to use data to improve library quality and effectiveness and to report honestly to constituents and stakeholders.

Right now PLA’s focus is not on data for improving library quality or accounting honestly to stakeholders. Instead, PLA’s primary goal is to prove that libraries provide substantial benefits to their communities. The premise is that library services and programs are automatically beneficial. In PLA’s view the only problem is the lack of metrics to confirm these benefits. This viewpoint is clear in these excerpts from the annual report:

People who work in public libraries know that library services open new opportunities for anyone who enters—putting people on the path to literacy, technological know-how, or a better job. We see evidence of this every day—what libraries have long been missing is the data to support it.1  [emphasis added]

Project Outcome was designed to help public libraries understand and share the true impact of their services and programs…2  [emphasis added]

A similar message can be heard in PLA’s March 9, 2017 podcast:

[Public libraries] know they make a difference in patron’s lives and the reasons why, but they need to prove it.3   [emphasis added]

PLA probably doesn’t realize that this statement contains a second proposition. Besides producing benefits automatically, public libraries know exactly how their efforts produce these benefits. Meaning libraries have a complete understanding of what is called program theory in the field of program evaluation, the how and why of designing programs and services to attain desired outcomes. I’d love to know the body of research this assertion is based on as it surely wasn’t covered in my library school training!

Here’s another proposition, the idea that benefits radiate beyond individual recipients of library services:

…the impact of public libraries extends beyond the individuals who use them–it strengthens and empowers the community around them.4

While this and the other propositions are central to the ideology of public librarianship, they are beliefs rather than facts. Which brings us to PLA’s mistaken ideas about data and evidence:

Data collected from the first year of Project Outcome tell us unequivocally that library programs and services improve the lives of their patrons.5  [emphasis added]

This statement is C-R-A-P. Unequivocal means leaving no room for doubt. It is impossible for surveys to produce findings that leave no room for doubt. This truism also applies to all data and evidence. Neither of these can ever be 100% valid or accurate. Their validity and accuracy are matters of degree since there’s always some measurement uncertainty (error) involved. Sometimes the error is so large that findings are completely wrong, as the world witnessed in U.S. political polls last year.

There are several hurdles in the way of gathering accurate survey data. In the field of survey research these hurdles are understood as sources of bias. I’ve enumerated these sources here and repeat them again here:

Selection bias: How complete a range of subjects are polled
Nonresponse bias: The extent to which subjects who are polled cooperate by completing
surveys
Question bias: Whether the measurement instrument affects responses, for example,
leading questions
Administration bias: Whether the survey administration affects responses, for example,
hints encouraging desirable responses
Response bias: Whether subject responses are accurate and truthful
Item bias: Whether the construction of the questionnaire as a whole yields
generally biased results, e.g., cultural-bias, gender-bias, age-bias.

Even though these biases can never be completely remedied, Project Outcome staff assure libraries that their survey data are trustworthy evidence. The sad thing is that unsuspecting libraries are encouraged to spread L-I-E-S to constituents and stakeholders. Granted, introducing novice libraries—especially the smallest libraries with the smallest staffs and budgets—to responsible use of data must begin with small steps, such as deploying simple measures of patron perceptions. But eventually PLA is going to need to evolve beyond L-I-E-S and F-A-K-E data.

Unfortunately, it looks like this evolution is going to be slow. Project Outcome is too mired in C-R-A-P embedded in its narrative, such as the idea that its surveys reveal the “true impact” of library services and programs (see the annual report excerpt quoted earlier). And this assertion from the March 2017 PLA podcast:

We looked at 17,000 patron survey responses from over 700 different types of programs. And what we found overall is we have really high percentages when it comes to patrons reporting that they learned something, that they gained confidence, that they plan a change of behavior, or are more aware of library resources. On average it’s an 80% rate that people say—yeh—we agree or strongly agree that this has happened that we have learned something…et cetera.6

Here the implication is that the more the surveys are repeated, the more trustworthy Project Outcome aggregate findings become. Who would not believe 17,000 survey respondents? Well again, in last year’s U.S. presidential campaigns repeated surveys of thousands and thousands of respondents still led to wrong answers. Conducting more and more  surveys doesn’t lead to more trustworthy findings. All that repeating faulty surveys does is increase the quantity of F-A-K-E data collected. Oblivious to how survey biases work, PLA continues to confidently publish their data such as these aggregate percentages:

PLA_PO_Percentages_520

Exaggerated data from Project Outcome annual report. Click for larger image.7

These high percentages are L-I-E-S due to the compromised accuracy and validity of the data. Accuracy is the quantitative correctness of a measurement. Validity is the extent to which the measurement reflects the characteristic or dimension being measured. A poorly calibrated household scale will still measure weight, although incorrectly. But even when calibrated correctly the scale cannot measure height.

But don’t expect to find these terms—accuracy and validity—used in PLA presentations, webinars, reports, or promotional materials. Instead, Project Outcome portrays issues related to distortion in their data as challenges:

PLA_PO_2Biases_520

Project Outcome computerized training slides addressing potential biases in the data. Click for larger image.8

In survey research not getting enough responses is usually an indication of selection (sampling) bias and/or nonresponse bias. Getting too positive responses can be due to question bias, administration bias, response bias and/or item bias. Project Outcome’s reaction to these possible data distortions is announcing that the survey data are “evidence of a community confirmation of change.” Notice that this phrase doesn’t tell readers whether the evidence is weak, moderate, or strong.

In any case, let’s examine this phrase more closely. The phrase refers to the fact that the surveys gather patron self-reports about changes in the project’s three measurement dimensions—knowledge, confidence, and behavioral intentions. The slide contrasts self-reports with “proof of rigorous statistical determination of change,” which would be objective assessments of some sort. An example of self-reported measures is student opinions about how much they learned in a social studies class. An example of objective assessments is validated tests of social studies student learning.

However, contrasting these two measurement methods is beside the point (of how to address distortion in the data). Continuing the social studies example, both student self-reports and assessments (tests) can have too few responses. Consider a classroom where one half of the class is out sick with the flu. Whether the teacher solicits self-reports or gives a test, under-reporting is still an issue. With 50% of students absent, both measurements can be inaccurate reflections of the class’s social studies learning. Similarly, too positive responses can occur with either measurement approach. Surveys of student opinions can be biased in a way that elicits positive responses. And tests can be so easy that almost all students earn an A.

Maybe Project Outcome brings up the self-reports versus assessment measurement approaches just to acknowledge that the surveys are not scientifically conclusive and therefore not completely trustworthy. But then how trustworthy are they? Hardly? Slightly? Moderately? Sufficiently? Project Outcome’s answer is that the survey data are definitely trustworthy. Enough to “reinforce a powerful story” and be “a strong indicator of public perception.” And that libraries should confidently use the data to “reinforce the importance of the library in grant applications” and to “build partnerships.”

With similar optimism Project Outcome webinars suggest that highly positive results are the plight of libraries. Libraries are so effective, so beneficial, and so beloved that patrons consistently give them glowing reviews. At the same time, the website describes highly positive responses as something called the ceiling effect:

PLA_PO_CeilingEffect_420Project Outcome computerized training slide about the ceiling effect. Click for larger image.9

In formal research a ceiling effect is when a measurement scale has an arbitrary upper limit that some individual measurements would surpass if it weren’t for this limit. Most of the time this effect indicates a flaw in the measurement instrument, rather than some characteristic of the phenomenon being measured.10

In this article I used mixed metaphors to describe this problem, writing that the questionnaires set a low bar for success by casting a wide net. This wide net is obvious in the questionnaire wording:

You are more aware of some issues in your community.
You learned something new that is helpful.
You feel more knowledgeable about the job search process.

With such lenient questions many more respondents are bound to respond positively than neutrally or negatively. Add to this the wishes of some respondents to be seen as good citizens or good learners and the effect is more pronounced. Then add bias from questions like these and the effect is even stronger:

You intend to apply what you just learned.
What could the library do to help you continue to learn more?

These items are biased because of their presumption that patron learning occurs for all respondents. And the presumption that patrons are satisfied with learning anything at all.

But the problem isn’t so much the leading questions as the over-generality of the measures. This was a conscious decision by PLA in its quest for standardized measures that could be aggregated nationally, a quest whose logic still eludes me. Combine this with the need to confirm the pre-conceived ideas that libraries naturally and consistently deliver benefits and the whole approach produces the ceiling effect.

Either PLA is unfamiliar with recognized testing and measurement methods or it chose to ignore these. Measurement instruments need to obtain results that are distributed over some reasonable range rather than clustered all together. This is why aptitude and achievement tests contain numerous specific questions rather than a few general ones.  The solution to a ceiling effect is to improve the measurement instrument so that it captures the full range of variation in whatever is being measured.11 For example, if nearly everyone in a social studies class gets 100% on a test, the test needs to be made more difficult.

In the surveys’ 6-item questionnaires, 3 items measure outcomes per se. A 4th item measures awareness of library services. (Knowing more about the library is a means to an end, not an outcome. Stakeholders are unlikely to buy the argument that one of the key benefits libraries provide to communities is a better understanding of the library.) The 5th and 6th items are open-ended questions meant to elicit patron suggestions.

From a program planning and evaluation perspective the data from the 3 outcome-related items are low in informational value (and even less informative when aggregated as alleged national averages). Hearing that participants in job search skills programs “are more knowledgeable about the job search process” may lead library stakeholders to ask questions like: What did they learn? Did they learn the basic skills they need? What range of skills job seekers want or need were covered? Were they taught how to successfully apply what they learned? The same questions that libraries should be asking. On the other hand, Project Outcome wants libraries to be happy with findings saying that recipients learned anything at all. Library stakeholders may not be so happy with these sorts of findings.

Project Outcome does a disservice to libraries by saying that they are conducting outcome evaluation when they are not. Global impressions of benefits delivered (the questionnaire’s 3 outcome items) tell neither the library nor its stakeholders whether the programs and services succeeded at accomplishing the specific outcomes that were intended. Nor do these impressions tell libraries which aspects of their program and service delivery need improved. Addressing these questions is a core requirement for conducting outcome evaluation.

With Project Outcome the only source of information potentially relevant to improving programs and services comes from the surveys’ open-ended items, and to a lesser extent from the item about awareness of library resources. This is why the project staff’s delight at hearing about libraries using Project Outcome findings to improve services is so ironic. If PLA were really interested in gathering data useful for quality improvement, they would not rely mostly on hit-or-miss open-ended questions. Put another way, they wouldn’t waste 3 of 6 questionnaire items gathering unusable data. They’d look into measurements more along the lines of the Association of Research Libraries’ LibQual+ Lite, a validated measurement instrument that gathers specific user perceptions about service expectations and experiences. In the meantime, Project Outcome portrays its casual efforts at collecting data relevant to program improvement as a major benefit of the surveys, and justification for employing the survey at all.

Again, none of the slantedness of this project should surprise anyone. All along PLA has openly admitted that Project Outcome surveys are a way to find somebody besides libraries and their advocates to promote libraries. Patrons are alternate messengers recruited to deliver the same messages. Pseudo-survey research is the medium. The only problem is the conflict between the profession’s principles of information accuracy, balance, and trustworthiness and the indifference that library advocates and marketeers have towards these principles.

To be fair I should say that Project Outcome does address certain methodological limitations of the survey data on its website under a topic entitled, Framing Survey Results. Here I’ll limit my comments on this material to one suggestion. This pertains to the problems of selection and nonresponse biases addressed, though not named, under the topic’s sub-heading, Survey Respondents. As “solutions” to these biases Project Outcome recommends that libraries “identify results as ‘based on survey respondents.’” And “include number of survey respondents and response rate.” To which I would add:

Advise library stakeholders that the data may have little resemblance to the true figures for the entire group of recipients of library programs or services, including past or future recipients.

Advise library stakeholders that “we have fairly strong evidence that the survey data are positively slanted. In order to understand library benefits among our patrons as a whole, when reading our reports please subtract some points—you decide how many—from the published figures.”

Something like this:

PLA_PO_PercentagesAdjusted_520

Subtracting points from Project Outcome aggregate percentages. Click for larger image.

Oh, yes. I confess to telling a lie earlier. In my dream I actually blurted 4 messages. The 4th was in reaction to this visual from the Project Outcome website:

PLA_PO_ChordDiagram_420

Project Outcome computerized training slide showing a “chord diagram.” Click for larger image.12

My blurt was:

Now PO expects libraries to unravel so-called “chord diagrams”? Study chord thickness? WTF? Voodoo statistics! This wheel needs re-invented!

—————————

1   Project Outcome. 2017a. 2016 Annual Report: Project Outcome Year in Review, Chicago: Public Library Association, 2017, p. 3.
2   Project Outcome. 2017a, p. 5.
3   Public Library Association. March 9, 2017. FYI Podcast 18 – Project Outcome. Public Libraries Online. Retrieved from http://publiclibrariesonline.org/2017/03/fyi-podcast-18-project-outcome.
4   Project Outcome. 2017a, p. 4.
5   Project Outcome. 2017a, p. 11.
6   Public Library Association, March 9, 2017.
7   Project Outcome. 2017a, p. 4.
8   Adapted from: Project Outcome. 2017b. Project Outcome Survey Results: Maximizing their Meaning.
Retrieved from https://www.projectoutcome.org/surveys-resources/survey-results-maximizing-their-meaning-3440f9c0-2fcb-4d9a-ad69-de0c18a99b27. I excerpted 2 charts and arranged them horizontally.
9   Project Outcome. 2017b.
10 A ceiling effect isn’t restricted to multiple choice tests as the slide might suggest. Nor does the term apply to qualitative data like comments to open-ended questions. For questions eliciting qualitative responses the term bias works fine.
11  You can read more here about developing and testing questionnaires to avoid uniform responses, that is, responses that cluster together on a measurement scale.
12  Project Outcome. 2017b.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s