In the book John Adams author David McCullough writes about Adams’ legal defense of British soldiers on trial for murder in 1770. In his argument to the Massachusettes jury Adams said:
Facts are stubborn things. And whatever our wishes, our inclinations, or the dictums of our passions, they cannot alter the state of facts and evidence.1
Indisputable facts are difficult to ignore, indeed. Yet, facts are not always clear and unambiguous. Getting to the plain facts and drawing valid conclusions from them can be stubborn matters in their own right. To quote science teacher and YouTube lecturer, wonderingmind42, “Interpreting evidence well requires skill, training, and experience.”2
In a review of David Brooks’ new book, The Social Animal, cognitive psychologist Christopher Chabris discusses the importance of drawing legitimate conclusions from evidence. He says that some of Brooks’ claims either misreport current research from the fields of neuroscience and psychology, or are conjecture:
For example, there is no way to assess the truth of [Brooks’] claim that “ninety percent of emotional communication is nonverbal” since it is impossible even to measure something as nebulous as “emotional communication.” Repeating such factoids…is tantamount to spreading urban legends.3
Reliable conclusions can only be based on careful interpretation of factual evidence. The evidence has to directly support the conclusions drawn. Chabris notes that Brooks sometimes falls short on this count:
Mr. Brooks describes research showing that, during the dot-com bubble, the average investor…lost money by trading in and out too much, when the investor could have made money by just sitting tight. Such people acted “self-destructively because of their excessive faith in their intelligence,” he says. Surely intelligent people can do stupid things. But where is the evidence that intelligence or intellectual arrogance was correlated with bad decisions, let alone caused them? The mutual-fund study [cited by Brooks] said nothing about whether more intelligent investors were more overconfident, or traded more, than less intelligent investors.4
Without sufficient evidence Brooks’ assertions are just “the dictums of his passions.” His evidence about intelligence inhibiting decision-making is insufficient because it is missing altogether. But what about arguments where corroborating information is provided? How do we judge whether the information is sufficient evidence?
Well, that depends. In certain fields like law and financial auditing there are specific rules of evidence, including standards the data must pass. But in the behavioral and social sciences the trustworthiness of evidence is a matter of degree. Depending on their scope and quality, data supporting a given assertion will have more or less weight.
Let’s take a closer look at how this works. In his book Policy & Evidence in a Partisan Age: The Great Disconnect, Paul Wyckoff provides an outline for a hierarchy of evidence in this diagram:
Source: Policy & Evidence in a Partisan Age.5 Reprinted with permission.
Towards the bottom of the diagram are the more common but least reliable sources for evidence. Sources towards the top provide more solid evidence.6
Here are my brief explanations of the individual levels:
Case studies and anecdotes. Includes patron testimonials, staff and board perceptions of services, focus groups, and those emotionally appealing docu-stories and photo shoots that frequently appear in public library annual reports. The information is unconvincing because it describes isolated accounts that typically do not reflect the larger population of interest.
“Raw” figures and percentages. Standard operational data such as library statistics (visit counts) and rates (library spending per capita or per student). While percentages and rates add some context to these data, neither types of data communicate much about the performance of the organization or its programs.
Multivariate statistical techniques and natural experiments.7 Includes descriptive studies (surveys, analyses of existing data, outcome studies) and correlational studies. The latter type explores relationships between factors, for instance, between community demographics and library use. Generally, these studies are reasonable descriptive evidence, but are insufficient for confirming a causal connection between any two factors, for instance, between program activities and client outcomes.
Small scale experiments. Individual experiments, both experimental and quasi-experimental, that produce information about program impact (effectiveness). The experiments are typically limited to specific contexts. Findings from well-designed experiments and, to a lesser degree, from quasi-experiments, are strong evidence that one factor (program interventions) caused another (desired program effects).
Meta-analyses and literature reviews. Thorough reviews summarizing findings from a set of studies on a particular topic.8 Numerous studies demonstrating consistent results amount to stronger support for an assertion.
Large scale experimental studies. National or international studies of large social, economic, educational, health, and other programs. Studies may be experimental or quasi-experimental. Libraries are unlikely to be the focus of these studies.
Given how evidence works, I’d say librarianship should take this approach: First, forget the anecdotes completely. As endearing as they are, they are way too short-sighted. Second, we should view standard library statistics with some skepticism since we don’t really know what they mean (except as they compare to themselves historically). Third, our focus has to be on multivariate studies and perhaps some quasi-experimental ones. And finally, meta-analysis would be awfully nice if we can see our way to tackle it.
So, there you have a quick tour of the evidence terrain. One post script, though. About not jumping to conclusions based on a single study. Chabris writes:
The literature in the social and medical sciences is full of results and claims that either don’t replicate or haven’t been tested [further]… The first study on a topic is rarely the last word.9
The importance of replicating research findings is a topic for some later date. (It has to do with the scientific method.) For now suffice it to say that the practice, so common in the library world, of pronouncing findings from a single study as simultaneously revolutionary and absolutely true is not the way to go. I’m all for the stubbornness of factual evidence. But the emphasis has to be on the factual part, not the stubborn part.
1 In McCullough, D., (2001). John Adams, Simon & Schuster, p. 68. Red emphasis added.
2 Quote appears in the video at the 5:18 time mark. Also watch the segment from 2:40 to 4:20 about facts versus the interpretation of facts.
3 Chabris, C. F. (2011, March 5-6). The mind readers: In search of success, do we overvalue intelligence and undervalue emotion, intuition and social cues? The Wall Street Journal, p. C5.
4 Chabris, C. F. (2011). p. C9.
5 Wyckoff, P.G., (2009). Policy & evidence in a partisan age, Washington, DC: Urban Institute Press, p. 18.
6 The wider rectangles of the lowest categories imply that these categories could be a foundation for the higher categories. But they are not. The lowest categories represent the least substantial forms of evidence, the higher categories, the most substantial. This chart from evidence-based medicine uses a design similar to Wyckoff’s (a 3-D pyramid in this case), with higher levels representing more trustworthy evidence.
7 The term natural experiment is a bit difficult to pin down since it is defined differently in different fields. Since natural experiments are sometimes considered to include quasi-experiments (which some experts have also labeled as observational studies), this sub-category spills over into the next category, small-scale experiments. Because of these complications I decided just to omit natural experiments from the discussion.
8 In evidence-based practice meta-analyses are called systematic reviews.
9 Chabris, C. F. (2011). p. C9.