Lib(rary) Performance Blog

Entries categorized as ‘Library assessment’

Navigating with Fragmentary Information

February 16, 2010 · Leave a Comment

I have implied this in other entries in this blog, but I might as well say it outright: The library and information science profession needs to come to terms with the issue of standards for (i.e., rules of) evidence for performance, statistical, and advocacy research data. There, now I’ve said it.

I recently read the short and enjoyable book Graphic Discovery: A Trout in the Milk and Other Visual Adventures by statistician Howard Wainer (Princeton, NJ: Princeton University Press, 2005). The subtitle of the book comes from something Henry David Thoreau wrote. During a dairy strike in 1850 in New England people began to suspect that dairy owners were watering down the milk supply. This led Thoreau to write in his journal, “Sometimes circumstantial evidence can be quite convincing; like when you find a trout in the milk” (quoted in Wainer, p. 81).

Wainer’s main point, one certainly made also by others like John Tukey and Edward Tufte, is that well designed graphical representations are invaluable for exploring and understanding data. Graphical data presentation can lead to revel- ations about data, and the underlying phenomena they describe, that would otherwise be missed.

But, alas, Wainer and the others warn that the design of graphs can serve to mislead readers. (Statistics can lie. So, you have to figure that statistical graphs might fail a few polygraph tests, too.)  Now re-sensitized to this possibility, I am—at least in the short term—looking closer at graphs I encounter.  A graph appearing in an article in the Nov. 9, 2009 issue of Business Week was easy

                                     Click on this image to see the Nov. 9 Business Week graph.

prey for my renewed vigilance. Unfortunately, the electronic versions of this article available from EBSCO, LexisNexis, and other databases omit graphics altogether—an aggravating defect of digitization, indeed. To see the graph, click on the image above.

In the Business Week article, “The GDP Mirage,” author Michael Mandel argues that the economic index, Gross Domestic Product (GDP), is incomplete because it does not measure “intangible investments” corporations make. By overlooking these investments, Mandel claims, the U.S. is “navigating…with fragmentary information” (p. 36). The reader can get the gist of Mandel’s ideas from the article itself. For now, I just want to point out that the aim of the graphic is to illustrate the author’s argument.

Notice that the Business Week graphic consists of three charts. Rather than having an individual title for each chart, a caption at the top forms three surrogate titles: “Reported GDP jumps ahead of jobs [left graph]…but the GDP stats don’t count R&D cuts [center graph]…or lost jobs for knowledge workers [right graph].”  The implication is that if the GDP were to include statistics reflecting cuts to research and development and lost jobs, it would be a more valid measure of economic output. (The article doesn’t actually recommend that job loss statistics be included in revised GDP calculations, but we can ignore this inconsistency for our present purposes.)

Its own title notwithstanding, this graphic has a “numbers problem” quite distinct from the GDP measurement challenge that concerns Mandel. The problem with the graphic is this: Two of the three charts report (let’s call these) “actual data” while the third does not. The left and right charts present data obtained from the U.S. Bureau of Labor Statistics, which we can presume were collected using accepted sampling methods. However, the center chart is—depending on how you look at it—either a convenience sample or merely a collection of anecdotes.

The center chart’s heading, “Selected Companies that Have Cut R&D Spending Over the Last Year,” suggests that the selection is some type of non-probability sample. As seen in the chart, cuts for these companies range from roughly 12% to 36%. Nowhere, though, does the chart or the article tell us how the companies were selected or to what extent the percentages pertain to the larger set of U.S. corporations of interest.

What we have is anecdotal information masquerading as data! Even though the chart title is clear,* placing that chart in the middle of two other charts that contain actual data is deceptive. Due to its location and bar-graph style, this chart appears to be on a par with the other charts when it really is not. The center chart is mostly conjecture, the other two have firmer grounding.

Since the units of measure in that chart are percentages, the population parameters (in this case, percentages of decrease in R&D spending among all U.S. corporations of interest) are likely to be within some reasonable range, probably not ridiculously far from the range seen here.

But this is not the point. The author does not have any conclusive evidence about what this range actually is and he, or the creator of the charts, ought to say so. This is a case of pretending to have data that you don’t, in fact, have. Or, in the Mandel’s words, navigating with fragmentary information. Wainer would not be so forgiving; he would call the center chart “nondata” since that is what it is (p. 57). On that same page Wainer also makes this wonderfully apropos pronouncement:

“The plural of anecdote is not data.”

Sure, for particular purposes, quick-and-dirty selections and pseudo-samples can be justified. But, they do not deserve to be graphed. So, if you will permit me, I want to experiment with a possible contribution to the set of standards for evaluating evidence that the library and information science profession might someday establish:

Standard XV.1.c.    Since anecdotal information represents only itself, it shall not be portrayed, nor presented graphically, in a way that implies that it describes any phenomena in the aggregate.

Okay, so I can’t think of very good wording. Thankfully, there’s plenty of time for re-working that sentence…

—————————————————
* I don’t mean to say that the chart is clearly titled, but that, once you are able to find it, the title (or is it a subtitle?) has an unambiguous meaning.

Compared to standards of good graphing practice that Howard Wainer, William Cleveland (The Elements of Graphing Data, Murray Hill, NJ: AT&T Laboratories, 1994) and others promote, the Business Week graphs are pretty damned bad! The axis labels are too difficult to find, first, because the charts are overpowered by thick, all-black bars and bold-fonted category labels (company names and occupation categories). And, second, due to small fonts, crowding, and misplacement.

In the left chart, the label “Percent” has the wrong orientation since it apples to the vertical axis. The chart’s horizontal axis has no label. Thanks to the chart designer’s use of Roman numerals we can guess that the units must be quarters on an annual economic calendar. Squeezing the legend into the data portion of this chart violates a cardinal graphing design principle: Don’t let clutter make the data more difficult to see. Though less important, the word “Forecast,” a note for the single GDP data point at quarter III of 2009, appears in a larger font than both axis labels and tickmark values. Not good.

In the center and right charts, the labels, “Percent Change in R&D Spending” and “Percent Change in Employment,” are misplaced. Both should appear on the lower horizontal axes near the the appropriate grid marks. Both labels include asterisked notes that imply the labels are meant to serve a dual purpose as titles (or subtitles). This confusion could be alleviated by creating descriptive chart titles that include the notes information (no need to separate it), and then inserting fully descriptive labels adjacent to the axes.

Grade this graphic earns:  D-

Categories: Advocacy · Library assessment · Measurement · Research

Sawing with a Dull Saw

January 25, 2010 · Leave a Comment

In spite of their evolution over the last few decades, accelerated most recently due to the Googlization of information, public libraries have been amazingly impervious to change in the arena of performance measurement. I found the following observations about  library measures in the early history of American libraries:

There is no branch of library economy more important, or so little understood by a librarian as helps to himself, as the daily statistics which he can preserve of the growth, loss, and use (both in extent and character) of the collection under his care. The librarian who watches these things closely, and records them, always understands what he is about, and what he accomplishes or fails to accomplish. The patrons to whom he present these statistics will comprehend better the machinery of the library, and be more indulgent toward its defects.     Public Libraries in the United States of America, Warren, S.R. And Clark, S. N., Eds., Washington, DC: U.S. Bureau of Education, 1876, p. 714.

Interesting that use of library statistics for advocacy purposes was recognized in 1876.

Early in the twentieth century our current ideas about performance measurement were already well understood, long before the practices of program evaluation, evidence-based management, and performance scorecarding were formalized. Arthur Bostwick, late director of St. Louis Public Library and Librarian of Brooklyn Public Library, wrote this in his book published in 1917:

No business can be properly carried on without a system of accounts. These may involve only money received and expended, but they may and should extend much further. The collection and tabulation of such [financial and performance] data have come to be regarded as indispensable by shrewd businessmen; and large corporations do not hesitate to spend considerable sums in employing a force of experts and clerks especially to gather data of this kind and to tell what they mean…    The American Public Library, Bostwick, A.E., New York: D. Appleton & Co., 1917, p. 253.

Bostwick also dealt with the ideas of accountability, continuous improvement, careful analysis of library statistics, and a tiered approach to evaluation data-collection:

Information of this [financial and performance] kind is gathered with either or both of two different purposes in view—to satisfy the legitimate curiosity of the person managing the business, or of some one who has a right to know how it is going on, whether it is succeeding or failing and just what it is accomplishing; and, secondly, to furnish a basis for improvements or changes, to indicate weak points and points of strength, so that the business may be reënforced along the former and extended along the latter.

…If the latter [purpose is intended], a more detailed and analytical study is made of the data, which are compared and tested in all possible ways to reveal unsuspected facts. When something is thus brought to light that seems to call for further investigation, additional data are collected. (p. 254)

He also preached about the imprecision of library statistics, a topic conveniently overlooked in our profession nowadays (as is the importance of assuring the validity of data presented in library advocacy reports also):

It should not be forgotten, either by those who collect and report these statistics, or by those who read them or use them, that they are of various degrees of exactness…In any kind of scientific measurement the limits of probable error are always mentioned to give an idea of the degree of accuracy. The less the probable error, the greater the accuracy. It is never stated that there can be no error and that the accuracy is exact; this would be simply ridiculous. The same holds good in library statistics. In the average report nothing at all is said of accuracy; the reader is left to conclude that all the data are exact, or at least that there is no difference in their degree of exactness. (p. 262)

Finally, the level of interest in this topic among Bostwick’s 20th century peers strikes a familiar chord:

But how much intelligent study of library statistics goes on in librarian’s offices, and how much modification or improvement in library methods and material results from such study, is something that we shall never now. It appears to be certain, however, that large numbers of librarians…look upon their statistics in the light of a necessary evil. They must be collected, because some thing of the kind is expected in the annual report, but they should be minimized, and, once in print, they should be dismissed from the mind. This attitude reminds one of the rural workman who used a dull saw because the amount of work before him gave him no time to stop and sharpen it… (p. 255)

Categories: Library assessment · Measurement
Tagged: , ,

Thoroughly Modern Museums and Libraries

August 31, 2009 · Leave a Comment

I think I get it now.  I had thought the term assessment meant a systematic and appropriately rigorous measurement of a construct or phenomenon of interest, like program outcomes, community needs, service quality, and so on.  Only now have I come to understand that a self-assessment is a different animal altogether. Who would have thought that the purpose of a self-assessment is not really to assess anything?  The purpose, I now realize, is to inform and educate. All this time I have been applying research methodology standards to tools that are intended to advocate and indoctrinate. No wonder my observations have been so off-base!

When I disapproved of WebJunction’s online competencies assessment questionnaire (see my April 22, 2009 entry), the WebJunction staff explained to me that the true objective for their surveys was to increase awareness of these competencies. I immediately wondered, “Well, how then will WebJunction measure awareness?”  But that is quite an irrelevant question when these questionnaires are actually teaching tools, not measurement instruments. Since the instruments don’t really have to measure anything, we don’t have to obsess about how reliable or valid they are. They can be evaluated (I guess) according to how well they apply proven methods for facilitating adult learning.

The irony of using a research instrument like a survey questionnaire this way will probably escape the majority of librarians (i.e. those who disliked library school research methods class.)  But here’s the story: One of the giant problems in designing behavioral science measures is making sure the measures don’t alter the thing you’re trying to measure. Measures are supposed to be unobtrusive. You would never trust a thermometer if you found that, while measuring the temperature of water, the thermometer also happened to heat the water! The same goes for questionnaires and tests in behavioral science and education.

Worries like this are old hat nowadays. Forget the antiseptic, hands-off approach. So easy and cheap to post online, the new questionnaires are designed to induce change by informing, educating, and motivating respondents. Millie95I ran across another one of these in connection with a new initiative on “21st century skills” launched last week by the Institute of Museum and Library Services (IMLS). This campaign presents a thoroughly modern take on the mission of libraries and museums. You can read the details and access the “self-assessment tool” here.

Still stuck in my 20th century research methodology paradigm, I found the IMLS questionnaire technically interesting. It is what I call a “Goldilocks instrument” since it uses a 3-point ordinal scale that amounts to a little, a medium amount, and a lot. The response options are something like this:

Goldilocks110

  1. The institution rarely practices such-and-such 21st century skills enhancement task or technique
  2. The institution practices the task or technique fairly often, or
  3. The institution almost always practices the task or technique.

In several questions in the survey, this tripartite scale appears as less than 25% of the time, 25% to 75% of the time, and over 75% of the time. But you get the idea—small, medium, large.

Specific questionnaire items address a series of general institutional dimensions like accountability, leadership, partnerships, and so on.  (See the self-assessment tool matrix.)  Then, in each area, the institution is rated as being in one of three developmental stages:  Early, Transitional, or 21st Century. An institution’s Goldilocks responses fall conveniently into these stages (surprise!!).  If you perform a 21st century skill enhancement task less than 25% of the time, you are in the Early (Neolithic?) stage on that one.  If you perform it more than 75% of the time, you are thoroughly modern!

At the completion of the questionnaire, the self-assessment tool simply parrots back an institution’s responses in graphical form. There are “Recommendations” buttons users can click on, but the advice offered is pretty much the same, regardless of an institution’s rating: Use the results “to initiate a dialogue with your institution’s leaders, board, colleagues, and other stakeholders” so you can improve your rating. In Goldilocks measurement terms, having the most 21st century skills possible is always “just right!”

Obviously, the survey is a teaching tool, not an assessment. That’s why there is no need for the instrument to gauge how libraries and museums compare to any independently derived standards.  nutrition100Like some “minimum recommended daily allowance” of a particular 21st century practice. This makes things much simpler for IMLS because the idea of library or museum standards, itself, is notoriously tricky.  Several of the approaches endorsed in their model don’t apply to many institutions.  (How can a small rural library or a historic police museum be collaborating with community partners on its new educational programs “over 75% of the time?”)

Fortunately, these types of measurement issues are immaterial.  Remember, this is not assessment.  It is education and proselytizing.  In fact, the IMLS self-assessment tool demonstrates one 21st century skill enhancement technique first-hand. As described in the project report, the tool is clearly interactive audience involvement! Rather than posting the questionnaire merely to measure something, IMLS is modeling the behavior they are seeking from museums and libraries.  I think it’s called “showing by doing.”

Categories: Library assessment · Measurement · Research

Cha-Ching!

August 14, 2009 · Leave a Comment

I noticed that yet another library value calculator has appeared on the scene. This one is offered by the National Network of Libraries of Medicine (NNLM) NNLMLogo with the very best of intentions, I am sure. But, let me say that I am convinced that these calculators are a bad idea. Their underlying assumptions are weak and their designs are not well thought out. Eventually, library funders and stakeholders are going to realize that the calculations are superficial and…well…sloppy.

For one thing, sound cost-benefit analysis requires an examination of the full extent of relevant costs and benefits of a given project, program, or service. These quick-and-easy library calculators, however, use average retail prices as proxies for benefits. This oversimplification ignores important sources of library value like contributions to student and life-long learning, scientific and academic research, and public discourse, as well as roles libraries play in imparting cultural and humanitarian values and traditions, promoting literary appreciation and aesthetic values, facilitating community cohesion, and so forth.

boiling120But say that, for practical purposes, we accept the idea that value-boils-down-to-price as reasonable. Even so, the retail pricing approach these calculators use has definite problems. The calculators view retail prices as estimates of costs that patrons would incur if the library’s items and services were—hypothetically—unavailable to the community or institution. The library comes up with a retail price for each type of material and service it offers, and then these prices are translated directly into the value patrons receive from utilizing these materials or services.

In many cases, however, the alternative to obtaining an item or service from the library is not an outright purchase at retail prices. A student might purchase a textbook for $125 and then later re-sell it on Amazon.com for $50. Or perhaps she buys the item at a used price or borrows it from a friend for free. Clearly, a variety of alternative patron scenarios are possible, meaning that there is a range of alternative costs (approximate values) associated with each item or service use. The average of these ranges will typically be less than an item’s retail price. Besides, an item borrowed from a library does not include the breadth of rights and conveniences that item ownership does. So, it is a stretch to say that a patron always enjoys the same benefit from a borrowed item as from a purchased one.

Other problems with the calculators make their output suspect. For example, each time a patron renews an item or re-uses it in-house or online, the item’s retail price gets credited—again—to the library’s value totals. (Cha-ching!) On the other hand, when our Amazon.com shopper purchases a book at $75, that book’s value does not increase to $150, then to $225 and beyond each time the owner opens the book, or with each 3-week library loan period that passes.

Because the calculators tally only certain types of transactions, they end up painting a rather rosy picture of library performance. Consider the case of a patron who needs an item or service that is (really) not available from the library, and whose information need ultimately goes unmet. And the case of a service delivered that fails to meet a patron’s need, such as an unproductive reference consultation. The first case won’t be tallied at all by these calculators, and the second case will be tallied but will be significantly over-valued. (It will be considered a complete success.) Yet, the actual value of both of these patron transactions is negative and should be entered into these calculators this way. Unfortunately, the calculators’ designs do not accommodate this.

Given these problems and oversights, it is fairly obvious that these calculators produce exaggerated estimates of the benefits which libraries provide. Perhaps this exaggeration is only moderate or perhaps it is substantial—we cannot really tell for sure.

The calculators also underestimate the cost side of the equations, causing their benefit/cost ratios to be even more over-stated. They ignore several key costs incurred in delivering library materials and services, abacus120including expenses for information technology, equipment, building maintenance, utilities, and administrative overhead. These calculators also disregard the incidental costs that patrons may bear, like travel and parking costs, time lost due to item unavailability or poor service, usability difficulties encountered, and so on. In fact, NNLM’s calculator errors in the opposite direction: Assuming that libraries are always convenient, the calculator builds a patron time-savings factor into its formula. (I suppose you could enter in negative numbers to register patron lost time and inconvenience.)

When the calculators do recognize costs, they end up settling for data that are the grossest of estimates. For instance, users can enter estimated percent of total library staff time spent supporting access to materials or services. Creators of the calculators seem unaware that accurate benefit/cost ratios require meticulous collection of operational data, not just convenient guess-timates.

You will be hard-pressed to learn about these shortcomings from the materials that accompany library value calculators. Mostly, libraries receive general guidelines for entering data and encouragement to use the calculators without reservation. The library just keys in its data and—voilà!—receives an exact return-on-investment percentage or benefit/cost ratio right on the spot! Given the casual assumptions the calculations entail and the inexactness of the library’s input data, you’d think the final answer would at least include some type of margin-of-error disclaimer. Maybe something like this:

Your library’s benefit-to-cost ratio = $8.20 per $1.00 cost*

*    Based on our calculations, we are 95% confident that your library’s benefit/cost ratio is between $4.50 and $12.50 (per $1.00 cost). If your data are especially inaccurate, this range will be larger. Note that our single $8.20 estimate may be high due to assumptions our model uses.

Needless to say, this kind of small print doesn’t appear in the instructions that come with library value calculators. As they are, the calculators generate figures that are precise to the penny, with no other explanations to speak of. Libraries confidently report the figures to stakeholders as accurate, authoritative, and nearly approaching Scientific Truth. Of course, the figures are nothing of the sort.

Clones of these library calculators have sprouted up on dozens of library websites, where patrons are invited to enter their custom data to receive their own monthly “value of library services.” Costs are typically not mentioned, so that final value calculations are simple multiplications of counts times arbitrary and often fanciful retail price estimates. Of course, the nifty and optimistic totals will delight library patrons. The totals might even please the population of nonusers who are happy to subsidize library use by others as an overall benefit to the community or institution.

On a few public library websites the calculations are made even more tantalizing by informing patrons about their “individual return-on-investment”—how much value they gain for every tax dollar they contribute. (Don’t you just love democracy!) Unfortunately, this approach casts the wrong light on the public value of libraries. First, the figures are further exaggerations because they use per capita revenue data. Not every public library user pays taxes, a fact that makes the individually quoted return rates artificially high. (Instead of library tax revenue per capita, the calculations should use tax revenue per tax-paying household.)

Second, these seemingly benign “value” calculations actually hide information. The websites fail to provide overall return-on-investment rates for all tax-paying households or for all tuition-paying students. As I have already alluded, an individual patron’s rate of return is being subsidized by nonusers of library services. For every patron elated with his own personally-calculated rate, there will be several households or students whose return rates are less than $0, meaning they lose money on their library tax or tuition “investments.” (This mix of returns rates also applies to using the vanilla versions of the calculators that don’t bother to factor costs in.) Omitting this larger picture from these presentations is slanted and misleading—something that libraries should not be involved in.

From a public or institutional value perspective, these Library 2.0-inspired patron calculators CreatingPublicValue100 completely sidestep the rightful purpose of library evaluation. This purpose is to assess the extent to which the library provides value to the institution or community as a whole, not how each individual fares. This assessment must also confirm that products and services are equitably distributed, that is, equally available and accessible to all who wish or need to use them (see Creating Public Value by Mark H. Moore).

In actuality, economic valuation is not so simple as it appears. It involves complicated (and frustrating) concepts like exchange value, use value, contingent value, and others. Even business corporations have misgivings about standard return-on-investment analysis because of how difficult it is to obtain reliable data to input into the formulas.

If we want to use purely monetary estimates of the value of our services, we need more rigorous methods than these makeshift library calculators. This exact advice was offered to us a couple of years ago MeasLibValue100 by Donald Elliot and Glen and Leslie Holt in their book Measuring Your Library’s Value. Their work provides important guidance that we should be heeding. Like the fact that benefit/cost valuations are unique to the communities and institutions from whence they come. The figures are really not comparable across communities or for different libraries. This is something that most of us would not have thought about. The central message from their book, though, should already be obvious to us: We can’t just make these benefit/cost numbers up, the way these calculators do. There have to be sound theoretical and empirical bases for our findings.

Sure, quick-and-dirty estimations might be helpful in certain situations, as long as they are recognized for what they are. But the numbers gushing from these library calculators are nonsensical and disingenuous, in many cases. The whole idea has become an impediment to the real work of assessing library value. When the batteries in these little pocket library calculators wear out, I recommend that we just not replace them.

Categories: Library assessment · Measurement · Research

Shorter

July 20, 2009 · Leave a Comment

You may not want to spend time reading this blog post.  It’s rather long and drawn out and is likely to be dull.  And it gets kind of complicated. Besides, the graphics are sparse and uninteresting. Plus there’s no video.

Grant Wood American GothicInstead, you might appreciate some other informational experience better, one that happens also to be thoroughly cool and engaging. Like Facebook walls or those omnitemporal slice-of-life Twitter tweets.

This post definitely is not slice-of-life. Hardly. It is conceptual, meaning that it is mostly tedious and definitely time-consuming.  It entails plodding through the text to see if any of the ideas make any sense. And even if they do, you have to figure out whether they are at all relevant. Worse, the topic could be one of those god-awfully amorphous ones that have no clear, calculatable bottom lines—like conundrums or Zen Buddhist koans.

Well, since you’re reading this paragraph, you must have free time on your hands.  So, I’ll tell you that the title of this post is from a National Public Radio essay by commentator Mark Allen.  Allen recounts how his boss insists that Allen send him only brief, concise email messages.  The boss apparently realizes that life is too short to get bogged down in details.  Or, God forbid, in the subtleties of precision, meaning, and context. Too much information. Shorter. Allen says that for people like his boss who subscribe to the Utopian vision of Life 2.0, “speed and brevity are obviously more important than facts, words, or information.”

Bridge18th_120Nowadays it is a social faux pas to communicate in long sentences with colleagues, friends, and family.  It’s self-indulgent, counter-productive, and so 20th century!  (Actually, I like to think of it as so 18th century since that’s when expository writing actually sprang up.)

Every so often, though, brevity and simple-minded factoids end up being extremely dangerous. I am thinking of the 2003 Columbia space shuttle accident that killed seven astronauts and crippled the NASA shuttle program.  The (I apologize) details about the role that sound-bite-like thinking played in this tragedy can be seen in the thoughtful work of data-presentation expert Edward Tufte.

Bottom line—the format in which information is presented has a gigantic effect on the information itself.  Marshall McLuhan’s famous quote ‘The medium is the message’ said essentially this.  Bottom lines filter out lots of information and it is never clear what crucial data have gotten omitted.  (Listen to the NPR audio to hear how the print version strips out information that is otherwise embedded in the single spoken word “shorter.”) In the case of textual information, simplified formats lead to simplified information.  Complicated ones enable the presentation of more complex and richer information.

Thankfully, the engineering details of space shuttle systems can be fairly well specified, as Tufte points out.  The task just requires ample formats for text, formulas, performance data, and diagrams that permit the exploration of the information, including its obvious and latent interrelationships.  And, of course, a commitment to studying and analyzing the information systematically.

TuftePPcoverPutting too many time and space restrictions on information distorts the information. But, as Allen notes, managers on a mission want bottom line answers.  They inhabit the world of perpetual motion and decisive action—not contemplation and analysis.  When a manager is seeking a tree, the forest can only be an aggravation.

Tufte has an almost scriptural response to the temptation to oversimplify important phenomena:  “It’s more complicated than that.” Which I will supplement with this verse: “Woe to the manager who under-contemplates a really important decision.”

All too often Tufte’s adage applies to informational practices in businesses and in public institutions, including libraries.  Short, over-simplified answers typically misrepresent the real situation. And they tend to justify the conduct of business as usual.  Responsible and effective public management (that is, stewardship of the public’s resources), however, requires a commitment to analyzing and digesting operational, performance, and environmental data, recognizing where informational gaps exist, identifying possible connections, looking for underlying logic, structure, and trends, and determining what relevant conclusions or generalizations can justifiably be drawn from these details. 

But all of this is a big hassle when there is more pressing work to be done.  Work like hunkering down to absorb library budget cuts, re-allocate staff, pare down materials costs, pay utility bills, deal with unions, and so on.  When we have more time, we’ll study our data to inform our decisions, and maybe even refine what we collect.  But right now we’re in a time crunch!

Categories: Library assessment · Measurement
Tagged: , , , ,

Library Assessment 101

May 18, 2009 · Leave a Comment

I want to communicate what I believe is the single most useful message about library assessment. This is not an announcement of a new data analytic technique or some all-purpose library value calculator. Nor is it advice on the importance of aligning work and measurement with vision and strategy, recognizing the political pitfalls of evaluation, or solidifying an annual assessment plan.

All of these are secondary to one fundamental step. But this step is a giant one: Libraries must become “self-evaluating organizations.” The importance of this Sherlock Holmesdawned on me (again) when I heard a librarian describing how her library used customer surveys to rethink their service approach. I realized it was not their survey questionnaire nor the planned service changes that mattered. It was their whole mindset that made the difference. They had a willingness to be inquisitive and exploratory, to be logical and systematic, to question comfortable assumptions, to look for unexpected answers, and to act on what they learned.

The term “self-evaluating organization’” comes from a classic article by the late political scientist and pioneer in the field of public administration, Aaron Wildavsky (citation below). There are other more current and stylish terms for ideas in this same vein. Libraries must develop a culture of assessment, become learning organizations, pursue performance excellence and practice evidence-based decision-making.

But, these are all variations on a single basic theme: To do effective evaluation, libraries have to want to improve. They must seek out unbiased information about what needs done, how best to do it, how they are doing it, and what actual results their efforts produce. They must objectively examine their operations and accomplishments as well as any unintended consequences that might result from these. Libraries need to view their successes and failures—including fortunate circumstances and missed opportunities—impartially and non-defensively. errorMost of all, they must be willing to confront the errors of their ways and be prepared to change their operations based on evaluation results. (Sorry for the dramatic phrasing, but that is the crux of a learning organization. Detecting shortcomings and fixing them!)

To improve their performance libraries need fair and balanced assessments of that performance. Wildavsky took this even further:

“The ideal organization…would continuously monitor its own activities so as to determine whether it was meeting its goals or even whether these goals should continue to prevail. [It] would have no vested interest in continuation of current activities.” Wildavsky, A. 1972.The Self-Evaluating Organization. Public Administration Review, 32(5),  p. 509. I added red-highlighting for emphasis.

He understood that commitment to cherished beliefs is an impediment to evaluation. A thoughtful attendee at a recent PLA conference helped me understand this idea more clearly. She explained to me that librarians feel passionate about their ideas and pursuits. Pounding her fist on her chest, she said, “They speak from the heart out of commitment.” To take a dispassionate view of things would be out of character for this profession. Librarians want to be in the middle of the fray, getting things done and making a difference. (This explains some of the field’s exuberance for new technologies. It is oh-so-easy for a person with a hammer to see everything as a nail.)

FullHouseGouldHow ironic it is that dedication might actually be an obstacle to effectiveness! Yet, it is true that subjectivity can make us victims of tunnel vision and hide our misconceptions from us. Stephen Jay Gould expressed this in his book, Full House: The Spread of Excellence from Plato to Darwin:

The most erroneous stories are those we think we know best—and therefore never scrutinize or question.  (p. 56)

This is why Amos Lakos and Shelley Phipps preach that effective assessment requires significant change in a library’s organizational culture. Libraries need to be willing to critique everything they do. Without a zeal for self-evaluation, we are not ready for assessment. Better to not waste time dabbling in something that we are not able to take seriously.

Categories: Library assessment
Tagged: , , , , ,

Once Size Doesn’t Fit All

May 7, 2009 · Leave a Comment

A basic tenet of public librarianship is the idea that each library and its communities are unique.  While libraries share certain characteristics in common, their products, services, and operations are (in theory) highly customized to fit local conditions. I didn’t realize how strong a tenet this was until I heard this declaration at an Ohio Library Council conference:  “All library excellence is local.”  Wow, pretty unequivocal!  Granted, public libraries do acknowledge that they have certain things in common with other libraries, but it sure sounds like unique characteristics trump everything else.

This contrast between things standard and things tailored (or customized) turns out to be a theme central to evaluation research also.  The idea has been noted, for instance, by Mark Lipsey, co-author of the leading textbook on program evaluation:

Evaluation7Ed100“One of the difficulties in evaluating a specific program is that [there is] little basis for knowing which aspects of the program work in relatively predictable ways and which are very distinctive to that particular program situation. A given intervention…may be known to have positive effects when used with some client populations but [not for others].  Similarly, one variation of a service may be effective, but that may not be true of another variation, especially when applied in a different program situation.”      Lipsey, M.W. (2000). Meta-Analysis and the Learning Curve in Evaluation Practice, American Journal of Evaluation 21(2), p. 209.

In Lipsey’s quote just replace “relatively predictable” with “standard” and replace “distinctive” with “custom” or “tailored.”

Here’s the same idea from the Kellogg Foundation’s evaluation handbook:

“All too often, conventional approaches to evaluation focus on examining only the outcomes or the impact of a project without examining the environment in which it operates or the processes involved in the project’s development. Although we agree that assessing short- and long-term outcomes is important and necessary, such an exclusive focus on impacts leads us to overlook equally important aspects of evaluation–including more sophisticated understandings of how and why programs and services work, for whom they work, and in what circumstances.” W.K. Kellogg Foundation Evaluation Handbook, p. 20.

Suppose that our profession produces a rigorously conducted outcome evaluation of, say, summer reading programs and the study affirms the effectiveness of these programs.  Then, what claims can be made about library summer reading programs nationwide?  Can we boast that this effectiveness applies to any and every public library summer reading program and attendee group?  Experts from the field of program evaluation tell us otherwise.

WPASummerReadingClubOnly to the extent that a library’s summer reading program matches the content and delivery approach of the programs in the outcome study, and the library’s clientele also matches those in the study–only to these extents can a public library point to the outcome study as evidence of its local program’s effectiveness.

Public libraries view their attunement to the nature and needs          WPA Poster           of unique communities as the foundation for their excellence and effectiveness. This puts the onus on libraries to demonstrate how well their custom practices work for their local clientele. Pretty tall order.

Categories: Library assessment · Research
Tagged: , ,