This is an old revision of the document!
Date: Thu 03 Mar 2019
This week we're reading a piece on the ebook market, by Joseph Sanchez (2015), and one on the academic journal market, by Bosch, Albee, and Henderson (2018), and how each impacts library budgets.
In order to understand Sanchez's article, we need to address some copyright topics, and so in the next part of this lecture, I'll talk about copyright and the first sale doctrine and how digital works have disrupted some basic ways that libraries function. The article by Bosch, Albee, and Henderson discusses a similar case among academic libraries, but with academic or scholarly journals as the focal point, and here I'll also address some of the citation metrics they discuss.
Copyright law grants a monopoly to the person or corporate owner of an intellectual property. That is, the copyright owner has exclusive rights over the material that they own, where they could be a person or an organizational entity. Section 106 of the law grants copyright owners the following rights:
(1) to reproduce the copyrighted work in copies or phonorecords;
(2) to prepare derivative works based upon the copyrighted work;
(3) to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending;
(4) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly;
(5) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and
(6) in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission.
These exclusive rights are all encompassing, and intentionally designed to allow copyright owners a monopoly of their property. In principle and under some constraints, this is a good thing. However, there are some implications that we should consider.
In short, if those exclusive rights were followed without any limitations, then it would mean that the exchange of money for a work between a copyright holder and a buyer for a physical book, a physical DVD disk, etc. would not entail a transfer of ownership of that physical copy; that is, it would not allow the buyer of the physical item any distribution rights of the item once the first exchange has been made. Under such a scenario, libraries would be able to buy physical books but would not be able lend them. Sounds ludicrous?
The First Sale Doctrine helps avoid the issue granted by the full blown list of exclusive rights granted by copyright ownership. Because of the first sale doctrine, made a precedent in the early 20th century and then codified into law in 1976, you, I, or a library may buy a physical copy of a work, such as a book, a DVD, a painting, and literally own that specific copy. First sale doctrine does not grant us reproduction rights, as they are listed in Section 106 of the copyright law, but it does allow us to distribute the singular, physical representation or embodiment of the work that we have purchased. Thus, this first sale doctrine is why libraries were able to thrive throughout the 20th century, lend material, and preserve it. More mundanely, it's also why I can buy a book at a bookstore and later give it away or sell it to someone after I've finished it, such as at a used bookstore.
The digital medium makes things messier, though. There are two big reasons for this. One, digital works are not subject to the same distributions constraints that physical works are subject to, and the first sale doctrine is all about distribution rights, not reproduction rights. If I have a physical copy of some book and give you my copy of that book, then I no longer have that copy. However, if I have a copy of a digital file, then as we all know, it's relatively trivial for me to share that file with you without losing access to my own copy. Since digital works can be copied and distributed without anyone losing access to their copies or even to the original, the First Sale Doctrine does not apply. In the digital space, there are far fewer limitations on supply, including on lending.
Second, many digital works are like software, or at the very least, they are fully intertwined with the software needed to display them. This is true for all kinds of documents, such as HTML pages in the browser or audio files played through a media player. But let's consider ebooks as an example. Ebooks come in all shapes and sizes. Project Gutenberg distributes ebooks that are in the public domain and in various file formats, such as plain text documents that have no presentation markup like bold, italics, etc., HTML documents with markup, XML documents like EPUB, and then also PDFs and others. Why so many file formats? Text is text, right, and in the print space, a book is simply printed on a page, even if it's sometimes printed on different size pages or using different type settings. But these various markups exist because they each offer technological or presentational advantages and are often tied to specific pieces of software.
This is especially true for proprietary file formats, like the ones that Amazon created for use only on Kindles, or the popular MP3 file format for audio recordings that only recently became patent free. While file formats like these may not be necessarily counted as software, depending on how we define software, but more like data structures, it is certainly true that file formats and the specific software applications that display can be completely intertwined. If you are old enough, you may remember the headaches caused with files created as .doc in some early version of Microsoft Word that later failed to display properly in a future Microsoft Word version or in some other word document software.
In short, these complexities introduce obstacles to the first sale doctrine and raise other copyright issues because of the connection to software, which is also often copyrighted. The result is that copyright holders and publishers have little financial interest in selling actual digital copies of works, since they cannot prevent future sales without impacting, i.e., limiting supply, and instead are more motivated to license material and sometimes explicitly tie that material to specific pieces of software and hardware, such as the Kindle, which would have to be bought.
What does this mean for libraries in the digital age? It means that libraries buy less and rent or license more, and renting means that they continually pay for something for as long as they want access to it. As Sanchez (2015) puts it, "At its simplest, this takes the form of paying x dollars per year per title during the length of the contract." When the total supply of works increases, e.g., the total number of published books increases, as they do each year, then it means renting more and more without ever completely acquiring. When budgets are cut or are stagnant, this ultimately means a decline in the collection a library has to offer, or if not a decline in the collection, then cuts in some other areas of a library, such as the number of librarians or other staff. This is the unique conundrum that Sanchez raises in his article.
If that alone were the issue, maybe librarians could figure out other sustainable ways to proceed, but Sanchez raises additional issues and questions: what if publishers raise the prices for digital content at an annual rate faster than what they already raise for print content (reasonable assumption)? If so, does that mean that librarians will be able to afford fewer titles, digital or print, unless they raise their budgets, and, as they weed, how would that impact the physical space of the library? See figure 2.3, specifically, from Sanchez's article. The plot shows just how much could be lost and how little gained if the forecasts Sanchez discusses pan out.
Keep all of this in mind as you process Sanchez's article. You can even connect it to some discussions you've already had about accessing digital content. Specifically, there are many ways to put constraints on the supply of an item in the digital landscape, as opposed to limiting supply in the physical space, which include fewer methods. That is, it's relatively easy for publishers and others to restrict the supply of physical works. They simply have to limit how many of those physical works are manufactured (e.g., the number of print runs). But given the nature of digital content, restricting supply is driven by the technologies available to do so, and since there are so many publishers and distribution points, then each one of these points will often create their own unique type of constraint on the supply. The result is that there will be a number of confusing methods implemented to limit constraint, even if these limitations are marketed as selling points. In practice, this may mean that only a limited number of people may "check" out a work from a library at one time, or access a database at one time, and so forth. Thus the budget issue has an impact on access and usability.
Read more about copyright:
Although ebooks likely represent the biggest impact on public library budgets, academic libraries are largely concerned with scholarly journals. Like Sanchez (2015), Bosch, Albee, & Henderson (2018) show that the major issue here is that academic library budgets are declining or holding flat, even though prices continue to increase for journal titles and though the number of published articles increase. This raises an interesting phenomenon -- that although researchers are among the most hurt by the lack of access to research, researchers are also part of the cause of the increase simply because they continue to publish more and more. Ironically, the result of that rate of increase is less access for all or most.
The authors also note that part of the drive to publish includes a drive to publish in so-called prestigious journal titles, where prestigious is determined by how well cited the title is. The authors refer to a few citation-based metrics that the research community uses to determine prestige. These include the long-established Impact Factor, which can be examined in the Journal Citation Reports (JCR) provided by Clarivate Analytics, as well as newer ones, such as the Eigenfactor and the Article Influence Score, which can also be examined in JCR (the eigenfactor.org site is not well updated, at the time of this writing).
One motivation for using some sort of citation metric as the basis of evaluating journal titles is because citation metrics indicate, at some level, the use of the title. That is, a citation to an article in a journal title means, ideally, that the authors citing that article have read the article. Historically, when Eugene Garfield invented the Impact Factor, it was partly so as a tool for librarians to use in collection management because he recognized this use-based theory of citations.
However, citation metrics should never be the sole or even primary tool for such purposes, though. While they may provide good information, there are many caveats. First, there are different fields of research, and some fields cite at different rates and at different volumes than other fields, and also for different reasons. This is why, in Table 5 of the Bosch, Albee, and Henderson (2018) article, the cost per cite for journals in the Philosophy & Religion category are so much higher that the cost per cite of titles in other categories. Authors in P&R simply have different citation and publishing behaviors than authors in other categories. Second, citations do not capture all uses of a journal. For example, there are many journal titles that I might use in my courses but may not use in my research, and this is true for many other faculty, yet citation metrics won't reflect that kind of use. The authors refer to altmetrics, which was invented to help capture additional non-citing uses of scholarly products, but altmetrics is still in its infancy and is largely dependent on data sources and scholarly behavior that are problematic themselves. Third, there are various issues with the metrics themselves. The Impact Factor is based on a calculation that is outdated and not a very appropriate statistical measure. The other calculations were created to address that but may have other problems. And four, the use of the metrics, regardless of which one, tends to drive publishing behavior -- such that journal titles with higher metrics tend to attract more submissions and more attention, thus driving more citations to them. Thus, citation based metrics are comparable to a kind of capitalist economic system where, as the sociologist of science Robert Merton noted, the richer get richer and the poor get poorer. The issue then is that prestige, defined in this way, does not necessarily indicate quality -- just use.
The authors also discuss some of the issues with Gold Open Access and the idea that Gold OA may compound the cost problem. This is where authors pay a publication fee, or an article processing charge (APC), once a manuscript has been submitted and accepted by a journal (there are other types of Gold OA cost models). We can do a quick off the cuff and rough calculation to see why this might compound the problem. As an example, PLOS ONE is one of the largest gold OA journals and charges an APC of $1,595 USD. In 2018, 32 papers were published in PLOS ONE that included at least one author from the University of Kentucky, totaling $51,040 in APCs for the 50 total institutions that were associated with these papers. Thus, this amounts to about $1020 per institution, payed for by the authors and not libraries. For UK authors, this also amounts to over $32,640 spent on APCs (32 * $1020). This is about $27K more than the average price of the most expensive category, Chemistry, as reported in Table 1 of the reading. I'll leave it at that.
In a follow up video, I'll demonstrate some of the tools used to look at the discussed metrics.
Bosch, S., Albee, B., & Henderson, K. (2018). Death by 1,000 Cuts. Library Journal, 143(7), 28–33.
Sanchez, J. (2015). Chapter 2. Forecasting Public Library E-content Costs. Library Technology Reports, 51(8), 9–15. Retrieved from https://journals.ala.org/index.php/ltr/article/view/5833