Electronic Access

Introduction

Access is the paramount principle of librarianship, and all other issues, from censorship to information retrieval or to usability, are on some level derived from or framed by that principle of Access.

This week we devote ourselves to a discussion of electronic access. To start, let's begin with Samples and Healy (2014), who provide a nice framework for thinking about managing electronic access. They include two broad categories, proactive troubleshooting and reactive troubleshooting of access.

  • proactive troubleshooting of access: "defined as troubleshooting access problems before they are identified by a patron". Some examples include:
    • "letting public-facing library staff know about planned database downtime"
    • "doing a complete inventory to make sure that every database paid for is in fact 'turned on'
  • reactive troubleshoot of access: "defined as troubleshooting access issues as problems are identified and reported by a patron". Some examples include:
    • "fixing broken links"
    • "fixing incorrect coverage date ranges in the catalog"
    • "patron education about accessing full text"

The goal here, as suggested by Samples and Healy (2014), is to maximize proactive troubleshooting and to minimize reactive troubleshooting. The Samples and Healy (2014) report is a great example of systematic study. The authors identify a problem that had grown "organically," collected and analyzed data, and then generalized from it by outlining a "detailed workflow" to "improve the timeliness and accuracy of electronic resource work." Practically, studies like this promise to improve productivity and work flows and foster job and patron satisfaction. Such studies also help librarians identify the kinds of software solutions that align with their own workflows and patron information behaviors. If interested, I suggest reading Lowe et al., 2021 about the impact of Covid-19 on electronic resource management. Six authors individually describe access issues at their respective institutions and show how issues of pricing, acquisitions, training, user expectations, and budgets affect electronic access. I suggest reading articles like this in light of the framework provided by Samples and Healy (2014) because stories like these, about this impact of the pandemic on electronic access, can help guide us in developing proactive troubleshooting procedures minimize future issues, pandemic or otherwise, at our own institutions.

Samples and Healy (2014) say something important against a common assumption about electronic resources, particularly those provided by vendors:

The impression that once a resource is acquired, it is then just 'accessible' belies the actual, shifting nature of electronic resources, where continual changes in URLs, domain names, or incompatible metadata causes articles and ebooks to be available one day, but not the next (The Complexity of ERM section, para. 6).

Hence, unlike a printed work from the long ago print-only era that, once cataloged, may be shelved for decades or longer without major problems of access, electronic resources require constant and active attention to maintain accessibility to them. Ebooks, for example, can create metadata problems. For example, often what's important about scholarly ebooks, in particular, are the chapters they include, and hence metadata describing ebook components is important, along with providing links to those chapters in discovery systems. This difference between item-level cataloging and title-level cataloging, as Samples and Healy describe, can lead to confusing and problematic results when considering different genres and what those genres contain.

Or, note that they discuss how a series of links are involved starting from the source of discovery, e.g., an OPAC or a discovery layer, to the retrieved item, and how difficult it might be in determining which of these links and which of those services is broken when access becomes problematic.

Let me highlight a few key findings from their report:

  • Workflows: why does this keep coming up? It's because workflows help automate a process---simplify and smooth out what needs to be done, and because this is only possible when things are standardized.
  • Staffing: we'll discuss this more in another section, but part of the problem here is that ERM has had a major impact on organizational structure, but one where different libraries have responded differently. This lack of organizational standardization has its benefits regarding overall management practices and cultures, but it also has huge drawbacks---and that's the difficulty in establishing effective, generalized workflows that include key participants, and to minimize as many dependencies on any one person.
  • Tracking: if there's no tracking, there's no method to systematically identify patterns in problems. And if that's not possible, then there's no method to solve those problems proactively. It becomes all reactive troubleshooting, and reactive troubleshooting, as Samples and Healy indicate, results in poor patron experiences. We'll discuss tracking when we during the week on Evaluation and Statistics.

We commonly get the line that discovery systems are a great solution to all the disparate resources that librarians subscribe to. Or, if we do think about problems with such systems, we are often presented with a basic information retrieval problem, such that the larger the collection to search, the more likely a relevant item will get lost in the mix. Carter and Traill (2017) point out that these discovery systems also tend to reveal access problems as they are used. The authors provide a checklist to help track issues and improve existing workflows.

Buhler and Cataldo (2016) provide an important reminder that the mission of the electronic resource librarian is to serve the patron. This should remind us that the internet and the web have flattened genres. By that I mean they have made it difficult to distinguish among works like magazine articles, news articles, journal articles, encyclopedia articles, ebooks, etc. Though the Buhler and Cataldo (2016) reading is student-focused, other studies have hinted at the same issue they describe across other populations. It's important, if possible, to recognize these issues as ERM librarians and work to resolve them in the ways that you would be able to.

Myself, I grew up learning about the differences between encyclopedia articles, journal articles, magazine articles, newspaper articles, book chapters, handbooks, indexes, and dictionaries because I grew up with the print versions, which by definition, were tangible things that looked different from each other. Today, a traditional first year college student was born around the year 2004 and grew up reading sometime in the last decade. The problem this raises is that although electronic resources are electronic or digital, they are still based on genres that originated in the print age, yet they lack the physical characteristics that distinguished one from the other. E.g., what's the difference between a longer NY Times article (traditionally a newspaper article) and an article in the New Yorker (traditionally a magazine article) today in their online forms? Aside from some aesthetic differences between the two, they are both presented on web pages, and it's not altogether obvious, based on any kind of cursory examination, that we can tell, as regular users, that they're entirely different genres. However, there are important informational differences between the two, how they were written, how they were edited, how long they are, and who they were written by that might still lead us to consider them as different genres. Even Wikipedia articles pose this problem. Citing an encyclopedia article was never an accepted practice, but this was only true for general encyclopedias. It was generally okay to cite articles from special encyclopedias because they focused on limited subject matters like art, music, science, culture, and were usually more in-depth in their coverage. Examples include the Encyclopedia of GIS, the Encyclopedia of Evolution, The Kentucky African American Encyclopedia, The Encyclopedia of Virtual Art Carving Toraja--Indonesia, and so forth. There are studies that show that Wikipedia provides the same kind of in-depth coverage of some special encyclopedias, thus helping to flatten the encyclopedia genre, too.

The flattening holds true for things like Google. The best print analogy for Google is that of an index, which was used to locate keywords that would refer to source material. The main difference between these indexes and Google is that the indexes were produced to cover specific publications, like a newspaper, or specific areas, like the Social Science Citation Index or the Science Citation Index, both of which are actual, documented, historical precursors to Google and to Google Scholar. But today, these search engines are erroneously considered source material (e.g, "I found it on Google"). Few, I think, would have considered a print index as source material, but rather as a reference item, since it referred users to sources. Nowadays, it's all mixed up, but who can blame anyone.

Example print indexes:

Access and Authentication

Much of what exists in a library's electronic collections is paywalled, therefore librarians use software that authenticates users before they acquire access. This is generally required in agreements with content vendors.

There are two main technologies used to authenticate users. The first is through an IP / proxy server. Here, EZproxy (OCLC) is the main product in this arena, and in fact we use EZproxy at UK. When we access any paywalled work, like a journal article, you may notice the ezproxy.uky.edu string of text in a URL. For example, the following is an EZProxy URL:

https://www-sciencedirect-com.ezproxy.uky.edu/science/article/pii/S030645730500004X

The interesting thing about this URL is that it has a uky.edu address even though the article is in a journal that's hosted in Elsevier's ScienceDirect database. The www-sciencedirect-com part of the address is a simple subdomain of ezproxy.uky.edu (you can tell because the components are separated by dashes instead of periods), As a subdomain, it is no different than the www in www.google.com or the maps in maps.google.com. The original URL is in fact:

https://www.sciencedirect.com/science/article/pii/S030645730500004X

As opposed to the first URL, the interesting thing about the original URL is that it is in fact a sciencedirect.com address. Even though "sciencedirect" appears in the uky.edu URL, it is not a "sciencedirect.com" server. They are two different servers, from two different organizations, and are as different as uky.edu and google.com.

The reason we read an article or some other paywalled content at a uky.edu address and not at a, e.g., sciencedirect.com address is because of the way proxy servers work. In essence, when we make a request for a resource, like a journal article or a bibliographic database, that's provided by a library, our browser makes the request to the proxy server and not to the original server. The proxy server then makes the resource request to the original server, which relays that content back to the proxy server (EZproxy), which then sends the content to our browser. This means that when we request an article in a journal at sciencedirect.com or jstor.com, our browser never actually makes a connection to those servers. Instead, the proxy server acts as a go-between. See Day (2017) for a more technical and yet accessible description of the process.

Proxy servers provide access either through a login server or based on the user's IP address. If we're on campus, then our authentication is IP based, since all devices attached to the university's network are assigned an IP from a pre-defined range of IP addresses. This makes access to paywalled content fairly seamless, when on campus.

If we are off-campus, access is authenticated via a login method to the proxy server. When we attempt to access paywalled content from off-campus, we will see an EZproxy login URL. This looks something like this for accessing the ScienceDirect database:

http://ezproxy.uky.edu/login?url=https://www.sciencedirect.com

Aside from ScienceDirect, you can see a list of other subscribed content that requires EZproxy authentication here:

https://login.ezproxy.uky.edu/menu

The second main technology used to authenticate and provide access is based on what is called SAML authentication. The main product that provides SAML authentication for libraries is OpenAthens.

Unlike a proxy / IP authentication process, SAML is a type of identity verification system. Under this method, libraries offer a single sign-on process, and once authenticated, patrons have access to all SAML ready content or service providers. The process is similar to the Duo Single Sign-On service our university uses for authentication. In the OpenAthens case, users are authenticated via an identity provider, which would be the library. The library provides identification by connecting to its organization's identity management system, such as adfs, or Active Directory Federation Services. Once a patron has been authenticated, a confirmation is sent to the content provider, which then provides access to the content to the patron. For more details, see What is SAML? and this detailed OpenAthens software demo.

One of the benefits of this method is that URLs are not proxied, which means that content is not delivered to the patron from a proxy server like EZproxy. Instead, patrons access the original source directly. From a patron's perspective, this makes sharing URLs nicer. As far as I can tell, one of the downsides might be privacy related. With a proxy server, users don't access the original source, but instead the source is delivered through the proxy server, which by definition, masks the patron's IP address and browser information. This wouldn't be true under the SAML method.

Conclusion

The Samples & Healy (2014) and the Carter & Traill (2017) articles address troubleshooting strategies with electronic resources. One additional thing to note about these readings is how the organizational structure influences workflows and how the continued transition from a print-era model of library processes to an electronic one remains problematic. Even once that transition is complete, both readings make the case that strategy and preparation are needed to deal with these issues. The Buhler & Cataldo (2016) article shows how confusing e-resources are to patrons and how the move to digital has complicated all genres, or "containers", as the authors name them. Such "ambiguity" has implications not only for how users find and identify electronic resources but on how librarians manage access to them.

I added the EZproxy and OpenAthens content in order to complete the technical discussions we have had in recent weeks on integrated library systems, electronic resource management systems, link resolvers, and standards. These authentication and access technologies complete these discussions, which, altogether, cover the major technologies that electronic resource librarians work with to provide access to paywalled content in library collections. Both technologies aim to provide seamless access to paywalled content, as nearly as seamless as accessing content via a search engine or other source. Although neither will never be able to offer completely seamless access as long there are paywalled sources in library collections, the job of an electronic resource librarian is often to make sure they work as well as possible. This will often mean working with vendors and colleagues.

References

Samples, J., & Healy, C. (2014). Making it look easy: Maintaining the magic of access. Serials Review, 40, 105-117. https://doi.org/10.1080/00987913.2014.929483

Carter, S., & Traill, S. (2017). Essential skills and knowledge for troubleshooting e-resources access issues in a web-scale discovery environment. Journal of Electronic Resources Librarianship , 29(1), 1–15. https://doi.org/10.1080/1941126X.2017.1270096

Buhler, A., & Cataldo, T. (2016). Identifying e-resources: An exploratory study of university students. Library Resources & Technical Services, 60, 22-37. https://doi.org/10.5860/lrts.60n1.23

Additional Readings / References

Breeding, M. (2008). OCLC Acquires EZproxy. Smart Libraries Newsletter, 28(03), 1–2. https://librarytechnology.org/document/13149

OCLC. (2017, September 22). EZproxy. OCLC Support. https://help.oclc.org/Library_Management/EZproxy

OpenAthens transforms user access to library resources, replacing EZproxy and IP address authentication. (2021, June 2). About UBC Library. https://about.library.ubc.ca/2021/06/02/openathens-transforms-user-access-to-library-resources-replacing-ezproxy-and-ip-address-authentication/

Botyriute, K. (2018). Access to online resources. Springer International Publishing. https://doi.org/10.1007/978-3-319-73990-8

Day, J. M. (2017, April 25). Proxy servers: Basics and resources. Library Technology Launchpad. https://libtechlaunchpad.com/2017/04/25/proxy-servers-basics-and-resources/

Lowe, R. A., Chirombo, F., Coogan, J. F., Dodd, A., Hutchinson, C., & Nagata, J. (2021). Electronic Resources Management in the Time of COVID-19: Challenges and Opportunities Experienced by Six Academic Libraries. Journal of Electronic Resources Librarianship, 33(3), 215–223. https://doi.org/10.1080/1941126X.2021.1949162