CNI 2018 Fall Membership Meeting

December 11, 2018

tags: data management, digitization, google, libraries, open access, open source, privacy, publishing

wordcloud CNI was information dense as always. Below are the notes I recorded. Mistakes and leaps in logic are mine, not the speakers’.

It has been clear for a number of years that CNI is growing in popularity; at the opening session Cliff Lynch noted about a half dozen new members who have joined just since the spring membership meeting. For many of us who have been CNI regulars for years, this is not surprising. Cliff, Joan, and the fantastic CNI staff organize stellar membership meetings and in general run a very solid and communicative organization.

Monday

From Talking to Action: Fostering Deep Collaboration Between University Libraries, Museums, and IT

Susan Gibbons, Dale Hendrickson, Louis King, Michael Appleby – Yale U

Gibbons stressed that if these large and established entities can collaborate, it can be done at other places. Her role expanded in 2016 to include the title of Deputy Provost for Collections. That shift set meetings in motion, including all of the museum directors sitting down together. That hadn’t necessarily happened before, but once it did, common issues came to light. Addition of new CIO also helped push forward the case for collaboration. He wanted IT ‘pillars’ and created one called the Cultural Heritage IT Pillar. It had some money behind it, so if the directors worked together, they could access this money. So, four drivers:

regular meeting of cultural heritage directors
regular meetings of cultural heritage collection and IT reps
financial incentives to collaborate with IT
urgency for museum solution for digital asset management and preservation

King outlined how critical a better solution for that last item–digital asset management–had become. They built something in 2009 on the basis of a commercial platform. What they wanted was a modular system, so if a module was EOL, they could switch out a component without a massive system upheaval. They discovered that as they went across campus, they were working with units of various sizes, so their architecture standardized “on the aggregate,” but allows flexibility at the unit level based on capacity.

They worked out five ways to fund this:

annual subscription – each partner pays to sustain shared services
in-kind support – commit FTEs
commodity charge back – addresses scale differences; use more, pay more
value add investment – subsets of users or groups exploring new opportunities
transformative investment – university commitments or grant funding that really shift the scene

Appleby spoke about discovery across collections. They are now working on metadata integration. Had a nice graphic to display how varied their landscape is which demonstrated how little convergence there is on standards for museums and libraries. Hendrickson went deeper into creating common discovery tools. Part of this work involves developing personas and doing user experience work and testing. Indexing is also a task; legacy tool was a vendor tool that no longer meets the needs, so they are looking at tools such as Blacklight and others. Beyond that, they will have to develop an interface and also put effort into creating APIs to go against the index and allow it to be represented in ways they cannot yet envision.

Lever Press Project Update

Marta Brunner – Skidmore College; Mike Edington – Amherst College; Peggy Seiden – Swarthmore College; Charles Watkinson – U of Michigan

What is Lever Press? A digitally native, stakeholder governed, library
funded scholarly press. Also peer reviewed: three single-blind reviewers. Also platinum open access; no charge to author or reader.

60 titles over a five-year pilot. 50 traditional and 10 innovative. Average price per title is $16,700, which is much lower than the average of $23,000 for a monograph.

Why did they do this?

costs for scholarly materials are out of control (books rising at 2x inflation rate)
important segment of higher education just isn’t present (i.e. liberal arts colleges)
because libraries are most important revenue source for university presses, but have no say in publishing outcome
open access funding that focuses on authors at an institution (and only them) is not equitable

Fulcrum is built on a Samvera/Fedora base. Leverages Michigan’s existing stack.

Seiden asked the question: who would want to publish with this press? It’s a risky venture in terms of reputation and careers. Noted that the Ithaka data shows that open access really isn’t a driver of how faculty think about journal publishing, so makes the assumption that this might apply to book publishing as well. So while open access is a good cause, it may not make Lever immediately appealing.

What is appealing? The Fulcrum platform: it allows embedding media. Other advantages:

length of work no longer matters
liberal arts focus
undergraduate focus – collaborative faculty/student research
interdisciplinary approach
peer review
time to publication

Offered a number of thoughts on how Lever supports the liberal arts environment and mission. Also, it makes collections and work from liberal arts colleges more visible and present in the scholarly ecosystem.

Brunner offered the library director perspective. Many reasons to support it, but the administrative needs are real. One challenge is managing the logistics between the editorial board (faculty) and the oversight committee (mostly library directors). Also, it takes ramp up time. After putting money into such a venture, it takes time for books to emerge. They are going to run a membership campaign in the near future.

Towards Interoperable and Equitable Scholarly Communications Ecosystems: Values-based Questions to Ask Infrastructure Providers

Allegra Swift, David Minor – U of California, San Diego

Swift described the general commodification of the scholarly communications landscape. Introduced various elements of the ecosystem in their context that are pushing back against this emerging hegemony.

Minor pointed out that our campuses are contacted by various vendors and that even a given vendor may be sending multiple people to campus. There’s a lot of noise in this space; they’re big companies. He made the simple and powerful point that we tend to see the pieces of the research ecosystem as separate but the vendors do not. They see it holistically and are developing/acquiring tools to address each aspect. Elsevier’s Chi is very clear and unambiguous about this: “All this said, let me make it clear that we are not ever going to take our hands off the content curation. Having the content in a structured and curated way is very important to the analytics business.”

Organizations need to engage in self-reflection, ask the why questions, to expose and articulate their values. Then it’s possible to approach vendors to pose questions to vendors based on these values. They’ve developed checklists and are asking for input on them.

What lessons have they learned:

lack of concrete holistic academic-owned/open source/scholar-led options
commercial dollars vs. academic
dispersion of energy, lack of funding, and ongoing communication after events where we get ourselves excited

Blockchain: What’s Not to Like?

David S. H. Rosenthal

Bit late, but as I walked in he was pointing out that immutable objects actually run afoul of the European GDPR. Offered a very quick and somewhat challenging to follow narrative, but in general he was highlighting many of the shortcomings of the technology, among them its environmental impact.

Speaking more frankly after delivering the paper, he noted that in the 2017 “pump and dump” that drove Bitcoin prices through the roof, early adopters took $30 billion in “real money” out of the system by dumping some of their Bitcoin holdings. As he described this, it was clear that he holds such individuals in low esteem, noting that they are fans of the Austrian school of economics and are techno-libertarians. Such people have a strong incentive to hype these technologies to “continue to extract money from suckers.” The takeaway message for me is to pay closer attention to hype cycles and perhaps always be skeptical about them. He pointed out that since some of these people fund institutions, there may not be as many critical or reliable voices on these matters as we might actually want to have. Mentioned David Gerard’s blog as a good source of critical information, as well as his own blog.

Tuesday

Addressing the 20th Century Gap: Controlled Digital Lending by Libraries

Chris Freeland – Internet Archive; Kyle Courtney – Harvard U; Terry Ehling – MIT Press

Freeland started by noting that book scanning has a 20th century problem, meaning that books from the 20th century just aren’t present on accessible platforms after the 1920s, 1923 specifically, for obvious reasons related to copyright regimes. He then showed a graph of how many new animal names have been published since the 1920s, a considerable amount and an upward curve. This runs dramatically counter to the digitization curve for available works.

Courtney spoke to the legal situation. Noted that in the US Copyright Act, an exception is built in that allows lending (distribution right), aka the first sale doctrine aka exhaustion. One can legally sell items under copyright (or gift them to others): used bookstores, eBay, used record stores, etc. It has also supported lending.

Fair use statutes in the US also support controlled digital lending (CDL), because the nature of the use is for lending (section 109 – first sale/exhaustion). It’s also a non-commercial, temporary usage, so mimics economic transactions that are already allowed and have been for a long time.

He walked through a fairly straightforward scenario that demonstrates how CDL stays within the law, both letter and spirit. He underscored that there is no such thing as no risk; this is a low risk venture, but that’s the best one can hope for in the real world.

Freeland noted that IA actually has a physical collection of 1.5 million volumes of which 800,000 have been digitized, as well as myriad relationships with publishers to allow them to digitize materials. He showed screenshots of their system, highlighting how books show as checked out and are unavailable if the one copy available is in use. It’s not random distribution. One can see the system in action at openlibraries.online.

Ehling spoke to the publisher’s perspective. MIT Press has about 800 titles available via the platform, but they are awaiting data from the system in order to do analysis on how this impacts their operations and sales. For example, they want to offer POD (print on demand) versions of materials for readers who would prefer a physical copy. They want to know if Open Libraries is a positive–driving sales–or a negative for their press. The verdict is not yet clear, so she characterized their participation as an experiment. Digital is the “third rail” for publishers. They are curious about dwell time, how/if users travel between books, etc. With regard to orphaned works, she did say to Freeland “go for it.”

Using Digital Tools & Platforms to Increase Engagement with Articles Published in Scholarly Journals

Seth Denbo – American Historical Assn.; Stephen Robertson – George Mason U

Pointed out early in the talk that this isn’t necessarily about open access. Augmenting articles does not mean that the article itself needs to be out from behind a paywall. It’s also not a tool-building project; they want to leverage existing platforms and tools to do more of this work in the public sphere. They are interested in learning what it would take to help journal editors do this.

Their work happens in coordination with the Scholarly Communication Institute, a Mellon-funded exercise that takes place in the research triangle in North Carolina. Teams can propose projects and if successful are invited to come and work together for four days in residence. The idea is also to have projects working side on side so that they can share across project boundaries. Much of what they presented came together during this four-day stint with a project team of six individuals.

Their goal is to map what’s out there and present it to journal editors in a way that helps them imagine how they might move their journals’ work into the public eye. They developed seven categories of practice:

providing additional sources
providing teaching resources
providing information on method
interviewing authors
creating conversations between authors & readers
engaging contemporary issues
creating content for non-scholarly audiences

This can be done via a number of platforms: websites, apps, annotation, podcasts, video, and social media. They suggest specific platforms for each category.

Before they went to North Carolina, they surveyed about 150 editors of history journals. There was clear interest in doing more to mobilize scholarship, but also indications that no one had any time to do this. Their work acknowledges and attempts to address this latter issue. They created a decision tree that walks an editor through simple questions about whom they want to reach and what they want to communicate, leading to a set of recommended practices and platforms.

One of the things they attempt to address is that many institutions, through their libraries and other units, are already offering resources to support and train journal editors to do this work, but that awareness of these resources is uneven.

The two presenters have different ideas about how to build on these resources. Stephen pitched a “virtual intern” project. He acknowledged that unpaid internships raise ethical issues. He attempted to parry those issues, but I didn’t quite follow how that worked out.

Seth went into the direction of consortial or collaborative projects between journals and publishers. Rather than expecting journals to have a podcasting expert, the consortium could provide this expertise, for example.

Protecting Privacy on the Web: A Study of HTTPS and Google Analytics Implementation in Academic Library Websites

Kenning Arlitsch, Scott Young – Montana State U

The talk was the result of an audit of numerous library websites to assess the degree to which they implement principles that could protect user privacy. I’m going to guess that the outcomes of this audit are not terribly encouraging.

Scott covered third-party tracking as a concept, essentially noting that web analytics services are routinely gathered and that this data is sent to third parties either without the consent or with a lack of explicit consent from visitors to the site. Cookies have become more sophisticated, and they talk to each other across contexts, creating an image of a user which can then be sold and trafficked as a commodity.

What can we do? Consider moving away from Google Analytics. There are two others Metomo and Open Web Analytics. If using Google Analytics, turn on IP anonymization. Also consider using opt-out mechanisms. Last: use HTTPS.

Google Analytics is a strong tool and its free. Downsides are the passing of user data to Google and there are inaccuracies and some dubious insights that come from this data. We use, as a community, Google Analytics, but we don’t necessarily understand or accept the ramifications. Yet, various statements from NISO, IFLA, ALA and others state that we value user privacy, so we are not always meeting that standard when using tools such as GA.

Privacy has been a longstanding concern for us, but third-party tracking opens up a world that undermines privacy. Our values are in tension with showing our value, which requires analytics.

Research questions:

Do libraries implement HTTPS with proper redirect practices?
Do libraries using GA implement the available privacy protection measures?

Further to the use of HTTPS, it’s critical to have sites configured properly so that all traffic passes via HTTPS. For GA, the questions are whether libraries are using a secure connection to the service and if IP anonymization is activated.

They ran this across 279 libraries from 16 countries. Pool were members of one of three organizations: ARL, DLF, or the OCLC Research Partnership. About 62% had HTTPS implemented, but of the 173 who had HTTPS implemented, how many redirected non-secured traffic to HTTPS? Only 32% had done so. 88% use Google Analytics, but of those 85% do not use any of the privacy protection features. 14% use IP anonymization, but only 1% use library to Google HTTPS feature, with 0% using both.

They make five recommendations:

Implement HTTPS -correctly and thoroughly
Implement IP anonymization in Google Analytics – it’s a one-line addition to the configuration snippet
Educate users about online privacy (staff too)
Obtain informed consent from users (cookie warnings); not typically used by libraries
Conduct risk/benefit analysis when using third-party services

Comments are closed.

Libraries, Technology, and other matters

CNI 2018 Fall Membership Meeting

Monday

Tuesday

Who I am

Recent

Search

Older posts

Latest tweet

Libraries, Technology, and other matters

CNI 2018 Fall Membership Meeting

Monday

Tuesday

Share this:

Related

Who I am

Recent

Search

Older posts

Latest tweet