My editorial comments are in italics.
Improving the Odds of Preservation
David S. H. Rosenthal, LOCKSS
Various studies have shown that large portions of the digital world are not archived. Over 50% of the journals we hold are preserved, most content linked from e-theses are no longer available, etc. He refers to this as the ‘half-empty archive’ and notes that the bad news is that this is overly optimistic. It’s actually worse. We tend to prefer archiving information that’s easy to access and presents no technical hurdles, e.g.- archiving Elsevier’s output isn’t doing anything terribly useful since it’s well situated content. We do not skew our activities to risk, in other words. Put simply, large, obvious, and well linked collections of information are more likely to be preserved, while all of the smaller yet critical portions go unpreserved.
More issues: we look backwards, not forwards, in other words, we prefer books and journals as preservation objects to more modern forms of information such as social media output and Web content in general. Dynamic and ephemeral content has little chance of being preserved. Read more…
Under the Hood with OpenStack
Steve Marks, Amaz Taufique, ScholarsPortal
Showed the specs for the hardware being used for the Ontario Library Research Cloud. McMaster is part of this project, so I’ve seen these specs before and wasn’t taking close notes during this stretch. It’s a ton of hardware, suffice to say, with the goal being a 1.25PB array.
The software layer is based on OpenStack. Showed a graphic that explained how this works. Key to the design is no central database; also the whole setup is hardware agnostic. When an object is written, it needs to be written to two nodes for a successful write, but in testing it was common for all three nodes to write immediately, even with large and complex transfers. Were only one node to write, an error is returned. After being written, they are replicated across the other nodes. Amaz also showed what happens when a node becomes unavailable, which is that objects are written to handoff nodes until it recovers.
The initial pilot was done using GTAnet, which encompasses the three universities that participated in testing: UofT, Ryerson, and York. This testing was necessary to see what kind of traffic is generated during both routine and stress scenarios. Ultimately, there were four nodes, one each at Ryerson and York and two at UofT. The fourth was necessary to observe the aforementioned handoff (i.e.- what occurs when a primary node is unavailable). Read more…
As always, it’s a pleasure to be at Access, reconnecting with colleagues and learning about exciting new developments. This year’s version also has the distinction of being the first, and likely only, library conference I will ever attend that’s taking place within a stone’s throw of a 400m speedskating oval. The missed opportunity–my Viking klaps are at home buried in the basement–will sting for a while, but I suppose I could have looked before I got on the plane, since I knew quite well that Calgary has an oval.
As always with my notes, editorial comments are in italics to distinguish them from the speaker’s points.
Public Digital Humanities Center
Kim Martin, Western U
Was interrupted by a critical IT issue at work, so had to jump into this talk a bit late and my notes are correspondingly vague.
Showed how the DHMakerBus has been a way to work with a wide range of organizations and entities, which is a manifestation of her mantra “network by doing.” This stands in contrast to asking them “how can you help us” and replaces it instead with “how can we work together.” She ran through a number of sample events with some of these groups, noting how varied and successful they were.
This was a rare talk at an academic conference where children figured fairly large in the narrative. It occurred to me that this is a welcome departure, and if we’re talking about engaging people in the humanities via the digital humanities, we’ve missed the boat if we think we can achieve this with students who have already arrived at university. It needs to start much younger.
With regard to making DH public, she asked how we in libraries can make the artifacts of DH work permanent and accessible, using Minecraft worlds as one example. Read more…
Late last year, the review editor for the German library publication BuB Forum Bibliothek und Information asked me to review the book Catalogue 2.0: The Future of the Library Catalogue (n.b.- I serve on BuB‘s editorial board as the lone non-European). Typically, when I write for German publication, I write directly in German, which kind editors then slightly polish to remove some of my infidelities. This time, it seemed to make more sense to review a book written in English in English, my thought being that I could then post the English original for anyone interested. It took me a while to remember this, but I’ve finally managed to do that. Feel free to download and distribute this review (CC-BY, as noted on the PDF).
An interesting sidenote: this work was published by ALA Neal-Schuman. I wrote to them to request a review copy, noting that the review had been solicited by BuB, which is Germany’s largest circulation library publication. Not only did they not provide a review copy, they didn’t even deign to respond to the request. Despite that–and the book’s $90 price tag, which seems excessive–I’ve linked to their store above because it’s a really excellent volume that should be widely read.
This was a great last day to what has been a rich week of learning and discovery. It was certainly worth the long flight to attend DH2014. The organizers did a great job with the logistics and communication. The professional portraits they offered for free were a great idea; I’ve never had such a nice photo.
Session 6, Friday
The Dog That Didn’t Bark: A Longitudinal Study of Reading in Physical and Digital Environments
Claire Warwick, UCL
Set out to study reading and to address some of the dire statements that are out there around reading on the Web, such as ‘reading is dead’ and so on. In particular, they want to research how behaviour changes in digital environments. It was a five-year study (2009-2013) that repeats year after year, which is unprecedented in terms of studying online reading.
They studied masters students in UCL Department of Information Studies programs. They used a diary study method, where students kept track of everything they read, where they were, the medium, how long, time of day, and made comments about their reading. In all, there were 1261 episodes of reading (628 digital and 633 analogue), with 5.13/student on average (5.11 digital/5.15 analogue). Of the 216 total students, 171 were female, with both sexes having the same response rate (56%). Women had more reading episodes on average, however. Their results also showed that reading habits don’t necessarily follow age assumptions, in other words, there was no trend that showed that older students read more than younger students. Read more…
Today featured the panels I’ve noted below, as well as two excellent poster sessions. Saw many things that gave me ideas and/or inspiration, and took numerous pictures that I immediately emailed to colleagues who aren’t here.
Session 4, Thursday
MicroPasts: Co-creation and Participatory Public Archaeology
Daniel Pett, British Museum
All open: open data, open source. All built on a variety of OS platforms: WordPress, Pybossa, neighbor.ly, and Discourse. About 15 GitHub repos, which host various parts of MicroPasts. Read more…