Access 2013 YYT notes
Below are my notes from this year’s Access conference held in St. John’s, Newfoundland from September 23-26. Access is always worthwhile, but this year’s was particularly enriching for me, and I greatly enjoyed getting to know a new place, as well.
Please bear in mind that these are spontaneous notes, so any errors, omissions, or weird ranting is likely my own fault. I’ve tried to put my editorializing in italics to distinguish it from the speaker’s thoughts.
“What a Long, Strange Trip It’s Been”
Started by sketching a brief history of Access, and noted its influence on other conferences, e.g.- Code4Lib (single track, hackfest, etc.). There are several people working on recording the history of Access, and he noted that that’s not typical for conferences, so worth noting and applauding.
His talk was literally a stroll through his experience of Access, starting with 1994 in St. John’s. Reviewed titles and topics from older conferences. Z39.50 was a big deal early on, as were WAIS, Gopher, telnet, etc. Mosaic pretty much set the one for what was to come, of course.
His review of the early years of Access make one realize how really it was mostly men, certainly those presenting were men. He remarked on this. Showed a video from the hackfest in 2003, and while I couldn’t see the screen so well, it was still very male heavy, but the ratio seemed to be shifting. It would be interesting, and likely difficult, to gather registration demographic information from the past and graph the shift of gender (and also geography). For each iteration, he gave sample talk titles and names, and by 2006 only one woman’s name had appeared. Got a bit better after that, but it reinforced for me that we need to maintain our focus on being inclusive in all areas of library work.
Interesting to see who has appeared over the years. Many are no surprise, as they are pretty well known open source advocates. Others, though, are not part of that crowd yet made appearances. One wishes that we would see more mixing of ‘type’ at events like this, where there is a much more intense level of conversation and exchange than at larger conferences where they typically appear.
We all had a good laugh about Z39.50’s persistence on the agenda at this conference, but one wonders what we’re doing now that will seem naive or foolish in the future. It’s not really a bad thing to have things rise and fall, but as he noted, linked data is coming into the program, but will it be that next big thing (like OpenURL, which was spoken of in early days at Access) or the butt of future jokes? Hard to say, and it would be interesting to know what role we play (if any) in the success or failure of some of these technologies. Perhaps we just don’t like to talk about workhorses, i.e.- things that work but aren’t very exciting.
He noted that the conference information from previous years disappears quickly; 2009 has already mostly been vapourized. Clearly, we’re not posting documents in repositories, or at least not in a way that allows us to reconstruct a conference program. For my part, I know I tossed my paper from 2008 into the IR at my then institution, but who could ever find that?
Interoperability with Blacklight, EDS, and Fedora
Showed some work he’s doing in Ruby with discovery tools for U of Alberta. Pulling in records from various sources, reformatting them, and putting them out there via Blacklight. Did his presentation showing live interaction with code.
Gettin Sh*t Done in the Digital Archives
Nick Ruest & Anna St. Onge
Started with the question–given how buried most archives/libraries in analogue collections–of why we should care about Web content. That’s a hard question, and as she noted, it’s not a new problem, citing a 1972 lecture by Hugh Taylor. He advocated getting involved and learning to do “that computer thing” (her words).
Showed quote from Cecelia Muir (LAC, 2012) about activists losing their judgment if they become activists. Perhaps what she was doing was equating activism with engagement, which are, in fact, two different things. Others see it differently, including Brien Brothman, who advocates for archivists to be culturally aware and critical.
Showed a funny picture of a guy in an archive climbing over files to get at something, but pointed out that that’s more or less what’s happening with Web content. It’s a jumble of content, and much of it arises and falls without substantive documentation or capture.
They’re working at York to preserve the back issues of the campus newspaper, content that was at great risk for disappearing. Want to demonstrate to the campus that they can be a trusted partner and care about their stuff.
One thing he wants to avoid is being a “digital packrat.” It’s important to apply some system and controls. The WARC is an ISO standard for Web archives. All requests are aggregated to a file, with provenance data and a checksum.
Web Archiving Life Cycle – comes from the Internet Archive.
Nick has created an Islandora solution pack around all of this, i.e.- to grab WARCs and ingest them into the repo. Allows the creation of derivatives as well as retrieval.
A way to make the work we have to do more interesting is to learn by doing something that one cares about. Nick responded to a request from a fellow librarian for assistance with documenting an ongoing Web phenomenon.
Anna led Nick to an awareness that archivists can and should engage critically in what they do, a view articulated by others in the profession. They work together with the Toronto chapter of the Progressive Librarians Guild, which engages on social justice and equality issues.
A challenge with working with current information, at least for archivists, is that they have no distance from the information. For one, the topic is not fixed, and the scene is changing and evolving. She put this under the header of “describing the present” and noted that it created anxiety for her as an archivist.
Reinventing Discovery: An Analysis of Big Data
His slide design included a product name at the bottom of every slide. It can be a bit dicey when vendors speak at events like this, as they negotiate the line between participating in the community and making a sales pitch. Putting a product name on every slide may shift that balance toward sales pitch. This particular vendor wasn’t even an event sponsor, which would have bought them eyeball time. If a vendor wants to brand their slides, they should use the company name rather than the product, perhaps.
Had the good fortune to have a graduate student in Germany do an in-depth analysis of their interface. Very useful information, it seems (which they had to have translated!).
Given the resources Serials Solutions has to do these studies, one can see why the product is so expensive. But doesn’t the key question remain, i.e.- even if our interface is absolutely optimized, are users inclined to start their research with us? In other words, can we become their default place to start? Not even librarians behave this way, it seems. The answer may be that some will, but if that’s the case, then what investment is it worth to have such a system? I just can’t get excited about fancy library search interfaces anymore.
Top two search terms, from their research: jstor and pubmed. Does that mean people are using an expensive interface to find the door to the interface they want to use? That’s something worth exploring and considering.
Building crowdsourced photographic collections with lentil and Instagram
Their project emphasizes the collaborative, rather than central, role of the library on campus. They mined Twitter hashtags to see how their campus and libraries are talked about by users (think I got this right).
Their system, lentil, allows group participation, even for users who don’t know about their system. As he put it, it’s a peer user of Instagram. He showed the ingest process, but the slide went by too fast to capture. They do have a content moderation piece, although he mentioned that they don’t have to remove much. They do advertise their tool, to some degree, encouraging users to take images of Hunt Library and tag them appropriately.
They have an eye-popping 20-foot curved display wall where the lentil images are displayed in their library. They get some really interesting images, too, not all positive (showed one of broken outlets).
Looped back to the notion of how students relate to their university. Noted that alumni tend to develop more interest in the historical record, but it’s not well understood how current students view this. When they asked students about how they feel about having their images added to an archive, they got really excited about it, or at least didn’t find it a big deal if their Instragram shots end up in an archive (one tagged it #nobigdeal).
Clearly, they are hoping to establish closer relationships with students, which may pay dividends in the future, of course, both in tangible and intangible ways.
First Decade: Done and Dusted
He reviewed some of their lessons learned with Alberta’s digitization program. One was that they used Peel numbers from a bibliography, and that turned out to have limitations.
They outsource nearly all of their scanning (until recently they had little capacity in house), collecting it on DVDs that they had in cabinets, vaults, etc., around the magnitude of a TB. They changed vendors along the way and got metadata in newer formats (they use vendor METS).
With all of those DVDs, they needed to figure out digital preservation. Went in with Sun and The Alberta Library and became a Centre of Excellence (with two Honeycombs). As he pointed out, they became a CoE before they had done anything! Honeycombs were elite storage arrays at the time, with solid integrity checking and repair mechanisms. But it also had drawbacks: cannot delete anything, also impossible to backup via any normal method. They ended up using another stack to divert files to the Honeycomb and/or to tape.
At a certain point, they had a bit of scope creep. One was Internet Archive scanning (in coop with UofT). They needed to keep a local copy of that work, and as he put it “it was a lot of stuff.”
At one point, he made a scripting error that wiped out several thousand dollars of scanning (although it was recoverable from the vendor, it turned out). Conclusion: follow standards and routine.
When the Honeycomb died, they discovered that 1700 files existed only in the Honeycomb, and were unretrievable without extensive forensic work on the source USB keys. Some heroic work by a sysadmin, they managed to tease the files out of the Honeycomb, which would only response to a narrow set of queries.
Now: 6.3 million files, 22.8 TB. Mostly pdf, xml, tif, jp2, some zip and gz, but file count. By size, the zips take up nearly half of the storage (zipped files from IA).
As with the talk from NCSU, it’s hard not to be envious of their IT resources. They simply have more people, and roles we can only dream about, such as a storage specialist. Where we have one valiant person doing something, they have a team.
The SFU Library Open Data API
Todd Holbrook and Calvin Mah
To be usable, open data needs to be machine readable. One example he gave was municipal trash pickup schedules.
Libraries are champions of open data, but he posed the question about whether we actually expose any of our data. He noted that getting data out of some of our systems is hard and we create fake APIs with screen scraping, etc.
They were looking to get reserves information out of the ILS and expose it to their campus LMS. They built an API for this purpose, and then began looking for ways to extend it. They now expose covers, equipment, hours, patrons, reserves, serial costs, and workshops via an API. Patron data isn’t publicly available, of course.
Showed a screenshot of their reserves data in the campus LMS. Pretty simple, but awesome to see. They also have done really great things with their hours information, making a much better display and showing it in other places. All hours information is pulled from the same source, and pages just query the API for current information.
Currently they are all read-only APIs, so one thing to consider is allowing writing in some instances.
Many positives with doing this: simplifies programming, consistent developer interface, one place for documentation and access, etc. Downsides: single point of failure, may be losing powerful native features.
I’ve lost count over the years of the number of times I’ve heard developers talk about having to screen scrape data from a certain closed ILS, as came up in this talk. Amazing that this problem persists, which means the vendor doesn’t care, and libraries don’t care enough to dump the product and move to a more open platform, or at least that this isn’t a consideration that drives product choice.
Bookfinder: find your books fast!
Steve Marsden and Fangmin Wang
Their philosophy is that innovation should start with students, and they have a number of projects that started with students, including this one (Steve was a student, and is now staff). They looked at the existing tools for what they wanted to do (locate physical books), and decided that there were elements missing.
They have no Web interface for this tool; it’s embedded in their catalogue. For mobile, they have an app. Also works on their kiosk in their library (32″ touchscreen).
I saw this a while back at Code4Lib North, and remain impressed with this tool. One of their better ideas: number your shelves. So simple, but effective. Big signs, clear font. See the code on GitHub.
From the Cloud to the Ground: Cataloguing, Linked Data and RDA Search Strategies
Heather Pretty and Catelynne Sahadath
Provided a primer on cataloging practice past and present, highlighting the differences between the AACR2/MARC practice and RDA. RDA MARC fields are “text strings typed in by humans.” Given that RDA is intended for machine processing in a way that AACR wasn’t, that’s a bit ironic. She hit this point later, noting that our tools should be able to pull in authority data and link to it via URI.
Fair question: does RDA take us closer to Linked Data? She feels yes, at least with regard to authority records. The URI is key here, but the links need to be in our records. Also, she pointed out that while RDA is now being used for description, our encoding (MARC) requires “shoehorning,” and our ILS certainly doesn’t know what to do with RDA in terms of display and linking. Also asked whether we can expect our ILS tools to deal with this, or if we should be developing our own tools. Particularly for LOD tools and systems, she was clearly advocating for doing it ourselves and not expecting vendors to do it. In this context, she mentioned the eXtensible Catalog as experimenting with many of these concepts. Also mentioned RIMMF (RDF In Many Metadata Formats).
Issued a number of challenges, including adding URIs to MARC $0, also thinks we should be doing local LOD projects, and should be talking to each other, vendors, administrators.
Catelynne called cataloging behind a closed door like making a Thanksgiving dinner and not inviting anyone over. Her concern is how to bring RDA into the library, i.e.- explaining it to non-metadata staff. The danger of information overload about RDA is acute, since there is so much out there, hence the need for focused training.
Pointed out the clear fact that for some time, RDA and AACR are going to coexist, as with the scroll and codex (but not as long as that!). So we will have to communicate to staff that records will look different and must be interpreted differently. Gave the concrete example of searching for variants and parts of the Bible, which is a bit of a cryptic mess in AACR.
Pointed out the RDA training resource rdacake.
Building a Better Book (in the browser): Using HTML5 to Transform and Unlock Book Content
Jason Clark and Scott Young
They are interested in the book as form as well as the browser as a container. The codex has continued on in Web form, and there seem to be other possibilities, which they are exploring.
They called the Web browser the 21st century printing press. Nice way to capture it, although it states an implicit truth.
Showed their development stack: MySQL, Treesaver.js, HTML5+CSS3, structured data with schema.org.
They showed a model for a book used in a class at MSU, and it was so nice to see columned text on a screen. Reading margin to margin is one of the worst things about online reading. It makes the eyes ache.
One key element of what they’re doing is making their books machine readable by applying schema.org and RDF. One benefit is that the search results look a lot better in a search engine, since the machine knows what the object is and how it is structured.
This also allows exposing the book as an API. All of these cool things can happen behind the scenes, too. The human reader just sees an attractive and readable book.
Why do this work? Bring the conversation about ebooks into libraries (and out of the publisher realm). Also want to drive the move to LOD and the Semantic Web. Using the API to make the book a platform.
On the Road: Adventures in Mobile Hackery
Kim Martin & Sarah Simpkin
Like all good projects, this project (a DHMakerBus) began with an offhand remark: let’s buy a bus. Apparently one should be careful around Kim with such ideas.
Sarah made the point that part of this is about making opportunities available in libraries that they don’t have at home. The bus is a means to move it around to communities where libraries don’t have the means or ability to dabble in emerging technologies.
They used IndieGoGo, but made some mistakes. Lessons learned:
- make a professional video
- no campaigns longer than a month
- don’t run a campaign without the time to manage it
- don’t offer perks you can’t provide (keep your promises)
They learned that having the word “hacking” on their Website wasn’t such a good idea when applying for grants.
Integrating Course Reserves as a Service in the LMS at UBC
His story of reserves and the LMS involves some of the usual pitfalls, such as having to create kludges to link legacy local systems to vendor products. Pulling out of Access Copyright changed the scene, as did other changes in their environment.
They left behind their old tools and picked up new ones, such as Ares for electronic course reserves. Around the same time, the university dropped its previous LMS and moved to Blackboard from 2011-2014 (yes, it’s ongoing). To hook it up to their student data, they had to write yet another connector (CTC3). Course reserves were hooked up to the same information through some lobbying effort by the library. This required more arm twisting to get their central IT shop and their centre for teaching to include all of the student data in the LMS, something they had otherwise not planned to do.
Their tools enable the provision of very specific support for reserves. Instead of generic help, students can be directed to specific task who deal with their course materials. He described these under the header of “contextual library services.”
Can they scale? Library discovery tools and new forms of scholarly communications
Started his talk by looking back to the turn of the century and the ARL Scholars Portal, as well as Google Scholar’s launch in 2004. Pointed out that a team of three people created GS, and that currently it reportedly has one half-time person dedicated to it. Meanwhile, ARL Scholars Portal flamed out. Amazingly, USC still uses the Scholars Portal branding, although it points to a Serials Solution product.
Pointed out the simple logic, often repeated, that the discovery layer is a necessity of sorts since we invest so much money in collections. It’s not necessarily the best reason for doing it, he implied. He also noted that using these layers imposes a paradigm on searching that doesn’t work for all users. Cited Kuhlthau’s 1991 article on the bibliographic paradigm in this context.
Noted that 3,000,000 searches in Summon is a lot of use, but I would ask compared to what? Compared to our previous numbers in A&I databases? Catalogue searchers? We need to have a comparison or it’s just a big number, like most blunt Web metrics.
He made the point that I and others have made elsewhere that as the amount of freely available academic content increases, and slowly overwhelms the amount of pay-walled content, that the reasons for having a discovery layer or any library tool we have typically used no longer apply (as much).
Noted, almost as an aside, that the OpenURL can point to appropriate copy, but can’t address questions of open access. In other words, we cannot point to OA items selectively based on metadata in the OpenURL.
One of his main points, as I heard it, is that publishers such as Elsevier are less about content these days than services layered on top of them. In other words, they are adapting to the world of open access well, and are finding ways to maintain their revenue and standing. We are behind in our thinking, it seems.
He returned near the end of his talk to the point that libraries collectively provide a massive amount of content via the IR and other means.
Can One Story Change the World? (David Binkley Lecture)
He demonstrated a project on Historypin where he worked with his father to map some photos from his grandfather’s experiences in WWI. One of Historypin’s social aims is, in fact, intergenerational understanding. As he put it, archiving is not enough; one needs a conversation, the story. This reduces senior isolation and also gets younger people involved in storytelling and increases their civic engagement.
Showed a brief clip from the Imperial War Museum’s project around the lives of people impacted by WWI. What’s good to see is that it’s broadly based, i.e.- involves people beyond soldiers on the battlefield. They’ve used collaborative means to gather information about the photographs they have collected. He showed a picture of a solider where little details were filled in by people with domain knowledge.
As he put it, digital mystery solving starts with physical events, where people gather and work together. Very social, offering benefits beyond the actual project.
Something he mentioned in passing were “teen historians.” He pointed out they aren’t historians in the strict sense, but were the people gathering information and stories. We need to come up with a better name for this role than ‘hobby’ historian or citizen historian. It’s important work and needs a dignified name. Field historian?
He showed a couple of slides that highlight the notion that we are going from tables to graphs, i.e.- information organized in tables to mapped on a network graph. Each piece of data can relate to any other. To create a culture of linked open data, we need to follow simple rules and protocols.
He made a strong statement that building a linked data environment for libraries, museums, and archives, this is not something vendors will do, but we must do ourselves for ourselves. Further, he showed that we have legal tools that should enable this work, first and foremost CC0, but also other public domain licenses. Many people are doing this globally. As he said, “the tide is turning.”
Also noted that it’s not just for technologists, i.e.- LODLAM isn’t just a technical issue. Many people can be involved, and need to be involved. Also mentioned OpenGLAM.
Héritage: A Mega-Microfilm Project
Mega = ~40,000 rolls of film, representing about 60 million pages, mostly handwritten. Clearly, OCR isn’t going to work. Also, no demarcation between objects on the film; they just run together. Very little metadata.
Goal is to offer longterm open access to all objects and metadata, but he did note that it has to be self-funded. Goal for first three years is to build the infrastructure, digitize all the objects, develop standards, and create metadata. Basic access at that stage based on this metadata. The second part will take ten years, which is metadata at all levels, down to document level. Combination of professional cataloging, crowdsourcing, paid transcription, and/or collaboration with interested parties.
CRKN has offered $1.8 million in startup funds, which well get them infrastructure and the first block of content. Further funding will come from premium subscriptions and other means. Those who kicked in for the original kickstarter will get perpetual premium access. Content will be made available at a minimum rate of 10% per year over the ten years. If funding allows, it will be more, and sooner.
Create a discovery layer in minutes with CouchDB, Elasticsearch and jQuery (almost no programming required)
Yet another talk that explicitly mentioned a GitHub repo and documentation: hooray!
What to do with a giant MARC file? He pulled together some simple tools–Python, CouchDB, ElasticSearch, jQuery/Bootstrap–to do something with it. Python code takes the MARC dump and parses it into JSON and sends it to CouchDB. ElasticSearch indexes it using Lucene, and jQuery/Bootstrap send it to the browser and style it. Ta da!
Great demo project. Exactly the kind of prototype that points to how we can do these things with free tools and quickly.
It’s dangerous to go alone! How about we do this!?
Steve Marks, Nick Ruest, Graham Stewart & Amaz Taufique
Panel to talk about the issues facing us with digital preservation.
Nick started with some questions about what our needs are. Clearly, a mix of models and needs are in play.
Graham covered three trends that are impacting this work. By technology, he means OS technology for Web environments, since it’s well suited to this task. First trend is not new: virtualization. One particular advantage is that fast provisioning has changed the mindset of sysadmins. Gone are the days where we treated servers with such delicacy. It’s easy to move things around, bring things up, etc. We can run many servers with one sysadmin.
Configuration management is the next trend. Major Websites have shown the way, e.g.- Etsy. Puppet and Chef are the two common OS tools for automation. He said it: automate everything.
Third trend is hardware commercialization and open source hardware. Lots of servers, but nothing much distinguishes them from each other. Procedures are so well defined that one vendor’s solution is as good as the next. You need hardware, but it doesn’t really matter which hardware, unlike in past days.
Beyond that are the companies doing work with open source hardware, e.g.- Facebook (Open Compute Project) and Google. As he pointed out, this is fairly new, so exciting things are ahead. Software can now be used to replicate costly vendor systems. Big topic: software defined storage.
Where Angel Investors Fear to Tread: Library Technology Innovation in a Time of Austerity
Outlined up front his concerns with the library software development environment. Lots of broken things, vendor convergence, etc.
Decided to make something using some interesting tools, and the result was Ladder. His talk isn’t about Ladder, however, it’s more about sustaining projects. As he pointed out, open source is not a business model. Friends told him he needs to bootstrap, i.e.- create a startup, a company.
Interestingly, Toyota has some of the best current ideas for startups, all based around their concept of kaizen. The idea is to innovate “under conditions of uncertainty,” not to create profit in the short term.
Toronto has MaRS, funded by the provincial government, and Ryerson has a Digital Media Zone, which also acts as a tech incubator.
Library startups are different from tech startups, since the goal isn’t immediate business to consumer contact. It’s a different market, with an order of magnitude that is too small to attract VC. That said, in Canada there are examples of library startups: artefactual, Discovery Garden, etc.
Spoke about business models, and recommended that libraries use this approach, even for existing practices. We rarely apply that level of scrutiny to our practice and services.
Showed the build-measure-learn cycle and noted that it’s important to do this quickly, i.e.- move between the nodes quickly rather than trying to take giant steps.
Last thought: aim high, take some risks.
Linux Containers and Docker: A New Sort-Of Virtualization Framework That Will Leave You Confused, Yet Excited For The Future Of Virtualization Technologies, If That’s The Sort Of Thing You Usually Get Confused and/or Excited By
Gave a brief skim of typical virtualization, noting that most are machine based, i.e. “hardware abstracted.” They are great for many things, e.g.- multiuser environments, need to do many things at once, etc.
Asked: does anyone like Java? Only got one taker, and he was just being contrary, it turned out. Noted that Java had once excited him, but now he hates it, since JVM’s are awful.
At about this time in his talk, I was reminded (yet again) that John’s technical knowledge is deep and detailed. Gave up on trying to capture it and just listened.
Docker is much lighter weight, one can run many instances, and their boot time is “like a second.”
Built on Go, Google’s C-esque language, Linux >3.8, AUFS (file system – only stores diffs in second repo). Docker built on shipping metaphor: put stuff in containers and ship them off. “Fast to set up, fast to tear down.” Can run inside a Vagrant container despite being 64-bit native.
“This potentially means an end to every neckbeard holy war ever.” Best line of entire conference.
Did a live demo and installed a bunch of WordPress instances in a couple of minutes. Anything command line can be Dockerized.
Building a Local Dropbox Alternative to Facilitate Data Deposit
Two assumptions: people like Dropbox, and they don’t always deposit data into repos for the sake of it. SFU’s CIO encourages researchers not to store their data in the cloud (BC has stringent privacy statutes). They did a survey, however, and found that people were using Dropbox and other cloud services, so it’s clearly something people want and perhaps needs to do.
The technology for doing cloud file management is old and exists, e.g.- Unison from Penn.
On the server side, he uses Ajaxplorer as a Web-based window into the file system. They’ve now contracted with DiscoveryGarden to develop a plugin that allows deposit into Islandora.
Added versioning to Ajaxplorer via a git plugin.
Why not OwnCloud? Haven’t heard from any users, but it’s also a heavy application, with functionality they don’t need (mail, calendar, etc.). Also might be hard to bolt this to Islandora/Archivematica.
“Why is this link dead? Aren’t government publications all online?” Preserving digital federal content with the Canadian Government Information Private LOCKSS Network.
Mark Jordan & Amanda Wakaruk
Learning about Canada’s policy framework for public documents. It’s a bit different than in the U.S., and a major hole in my knowledge as someone who works here. In essence, there are gaps in the depository services program, leading to holes in the record, particularly digital information.
A 2008 common look and feel standard led to the removal of massive amounts of content from federal sites. LAC ceased its Web archiving program. Crown Copyright Licensing Program permitted non-commercial reuse (2010). End of print parliamentary committee documents (also 2010), and no plan for preserving the digital.
Budget cuts over the past few years have impacted this. DSP has been modified, no more paper (2012). LAC lost major funding in 2012 and 450 positions were affected and 215 cut.
Her suggestion in 2012 was to establish a LOCKSS network for government information. There really is nothing else going on in this regard (she asked at a meeting of experts and got silence).
In the meantime, the Web Renewal Action Plan has led to the removal of much content (the CLA GIN blog documents this, in part). Clearly there is a need to preserve this content. Basically, the federal government wants nothing older than 2-3 years on their sites, but there’s no plan to preserve what’s removed via the DSP or LAC.
Current state is that the Web Renewal Action Plan is trickling down. Also been confirmed that the Virtual Library will not be a repository, only a current info site.
Not surprisingly, there’s a LOCKSS program in the U.S. preserving federal documents, which was a model it seems.
Project goes by CGI-PLN: Canadian Government Information – Private LOCKSS Network. Mark covered the technical details, which are pretty typical for a LOCKSS network.
Community, understanding, courage and honesty
Thinks the discussion around metadata licensing is over and done. Apply CC0 and move on.
Our (special) collections are data, so how we put them online matters a lot. Used the simple example of how early newspaper digitization was obsessed with preserving the form.
Nice closing keynote. More something to listen to rather than taking notes. Clearly stirred a lot of passion, and a great way to end the event.