Access 2014 Calgary Thursday notes
Under the Hood with OpenStack
Steve Marks, Amaz Taufique, ScholarsPortal
Showed the specs for the hardware being used for the Ontario Library Research Cloud. McMaster is part of this project, so I’ve seen these specs before and wasn’t taking close notes during this stretch. It’s a ton of hardware, suffice to say, with the goal being a 1.25PB array.
The software layer is based on OpenStack. Showed a graphic that explained how this works. Key to the design is no central database; also the whole setup is hardware agnostic. When an object is written, it needs to be written to two nodes for a successful write, but in testing it was common for all three nodes to write immediately, even with large and complex transfers. Were only one node to write, an error is returned. After being written, they are replicated across the other nodes. Amaz also showed what happens when a node becomes unavailable, which is that objects are written to handoff nodes until it recovers.
The initial pilot was done using GTAnet, which encompasses the three universities that participated in testing: UofT, Ryerson, and York. This testing was necessary to see what kind of traffic is generated during both routine and stress scenarios. Ultimately, there were four nodes, one each at Ryerson and York and two at UofT. The fourth was necessary to observe the aforementioned handoff (i.e.- what occurs when a primary node is unavailable).
For testing, they had a 1Gbps channel set up to handle traffic, and it was shown that loading large amounts of data can quickly saturate this channel. Background chatter when not much is happening is very low, around 5-8Mbps, and other events, such as adding a node, cause fairly significant spikes in traffic, but still within a 1Gbps maximum.
Amaz reviewed the criteria for node selection, which revolve around available bandwidth, costs (both OTO and ongoing), the presence of an ORION POP, and some other technical details such as whether VLANs can be extended across the network. Showed the network diagram that ORION has created along with GTAnet and Scholars Portal, that in a nutshell lays out how ORION will create a VPLS (virtual private lan service) which will encompass all of the nodes, with dedicated switches at each site to which the storage arrays are attached.
Showed a graph about how long it would take to sync 200TB across five zones. With a 1Gbps channel, it exceeds 250 hours to do an entire zone. Replacing a drive barely registers in terms of time (about two hours). A drawer (24TB) is about 55 hours, while a RAID card (48TB) is over 200. During these time windows, say the 11 day window for a full zone, there would only be two copies, which was a concern. This led to asking the nodes if they could, if necessary, offer a 5Gbps connection to reduce these write times. This drove the site selection process.
Steve then showed a GUI browser for Swift that Scholars Portal is developing. It’s a simple interface, but it offers “basic functionality.” Spoke about the hackfest that the project held back in June, and showed some examples of work that was done there. One was a simple Sword server for depositing objects, while others used Cloudfuse and other tools. All of the work is documented on GitHub. In this context, he also mentioned ownCloud, which would layer a fairly user friendly interface on top of the storage. Behind the scenes, they’ve also worked with Artefactual on Archivematica integration.
- deploy all the hardware
- beta phase
- develop end-user tools
- repository integration
- Hadoop cluster and text mining
Steve noted that hardware deployment has surfaced some interesting details about the various campuses and how things are done and what they cost. There is little consistency across institutions.
Panel: Have you Tried Turning it off and Back On Again? Rebooting Library Technology Conversations
Gillian Byrne, Ryerson; Andrew McAlorum, Waterloo; Tim Ribaric, Brock; Graham Stewart, U of Toronto; Steve Marks, Scholars Portal
Each panel member had five minutes to toss out some ideas. Graham started with some thoughts on why perceptions of IT departments can be so bad. On the other hand, he talked about what he calls “Web scale IT,” which he applies to born-digital groups and projects that have a very different culture and way of working than a traditional IT department. The latter are often focused on work such as payroll, time tracking, accounting, and in a library context, the ILS. As he said, these complement existing structures, but aren’t about planning for the future. These are ‘neutral’ services and are easily outsourced and/or moved to the cloud. In this environment, stability and security are paramount, and ROI becomes something of a dogma that trumps anything innovative. It’s a culture of no, and has negative stereotypes: geeks, misfits, and BOFH (bastard operators from hell).
Web scale IT, in his characterization, is about entirely new services and business models. He noted that most libraries fall into this category, since we are often concerned with the new and developing rather than core infrastructure. In this environment, software is the means with which the organization interacts with its users and customers, which puts IT in a central and key position, perhaps even the most critical place. At the very least, in this environment, IT drives decisions and defines services, and is more or less all that the organization does. He pulled this into the library context, and noted how things we see in library strategic plans, then we are defining ourselves as one of these new organizations. The things we say we need in these documents–new interfaces, new forms of user engagement–require this new model.
What does a Web scale IT team look like? In his description, it’s made up of staff from across the organization, and in his words, it must be humane. Also, post-mortems should be blameless. He offered some further thougths on humane systems, which in a nutshell should work well for both the authors and users of the system (he was quoting another source that I missed).
Gillian opted to use the decision making process around LibGuides as her leitmotif. She provided the caveat that this is her experience; as always, your mileage may vary.
She’s not a fan of LibGuides, nor of research guides in their current iteration, but they opted to go with LG rather than persist with using an outdated locally developed models. The decision was driven by liaison librarians, and was a content decision rather than a technology decision. In hindsight, she realized that there wasn’t enough interaction between those who wanted it and the IT staff. They thought it was a turnkey system that didn’t require much local work or input.
She spoke a bit about the pack behaviour that’s involved with LibGuides, in other words, everyone else has it, so we need it. Also, she noted that failure and experimentation don’t look good, and in that context, it’s safer and easier to make a decision to go with a vendor solution that will at the very least be moderately successful.
Tim framed his thoughts on values around two anecdotes. The first anecdote was that everyone has a price. He posited that things only have value when they have a price tag on them. Open source software doesn’t have a price, so suffers due to this notion of value=price. It is inherently undervalued.
His second point was the LibGuides is a four-letter word. He asked: what is a LibGuide? Answer: it’s a Web page with stuff on it. Given that, why do we need a platform like this? Do we truly lack the internal ability to do something like this? Is it just easier to cut a check, or is it indicative of something else?
His second anecdote revolves around sharing, or that the library leviathan will swallow us and we are powerless to stop it. Spoke about their use of EDS, and a plugin that exists to allow one to integrate into an LMS. Such as it is, it works fine and does some neat little things. Then Ebsco grabbed it and made it a hosted solution. In other words, a locally developed (free) plugin became a vendor solution with an annual fee.
His point is that you can innovate and create something, but then the Leviathan comes along, gobbles it up, and sells it back to those people who either can’t do this work or won’t do this work locally. One elegant point he made is that any effort we put into vendor solutions disappears the moment that the contract ends.
Steve spoke about collaboration, and the need for more conversations where we make ourselves understood and also strive to understand others’ viewpoints. How do we achieve that? One: share information. Two: collaborate, which means sharing resources and establishing shared stakes.
Asked how we facilitate communication other than by establishing yet another mailing list. Hard question, and his answers made it clear that we don’t communicate very well on multiple levels, even within our organizations, let alone between institutions. Pointed out that standing meetings have their place. Yes, we all dump on meetings, but they play a role in sharing and open communication. Keep them short, keep them focused, don’t ask too much from them in terms of decision making, which doesn’t happen in large groups.
Pointed out the LLC article by Quinn Dombrowski that offered a post-mortem of Project Bamboo is a rare example of an analysis that speaks clearly about how projects can fail to achieve all of their goals.
Andrew framed his remarks around projects for better communication. In contrast to Steve, he said we need fewer meetings and working groups. Also made a call for more agile development practices, quoting bits from the Manifesto of Agile Software Development. At Waterloo they use Scrum. Scrum has a lot of appeal, but it’s difficult in an environment where people have dual roles, such as being both a developer as well as a quasi-sysadmin.
Also put in a good word for ITIL. He pulled out a couple of ITIL points he feels are critical. One is to have an issue tracking system. Another is to have a service catalogue. He suggested a change advisory board that manages risk by including people from across the organization. Also mentioned service level agreements and continual service improvement.
As one would expect, this panel touched a sensitive nerve. The messages clearly resonate with this audience, but questions came up about how we connect with the rest of our profession, and how we get leadership to understand IT when they largely do not come from IT backgrounds. Some noted that this is an annual discussion at Access that often doesn’t survive the trip back home, so to speak.
Productivity and Collaboration in the Age of Digital Distraction
Showed some funny numbers from various surveys that how addicted people are to their phones, where they consult them, what they would give up rather than their phone, etc. Pretty funny, but a bit scary as well. In short, it’s become and omnipresence in our lives, and is just an extension of the Internet and its habit of allowing a small task to spill over into serious time wasting (check the weather, and then start researching monkeys, for example). He argues, though, that there is more to this trend than just its time wasting aspects in terms of how we think about work and productivity. He referred to this as “rethinking work.” He also claimed that by the end of his talk we would solve the problem of work/life balance!
One fallacy: that things were all great before we had these screens in front of us. His point is that we are experts at procrastination. Long before smartphones we had effective strategies for wasting time. Research has actually shown that Web surfing is actually helpful in terms of productivity. Predictably, perhaps, he also threw a jab in the direction of meetings in this context, but it was in service of his notion that we romanticize the pre-smartphone work environment.
He also questioned if work and life were actually ever in balance. Now the notion is integration, where we make decisions about what we are doing when. Made some interesting points about telecommuting in terms of how much time it saves when there’s no commute, but also how it allows an organization to hire from a wider talent pool, not just the one that is geographically close at hand.
Spoke a bit about hacking work and showed the apps that one can download or buy (irony: one is called Freedom) that will help one avoid distractions online. At some point, however, this kind of tracking can become an obsession of its own, and as he put it, we can waste a lot of time trying not to waste time. Offered some thoughts on gamification, as well, noting that while it has benefits, it can become rather empty and hollow and that we need to “protect our brains” from what is basically overstimulation.