Paying Google
Within days of Google’s 2004 launch of their ambitious book scanning plan, cynical librarians (myself included) wondered how long it would take for Google to market their new toy to libraries.
As it turns out, a long time. It’s now 2011, and we’re still waiting for Google to come knocking offering content for money. Is this strategic, or perhaps related to their epic struggles with rightsholders? Likely the latter.
This occurred to me today when I recalled–for the 300th time–that Google offers no API for Scholar. While noodling around for information on the current state of that issue, I found a comment on Jonathan Rochkind’s blog that made something go bing in my head. Commenter Marty pointed out that Google relies on publisher largesse to get at the article metadata, and that many of these publishers have an interest in driving use toward their own tools (e.g.- SciVerse). It would follow that Google faces many publisher-imposed restrictions about what they can do with Scholar, which would explain the lack of development and an API.
That leads to a question. Money makes everything move, so would we be willing to pay for this API? Google could make the publishers happy by paying them for access to their data, and we could pay Google for use of their API. It would be like licensing a database, except I don’t want a crappy interface and a sales call, I want an API so that my library can access the data from our own interface and manipulate it to fit our needs. In a nutshell, we’d be paying Google to aggregate publisher data, something we currently do in a variety of semi-satisfactory ways.
Is this crazy talk?
Comments are closed.
Yes.
OK, I’ll bite: why is it crazy?
I don’t think its crazy talk but I think that it is more complex than just offering money. Google will serve as essentially the middleman, which they are anyway. With the Google Books plan, they seem to be willing to be the middleman. But this is slightly different.
So I think most of the issues will be administrative. On the library side, what funds will pay for this (kind of like the MARC records discussion…). And will libraries pay individually or what? How will it be priced? On the Google side, you’ll need a pretty savvy librarian (and I do mean that) to understand what we want and also what publishers want and an efficient way to bundle payments.
Are other APIs only available for pay? This is interesting and seems like a new territory for libraries to get in, but not insurmountable.
I also think its worth a try! I’d rather pay for this than a bunch of other things we pay for.
I think you’re spot on on several points, Jen. Yes, it’s more complex than just paying Google, since they are, as you say, just in the middle here. I see potential, however, for having a fairly solid framework in place where we pay Google and they pay the publishers.
And I hadn’t considered the internal issue of who pays, but you’re right: at the very least, it’s messy. We need to revise our practices in general to allow more flexibility in how we pay whom for what. As it is, we still struggle with the financial practice ramifications of the transition from analog to digital collections, as you allude to here.
Off the top of my head, I can’t think of any APIs for pay, but I’m sure they exist and I’m just missing something. Amazon used to (and perhaps still does) have restrictions on how many API calls a developer could make without meeting their conditions (of advertising and whatnot), which isn’t paying per se but certainly hangs financial considerations on the API. Frankly, if I had my druthers (I don’t) and a bunch of expert programmers who can work with APIs (ditto), I’d tell database and ejournal vendors, thanks but no thanks, we don’t need your crappy interfaces, just sell me an API to your data and we’ll take care of the rest, thank you very much. That’s my way of saying what you say more eloquently: I’d rather pay for an API than a bunch of the other stuff for which we shell out.
I wonder with more of these open source library environments if your idea may come to fruition sooner rather than later. I would like to see libraries focusing more on the search experience, as I think some of them really are (NCState comes to mind as well as other libraries that use some of those newer catalog projects/software). It seems to me JSTOR would be a good bet…of course, JSTOR has a pretty decent interface. But you could start somewhere, hopefully with someone who considers libraries a “partner”. Or OCLC or something. I worry this idea deepens the divide between techie libraries and non-techie libraries, but as consortia grow in influence, there are ways around this as well.
I admit it seems like the administrative issues should not be a major issue but you and I both know that is never the case!
Funny you mention NC State, since McMaster uses the same overlay on the catalog (Endeca). We are currently reviewing how we want to move forward in that area. I tend to agree that the search experience is something that is generally neglected, or if not neglected, then over-optimized for the expert (i.e.- librarian) searcher, with highly complex interfaces and verbose results.
Interesting that you mention JSTOR, too, since as you likely know, they do offer (or did last time I looked) something akin to an API, which was the ability of customers to download their metadata and integrate it into their own local search tools. I hoped at the time that that was the beginning of a wave of such offerings, but it quickly became the province of vendors who federate data on our behalf (SerSol, CSA, et al.). Those products aren’t necessarily bad, but overpriced and inflexibile, so my interest in them is minimal. That statement goes against the giddy rush of many libraries toward Primo, Summon, etc., but it’s not hard to predict that those products will lose their lustre before long when people realize how kludgy they are. In other words, we’ll seen see a “our discovery layer sucks” meme as we have with the OPAC.
At the end of the day, I have to note that there’s a personal aspect to this. I have found that I do well over half of my personal literature searching in Google Scholar. Why? Because I routinely unearth artices/texts there that fail to come up in proprietary disciplinary databases. When one considers how databases such as the MLA, Library Literature, America History & Life, etc. are produced, it’s no wonder that this is the case. The MLA, in particular, has gaps through which one could drive a truck, and LibraryLit isn’t worth $1.99. At any rate, I encounter more and more librarians who use Scholar as a primary tool for searching. Given that we are experts at searching, what does that say about the data sources/interfaces for which we pay? Nothing good, I’m afraid, hence the motivation for my original post.