A Party Where Everyone's Invited: APIs in the Digital Humanities

Rochelle Terman
September 09, 2011
Simple image diagramming how the social Internet works.

While on vacation, you take what only be described as a absurdly adorable picture of your dog Coco, whose body is angled in front of the lens so as to mimic the fifth head of Mount Rushmore. The world needs to see this. You upload the picture to your Flickr account, and then copy that image onto your Facebook wall. You tweet, “President Coco”, which gets automatically posted on your Foursquare, to which someone comments "Cute!", which itself then appears on both Twitter and Facebook. While you’re at it, you share several Youtube videos of laughing babies, embedding them directly on your Google Page site “Shock and Awwwww”. Somebody sees these videos and ‘likes’ them, which then diverts the content onto their own Facebook page. All the while, it seems that every web advertisement you view has something to do with dogs, babies, or online dating sites.

It is perhaps a testament to our ability to adapt to new technology (or our short attention spans) that many of us forget such an episode would have been impossible even a few years ago. While Facebook, Flickr, and Google have existed for some time, until very recently they have operated as separate, autonomous, and mutually exclusive platforms to communicate and share content among users. In other words, these tools couldn’t understand each other, and thus couldn’t work together.

What make such collaboration possible is application programming interfaces (API). Just like a user interface facilitates interaction between humans and computers, an API works with different software programs and facilitates their interaction. APIs are often called colloquially as “mashups”, but I prefer to think of them as bridges of sorts that link different areas of the web to produce a coherent, integrated whole.

Now imagine you care less about puppies and more about humanities archives. In recent years, several large repositories of cultural and scholarly data have become freely available on the Web. Unlike ‘twentieth-century’ databases, these repositories follow the current conventions and values of web design and usability, meaning that they are designed to provide tools for viewing, searching, and manipulating their contents, all with an aesthetically intuitive and uncluttered Web interface. They are presented so that anybody familiar with a simple Google search or email client can wield the power of large, complicated databases.

But here’s the problem: No online tool will survive in today’s age without being malleable and adaptable. By virtue of the complex dynamics of internet use, a particular tool will often be harnessed for purposes for which it was never intended. For instance, Mark Zuckerberg and Co. did not originally design Facebook as a photo-sharing tool. (Anybody else remember when Facebook didn’t even have a Wall?) But as photo-sharing tools (Flickr, Photobucket) gained popularity, users wanted the ability to combine the best of both tools.

Likewise, as more and more digital repositories are propping up in various areas of the Web, scholars want the ability to interact with more than one tool or repository while minimizing the trade-offs inherent in choosing one or another.

Enter Dave Lester, Assistant Director at the Maryland Institute for Technology in the Humanities (MITH) and director of a digital humanities API Workshop, hosted February 25-26, 2011 by MITH at the University of Maryland. The workshop gathered 50 digital humanities scholars and developers, who demonstrated their APIs and discussed ways in which existing and future APIs could be leveraged for the digital humanities projects.

According to Lester, because digital humanities repositories are so new, we do not yet know how they will be harnessed in academic research, teaching, and collaboration. “A scholar may, for instance, want to ask a question that an elegant but relatively simple search interface does not allow,” says Lester, “Another may want to combine the data from two archives together to create a visualization to illuminate previously unknown or unacknowledged connections.” APIs, however, offer the best solution to facilitate evolution, growth, and collaboration amongst archives, whatever the future holds.

Already, APIs are being developed to leverage existing assets in the humanities to the benefit of students and researchers. Flickr Commons presents the hidden treasures of the world’s public photography archives; OpenLibrary – which is basically a wikipedia for books – has developed a suite of APIs to help developers get up and running with their data; and Google Refine wields its magic powers for refining big ugly datasets into glistening gems of consistency.

APIs are also working to record the Web itself as an important archive. Internet Archive saves snapshots of the web in a digital library that uses APIs to share its data with researchers. The Internet Archive supports a number of projects including the Wayback Machine with over 150 billion web pages archived from 1996. (To see just one example of APIs in action, visit Understanding 9/11, where Internet Archive has posted video from 20 television stations chronicling the grim week of September 11, 2001.)

The trend of API use in digital humanities exhibited no signs of slowing down. Indeed, as API Workshop presenter Raymond Yee’s predicts, “every site of significance will eventually have an API.”

David Lester is now a student at the University of California Berkeley School of Information. With Lester’s help, UC Berkeley may now become the hub of API Humanities developing, shaping a world in which accessing Humanities archives will be as easy as sharing a photo of your dog Coco.