"What Is an Author" for the Digital Humanities?

"What Is an Author" for the Digital Humanities?

Old illustration of a man kneeling and handing a donation of gold to a pope.

While digital technologies may drive innovation in virtually all aspects of humanistic study--offering new pedagogical tools, facilitating close collaboration with far-flung colleagues, and enabling online publishing (and thus prompting a reassessment of the peer review system)--in terms of the actual stuff of research, the most momentous breakthrough may be the methodology known as "distant reading." We’ve already devoted some attention to distant reading here on this blog, and its centrality to the young field of the digital humanities was signalled by an article in NY TimesHumanities 2.0 series and the first of the same newspaper's recently-launched Mechanic Muse column in the Sunday Book Review

Most basically, distant reading is the practice of using data to read--or at least to understand the cultural significance of--books: tracking the ebb and flow of given terms, studying the frequency of critical strings of words. (The emergent field of Culturomics, research feuled by Google’s n-gram viewer and dataset of the more than 500 billion words contained in about 5.2 million digitized books, is but one form of distant reading.)

In this academic context, as scholars step back from individual works to take in a broader panorama, the figure of the author appears at once blurry and bold. Within Culturomics, the author is all but absent: in order to avoid copyright infringement, Google's datasets are composed of sequences of up to five words--associated with a publication year, but not particular authors or works.

To the extent that the digital humanities represents a cohesive movement, we might find its dogma expressed in the Digital Humanities Manifesto, published by the UCLA's Center for Digital Humanities. And the Manifesto is fairly unequivocal in its marginalization of authorship by way of a disregard for intellectual property rights:

“Copyright and IP standards must [...] be freed from the stranglehold of Capital. Pirate and pervert Disney materials on such a massive scale that Disney will have to sue… your entire neighborhood, school, or country. Practice digital anarchy by creatively undermining copyright and mashing up media.”  

(The updated, "2.0" version of the Manifesto is more cautious with respect to intellectual property, adding the stipulation: "Digital humanists defend the rights of content makers, whether authors, musicians, coders, designers, or artists, to exert control over their creations and to avoid unauthorized exploitation; but this control mustn’t compromise the freedom to rework, critique, and use for purposes of research and education. Intellectual property must open up, not close down the intellect and proprius.") 

While the digital humanities in general and distant reading in particular appear to marginalize the figure of the author, the traditional humanities seem to be marked by the return of the author, once banished for almost fifty years by the seminal essays of Roland Barthes and Michel Foucault. [Two recent examples of such a return here at Berkeley are Dante and the Making of a Modern Author by Albert Ascoli (Italian Studies) and Shakespeare Only by Jeffrey Knapp (English).]

But even within the digital humanities--and despite the position of the Manifesto--the author may be gaining a new prominence, as distant reading is put to use to identify authors. Here at Berkeley, Associate Professor of English Bryan Wagner recently received a Digital Humanities Start-Up grant from the National Endowment for the Humanities to develop a text analysis tool for examining and visualizing grammatical and stylistic features to assist in authorship identification.

If the distant reading practices of Culturomics and developing author attribution tools suggest opposing ideologies with respect to the figure of the author, the question may not be whether the author stands in the foreground or fades into the background of the digital humanities. Instead we might ask how our understanding of authorial authenticity will change as it is produced by data mining techniques rather than traditional philology, and what semiotic unconscious will be bared by so much data?


Image Credit:
Detail of The Donation of Constantine by Gianfrancesco Penni and/or Giulio Romano in the Apostolic Palace at the Vatican.
The most famous philological work of author (dis)identification long pre-dates digital technologies: Lorenzo Valla's 15th-century treatise exposing the documentation of Constantine's donation as a forgery.