How is a literary work like a genome?
According to Associate Professor of Classics Pramit Chaudhuri—who last week was named an American Council of Learned Societies (ACLS) Digital Innovation Fellow—literary references, like genetic sequences, mutate over time as one author picks up a phrase or theme from an older work and makes use of it in a new context.
Chaudhuri says that a Latin phrase like “immane nefas” (“enormous wrongdoing”), used in Virgil’s Aeneid in reference to the worst realms of the underworld, becomes “commune nefas” (“collective wrongdoing”) in Lucan’s later epic Pharsalia, implicating an entire community in the horrors of the Roman civil war.
And like genetic mutations, such intertextual references can be discovered computationally—using a tool from genomics called sequence alignment, which compares sequences of DNA and assigns positive values to matched base pairs and negative values to mismatches.
This insight led Chaudhuri and his collaborator Joseph Dexter, a PhD student in systems biology at Harvard, to apply sequence alignment at a character-by-character level to Latin epics. The tool they are building will allow scholars to quickly search and compare Latin texts to reveal phrases—like the Lucan and Virgil phrases “commune nefas” and “immane nefas”—that are similar but not identical.
Of those phrases, uncovered in less than a minute through a sequence alignment search of more than two dozen books of poetry, Chaudhuri says, “How would you find this echo using conventional means, computational or otherwise? A search for the word ‘nefas’ alone would generate too many results to be useful. ‘Commune’ and ‘immane’ are not lexically related; the relationship is purely one of sound and therefore of letter. What’s nice about this tool is it will identify those correspondences.”
With the fellowship, one of seven ACLS Digital Innovation Fellowships awarded nationally this year, Chaudhuri and his colleagues will be able to further develop a suite of digital tools to help scholars follow the intertextual threads that link broad sets of classical writings. In addition to sequence alignment, the team has been working on computational tools that can scan the meter of Latin verse, and on methods of studying intertextuality between classical Latin and Greek texts—which requires an analysis across languages and alphabets.
The fellowship provides funds to support both a sabbatical and the project itself—allowing Chaudhuri and his team to continue work begun last year with seed funding from the William H. Neukom Institute for Computational Science.
“The ACLS fellowship is ideal in a way that a simple sabbatical, or simple project funding, would not have been,” says Chaudhuri. “The sabbatical funding is an opportunity for me to focus entirely on this project. But as important is the project funding that will enable us to continue recruiting very talented undergraduates who can help on both the computer science and classics sides of the project.”
While the fellowship has been awarded to Chaudhuri individually, he stresses that the project has been collaborative and interdisciplinary from the start—relying heavily on the work of co-principal investigator Dexter as well as that of Tathagata Dasgupta, a senior research fellow at Harvard Medical School, and Nilesh Tripuraneni, a PhD candidate in machine learning at Cambridge University. Chaudhuri’s wife, Ayelet Haimson Lushkov, an assistant professor of classics at the University of Texas at Austin, is also a contributor.
Dartmouth undergraduates who have contributed to the work so far include Ajay Kannan ’15, James Brofos ’15, Jorge Bonilla Lopez ’16, and Lea Schroeder ’16. “Another goal of the project is to encourage involvement of undergrads in research, and undergraduate publication, both of which are more common in the sciences than the humanities,” says Dexter.
Chaudhuri and Dexter’s own partnership stems from a Dartmouth undergraduate classroom. In 2009 Dexter, then a high school student from Chester, Vt., took Chaudhuri’s advanced Latin course at Dartmouth. With Chaudhuri as a mentor, Dexter was able to publish original research in classical literary studies while at Princeton, where he majored in chemistry and also studied biology and classics.
The two stayed in touch, and while discussing other recent developments in the computational study of intertextuality, started to explore the idea of applying sequence alignment methods to inexact textual matches.
The tools they are developing are, says Chaudhuri, “especially important for people who are approaching new sets of texts, where you have no background in the text in question, but you’re curious about what kinds of relations there might be. Suddenly you’re given access to that very easily.”
He points to the large body of medieval, neo-Latin works that are increasingly available in digital formats but have yet to be as extensively studied as better known classical literature. “We’re opening up to anyone with Latin the possibility of working with this material in a more substantive way.”