Nov. 9, 2015
Dartmouth scientists have created an automatic speech analysis tool that pushes the technological envelope for what types of sociolinguistic dialect research are possible.
Socio-phoneticians, who study how accents and speech patterns vary in different communities, are often concerned with the sounds of vowels. Two people who have different accents, even within the United States, might produce their vowels with markedly different resonance frequencies. These resonance frequencies (also known as vowel formants) give linguists a precise, quantitative way to characterize accents. Previously, all analysis of vowel formants was done manually, but there has been recent interest in using computational methods to automate part of the process. One such program called FAVE (Forced Alignment & Vowel Extraction), which was developed at the University of Pennsylvania, automatically aligns a transcript with the speech and measures the formant values.
The bottleneck with such a program is the speech transcripts still need to be provided by humans, which makes it difficult to analyze large amounts of speech data. Given that automatic speech recognition, or ASR, is rapidly becoming more accurate, the Dartmouth researchers wanted to study whether it would be feasible to build a tool that automatically analyzes dialect features in speech data without requiring a human transcriber. It would then be possible to quickly analyze virtually limitless hours of recordings, such as videos from YouTube, publicly available archives and large-scale personal interviews.
The Dartmouth researchers have developed a fully automated, open-access, user-friendly web application called DARLA (Dartmouth Linguistic Automation), which automatically generates transcriptions of uploaded data using speech recognition, filters out noisy tokens, and measures and plots formant frequencies, in formats convenient for linguistic analysis. Part of the system uses technology from the FAVE project at Penn. It also provides several options for users needing different levels of precision in their results.
The Dartmouth team has published DARLA-related work in Linguistics Vanguard and in the Proceedings of the North American Association for Computational Linguistics and presented a workshop at the New Ways of Analyzing Variation conference last month.
“Fully automated vowel extraction methods still have a long way to go, but as ASR technologies continue to improve, we believe the DARLA system will be useful for more and more sociolinguistic research questions,” says DARLA co-developer Jim Stanford, an associate professor and sociolinguist. “We anticipate that a large amount of sociolinguistic research in the future will eventually use fully automated methods like DARLA for measuring vowel data, and so our work helps take a step in that direction.”
Sravana Reddy, the lead researcher on the project, designed, wrote and implemented the DARLA computational system beginning with her Neukom post-doctoral fellowship at Dartmouth and continuing since that fellowship. Dartmouth student Irene Feng also helped with the initial website development.
Dartmouth Associate Professor Jim Stanford is available to comment at firstname.lastname@example.org.
Broadcast studios: Dartmouth has TV and radio studios available for interviews. For more information, visit: Broadcast studios