I split my time at Carnegie Mellon between two projects, my primary project dealt with identifying when
collaborative learning is occuring through speech, you can read more about it here in my final
paper. The second project was about code switching between English and Bantu langugaes.
Social Accommodation Theory states that when two people speak, their speaking styles will become either more or less similar depending on the outcome of their discourse. When a dialogue is constructive and the participants are engaging each other, they will start to speak more like each other. Each individual has a specific way of pronouncing different sounds, and the quantitative description of how the vowels are pronounced is known as a vowel space. By comparing the change in vowel spaces of individuals over the course of a dialogue and looking for convergence we can estimate the amount of influence they had on each other.
Transactive reasoning, when a speaker bases their argument off of the idea of another speaker, has consistently been shown to be an accurate predictor and an important part of learning. By identifying where transactive reasoning has occurred within a discourse, and charting the corresponding changes in the speakers' vowel spaces' we can train models to identify when transactive reasoning is occurring based on speech alone. This research supports the novel idea that you can glean information about acquiring or transforming a knowledge base solely by identifying relevant change of the sound waves produced in human speech.
I also worked on a project that dealt with deriving the social context of code switching between English and Bantu Languages. In conversations between mulitlingual people there are often motivations to switch between the dominant, or matrix, language into a second langugae. The switch can occur for a reason as simple as the speaker was looking for a word that they don't know or doesn't exist in the matrix language. Code switching can also occur to affirm expressions of identity, or to highlight a commonality between the speakers, or to exclude any other present party from understanding. Bantu Languages are a branch of languages spoken in south and east Africa, and some of its languages, like Zulu and Swahili are widespread and well documented, many of its dialects are very low resource. We want to be able to identify surruptitious conversation in low resource Bantu languages. I did topic segmentation models and other analyses on English-Bantu corpora in order to help the research group decide which resources would be most helpful in tackling this project.