W. Wang, A. Kathol, and H. Bratt, “Automatic detection of speaker attributes based on utterance text,” in Proc. Interspeech, 2011, pp. 2361–2364.
In this paper, we present models for detecting various attributes of a speaker based on uttered text alone. These attributes include whether the speaker is speaking his/her native language, the speaker’s age and gender, and the regional information reported by the speakers. We explore various lexical features as well as features inspired by Linguistic Inquiry and Word Count and Dictionary of Affect in Language. Overall, results suggest that when audio data is not available, by exploring effective feature sets only from uttered text and system combinations of multiple classification algorithms, we can build high quality statistical models to detect these attributes of speakers, comparable to systems that can exploit the audio data.
Index Terms: speaker attributes, machine learning, nativeness, gender, age, region