SRI Authors: Andreas Kathol, Colleen Richey, Dimitra Vergyri, Martin Graciarena
Mitra, V., Shriberg, E., Mitchell, M., Kathol, A., Richey, C., Vergyri, D., Graciarena, M. (2014, November). The SRI AVEC-2014 Evaluation System. Presented at the 22nd ACM International Conference on Multimedia, Orlando, FL.
Though depression is a common mental health problem with significant impact on human society, it often goes undetected. We explore a diverse set of features based only on spoken audio to understand which features correlate with self-reported depression scores according to the Beck depression rating scale. These features, many of which are novel for this task, include (1) estimated articulatory trajectories during speech production, (2) acoustic characteristics, (3) acoustic-phonetic characteristics and (4) prosodic features. Features are modeled using a variety of approaches, including support vector regression, a Gaussian backend and decision trees. We report results on the AVEC-2014 depression dataset and find that individual systems range from 9.18 to 11.87 in root mean squared error (RMSE), and from 7.68 to 9.99 in mean absolute error (MAE). Initial fusion brings further improvement; fusion and feature selection work is still in progress.