Using Prosodic and Spectral Features in Detecting Depression in Elderly Males

Citation

M. H. Sanchez, D. Vergyri, L. Ferrer, C. Richey, P. Garcia, B. Knoth and W. Jarrold, “Using prosodic and spectral features in  detecting depression in elderly males,” in Proc. Interspeech, 2011, pp. 3001–3004.

Abstract

As research in speech processing has matured, there has been much interest in paralinguistic speech processing problems including the speaker’s mental and psychological health.  In this study, we focus on speech features that can identify the speaker’s emotional health, i.e., whether the speaker is depressed or not.  We use prosodic speech measurements, such as pitch and energy, in addition to spectral features, such as formants and spectral tilt, and compute statistics of these features over different regions of the speech signal.  These statistics are used as input features to a discriminative classifier that predicts the speaker’s depression state.  We find that with an N-fold leave-one-out cross-validation setup, we can achieve a prediction accuracy of 81.3%, where random guess is 50%.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.