SRI’s AI-driven voice analysis could help screen for mental health conditions

Researchers at SRI are developing tools to help clinicians keep a close eye on depression, PTSD, and other mental health issues.

The way we speak can reveal a lot about our mental and emotional health. Signs of depression, post-traumatic stress, or suicidality can show up in the words and tone we use, how we present our ideas, or even in how our voice sounds as it passes by stressed muscles in our neck and throat.

Researchers at SRI have been developing an AI tool that could help analyze voice and assess a person’s mental state. The tool could provide additional information to clinicians making diagnoses and make it easier to screen people who may normally be hard to reach, such as those deployed in the military.

“Speech is a non-invasive way of collecting information and you don’t need access to high-quality equipment—it can be done over the phone,” said Dimitra Vergyri, director of the Speech Technology and Research Laboratory at SRI. “This could be a monitoring tool to provide information longitudinally, frequently, and at a low cost.”

Identifying emotional tones

Psychologists note that people with depression tend to speak in quieter, monotone voices and pause often. For those with anxiety, the tension affects the tone of voice and the pace of breathing.

Researchers are working towards identifying voice markers for post-traumatic stress disorder (PTSD), traumatic brain injury, and other conditions. While many of these markers overlap with each other, Vergyri and her colleagues have used machine learning techniques to identify patterns across voice samples from multiple speakers and determine which features are correlated with a specific condition.

This tool is intended to be used by clinicians, Vergyri said. Her team’s goal is to use AI to provide quantifiable metrics that an expert can use, which means the tool needs to do more than simply provide a score—it needs to be transparent and explainable.

“If the tool’s scores indicate a high risk for a certain condition, clinicians want to know what the indicators are,” Vergyri said. “We’re trying to provide tangible information that can be useful to experts when they diagnose and provide treatment.”

Tools that spot inflections and change

The researchers also see the potential of the tool to provide frequent, long-term monitoring to help non-experts spot changes in someone’s mental health condition. If a person is taking medication for depression, for example, regular voice analysis could provide ongoing data about their condition. Under normal circumstances, they might only speak with a clinician occasionally, but this tool could help non-experts identify when it’s time to seek expert help because, say, a medication is not effective or something else has changed.

“This tool can also help caregivers feel confident that their loved ones are doing well in their daily lives. It’s just not feasible to talk to a clinician every day, or every week, especially if a person is far away or deployed overseas, for example,” Vergyri said. “This tool could provide consistent monitoring to help people receive the support they need.”

“This tool can also help caregivers feel confident that their loved ones are doing well in their daily lives,” said Dimitra Vergyri.

Using voice recordings, the team has demonstrated that their algorithms can objectively identify features of various mental health conditions. They published a paper demonstrating the ability to successfully differentiate between patients with and without PTSD. They are continuing their collaboration with other groups to combine speech and video analysis into an even more comprehensive system.

“Speech is only one piece of the puzzle,” Vergyri said. “We’re all continuously aiming to improve our technology and get to the point where these tools can help people.”

Read more from SRI