How Your Voice Can Reveal If You're Depressed

More people than ever are suffering from mental health problems, with one in six adults managing symptoms associated with mental ill health at any given time.

Depression can manifest itself in many ways, from insomnia to libido loss, tearfulness to anxiety. However, one company has developed an unusual way to tell if someone is depressed: by listening to their voice.

Research has shown that people suffering from depression speak with a reduced frequency range in vowel production, which means that they have a flatter-sounding voice. These measurable traits in someone's voice are called paraverbal features, and are also detectable in other mental illnesses, including post‐traumatic stress disorder.

Kintsugi is an app start-up based in California that is trying to tap into these tell-tale markers of depression by building a machine-learning model that, after listening to their voice, can measure a speaker's likelihood of having depression on the PHQ-9 and GAD-7 scales, which rate a patient's depression and anxiety severity, respectively, on a 0 to 21 scale.

Depression
Stock image: woman speaks to therapist online. iStock / Getty Images Plus

Users speak regularly about their feelings using the Kintsugi app, talking through pre-set prompts or challenges, and the machine learning AI scores them on these scales with each recording.

"Kintsugi users are able to passively track PHQ-9 and GAD-7 scores for depression and anxiety in concert with their journal entries over time, allowing individuals to see progress without much more effort than just talking about top-of-mind issues," reads Kintsugi's website.

According to a blog post on its website, Kintsugi was awarded multiple Small Business Innovation Research grants from the National Science Foundation to develop this novel AI software to detect signs of clinical depression and anxiety from short speech clips. The more people who use the app, the better the machine-learning model gets at accurately detecting traits in voices that indicate mental ill health.

"Our neural network model has been trained on tens of thousands of depressed voices. So it can be like a set of psychiatrists, but it's much more sensitive. It can pick it up even when the depression is at mild or moderate levels," co-founder Rima Seiilova-Olson told TechCrunch.

So, what are the potential uses of this technology? According to Kintsugi, the data from the voice clips will help patients get diagnoses quickly which will allow them to be treated faster if they need it.

Kintsugi isn't the only company with this idea: Cogito and Ellipsis Health have both developed AI systems that analyze the mental health markers of a speaker's voice, with the Ellipsis Health app showing "feasibility in using voice recordings to screen for depression and anxiety among various age groups", according to a study published in April in the journal Frontiers in Psychology.

Kintsugi's Chief Medical Officer, Prentice Tom, is quoted on the company's website as saying: "Real-time data that augments the clinician's ability to improve care and that can be easily embedded in current clinical workflows, such as Kintsugi's voice biomarker tool is a critical component necessary for us to move to a more efficient, quality driven, value based care health system."

The company has also developed an enterprise API (a software intermediary allowing applications to talk to each other) called Kintsugi Voice that can be integrated into clinical call centers and remote patient monitoring apps and allows the person on the other end to determine the emotional state of the speaker.

"By uniquely focusing on how people are speaking versus what they say, [it] reduces language bias inherent across socioeconomic class and protects patient privacy by only analyzing features of the voice signal," their website states.

One problem that may arise for all of these AI companies is the issue of data and privacy. However, TechCrunch reported that "entries are encrypted in transit and at rest, but they are also shareable publicly if people are inclined to do that", and that in the case of Kintsugi, natural language processing is not used, meaning that the actual words being spoken are not recorded explicitly, only the tonality of the speaker's voice.

Newsweek has asked Kintsugi for comment.