AI model combines speech and brain activity to diagnose depression

Depression is one of the most common mental illnesses. As many as 280 million people worldwide are affected by this disease, which is why researchers at Kaunas University of Technology (KTU) have developed an artificial intelligence (AI) model that helps to identify depression based on both speech and brain neural activity.

featured-image

Depression is one of the most common mental illnesses. As many as 280 million people worldwide are affected by this disease, which is why researchers at Kaunas University of Technology (KTU) have developed an artificial intelligence (AI) model that helps to identify depression based on both speech and brain neural activity. This multimodal approach, combining two different data sources, allows a more accurate and objective analysis of a person's emotional state, opening the door to a new phase of depression diagnosis .

Depression is one of the most common mental disorders, with devastating consequences for both the individual and society, so we are developing a new, more objective diagnostic method that could become accessible to everyone in the future." Rytis Maskeliūnas, professor at KTU and one of the authors of the invention Scientists argue that while most diagnostic research for depression has traditionally relied on a single type of data, the new multimodal approach can provide better information about a person's emotional state. Impressive accuracy using voice and brain activity data This combination of speech and brain activity data achieved an impressive 97.



53 per cent accuracy in diagnosing depression, significantly outperforming alternative methods. "This is because the voice adds data to the study that we cannot yet extract from the brain," explains Maskeliūnas. According to Musyyab Yousufi, KTU PhD student who contributed to the invention, the choice of data was carefully considered: "While it is believed that, facial expressions might reveal more about a person's psychological state, but this is quite easily falsifiable data.

We chose voice because it can subtly reveal an emotional state, with the diagnosis affecting the pace of speech, intonation, and overall energy". In addition, unlike electrical brain activity (EEG) or voice data, the face can directly identify a person's state of severity up to certain extent. "But we cannot violate patients' privacy, and also, collecting and combining data from several sources is more promising for further use," says the professor at KTU Faculty of Informatics (IF).

Maskeliūnas emphasises that the used EEG dataset was obtained from the Multimodal Open Dataset for Mental Disorder Analysis (MODMA), as the KTU research group represents computer science and not the medical science field. MODMA EEG data was collected and recorded for five minutes while participants were awake, at rest, and with their eyes closed. In the audio part of the experiment, the patients participated in a question-and-answer session and several activities focused on reading and describing pictures to capture their natural language and cognitive state.

AI will need to learn how to justify the diagnosis The collected EEG and audio signals were transformed into spectrograms, allowing the data to be visualised. Special noise filters and pre-processing methods were applied to make the data noise free and comparable, and a modified DenseNet-121 deep-learning model was used to identify signs of depression in the images. Each image reflected signal changes over time.

The EEG showed waveforms of brain activity, and the sound showed frequency and intensity distributions. Related Stories New mindfulness therapy offers hope for teenagers struggling with depression Research highlights immune system's role in depression New study links circadian gene variants to winter depression The model included a custom classification layer trained to split the data into classes of healthy or depressed people. Successful classification was evaluated and then the accuracy of the application was assessed.

In the future, this AI model could speed up the diagnosis of depression, or even make it remote, and reduce the risk of subjective evaluations. This requires further clinical trials and improvements to the programme. However, Maskeliūnas adds, that the latter aspect of the research might raise some challenges.

"The main problem with these studies is the lack of data because people tend to remain private about their mental health matters," he says. Another important aspect mentioned by the professor of the KTU Department of Multimedia Engineering is that the algorithm needs to be improved in such a way that it is not only accurate but also provides information to the medical professional on what led to this diagnostic result. "The algorithm still has to learn how to explain the diagnosis in a comprehensible way," says Maskeliūnas.

According to a KTU professor, due to the growing demand for AI solutions that directly affect people in areas such as healthcare, finance, and the legal system, similar requirements are becoming common. This is why explainable artificial intelligence (XAI), which aims to explain to the user why the model makes certain decisions and to increase their trust in the AI, is now gaining momentum. Kaunas University of Technology Yousufi, M.

, et al. (2024). Multimodal Fusion of EEG and Audio Spectrogram for Major Depressive Disorder Recognition Using Modified DenseNet121.

Brain Sciences . doi.org/10.

3390/brainsci14101018 ..