Emotion Recognition and Stress Disorder Detection from Speech using Sparse Coding and Advanced Deep Learning Techniques

Chappidi Suneetha; Bhavsingh Maloth; Addepalli Lavanya

doi:10.22362/ijcert/2025/v12/i4/v12i402

PDF

Published: Apr 30, 2025

DOI: https://doi.org/10.22362/ijcert/2025/v12/i4/v12i402

Keywords:

Sparse coding, Deep Learning, Neural network, CNN, KTEmotion recognition

Chappidi Suneetha

Bhavsingh Maloth

Addepalli Lavanya

Abstract

Human emotions can be read from a person's face, words, actions (gesture/posture), or even their heart rate. Due to recent advancements in Machine Learning and data fusion, we can now equip computers with the ability to comprehend, identify, and evaluate human sentiment. Emotional state recognition and Stress disorder diagnosis from speech signals have both been concerns for the recent decade. An increasingly useful computer-aided method for identifying emotional disorders is emotion recognition based on multichannel neurophysiologic inputs, a difficult pattern recognition challenge. Correlation information between channels and frequency components is underutilized by conventional fusion techniques. This paper reveals that deep neural networks trained on emotion data can align with prior domain knowledge and acquire representations that are more accurate than those obtained using hand-crafted techniques. Emotional state identification was the focus of this dissertation, which develops the proposed model named Sparse Coding Technique-Deep Learning(SCT-DL) network models.This is done through two methods named the Convolutional-Recurrent Neural Network (CR-NN)which is a deep learning model that can extract task-related characteristics, extract correlated data between channels and incorporate the contextual information gained from this analysis. Due to the complexity of deep belief networks, limited data sets such as the voice database are incompatible with this type of model. Hence the second method named Knowledge Transmission (KT) which is implemented to deal with the issue of limited data. The purpose is to enhance learning by drawing information from multiple source tasks and applying it to a single target activity. The proposed models have statistically and experimentally been proven to be more effective than most state-of-the-art techniques currently available for recognizing emotional states.

How to Cite

[1]

Chappidi Suneetha, Bhavsingh Maloth, and Addepalli Lavanya, “Emotion Recognition and Stress Disorder Detection from Speech using Sparse Coding and Advanced Deep Learning Techniques”, Int. J. Comput. Eng. Res. Trends, vol. 12, no. 4, pp. 12–26, Apr. 2025.

Issue

Vol. 12 No. 4 (2025): April (2025) Issue

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

IJCERT Policy:

The published work presented in this paper is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This means that the content of this paper can be shared, copied, and redistributed in any medium or format, as long as the original author is properly attributed. Additionally, any derivative works based on this paper must also be licensed under the same terms. This licensing agreement allows for broad dissemination and use of the work while maintaining the author's rights and recognition.

By submitting this paper to IJCERT, the author(s) agree to these licensing terms and confirm that the work is original and does not infringe on any third-party copyright or intellectual property rights.

References

S. Saganowski, "Bringing emotion recognition out of the lab into real life: Recent advances in sensors and machine learning," Electronics, vol. 11, no. 3, p. 496, 2022.

M. S. Fahad, A. Ranjan, J. Yadav, and A. Deepak, "A survey of speech emotion recognition in natural environment," Digital Signal Processing, vol. 110, p. 102951, 2021.

K. B. Bhangale and K. Mohanaprasad, "A review on speech processing using machine learning paradigm," International Journal of Speech Technology, vol. 24, pp. 367–388, 2021.

E. H. Houssein, A. Hammad, and A. A. Ali, "Human emotion recognition from EEG-based brain–computer interface using machine learning: A comprehensive review," Neural Computing and Applications, vol. 34, no. 15, pp. 12527–12557, 2022.

S. Kwon, "Att-Net: Enhanced emotion recognition system using lightweight self-attention module," Applied Soft Computing, vol. 102, p. 107101, 2021.

M. R. Islam et al., "Emotion recognition from EEG signal focusing on deep learning and shallow learning techniques," IEEE Access, vol. 9, pp. 94601–94624, 2021.

D. Banerjee et al., "A deep transfer learning approach for improved post-traumatic stress disorder diagnosis," Knowledge and Information Systems, vol. 60, pp. 1693–1724, 2019.

T. K. Arora et al., "Optimal facial feature based emotional recognition using deep learning algorithm," Computational Intelligence and Neuroscience, vol. 2022, 2022.

E. M. Onyema et al., "Enhancement of patient facial recognition through deep learning algorithm: ConvNet," Journal of Healthcare Engineering, vol. 2021, 2021.

A. Dey et al., "A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition," IEEE Access, vol. 8, pp. 200953–200970, 2020.

Y. Jiang, W. Li, M. S. Hossain, M. Chen, A. Alelaiwi, and M. Al-Hammadi, "A snapshot research and implementation of multimodal information fusion for data-driven emotion recognition," Information Fusion, vol. 53, pp. 209–221, 2020.

F. Al Machot, A. Elmachot, M. Ali, E. Al Machot, and K. Kyamakya, "A deep-learning model for subject-independent human emotion recognition using electrodermal activity sensors," Sensors, vol. 19, no. 7, p. 1659, 2019.

M. Maithri et al., "Automated emotion recognition: Current trends and future perspectives," Computer Methods and Programs in Biomedicine, vol. 226, p. 106646, 2022.

A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, "Speech recognition using deep neural networks: A systematic review," IEEE Access, vol. 7, pp. 19143–19165, 2019.

P. Tiwari and A. D. Darji, "A novel S-LDA features for automatic emotion recognition from speech using 1-D CNN," International Journal of Mathematical, Engineering and Management Sciences, vol. 7, no. 1, pp. 49–63, 2022.

S. Zhang, R. Liu, X. Tao, and X. Zhao, "Deep cross-corpus speech emotion recognition: Recent advances and perspectives," Frontiers in Neurorobotics, vol. 15, p. 162, 2021.

Z. Halim, M. Waqar, and M. Tahir, "A machine learning-based investigation utilizing the in-text features for the identification of dominant emotion in an email," Knowledge-Based Systems, vol. 208, p. 106443, 2020.

S. Das, N. N. Lønfeldt, A. K. Pagsberg, and L. H. Clemmensen, "Towards interpretable and transferable speech emotion recognition: Latent representation based analysis of features, methods and corpora," arXiv preprint, arXiv:2105.02055, 2021.

S. B. Alex, L. Mary, and B. P. Babu, "Attention and feature selection for automatic speech emotion recognition using utterance and syllable-level prosodic features," Circuits, Systems, and Signal Processing, vol. 39, no. 11, pp. 5681–5709, 2020.

M. L. Joshi and N. Kanoongo, "Depression detection using emotional artificial intelligence and machine learning: A closer review," Materials Today: Proceedings, vol. 58, pp. 217–226, 2022.

S. R. Kshirsagar and T. H. Falk, "Quality-aware bag of modulation spectrum features for robust speech emotion recognition," IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1892–1905, 2022.

H. F. Nweke, Y. W. Teh, G. Mujtaba, and M. A. Al-Garadi, "Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions," Information Fusion, vol. 46, pp. 147–170, 2019.

X. Li, D. Song, P. Zhang, Y. Hou, and B. Hu, "Deep fusion of multi-channel neurophysiological signal for emotion recognition and monitoring," International Journal of Data Mining and Bioinformatics, vol. 18, no. 1, pp. 1–27, 2017.

D. Banerjee, "Speech based machine learning models for emotional state recognition and PTSD detection," Ph.D. dissertation, Old Dominion Univ., 2017.

P. R. Khorrami, "How deep learning can help emotion recognition," unpublished.

J. Zhang, Z. Yin, P. Chen, and S. Nichele, "Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review," Information Fusion, vol. 59, pp. 103–126, 2020.

X. Li, D. Song, P. Zhang, G. Yu, Y. Hou, and B. Hu, "Emotion recognition from multi-channel EEG data through convolutional recurrent neural network," in Proc. IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM), 2016, pp. 352–359.