Emotion Recognition and Stress Disorder Detection from Speech using Sparse Coding and Advanced Deep Learning Techniques

Main Article Content

Chappidi Suneetha
Bhavsingh Maloth
Addepalli Lavanya

Abstract

Human emotions can be read from a person's face, words, actions (gesture/posture), or even their heart rate. Due to recent advancements in Machine Learning and data fusion, we can now equip computers with the ability to comprehend, identify, and evaluate human sentiment. Emotional state recognition and Stress disorder diagnosis from speech signals have both been concerns for the recent decade. An increasingly useful computer-aided method for identifying emotional disorders is emotion recognition based on multichannel neurophysiologic inputs, a difficult pattern recognition challenge. Correlation information between channels and frequency components is underutilized by conventional fusion techniques. This paper reveals that deep neural networks trained on emotion data can align with prior domain knowledge and acquire representations that are more accurate than those obtained using hand-crafted techniques. Emotional state identification was the focus of this dissertation, which develops the proposed model named Sparse Coding Technique-Deep Learning(SCT-DL) network models.This is done through two methods named the Convolutional-Recurrent Neural Network (CR-NN)which is a deep learning model that can extract task-related characteristics, extract correlated data between channels and incorporate the contextual information gained from this analysis. Due to the complexity of deep belief networks, limited data sets such as the voice database are incompatible with this type of model. Hence the second method named Knowledge Transmission (KT) which is implemented to deal with the issue of limited data. The purpose is to enhance learning by drawing information from multiple source tasks and applying it to a single target activity. The proposed models have statistically and experimentally been proven to be more effective than most state-of-the-art techniques currently available for recognizing emotional states.

Article Details

How to Cite
[1]
Chappidi Suneetha, Bhavsingh Maloth, and Addepalli Lavanya, “Emotion Recognition and Stress Disorder Detection from Speech using Sparse Coding and Advanced Deep Learning Techniques”, Int. J. Comput. Eng. Res. Trends, vol. 12, no. 4, pp. 12–26, Apr. 2025.
Section
Research Articles

References

S. Saganowski, "Bringing emotion recognition out of the lab into real life: Recent advances in sensors and machine learning," Electronics, vol. 11, no. 3, p. 496, 2022.

M. S. Fahad, A. Ranjan, J. Yadav, and A. Deepak, "A survey of speech emotion recognition in natural environment," Digital Signal Processing, vol. 110, p. 102951, 2021.

K. B. Bhangale and K. Mohanaprasad, "A review on speech processing using machine learning paradigm," International Journal of Speech Technology, vol. 24, pp. 367–388, 2021.

E. H. Houssein, A. Hammad, and A. A. Ali, "Human emotion recognition from EEG-based brain–computer interface using machine learning: A comprehensive review," Neural Computing and Applications, vol. 34, no. 15, pp. 12527–12557, 2022.

S. Kwon, "Att-Net: Enhanced emotion recognition system using lightweight self-attention module," Applied Soft Computing, vol. 102, p. 107101, 2021.

M. R. Islam et al., "Emotion recognition from EEG signal focusing on deep learning and shallow learning techniques," IEEE Access, vol. 9, pp. 94601–94624, 2021.

D. Banerjee et al., "A deep transfer learning approach for improved post-traumatic stress disorder diagnosis," Knowledge and Information Systems, vol. 60, pp. 1693–1724, 2019.

T. K. Arora et al., "Optimal facial feature based emotional recognition using deep learning algorithm," Computational Intelligence and Neuroscience, vol. 2022, 2022.

E. M. Onyema et al., "Enhancement of patient facial recognition through deep learning algorithm: ConvNet," Journal of Healthcare Engineering, vol. 2021, 2021.

A. Dey et al., "A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition," IEEE Access, vol. 8, pp. 200953–200970, 2020.

Y. Jiang, W. Li, M. S. Hossain, M. Chen, A. Alelaiwi, and M. Al-Hammadi, "A snapshot research and implementation of multimodal information fusion for data-driven emotion recognition," Information Fusion, vol. 53, pp. 209–221, 2020.

F. Al Machot, A. Elmachot, M. Ali, E. Al Machot, and K. Kyamakya, "A deep-learning model for subject-independent human emotion recognition using electrodermal activity sensors," Sensors, vol. 19, no. 7, p. 1659, 2019.

M. Maithri et al., "Automated emotion recognition: Current trends and future perspectives," Computer Methods and Programs in Biomedicine, vol. 226, p. 106646, 2022.

A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, "Speech recognition using deep neural networks: A systematic review," IEEE Access, vol. 7, pp. 19143–19165, 2019.

P. Tiwari and A. D. Darji, "A novel S-LDA features for automatic emotion recognition from speech using 1-D CNN," International Journal of Mathematical, Engineering and Management Sciences, vol. 7, no. 1, pp. 49–63, 2022.

S. Zhang, R. Liu, X. Tao, and X. Zhao, "Deep cross-corpus speech emotion recognition: Recent advances and perspectives," Frontiers in Neurorobotics, vol. 15, p. 162, 2021.

Z. Halim, M. Waqar, and M. Tahir, "A machine learning-based investigation utilizing the in-text features for the identification of dominant emotion in an email," Knowledge-Based Systems, vol. 208, p. 106443, 2020.

S. Das, N. N. Lønfeldt, A. K. Pagsberg, and L. H. Clemmensen, "Towards interpretable and transferable speech emotion recognition: Latent representation based analysis of features, methods and corpora," arXiv preprint, arXiv:2105.02055, 2021.

S. B. Alex, L. Mary, and B. P. Babu, "Attention and feature selection for automatic speech emotion recognition using utterance and syllable-level prosodic features," Circuits, Systems, and Signal Processing, vol. 39, no. 11, pp. 5681–5709, 2020.

M. L. Joshi and N. Kanoongo, "Depression detection using emotional artificial intelligence and machine learning: A closer review," Materials Today: Proceedings, vol. 58, pp. 217–226, 2022.

S. R. Kshirsagar and T. H. Falk, "Quality-aware bag of modulation spectrum features for robust speech emotion recognition," IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1892–1905, 2022.

H. F. Nweke, Y. W. Teh, G. Mujtaba, and M. A. Al-Garadi, "Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions," Information Fusion, vol. 46, pp. 147–170, 2019.

X. Li, D. Song, P. Zhang, Y. Hou, and B. Hu, "Deep fusion of multi-channel neurophysiological signal for emotion recognition and monitoring," International Journal of Data Mining and Bioinformatics, vol. 18, no. 1, pp. 1–27, 2017.

D. Banerjee, "Speech based machine learning models for emotional state recognition and PTSD detection," Ph.D. dissertation, Old Dominion Univ., 2017.

P. R. Khorrami, "How deep learning can help emotion recognition," unpublished.

J. Zhang, Z. Yin, P. Chen, and S. Nichele, "Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review," Information Fusion, vol. 59, pp. 103–126, 2020.

X. Li, D. Song, P. Zhang, G. Yu, Y. Hou, and B. Hu, "Emotion recognition from multi-channel EEG data through convolutional recurrent neural network," in Proc. IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM), 2016, pp. 352–359.

M. S. Fahad, A. Ranjan, J. Yadav, and A. Deepak, "A survey of speech emotion recognition in natural environment," Digital Signal Processing, vol. 110, p. 102951, 2021.

S. Saganowski, "Bringing emotion recognition out of the lab into real life: Recent advances in sensors and machine learning," Electronics, vol. 11, no. 3, p. 496, 2022.

K. B. Bhangale and K. Mohanaprasad, "A review on speech processing using machine learning paradigm," International Journal of Speech Technology, vol. 24, pp. 367–388, 2021.

E. H. Houssein, A. Hammad, and A. A. Ali, "Human emotion recognition from EEG-based brain–computer interface using machine learning: A comprehensive review," Neural Computing and Applications, vol. 34, no. 15, pp. 12527–12557, 2022.

S. Kwon, "Att-Net: Enhanced emotion recognition system using lightweight self-attention module," Applied Soft Computing, vol. 102, p. 107101, 2021.

M. R. Islam et al., "Emotion recognition from EEG signal focusing on deep learning and shallow learning techniques," IEEE Access, vol. 9, pp. 94601–94624, 2021.

T. K. Arora et al., "Optimal facial feature based emotional recognition using deep learning algorithm," Computational Intelligence and Neuroscience, vol. 2022, 2022.

E. M. Onyema et al., "Enhancement of patient facial recognition through deep learning algorithm: ConvNet," Journal of Healthcare Engineering, vol. 2021, 2021.

M. Maithri et al., "Automated emotion recognition: Current trends and future perspectives," Computer Methods and Programs in Biomedicine, vol. 226, p. 106646, 2022.

P. Tiwari and A. D. Darji, "A novel S-LDA features for automatic emotion recognition from speech using 1-D CNN," International Journal of Mathematical, Engineering and Management Sciences, vol. 7, no. 1, pp. 49–63, 2022.

S. Zhang, R. Liu, X. Tao, and X. Zhao, "Deep cross-corpus speech emotion recognition: Recent advances and perspectives," Frontiers in Neurorobotics, vol. 15, p. 162, 2021.

S. Das, N. N. Lønfeldt, A. K. Pagsberg, and L. H. Clemmensen, "Towards interpretable and transferable speech emotion recognition: Latent representation based analysis of features, methods and corpora," arXiv preprint, arXiv:2105.02055, 2021.

M. L. Joshi and N. Kanoongo, "Depression detection using emotional artificial intelligence and machine learning: A closer review," Materials Today: Proceedings, vol. 58, pp. 217–226, 2022.

S. R. Kshirsagar and T. H. Falk, "Quality-aware bag of modulation spectrum features for robust speech emotion recognition," IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1892–1905, 2022.

N. Patel, S. Patel, and S. H. Mankad, "Impact of autoencoder based compact representation on emotion detection from audio," Journal of Ambient Intelligence and Humanized Computing, pp. 1–19, 2022.

M. A. Asghar, M. J. Khan, M. Rizwan, M. Shorfuzzaman, and R. M. Mehmood, "AI inspired EEG-based spatial feature selection method using multivariate empirical mode decomposition for emotion classification," Multimedia Systems, vol. 28, no. 4, pp. 1275–1288, 2022.

N. K. Benamara et al., "Real-time facial expression recognition using smoothed deep neural network ensemble," Integrated Computer-Aided Engineering, vol. 28, no. 1, pp. 97–111, 2021.

Z. Wan, R. Yang, M. Huang, N. Zeng, and X. Liu, "A review on transfer learning in EEG signal analysis," Neurocomputing, vol. 421, pp. 1–14, 2021.

Most read articles by the same author(s)