Speech Emotion Recognition with Machine Learning
Keywords:
Emotion, Machine learning, speech recognition.Abstract
Understanding a person's emotion from their speech is
called speech emotion recognition. It enhances
interactivity between people and machines. Although it
is tough to annotate audio since emotions are
subjective, Speech Emotion Recognition (SER) makes it
possible to forecast a person's emotional state. This is
the same principle that dogs, elephants, horses, and
other animals use to understand human emotion. There
are several ways to gauge someone's emotional
condition, including behaviour, expression, pitch, tone,
etc. Some of these are supposed to enable the
recognition of speech emotions. The classifiers are
taught to recognise speech emotions using a limited
amount of data points. This study considers the Ryerson
Audio-Visual Database of Emotional Speech and Song
dataset. Here, the three essential properties are
retrieved, including chroma, Mel Spectrogram, and
MFCC (Mel Frequency Cepstral Coefficients).
Downloads
References
Rao, K. Sreenivasa, et al. "Emotion recognition from
speech." International Journal of Computer Science and
Information Technologies 3.2 (2012): 3603-3607.
2. Yu, Feng, et al. "Emotion detection from speech to enrich
multimedia content." Pacific-Rim Conference on Multimedia.
Springer, Berlin, Heidelberg, 2001.
3. Utane, Akshay S., and S. L. Nalbalwar. "Emotion recognition
through Speech." International Journal of Applied Information
Syatems (IJAIS) (2013): 5-8.
4. Manas Jain, Shruthi Narayan, Pratibha Balaji, Bharath K P,
Abhijit Bhowmick, Karthik R, Rajesh Kumar Muthu “Speech
Emotion Recognition using Support Vector Machine”
https://doi.org/10.48550/arXiv.2002.07590
5. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G.,
Kollias, S., Fellenz, W., and Taylor, J. G., Emotion recognition
in human-computer interaction, IEEE Signal Processing
magazine, Vol. 18, No. 1, 32-80, Jan. 2001.
6. D. Ververidis, and C. Kotropoulos, Automatic speech
classification to five emotional states based on gender
information, Proceedings of the EUSIPCO2004 Conference,
Austria, 341-344, Sept. 2004.
7. Lijiang Chen, Xia Mao, Yuli Xue, Lee Lung Cheng, Speech
emotion recognition: Features and classification models, Digital
Signal Processing, Volume 22, Issue 6,2012, Pages 1154-1160,
ISSN 1051-2004, https://doi.org/10.1016/j.dsp.2012.05.007.