SVM-BASED EMOTION RECOGNITION FROM SPEECH WITH GTCC AND FREQUENCY FEATURES
Abstract
Keywords
Full Text:
PDFReferences
Minu Babu et all “Whether MFCC or GFCC is better for recognizing emotion from speech?”, International journal of research in computer applications and robotics Vol.2 Issue.6, Pg.: 14-17, June 2014, www.ijrcar.com
Holdsworth J, Smith I,N, Patterson R, Rice P. Implementing a Gammatone filter bank. Annex C of the SVOS final report: Part A: The auditory filterbank. 1988. p. 1–5. Available from: https://www.pdn.cam.ac.uk/other-pages/cnbh/files/publications/SVOSAnnexC1988.pdf
O. Cheng, W. Abdulla, Z. Salcic, and N. Zealand, “Performance Evaluation of Front-End Algorithms for Robust Speech Recognition,” Signal Processing and Its Applications, 2005. Proceedings of the Eighth International Symposium on, vol. 2, pp. 711–714, 2005.
[52] R. Schl, I. Bezrukov, H. Wagner, and H. Ney, “Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition,” Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, vol. 4, pp. 649–652, 2007.
[54] X. Valero, S. Member, and F. Alías, “Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification,” Multimedia, IEEE Transactions on, vol. 14, no. 6, pp. 1684–1689, 2012.
M. Slaney, “An Efficient Implementation of the Auditory Filter Bank,” Apple Computer, Perception Group, Tech. Rep, 1993.
B. R. Glasberg and B. C. Moore, “Derivation of auditory filter shapes from notched-noise data,” Hearing research, vol. 47, no. 1, pp. 103–138, Aug. 1990.
Utane, Akshay S., and S. L. Nalbalwar. "Emotion Recognition through Speech Using Gaussian Mixture Model and Support Vector Machine."emotion 2 (2013): 8.
P. Boersma, "Accurate Short-Term Analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound,", IFA Proceedings, (17), 1993.
Martijn Goudbeek, Jean Philippe Goldman, Klaus R. Scherer, „Emotion dimensions and formant position“, https://bridging.uvt.nl/pdf/goudbeek_goldman_scherer_interspeech_2009.pdf
Ververidis, Dimitrios, and Constantine Kotropoulos. "Emotional speech recognition: Resources, features, and methods." Speech communication 48.9 (2006): 1162-1181.
11. Huang Y, Ao W, Zhang G (2017) Novel sub-band spectral centroid weighted wavelet packet features with importance-weighted support vector machines for robust speech emotion recognition. Wireless Pers Commun 95(3):2223–2238.
Mao Q, Xu G, Xue W, Gou J, Zhan Y (2017) Learning emotiondiscriminative and domain-invariant features for domain adaptation in speech emotion recognition. Speech Commun 93:1–10
Jordan, M.: The Kernel Trick, Advanced Topics in Learning & Decision Making, Berkeley, 2004.
Minh, H.; Q.; Niyogi, P.; Yao, Y.: Mercer Theorem, Feature Maps, and Smoothing, Lecture Notes in Computer Science,Springer Berlin, 2006
Cortes, C.; Vapnik, V.: Support Vector Networks, Machine Learning, vol.20, pp. 273-297, Kluver Academic Publishers, Boston, 1995.
Pan, Yixiong, Peipei Shen, and Liping Shen. "Speech emotion recognition using support vector machine." International Journal ofSmart Home 6.2 (2012): 101-108.
D. Ververidis, C. Kotropoulos, I. Pitas, Automatic emotional speech classification, In. Proc. 2004 IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 593-596, Montreal, 2004.
Pan, Yixiong, Peipei Shen, and Liping Shen. "Speech emotion recognition using support vector machine." International Journal ofSmart Home 6.2 (2012): 101-108.
Prabhakar GA, Basel B, Dutta A, Rao CVR (2023) Multichannel cnn-blstm architecture for speech emotion recognition system by fusion of magnitude and phase spectral features using DCCA for consumer appli¬cations. IEEE Transactions on consumer electronics
Hama Saeed M (2023) Improved speech emotion classification using deep neural network. Circuits Syst Signal Proc 42(12):7357–7376
Alluhaidan AS, Saidani O, Jahangir R, Nauman MA, Neffati OS (2023) Speech emotion recognition through hybrid features and convolutional neural network. Appl Sci 13(8):4750
Manuel Cardona, Vijender K. Solanki, Speech emotion recognition using gammatone cepstral coefficients and deep learning features, Proceedings of the 2023 IEEE International Conference on Machine Learning and Applied Network Technologies
U. Kumaran, S. Radha Rammohan, Senthil Murugan Nagarajan, A. Prathik, Fusion of mel and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN, June 2021 International Journal of Speech Technology 24(2), Volume 24, pages 303–314, (2021)
DOI: https://doi.org/10.22190/FUACR250210003V
Refbacks
- There are currently no refbacks.
Print ISSN: 1820-6417
Online ISSN: 1820-6425