TR2004-082

Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification

- Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T.S., "Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification", IEEE International Conference on Multimedia and Expo (ICME), DOI: 10.1109/ICME.2003.1221332, July 2003, vol. 3, pp. 397-400.
  BibTeX TR2004-082 PDF
  - @inproceedings{Xiong2003jul2,
  - author = {Xiong, Z. and Radhakrishnan, R. and Divakaran, A. and Huang, T.S.},
  - title = {{Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification}},
  - booktitle = {IEEE International Conference on Multimedia and Expo (ICME)},
  - year = 2003,
  - volume = 3,
  - pages = {397--400},
  - month = jul,
  - doi = {10.1109/ICME.2003.1221332},
  - url = {https://www.merl.com/publications/TR2004-082}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

We present a comparison of 6 methods for classification of sports audio. For the feature extraction we have two choices: MPEG-7 audio features and Mel-scale Frequency Cepstrum Coefficients (MFCC). For the classificaiton we also have two choices: Maximum Likelihood Hidden Markov Models (ML-HMM) and Entropic Prior HMM(EP-HMM). EP-HMM, in turn, have two variations: with and without trimming of the model parameters. We thus have 6 possible methods, each of which corresponds to a combination. Our results show that all the combinations achieve classification accuracy of around 90% with the best and the second best being MPEG-7 features with EP-HMM and MFCC with ML-HMM.

Related News & Events

NEWS ICME 2003: 7 publications by Chia Shen, Anthony Vetro, Ajay Divakaran and Huifang Sun
Date: July 6, 2003
Where: IEEE International Conference on Multimedia and Expo (ICME)
MERL Contacts: Anthony Vetro; Huifang Sun
Brief
- The papers "Multi-Camera Calibration, Object Tracking and Query Generation" by Porikli, F.M. and Divakaran, A., "Unsupervised Discovery of Multilevel Statistical Video Structures Using Hierarchical Hidden Markov Models" by Xie, L., Chang, S.-F., Divakaran, A. and Sun, H., "FGS Enhancement Layer Truncation with Minimized Intra-Frame Quality Variation" by Zhou, J., Shao, H.-R., Shen, C. and Sun, M.-T., "Object-Based Coding for Long-Term Archive of Surveillance Video" by Vetro, A., Haga, T., Sumi, K. and Sun, H., "Rate Allocation for FGS-Coded Video Using Composite Rate-Distortion Analysis" by Cheng, H., Zhang, X.M., Shi, Y.Q., Vetro, A. and Sun, H., "Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework" by Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T.S. and "Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification" by Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T.S. were presented at the IEEE International Conference on Multimedia and Expo (ICME).
NEWS ICASSP 2003: 6 publications by Anthony Vetro and Ajay Divakaran
Date: April 6, 2003
Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
MERL Contact: Anthony Vetro
Brief
- The papers "Lossless Compression of Language Model Structure and Word Identifiers" by Raj, B. and Whittaker, E.W.D., "Rate-Distortion Analysis of the Multiple Description Motion Compensation Video Coding Scheme" by Lin, S., Vetro, A. and Wang, Y., "Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification" by Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T.S., "Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework" by Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T.S., "Multi-Channel Source Separation by Factorial HMMs" by Reyes-Gomez, M.J., Raj, B. and Ellis, D.P.W. and "Tracking Noise via Dynamical Systems with a Continuum of States" by Singh, R. and Raj, B. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Related Publication

Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T.S., "Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/ICASSP.2003.1200048, April 2003, vol. 5, pp. 628-631.

BibTeX IEEE Xplore

@inproceedings{Xiong2003apr1,
author = {Xiong, Z. and Radhakrishnan, R. and Divakaran, A. and Huang, T.S.},
title = {{Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification}},
booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
year = 2003,
volume = 5,
pages = {628--631},
month = apr,
doi = {10.1109/ICASSP.2003.1200048},
issn = {1520-6149},
url = {https://ieeexplore.ieee.org/document/1200048}
}

Research Areas:

Abstract: