TR2005-078
Layered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-Modal Streams
-
- "Layered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-Modal Streams", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2005, vol. 2, pp. 1053-1056.BibTeX TR2005-078 PDF
- @inproceedings{Xie2005mar,
- author = {Xie, L. and Kennedy, L. and Chang, S.-F. and Divakaran, A. and Sun, H. and Lin, C.-Y.},
- title = {Layered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-Modal Streams},
- booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
- year = 2005,
- volume = 2,
- pages = {1053--1056},
- month = mar,
- issn = {1520-6149},
- url = {https://www.merl.com/publications/TR2005-078}
- }
,
- "Layered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-Modal Streams", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2005, vol. 2, pp. 1053-1056.
-
MERL Contact:
Abstract:
We propose a layered dynamic mixture model for asynchronous multi-modal fusion for unsupervised pattern discovery in video. The lower layer of the model uses generative temporal structures such as a hierarchical hidden Markov model to convert the audio-visual streams into mid-level labels, it also models the correlations in text with probabilistic latent semantic analysis. The upper layer fuses the statistical evidence across diverse modalities with a flexible meta-mixture model that assumes loose temporal correspondence. Evaluation on a large news database shows that multi-modal clusters have better correspondence to news topics than audio-visual clusters alone; novel analysis techniques suggest that meaningful clusters occur when the prediction of salient features by the model concurs with those shown in the story clusters.
Related News & Events
-
NEWS ICASSP 2005: 4 publications by Anthony Vetro, Ajay Divakaran, Huifang Sun and others Date: March 18, 2005
Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
MERL Contacts: Anthony Vetro; Huifang SunBrief- The papers "Fast Adaptive Fuzzy Post-Filtering for Coding Artifacts Removal in Interlaced Video" by Nie, Y., Kong, H.-S., Vetro, A. and Barner, K., "Video Coding Using 3-D Dual-Tree Discrete Wavelet Transform" by Wang, B., Wang, Y., Selesnick, I. and Vetro, A., "A Companding Front End for Noise-Robust Automatic Speech Recognition" by Guinness, J., Raj, B., Schmidt-Nielsen, B., Turicchia, L. and Sarpeshkar, R. and "Layered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-Modal Streams" by Xie, L., Kennedy, L., Chang, S.-F., Divakaran, A., Sun, H. and Lin, C.-Y. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).