TR2013-118

A Generalized Discriminative Training Framework for System Combination


    •  Tachioka, Y., Watanabe, S., Le Roux, J., Hershey, J.R., "A Generalized Discriminative Training Framework for System Combination", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), DOI: 10.1109/​ASRU.2013.6707703, December 2013, pp. 43-48.
      BibTeX TR2013-118 PDF
      • @inproceedings{Tachioka2013dec,
      • author = {Tachioka, Y. and Watanabe, S. and {Le Roux}, J. and Hershey, J.R.},
      • title = {A Generalized Discriminative Training Framework for System Combination},
      • booktitle = {IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)},
      • year = 2013,
      • pages = {43--48},
      • month = dec,
      • doi = {10.1109/ASRU.2013.6707703},
      • url = {https://www.merl.com/publications/TR2013-118}
      • }
  • MERL Contact:
Abstract:

This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.