NEWS    MERL's seamless speech recognition technology featured in Mitsubishi Electric Corporation press release

Date released: February 22, 2019


  •  NEWS    MERL's seamless speech recognition technology featured in Mitsubishi Electric Corporation press release
  • Date:

    February 13, 2019

  • Where:

    Tokyo, Japan

  • Description:

    Mitsubishi Electric Corporation announced that it has developed the world's first technology capable of highly accurate multilingual speech recognition without being informed which language is being spoken. The novel technology, Seamless Speech Recognition, incorporates Mitsubishi Electric's proprietary Maisart compact AI technology and is built on a single system that can simultaneously identify and understand spoken languages. In tests involving 5 languages, the system achieved recognition with over 90 percent accuracy, without being informed which language was being spoken. When incorporating 5 more languages with lower resources, accuracy remained above 80 percent. The technology can also understand multiple people speaking either the same or different languages simultaneously. A live demonstration involving a multilingual airport guidance system took place on February 13 in Tokyo, Japan. It was widely covered by the Japanese media, with reports by all six main Japanese TV stations and multiple articles in print and online newspapers, including in Japan's top newspaper, Asahi Shimbun. The technology is based on recent research by MERL's Speech and Audio team.

    Link:

    Mitsubishi Electric Corporation Press Release

    Media Coverage:

    NHK, News (Japanese)
    NHK World, News (English), video report (starting at 4'38")
    TV Asahi, ANN news (Japanese)
    Nippon TV, News24 (Japanese)
    Fuji TV, Prime News Alpha (Japanese)
    TV Tokyo, World Business Satellite (Japanese)
    TV Tokyo, Morning Satellite (Japanese)
    TBS, News, N Studio (Japanese)
    The Asahi Shimbun (Japanese)
    The Nikkei Shimbun (Japanese)
    Nikkei xTech (Japanese)
    Response (Japanese).

  • MERL Contacts:
  • Research Area:

    Speech & Audio

    •  Seki, H., Hori, T., Watanabe, S., Le Roux, J., Hershey, J., "A Purely End-to-end System for Multi-speaker Speech Recognition", arXiv, July 2018.
      BibTeX arXiv Video
      • @article{Seki2018jul2,
      • author = {Seki, Hiroshi and Hori, Takaaki and Watanabe, Shinji and Le Roux, Jonathan and Hershey, John},
      • title = {A Purely End-to-end System for Multi-speaker Speech Recognition},
      • journal = {arXiv},
      • year = 2018,
      • month = jul,
      • url = {https://arxiv.org/abs/1805.05826}
      • }
    •  Seki, H., Watanabe, S., Hori, T., Le Roux, J., Hershey, J.R., "An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/​ICASSP.2018.8462180, April 2018, pp. 4919-4923.
      BibTeX TR2018-002 PDF Video
      • @inproceedings{Seki2018apr,
      • author = {Seki, Hiroshi and Watanabe, Shinji and Hori, Takaaki and Le Roux, Jonathan and Hershey, John R.},
      • title = {An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2018,
      • pages = {4919--4923},
      • month = apr,
      • doi = {10.1109/ICASSP.2018.8462180},
      • url = {https://www.merl.com/publications/TR2018-002}
      • }
    •  Settle, S., Le Roux, J., Hori, T., Watanabe, S., Hershey, J.R., "End-to-End Multi-Speaker Speech Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/​ICASSP.2018.8461893, April 2018, pp. 4819-4823.
      BibTeX TR2018-001 PDF Video
      • @inproceedings{Settle2018apr,
      • author = {Settle, Shane and Le Roux, Jonathan and Hori, Takaaki and Watanabe, Shinji and Hershey, John R.},
      • title = {End-to-End Multi-Speaker Speech Recognition},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2018,
      • pages = {4819--4823},
      • month = apr,
      • doi = {10.1109/ICASSP.2018.8461893},
      • url = {https://www.merl.com/publications/TR2018-001}
      • }
    •  Watanabe, S., Hori, T., Kim, S., Hershey, J.R., Hayashi, T., "Hybrid CTC/Attention Architecture for End-to-End Speech Recognition", IEEE Journal of Selected Topics in Signal Processing, DOI: 10.1109/​JSTSP.2017.2763455, Vol. 11, No. 8, pp. 1240-1253, October 2017.
      BibTeX TR2017-190 PDF Video
      • @article{Watanabe2017oct,
      • author = {Watanabe, Shinji and Hori, Takaaki and Kim, Suyoun and Hershey, John R. and Hayashi, Tomoki},
      • title = {Hybrid CTC/Attention Architecture for End-to-End Speech Recognition},
      • journal = {IEEE Journal of Selected Topics in Signal Processing},
      • year = 2017,
      • volume = 11,
      • number = 8,
      • pages = {1240--1253},
      • month = oct,
      • doi = {10.1109/JSTSP.2017.2763455},
      • issn = {1941-0484},
      • url = {https://www.merl.com/publications/TR2017-190}
      • }
    •  Watanabe, S., Hori, T., Hershey, J.R., "Language Independent End-to-End Architecture For Joint Language and Speech Recognition", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), DOI: 10.1109/​ASRU.2017.8268945, December 2017.
      BibTeX TR2017-182 PDF Video
      • @inproceedings{Watanabe2017dec,
      • author = {Watanabe, Shinji and Hori, Takaaki and Hershey, John R.},
      • title = {Language Independent End-to-End Architecture For Joint Language and Speech Recognition},
      • booktitle = {IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)},
      • year = 2017,
      • month = dec,
      • doi = {10.1109/ASRU.2017.8268945},
      • url = {https://www.merl.com/publications/TR2017-182}
      • }
    •  Hori, T., Watanabe, S., Zhang, Y., Chan, W., "Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM", Interspeech, August 2017.
      BibTeX TR2017-132 PDF Video
      • @inproceedings{Hori2017aug,
      • author = {Hori, Takaaki and Watanabe, Shinji and Zhang, Yu and Chan, William},
      • title = {Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM},
      • booktitle = {Interspeech},
      • year = 2017,
      • month = aug,
      • url = {https://www.merl.com/publications/TR2017-132}
      • }
    •  Hori, T., Watanabe, S., Hershey, J.R., "Joint CTC/attention decoding for end-to-end speech recognition", Association for Computational Linguistics (ACL), DOI: 10.18653/​v1/​P17-1048, July 2017, pp. 518-529.
      BibTeX TR2017-103 PDF Video
      • @inproceedings{Hori2017jul,
      • author = {Hori, Takaaki and Watanabe, Shinji and Hershey, John R.},
      • title = {Joint CTC/attention decoding for end-to-end speech recognition},
      • booktitle = {Association for Computational Linguistics (ACL)},
      • year = 2017,
      • pages = {518--529},
      • month = jul,
      • doi = {10.18653/v1/P17-1048},
      • url = {https://www.merl.com/publications/TR2017-103}
      • }
    •  Kim, S., Hori, T., Watanabe, S., "Joint CTC- Attention Based End-to-End Speech Recognition Using Multi-task Learning", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2017.
      BibTeX TR2017-016 PDF Video
      • @inproceedings{Kim2017mar,
      • author = {Kim, Suyoun and Hori, Takaaki and Watanabe, Shinji},
      • title = {Joint CTC- Attention Based End-to-End Speech Recognition Using Multi-task Learning},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2017,
      • month = mar,
      • url = {https://www.merl.com/publications/TR2017-016}
      • }
    •  Seki, H., Hori, T., Watanabe, S., Le Roux, J., Hershey, J., "A Purely End-to-end System for Multi-speaker Speech Recognition", Annual Meeting of the Association for Computational Linguistics (ACL), July 2018, pp. 2620-2630.
      BibTeX TR2018-104 PDF Video
      • @inproceedings{Seki2018jul,
      • author = {Seki, Hiroshi and Hori, Takaaki and Watanabe, Shinji and Le Roux, Jonathan and Hershey, John},
      • title = {A Purely End-to-end System for Multi-speaker Speech Recognition},
      • booktitle = {Annual Meeting of the Association for Computational Linguistics (ACL)},
      • year = 2018,
      • pages = {2620--2630},
      • month = jul,
      • publisher = {Elsevier},
      • url = {https://www.merl.com/publications/TR2018-104}
      • }