Software & Data Downloads — TF-Locoformer

Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper.

This code implements TF-Locoformer, a Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper. Training and inference scripts are provided, as well as pretrained models for the WSJ0-2mix, Libri2mix, WHAMR!, and DNS-Interspeech2020 datasets

  •  Saijo, K., Wichern, G., Germain, F.G., Pan, Z., Le Roux, J., "TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement", International Workshop on Acoustic Signal Enhancement (IWAENC), September 2024.
    BibTeX TR2024-126 PDF Software
    • @inproceedings{Saijo2024sep2,
    • author = {Saijo, Kohei and Wichern, Gordon and Germain, François G and Pan, Zexu and Le Roux, Jonathan}},
    • title = {TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement},
    • booktitle = {International Workshop on Acoustic Signal Enhancement (IWAENC)},
    • year = 2024,
    • month = sep,
    • url = {https://www.merl.com/publications/TR2024-126}
    • }

Access software at https://github.com/merlresearch/tf-locoformer.