Alibaba rolls out models for speech recognition, speech synthesis, AI live speech translation, audio captioning, and multilingual OCR.
Abstract: Multi-talker speech recognition (MTASR) faces unique challenges in disentangling and transcribing overlapping speech. To address these challenges, this paper investigates the role of ...
Abstract: The Mixture of Experts (MoE) model is a promising approach for handling code-switching speech recognition (CS-ASR) tasks. However, the existing CS-ASR work on MoE has yet to leverage the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results