Alibaba Group’s new AI voice model has surpassed major Western competitors, showcasing its advanced capabilities in mastering complex Chinese dialects and accents. Developed by Alibaba’s Tongyi Lab, the Fun-Realtime-TTS-Preview model achieved a fifth-place ranking on the global Artificial Analysis Speech Arena leaderboard, marking it as the only Chinese-engineered system in the top five. The Speech Arena, an AI evaluation platform from San Francisco, assesses models based on speech-to-text conversion, voice understanding, and natural-sounding speech generation. Additionally, Alibaba’s Fun-Realtime-ASR model led the Word Error Rate index with an impressive 1.8% error rate, indicating high transcription accuracy.
previous post

