65 / 100 SEO Score

Dr Liumeng Xue | Audio and Speech Processing | Best Researcher Award

Dr Liumeng Xue, The Chinese University of Hong Kong, Shenzhen, China

Liumeng Xue is a postdoctoral researcher at The Chinese University of Hong Kong, Shenzhen, specializing in speech synthesis, voice conversion, and audio processing. With a Ph.D. and M.Eng. from Northwestern Polytechnical University, his research has significantly contributed to the fields of deep learning and machine learning for speech and language processing. Liumeng has collaborated with tech giants such as Microsoft, Tencent AI Lab, and JD AI Lab, enhancing technologies in speech synthesis and voice conversion. His work includes developing open-source platforms like Amphion for audio, music, and speech generation, and contributing to pioneering projects such as the visual analytics of diffusion models for singing voice conversion. Liumeng has published extensively in top conferences and journals, and holds patents in speech synthesis technologies. His innovative contributions continue to advance the field of audio and speech processing, making him a notable figure in this rapidly evolving area.

Publication Profile

Google Scholar

Strengths for the Award

  1. Extensive Research Experience: Liumeng Xue has a robust research background in audio, speech, and language processing, evidenced by his numerous research projects at renowned institutions like Microsoft, Tencent AI Lab, JD AI Lab, and the Chinese University of Hong Kong, Shenzhen (CUHKSZ). His focus areas include speech synthesis, voice conversion, deep learning, and machine learning, demonstrating versatility and depth in his field.
  2. Significant Contributions: His involvement in high-impact projects such as “Amphion: Open-source platform for audio, music, and speech generation” and “Visual analytics of diffusion model for singing voice conversion” highlights his ability to lead and innovate. The development of tools like Amphion, which has garnered significant community attention with 4.2k stars on GitHub, showcases his practical impact on the field.
  3. High-Quality Publications: Liumeng has authored and co-authored multiple peer-reviewed articles published in reputable conferences and journals, such as INTERSPEECH, IEEE/ACM Transactions on Audio, Speech, and Language Processing, and Neural Networks. His research has focused on cutting-edge topics like expressive TTS (text-to-speech) and speech synthesis for mixed-lingual and noisy environments, contributing novel insights to the domain.
  4. Recognized by Peers: His role as a reviewer for prestigious conferences and journals (TASLP, INTERSPEECH, ICASSP, ASRU) and his co-organization of IEEE SLT 2024 indicate recognition by the academic community. Additionally, receiving the INTERSPEECH 2022 Travel Grant underscores his standing as an emerging researcher.
  5. Innovative Patents: Liumeng holds patents related to encoder-decoder speech synthesis and controllable speech synthesis, showcasing his contribution to technological advancements and intellectual property in speech processing.

Areas for Improvement

  1. Broader Research Visibility: While Liumeng has significant academic publications and projects, there could be a greater emphasis on interdisciplinary collaboration and outreach beyond traditional speech and audio processing circles. Engaging in cross-disciplinary research could enhance the applicability of his work in diverse domains like healthcare, education, and entertainment.
  2. Leadership in Larger Collaborative Projects: Although he has demonstrated leadership in developing tools and research projects, leading larger, more collaborative international projects or initiatives could further enhance his profile as a global leader in his field. This could involve coordinating large-scale research efforts or leading multi-institutional grants.

Education 

Liumeng Xue completed his Doctor of Philosophy in 2023 at Northwestern Polytechnical University in Xi’an, China, specializing in speech synthesis and audio processing. His doctoral research focused on deep learning approaches to improve naturalness and expressiveness in speech synthesis. Prior to this, he earned his Master of Engineering from the same institution in 2017, where he laid the groundwork for his interest in audio and language processing technologies. During his master’s studies, Liumeng delved into topics like voice conversion and mixed-lingual text-to-speech synthesis, developing skills in machine learning and speech processing algorithms. His academic journey has been marked by a strong foundation in engineering principles and a focus on applying these to solve complex problems in speech and audio technology, preparing him for his current role as a postdoctoral researcher at The Chinese University of Hong Kong, Shenzhen, where he continues to explore innovative solutions in speech and audio processing.

Experience 

Liumeng Xue has extensive experience in speech synthesis and audio processing, having worked with leading technology companies and research institutions. From 2023 to the present, he has been a postdoctoral researcher at The Chinese University of Hong Kong, Shenzhen, focusing on advanced speech and audio technologies. Liumeng has collaborated with Microsoft on multiple projects, including controllable emphatic speech synthesis and paragraph-level speech synthesis, where he improved naturalness and expressiveness in generated speech. His tenure at Tencent AI Lab involved developing robust voice conversion methods for noisy target speakers. At JD AI Lab, he contributed to the development of Mandarin-English mixed-lingual TTS systems. Liumeng’s diverse experience also includes research stints at the Microsoft and ASLP Lab at Northwestern Polytechnical University, where he worked on speaking style modeling and emotional speech synthesis. His research has consistently aimed at advancing the quality and capabilities of speech synthesis and conversion technologies.

Awards and Honors 

Liumeng Xue has been recognized for his contributions to the field of speech synthesis and audio processing. In 2022, he received the Interspeech Travel Grant, acknowledging his significant research presentations at the conference. He has also been actively involved in the academic community, serving as a reviewer for prestigious journals and conferences such as IEEE Transactions on Audio, Speech, and Language Processing (TASLP), INTERSPEECH, ICASSP, ASRU, and ICMC. His commitment to advancing the field is further demonstrated by his role as a co-organizer of the 2024 IEEE Spoken Language Technology Workshop (IEEE SLT 2024). Liumeng’s academic and professional achievements are complemented by his innovative contributions to speech and audio technology, including multiple patents in speech synthesis, reflecting his dedication to enhancing audio processing technologies. His accolades and professional activities highlight his influence and leadership in the speech and language processing community.

Research Focus 

Liumeng Xue’s research focuses on the intersection of speech synthesis, voice conversion, and audio processing, utilizing advanced techniques in deep learning and machine learning. His work aims to improve the naturalness, expressiveness, and quality of synthesized speech, with a particular emphasis on developing robust models that can handle diverse linguistic and acoustic challenges. Liumeng has explored various aspects of speech and language processing, including the development of end-to-end text-to-speech (TTS) models, speaking style transfer, and mixed-lingual TTS systems. His recent projects include constructing instruction-tuned language models for speech, audio, and music generation, and creating visualization systems for diffusion models in singing voice conversion. Liumeng’s research also extends to noise-independent speech representation learning for voice conversion and paragraph-level speech synthesis using prosodic cross-sentence information. His innovative work contributes to the advancement of speech technology, aiming to make audio processing more versatile and accessible.

Publication Top Notes

🎵 SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion (Under review, 2024)

🎧 Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit (Under review, 2024)

🔊 Single-Codec: Single-Codebook Speech Codec for High-Performance Speech Generation (INTERSPEECH, 2024)

📚 Text-Aware and Context-Aware Expressive Audiobook Speech Synthesis (INTERSPEECH, 2024)

🗣️ WenetSpeech4TTS: A 12,800-Hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark (INTERSPEECH, 2024)

🎼 ChatMusician: Understanding and Generating Music Intrinsically with LLM (ACL, 2024)

🗣️ SponTTS: Modeling and Transferring Spontaneous Style for TTS (ICASSP, 2024)

🎤 HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-Form TTS (ASRU, 2023)

🎶 Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder (ICASSP, 2023)

🔍 An Initial Investigation of Neural Replay Simulator for Over-the-Air Adversarial Perturbations to Automatic Speaker Verification (ICASSP, 2023)

Conclusion

Liumeng Xue demonstrates exceptional qualifications for the Best Researcher Award with his significant contributions to the fields of speech synthesis, voice conversion, and audio processing. His innovative work, recognized by numerous high-impact publications, patents, and his role as a reviewer and organizer, showcases a promising research trajectory. By broadening his research collaborations and engaging in interdisciplinary projects, Liumeng could further elevate his profile and impact, making him a strong candidate for this award.

Liumeng Xue | Audio and Speech Processing | Best Researcher Award

You May Also Like

Leave a Reply