Resume

Here is the resume in PDF.

My Google Scholar page and Researchmap site.

Basic info

Xin Wang

Project Associate Professor (in fact, post-doc)

National Institute of Informatics

2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan

Education

Ph.D: 2015 - 2018 National Institute of Informatics, SOKENDAI, Tokyo, Japan.

Fundamental frequency modeling for neural-betwork-based statistical parametric speech synthesis

Supervisor: Prof. Junichi Yamagishi

M.Sc.: 2012 - 2015 University of Science and Technology of China, Hefei, China.

Bi-directional optimization for concept-to-speech synthesis

Supervisor: Prof. Zhen-Hua Ling

B.Sc.: 2008 - 2012 University of Electronic Science and Technology of China, Chengdu, China.

Academic activity

Organizer

Guest editor

Reviewer

  • IEEE TASLP, TBIOM, TIFS, SPL, ICASSP, ASRU, SLT

  • ISCA Interspeech, Speech synthesis workshop, Odyssey workshop, Computer speech & language, Speech Communication

  • IEICE Trans on Information and Systems

  • EUSIPCO, BIOSIG

Session chair

Grants

  • 2023 - 2027, JST, PRESTO: Unified framework for speech privacy protection and anti-spoofing. PI: Xin Wang.

  • 2021 - 2023, JSPS, Wakate (21K17775): Speech privacy protection by high-quality, invertible, and extendable speech anonymization and de-anonymization. PI: Xin Wang.

  • 2020 - 2021, KAWAI: Deep-learning-based neural source-filtering models for fast and high-quality music signal generation. PI: Xin Wang.

  • 2021 - 2022, JST AIP Challenge Enhanced End-to-End Multi-Instrument MIDI/sheet-to-Music Synthesis with Timber and Style Transfer. PI: Xin Wang.

  • 2019 - 2021, JSPS, grant-for-startup (19K24371): One model for all sounds: fast and high-quality neural source-filter model for speech and non-speech waveform modeling. PI: Xin Wang.

  • 2021 - 2022, Google Research Grant: Optimizing a Speech Anti-spoofing Database. PI:: Junichi Yamagishi. Collaborator: Xin Wang, Eric Cooper.

  • 2019 - 2020, Google AI Focused Research Awards Program in Japa: Robust and all-purpose neural source-filter models. PI: Junichi Yamagishi. Collaborator: Xin Wang, Eric Cooper.

Publication

Journal & book chapters

Speech Synthesis

  1. Xin Wang, Shinji Takaki, Junichi Yamagishi, Simon King and Keiichi Tokuda. A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F0 Model for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 157-170. doi: 10.1109/TASLP.2019.2950099. 2020.

  2. Xin Wang, Shinji Takaki and Junichi Yamagishi. Neural Source-Filter Waveform Models for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 402-415. doi: 10.1109/TASLP.2019.2956145. 2020.

  3. Xin Wang, Shinji Takaki and Junichi Yamagishi. Investigating very deep highway networks for parametric speech synthesisSpeech Communicationvol: 96, pages 1-9. doi: 10.1016/j.specom.2017.11.002. 2018.

  4. Xin Wang, Shinji Takaki and Junichi Yamagishi. Autoregressive Neural F0 Model for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 26, pages 1406-1419. doi: 10.1109/TASLP.2018.2828650. 2018.

  5. Xin Wang, Zhen-Hua Ling and Li-Rong Dai. Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filteringComputer Speech & Languagevol: 38, pages 46-67. doi: 10.1016/j.csl.2015.12.003. 2016.

  6. Xin Wang, Shinji Takaki and Junichi Yamagishi. Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech SynthesisIEICE Transactions on Information and Systemsvol: E99.D, pages 2471-2480. doi: 10.1587/transinf.2016SLP0011. 2016.

  7. Yusuke Yasuda, Xin Wang and Junichi Yamagishi. Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesisComputer Speech & Languagevol: 67, pages 101183. doi: https://doi.org/10.1016/j.csl.2020.101183. 2021.

  8. Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki and Junichi Yamagishi. Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains AudiencesIEEE Accessvol: 8, pages 138149-138161. doi: 10.1109/ACCESS.2020.3011975. 2020.

    Speech anti-spoofing

  9. Xin Wang, Junichi Yamagishi, Massimiliano Todisco, H{'{e}}ctor Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, S{'{e}}bastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Fran{c{c}}ois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang and Zhen-Hua Ling. ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speechComputer Speech & Languagevol: 64, pages 101114. doi: 10.1016/j.csl.2020.101114. 2020.

  10. Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, H{'{e}}ctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch and Kong Aik Lee. ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the WildIEEE Trans. on Audio, Speech, and Language Processingpages (accepted). 2023.

  11. Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans and Junichi Yamagishi. The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an UtteranceIEEE/ACM Transactions on Audio, Speech, and Language Processingpages 1-13. doi: 10.1109/TASLP.2022.3233236. 2022.

  12. Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi H. Kinnunen, Ville Vestman, Massimiliano Todisco, Hector Delgado, Md Sahidullah, Junichi Yamagishi and Kong Aik Lee. ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed SpeechIEEE Transactions on Biometrics, Behavior, and Identity Sciencevol: 3, pages 252-265. doi: 10.1109/TBIOM.2021.3059479. 2021.

  13. Tomi Kinnunen, Hector Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi and Douglas A Reynolds. Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: FundamentalsIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 2195-2210. doi: 10.1109/TASLP.2020.3009494. 2020.

  14. Xin Wang and Junichi Yamagishi. A Practical Guide to Logical Access Voice Presentation Attack DetectionFrontiers in Fake Media Generation and Detectionpages 169-214. doi: 10.1007/978-981-19-1524-6_8. 2022.

  15. Md Sahidullah, H{'{e}}ctor Delgado, Massimiliano Todisco, Andreas Nautsch, Xin Wang, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi and Kong-Aik Lee. Introduction to Voice Presentation Attack Detection and Recent AdvancesHandbook of Biometric Anti-Spoofingpages 339-385. doi: 10.1007/978-981-19-5288-3_13. 2023.

    Speaker anonymization

  16. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Language-Independent Speaker Anonymization Using Orthogonal {{Householder}} Neural NetworkIEEE/ACM Transactions on Audio, Speech, and Language Processingpages (accepted). 2023.

  17. Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier No{'{e}}, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O’Brien, Ana{"{i}}s Chanclu, Jean-Fran{c{c}}ois Bonastre, Massimiliano Todisco and Mohamed Maouche. The VoicePrivacy 2020 Challenge: Results and findingsComputer Speech & Languagepages 101362. doi: https://doi.org/10.1016/j.csl.2022.101362. 2022.

  18. Brij Mohan Lal Srivastava, Mohamed Maouche, Md Sahidullah, Emmanuel Vincent, Aurelien Bellet, Marc Tommasi, Natalia Tomashenko, Xin Wang and Junichi Yamagishi. Privacy and Utility of X-Vector Based Speaker AnonymizationIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 30, pages 2383-2395. doi: 10.1109/TASLP.2022.3190741. 2022.

Conference

Speech Synthesis

  1. Xin Wang and Junichi Yamagishi. Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform ModelProc. Interspeechpages 1992-1996. doi: 10.21437/Interspeech.2020-1018. 2020.

  2. Xin Wang, Shinji Takaki and Junichi Yamagishi. Neural Source-filter-based Waveform Model for Statistical Parametric Speech SynthesisProc. ICASSPpages 5916-5920. doi: 10.1109/ICASSP.2019.8682298. 2019.

  3. Xin Wang and Junichi Yamagishi. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech SynthesisProc. SSWpages 1-6. doi: 10.21437/SSW.2019-1. 2019.

  4. Xin Wang, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela and Junichi Yamagishi. A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesisProc. ICASSPpages 4804-4808. 2018.

  5. Xin Wang, Shinji Takaki and Junichi Yamagishi. An RNN-based quantized F0 model with multi-tier feedback links for text-to-speech synthesisProc. Interspeechpages 1059-1063. 2017.

  6. Xin Wang, Minghui Dong and Zhenhua Ling. A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesisProc. ICASSPpages 5165-5169. doi: 10.1109/ICASSP.2016.7472662. 2016.

  7. Xin Wang, Shinji Takaki and Junichi Yamagishi. A comparative study of the performance of HMM, DNN, and RNN based speech synthesis systems trained on very large speaker-dependent corporaProc. SSWpages 125-128. 2016.

  8. Xin Wang, Shinji Takaki and Junichi Yamagishi. Investigating very deep highway networks for parametric speech synthesisProc. SSWpages 181-186. 2016.

  9. Xin Wang, Shinji Takaki and Junichi Yamagishi. Enhance the word vector with prosodic information for the recurrent neural network based TTS systemProc. Interspeechpages 2856-2860. 2016.

  10. Xin Wang, Zhen-Hua Ling and Li-Rong Dai. Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesisProc. Interspeechpages 2942-2946. 2014.

  11. Xin Wang, Zhen-Hua Ling and Li-Rong Dai. Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesisProc. ISCSLPpages 84-87. 2012.

  12. Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper and Junichi Yamagishi. How Similar or Different is Rakugo Speech Synthesizer to Professional Performers?Proc. ICASSPpages 6488-6492. doi: 10.1109/ICASSP39728.2021.9414175. 2021.

  13. Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi and Zhenhua Ling. Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform GenerationProc. SLTpages 477-484. doi: 10.1109/SLT48900.2021.9383611. 2021.

  14. Yusuke Yasuda, Xin Wang and Junichi Yamagishd. End-to-End Text-to-Speech Using Latent Duration Based on VQ-VAEProc. ICASSPpages 5694-5698. doi: 10.1109/ICASSP39728.2021.9414499. 2021.

  15. Erica Cooper, Xin Wang and Junichi Yamagishi. Text-to-Speech Synthesis Techniques for MIDI-to-Audio SynthesisProc. SSWpages 130-135. doi: 10.21437/SSW.2021-23. 2021.

  16. Yi Zhao, Xin Wang, Lauri Juvela and Junichi Yamagishi. Transferring neural speech waveform synthesizers to musical instrument sounds generationProc. ICASSPpages 6269-6273. doi: 10.1109/ICASSP40776.2020.9053047. 2020.

  17. Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen and Junichi Yamagishi. Zero-shot multi-speaker text-to-speech with state-of-the-art neural speaker embeddingsProc. ICASSPpages 6184-6188. 2020.

  18. Yusuke Yasuda, Xin Wang and Junichi Yamagishi. Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignmentProc. ICASSPpages 6724-6728. 2020.

  19. Yang Ai, Xin Wang, Junichi Yamagishi and Zhen-Hua Ling. Reverberation Modeling for Source-Filter-Based Neural VocoderProc. Interspeechpages 3560-3564. doi: 10.21437/Interspeech.2020-1613. 2020.

  20. Yusuke Yasuda, Xin Wang, Shinji Takaki and Junichi Yamagishi. Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent languageProc. ICASSPpages 6905-6909. 2019.

  21. Fuming Fang, Xin Wang, Junichi Yamagishi and Isao Echizen. Audiovisual speaker conversion: jointly and simultaneously transforming facial expression and acoustic characteristicsProc. ICASSPpages 6795-6799. 2019.

  22. Shinji Takaki, Toru Nakashika, Xin Wang and Junichi Yamagishi. STFT spectral loss for training a neural speech waveform modelProc. ICASSPpages 7065-7069. 2019.

  23. Hieu-Thi Luong, Xin Wang, Junichi Yamagishi and Nobuyuki Nishizawa. Training multi-speaker neural text-to-speech systems using speaker-imbalanced speech corporaProc. Interspeechpages 1303-1307. doi: 10.21437/Interspeech.2019-1311. 2019.

  24. Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li and Junichi Yamagishi. Joint training framework for text-to-speech and voice conversion using multi-source tacotron and WaveNetProc. Interspeechpages 1298-1302. doi: 10.21437/Interspeech.2019-1357. 2019.

  25. Yusuke Yasuda, Xin Wang and Junichi Yamagishi. Initial investigation of encoder-decoder end-to-end TTS using marginalization of monotonic hard alignmentsProc. SSWpages 211-216. doi: 10.21437/SSW.2019-38. 2019.

  26. Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki and Junichi Yamagishi. Rakugo speech synthesis using segment-to-segment neural transduction and style tokens - toward speech synthesis for entertaining audiencesProc. SSWpages 111-116. doi: 10.21437/SSW.2019-20. 2019.

  27. Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Mariko Kondo and Junichi Yamagishi. Cyborg speech: Deep multilingual speech synthesis for generating segmental foreign accent with natural prosodyProc. ICASSPpages 4799-4803. 2018.

  28. Lauri Juvela, Bajibabu Bollepalli, Xin Wang, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi and Paavo Alku. Speech waveform synthesis from MFCC sequences with generative adversarial networksProc. ICASSPpages 5679-5683. 2018.

  29. Hieu-Thi Luong, Xin Wang, Junichi Yamagishi and Nobuyuki Nishizawa. Investigating accuracy of pitch-accent annotations in neural-network-based speech synthesis and denoising effectsProc. Interspeechpages 37-41. 2018.

  30. Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi and Tomi Kinnunen. Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found dataProc. Odysseypages 240-247. doi: 10.21437/Odyssey.2018-34. 2018.

  31. Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang and Junichi Yamagishi. Principles for learning controllable TTS from annotated and latent variationProc. Interspeechpages 3956-3960. doi: 10.21437/Interspeech.2017-171. 2017.

  32. Lauri Juvela, Xin Wang, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi and Paavo Alku. Using text and acoustic features in predicting glottal excitation waveforms for parametric speech synthesis with recurrent neural networksProc. Interspeechpages 2283-2287. 2016.

    Speech anti-spoofing

  33. Xin Wang and Junichi Yamagishi. Investigating Active-learning-based Training Data Selection for Speech Spoofing CountermeasureProc. SLTpages 585-592. 2023.

  34. Xin Wang and Junichi Yamagishi. Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocodersProc. ICASSPpages (accepted). 2023.

  35. Xin Wang and Junichi Yamagishi. Estimating the Confidence of Speech Spoofing CountermeasureProc. ICASSPpages 6372-6376. doi: 10.1109/ICASSP43922.2022.9746204. 2022.

  36. Xin Wang and Junichi Yamagishi. Investigating Self-Supervised Front Ends for Speech Spoofing CountermeasuresProc. Odysseypages 100-106. doi: 10.21437/Odyssey.2022-14. 2022.

  37. Xin Wang and Junichi Yamagishi. A comparative study on recent neural spoofing countermeasures for synthetic speech detectionProc. Interspeechpages 4259-4263. doi: 10.21437/Interspeech.2021-702. 2021.

  38. Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim and Jee-weon Jung. Towards Single Integrated Spoofing-aware Speaker Verification EmbeddingsProc. Interspeechpages 3989-3993. doi: 10.21437/Interspeech.2023-1402. 2023.

  39. Chang Zeng, Xin Wang, Xiaoxiao Miao, Erica Cooper and Junichi Yamagishi. Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization TermsProc. Interspeechpages 1998-2002. doi: 10.21437/Interspeech.2023-125. 2023.

  40. Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans and Junichi Yamagishi. Range-Based Equal Error Rate for Spoof LocalizationProc. Interspeechpages 3212-3216. doi: 10.21437/Interspeech.2023-1214. 2023.

  41. Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi and Nicholas Evans. Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentationProc. Odysseypages 112-119. 2022.

  42. Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino and Nicholas Evans. An Initial Investigation for Detecting Partially Spoofed AudioProc. Interspeechpages 4264-4268. doi: 10.21437/Interspeech.2021-738. 2021.

  43. Lin Zhang, Xin Wang, Erica Cooper and Junichi Yamagishi. Multi-task Learning in Utterance-level and Segmental-level Spoof DetectionProc. ASVspoof Challenge workshoppages 9-15. doi: 10.21437/ASVSPOOF.2021-2. 2021.

  44. Tomi Kinnunen, Andreas Nautsch, Md. Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, H{'{e}}ctor Delgado, Junichi Yamagishi and Kong Aik Lee. Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-SpoofingProc. Interspeechpages 4299-4303. doi: 10.21437/Interspeech.2021-1522. 2021.

  45. Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans and H{'{e}}ctor Delgado. ASVspoof 2021: accelerating progress in spoofed and deepfake speech detectionProc. ASVspoof Challenge workshoppages 47-54. doi: 10.21437/ASVSPOOF.2021-8. 2021.

  46. Massimiliano Todisco, Xin Wang, Ville Vestman, Md. Sahidullah, H{'{e}}ctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi H Kinnunen and Kong Aik Lee. ASVspoof 2019: future horizons in spoofed and fake audio detectionProc. Interspeechpages 1008-1012. doi: 10.21437/Interspeech.2019-2249. 2019.

    Speaker anonymization

  47. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Modelspages 279-286. doi: 10.21437/Odyssey.2022-39. 2022.

  48. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Analyzing Language-Independent Speaker Anonymization Framework under Unseen ConditionsProc. Interspeechpages 4426-4430. doi: 10.21437/Interspeech.2022-11065. 2022.

  49. Jean-Fran{c{c}}ois Bonastre, H{'{e}}ctor Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang and Junichi Yamagishi. Benchmarking and challenges in security and privacy for voice biometricsProc. 2021 ISCA Symposium on Security and Privacy in Speech Communicationpages 52-56. doi: 10.21437/SPSC.2021-11. 2021.

  50. Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-Fran{c{c}}ois Bonastre, Paul-Gauthier No{'{e}} and Massimiliano Todisco. Introducing the VoicePrivacy InitiativeProc. Interspeechpages 1693-1697. doi: 10.21437/Interspeech.2020-1333. 2020.

  51. Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aur{'{e}}lien Bellet and Marc Tommasi. Design Choices for X-Vector Based Speaker AnonymizationProc. Interspeechpages 1713-1717. doi: 10.21437/Interspeech.2020-2692. 2020.

  52. Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen, Massimiliano Todisco, Nicholas Evans and Jean-Francois Bonastre. Speaker anonymization using X-vector and neural waveform modelsProc. SSWpages 155-160. doi: 10.21437/SSW.2019-28. 2019.

    Other topics

  53. Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao and Junichi Yamagishi. Attention Back-end for Automatic Speaker Verification with Multiple Enrollment UtterancesProc. ICASSPpages (accepted). 2022.

  54. Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao and Hsin-Min Wang. MOSnet: deep learning-based objective assessment for voice conversionProc. Interspeechpages 1541-1545. doi: 10.21437/Interspeech.2019-2003. 2019.

  55. Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki and Junichi Yamagishi. Investigating RNN-based speech enhancement methods for noise-robust text-to-speechProc. SSWpages 146-152. 2016.

  56. Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki and Junichi Yamagishi. Speech enhancement for a noise-robust text-to-speech synthesis system using deep recurrent neural networksProc. Interspeechpages 352-356. 2016.

Talk

(Slides are available in Talk & slides)

  • 2023 Nov, VoicePersonae and ASVspoof workshop (link NOV-2023):

    • Talk 1: From DSP and DNN to DNN+DSP for waveform model

    • Talk 2: Harnessing data to improving speech spoofing countermeasure

  • 2023 Aug, Interspeech 2023 tutorial Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning. Materials are available on github.

  • 2023 Mar, SPSC webinar: using vocoders to create training data for speech spoofing countermeasure (link MAR-2023).

  • 2022 Sep, SPSC Symposium: tutorial on speaker anonymization (software part) (link :ref: label-slide-2022-sep-1).

  • 2022 May, ICASSP 2022 short course: inclusive Neural Speech Synthesis - neural vocoder part (link MAY-2022).

  • 2021 Dec, Speech Synthesis Forum, China Computer Federation: Two speech security issues after the speech synthesis boom (link OCT-2021).

  • 2021 Oct, JST Science Agora 2021, pre-Agora event: Deepfakes: High-tech Illusions to Trick the Human Brain., with Sascha Frühholz (University of Zurich), Erica Cooper, Florence Steiner (University of Zurich). Video is here

  • 2021 July, Tutorial at ISCA 2021 Speech Processing Courses in Crete: Advancement in Neural Vocoders, with Prof. Yamagishi (link JUL-2021).

  • 2020 Nov., Tutorial as ISCA 2020 Speaker Odyssey: Neural statistical parametric speech synthesis (link DEC-2020).

  • 2020 July, Tutorial at ISCA 2020 Speech Processing Courses in Crete: Neural auto-regressive, source-filter and glottal vocoders for speech and music signals, with Prof. Yamagishi (link JUL-2020).

  • 2019 Sep, Fraunhofer IIS, invited talk: Neural waveform models for text-to-speech synthesis (link SEP-2019).

  • 2019 Jan, IEICE Technical Committee on Speech (SP), invited tutorial, Kanazawa, Japan: Tutorial on recent neural waveform models (link JAN-2019).

  • 2018 Nov, Nagoya Institute of Technology, Tokuda lab: Autoregressive neural networks for parametric speech synthesis (link JAN-2018).

  • 2018 Jun, University of Eastern Finland and Aalto University Autoregressive neural networks for parametric speech synthesis (same content as above).

Awards & scholarship

Language

  • Mandarin

  • English (Toefl 2015, 112/120)

  • Japanese (N1, 2021 Dec, 169/180)