.. temp documentation master file, created by sphinx-quickstart on Sun Aug 30 20:18:34 2020. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. _label-resume: Resume ****** Here is the `resume in PDF `_. My `Google Scholar page `_ and `Researchmap site `_. Basic info ========== **Xin Wang** Project Associate Professor (in fact, post-doc) National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan Education ========= Ph.D: 2015 - 2018 National Institute of Informatics, SOKENDAI, Tokyo, Japan. Fundamental frequency modeling for neural-betwork-based statistical parametric speech synthesis Supervisor: Prof. Junichi Yamagishi * Thesis (submitted 2018-06-29): `PDF version `_ * Slides for thesis defense: `defense slides `_ * Appendix: `highway network `_, `SAR `_, `DAR `_, `VQ-VAE `_ M.Sc.: 2012 - 2015 University of Science and Technology of China, Hefei, China. Bi-directional optimization for concept-to-speech synthesis Supervisor: Prof. Zhen-Hua Ling B.Sc.: 2008 - 2012 University of Electronic Science and Technology of China, Chengdu, China. Academic activity ================= Organizer * `ASVspoof5 `_, ASVspoof challenges 2021, 2019 * `Voice Privacy Challenge `_ 2022, 2020 * APSIPA ASC 2019 special session on `Deep Generative Models for Media Clones and Its Detection `_ * ISCA Interspeech 2019 special session on `Automatic Speaker Verification Spoofing and Countermeasures Challenge 2019 (ASVSpoof 2019) `_ * IEEE ASRU 2019 special session on `ASVspoof 2019 `_ Guest editor * Computer Speech and Language `Special issue on Advances in Automatic Speaker Verification Anti-spoofing `_ Reviewer * IEEE TASLP, TBIOM, TIFS, SPL, ICASSP, ASRU, SLT * ISCA Interspeech, Speech synthesis workshop, Odyssey workshop, Computer speech \& language, Speech Communication * IEICE Trans on Information and Systems * EUSIPCO, BIOSIG Session chair * ICASSP 2023, `ACM MM 2022 DDAM Workshop `_, `ASVspoof workshop 2021 `_, Interspeech 2021, SSW 2019. Grants ====== * 2023 - 2027, **JST, PRESTO**: Unified framework for speech privacy protection and anti-spoofing. PI: Xin Wang. * 2021 - 2023, **JSPS, Wakate (21K17775)**: Speech privacy protection by high-quality, invertible, and extendable speech anonymization and de-anonymization. PI: Xin Wang. * 2020 - 2021, **KAWAI**: Deep-learning-based neural source-filtering models for fast and high-quality music signal generation. PI: Xin Wang. * 2021 - 2022, **JST AIP Challenge** Enhanced End-to-End Multi-Instrument MIDI/sheet-to-Music Synthesis with Timber and Style Transfer. PI: Xin Wang. * 2019 - 2021, **JSPS, grant-for-startup (19K24371)**: One model for all sounds: fast and high-quality neural source-filter model for speech and non-speech waveform modeling. PI: Xin Wang. * 2021 - 2022, **Google Research Grant**: Optimizing a Speech Anti-spoofing Database. PI:: Junichi Yamagishi. Collaborator: Xin Wang, Eric Cooper. * 2019 - 2020, **Google AI Focused Research Awards Program in Japa**: Robust and all-purpose neural source-filter models. PI: Junichi Yamagishi. Collaborator: Xin Wang, Eric Cooper. Publication =========== Journal & book chapters ----------------------- **Speech Synthesis** #. **Xin Wang**, Shinji Takaki, Junichi Yamagishi, Simon King and Keiichi Tokuda. A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F0 Model for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 157-170. doi: 10.1109/TASLP.2019.2950099. 2020. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Neural Source-Filter Waveform Models for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 402-415. doi: 10.1109/TASLP.2019.2956145. 2020. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Investigating very deep highway networks for parametric speech synthesisSpeech Communicationvol: 96, pages 1-9. doi: 10.1016/j.specom.2017.11.002. 2018. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Autoregressive Neural F0 Model for Statistical Parametric Speech SynthesisIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 26, pages 1406-1419. doi: 10.1109/TASLP.2018.2828650. 2018. #. **Xin Wang**, Zhen-Hua Ling and Li-Rong Dai. Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filteringComputer Speech \& Languagevol: 38, pages 46-67. doi: 10.1016/j.csl.2015.12.003. 2016. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech SynthesisIEICE Transactions on Information and Systemsvol: E99.D, pages 2471-2480. doi: 10.1587/transinf.2016SLP0011. 2016. #. Yusuke Yasuda, **Xin Wang** and Junichi Yamagishi. Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesisComputer Speech \& Languagevol: 67, pages 101183. doi: https://doi.org/10.1016/j.csl.2020.101183. 2021. #. Shuhei Kato, Yusuke Yasuda, **Xin Wang**, Erica Cooper, Shinji Takaki and Junichi Yamagishi. Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains AudiencesIEEE Accessvol: 8, pages 138149-138161. doi: 10.1109/ACCESS.2020.3011975. 2020. **Speech anti-spoofing** #. **Xin Wang**, Junichi Yamagishi, Massimiliano Todisco, H{\'{e}}ctor Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, S{\'{e}}bastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Fran{\c{c}}ois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang and Zhen-Hua Ling. ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speechComputer Speech \& Languagevol: 64, pages 101114. doi: 10.1016/j.csl.2020.101114. 2020. #. Xuechen Liu, **Xin Wang**, Md Sahidullah, Jose Patino, H{\'{e}}ctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch and Kong Aik Lee. ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the WildIEEE Trans. on Audio, Speech, and Language Processingpages (accepted). 2023. #. Lin Zhang, **Xin Wang**, Erica Cooper, Nicholas Evans and Junichi Yamagishi. The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an UtteranceIEEE/ACM Transactions on Audio, Speech, and Language Processingpages 1-13. doi: 10.1109/TASLP.2022.3233236. 2022. #. Andreas Nautsch, **Xin Wang**, Nicholas Evans, Tomi H. Kinnunen, Ville Vestman, Massimiliano Todisco, Hector Delgado, Md Sahidullah, Junichi Yamagishi and Kong Aik Lee. ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed SpeechIEEE Transactions on Biometrics, Behavior, and Identity Sciencevol: 3, pages 252-265. doi: 10.1109/TBIOM.2021.3059479. 2021. #. Tomi Kinnunen, Hector Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, **Xin Wang**, Md Sahidullah, Junichi Yamagishi and Douglas A Reynolds. Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: FundamentalsIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 28, pages 2195-2210. doi: 10.1109/TASLP.2020.3009494. 2020. #. **Xin Wang** and Junichi Yamagishi. A Practical Guide to Logical Access Voice Presentation Attack DetectionFrontiers in Fake Media Generation and Detectionpages 169-214. doi: 10.1007/978-981-19-1524-6_8. 2022. #. Md Sahidullah, H{\'{e}}ctor Delgado, Massimiliano Todisco, Andreas Nautsch, **Xin Wang**, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi and Kong-Aik Lee. Introduction to Voice Presentation Attack Detection and Recent AdvancesHandbook of Biometric Anti-Spoofingpages 339-385. doi: 10.1007/978-981-19-5288-3_13. 2023. **Speaker anonymization** #. Xiaoxiao Miao, **Xin Wang**, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Language-Independent Speaker Anonymization Using Orthogonal {{Householder}} Neural NetworkIEEE/ACM Transactions on Audio, Speech, and Language Processingpages (accepted). 2023. #. Natalia Tomashenko, **Xin Wang**, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier No{\'{e}}, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Ana{\"{i}}s Chanclu, Jean-Fran{\c{c}}ois Bonastre, Massimiliano Todisco and Mohamed Maouche. The VoicePrivacy 2020 Challenge: Results and findingsComputer Speech \& Languagepages 101362. doi: https://doi.org/10.1016/j.csl.2022.101362. 2022. #. Brij Mohan Lal Srivastava, Mohamed Maouche, Md Sahidullah, Emmanuel Vincent, Aurelien Bellet, Marc Tommasi, Natalia Tomashenko, **Xin Wang** and Junichi Yamagishi. Privacy and Utility of X-Vector Based Speaker AnonymizationIEEE/ACM Transactions on Audio, Speech, and Language Processingvol: 30, pages 2383-2395. doi: 10.1109/TASLP.2022.3190741. 2022. Conference ---------- **Speech Synthesis** #. **Xin Wang** and Junichi Yamagishi. Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform ModelProc. Interspeechpages 1992-1996. doi: 10.21437/Interspeech.2020-1018. 2020. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Neural Source-filter-based Waveform Model for Statistical Parametric Speech SynthesisProc. ICASSPpages 5916-5920. doi: 10.1109/ICASSP.2019.8682298. 2019. #. **Xin Wang** and Junichi Yamagishi. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech SynthesisProc. SSWpages 1-6. doi: 10.21437/SSW.2019-1. 2019. #. **Xin Wang**, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela and Junichi Yamagishi. A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesisProc. ICASSPpages 4804-4808. 2018. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. An RNN-based quantized F0 model with multi-tier feedback links for text-to-speech synthesisProc. Interspeechpages 1059-1063. 2017. #. **Xin Wang**, Minghui Dong and Zhenhua Ling. A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesisProc. ICASSPpages 5165-5169. doi: 10.1109/ICASSP.2016.7472662. 2016. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. A comparative study of the performance of HMM, DNN, and RNN based speech synthesis systems trained on very large speaker-dependent corporaProc. SSWpages 125-128. 2016. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Investigating very deep highway networks for parametric speech synthesisProc. SSWpages 181-186. 2016. #. **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Enhance the word vector with prosodic information for the recurrent neural network based TTS systemProc. Interspeechpages 2856-2860. 2016. #. **Xin Wang**, Zhen-Hua Ling and Li-Rong Dai. Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesisProc. Interspeechpages 2942-2946. 2014. #. **Xin Wang**, Zhen-Hua Ling and Li-Rong Dai. Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesisProc. ISCSLPpages 84-87. 2012. #. Shuhei Kato, Yusuke Yasuda, **Xin Wang**, Erica Cooper and Junichi Yamagishi. How Similar or Different is Rakugo Speech Synthesizer to Professional Performers?Proc. ICASSPpages 6488-6492. doi: 10.1109/ICASSP39728.2021.9414175. 2021. #. Yang Ai, Haoyu Li, **Xin Wang**, Junichi Yamagishi and Zhenhua Ling. Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform GenerationProc. SLTpages 477-484. doi: 10.1109/SLT48900.2021.9383611. 2021. #. Yusuke Yasuda, **Xin Wang** and Junichi Yamagishd. End-to-End Text-to-Speech Using Latent Duration Based on VQ-VAEProc. ICASSPpages 5694-5698. doi: 10.1109/ICASSP39728.2021.9414499. 2021. #. Erica Cooper, **Xin Wang** and Junichi Yamagishi. Text-to-Speech Synthesis Techniques for MIDI-to-Audio SynthesisProc. SSWpages 130-135. doi: 10.21437/SSW.2021-23. 2021. #. Yi Zhao, **Xin Wang**, Lauri Juvela and Junichi Yamagishi. Transferring neural speech waveform synthesizers to musical instrument sounds generationProc. ICASSPpages 6269-6273. doi: 10.1109/ICASSP40776.2020.9053047. 2020. #. Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, **Xin Wang**, Nanxin Chen and Junichi Yamagishi. Zero-shot multi-speaker text-to-speech with state-of-the-art neural speaker embeddingsProc. ICASSPpages 6184-6188. 2020. #. Yusuke Yasuda, **Xin Wang** and Junichi Yamagishi. Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignmentProc. ICASSPpages 6724-6728. 2020. #. Yang Ai, **Xin Wang**, Junichi Yamagishi and Zhen-Hua Ling. Reverberation Modeling for Source-Filter-Based Neural VocoderProc. Interspeechpages 3560-3564. doi: 10.21437/Interspeech.2020-1613. 2020. #. Yusuke Yasuda, **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent languageProc. ICASSPpages 6905-6909. 2019. #. Fuming Fang, **Xin Wang**, Junichi Yamagishi and Isao Echizen. Audiovisual speaker conversion: jointly and simultaneously transforming facial expression and acoustic characteristicsProc. ICASSPpages 6795-6799. 2019. #. Shinji Takaki, Toru Nakashika, **Xin Wang** and Junichi Yamagishi. STFT spectral loss for training a neural speech waveform modelProc. ICASSPpages 7065-7069. 2019. #. Hieu-Thi Luong, **Xin Wang**, Junichi Yamagishi and Nobuyuki Nishizawa. Training multi-speaker neural text-to-speech systems using speaker-imbalanced speech corporaProc. Interspeechpages 1303-1307. doi: 10.21437/Interspeech.2019-1311. 2019. #. Mingyang Zhang, **Xin Wang**, Fuming Fang, Haizhou Li and Junichi Yamagishi. Joint training framework for text-to-speech and voice conversion using multi-source tacotron and WaveNetProc. Interspeechpages 1298-1302. doi: 10.21437/Interspeech.2019-1357. 2019. #. Yusuke Yasuda, **Xin Wang** and Junichi Yamagishi. Initial investigation of encoder-decoder end-to-end TTS using marginalization of monotonic hard alignmentsProc. SSWpages 211-216. doi: 10.21437/SSW.2019-38. 2019. #. Shuhei Kato, Yusuke Yasuda, **Xin Wang**, Erica Cooper, Shinji Takaki and Junichi Yamagishi. Rakugo speech synthesis using segment-to-segment neural transduction and style tokens - toward speech synthesis for entertaining audiencesProc. SSWpages 111-116. doi: 10.21437/SSW.2019-20. 2019. #. Gustav Eje Henter, Jaime Lorenzo-Trueba, **Xin Wang**, Mariko Kondo and Junichi Yamagishi. Cyborg speech: Deep multilingual speech synthesis for generating segmental foreign accent with natural prosodyProc. ICASSPpages 4799-4803. 2018. #. Lauri Juvela, Bajibabu Bollepalli, **Xin Wang**, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi and Paavo Alku. Speech waveform synthesis from MFCC sequences with generative adversarial networksProc. ICASSPpages 5679-5683. 2018. #. Hieu-Thi Luong, **Xin Wang**, Junichi Yamagishi and Nobuyuki Nishizawa. Investigating accuracy of pitch-accent annotations in neural-network-based speech synthesis and denoising effectsProc. Interspeechpages 37-41. 2018. #. Jaime Lorenzo-Trueba, Fuming Fang, **Xin Wang**, Isao Echizen, Junichi Yamagishi and Tomi Kinnunen. Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found dataProc. Odysseypages 240-247. doi: 10.21437/Odyssey.2018-34. 2018. #. Gustav Eje Henter, Jaime Lorenzo-Trueba, **Xin Wang** and Junichi Yamagishi. Principles for learning controllable TTS from annotated and latent variationProc. Interspeechpages 3956-3960. doi: 10.21437/Interspeech.2017-171. 2017. #. Lauri Juvela, **Xin Wang**, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi and Paavo Alku. Using text and acoustic features in predicting glottal excitation waveforms for parametric speech synthesis with recurrent neural networksProc. Interspeechpages 2283-2287. 2016. **Speech anti-spoofing** #. **Xin Wang** and Junichi Yamagishi. Investigating Active-learning-based Training Data Selection for Speech Spoofing CountermeasureProc. SLTpages 585-592. 2023. #. **Xin Wang** and Junichi Yamagishi. Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocodersProc. ICASSPpages (accepted). 2023. #. **Xin Wang** and Junichi Yamagishi. Estimating the Confidence of Speech Spoofing CountermeasureProc. ICASSPpages 6372-6376. doi: 10.1109/ICASSP43922.2022.9746204. 2022. #. **Xin Wang** and Junichi Yamagishi. Investigating Self-Supervised Front Ends for Speech Spoofing CountermeasuresProc. Odysseypages 100-106. doi: 10.21437/Odyssey.2022-14. 2022. #. **Xin Wang** and Junichi Yamagishi. A comparative study on recent neural spoofing countermeasures for synthetic speech detectionProc. Interspeechpages 4259-4263. doi: 10.21437/Interspeech.2021-702. 2021. #. Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, **Xin Wang**, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim and Jee-weon Jung. Towards Single Integrated Spoofing-aware Speaker Verification EmbeddingsProc. Interspeechpages 3989-3993. doi: 10.21437/Interspeech.2023-1402. 2023. #. Chang Zeng, **Xin Wang**, Xiaoxiao Miao, Erica Cooper and Junichi Yamagishi. Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization TermsProc. Interspeechpages 1998-2002. doi: 10.21437/Interspeech.2023-125. 2023. #. Lin Zhang, **Xin Wang**, Erica Cooper, Nicholas Evans and Junichi Yamagishi. Range-Based Equal Error Rate for Spoof LocalizationProc. Interspeechpages 3212-3216. doi: 10.21437/Interspeech.2023-1214. 2023. #. Hemlata Tak, Massimiliano Todisco, **Xin Wang**, Jee-weon Jung, Junichi Yamagishi and Nicholas Evans. Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentationProc. Odysseypages 112-119. 2022. #. Lin Zhang, **Xin Wang**, Erica Cooper, Junichi Yamagishi, Jose Patino and Nicholas Evans. An Initial Investigation for Detecting Partially Spoofed AudioProc. Interspeechpages 4264-4268. doi: 10.21437/Interspeech.2021-738. 2021. #. Lin Zhang, **Xin Wang**, Erica Cooper and Junichi Yamagishi. Multi-task Learning in Utterance-level and Segmental-level Spoof DetectionProc. ASVspoof Challenge workshoppages 9-15. doi: 10.21437/ASVSPOOF.2021-2. 2021. #. Tomi Kinnunen, Andreas Nautsch, Md. Sahidullah, Nicholas Evans, **Xin Wang**, Massimiliano Todisco, H{\'{e}}ctor Delgado, Junichi Yamagishi and Kong Aik Lee. Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-SpoofingProc. Interspeechpages 4299-4303. doi: 10.21437/Interspeech.2021-1522. 2021. #. Junichi Yamagishi, **Xin Wang**, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans and H{\'{e}}ctor Delgado. ASVspoof 2021: accelerating progress in spoofed and deepfake speech detectionProc. ASVspoof Challenge workshoppages 47-54. doi: 10.21437/ASVSPOOF.2021-8. 2021. #. Massimiliano Todisco, **Xin Wang**, Ville Vestman, Md. Sahidullah, H{\'{e}}ctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi H Kinnunen and Kong Aik Lee. ASVspoof 2019: future horizons in spoofed and fake audio detectionProc. Interspeechpages 1008-1012. doi: 10.21437/Interspeech.2019-2249. 2019. **Speaker anonymization** #. Xiaoxiao Miao, **Xin Wang**, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Modelspages 279-286. doi: 10.21437/Odyssey.2022-39. 2022. #. Xiaoxiao Miao, **Xin Wang**, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko. Analyzing Language-Independent Speaker Anonymization Framework under Unseen ConditionsProc. Interspeechpages 4426-4430. doi: 10.21437/Interspeech.2022-11065. 2022. #. Jean-Fran{\c{c}}ois Bonastre, H{\'{e}}ctor Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, **Xin Wang** and Junichi Yamagishi. Benchmarking and challenges in security and privacy for voice biometricsProc. 2021 ISCA Symposium on Security and Privacy in Speech Communicationpages 52-56. doi: 10.21437/SPSC.2021-11. 2021. #. Natalia Tomashenko, Brij Mohan Lal Srivastava, **Xin Wang**, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-Fran{\c{c}}ois Bonastre, Paul-Gauthier No{\'{e}} and Massimiliano Todisco. Introducing the VoicePrivacy InitiativeProc. Interspeechpages 1693-1697. doi: 10.21437/Interspeech.2020-1333. 2020. #. Brij Mohan Lal Srivastava, Natalia Tomashenko, **Xin Wang**, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aur{\'{e}}lien Bellet and Marc Tommasi. Design Choices for X-Vector Based Speaker AnonymizationProc. Interspeechpages 1713-1717. doi: 10.21437/Interspeech.2020-2692. 2020. #. Fuming Fang, **Xin Wang**, Junichi Yamagishi, Isao Echizen, Massimiliano Todisco, Nicholas Evans and Jean-Francois Bonastre. Speaker anonymization using X-vector and neural waveform modelsProc. SSWpages 155-160. doi: 10.21437/SSW.2019-28. 2019. **Other topics** #. Chang Zeng, **Xin Wang**, Erica Cooper, Xiaoxiao Miao and Junichi Yamagishi. Attention Back-end for Automatic Speaker Verification with Multiple Enrollment UtterancesProc. ICASSPpages (accepted). 2022. #. Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, **Xin Wang**, Junichi Yamagishi, Yu Tsao and Hsin-Min Wang. MOSnet: deep learning-based objective assessment for voice conversionProc. Interspeechpages 1541-1545. doi: 10.21437/Interspeech.2019-2003. 2019. #. Cassia Valentini-Botinhao, **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Investigating RNN-based speech enhancement methods for noise-robust text-to-speechProc. SSWpages 146-152. 2016. #. Cassia Valentini-Botinhao, **Xin Wang**, Shinji Takaki and Junichi Yamagishi. Speech enhancement for a noise-robust text-to-speech synthesis system using deep recurrent neural networksProc. Interspeechpages 352-356. 2016. Talk ==== (Slides are available in :ref:`label-slide`) * 2023 Nov, VoicePersonae and ASVspoof workshop (link :ref:`label-slide-2023-nov-1`): * Talk 1: ``From DSP and DNN to DNN+DSP for waveform model`` * Talk 2: ``Harnessing data to improving speech spoofing countermeasure`` * 2023 Aug, Interspeech 2023 tutorial ``Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning``. Materials are available on `github `__. * 2023 Mar, SPSC webinar: ``using vocoders to create training data for speech spoofing countermeasure`` (link :ref:`label-slide-2023-mar-1`). * 2022 Sep, SPSC Symposium: ``tutorial on speaker anonymization (software part)`` (link :ref: `label-slide-2022-sep-1`). * 2022 May, ICASSP 2022 short course: ``inclusive Neural Speech Synthesis - neural vocoder part`` (link :ref:`label-slide-2022-may-1`). * 2021 Dec, Speech Synthesis Forum, China Computer Federation: ``Two speech security issues after the speech synthesis boom`` (link :ref:`label-slide-2021-oct-1`). * 2021 Oct, JST Science Agora 2021, pre-Agora event: ``Deepfakes: High-tech Illusions to Trick the Human Brain.``, with Sascha Frühholz (University of Zurich), Erica Cooper, Florence Steiner (University of Zurich). Video is `here `_ * 2021 July, Tutorial at ISCA 2021 Speech Processing Courses in Crete: ``Advancement in Neural Vocoders``, with Prof. Yamagishi (link :ref:`label-slide-2021-jul-1`). * 2020 Nov., Tutorial as ISCA 2020 Speaker Odyssey: ``Neural statistical parametric speech synthesis`` (link :ref:`label-slide-2020-dec-1`). * 2020 July, Tutorial at ISCA 2020 Speech Processing Courses in Crete: ``Neural auto-regressive, source-filter and glottal vocoders for speech and music signals``, with Prof. Yamagishi (link :ref:`label-slide-2020-jul-1`). * 2019 Sep, Fraunhofer IIS, invited talk: ``Neural waveform models for text-to-speech synthesis`` (link :ref:`label-slide-2019-sep-1`). * 2019 Jan, IEICE Technical Committee on Speech (SP), invited tutorial, Kanazawa, Japan: ``Tutorial on recent neural waveform models`` (link :ref:`label-slide-2019-jan-1`). * 2018 Nov, Nagoya Institute of Technology, Tokuda lab: ``Autoregressive neural networks for parametric speech synthesis`` (link :ref:`label-slide-2018-jan-1`). * 2018 Jun, University of Eastern Finland and Aalto University ``Autoregressive neural networks for parametric speech synthesis`` (same content as above). Awards & scholarship ==================== * Best paper award for `SSW 2016 `_, ISCA SynSig * `SOKENDAI Award `_, SOKENDAI, Japan * `Young Researcher's Award in Speech Field `_, IEICE ISS, Japan * 11th IEEE Signal Processing Society Japan Student `Best Paper Award `_, IEEE Japan * MEXT Scholarship (Ph.D 2015 - 2018), Japan Language ======== * Mandarin * English (Toefl 2015, 112/120) * Japanese (N1, 2021 Dec, 169/180) .. toctree:: :hidden: :maxdepth: 1