.. temp documentation master file, created by sphinx-quickstart on Sun Aug 30 20:18:34 2020. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. _label-slide: Talk & slides ************* In most cases, I cannot directly share audio samples. Some samples can be found through the link in the PDF. Talk ============ .. _label-slide-2024-apr-1: APR-2024 -------- **ICASSP 2024 presentation**: Can Large-Scale Vocoded Spoofed Data Improve Speech Spoofing Countermeasure with a Self-Supervised Front End? Using large-scale spoofed data to updated SSL front end of speech anti-spoofing model. * Paper: https://ieeexplore.ieee.org/document/10446331 * Slides: `PPT `__ and `PDF `__ .. _label-slide-2023-nov-1: NOV-2023 -------- **VoicePersonae workshop talk 2: Harnessing data to improve speech spoofing countermeasures** High-level summary of the talk to use vocoded data to train speech anti-spoofing models. Slides can be downloaded here `dropbox `__. **VoicePersonae workshop talk 1: DNN+DSP waveform model** An overview talk given at VoicePersonae workshop. The title is From DSP and DNN to DNN/DSP: Neural speech waveform models and its applications in speech and music audio waveform modelling. Slides can be downloaded here `dropbox `__. .. _label-slide-2023-oct-31: OCT-2023 -------- **Shonan Seminar: casual presentation** During the No.182 Shonan Seminar https://shonan.nii.ac.jp/seminars/182/, I had chance to introduce voice privacy. Slides are available on `dropbox `__. .. _label-slide-2023-aug-1: AUG-2023 -------- **Interspeech Tutorial: anti-spoofing** Interspeech 2023 tutorial Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning. Slides and notebook are available on `github `__. .. _label-slide-2023-mar-1: MAR-2023 -------- **SPSC Webinar: using vocoders to create spoofed data for speech spoofing countermeasures** for `ICASSP 2023 paper `__ "Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders". Slides `in PDF `__ and `PPTX `__ .. _label-slide-2022-sep-1: SEP-2022 -------- **SPSC Symposium: tutorial on speaker anonymization (software part)** This short tutorial shows the basic process of speaker anonymization, using baselines in Voice Privacy Challenge 2022. The hands-on notebook is available on `Google Colab `__. .. _label-slide-2022-may-1: MAY-2022 -------- **ICASSP 2022 short course: neural vocoder** This talk briefly summarizes a few representative neural vocoders. For a more detailed talk, please check :ref:`the slide for Advancement in Neural Vocoders `. The hands-on materials used for this short course cover a few latest neural vocoders. There are step-to-step instructions on implementation, demonstration with pre-trained models, and detailed explanation on some common DSP and deep learning techniques. Please check `Google Colab `_. .. _label-slide-2021-dec-1: DEC-2021 -------- **Two Speech Security Issues after Speech Synthesis Boom** This talk briefly introduces anti-spoofing (audio deepfake detection) and voice privacy. It is mainly for new comers to these fields. The slide can be found `on dropbox here (PPTX) `_, `(PDF) `_. .. _label-slide-2021-oct-1: OCT-2021 -------- **DeepFake: high-tech illusions to deceive human brains** This is a talk given at JST Science Agora with Dr. Erica Cooper. It is an introduction on anti-spoofing (audio deepfake detection). Here is the part presented by me: `Agora PDF `_ and `Aogra PPT `_. .. _label-slide-2021-jul-1: JUL-2021 -------- **Advancement in Neural Vocoders** This is the tutorial on neural vocoders, at ISCA 2021 Speech Processing Courses in Crete, with Prof. Yamagishi. It was a very long tutorial (>3 hours). Slides are `on slideshare `_ (I only own part of it). The hands-on materials were re-edited and uploaded to Google Colab. See :ref:`ICASSP 2022 short course: neural vocoder `. .. _label-slide-2020-dec-1: DEC-2020 -------- **Tutorial on Neural statistical parametric speech synthesis** This is a tutorial on text-to-speech synthesis, at ISCA speaker Odyssey 2020. It is mainly on sequence-to-sequence TTS acoustic models (both soft- and hard-attention based approaches), but it also covers some basic ideas from the classical HMM-based approaches. `PDF `_ and `PPT slides `_ are available. The video is on `youtube `_ There many audios samples collected from reference papers' official websites or from open domain data repository. .. _label-slide-2020-nov-1: NOV-2020 -------- **Neural vocoders for speech and music signals** This an invited talk at YAMAHA, with Prof. Yamagishi. Nothing can be disclosed. .. _label-slide-2020-jul-1: JUL-2020 -------- **Neural auto-regressive, source-filter and glottal vocoders for speech and music signals** This is the early version of the tutorial on neural vocoders, given at ISCA 2020 Speech Processing Courses in Crete, with Prof. Yamagishi. The hands-on materials were re-edited and uploaded to Google Colab. See :ref:`ICASSP 2022 short course: neural vocoder `. .. _label-slide-2019-sep-1: SEP-2019 -------- **Neural waveform models for text-to-speech synthesis** Invited talk given at Fraunhofer IIS, Erlangen, Germany. This is about the neural source-filter vocoders and related experiments done by 2019. Slide is `here 1 `_ .. _label-slide-2019-jan-1: JAN-2019 -------- **Tutorial on recent neural waveform models** This is a talk on neural vocoders, but the contents and explanations are based on my knowledge by then. It is out-of-date. Please check tutorials above for my latest understanding. IEICE Technical Committee on Speech (SP), invited tutorial, Kanazawa, Japan. Slide is `here 2 `_ .. _label-slide-2018-jan-1: JAN-2018 -------- **Autoregressive neural networks for parametric speech synthesis** This is a talk on the previous-generation TTS system. It talks about autoregressive models for F0 prediction. It was given at Nagoya Institute of Technology, Tokuda lab, and Aalto University, Paavo Alku lab. Slide is `here 3 `_ Conference presentation ======================= Anti-spoofing: Interspeech 2021 presentation for `Comparative study on ASVspoof 2019 LA, PPT `_. Codes are available at `git repo project/03-asvspoof-mega `_ NSF model (latest ver.): Interspeech 2020 presentation for cyclic-noise-NSF -- `PPT `_ and `PDF slides `_ . Natural samples are from `CMU-arctic `_ NSF model (2nd ver.): `SSW 2019 `_ for paper Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis NSF model (1st ver.): `ICASSP 2019 `_ for paper Neural Source-Filter-Based Waveform Model for Statistical Parametric Speech Synthesis Speech synthesis comparison: `ICASSP 2018 `_ for paper A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis Deep AR F0 model: `Interspeech 2017 slide `_ for paper An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis. Shallow AR model: `ICASSP 2017 slide `_ for paper An Autoregressive Recurrent Mixture Density Network for Parametric Speech Synthesis. Speech synthesis: `SSW 2016 slide `_ for paper A Comparative Study of the Performance of HMM, DNN, and RNN Based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora. Prosody embedding: `Interspeech 2016 slide `_ for paper Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System. HMM-based speech synthesis: `ICASSP 2016 slide `_. For paper A Full Training Framework of Cross-Stream Dependence Modelling for HMM-Based Singing Voice Synthesis. MISC ==== On CURRENNT toolkit. These slides were made a long time ago during weekends, and they may be sloppy :) * CURRENNT `basics `_ * CURRENNT `LSTM explanation `_ * CURRENNT `CNN implementation `_ * CURRENNT `mixture density network `_ * CURRENNT `WaveNet `_ CURRENNT WaveNet is also explained in `another slide `_ with more figures. .. toctree:: :hidden: :maxdepth: 1