YSDA Speech Processing Course

Materials for each week are in ./week* folders

Course program

Week 1: Slides | Lecture | Seminar
- Lecture: Intro to Digital Signal Processing (DSP)
- Seminar: Implement DSP pipeline
- Homework (5pt): Implement mel-spectrogram transformations
Week 2:
- Lecture: Introduction to speech NN discriminative models. Voice Activity Detection (VAD) and Sound Event Detection (SED) tasks
- Seminar: Train VAD models
- Homework (15pt): Train SED models
Week 3:
- Lecture: Keyword Spotting and Speech Biometrics tasks
- Seminar: Train Biometrics model and look at embeddings
- Homework (20pt): Train Biometrics model to better quality
Week 4:
- Lecture: Speech Recognition I
- Seminar: Metrics and augmentations for speech recognition
- Homework (10pt): Implement CTC algorithm
Week 5:
- Lecture: Speech Recognition II, Pretraining
- Homework (5pt): Finetune Wav2Vec2
Week 6:
- Lecture: ASR Inference
- Seminar: Streaming ASR
- Homework (5pt): Seminar continuation
Week 7:
- Lecture: Text-to-Speech I, intro, preprocessor, metrics
Week 8:
- Lecture: Text-to-Speech II, Acoustic models and vocoding
- Seminar (5pt): Pitch estimation, Monotonic Alignment Search for phoneme duration estimation
- Homework (10pt): Train FastPitch model
Week 9:
- Lecture: Text-to-Speech III, Codecs
- Seminar: Vector Quantizaton, Residual Vector Quantization
Week 10:
- Lecture: Text-to-Speech IV, Tortoise and other tranformers for TTS
- Homework (15pt): write codec transformer with delayed pattern
Week 11:
- Lecture: Multimodality, How to build a big GPT with voice capabilities
Week 12:
- Lecture: noise reduction
- Seminar: Streaming STFT and ISTFT
- Homework (15pt): Noise reduction model implementation
Week 13:
- Lecture: Acoustic Echo Cancelation (AEC) and Beamforming
- Homework (5pt): Basic AEC implementation

Course program for spring 2024

Week 1: Slides | Lecture | Seminar
- Lecture: Intro to Digital Signal Processing (DSP)
- Seminar: Implement DSP pipeline
Week 2: Slides | Lecture | Seminar
- Lecture: Introduction to speech NN discriminative models. Voice Activity Detection (VAD) and Sound Event Detection (SED) tasks
- Seminar: Train VAD models
- Homework: Train SED models
Week 3: Slides | Lecture | Seminar
- Lecture: Keyword Spotting and Speech Biometrics tasks
- Seminar: Train Biometrics model and look at embeddings
- Homework: Train Biometrics model to better quality
Week 4: Slides | Lecture | Seminar
- Lecture: Speech Recognition I
- Seminar: Metrics and augmentations for speech recognition
- Homework: Implement CTC algorithm
Week 5: Slides | Lecture
- Lecture: Speech Recognition II, Pretraining
- Homework: Finetune Wav2Vec2
Week 6: Slides | Lecture
- Lecture: Text-to-Speech I, intro, preprocessor, metrics
Week 7: Slides | Lecture
- Lecture: Text-to-Speech II, Acoustic models
- Seminar: Pitch estimation, Monotonic Alignment Search for phoneme duration estimation
- Homework: Train FastPitch model
Week 8: Slides, p1 | Lecture, p1 | Slides, p2 | Lecture, p2 | Seminar
- Lecture, p1: Text-to-Speech III, Vocoding
- Lecture, p2: Vector Quantization, Codecs
- Seminar: Vector Quantizaton, Residual Vector Quantization
Week 9: Slides | Lecture, p1 | Lecture, p2
- Lecture: Tranformers for TTS
- Homework: write inference for pre-trained transformer
Week 10: Slides | Lecture | Seminar
- Lecture: noise reduction
- Seminar: Streaming STFT and ISTFT
- Homework: Noise reduction model implementation
Week 11: Slides | Lecture
- Lecture: Acoustic Echo Cancelation (AEC) and Beamforming
Week 12: Slides | Lecture | Seminar
- Lecture: ASR Inference
- Seminar: Streaming ASR
Week 13: Slides | Lecture
- Lecture: Flow based TTS + Voice Conversion

Contributors & course staff

Current:

Pavel Mazaev - spotter
Alex Rak - VAD, spotter, biometry
Mikhail Andreev - ASR
Stepan Kargaltsev - ASR
Evgeniia Elistratova - TTS
Roman Kail - TTS
Vladimir Platonov - TTS
Ivan Matvienko - TTS
Ravil Khisamov - VQE
Anton Parfiriev - AEC

Previous iteration:

Andrey Malinin - Course admin, lectures, seminars, homeworks
Vladimir Kirichenko - lectures, seminars, homeworks
Segey Dukanov - lecures, seminars, homeworks
Evgenii Shabalin - lecture and homework on conversion

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
data		data
week_01_DSP		week_01_DSP
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requiremets.txt		requiremets.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YSDA Speech Processing Course

Course program

Contributors & course staff

About

Releases 1

Packages

Contributors 9

Languages

License

yandexdataschool/speech_course

Folders and files

Latest commit

History

Repository files navigation

YSDA Speech Processing Course

Course program

Contributors & course staff

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 9

Languages

Packages