EE627A: Speech Signal Processing (Spring 2021)
Vipul Arora
Department of Electrical Engineering, IIT Kanpur
Course Objectives:
This course will be taught jointly with Prof. Rajesh Hegde.
I will be teaching later half focusing on ASR.
This part of the course aims at introducing the students to
topics in automatic speech recognition (ASR).
The course will deal with concepts involved in building a ASR system.
Starting with the conventional methods, it will touch upon the latest
deep learning based methods. The Kaldi
and open-FST toolkits will be introduced.
The lectures will focus on mathematical principles, and there will be coding based assignments for implementation.
Topics:
- Conventional ASR systems
- Gaussian Mixture Models
- Hidden Markov Models
- Finite State Transducers
- Kaldi toolkit
- Hybrid HMM-DNN ASR systems
- End-to-end ASR systems
- Connectionist Temporal Classification
- Other topics of interest
References:
- "Automatic Speech Recognition: A Deep Learning Approach", D. Yu and L. Deng, Springer, 2016
- "Pattern Recognition and Machine Learning", C.M. Bishop, 2nd
Edition, Springer, 2011. https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
- "Deep Learning", I. Goodfellow, Y, Bengio, A. Courville, MIT Press, 2016. https://www.deeplearningbook.org/