Comment on page
Basics
Series by ketan Doshi
- 1.State-of-the-Art Techniques (What is sound and how it is digitized. What problems is audio deep learning solving in our daily lives. What are Spectrograms and why they are all-important.)
- 2.Why Mel Spectrograms perform better (Processing audio data in Python. What are Mel Spectrograms and how to generate them)
- 3.Data Preparation and Augmentation (Enhance Spectrograms features for optimal performance by hyper-parameter tuning and data augmentation)
- 4.Sound Classification (End-to-end example and architecture to classify ordinary sounds. Foundational application for a range of scenarios.)
- 5.Automatic Speech Recognition (Speech-to-Text algorithm and architecture, using CTC Loss and Decoding for aligning sequences.)
- 6.Beam Search (Algorithm commonly used by Speech-to-Text and NLP applications to enhance predictions)
Last modified 25d ago