automatic speech recognition (ASR)
Projects with this topic
-
This project provides a client package and example scripts to access the alphaspeech pro ASR APIs.
Updated -
WSJ: Wall Street Journal corpus from ARPA in 1992, 1994 (LDC93S6A, LDC94S13A).
Updated -
Vystadial: English part of Vystadial CTS corpus from Prague [Korvas et al. 2014].
Updated -
VoxForge: free, open-source ASR dataset of crowdsourced speech from voxforge.org.
Updated -
VCTK: Voice Cloning Toolkit dataset from CSTR, Edinburgh [Veaux et al. 2013].
Updated -
TIMIT: famous corpus of American English with phone-level transcriptions (LDC93S1).
Updated -
TED-LIUM 3: Release 3 of TED talk corpus from LIUM (SLR51).
Updated -
TED-LIUM 2: Release 2 of TED talk corpus from LIUM (SLR19).
Updated -
TED-LIUM 1: Release 1 of TED talk corpus from LIUM (SLR7).
Updated -
Tatoeba: Tatoeba Project of English sentences (https://tatoeba.org/eng).
Updated -
synthcmd: Synthetic Speech Commands Dataset from Kaggle.
Updated -
SWC: Spoken Wikipedia Corpus, crowdsourced speech of read Wikipedia articles.
Updated -
Switchboard-1, release 2: famous ASR data set of CTS from the 1990s (LDC97S62).
Updated -
ST-AEDS: Surfingtech American English Dataset of cellphone speech (SLR45).
Updated -
speechcmd: Speech Commands Dataset from Google [Warden 2018].
Updated -
Snips: SLU dataset from Snips (now part of Sonos) [Saade et al. 2019].
Updated -
rt03: NIST 2003 Rich Transcription Evaluation Data for CTS (LDC2007S10).
Updated -
RM: Resource Management v. 2.0 corpus from DARPA in the 1990s (LDC93S3A).
Updated -
RedDots: corpus of short-dur utts from mobile apps (sites.google.com/site/thereddotsproject).
Updated -
PDA: Personal Digital Assistant speech dataset from CMU.
Updated