automatic speech recognition (ASR)
Projects with this topic
-
Noisy-VCTK: Noisy subset of VCTK (Voice Cloning Toolkit) dataset from CSTR.
Updated -
LibriTTS: Librispeech for text-to-speech (TTS) corpus (SLR60).
Updated -
CMU_ARCTIC dataset from CMU FestVox project (www.festvox.org/cmu_arctic).
Updated -
CMU_SIN (speech-in-noise) dataset of Lombard speech from CMU FestVox project.
Updated -
commonvoice: Common Voice dataset of crowdsourced speech from Mozilla.
Updated -
DR-VCTK: device-recorded Voice Cloning Toolkit (DR-VCTK) dataset from CSTR.
Updated -
VCTK: Voice Cloning Toolkit dataset from CSTR, Edinburgh [Veaux et al. 2013].
Updated -
VoxForge: free, open-source ASR dataset of crowdsourced speech from voxforge.org.
Updated -
RedDots: corpus of short-dur utts from mobile apps (sites.google.com/site/thereddotsproject).
Updated -
FRED: Freiburg English Dialect Corpus Sampler (FRED-S) of British English interviews.
Updated -
WSJ: Wall Street Journal corpus from ARPA in 1992, 1994 (LDC93S6A, LDC94S13A).
Updated -
AESL: American English Spoken Lexicon from 1 female speaker (LDC99L23).
Updated -
Tatoeba: Tatoeba Project of English sentences (https://tatoeba.org/eng).
Updated -
fluentcmd: Fluent Speech Commands Dataset for SLU from fluent.ai.
Updated -
heysnips2: Hey Snips Dataset 2 for KWS from Sonos [Leroy et al. 2019].
Updated -
LJ: LJ (Linda Johson) Speech Corpus (v. 1.1), often used for TTS.
Updated -
heysnips1: Hey Snips Dataset 1 for KWS from Sonos [Coucke et al. 2019]
Updated -
AudioMNIST: free dataset of spoken digits (0-9) from 60 speakers.
Updated -
Switchboard-1, release 2: famous ASR data set of CTS from the 1990s (LDC97S62).
Updated -
TED-LIUM 1: Release 1 of TED talk corpus from LIUM (SLR7).
Updated