automatic speech recognition (ASR)
Projects with this topic
-
Empowering seamless transcription with cutting-edge STT (Speech-to-Text) technology, revolutionizing interaction through accurate speech recognition
Updated -
TIMIT: famous corpus of American English with phone-level transcriptions (LDC93S1).
Updated -
LibriSpeech: large ASR data set of read books (SLR12) [Panayotov et al. 2015].
Updated -
AN4: Alphanumeric or "census" database from CMU [Acero 1993].
Updated -
Buckeye: Buckeye Speech Corpus (release 2) of interviews from Ohio State.
Updated -
Proof-of-concept (POC) app towards Aida English app.
UpdatedUpdated -
IndicTTS: Indian English speech from the IIT TTS Team.
Updated -
EMIME Bilingual {Finnish,German,Mandarin}/English database (www.emime.org).
Updated -
UCAM Bilingual database from EMIME (www.emime.org).
Updated -
CRM (coordinate response number) corpus [Bolia et al. 2000].
Updated -
GRID: audiovisual corpus of grid-related commands, from Univ. Sheffield.
Updated -
Lombard Grid: extension of Grid corpus with Lombard and normal speech.
Updated -
CTIMIT: TIMIT played through cellphone network (LDC96S30).
Updated -
Noisy-VCTK: Noisy subset of VCTK (Voice Cloning Toolkit) dataset from CSTR.
Updated -
LibriTTS: Librispeech for text-to-speech (TTS) corpus (SLR60).
Updated -
CMU_ARCTIC dataset from CMU FestVox project (www.festvox.org/cmu_arctic).
Updated -
CMU_SIN (speech-in-noise) dataset of Lombard speech from CMU FestVox project.
Updated -
commonvoice: Common Voice dataset of crowdsourced speech from Mozilla.
Updated -
DR-VCTK: device-recorded Voice Cloning Toolkit (DR-VCTK) dataset from CSTR.
Updated -
VCTK: Voice Cloning Toolkit dataset from CSTR, Edinburgh [Veaux et al. 2013].
Updated