提供者：杜成玉
下载地址：http://www.openslr.org/12/

概述

数据来源：https://www.zhihu.com/question/63383992/answer/222718972
该数据集为包含文本和语音的有声读物数据集，由Vassil Panayotov编写的大约1000小时的16kHz读取英语演讲的语料库。数据来源于LibriVox项目的阅读有声读物，并经过细致的细分和一致。推荐应用方向：自然语音理解和分析挖掘

相关论文

[1]Panayotov V, Chen G, Povey D, et al. Librispeech: an ASR corpus based on public domain audio books[C]//Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015: 5206-5210.
[2]Amodei D, Ananthanarayanan S, Anubhai R, et al. Deep speech 2: End-to-end speech recognition in english and mandarin[C]//International Conference on Machine Learning. 2016: 173-182.
[3]Ko T, Peddinti V, Povey D, et al. Audio augmentation for speech recognition[C]//Sixteenth Annual Conference of the International Speech Communication Association. 2015.
[4]Soltau H, Liao H, Sak H. Neural speech recognizer: Acoustic-to-word LSTM model for large vocabulary speech recognition[J]. arXiv preprint arXiv:1610.09975, 2016.
[5]Chung Y A, Wu C C, Shen C H, et al. Audio word2vec: Unsupervised learning of audio segment representations using sequence-to-sequence autoencoder[J]. arXiv preprint arXiv:1603.00982, 2016.