The 20 Newsgroups data set 新闻组数据集

提供者:卢梦依
下载地址:http://qwone.com/~jason/20Newsgroups/

简介

数据集概述

该数据集包含着新闻组相关的文本数据信息。这二十个新闻组数据集合收集了大约20,000新闻组文档,均匀的分布在20个不同的集合。这些文档具有新闻的典型特征:主题,作者和引述。

文件

大小:20 MB
类型:txt文本
数量:来自20个新闻组的20,000条消息

相关论文

1.Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv, 2014.
2.Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. 2016:427-431.
3.Zhang Y, Wallace B. A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification[J]. Computer Science, 2015.
4.Ji Y L, Dernoncourt F. Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks[J]. 2016:515-520.
5.Chen G, Ye D, Xing Z, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]// International Joint Conference on Neural Networks. IEEE, 2017:2377-2383.