adult数据集

提供者:杜成玉
下载地址:http://www.cs.toronto.edu/~delve/data/adult/desc.html

概述

数据来源:https://www.jianshu.com/p/be23b3870d2e
该数据从美国1994年人口普查数据库抽取而来,可以用来预测居民收入是否超过50K$/year。该数据集类变量为年收入是否超过50k$,属性变量包含年龄,工种,学历,职业,人种等重要信息,值得一提的是,14个属性变量中有7个类别型变量。

数据集特征

数据来源:http://archive.ics.uci.edu/ml/datasets/Adult
特征:多变量
记录数:48842
领域:社会
属性特征:类别型,整数
属性数目:14
相关应用:分类
缺失值?有

相关论文

1.Ron Kohavi, “Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid”, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996
2.Rakesh Agrawal and Ramakrishnan ikant and Dilys Thomas. Privacy Preserving OLAP. SIGMOD Conference. 2005.
3.Rich Caruana and Alexandru Niculescu-Mizil. An Empirical Evaluation of Supervised Learning for ROC Area. ROCAI. 2004.
4.Rich Caruana and Alexandru Niculescu-Mizil and Geoff Crew and Alex Ksikes. Ensemble selection from libraries of models. ICML. 2004.
5.Bianca Zadrozny. Learning and evaluating classifiers under sample selection bias. ICML. 2004.

`