Web Directory

  Datasets (19)

Submit a Featured Link: [$19.00] Submit a Regular Link: [$9.00]  

Regular Websites in this category

TechTC - Technion Repository of Text Categorization Datasets Open in a new windowLink Details
- Provides a large number of diverse test collections for use in text categorization research.
- http://techtc.cs.technion.ac.il

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Bilkent University Function Approximation Repository Open in a new windowLink Details
- Datasets used for the experimental analysis of function approximation techniques and for training and demonstration by machine learning and statistics community.
- http://funapp.cs.bilkent.edu.tr/DataSets/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Reuters-21578 Text Categorization Corpus Open in a new windowLink Details
- A classic benchmark for text categorization algorithms.
- http://www.daviddlewis.com/resources/testcollections/reuters21578/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
RISE: Repository of Information Sources used in information Extraction tasks. Open in a new windowLink Details
- Repository of online information sources: test domains for information extraction and wrapper generation tools that learn extraction rules (extraction patterns).
- http://www.isi.edu/info-agents/RISE/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
WordSimilarity-353 Test Collection Open in a new windowLink Details
- Contains 353 English word pairs along with human-assigned similarity judgements.
- http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Web->KB dataset Open in a new windowLink Details
- Web pages partitioned into classes, with hyperlink data. The dataset has been used for text categorization and learning to extract symbolic knowledge from the World Wide Web.
- http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Learning Relational Concepts from Sensor Data of a Mobile Robot Open in a new windowLink Details
- A set of data sets, where each data set is represented in first order logic. Maintained at the University of Dortmund, Germany.
- http://www-ai.cs.uni-dortmund.de/FORSCHUNG/PROJEKTE/BLEARN2/data-sets.html

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
HS3D - Homo Sapiens Splice Sites Dataset Open in a new windowLink Details
- HS3D (Homo Sapiens Splice Sites Dataset) is a database of Homo Sapiens Exon, Intron and Splice regions extracted from GenBank primate sequences Rel.123. The aim of this data set is to give standardized material to train and to assess the prediction accu
- http://www.sci.unisannio.it/docenti/rampone/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Penn Treebank Project Open in a new windowLink Details
- A corpus of parsed sentences. Used by many researchers for training data-driven parsing algorithms.
- http://www.cis.upenn.edu/~treebank/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Time Series Data Library Open in a new windowLink Details
- A collection of over 500 time series, maintained by Rob Hyndman. Time series are organized by subject.
- http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Face recognition dataset Open in a new windowLink Details
- A dataset of face images for face recognition algorithms.
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/ML94/face_homework.html

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
NIST Special Database 4. Open in a new windowLink Details
- This NIST database of fingerprint images contains 2000 8- bit gray scale fingerprint image pairs.
- http://www.nist.gov/srd/nistsd4.htm

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
The StatLib Datasets Archive Open in a new windowLink Details
- A repository of datasets used in statistics and machine learning.
- http://lib.stat.cmu.edu/datasets/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
National Space Science Data Center Open in a new windowLink Details
- Provides access to a wide variety of astrophysics, space physics, solar physics, lunar and planetary data from NASA space flight missions, in addition to selected other data and some models and software.
- http://nssdc.gsfc.nasa.gov/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
TREC Data Open in a new windowLink Details
- Text datasets used in information retrieval and learning in text domains.
- http://trec.nist.gov/data.html

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
UCI Machine Learning Repository Open in a new windowLink Details
- A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
- http://www.ics.uci.edu/~mlearn/MLRepository.html

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
DELVE - Data for Evaluating Learning in Valid Experiments Open in a new windowLink Details
- Data for Evaluating Learning Valid Experiments: A standardized environment designed to evaluate the performance of methods that learn relationships based primarily on empirical data. Delve makes it possible for users to compare their learning methods with
- http://www.cs.utoronto.ca/~delve/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
Dataset generator Open in a new windowLink Details
- Datgen, formerly SCDS, is a computer program that generates data to systematically test programs that consume data. These synthetic datasets can be used to validate learning algorithms.
- http://www.datgen.com/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing
The RCSB Protein Data Bank (PDB) Open in a new windowLink Details
- Archive of experimentally-determined, biological macromolecule 3-D structures from the Brookhaven National Laboratory.
- http://www.rcsb.org/pdb/

Lock this listing - So it can't be removedLock this listing - and upgrade it to FeaturedReport this listing

Submit a Featured Link: [$19.00]  Submit a Regular Link: [$9.00]