Nieme Machine Learning Library
- Nieme is a Machine Learning library for classification, regression and ranking. It implements several well-known algorithms and is specially designed for large-scale applications.
- http://www-connex.lip6.fr/%7emaes/wikihomepage/pmwiki.php?n=Nieme.Nieme
HMM and other statistical programs
- This tool implements Hidden Markov Models and application to part-of-speech tagging. Also available; a multivariate hypothesis testing software for gaussian data, and a groundtruth/metadata editing and visualizing toolkit for OCR. [GPL]
- http://www.kanungo.com/software/software.html
ArrayMiner - ClassMarker
- Programmatically isolate similarities between scattered classes of genes. Expression driven. Utilizes a voting method along with a k-Nearest-Neighbors classification. Very rich graphical interface. Samples of an unknown class are possible given enough
- http://www.optimaldesign.com/ArrayMiner/ClassMarker.htm
Software Packages and Toolboxes
- Online software repository of the Department of Computer Science at The University of Pittsburg. Everything from expert systems, finite-state machines, graphical models, linear programming, and machine learning through turning machines is covered here an
- http://www.isp.pitt.edu/information/software.html
New Scientific Brainstorming Software for Inventors -- Windows, Mac and Linux
- A software developed to help your team brainstorm. Words are replaced programmatically in user's idea sentence with new words from program categories perhaps creating ideas not formerly thought of. Extensive word categories. [Commercial]
- http://www.paramind.net/pmscientificversion.html
GAlib: Matthew's Genetic Algorithms Library
- A toolset of genetic algorithm objects for C++ to perform optimization. Uses any representation and genetic operators. The documentation contains implementation and examples. Nice screenshots. PVM for distributed, parallel implementations. Includ
- http://lancet.mit.edu/ga/
SAM: Sequence Alignment and Modeling
- A collection of tools for creating and using HMMs for biological sequences. [AFL]
- http://bioweb.pasteur.fr/seqanal/motif/sam-uk.html
PRAPI: Pattern Recognition Application Programmer's Interface
- A library for many pattern recognition tasks. The main focus of this package is on image analysis but utilizes a general architecture and XML-based data interchange format. Written in C++ [GPL]
- http://www.ee.oulu.fi/~topiolli/cppdocs/
Machine Learning Programs by Peter Clark
- A collection of downloadable packages including: KM - The Knowledge Machine, Guiding Inductive Learning with a Qualitative Model, LPE - Lazy Partial Evaluation, and CN2 - Rule induction from examples. [GPL]
- http://www.cs.utexas.edu/users/pclark/software/
HMMER: Biosequence Analysis
- A tool used to build HMMs from multiple alignments and calculate e-scores. [GPL]
- http://hmmer.janelia.org/
BNET, Belief Network Tools & VisionKit, Computer Vision Components
- A developer toolkit for researchers and engineers to embed belief networks in software applications. Nice online demo. [Commercial]
- http://www.cra.com/commercial-products-services/bnet-engine-kit.asp
WebMO
- Web-based interface to computational chemistry. Has support for Gaussian 94/98/03, GAMESS, MolPro 2002, MOPAC 7/93/200x, NWChem 4.6+, QChem 2.1+, and Tinker 4.2+. Unix or Linux based. [Free]
- http://www.webmo.net/
SenseClusters
- Programs to cluster similar contexts together using unsupervised knowledge-lean methods for word sense discrimination, email categorization, and name discrimination. Written in Perl. [GNU]
- http://senseclusters.sourceforge.net/
What If: Web-based Scientific Discovery
- An algorithm engine which will calculate everything from symmetry, torsion angles, polar fraction through protein analysis and bond angles. Online version only. [Free]
- http://swift.cmbi.kun.nl/WIWWWI/
Pattern Matching Pointers
- Using algorithms to address issues of searching and matching strings and more complicated patterns such as trees, regular expressions, graphs, point sets, and arrays. [GPL]
- http://www.cs.ucr.edu/~stelo/pattern.html
FTP Repository Site List for Cognitive and Machine Learning
- Anonymous sites from popular colleges and universities. To access other pages just replace the 6 in URL with numbers 1-23. [Free]
- http://hoohoo.ncsa.uiuc.edu/ftp/part6.html
Tree Visualizer
- Software which allows one to navigate (fly) through the data tree, zoom in on interesting nodes, click on bars to get counts, and mark interesting places in the tree. Includes datasets for automobiles, voting, produce, and medical research. Uses LEDA, (
- http://www.sgi.com/tech/mlc/trees.html
MALLET: Advanced Machine Learning for Language
- An integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text. [GPL]
- http://mallet.cs.umass.edu/index.php/Main_Page
VIBES: Variational Inference for Bayesian Networks
- A software package which allows variance-modeled posterior inference to be performed automatically on a Bayesian network. [GPL]
- http://vibes.sourceforge.net/
WinBUGS: The BUGS Project
- A stand-alone program to allow practical MCMC methods available to applied statisticians. Either a point and click interface can be used to control the analysis or a graphical interface can be constructed. The BUGs project also includes links to GeoBUG
- http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
OpenCyc
- Open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine. Can be used as the basis of a wide variety of intelligent applications including; rapid development of an ontology
- http://www.opencyc.org/
ELIE: An Adaptive Information Extraction System
- A tool for adaptive information extraction from text. Also included are a number of other text processing tools for POS tagging, chunking, gazetteer, and stemming. [GPL]
- http://www.aidanf.net/software/elie_an_adaptive_information_extraction_system
Sorting Algorithms for Machine Learning
- Various sorting algorithms including insertion, quick, merge, heap, Dutch National Flag, and radix with on-line demos. [Free]
- http://www.csse.monash.edu.au/%7Elloyd/tildeAlgDS/Sort/
Data Access Tools from the US Census Bureau
- General purpose data display and extraction tools that works with Census Bureau data. Census data available for pickup through census bureau employees only. [Free]
- http://www.census.gov/main/www/access.html
UCI Machine Learning Repository
- A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms at the University of California at Irvine. [Free]
- http://www.ics.uci.edu/~mlearn/MLRepository.html
LingPipe: Natural Language Processor (NLP)
- A suite of Java libraries for the linguistic analysis of human language which can link entity mentions to database entries, uncover relations, cluster documents, and discover significant trends. [GPL]
- http://www.alias-i.com/lingpipe/
Classification Toolbox for MATLAB
- A complete set of algorithms for classification, clustering, feature selection and reduction for Matlab. [Free]
- http://www.yom-tov.info/toolbox.html
Bayes Net Toolbox for Matlab
- Supports several inference algorithms and learning algorithms. Allows simulation of static and dynamic networks, including HMMs, IOHMMs, and Kalman filters. [GPL]
- http://bnt.sourceforge.net/
HRE API: A Portable Handwriting Recognition Engine
- This engine is a functionally complete interface for handwriting recognition. API was written in ANSI C and has minimal reliance on the Windows system. There is a version ported to Linux. [GPL]
- http://playground.sun.com/pub/multimedia/handwriting/hre.html
BayesBuilder: Bayesian network construction tool
- This tool supports discrete gaussians and efficient noisy-OR nodes necessary in large networks. Node search, undo/redo, and automatic network layout are also supported. Written in C++ with a Java front end. [GPL]
- http://www.snn.ru.nl/nijmegen/index.php3?page=31
The AutoClass Project
- A database of cases described by a combination of real and discrete valued attributes, and automatically finds the natural classes in that data. It can be seen as a Naive Bayes classifier where the class node is hidden. [Free]
- http://ic-www.arc.nasa.gov/ic/projects/bayes-group/autoclass/
Spider: General Purpose Machine Learning Toolbox in Matlab
- An object orientated environment for machine learning in Matlab. Algorithms can be plugged together and can be compared with (e.g. model selection, statistical tests and visual plots). Algorithms may be downloaded separately. [GPL]
- http://www.kyb.tuebingen.mpg.de/bs/people/spider/index.html
(H)HMM Library and Designer
- This library allows probabilistic sequence models to be constructed through use of Hidden Markov models (HMMs) and Hierarchical Markov models HMMs (HHMMs) in Ocaml programming language. [GPL]
- http://connex.lip6.fr/%7ebinsztok/hhld.html
The Torch Machine Learning Library
- This package forms a complete gradient descent machine learning library. Modules support vector machines in classification and regression, ensemble models such as bagging or adaboost, non-parametric models such as K-nearest neighbors, Parzen regression,
- http://www.torch.ch
The PNC2 Rule Induction System
- Windows software tool that induces rules from your data using the PNC2 cluster algorithm. An integrated parameter-tuning component allows easy adjustment of the algorithms behavior to the given problem without requiring any further knowledge. [GPL]
- http://www.newty.de/pnc2/index.html