Abridged List of Machine Learning Topics


Deep Learning

Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations.



Torch – Torch7 is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and provides a very efficient implementation, thanks to an easy and fast scripting language, LuaJIT, and an underlying C implementation.

Theano – Theano is a python library that makes writing deep learning models easy, and gives the option of training them on a GPU.

Caffe – Caffe is a deep learning framework developed with cleanliness, readability, and speed in mind. It was created by Yangqing Jia during his PhD at UC Berkeley, and is in active development by the Berkeley Vision and Learning Center (BVLC) and by community contributors. 

GraphLab – GraphLab Create image analysis tools, the Deep Learning package enables accurate and in-depth understanding of images and videos.

CUDA-Convnet - High-performance C++/CUDA implementation of convolutional neural networks.

Mocha - Deep Learning framework for Julia, inspired by the C++ framework Caffe.

ConvNetJS - Javascript implementation of Neural networks

Deeplearning4j – Java based library designed to do commercial-grade deep-learning.


Schmidhuber, Jürgen. "Deep Learning in Neural Networks: An Overview." arXiv preprint arXiv:1404.7828 (2014).

Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.8 (2013): 1798-1828.

Online Learning

Online machine learning is a model of induction that learns one instance at a time thus reducing the amount of memory required.


Vowpal Wabbit (VW) – a fast out-of-core learning system that serves as a research vehicle on online learning, reductions, cluster parallel and other areas.

Sofia-ML – Suite of Fast Incremental Algorithms for Machine Learning. Includes methods for learning classification and ranking models, using Pegasos SVM, SGD-SVM, ROMMA, Passive-Aggressive Perceptron, Perceptron with Margins, and Logistic Regression.


Agarwal, Alekh, et al. "A reliable effective terascale linear learning system." arXiv preprint arXiv:1110.4198 (2011).


Langford, John, LeCun, Yann. "Large-Scale Machine Learning"

Graphical Models


Hidden Markov Model in Julia – PyStruct implements learning for structured prediction


Jordan, Michael I. "Graphical models." Statistical Science (2004): 140-155.


Koller, Daphne. "Probabilistic Graphical Models." 

Structured Predictions



PyStruct – PyStruct implements learning for structured prediction

Vowpal Wabbit (VW) – VW supports "learning to search" algorithms like Searn and DAgger. 


Daumé III, Hal, John Langford, and Stephane Ross. "Efficient programmable learning to search." arXiv preprint arXiv:1406.1837 (2014).

Ensemble Methods


H2O – In memory prediction engine for big data science. Has an implementation of Distributed Random Forests and Neural Networks. 


Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32.

Kernel Machines


LIBSVM – Support Vector Machines with linear, polynomial, radial basis function, sigmoid kernels

SVMlight - Feature rich Support Vector Machines implementation for small datasets




Le, Quoc, Tamás Sarlós, and Alex Smola. "Fastfood—approximating kernel expansions in loglinear time." Proceedings of the international conference on machine learning. 2013.

Hyper-parameter Optimization


Hyperopt – library for serial and parallel optimization of real-valued, discrete, and conditional dimensions.


Bergstra, James, Dan Yamins, and David D. Cox. "Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms." (2013).



ml-ease – large-scale machine learning library from LinkedIn; currently it has ADMM based large scale logistic regression

Pegasos – code implements the Pegasos algorithm for solving SVM in the primal form.



Recht, Benjamin, et al. "Hogwild: A lock-free approach to parallelizing stochastic gradient descent." Advances in Neural Information Processing Systems. 2011.

Boyd, Stephen, et al. "Distributed optimization and statistical learning via the alternating direction method of multipliers." Foundations and Trends® in Machine Learning 3.1 (2011): 1-122.

Shalev-Shwartz, Shai, et al. "Pegasos: Primal estimated sub-gradient solver for svm." Mathematical programming 127.1 (2011): 3-30.



GraphX  Resilient Distributed Graph System on Spark

Titan – Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.

Lumify – Analyze relationships, automatically discover paths between entities, and establish new links in 2D or 3D.

Stringer – dynamic/streaming graphs 

Basic Linear Algebra Subprograms (BLAS) Graph Algorithms in the language of Linear Algebra.


GraphBLAS – An effort to define standard building blocks for Graph Algorithms in the language of Linear Algebra.

An NSA Big Graph experiment – Brain scale graphs (100 billion vertices, 100 trillion edges)

Hadoop / Spark



Spark MLlib – Select whether date or category of a post appear above its title in blog list view.

Cloudera Oryx – provides simple, real-time large-scale machine learning / predictive analytics infrastructure.

Metronome  – is a suite of parallel iterative algorithms that run natively on Hadoop's Next Generation YARN platform

GPU learning


BIDMach  – a very fast tool for machine learning, from small problems to terabyte scale. BIDMach claims to be the fastest system for many common machine learning tasks.

Deep Learning  see Theano & Torch 


Bergstra, James, et al. "Theano: a CPU and GPU math compiler in Python." Proc. 9th Python in Science Conf. 2010.



Julia – Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments.

IJulia – Julia-language backend combined with the IPython interactive environment 


Bezanson, Jeff, et al. "Julia: A fast dynamic language for technical computing."arXiv preprint arXiv:1209.5145 (2012).

Lubin, Miles, and Iain Dunning. "Computing in Operations Research using Julia." arXiv preprint arXiv:1312.1431 (2013).



Robot Operating System (ROS) – a set of software libraries and tools that help you build robot applications. 

MOOS-IvP – a set of open source C++ modules for providing autonomy on robotic platforms, in particular autonomous marine vehicles.


Quigley, Morgan, et al. "ROS: an open-source Robot Operating System." ICRA workshop on open source software. Vol. 3. No. 3.2. 2009.


John Leonard discusses SLAM, KinectFusion, etc.

Natural Language Processing



Natural Language Toolkit (NLTK) - easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

word2vec -  an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications.

Global Vectors for Word Representation (GloVe) -  unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space. 

gensim - topic modeling for humans

spaCy - industrial-strength NLP, thanks to Cython



Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." 
Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) 12 (2014).

Quoc Le and Tomas Mikolov. "Distributed Representations of Sentences and Documents." ICML, 2014



t-SNE – t-Distributed Stochastic Neighbor Embedding is a technique for dimensionality reduction that is well suited for the visualization of high-dimensional datasets

processing - visualization programming language, development environment, and online community


Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008): 85.


Laurens van der Maaten. "Visualizing Data Using t-SNE"

Computer Vision


OpenCV  popular computer vision library designed to by computational efficiency with a strong focus on real-time applications.

CCV  C-based/Cached/Core Computer Vision Library


Jarrett, Kevin, et al. "What is the best multi-stage architecture for object recognition?" Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.

Tompson, Jonathan, et al. "Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation." arXiv preprint arXiv:1406.2984(2014).