## Abridged List of Machine Learning Topics

## Deep Learning

Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations.

### Software

Torch – Torch7 is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and provides a very efficient implementation, thanks to an easy and fast scripting language, LuaJIT, and an underlying C implementation.

Theano – Theano is a python library that makes writing deep learning models easy, and gives the option of training them on a GPU.

Caffe – Caffe is a deep learning framework developed with cleanliness, readability, and speed in mind. It was created by Yangqing Jia during his PhD at UC Berkeley, and is in active development by the Berkeley Vision and Learning Center (BVLC) and by community contributors.

GraphLab – GraphLab Create image analysis tools, the Deep Learning package enables accurate and in-depth understanding of images and videos.

CUDA-Convnet - High-performance C++/CUDA implementation of convolutional neural networks.

Mocha - Deep Learning framework for Julia, inspired by the C++ framework Caffe.

ConvNetJS - Javascript implementation of Neural networks

Deeplearning4j – Java based library designed to do commercial-grade deep-learning.

### Research

Schmidhuber, Jürgen. "Deep Learning in Neural Networks: An Overview." arXiv preprint arXiv:1404.7828 (2014).

Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.8 (2013): 1798-1828.

### Talks

Ng, Andrew. "Deep Learning, Self-Taught Learning and Unsupervised Feature Learning"

Schmidhuber, Jürgen. "Deep Learning"

Bengio, Yoshua. "Deep Learning of Representations"

## Online Learning

Online machine learning is a model of induction that learns one instance at a time thus reducing the amount of memory required.

### Software

Vowpal Wabbit (VW) – a fast out-of-core learning system that serves as a research vehicle on online learning, reductions, cluster parallel and other areas.

Sofia-ML – Suite of Fast Incremental Algorithms for Machine Learning. Includes methods for learning classification and ranking models, using Pegasos SVM, SGD-SVM, ROMMA, Passive-Aggressive Perceptron, Perceptron with Margins, and Logistic Regression.

### Research

Agarwal, Alekh, et al. "A reliable effective terascale linear learning system." arXiv preprint arXiv:1110.4198 (2011).

### Lectures

Langford, John, LeCun, Yann. "Large-Scale Machine Learning"

## Graphical Models

### Software

Hidden Markov Model in Julia – PyStruct implements learning for structured prediction

### Research

Jordan, Michael I. "Graphical models." Statistical Science (2004): 140-155.

### Lectures

Koller, Daphne. "Probabilistic Graphical Models."

## Structured Predictions

### Software

PyStruct – PyStruct implements learning for structured prediction

Vowpal Wabbit (VW) – VW supports "learning to search" algorithms like Searn and DAgger.

### Research

Daumé III, Hal, John Langford, and Stephane Ross. "Efficient programmable learning to search." arXiv preprint arXiv:1406.1837 (2014).

## Ensemble Methods

### Software

H2O – In memory prediction engine for big data science. Has an implementation of Distributed Random Forests and Neural Networks.

### Research

Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32.

## Kernel Machines

### Research

Le, Quoc, Tamás Sarlós, and Alex Smola. "Fastfood—approximating kernel expansions in loglinear time." Proceedings of the international conference on machine learning. 2013.

## Hyper-parameter Optimization

### Software

Hyperopt – library for serial and parallel optimization of real-valued, discrete, and conditional dimensions.

### Research

Bergstra, James, Dan Yamins, and David D. Cox. "Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms." (2013).

## Optimization

### Research

Recht, Benjamin, et al. "Hogwild: A lock-free approach to parallelizing stochastic gradient descent." Advances in Neural Information Processing Systems. 2011.

Boyd, Stephen, et al. "Distributed optimization and statistical learning via the alternating direction method of multipliers." Foundations and Trends® in Machine Learning 3.1 (2011): 1-122.

Shalev-Shwartz, Shai, et al. "Pegasos: Primal estimated sub-gradient solver for svm." Mathematical programming 127.1 (2011): 3-30.

## Graphs

### Software

GraphX – Resilient Distributed Graph System on Spark

Titan – Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.

Lumify – Analyze relationships, automatically discover paths between entities, and establish new links in 2D or 3D.

Stringer – dynamic/streaming graphs

Basic Linear Algebra Subprograms (BLAS) – Graph Algorithms in the language of Linear Algebra.

### Research

GraphBLAS – An effort to define standard building blocks for Graph Algorithms in the language of Linear Algebra.

An NSA Big Graph experiment – Brain scale graphs (100 billion vertices, 100 trillion edges)

## Hadoop / Spark

### Software

Spark MLlib – Select whether date or category of a post appear above its title in blog list view.

Cloudera Oryx – provides simple, real-time large-scale machine learning / predictive analytics infrastructure.

Metronome – is a suite of parallel iterative algorithms that run natively on Hadoop's Next Generation YARN platform

### Research

Zaharia, Matei. "An Architecture for Fast and General Data Processing on Large Clusters." (2014).

## GPU learning

### Software

BIDMach – a very fast tool for machine learning, from small problems to terabyte scale. BIDMach claims to be the fastest system for many common machine learning tasks.

Deep Learning – see Theano & Torch

### Research

Bergstra, James, et al. "Theano: a CPU and GPU math compiler in Python." Proc. 9th Python in Science Conf. 2010.

## Julia

### Research

Bezanson, Jeff, et al. "Julia: A fast dynamic language for technical computing."arXiv preprint arXiv:1209.5145 (2012).

Lubin, Miles, and Iain Dunning. "Computing in Operations Research using Julia." arXiv preprint arXiv:1312.1431 (2013).

## Robotics

### Software

Robot Operating System (ROS) – a set of software libraries and tools that help you build robot applications.

MOOS-IvP – a set of open source C++ modules for providing autonomy on robotic platforms, in particular autonomous marine vehicles.

### Research

Quigley, Morgan, et al. "ROS: an open-source Robot Operating System." ICRA workshop on open source software. Vol. 3. No. 3.2. 2009.

### Podcast

John Leonard discusses SLAM, KinectFusion, etc.

## Natural Language Processing

### Software

Natural Language Toolkit (NLTK) - easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

word2vec - an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications.

Global Vectors for Word Representation (GloVe) - unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.

gensim - topic modeling for humans

spaCy - industrial-strength NLP, thanks to Cython

### Research

Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation."

Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) 12 (2014).

Quoc Le and Tomas Mikolov. "Distributed Representations of Sentences and Documents." ICML, 2014

## Visualization

### Software

t-SNE – t-Distributed Stochastic Neighbor Embedding is a technique for dimensionality reduction that is well suited for the visualization of high-dimensional datasets

processing - visualization programming language, development environment, and online community

### Research

Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008): 85.

### Talks

Laurens van der Maaten. "Visualizing Data Using t-SNE"

## Computer Vision

### Research

Jarrett, Kevin, et al. "What is the best multi-stage architecture for object recognition?" Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.

Tompson, Jonathan, et al. "Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation." arXiv preprint arXiv:1406.2984(2014).