Julia is a new language that could become the goto choice for scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing. It uses LLVM-based just-in-time (JIT) compilation, has the speed of C and the dynamism of Ruby.
Contributors of Julia wrote a manifesto to explain their motivation for creating yet another programming language. Jeff Bezanson, Stefan Karpinski, Viral Shah and Alan Edelman highlight Python's annoying dependencies, JVM's unnecessary overhead, and the debugging pain of distributed systems like Hadoop as just of few of the reasons why Julia exists.
Julia holds a lot of promise because of a few fundamental design choices:
- Almost everything in Julia is written in Julia. This will get us out of the C/C++ and Fortran dependency-hell of scikit-learn.
- Type system makes it possible to rapidly experiment and iterate on data science problems. The documentation claims that, "Julia’s type system is designed to be powerful and expressive, yet clear, intuitive and unobtrusive." This is in fact the case. For example if we build a Hidden Markov Model and our initial attempt was to treat all hidden states as Gaussian distributions, and now we want to try out Exponential, we won't need to refactor the HMM code. If HMM was designed correctly and references the Distributions type, either Normal or Exponential can be used.
- Our limited testing suggests that identically constructed code often will run 2-3 times the speed of Python
- Github is used for tracking all the Julia source code and for installing packages. Goodbye PyPi and Maven repos!
- Julia supports metaprogramming. This makes it possible for a program to transform and generate its own code, resulting in a new level of flexibility and powerful reflection capabilities.
- Pandas is significantly more mature than Julia DataFrames.
- For NLP problems, Python is still a better choice. TextAnalysis.jl is very basic.
- John Myles White points out some challenges with the current Julia stats functionality that will be improved in v0.4.
- Julia community is still small (but hopefully growing).
Getting Started on OS X
Download and install Anaconda (only if you want to run Julia in IPython Notebook)
Download and install Julia
Mac OS X Package (.dmg) contains Julia.app. Drag Julia icon to Applications.
sudo ln -s /Applications/Julia-0.3.6.app/Contents/Resources/julia/bin/julia /usr/bin/julia
julia in terminal (you should see the beautiful ascii version of the logo)
Start IPython Notebook with a Julia profile (in terminal)
ipython notebook --profile julia
Gadfly.jl - plotting and data visualization package that conveniently installs most of the frequently used packages like DataFrames, Iterators, Distributions, etc.
Cairo.jl - Cairo graphics library used among other things to render PDFs from Gadfly charts
DecisionTree.jl, Clustering.jl, MultivariateStats.jl - stats / machine learning tools
DSP.jl - provides a number of common Digital Signal Processing (DSP) routines
Graph.jl - provides graph types and algorithms like centrality, connected components, cycle detection, etc.
Mocha.jl - deep learning framework inspired by the C++ framework Caffe
Optim.jl - basic optimization algorithms in pure Julia
Morsel.jl - a Sinatra-like micro framework for declaring routes and handling requests. It is built on top of HttpServer.jl and Meddle.jl.
PyCall.jl - if all else fails, call some Python library
JavaCall.jl - reuse the millions of lines of Java code that's out there
~500 more packages
If you find a package that isn't registered you can install it by:
To update packages:
Pkg.update() #for all packages
Examples and Tutorials
Introduction to Julia tutorial at SciPy 2014
Implementing Digital Filters in Julia
Videos from the Julia tutorial at MIT
Learn Bayes Theorem with Julia
Data Analysis in Julia with Data Frames
Is it Ready for Production?
Yes! We run Julia against massive volumes of data and process tens of thousands of transactions per second. We have successfully deployed Julia for graph analytics, non-parametric probability density functions, graphical models, DSP problems, etc.
We also use Julia in our fellowship. While we encourage fellows to check out Julia, we certainly do not insist on using it for every problem.
Getting Answers to Questions
The julia-users mailing list is for discussion around the usage of Julia.
JuliaCon 2015 will be held at the MIT Stata Center June 24 - June 28.