A little effort goes a long way! 

Fellows that completed our program now work at places like Enlitic, Sentient Technologies, Orange Silicon Valley and Uber Advanced Technologies Center (ATC). Here is your chance to demonstrate that you have the necessary skills (and passion) to join our program.

Step 1:  Submit an application.
Step 2:  Select one of the problems from below that you will enjoy working on.
Step 3:  Ideally perform your analysis in an ipython notebook.  Post the notebook on Github and submit your results.

European Parliament Proceedings Parallel Corpus is a text dataset used for evaluating language detection engines. The 1.5GB corpus includes 21 languages spoken in EU.  

Create a machine learning model trained on this dataset to predict the following test set.

Language Detection Model

In this test, you are given the daily value of a SP500 index. Please calculate the stochastic volatility of the index and do rolling one-step-ahead forecast.

Stochastic Volatility of an Index

This dataset contains the trajectories of thousands of taxis operating in China. Your task is to read through the following  paper and produce the first graphs (distribution of distances and sampling time interval).

Next, please pick a trajectory for a particular trip and determine its smoothed trajectory (using Kalman filter for example or splines)

Trajectories of Taxis


Use the US Dept. of Transportation on-time arrival data for non-stop domestic flights by major air carriers to predict arrival delays.

Build a binary classification model for predicting arrival delays or a regression model that predicts the extent of the delay.  Do not use departure delay as an input feature.

Airline On-Time Arrivals


Global Terrorism Database (GTD) is an open-source database including information on terrorist events around the world from 1970 through 2014. Some portion of the attacks have not been attributed to a particular terrorist group.

Use attack type, weapons used, description of the attack, etc. to build a model that can predict what group may have been responsible for an incident. 

Global Terrorism Attack Attribution

Submit Your Results

Name *