Free Fellowship Focused on Hands-on Experience

Startup.ML fellowship gives aspiring machine learning engineers the chance to hone their skills by building real-world applications.  The number one qualification employers look for when hiring an ML engineering candidate is previous experience.    

  • build scalable machine learning models with agile software development methodology
  • mentoring by experienced ML researchers
  • full-time for 4 months
  • pair program with other fellows and mentors
  • apply latest research in deep learning, reinforcement learning, generative adversarial networks, etc.
  • program is offered in San Francisco, Oakland and New York

Fellows from previous cohorts are now in data science roles at Uber Advanced Technologies Center, Facebook, Baidu, Yelp, Orange,  etc.  See a complete list of our ~30 fellows.


Hiring Partners & Employers

Applying to the fellowship was the best thing I could have done for my career. There’s really no other program like it out there where you can take the lead on a project for a hedge fund and deliver a product that will actually be used. I gained invaluable experience in advanced ML methods that boosted my confidence in interviews and landed me where I am today!
— Trevor Lindsay, Facebook
Startup.ML provided a community of passionate machine learning practitioners and real world projects that helped solidify and deepen my knowledge while at the same time instilling confidence in my ability to bring significant, measurable value to clients.
— Alex Chao, Uber ATC

Past Fellows




Screenshot 2017-02-20 08.42.06.png

Adversarial.AI is the first product launched from the fellowship program.

Billions of events are created every day by applications that do telemetry, capture videos and track movement of physical assets. Although this data is rich, it can be hard to interpret and get value out of.

Events may not have explicit labels (i.e. business meaning) but they contain useful structure that deep learning algorithms can discover. 



Screenshot 2017-02-20 08.44.06.png

One prominent area of focus since inception has been researching systematic trading strategies using artificial intelligence.  Multibillion dollar hedge funds and prop traders currently trade with strategies that were developed in our fellowship program.

Karén Chaltikian leads this area of our research. His background includes more than 17 years in financial industry, including a decade at a hedge fund in BlackRock.  



Screenshot 2017-02-20 08.46.09.png

Lending.AI helps lenders make better decisions through artificial intelligence and novel use of data. We bring a modern approach to the underwriting process that uses latest machine learning techniques to predict default, prepayment and delinquency risk. 

We also use large-scale optimization and simulations to model capital reserves, liquidity and borrowing needs.


Fellowship Application Process

We offer open enrollment so you can apply at any time. We typically reach a decision in one to two weeks. 

Step 1:

Complete the application below

Step 2:

Work on a challenge problem and submit your results

Step 3:

Schedule a time to speak with a mentor or a former fellow

Program Location
Name *
When can you start? *
When can you start?
We have a rolling admission policy. Since there is no set curriculum, we admit fellows based on the needs of our projects.
Background *
Familiar with ML theory & math
Know how to code in python
Familiar with neural networks and deep learning

Fellowship Frequently Asked Questions

Program Basics

How long is the program?

The program is 4 months on a full-time basis.  We do not currently offer a part-time option.

How much does it cost?

The program is free to the fellows.

Where are you located?

We are located at the Batchery in Oakland and Acceleprise in San Francisco.

Can the fellowship program be done remotely? 

Key aspect of the learning is the in-person communication with mentors and other fellows.  We don't believe the same level of collaboration is possible remotely so we currently do not offer this option.

Do you sponsor visas?

Currently we do not have the ability to sponsor visas. 

Bay Area is expensive, do you offer any stipend or living accommodations?

At this point we don't offer any assistance.

Why this Fellowship?

How is this program different from other Data Science programs?

The fellows work on actual machine learning products that are used in production environments. Fellows work under the supervision of the mentor team.  Mentors are actively involved in the delivery of projects, including coding.

Fellows also have an opportunity to interact directly with our customers and get immediate feedback on their results.

What happens to fellows after they graduate? What jobs do they get?

Our fellows are now in machine learning roles at Uber ATC, Facebook, Enlitic, Sentient Technologies, Yelp, Orange, Pivotal, etc.

What type of projects will I get a chance to work on?

We apply deep learning and large-scale optimization expertise to finance and adversarial problems. Most of our projects involve deep learning and reinforcement learning on large data sets. 


What does the day-to-day look like?

Majority of the time is spent pair programming.  We pair up a fellow more proficient in quantitative skills with a fellow more proficient in software development. The project team typically consists of two fellows working under supervision of a mentor.  

We have daily scrums, and we are very diligent about it. We have internal slack channels, shared github repos and trello boards. We have a weekly retrospective and iteration planning. 

What tools will I get a chance to learn?

We are primarily a python shop but fellows are free to use whatever tool and technique they believe is best suited to the problem. We typically use a variety of machine learning libraries including TensorFlow, Keras, XGBoost, etc.

What percentage of the fellowship is actual model building?

Model building is an iterative process. Typically, we spend 50% on data wrangling, 40% on modeling, and the remaining time on explaining results to business people.

Challenge Problem

We require fellows to work on a small challenge problem to assess problem solving and coding capabilities. Select a problem from the list below.  Ideally perform your analysis in a jupyter notebook.  Post the notebook on Github and submit your results.

Some hints for hacking our challenge:

  • Ask yourself why would they have selected this problem for the challenge? What are some gotchas in this domain I should know about?
  • What is the highest level of accuracy that others have achieved with this dataset or similar problems / datasets ?
  • What types of visualizations will help me grasp the nature of the problem / data?
  • What feature engineering might help improve the signal?
  • Which modeling techniques are good at capturing the types of relationships I see in this data?
  • Now that I have a model, how can I be sure that I didn't introduce a bug in the code? If results are too good to be true, they probably are!
  • What are some of the weakness of the model and and how can the model be improved with additional work?

Language Detection

European Parliament Proceedings Parallel Corpus is a text dataset used for evaluating language detection engines. The 1.5GB corpus includes 21 languages spoken in EU.  

Create a machine learning model trained on this dataset to predict the following test set.

Stochastic Volatility 

In this test, you are given the daily value of a SP500 index. Please calculate the stochastic volatility of the index and do rolling one-step-ahead forecast.

Airline On-Time Arrivals

Use the US Dept. of Transportation on-time arrival data for non-stop domestic flights by major air carriers to predict arrival delays.

Build a binary classification model for predicting arrival delays or a regression model that predicts the extent of the delay.  Do not use departure delay as an input feature.

Global Terrorist Attacks

Global Terrorism Database (GTD) is an open-source database including information on terrorist events around the world from 1970 through 2014. Some portion of the attacks have not been attributed to a particular terrorist group.

Use attack type, weapons used, description of the attack, etc. to build a model that can predict what group may have been responsible for an incident.