Scalable deep learning (talk)

By Ameet Talwalkar

May 01, 2018

I recently spoke at the AI Conference in NYC about some of the academic research underlying our efforts at Determined AI. My talk focuses on two of the fundamental bottlenecks that exist when attempting to develop deep learning applications at scale. One involves exploring an architecture’s design space, which typically requires training tens to thousands of models with different hyperparameters. Model training itself is a second major bottleneck when learning on massive datasets. In my talk, I first introduce Hyperband, a novel algorithm for hyperparameter optimization that is simple, flexible, theoretically sound, and an order of magnitude faster than leading competitors. I then present work aimed at understanding the underlying landscape of training deep learning models in parallel and distributed environments. I introduce Paleo, an analytical performance model that can quickly and accurately model the expected scalability and performance of putative parallel and distributed deep learning systems.

Check out the talk to learn more:

Contact us if you’d like to find out what we’re doing at Determined AI to tackle these core bottlenecks, and more generally to make your company’s data science team more productive.

Scalable deep learning (talk)

Recent Posts

From a pre-trained model to an AI assistant: Finetuning Gemma-2B using DPO

AI News #20

Determined v0.31.0