March '20 Newsletter

AI-Native Software Infrastructure

In our latest blog post, we discuss some of the theoretical and practical considerations that deep learning engineers run into as they attempt to scale training beyond a single machine. There’s a lot to get right to even get functional distributed training off the ground. Once you do, there is a rich space of optimizations to navigate whose efficacy can depend on everything from your model architecture to your network topology.

By the way, here’s how easy it is to enable optimized distributed training in Determined AI:

Optimized distributed training with Determined