New video demo: LLM Batch Inference with Determined

By Corey Staten

November 08, 2023

In this talk from ML-at-Scale 2023, Corey shows how to use Determined’s Core API and Hugging Face Transformers to build and optimize batch inference workflows. He also discusses some advanced parallelization techniques, and shows how to achieve them using Determined’s DeepSpeed integration. Warning: This video is code-heavy!

Join the Determined community on Slack: Determined Slack

New video demo: LLM Batch Inference with Determined

Recent Posts

Finding the best LoRA parameters

Summer '24 Conference Recap

How does Video Generation work?