Software Engineer, Machine Learning

ABOUT THE ROLE

As a Software Engineer focused on Machine Learning, you will have the opportunity (and responsibility!) to define major aspects of our product vision and guide the product roadmap. You’ll work closely with our technical customers to better understand their actual pain points when developing and deploying deep learning workflows; develop prototypes for new product functionality inspired by both customer feedback and cutting-edge deep learning research; and interact with our world-class systems engineers to translate these ML prototypes into production-quality product features.

REQUIREMENTS
  • Strong problem solving and analytical skills
  • Excellent communication skills, both written and verbal
  • Track record of academic research and publication in ML or related field, OR 3+ years of industry experience delivering ML applications, especially those powered by deep learning
  • Experience using systems for large-scale data management, analytics, cluster scheduling, stream processing, or machine learning
  • Experience writing software in an industrial setting
PREFERRED
  • Experience collaborating with systems engineers to build machine learning systems
  • Ph.D. in Machine Learning or related field
TEAMS & PROCESS

We are building a team of world class engineers — join us! We have one product and one team, where everyone is a worker-leader. We combine input from customers, engineers and company leadership to prioritize our work, and work hard to make decisions transparent. We believe in tight feedback with customers, and in minimum valuable products.

We believe in just enough (but not too much) process; currently we run scrum with two week sprints. We use Github to manage our work; we require code review, lint, and tests to pass for all our PRs. We run an extensive continuous integration pipeline to test our GPU features. We use Slack, GSuite and have provisioned a video conferencing system for our remote workers.

TECHNICAL CHALLENGES

We have implemented, from scratch, a distributed, fault tolerant GPU cluster manager and scheduler, purpose-built for DL and ML workloads. We have invented, published and implemented state-of-the-art hyperparameter optimization algorithms in our platform. We have numerous other research ideas ready to turn into product features that will differentiate us from our competitors.

TECH STACK

Go

Python

Docker

TensorFlow

PyTorch

Keras

Elm

Kubernetes

Mesos

PostgreSQL