December 29, 2020
Object detection is a common computer vision problem, the goal of which is to detect instances of a given class in an image. FasterRCNN is a popular deep learning network architecture for performing object detection. FasterRCNN does detection with a two-step approach by first identifying bounding boxes that may potentially contain objects and then subsequently classifying each bounding box. Today, we’ll walk through how to train FasterRCNN to perform object detection using Determined and PyTorch.
For this example, we’ll be training FasterRCNN on the Penn-Fudan Database for Pedestrian Detection and Segmentation. Thanks to the PyTorch FasterRCNN tutorial, its easy to get started. We will adapt code from this tutorial to run on Determined so that we can easily scale up training, run a hyperaprameter search, and achieve a better final validation IOU.
In advance, we’ve organized the tutorial code in a Determined PyTorch Trial Interface. By organizing the model this way, we can use Determined to track our experiments, scale to distributed training, and do hyperparameter tuning. To get started, you’ll need to install Determined, and configure the Determined cli. The code for this example can be found here.
To run this example, first install Determined either locally or on the cloud. Since, we will be running a hyperparameter search consisting of many training runs, we recommend running on the cloud.
Once you have Determined installed, you can train FasterRCNN and track the progress of training with:
det experiment create const.yaml .
The configuration of this experiment is defined in const.yaml
:
description: fasterrcnn_coco_pytorch_const
data:
url: https://determined-ai-public-datasets.s3-us-west-2.amazonaws.com/PennFudanPed/PennFudanPed.zip
hyperparameters:
learning_rate: 0.005
momentum: 0.9
weight_decay: 0.0005
global_batch_size: 2
searcher:
name: single
metric: val_avg_iou
smaller_is_better: false
max_length:
batches: 800
entrypoint: model_def:ObjectDetectionTrial
For full documentation about how to configure experiments, check out the Determined experiment configuration documentation. Today, we will modify this configuration to run a hyperparameter search.
In our new configuration, called adaptive.yaml
, we will add sweeps of the learning_rate
and momentum
hyperparameters:
hyperparameters:
learning_rate:
type: double
minval: 0.0001
maxval: 0.001
momentum:
type: double
minval: 0.2
maxval: 1.0
We will then configure the searcher with the search algorithm name, the optimization metric and the size of the hyperparameter search. We will use the state-of-the-art ASHA algorithm. We’ll start with a small experiment, running 30 trials of 8 batches of training each.
searcher:
name: adaptive_asha
metric: val_avg_iou
smaller_is_better: false
max_length:
batches: 8
max_trials: 30
The final configuration looks like:
description: fasterrcnn_coco_pytorch_adaptive_search
data:
url: https://determined-ai-public-datasets.s3-us-west-2.amazonaws.com/PennFudanPed/PennFudanPed.zip
hyperparameters:
learning_rate:
type: double
minval: 0.0001
maxval: 0.001
momentum:
type: double
minval: 0.2
maxval: 1.0
weight_decay: 0.0005
global_batch_size: 2
searcher:
name: adaptive_asha
metric: val_avg_iou
smaller_is_better: false
max_length:
batches: 8
max_trials: 30
entrypoint: model_def:ObjectDetectionTrial
This can be run from the command line:
det experiment create adaptive.yaml .
When training has completed, your model should obtain a validation IOU score of ~52.
Next, to further improve the IOU score, we’ll increase the size of the hyperparameter search to run for 300 trials.
searcher:
name: adaptive_asha
metric: val_avg_iou
smaller_is_better: false
max_length:
batches: 8
max_trials: 300
Determined automatically parallelizes our hyperparameter search across multiple machines. Our Determined cluster is configured to spin up up to 40 agents during training, so even though we’re running 300 trials, training only takes minutes.
When training is complete, your model should obtain a validation IOU score of ~67.
We encourage you to give Determined a spin by trying this example or any others available in the Determined repository. If you have any questions along the way, hop on our community Slack or reach out our GitHub – we’d love to help!