Problem replicating YoloV3 result (COCO dataset) with training from scratch
I am trying to train COCO Dataset with YoloV3 from scratch. What I did was
- Take
cfg/yolo.py
files and make necessary changes for YoloV3
- Change loss function from RegionLoss to MultiScaleRegionLoss
- Change GetBoundingBoxes to GetMultiScaleBoundingBoxes
- Make COCO dataset based on the provided
labels.py
- Apply a test function against the COCO valid data during training
- Basically, I extended
train.py
by applying a test function modified from the providedtest.py
- Then, I provided a second dataloader (COCO validation dataset dataloader) to the
TrainEngine
, and do something like@ln.engine.Engine.batch_end(5000) def test_against_valid_data(self): test_fn(self)
- However, since I am not really accustom to the hyperparameters for Yolo training (I am more comfortable with training object recognition network), I just use the one set in the
train.py
However, after long training, the test against COCO valid dataset stuck at NaN
. Since I thought there are bugs in the code, I tried to train again from the start but with the existing weight loaded into the model (yolov3-coco.pt
). This gives a non-Nan
test result (which should mean there is no bug?).
So, what I should do to reproduce training COCO from scratch with YoloV3?
I also included the training graph screenshot here.
- The blue one is result of training from scratch (I replaced NaN in test_mAP with 0)
- The red one is result of training from yolov3-coco.pt
Once again, thank you for the awesome library (including brambox)!