Prune the Dataset
Remove images without bounding boxes from the dataset to ensure training only uses labeled data.
After splitting your dataset, there may still be images that contain no annotations. These unlabeled images can negatively affect calibration and training. Use the pruning script to remove them.
The script automatically uses the correct directories (data/images_raw and data/annotations) when run from the finetune directory. You can optionally specify custom paths using the --images-dir and --annotations-dir arguments if your dataset is structured differently.
bash
# Remove images without boxes from your raw image set
python3 scripts/prune_images_without_boxes.py --delete