-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
Current Situation
coco_detection_dataset is carrying object detection and object segmentation at the same time. This comes with
- the need to carry a huge 500MB+
$annotationobject hosting all annotations for both tasks along the whole dataset lifetime in memory (i.e. forever) - a complex documentation for the dataset that is confusing to run one task or the other, knowing that the two task will never run together (as they target different model architecture)
- the downloaded archive and ann_zip are stored at root folder on the
rappdirs::user_cache_dir("torch")without using aprefix =but have filename not explicit to the coco dataset, so end-user will have to deal with an hardly un-identifiedval2014.zip6GB files on disk when reaching the "disk full" exception...
Suggestion
As the two tasks have no common DNN implementation, we could easily (through instantiation)
split the coco_detection_dataset() into
coco_detection_dataset()coco_segmentation_dataset()
In the website (see pkgdown.yml) we should also separate classification datasets form the non-classification datasets.
That would allow much more straightforward example, and lower memory footprint of $annotation. That would also ease #170
- archive and ann_zip should be moved into or prefixed by a /coco folder for better identification in the cache folder
Metadata
Metadata
Assignees
Labels
No labels