Hi @fedorov
I stumbled across this while playing around with testing this PR.
There are a few places in the idc-index public API that allow you to choose between AWS and GCP buckets as the source location. The problem is that half the time, users should pass "gcp" if they want GCP buckets, and half the time they should pass "gcs". Specifically:
download_from_selection, dicom_dicom_instance, download_dicom_patients, download_collection expect "gcs" for Google and DO enforce it, so if you pass "gcp", it will give you an error
get_series_file_URLs and get_instance_file_URL expect "gcp" for Google currently but do NOT enforce it. I.e. if you pass "gcs" expecting the Google URLs, you will get back the "aws" URLs without any error or warning (as is the case with any string other than "gcp")
I suggest:
- Standardizing on
"gcs", since the functions that use it are probably used more often
- Adding checks to get_series_file_URLs
andget_instance_file_URLto check the value passed tosource_bucket_location`