-
Notifications
You must be signed in to change notification settings - Fork 88
Closed
Description
**Describe the bug
We tried to benchmark VISTA-3D for accuracy (dice score), so we ran one locally to create baselines and another one in our CI pipeline to create benchmarks. However, we realized that we cannot reproduce these metrics and baseline and benchmark differ:
| Test Case | calss | Baseline Dice | Benchmark Dice |
|---|---|---|---|
| case1 | spleen | 0.9652 | 0.964 |
| case2 | aorta | 0.9682 | 0.969 |
| case3 | liver | 0.9177 | 0.92 |
| hepatic tumor | 0.684 | 0.703 | |
| case4 | lung tumor | 0.8779 | 0.863 |
| case5 | colon cancer primaries | 0.8421 | 0.801 |
| case6 | stomach | 0.936 | 0.932 |
| inferior vena cava | 0.9087 | 0.901 | |
| pancreas | 0.781 | 0.703 | |
| vertebrae L1 | 0.9793 | 0.979 | |
| vertebrae T8 | 0.9769 | 0.979 | |
| brain | 0.8203 | 0.849 |
Here are the tests cases that we used:
test_case:
case1:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/spleen_2_image.nii.gz
label: spleen_2_label.nii.gz
gt:
spleen: 1
prompts:
classes:
- spleen
case2:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/s0996.nii
label: s0996_seg.nii
gt:
aorta: 7
prompts:
classes:
- aorta
case3:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/liver_129.nii.gz
label: liver_129_seg.nii.gz
gt:
liver: 1
hepatic tumor: 2
prompts:
classes:
- liver
- hepatic tumor
case4:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/lung_034.nii.gz
label: lung_034_seg.nii.gz
gt:
lung tumor: 1
prompts:
classes:
- lung tumor
case5:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/colon_203.nii
label: colon_203_seg.nii
gt:
colon cancer primaries: 1
prompts:
classes:
- colon cancer primaries
case6:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/s0459.nii
label: s0459_seg.nii
gt:
stomach: 6
inferior vena cava: 8
pancreas: 10
vertebrae L1: 22
vertebrae T8: 27
brain: 44
prompts:
classes:
- stomach
- inferior vena cava
- pancreas
- vertebrae L1
- vertebrae T8
- brain
case7:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/s0675.nii
label: s0675_seg.nii
gt:
left rib 8: 59
right rib 3: 66
right rib 12: 75
right iliopsoas: 96
heart: 105
prompts:
classes:
- left rib 8
- right rib 3
- right rib 12
- right iliopsoas
- heartand here are the test cases for speed:
test_case:
case1:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/256cubic.nii.gz
prompts:
classes: []
size: [256, 256, 256]
case2:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/256cubic.nii.gz
prompts:
classes:
- spleen
size: [256, 256, 256]
case3:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512cubic.nii.gz
prompts:
classes: []
size: [512, 512, 512]
case4:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512cubic.nii.gz
prompts:
classes:
- liver
size: [512, 512, 512]
case5:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512-768.nii.gz
prompts:
classes: []
size: [512, 512, 768]
case6:
image: https://vista3d-nim-test-images.s3.amazonaws.com/test-vista3d-nim-image-url/512-768.nii.gz
prompts:
classes:
- heart
size: [512, 512, 768]
Environment
The baseline and banchmak are being run on different machines but the same container.
================================
Printing MONAI config...
================================
MONAI version: 1.4.0rc9
Numpy version: 1.24.4
Pytorch version: 2.5.0a0+872d972e41.nv24.08.01
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: fa1c1af79ef5387434f2a76744f75b5aaca09f0b
MONAI __file__: /usr/local/lib/python3.10/dist-packages/monai/__init__.py
Optional dependencies:
Pytorch Ignite version: 0.4.11
ITK version: 5.4.0
Nibabel version: 5.2.1
scikit-image version: 0.23.2
scipy version: 1.13.1
Pillow version: 10.4.0
Tensorboard version: 2.17.0
gdown version: 5.2.0
TorchVision version: 0.20.0a0
tqdm version: 4.66.4
lmdb version: 1.5.1
psutil version: 5.9.8
pandas version: 2.2.2
einops version: 0.7.0
transformers version: 4.40.2
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: 1.0.0
clearml version: 1.16.3
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.5 LTS
Platform: Linux-6.8.0-41-generic-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.10.12
Process name: pt_main_thread
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: []
Num physical CPUs: 8
Num logical CPUs: 16
Num usable CPUs: 16
CPU usage (%): [4.2, 2.8, 2.5, 2.1, 2.5, 2.6, 2.1, 17.5, 2.6, 2.6, 2.1, 2.0, 2.1, 1.8, 78.3, 7.2]
CPU freq. (MHz): 1839
Load avg. in last 1, 5, 15 mins (%): [3.5, 1.8, 0.6]
Disk usage (%): 38.2
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 125.7
Available memory (GB): 121.3
Used memory (GB): 2.9
================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.6
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: 1
cuDNN version: 90400
Current device: 0
Library compiled for CUDA architectures: ['sm_70', 'sm_72', 'sm_75', 'sm_80', 'sm_86', 'sm_87', 'sm_90', 'compute_90']
GPU 0 Name: NVIDIA A40
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 84
GPU 0 Total memory (GB): 44.4
GPU 0 CUDA capability (maj.min): 8.6
Metadata
Metadata
Assignees
Labels
No labels