-
Notifications
You must be signed in to change notification settings - Fork 41
Description
This issue will be used to track compilation failures for migraphx models on CPU and GPU. Compile failures for each model should have a link to an issue with a smaller reproducer in the notes column.
Notes:
migraphx_ORT__bert_base_cased_1
fails on CPU but passes on GPU. Other adjacent models fail for similar reasons on both. Very odd.- not including tests
migraphx_sdxl__unet__model
,migraphx_ORT__bert_large_uncased_1
because they cause a crash (likely OOM) - not including any of the tf models yet.
CPU Status Table
The Following report was generated with IREE compiler version iree-org/iree@caacf6c
Torch-mlir version llvm/torch-mlir@2665ed3
Passing Summary
TOTAL TESTS = 30
Stage | # Passing | % of Total | % of Attempted |
---|---|---|---|
Setup | 30 | 100.0% | 100.0% |
IREE Compilation | 24 | 80.0% | 80.0% |
Gold Inference | 22 | 73.3% | 91.7% |
IREE Inference Invocation | 19 | 63.3% | 86.4% |
Inference Comparison (PASS) | 15 | 50.0% | 78.9% |
Fail Summary
TOTAL TESTS = 30
Stage | # Failed at Stage | % of Total |
---|---|---|
Setup | 0 | 0.0% |
IREE Compilation | 6 | 20.0% |
Gold Inference | 2 | 6.7% |
IREE Inference Invocation | 3 | 10.0% |
Inference Comparison | 4 | 13.3% |
Test Run Detail
Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=True, stages=None, skip_stages=None, benchmark=False, load_inputs=False, groups='all', test_filter='migraphx', testsfile=None, tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='mi_10_10.md')
Test | Exit Status | Mean Benchmark Time (ms) | Notes |
---|---|---|---|
migraphx_agentmodel__AgentModel | compilation | None | iree-18268 iree-18412 torch-mlir-3651 |
migraphx_bert__bert-large-uncased | preprocessing | None | |
migraphx_bert__bertsquad-12 | Numerics | None | |
migraphx_cadene__dpn92i1 | PASS | None | |
migraphx_cadene__inceptionv4i16 | PASS | None | |
migraphx_cadene__resnext101_64x4di1 | PASS | None | |
migraphx_cadene__resnext101_64x4di16 | PASS | None | |
migraphx_huggingface-transformers__bert_mrpc8 | native_inference | None | |
migraphx_mlperf__bert_large_mlperf | Numerics | None | |
migraphx_mlperf__resnet50_v1 | PASS | None | |
migraphx_models__whisper-tiny-decoder | compiled_inference | None | |
migraphx_models__whisper-tiny-encoder | native_inference | None | |
migraphx_onnx-misc__taau_low_res_downsample_d2s_for_infer_time_fp16_opset11 | import_model | None | |
migraphx_onnx-model-zoo__gpt2-10 | preprocessing | None | |
migraphx_ORT__bert_base_cased_1 | PASS | None | |
migraphx_ORT__bert_base_uncased_1 | PASS | None | |
migraphx_ORT__bert_large_uncased_1 | PASS | None | |
migraphx_ORT__distilgpt2_1 | compiled_inference | None | |
migraphx_ORT__onnx_models__bert_base_cased_1_fp16_gpu | Numerics | None | |
migraphx_ORT__onnx_models__bert_large_uncased_1_fp16_gpu | Numerics | None | |
migraphx_ORT__onnx_models__distilgpt2_1_fp16_gpu | compiled_inference | None | |
migraphx_pytorch-examples__wlang_gru | PASS | None | |
migraphx_pytorch-examples__wlang_lstm | PASS | None | |
migraphx_sd__unet__model | import_model | None | |
migraphx_sdxl__unet__model | import_model | None | |
migraphx_torchvision__densenet121i32 | PASS | None | |
migraphx_torchvision__inceptioni1 | PASS | None | |
migraphx_torchvision__inceptioni32 | PASS | None | |
migraphx_torchvision__resnet50i1 | PASS | None | |
migraphx_torchvision__resnet50i64 | PASS | None |
OLD STATUS (Will update and migrate issues to current table)
Test | Exit Status | Notes |
---|---|---|
migraphx_agentmodel__AgentModel | compilation | |
migraphx_bert__bert-large-uncased | compilation | iree-18269 Two IR reported under this, depicting different behavior |
migraphx_bert__bertsquad-12 | compilation | iree-18267 torch-mlir-3647 |
migraphx_cadene__dpn92i1 | PASS | |
migraphx_cadene__inceptionv4i16 | PASS | |
migraphx_cadene__resnext101_64x4di1 | PASS | |
migraphx_cadene__resnext101_64x4di16 | PASS | |
migraphx_huggingface-transformers__bert_mrpc8 | compilation | iree-18413 |
migraphx_mlperf__bert_large_mlperf | compilation | iree-18297 |
migraphx_mlperf__resnet50_v1 | PASS | |
migraphx_models__whisper-tiny-decoder | compilation | torch-mlir-3647 |
migraphx_models__whisper-tiny-encoder | compilation | torch-mlir-3647 |
migraphx_onnx-misc__taau_low_res_downsample_d2s_for_infer_time_fp16_opset11 | construct_inputs | ORT issue with resize with f16 inputs? |
migraphx_onnx-model-zoo__gpt2-10 | compilation | shark-turbine-465 torch-mlir-615 torch-mlir-3293 |
migraphx_ORT__bert_base_cased_1 | Numerics | Passed when '--iree-input-demote-i64-to-i32' is not present iree-18273 |
migraphx_ORT__bert_base_uncased_1 | Numerics | Passed when '--iree-input-demote-i64-to-i32' is not present |
migraphx_ORT__bert_large_uncased_1 | compilation | crashes "MatMul" fail to legalize stream.cmd.dispatch iree-org/iree#18229 llvm/torch-mlir#3647 ?? |
migraphx_ORT__distilgpt2_1 | Numerics | |
migraphx_ORT__onnx_models__bert_base_cased_1_fp16_gpu | Numerics | |
migraphx_ORT__onnx_models__bert_large_uncased_1_fp16_gpu | Numerics | |
migraphx_ORT__onnx_models__distilgpt2_1_fp16_gpu | Numerics | |
migraphx_pytorch-examples__wlang_gru | Numerics | iree-18441 |
migraphx_pytorch-examples__wlang_lstm | Numerics | iree-18441 |
migraphx_sd__unet__model | import_model | Killed during MLIR import. Too big?? |
migraphx_sdxl__unet__model | import_model | Killed during MLIR import. Too big?? |
migraphx_torchvision__densenet121i32 | PASS | |
migraphx_torchvision__inceptioni1 | PASS | |
migraphx_torchvision__inceptioni32 | PASS | |
migraphx_torchvision__resnet50i1 | PASS | |
migraphx_torchvision__resnet50i64 | PASS |
GPU Status Table
last generated with pip installed iree tools at version
iree-compiler 20240903.1005
iree-runtime 20240903.1005
Summary
Stage | Count |
---|---|
Total | 21 (non-crashing, see table below) |
PASS | 12 |
Numerics | 2 |
results-summary | 0 |
postprocessing | 0 |
compiled_inference | up to 5 (not included in total) crash during this stage |
compilation | 4 |
preprocessing | 0 |
import_model | 1 |
native_inference | 2 |
construct_inputs | 0 |
setup | 0 |
Test Run Detail
Test was run with the following arguments:
Namespace(device='hip://1', backend='rocm', iree_compile_args=['iree-hip-target=gfx942'], mode='onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, load_inputs=False, groups='all', test_filter='migraphx', tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, report=True, report_file='9_3_migraphx.md')
Test | Exit Status | Notes |
---|---|---|
migraphx_agentmodel__AgentModel | compilation | related : llvm/torch-mlir#3630 |
migraphx_bert__bert-large-uncased | compilation | operand return type issue (see CPU table) |
migraphx_bert__bertsquad-12 | compilation (without shape inference)/ compiled_inference | 1. Failing to use shape inference torch-mlir passes in torch-to-iree pipeline gives an all dynamic squeeze-dim op. 2. If using torch-lower-to-backend-contract to get the shape information, this crashes during inference with OOB memory access |
migraphx_cadene__dpn92i1 | PASS | |
migraphx_cadene__inceptionv4i16 | PASS | |
migraphx_cadene__resnext101_64x4di1 | PASS | |
migraphx_cadene__resnext101_64x4di16 | PASS | |
migraphx_huggingface-transformers__bert_mrpc8 | native_inference | |
migraphx_mlperf__bert_large_mlperf | native_inference | |
migraphx_mlperf__resnet50_v1 | PASS | |
migraphx_onnx-misc__taau_low_res_downsample_d2s_for_infer_time_fp16_opset11 | import_model | |
migraphx_onnx-model-zoo__gpt2-10 | compilation | nod-ai/SHARK-ModelDev#465 llvm/torch-mlir#615 llvm/torch-mlir#3293 |
migraphx_ORT__bert_base_cased_1 | PASS | |
migraphx_ORT__bert_base_uncased_1 | PASS | |
migraphx_ORT__distilgpt2_1 | likely compiled_inference | crashes with "Memory access fault by GPU node-3 (Agent handle: 0x5595fe450840) on address 0x7f1811a56000. Reason: Unknown." |
migraphx_ORT__onnx_models__bert_base_cased_1_fp16_gpu | compiled_inference | causes a hard crash for trying to access memory out of bounds (Mi300x) |
migraphx_ORT__onnx_models__bert_large_uncased_1_fp16_gpu | compiled_inference | same crash as above |
migraphx_ORT__onnx_models__distilgpt2_1_fp16_gpu | likely compiled_inference | crashes with "Memory access fault by GPU node-3 (Agent handle: 0x5595fe450840) on address 0x7f1811a56000. Reason: Unknown." |
migraphx_pytorch-examples__wlang_gru | Numerics | |
migraphx_pytorch-examples__wlang_lstm | Numerics | |
migraphx_torchvision__densenet121i32 | PASS | |
migraphx_torchvision__inceptioni1 | PASS | |
migraphx_torchvision__inceptioni32 | PASS | |
migraphx_torchvision__resnet50i1 | PASS | |
migraphx_torchvision__resnet50i64 | PASS |
Note: GPU missing sd model (runs out of memory and kills the test). Probably happening during native inference, so it might need some looking into.
Performance data with iree-benchmark-module on GPU
Summary
Stage | Count |
---|---|
Total | 30 |
PASS | 13 |
Numerics | 3 |
results-summary | 0 |
postprocessing | 0 |
benchmark | 0 |
compiled_inference | 2 |
native_inference | 1 |
construct_inputs | 0 |
compilation | 8 |
preprocessing | 0 |
import_model | 3 |
setup | 0 |
Test Run Detail
Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, benchmark=True, load_inputs=False, groups='all', test_filter='migraphx', testsfile=None, tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='report.md')
Test | Exit Status | Mean Benchmark Time (ms) | Notes |
---|---|---|---|
migraphx_agentmodel__AgentModel | compilation | None | |
migraphx_bert__bert-large-uncased | compilation | None | |
migraphx_bert__bertsquad-12 | compilation | None | |
migraphx_cadene__dpn92i1 | PASS | 457.4397828740378 | |
migraphx_cadene__inceptionv4i16 | PASS | 26072.668661984306 | |
migraphx_cadene__resnext101_64x4di1 | PASS | 995.6825857516378 | |
migraphx_cadene__resnext101_64x4di16 | PASS | 6324.309662605326 | |
migraphx_huggingface-transformers__bert_mrpc8 | compilation | None | |
migraphx_mlperf__bert_large_mlperf | PASS | 8195.630943014596 | |
migraphx_mlperf__resnet50_v1 | PASS | 219.81522629761858 | |
migraphx_models__whisper-tiny-decoder | compiled_inference | None | |
migraphx_models__whisper-tiny-encoder | native_inference | None | |
migraphx_onnx-misc__taau_low_res_downsample_d2s_for_infer_time_fp16_opset11 | import_model | None | |
migraphx_onnx-model-zoo__gpt2-10 | compilation | None | |
migraphx_ORT__bert_base_cased_1 | PASS | 817.4834945239127 | |
migraphx_ORT__bert_base_uncased_1 | compilation | None | |
migraphx_ORT__bert_large_uncased_1 | PASS | 2728.984761983156 | |
migraphx_ORT__distilgpt2_1 | compiled_inference | None | |
migraphx_ORT__onnx_models__bert_base_cased_1_fp16_gpu | Numerics | 2141.3577783387154 | |
migraphx_ORT__onnx_models__bert_large_uncased_1_fp16_gpu | Numerics | 6767.566671983029 | |
migraphx_ORT__onnx_models__distilgpt2_1_fp16_gpu | Numerics | 101.96079453453422 | |
migraphx_pytorch-examples__wlang_gru | compilation | None | |
migraphx_pytorch-examples__wlang_lstm | compilation | None | |
migraphx_sd__unet__model | import_model | None | |
migraphx_sdxl__unet__model | import_model | None | |
migraphx_torchvision__densenet121i32 | PASS | 2639.900082334255 | |
migraphx_torchvision__inceptioni1 | PASS | 627.4162046611309 | |
migraphx_torchvision__inceptioni32 | PASS | 22124.727455200627 | |
migraphx_torchvision__resnet50i1 | PASS | 284.1490000589854 | |
migraphx_torchvision__resnet50i64 | PASS | 11100.900294492021 |