Update benchmarking for diffusers #487

ajrasane · 2025-10-31T01:18:27Z

What does this PR do?

Type of change:
Example update

Overview:

Optimize the benchmarking function in the diffusers example

python diffusion_trt.py --model flux-dev --benchmark --model-dtype BFloat16 --skip-image --torch

Testing

Backbone-only inference latency (BFloat16):
  Average: 139.48 ms
  P50: 139.36 ms
  P95: 141.13 ms
  P99: 141.35 ms

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: No
Did you add or update any necessary documentation?: No
Did you update Changelog?: No

codecov · 2025-10-31T01:31:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.36%. Comparing base (ca94c96) to head (e4453f7).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #487   +/-   ##
=======================================
  Coverage   74.36%   74.36%           
=======================================
  Files         181      182    +1     
  Lines       18192    18216   +24     
=======================================
+ Hits        13529    13547   +18     
- Misses       4663     4669    +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97 · 2025-10-31T06:47:16Z

Please make sure to run internal gitlab diffuesrs cicd test to verify they dont break with this change

Signed-off-by: ajrasane <[email protected]>

cjluo-nv · 2025-10-31T16:44:52Z

examples/diffusers/quantization/diffusion_trt.py

-    def forward_hook(_module, _input, _output):
+        _ = backbone(**dummy_inputs_dict)
        end_event.record()
        torch.cuda.synchronize()


I don't think you need to call sync here.

The synchronization call is needed. Or we run into this error:

RuntimeError: Both events must be completed before calculating elapsed time.

examples/diffusers/quantization/diffusion_trt.py

Signed-off-by: ajrasane <[email protected]>

## What does this PR do? **Type of change:** Example update **Overview:** - Optimize the benchmarking function in the diffusers example ```python python diffusion_trt.py --model flux-dev --benchmark --model-dtype BFloat16 --skip-image --torch ``` ## Testing ``` Backbone-only inference latency (BFloat16): Average: 139.48 ms P50: 139.36 ms P95: 141.13 ms P99: 141.35 ms ``` ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: No --------- Signed-off-by: ajrasane <[email protected]>

## What does this PR do? **Type of change:** Example update **Overview:** - Optimize the benchmarking function in the diffusers example ```python python diffusion_trt.py --model flux-dev --benchmark --model-dtype BFloat16 --skip-image --torch ``` ## Testing ``` Backbone-only inference latency (BFloat16): Average: 139.48 ms P50: 139.36 ms P95: 141.13 ms P99: 141.35 ms ``` ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: No --------- Signed-off-by: ajrasane <[email protected]> Signed-off-by: mxin <[email protected]>

ajrasane requested a review from a team as a code owner October 31, 2025 01:18

ajrasane requested a review from kevalmorabia97 October 31, 2025 01:18

ajrasane self-assigned this Oct 31, 2025

ajrasane requested a review from cjluo-nv October 31, 2025 01:18

kevalmorabia97 requested a review from jingyu-ml October 31, 2025 06:46

Update benchmarking for diffusers

1aafbbc

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/benchmark_diffusers branch from 89f6c25 to 1aafbbc Compare November 7, 2025 19:28

Add cuda profiler

646458a

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/benchmark_diffusers branch from 094aa94 to 646458a Compare November 7, 2025 20:05

jingyu-ml approved these changes Nov 7, 2025

View reviewed changes

cjluo-nv approved these changes Nov 7, 2025

View reviewed changes

Replace percentile methods

99ee196

Signed-off-by: ajrasane <[email protected]>

ajrasane enabled auto-merge (squash) November 10, 2025 11:09

ajrasane disabled auto-merge November 10, 2025 11:22

Update the dynamic axes check

e4453f7

Signed-off-by: ajrasane <[email protected]>

ajrasane requested a review from a team as a code owner November 10, 2025 11:33

ajrasane requested a review from i-riyad November 10, 2025 11:33

ajrasane enabled auto-merge (squash) November 10, 2025 11:34

ajrasane merged commit e74a468 into main Nov 10, 2025
26 checks passed

ajrasane deleted the ajrasane/benchmark_diffusers branch November 10, 2025 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update benchmarking for diffusers #487

Update benchmarking for diffusers #487

Uh oh!

ajrasane commented Oct 31, 2025

Uh oh!

codecov bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

kevalmorabia97 commented Oct 31, 2025

Uh oh!

cjluo-nv Oct 31, 2025

Uh oh!

ajrasane Nov 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Update benchmarking for diffusers #487

Update benchmarking for diffusers #487

Uh oh!

Conversation

ajrasane commented Oct 31, 2025

What does this PR do?

Testing

Before your PR is "Ready for review"

Uh oh!

codecov bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevalmorabia97 commented Oct 31, 2025

Uh oh!

cjluo-nv Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

ajrasane Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Oct 31, 2025 •

edited

Loading