NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 192
Star 1.5k

Code
Issues 58
Pull requests 46
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Labels 23 Milestones 0

New pull request New

46 Open 226 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Enable Yarn RoPE in minitron pruning for gpt-oss support

#530 opened Nov 8, 2025 by kevalmorabia97 • Draft

Make wheel build manual CI job and diffusers test fix

#529 opened Nov 8, 2025 by kevalmorabia97

Loading…

[2/n] Add Core Sparse Attention Infrastructure

#527 opened Nov 7, 2025 by kaix-nv

Loading…

[BUG FIX 5616904] Add transformers version restoration after PTQ for VILA

#525 opened Nov 7, 2025 by yueshen2016

Loading…

parallel eagle draft

#523 opened Nov 6, 2025 by yeyu-nvidia • Draft

[Bug #193] fix fp8 blockwise real quantization

#522 opened Nov 6, 2025 by meenchen

Loading…

Support AWQ fake quant for vLLM MoE models

#521 opened Nov 6, 2025 by meenchen • Draft

Update custom file name patterns when copy files and remove problematic parameters in export

#520 opened Nov 6, 2025 by Edwardf0t1

Loading…

Fix BMM style MoE export in fp8_pc_pt recipe

#515 opened Nov 5, 2025 by Edwardf0t1

Loading…

Fix DQ1 output type error in DQ1->DQ2 for FP4 weights in NVFP4 model

#513 opened Nov 5, 2025 by vishalpandya1990

Loading…

Alit/moe dev2

#508 opened Nov 4, 2025 by JRD971000 • Draft

[5591945][5589019@13][ONNX] Fix 'nodes not sorted' failure

#507 opened Nov 4, 2025 by gcunhase

Loading…

Add decilm modelling code

#505 opened Nov 4, 2025 by danielkorzekwa

Loading…

[OMNIML-2917] handle lm_head and other un-quantized modules correctly

#504 opened Nov 4, 2025 by shengliangxu

Loading…

PyTorch geometric quantization support

#494 opened Nov 3, 2025 by i-riyad

Loading…

Compress tutorial (PoC)

#492 opened Nov 3, 2025 by danielkorzekwa

Loading…

Update benchmarking for diffusers

#487 opened Oct 31, 2025 by ajrasane

Loading…

[Draft] [5526696] Add kv cache quantization support for onnx quantization

#486 opened Oct 31, 2025 by zhanghaoc

Loading…

Fix/Improve vllm PTQ and Support multi-node with ray

#484 opened Oct 30, 2025 by mxinO

Loading…

Yeyu/set block

#480 opened Oct 28, 2025 by yeyu-nvidia • Draft

feat: add onnxslim support

#478 opened Oct 28, 2025 by inisis

Loading…

Feat: Eagle3 HF Online - support nemotron models

#463 opened Oct 25, 2025 by h-guo18

Loading…

Add functional test cases for published checkpoints on HF

#455 opened Oct 21, 2025 by noeyy-mino

Loading…

Preserve original rope scaling type in export due to transformers library AutoConfig issue

#452 opened Oct 17, 2025 by Edwardf0t1

Loading…

[1/2] Registry interface for custom quantization functional backend

#449 opened Oct 17, 2025 by realAsma

Loading…

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-10-09.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!