[release_v2190] release notes template #3731

AlexanderDokuchaev · 2025-11-07T19:44:54Z

Reason for changes

Upcoming release

Related tickets

176350

MaximProshin · 2025-11-11T07:29:03Z

@l-bat , please help with the list of new notebooks with NNCF. I noticed the following only:

daniil-lyakhov · 2025-11-11T13:00:01Z

ReleaseNotes.md

+- General:
+  - ...
+- Features:
+  - The histogram aggregator was introduced, improving metrics for a number of classification models with PTQ.


daniil-lyakhov · 2025-11-11T13:00:30Z

ReleaseNotes.md

+Post-training Quantization:
+
+- Breaking changes:
+- (OpenVINO) `nncf.CompressWeightsMode.E2M1` `mode` option is renamed to `nncf.CompressWeightsMode.MXFP4`.


daniil-lyakhov · 2025-11-11T13:00:54Z

ReleaseNotes.md

+  - ...
+- Features:
+  - The histogram aggregator was introduced, improving metrics for a number of classification models with PTQ.
+  - (OpenVINO) Introduced several new compression modes in `nncf.CompressWeightsMode`: `MXFP8`, `FP8`, and `FP4`. These can be used as the `mode` option in `nncf.compress_weights()` to apply the corresponding MXFP8, FP8, or FP4 precisions (experimental).


#3664
#3683

daniil-lyakhov · 2025-11-11T13:02:00Z

ReleaseNotes.md

+- Fixes:
+  - ...
+- Improvements:
+  - Maximum memory consumption during statistic collection has been reduced by releasing model output memory before the next statistic collection inference call.


nikita-savelyevv · 2025-11-12T10:58:31Z

ReleaseNotes.md

+- Features:
+  - The histogram aggregator was introduced, improving metrics for a number of classification models with PTQ.
+  - (OpenVINO) Introduced several new compression modes in `nncf.CompressWeightsMode`: `MXFP8`, `FP8`, and `FP4`. These can be used as the `mode` option in `nncf.compress_weights()` to apply the corresponding MXFP8, FP8, or FP4 precisions (experimental).
+  - Now weight compression biwidth distribution table also displays group size value for each of the compression data type.


nikita-savelyevv · 2025-11-12T10:58:51Z

ReleaseNotes.md

+- Known issues:
+  - ...
+- Other:
+  - Refined the handling of layers that don't have channel size divisible by group size during weight compression. Now the default behavior in such case is that an error will be raised and in the error message users are suggested to provide a different group size value or use `GroupSizeFallbackMode.ADJUST` to automatically adjust group size for problematic layers.


nikita-savelyevv · 2025-11-12T10:59:13Z

ReleaseNotes.md

+  - (OpenVINO) Introduced several new compression modes in `nncf.CompressWeightsMode`: `MXFP8`, `FP8`, and `FP4`. These can be used as the `mode` option in `nncf.compress_weights()` to apply the corresponding MXFP8, FP8, or FP4 precisions (experimental).
+  - Now weight compression biwidth distribution table also displays group size value for each of the compression data type.
+- Fixes:
+  - Added an ignored pattern for position embedding layer in Segment Anything model.


nikita-savelyevv · 2025-11-12T11:00:43Z

ReleaseNotes.md

+  - Added an ignored pattern for position embedding layer in Segment Anything model.
+- Improvements:
+  - Maximum memory consumption during statistic collection has been reduced by releasing model output memory before the next statistic collection inference call.
+  - Reduced peak memory footprint for Bias Correction algorithm.


nikita-savelyevv · 2025-11-12T11:00:53Z

ReleaseNotes.md

+- Improvements:
+  - Maximum memory consumption during statistic collection has been reduced by releasing model output memory before the next statistic collection inference call.
+  - Reduced peak memory footprint for Bias Correction algorithm.
+  - (OpenVINO) Reduced time (by up to 3x) and memory (by up to 1.5x) it takes to compress models to `MXFP4` data type.


andrey-churkin · 2025-11-15T20:33:36Z

ReleaseNotes.md

+  - The histogram aggregator was introduced, improving metrics for a number of classification models with PTQ.
+  - (OpenVINO) Introduced several new compression modes in `nncf.CompressWeightsMode`: `MXFP8`, `FP8`, and `FP4`. These can be used as the `mode` option in `nncf.compress_weights()` to apply the corresponding MXFP8, FP8, or FP4 precisions (experimental).
+  - Now weight compression biwidth distribution table also displays group size value for each of the compression data type.
+  - (ONNX) Support for the SmoothQuant algorithm has been added to the ONNX backend for INT8 quantization.


#3644, #3687

andrey-churkin · 2025-11-15T20:33:52Z

ReleaseNotes.md

+  - (OpenVINO) Introduced several new compression modes in `nncf.CompressWeightsMode`: `MXFP8`, `FP8`, and `FP4`. These can be used as the `mode` option in `nncf.compress_weights()` to apply the corresponding MXFP8, FP8, or FP4 precisions (experimental).
+  - Now weight compression biwidth distribution table also displays group size value for each of the compression data type.
+  - (ONNX) Support for the SmoothQuant algorithm has been added to the ONNX backend for INT8 quantization.
+  - (ONNX) A new transformation has been added to optimize models by folding `QuantizeLinear` nodes with constant inputs into precomputed, quantized initializers. This behavior is controlled by the `COMPRESS_WEIGHTS` backend parameter, which is now enabled (`True`) by default.


andrey-churkin · 2025-11-15T20:34:08Z

ReleaseNotes.md

+  - Now weight compression biwidth distribution table also displays group size value for each of the compression data type.
+  - (ONNX) Support for the SmoothQuant algorithm has been added to the ONNX backend for INT8 quantization.
+  - (ONNX) A new transformation has been added to optimize models by folding `QuantizeLinear` nodes with constant inputs into precomputed, quantized initializers. This behavior is controlled by the `COMPRESS_WEIGHTS` backend parameter, which is now enabled (`True`) by default.
+  - (ONNX) Support has been added for applying the Fast Bias/Bias Correction algorithm to `MatMul` + `Add` subgraphs where one of the inputs to the `Add` operation is a constant. Previously, these cases were skipped because the `MatMul` operation was not recognized as having a bias, preventing the algorithm from being applied.


andrey-churkin · 2025-11-15T20:34:24Z

ReleaseNotes.md

+  - (ONNX) Support has been added for applying the Fast Bias/Bias Correction algorithm to `MatMul` + `Add` subgraphs where one of the inputs to the `Add` operation is a constant. Previously, these cases were skipped because the `MatMul` operation was not recognized as having a bias, preventing the algorithm from being applied.
+- Fixes:
+  - Added an ignored pattern for position embedding layer in Segment Anything model.
+  - (ONNX) Fixed incorrect input handling for the `MatMulNBits` operation that previously caused graph breaks.


andrey-churkin · 2025-11-15T20:34:39Z

ReleaseNotes.md

+- Fixes:
+  - Added an ignored pattern for position embedding layer in Segment Anything model.
+  - (ONNX) Fixed incorrect input handling for the `MatMulNBits` operation that previously caused graph breaks.
+  - (ONNX) Resolved an issue with INT4 weight compression in the `Gemm` operation when `transB=1`.


release notes template

d8f03d7

AlexanderDokuchaev requested a review from a team as a code owner November 7, 2025 19:44

github-actions bot added documentation Improvements or additions to documentation release target labels Nov 7, 2025

AlexanderDokuchaev requested review from MaximProshin, andrey-churkin, andreyanufr, anzr299, daniil-lyakhov, l-bat, ljaljushkin and nikita-savelyevv November 7, 2025 19:46

Add list of OV notebooks with NNCF to release notes

0533467

l-bat approved these changes Nov 11, 2025

View reviewed changes

Update ReleaseNotes.md

57e9750

daniil-lyakhov reviewed Nov 11, 2025

View reviewed changes

daniil-lyakhov approved these changes Nov 11, 2025

View reviewed changes

ljaljushkin approved these changes Nov 11, 2025

View reviewed changes

Update ReleaseNotes.md

0ec4746

nikita-savelyevv reviewed Nov 12, 2025

View reviewed changes

nikita-savelyevv approved these changes Nov 12, 2025

View reviewed changes

anzr299 approved these changes Nov 12, 2025

View reviewed changes

AlexanderDokuchaev and others added 2 commits November 14, 2025 01:35

Update ReleaseNotes.md

9bbb0f3

Update ReleaseNotes.md

d79f3d0

andrey-churkin approved these changes Nov 15, 2025

View reviewed changes

andrey-churkin reviewed Nov 15, 2025

View reviewed changes

AlexanderDokuchaev added 2 commits November 16, 2025 16:59

Update ReleaseNotes.md

44d905b

Update ReleaseNotes.md

2118930

[release_v2190] release notes template #3731

Are you sure you want to change the base?

[release_v2190] release notes template #3731

Uh oh!

Conversation

AlexanderDokuchaev commented Nov 7, 2025

Reason for changes

Related tickets

Uh oh!

MaximProshin commented Nov 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikita-savelyevv Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

nikita-savelyevv Nov 12, 2025 •

edited

Loading