[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler - new metadata version #32350

DariaMityagina · 2025-10-10T01:27:27Z

Details:

The task is to explore the possibility of moving the reshaping process to the plugin. This would allow the compiler to receive either a network with batch size 1 if reshaping is successful, or a network with a non-1 batch size if reshaping fails. This approach aims to simplify the batch handling process and reduce dependencies on the compiler.

Flow:

Plugin receives a dynamically batched model
Plugin tries to reshape it to batch=1
If success, plugin invokes the compiler to compile the model with batch=1
Plugin dumps the blob in the FS
===end of context===
A user calls benchmark_app -m model.blob -data_shape [4, 3, 224, 224]
Plugin loads the model onto the device
Plugin sees the batch -> it creates N (4) infer requests

Tickets:

E-176749

src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp

src/plugins/intel_npu/src/backend/src/zero_infer_request.cpp

PatrikStepan · 2025-10-10T09:09:15Z

src/plugins/intel_npu/src/plugin/src/compiled_model.cpp

+    Metadata<CURRENT_METADATA_VERSION>(blobSizesBeforeVersioning,
+                                       CURRENT_OPENVINO_VERSION,
+                                       initBlobSizes,
+                                       originalBatchSize)


Can we pass here information from the CompiledModel itself, not from the _graph?

Done, thanks!
93181ac

Suggested change

originalBatchSize)

_batchSize)

src/plugins/intel_npu/src/plugin/src/plugin.cpp

sivanov-work

LGTM with minor comments

src/plugins/intel_npu/src/utils/include/intel_npu/utils/utils.hpp

src/plugins/intel_npu/src/plugin/src/plugin.cpp

DariaMityagina · 2025-10-13T07:27:23Z

@PatrikStepan hi! Could you please take another look at this PR?

PatrikStepan · 2025-10-13T13:02:56Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+        // If we have successfully debatched the model on the PLUGIN side, we should
+        // avoid repeating the same in the compiler by resetting the batch mode
+        updateBatchMode(ov::intel_npu::BatchMode::COMPILER);


@DariaMityagina @pereanub what if we remove BATCH_MODE from the config here? From the compiler source code, it seems to have the same effect.
BATCH_MODE is still a private property, it is added today only by the plugin itself in the list of compiler configs. With changes from this PR maybe it should be added only when the fallback on the compiler implementation is required?

With changes from this PR maybe it should be added only when the fallback on the compiler implementation is required?

That's exactly what's happening here, let's take a closer look.

If BATCH_MODE = AUTO or PLUGIN:

Verify if the model is compatible with batching on the plugin side

Attempt to reshape the model using set_batch(1)

If this is successful, we continue with the PLUGIN BATCH_MODE

Write/Read batch size into NPU plugin metadata

The reshaped model is sent to the compiler. BATCH_MODE must be set to COMPILER

Adjust the metadata post-compilation to retain the original batch value

If it isn't successful:

If BATCH_MODE = AUTO: we fallback to BATCH_MODE =COMPILER

If BATCH_MODE = PLUGIN: we fail

Right, I was challenging this part: "BATCH_MODE must be set to COMPILER". If BATCH_MODE is not found in the config, the compiler will attempt to apply its own batching mechanism anyway.
Can BATCH_MODE remain a hint only for the plugin?

If set to PLUGIN: try to debatch the model and pass a modified model to the compiler without any extra configs if debatching succeeds, throw otherwise

if set to AUTO or if not set at all: try to debatch the model and pass a modified model to the compiler without any extra configs if debatching succeeds, fallback on the compiler implementation otherwise

If set to COMPILER or fallback to compiler implementation: pass the original model to the compiler without any extra configs

I see. I removed the BATCH_MODE must be set to COMPILER requirement after successful debatching (here fb6363b).

However, I think it still makes sense to pass the config value for the remaining cases to avoid duplicating the plugin-side validation logic in the compiler. This way we can directly use the updated value from the plugin instead of re-implementing/repeating the same checks just to set BATCH_MODE.

At least in the current state. We can adjust this implementation later through a dedicated task with focused improvements.

Thanks!

Discussed offline.

NPU_BATCH_MODE is now a Runtime option: 977e1b1

Thanks!

Not applicable anymore.

PatrikStepan · 2025-10-13T13:10:15Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+
+    // Handle batch mode configuration
+    std::optional<ov::Dimension> originalBatch = std::nullopt;
+    if (localConfig.isAvailable(ov::intel_npu::batch_mode.name())) {


BATCH_MODE is not available when compiler does not support this option. But batching could still be handled on the plugin side (in some cases at least). Isn't that the actual requirement that needs to be addressed in this PR?

Fixed, thanks!
https://github.com/openvinotoolkit/openvino/pull/32350/files#diff-f1f8f14a46f6bfbbfc9561e65e1a27b99d0e3a554964137190d414557eec728cR636

src/plugins/intel_npu/src/backend/src/zero_infer_request.cpp

src/plugins/intel_npu/src/plugin/include/compiled_model.hpp

pereanub

Is not very clear to me how you fix today fix batch size case.
E.g.: batch size of I/O of the network is not dynamic but fix to 3x... How is this supposed to work today?

src/plugins/intel_npu/src/backend/src/zero_infer_request.cpp

DariaMityagina · 2025-10-15T12:20:07Z

Is not very clear to me how you fix today fix batch size case. E.g.: batch size of I/O of the network is not dynamic but fix to 3x... How is this supposed to work today?

@pereanub, static batch size is fixed here:

Instead of:

https://github.com/openvinotoolkit/openvino/pull/32350/files#diff-f3fd07fa7ab5f8dd58485a39a96da3e3b63cae37a2a882f113572f16c5a5770aL235

So, when batch=3 (static) is used, e.g.:

We run validation checks to determine if the model is compatible with plugin batching
We identify the batch_size, which equals 3 in this scenario
We call graph->set_batch_size(batchSize.value());
Within zero_infer_request, we call _graph->get_batch_size() (this returns the previously set value of 3)

src/plugins/intel_npu/src/compiler_adapter/src/plugin_compiler_adapter.cpp

src/plugins/intel_npu/src/plugin/src/plugin.cpp

…and Compiler - review, metadata used

…and Compiler - clang

…and Compiler - review

…and Compiler - remove some unused logic

…and Compiler - test fixes

…and Compiler - review (remove setting batch_mode to COMPILER after debatching, private methods, tests)

…and Compiler - review (helpers to a separate file, remove get_batch_size for compiledModel, BATCH_MODE is a runtime option)

…and Compiler - review (batchSize type)

…and Compiler - review (BATCH_MODE is needed, tests are not skipped)

…and Compiler - review (if compiler doesn't support BatchMode)

… Plugin and Compiler - review (batchSize type)" This reverts commit 095954f.

…and Compiler - review (if compiler doesn't support BatchMode) - fixes

…and Compiler - review (use partial shape instead of _compiledModel, coverity)

…and Compiler - review (redundant clone)

…and Compiler - review (protect original model, minor comments)

…and Compiler - review

DariaMityagina · 2025-10-19T12:37:24Z

@PatrikStepan @pereanub, @alexandruenache1111 hi! Could you please take another look at this PR? All comments have been addressed.

…and Compiler - review (metadata write/read)

alexandruenache1111

LGTM on the metadata side.

DariaMityagina · 2025-10-20T10:25:56Z

@razvanapetroaie hi! Could you please take another look at this PR? We would like to merge this today.

razvanapetroaie

Nothing important imo. I can handle the nits in one of my PRs, so you won't have to rerun the CI jobs just for this.

razvanapetroaie · 2025-10-20T09:28:09Z

src/plugins/intel_npu/src/plugin/include/metadata.hpp

+    /**
+     * @details The number of init schedules, along with the size of each init binary object are read in addition to the
+     * information provided by the previous metadata versions.
+     */


Comment is wrong, this version is adding the batch size value, nothing related to init schedules. Same about the write function. You may leave this without comments if you wish, I think the description of the class makes this obvious enough.

razvanapetroaie · 2025-10-20T09:34:07Z

src/plugins/intel_npu/src/plugin/src/compiled_model.cpp

+    Metadata<CURRENT_METADATA_VERSION>(blobSizesBeforeVersioning,
+                                       CURRENT_OPENVINO_VERSION,
+                                       initBlobSizes,
+                                       originalBatchSize)


Suggested change

originalBatchSize)

_batchSize)

razvanapetroaie · 2025-10-20T09:37:13Z

src/plugins/intel_npu/src/plugin/include/compiled_model.hpp

                  const std::shared_ptr<IGraph>& graph,
-                  const FilteredConfig& config);
+                  const FilteredConfig& config,
+                  const std::optional<int64_t>& batchSize);


Could use a comment in the doxygen passage above on why we ended up passing this value separately. One would expect the batch size would be visible in the ov::Model object.

razvanapetroaie · 2025-10-20T10:20:51Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+                                                                  : inputDescriptor.shapeFromCompiler;
+
+        if (batchSize.has_value()) {
+            shape[intel_npu::utils::BATCH_AXIS] = ov::Dimension(batchSize.value());


One potential problem that might occur here:
This shape is supposed to be the exact shape used by the original ov::Model object (before compilation). This convention allows using I/O descriptors of the ov::Model as identifiers for the CompiledModel and its derived InferenceRequest. The shape should also match in order to successfully identify I/Os.

If I interpret the code correctly, if the original model used a bounded interval as the batch axis value, you are overriding it using -1 (this line), so it doesn't match anymore. Now, maybe bounded values for the batch axis don't make much sense if adopting a ML perspective. And we are also having the same mismatching issue nowadays, since shapeFromIRModel is not storing dynamic stuff. So maybe this issue is not really relevent.

DariaMityagina requested review from a team as code owners October 10, 2025 01:27

DariaMityagina marked this pull request as draft October 10, 2025 01:27

github-actions bot added the category: NPU OpenVINO NPU plugin label Oct 10, 2025

DariaMityagina mentioned this pull request Oct 10, 2025

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

Closed

PatrikStepan reviewed Oct 10, 2025

View reviewed changes

DariaMityagina changed the title ~~[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler~~ [Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler - new metadata version Oct 10, 2025

DariaMityagina marked this pull request as ready for review October 10, 2025 09:47

DariaMityagina mentioned this pull request Oct 10, 2025

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler - ver 2 #31784

Open

sivanov-work approved these changes Oct 10, 2025

View reviewed changes

src/plugins/intel_npu/src/utils/include/intel_npu/utils/utils.hpp Outdated Show resolved Hide resolved

src/plugins/intel_npu/src/plugin/src/plugin.cpp Outdated Show resolved Hide resolved

DariaMityagina added this to the 2025.4 milestone Oct 10, 2025

DariaMityagina self-assigned this Oct 10, 2025

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch from 3d11a35 to 93181ac Compare October 13, 2025 07:18

DariaMityagina requested a review from PatrikStepan October 13, 2025 07:27

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch 2 times, most recently from 61e0dd0 to b5a2daa Compare October 13, 2025 10:57

PatrikStepan reviewed Oct 13, 2025

View reviewed changes

pereanub reviewed Oct 13, 2025

View reviewed changes

src/plugins/intel_npu/src/backend/src/zero_infer_request.cpp Outdated Show resolved Hide resolved

src/plugins/intel_npu/src/plugin/include/compiled_model.hpp Outdated Show resolved Hide resolved

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch 4 times, most recently from 977e1b1 to 081c268 Compare October 15, 2025 02:13

pereanub reviewed Oct 15, 2025

View reviewed changes

src/plugins/intel_npu/src/backend/src/zero_infer_request.cpp Outdated Show resolved Hide resolved

DariaMityagina requested a review from PatrikStepan October 15, 2025 11:20

DariaMityagina commented Oct 15, 2025

View reviewed changes

src/plugins/intel_npu/src/compiler_adapter/src/plugin_compiler_adapter.cpp Outdated Show resolved Hide resolved

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch from f33bf5e to 095954f Compare October 15, 2025 17:09

DariaMityagina commented Oct 15, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Outdated Show resolved Hide resolved

DariaMityagina added 16 commits October 19, 2025 00:13

Investigate refactoring opportunities for batch management in Plugin …

b5956b8

…and Compiler - review, metadata used

Investigate refactoring opportunities for batch management in Plugin …

b79a8e4

…and Compiler - clang

Investigate refactoring opportunities for batch management in Plugin …

91adae7

…and Compiler - review

Investigate refactoring opportunities for batch management in Plugin …

e486ac3

…and Compiler - remove some unused logic

Investigate refactoring opportunities for batch management in Plugin …

c21eaf7

…and Compiler - test fixes

Investigate refactoring opportunities for batch management in Plugin …

30bf5c7

…and Compiler - review (remove setting batch_mode to COMPILER after debatching, private methods, tests)

Investigate refactoring opportunities for batch management in Plugin …

6aba4fd

…and Compiler - review (helpers to a separate file, remove get_batch_size for compiledModel, BATCH_MODE is a runtime option)

Investigate refactoring opportunities for batch management in Plugin …

af9e1f7

…and Compiler - review (batchSize type)

Investigate refactoring opportunities for batch management in Plugin …

2d6bc62

…and Compiler - review (BATCH_MODE is needed, tests are not skipped)

Investigate refactoring opportunities for batch management in Plugin …

9f03bdb

…and Compiler - review (if compiler doesn't support BatchMode)

Revert "Investigate refactoring opportunities for batch management in…

e01119b

… Plugin and Compiler - review (batchSize type)" This reverts commit 095954f.

Investigate refactoring opportunities for batch management in Plugin …

5517de6

…and Compiler - review (if compiler doesn't support BatchMode) - fixes

Investigate refactoring opportunities for batch management in Plugin …

aaeb8eb

…and Compiler - review (use partial shape instead of _compiledModel, coverity)

Investigate refactoring opportunities for batch management in Plugin …

da028e0

…and Compiler - review (redundant clone)

Investigate refactoring opportunities for batch management in Plugin …

6210112

…and Compiler - review (protect original model, minor comments)

Investigate refactoring opportunities for batch management in Plugin …

44532be

…and Compiler - review

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch from f852a99 to a16d4d8 Compare October 19, 2025 11:45

DariaMityagina force-pushed the icv/dm/plugin_batch-metadata-used branch 2 times, most recently from c3498e4 to 6d228f7 Compare October 19, 2025 20:22

Investigate refactoring opportunities for batch management in Plugin …

6d228f7

…and Compiler - review (metadata write/read)

pereanub approved these changes Oct 20, 2025

View reviewed changes

alexandruenache1111 approved these changes Oct 20, 2025

View reviewed changes

razvanapetroaie approved these changes Oct 20, 2025

View reviewed changes

PatrikStepan approved these changes Oct 20, 2025

View reviewed changes

DariaMityagina added this pull request to the merge queue Oct 20, 2025

Merged via the queue into openvinotoolkit:master with commit 6872b07 Oct 20, 2025
178 checks passed

DariaMityagina deleted the icv/dm/plugin_batch-metadata-used branch October 20, 2025 13:27

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler - new metadata version #32350

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler - new metadata version #32350

Conversation

DariaMityagina commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sivanov-work left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DariaMityagina commented Oct 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PatrikStepan Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DariaMityagina Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pereanub left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DariaMityagina commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DariaMityagina commented Oct 19, 2025

Uh oh!

alexandruenache1111 left a comment

Choose a reason for hiding this comment

Uh oh!

DariaMityagina commented Oct 20, 2025

Uh oh!

razvanapetroaie left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

DariaMityagina commented Oct 10, 2025 •

edited

Loading

PatrikStepan Oct 13, 2025 •

edited

Loading

DariaMityagina Oct 13, 2025 •

edited

Loading

DariaMityagina commented Oct 15, 2025 •

edited

Loading