[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

DariaMityagina · 2025-08-12T06:18:55Z

Closed as a duplicate for - #32350

Details:

The task is to explore the possibility of moving the reshaping process to the plugin. This would allow the compiler to receive either a network with batch size 1 if reshaping is successful, or a network with a non-1 batch size if reshaping fails. This approach aims to simplify the batch handling process and reduce dependencies on the compiler.

The concept in this PR approach:

Verify if the model is compatible with batching on the plugin side.
Attempt to reshape the model using set_batch(1).
If this is successful, we continue with the PLUGIN batch. The reshaped model is sent to the compiler, and afterward, we adjust the metadata post-compilation to retain the original batch value.

Need to modify the logic to be aligned with:

Verify if the model is compatible with batching on the plugin side.
Attempt to reshape the model using set_batch(1).
If this is successful, we continue with the PLUGIN batch.
Write/Read batch size into NPU plugin metadata (write when compiling a model, read when importing a model).
The reshaped model is sent to the compiler. BATCH_MODE must be set to COMPILER.
Adjust the metadata post-compilation to retain the original batch value - To keep backward/forward compatibility, create/update blob metadata with what you have in the plugin metadata. Create it in case of CiD, update it in case of MLIR.

Tickets:

E-176749

…and Compiler

src/plugins/intel_npu/src/plugin/src/plugin.cpp

…and Compiler

DariaMityagina · 2025-08-13T14:35:36Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+        auto metadata = graph->get_metadata();
+        for (auto& in : metadata.inputs) {
+            if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) {
+                in.shapeFromIRModel.value()[0] = originalBatch;


Maybe keep originalShapesMap to avoid using [0].

DariaMityagina · 2025-08-14T07:23:08Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+            }
+        }
+        graph->set_metadata(metadata);
+    }


This section is necessary to preserve the original batch information. After reshaping the model in lines 660-676 and compiling it in line 736, the metadata will reflect shapeFromIRModel as the reshaped version, rather than the original.

Points to consider: Is it possible to avoid altering the metadata? Can we eliminate dependence on it when dealing with dynamic batch scenarios?

src/plugins/intel_npu/src/plugin/src/plugin.cpp

…and Compiler - clean up

DariaMityagina · 2025-08-15T11:09:02Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp


+    bool modelDeBached = false;
+    ov::Dimension originalBatch;
+    if (localConfig.isAvailable(ov::intel_npu::batch_mode.name()) && modelForCompilation->is_dynamic()) {


TODO: check static batching as well.

src/plugins/intel_npu/src/plugin/src/plugin.cpp

sivanov-work · 2025-08-18T09:59:11Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+            try {
+                _logger.info("Attempting to handle batching on the plugin side.");
+                originalBatch = ov::get_batch(modelForCompilation);
+                ov::set_batch(modelForCompilation, 1);


set_batch() is naive and cannot proceed sometimes especially where batch dimetion is not specified in layout information. I'd recommend to add an attempt to do debatchDynamicModel in case of exception

sivanov-work · 2025-08-18T12:44:04Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+        auto metadata = graph->get_metadata();
+        for (auto& in : metadata.inputs) {
+            if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) {
+                in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch;


What if we set not originalBatch but an entire originalShape? to do not speculate with an actual BATCH_AXIS position, which may not be represented by 0 index

Yes, there was such an idea: #31691 (comment)
Thanks!

sivanov-work · 2025-08-18T12:49:07Z

src/plugins/intel_npu/src/plugin/src/plugin.cpp

+                in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch;
+            }
+        }
+        graph->set_metadata(metadata);


I'd propose to extend NetworkMetadata class by aggregating additional layout information or better off introducing a new class alike PluginNetworkMetadata which will hold NetworkMetadata and layouts as well.

The purpose of adding this layout is to let user specify it so that we will stick to it instead of speculate with BATCH_AXIS position which is not equal to 0 in the generic case as we had ensured in previous PRs

…and Compiler - review

src/plugins/intel_npu/src/plugin/src/plugin.cpp

…and Compiler - review - WIP

…and Compiler - clang

…and Compiler - MLIR fixx

DariaMityagina · 2025-08-26T08:44:29Z

src/plugins/intel_npu/src/compiler_adapter/src/plugin_compiler_adapter.cpp

+    for (auto& in : networkMeta.inputs) {
+        if (in.shapeFromIRModel.has_value() && batchSize.has_value()) {
+            in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
+        }
+    }
+    for (auto& out : networkMeta.outputs) {
+        if (out.shapeFromIRModel.has_value() && batchSize.has_value()) {
+            out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
+        }
+    }
+


Suggested change

for (auto& in : networkMeta.inputs) {

if (in.shapeFromIRModel.has_value() && batchSize.has_value()) {

in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());

}

}

for (auto& out : networkMeta.outputs) {

if (out.shapeFromIRModel.has_value() && batchSize.has_value()) {

out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());

}

}

if (batchSize.has_value()) {

for (auto& in : networkMeta.inputs) {

if (in.shapeFromIRModel.has_value()) {

in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());

}

}

for (auto& out : networkMeta.outputs) {

if (out.shapeFromIRModel.has_value()) {

out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());

}

}

}

DariaMityagina self-assigned this Aug 12, 2025

github-actions bot added the category: NPU OpenVINO NPU plugin label Aug 12, 2025

DariaMityagina added 2 commits August 13, 2025 11:30

Investigate refactoring opportunities for batch management in Plugin …

01b7450

…and Compiler

Investigate refactoring opportunities for batch management in Plugin …

365846f

…and Compiler

DariaMityagina force-pushed the icv/dm/plugin_batch branch from 93b4287 to d299f5c Compare August 13, 2025 11:53

DariaMityagina commented Aug 13, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Show resolved Hide resolved

Investigate refactoring opportunities for batch management in Plugin …

d299f5c

…and Compiler

DariaMityagina commented Aug 13, 2025

View reviewed changes

DariaMityagina commented Aug 14, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Outdated Show resolved Hide resolved

DariaMityagina commented Aug 14, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Show resolved Hide resolved

DariaMityagina force-pushed the icv/dm/plugin_batch branch 2 times, most recently from 197d408 to 21eb1ef Compare August 15, 2025 10:45

Investigate refactoring opportunities for batch management in Plugin …

21eb1ef

…and Compiler - clean up

DariaMityagina commented Aug 15, 2025

View reviewed changes

DariaMityagina commented Aug 18, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Show resolved Hide resolved

DariaMityagina requested a review from sivanov-work August 18, 2025 08:46

sivanov-work reviewed Aug 18, 2025

View reviewed changes

DariaMityagina force-pushed the icv/dm/plugin_batch branch from 1ee72b3 to 528435d Compare August 19, 2025 13:15

Investigate refactoring opportunities for batch management in Plugin …

528435d

…and Compiler - review

sivanov-work reviewed Aug 19, 2025

View reviewed changes

src/plugins/intel_npu/src/plugin/src/plugin.cpp Show resolved Hide resolved

DariaMityagina force-pushed the icv/dm/plugin_batch branch 6 times, most recently from 3de8df7 to 2ba4a52 Compare August 21, 2025 12:40

Investigate refactoring opportunities for batch management in Plugin …

2ba4a52

…and Compiler - review - WIP

DariaMityagina added 2 commits August 21, 2025 14:48

Investigate refactoring opportunities for batch management in Plugin …

55fb7d7

…and Compiler - review - WIP

Investigate refactoring opportunities for batch management in Plugin …

a2f5de8

…and Compiler - review - WIP

DariaMityagina force-pushed the icv/dm/plugin_batch branch 4 times, most recently from 6cb439f to 9349a91 Compare August 25, 2025 18:29

Investigate refactoring opportunities for batch management in Plugin …

9349a91

…and Compiler - review - WIP

DariaMityagina force-pushed the icv/dm/plugin_batch branch from c0fd2b0 to 2e9f5d7 Compare August 25, 2025 19:04

DariaMityagina added 2 commits August 25, 2025 19:07

Investigate refactoring opportunities for batch management in Plugin …

2e9f5d7

…and Compiler - clang

Investigate refactoring opportunities for batch management in Plugin …

2653c3f

…and Compiler - MLIR fixx

DariaMityagina commented Aug 26, 2025

View reviewed changes

DariaMityagina closed this Oct 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

Uh oh!

DariaMityagina commented Aug 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

DariaMityagina Aug 13, 2025

Uh oh!

DariaMityagina Aug 14, 2025 •

edited

Loading

Uh oh!

DariaMityagina Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

DariaMityagina Aug 15, 2025

Uh oh!

Uh oh!

sivanov-work Aug 18, 2025

Uh oh!

sivanov-work Aug 18, 2025

Uh oh!

DariaMityagina Aug 19, 2025

Uh oh!

sivanov-work Aug 18, 2025

Uh oh!

Uh oh!

DariaMityagina Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691

Uh oh!

Conversation

DariaMityagina commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DariaMityagina Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DariaMityagina commented Aug 12, 2025 •

edited

Loading

DariaMityagina Aug 14, 2025 •

edited

Loading