Skip to content

Conversation

@DariaMityagina
Copy link
Contributor

@DariaMityagina DariaMityagina commented Aug 12, 2025

Closed as a duplicate for - #32350

Details:

  • The task is to explore the possibility of moving the reshaping process to the plugin. This would allow the compiler to receive either a network with batch size 1 if reshaping is successful, or a network with a non-1 batch size if reshaping fails. This approach aims to simplify the batch handling process and reduce dependencies on the compiler.

The concept in this PR approach:

  1. Verify if the model is compatible with batching on the plugin side.
  2. Attempt to reshape the model using set_batch(1).
  3. If this is successful, we continue with the PLUGIN batch. The reshaped model is sent to the compiler, and afterward, we adjust the metadata post-compilation to retain the original batch value.

Need to modify the logic to be aligned with:

  1. Verify if the model is compatible with batching on the plugin side.
  2. Attempt to reshape the model using set_batch(1).
  3. If this is successful, we continue with the PLUGIN batch.
  4. Write/Read batch size into NPU plugin metadata (write when compiling a model, read when importing a model).
  5. The reshaped model is sent to the compiler. BATCH_MODE must be set to COMPILER.
  6. Adjust the metadata post-compilation to retain the original batch value - To keep backward/forward compatibility, create/update blob metadata with what you have in the plugin metadata. Create it in case of CiD, update it in case of MLIR.

Tickets:

  • E-176749

@DariaMityagina DariaMityagina self-assigned this Aug 12, 2025
@github-actions github-actions bot added the category: NPU OpenVINO NPU plugin label Aug 12, 2025
auto metadata = graph->get_metadata();
for (auto& in : metadata.inputs) {
if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) {
in.shapeFromIRModel.value()[0] = originalBatch;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep originalShapesMap to avoid using [0].

}
}
graph->set_metadata(metadata);
}
Copy link
Contributor Author

@DariaMityagina DariaMityagina Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is necessary to preserve the original batch information. After reshaping the model in lines 660-676 and compiling it in line 736, the metadata will reflect shapeFromIRModel as the reshaped version, rather than the original.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Points to consider: Is it possible to avoid altering the metadata? Can we eliminate dependence on it when dealing with dynamic batch scenarios?

@DariaMityagina DariaMityagina force-pushed the icv/dm/plugin_batch branch 2 times, most recently from 197d408 to 21eb1ef Compare August 15, 2025 10:45

bool modelDeBached = false;
ov::Dimension originalBatch;
if (localConfig.isAvailable(ov::intel_npu::batch_mode.name()) && modelForCompilation->is_dynamic()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: check static batching as well.

try {
_logger.info("Attempting to handle batching on the plugin side.");
originalBatch = ov::get_batch(modelForCompilation);
ov::set_batch(modelForCompilation, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_batch() is naive and cannot proceed sometimes especially where batch dimetion is not specified in layout information. I'd recommend to add an attempt to do debatchDynamicModel in case of exception

auto metadata = graph->get_metadata();
for (auto& in : metadata.inputs) {
if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) {
in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we set not originalBatch but an entire originalShape? to do not speculate with an actual BATCH_AXIS position, which may not be represented by 0 index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there was such an idea: #31691 (comment)
Thanks!

in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch;
}
}
graph->set_metadata(metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to extend NetworkMetadata class by aggregating additional layout information or better off introducing a new class alike PluginNetworkMetadata which will hold NetworkMetadata and layouts as well.

The purpose of adding this layout is to let user specify it so that we will stick to it instead of speculate with BATCH_AXIS position which is not equal to 0 in the generic case as we had ensured in previous PRs

@DariaMityagina DariaMityagina force-pushed the icv/dm/plugin_batch branch 6 times, most recently from 3de8df7 to 2ba4a52 Compare August 21, 2025 12:40
@DariaMityagina DariaMityagina force-pushed the icv/dm/plugin_batch branch 4 times, most recently from 6cb439f to 9349a91 Compare August 25, 2025 18:29
Comment on lines +267 to +277
for (auto& in : networkMeta.inputs) {
if (in.shapeFromIRModel.has_value() && batchSize.has_value()) {
in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}
for (auto& out : networkMeta.outputs) {
if (out.shapeFromIRModel.has_value() && batchSize.has_value()) {
out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (auto& in : networkMeta.inputs) {
if (in.shapeFromIRModel.has_value() && batchSize.has_value()) {
in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}
for (auto& out : networkMeta.outputs) {
if (out.shapeFromIRModel.has_value() && batchSize.has_value()) {
out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}
if (batchSize.has_value()) {
for (auto& in : networkMeta.inputs) {
if (in.shapeFromIRModel.has_value()) {
in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}
for (auto& out : networkMeta.outputs) {
if (out.shapeFromIRModel.has_value()) {
out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value());
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: NPU OpenVINO NPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants