-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dynamic batch] Investigate refactoring opportunities for batch management in Plugin and Compiler #31691
Conversation
93b4287 to
d299f5c
Compare
| auto metadata = graph->get_metadata(); | ||
| for (auto& in : metadata.inputs) { | ||
| if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) { | ||
| in.shapeFromIRModel.value()[0] = originalBatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe keep originalShapesMap to avoid using [0].
| } | ||
| } | ||
| graph->set_metadata(metadata); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is necessary to preserve the original batch information. After reshaping the model in lines 660-676 and compiling it in line 736, the metadata will reflect shapeFromIRModel as the reshaped version, rather than the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Points to consider: Is it possible to avoid altering the metadata? Can we eliminate dependence on it when dealing with dynamic batch scenarios?
197d408 to
21eb1ef
Compare
…and Compiler - clean up
|
|
||
| bool modelDeBached = false; | ||
| ov::Dimension originalBatch; | ||
| if (localConfig.isAvailable(ov::intel_npu::batch_mode.name()) && modelForCompilation->is_dynamic()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: check static batching as well.
| try { | ||
| _logger.info("Attempting to handle batching on the plugin side."); | ||
| originalBatch = ov::get_batch(modelForCompilation); | ||
| ov::set_batch(modelForCompilation, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_batch() is naive and cannot proceed sometimes especially where batch dimetion is not specified in layout information. I'd recommend to add an attempt to do debatchDynamicModel in case of exception
| auto metadata = graph->get_metadata(); | ||
| for (auto& in : metadata.inputs) { | ||
| if (in.shapeFromIRModel.has_value() && originalBatch.get_max_length() != 1) { | ||
| in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we set not originalBatch but an entire originalShape? to do not speculate with an actual BATCH_AXIS position, which may not be represented by 0 index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there was such an idea: #31691 (comment)
Thanks!
| in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = originalBatch; | ||
| } | ||
| } | ||
| graph->set_metadata(metadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose to extend NetworkMetadata class by aggregating additional layout information or better off introducing a new class alike PluginNetworkMetadata which will hold NetworkMetadata and layouts as well.
The purpose of adding this layout is to let user specify it so that we will stick to it instead of speculate with BATCH_AXIS position which is not equal to 0 in the generic case as we had ensured in previous PRs
1ee72b3 to
528435d
Compare
…and Compiler - review
3de8df7 to
2ba4a52
Compare
…and Compiler - review - WIP
…and Compiler - review - WIP
…and Compiler - review - WIP
6cb439f to
9349a91
Compare
…and Compiler - review - WIP
c0fd2b0 to
2e9f5d7
Compare
…and Compiler - clang
…and Compiler - MLIR fixx
| for (auto& in : networkMeta.inputs) { | ||
| if (in.shapeFromIRModel.has_value() && batchSize.has_value()) { | ||
| in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | ||
| } | ||
| } | ||
| for (auto& out : networkMeta.outputs) { | ||
| if (out.shapeFromIRModel.has_value() && batchSize.has_value()) { | ||
| out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for (auto& in : networkMeta.inputs) { | |
| if (in.shapeFromIRModel.has_value() && batchSize.has_value()) { | |
| in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | |
| } | |
| } | |
| for (auto& out : networkMeta.outputs) { | |
| if (out.shapeFromIRModel.has_value() && batchSize.has_value()) { | |
| out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | |
| } | |
| } | |
| if (batchSize.has_value()) { | |
| for (auto& in : networkMeta.inputs) { | |
| if (in.shapeFromIRModel.has_value()) { | |
| in.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | |
| } | |
| } | |
| for (auto& out : networkMeta.outputs) { | |
| if (out.shapeFromIRModel.has_value()) { | |
| out.shapeFromIRModel.value()[intel_npu::utils::BATCH_AXIS] = ov::Dimension(1, batchSize.value()); | |
| } | |
| } | |
| } |
Closed as a duplicate for - #32350
Details:
The concept in this PR approach:
Need to modify the logic to be aligned with:
Tickets: