-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[NPU] Model serialization/deserialization without weights copies #31939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
razvanapetroaie
wants to merge
42
commits into
openvinotoolkit:master
Choose a base branch
from
razvanapetroaie:CVS-169982-weights-serialization
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+647
−222
Open
Changes from 30 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
d717f92
Implementing the NPU plugin deserializer, it doesn't do anything spec…
razvanapetroaie 55adf77
Addinge a serializer that does nothing special
razvanapetroaie 8f3e760
Adding a new config option
razvanapetroaie ea58a81
Starting to refactor the plugin-driver adapter
razvanapetroaie 7800811
Done refactoring
razvanapetroaie 7393f90
Tweaking the deserializer. First weights-copy solution that seems to …
razvanapetroaie f2e1e64
Adding the same extensions used by the driver-compiler adapter
razvanapetroaie 553b23e
Storing the first serializer attempt
razvanapetroaie f93ef2b
Second attempt
razvanapetroaie 63117b1
First solution that seems to be working
razvanapetroaie a2362fe
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie 5a49dea
Linux time measurements
razvanapetroaie 5b34fc2
windowstime measurements
razvanapetroaie 57201e3
Renaming the config
razvanapetroaie 38bb262
Adding a new config option for setting the weights size threshold
razvanapetroaie cf12d7e
Revert "windowstime measurements"
razvanapetroaie a27cddf
SERIALIZATION_WEIGHTS_SIZE_THRESHOLD ammend
razvanapetroaie bbad999
Renamed to VCLSerializer
razvanapetroaie 5313bb6
Added a weights size threshold
razvanapetroaie 7a097d4
Adding one more time measurement
razvanapetroaie 932627e
Avoiding one model clone
razvanapetroaie 8a165a4
Shorter tags for the new attribute
razvanapetroaie 9e2010f
Serializer - writing custon data using the OV interface
razvanapetroaie 731dafd
Moving the deserializer code
razvanapetroaie 1152407
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie 7a74cf4
Comments, code style
razvanapetroaie fde9c7a
Removing measurements
razvanapetroaie 5a2ad5a
Test tweak
razvanapetroaie 9f322d0
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie 0b8f541
more test tweak
razvanapetroaie 695f227
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie 5bba959
just comments and attribute tags
razvanapetroaie 742ee91
virtual dtor
razvanapetroaie afca0f8
Basic test for weightless serializer
razvanapetroaie bf11a90
reduced copy-pasta in the "serialize_model_to_stream" functions
razvanapetroaie 38ca6e6
just a comment
razvanapetroaie 352f853
Reusing the weightless writer -> significant time boost if weights are
razvanapetroaie 9fd1b2b
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie 5fdf849
post-merge build fix
razvanapetroaie 0493b3e
Const values instead of hardcoded ones, serialization keys
razvanapetroaie 3cc1798
copyright fix
razvanapetroaie e95fae3
Merge remote-tracking branch 'upstream/master' into CVS-169982-weight…
razvanapetroaie File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
41 changes: 41 additions & 0 deletions
41
src/plugins/intel_npu/src/al/include/intel_npu/weights_pointer_attribute.hpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| // Copyright (C) 2018-2025 Intel Corporation | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
| // | ||
|
|
||
| #pragma once | ||
|
|
||
| #include "openvino/core/runtime_attribute.hpp" | ||
|
|
||
| namespace intel_npu { | ||
|
|
||
| /** | ||
| * @brief Attribute containing the memory address of a weights buffer and the size of the buffer in bytes. | ||
| * @details Used as part of the serialization/deserialization algorithms in order to allow processing models without | ||
| * copying weights. | ||
| */ | ||
| class WeightsPointerAttribute : public ov::RuntimeAttribute { | ||
| public: | ||
| OPENVINO_RTTI("WeightsPointerAttribute", "0", RuntimeAttribute); | ||
|
|
||
| WeightsPointerAttribute() = delete; | ||
|
|
||
| WeightsPointerAttribute(const void* pointer, const size_t size) | ||
| : memory_pointer(reinterpret_cast<size_t>(pointer)), | ||
| byte_size(size) {} | ||
|
|
||
| /** | ||
| * @note The names of the attributes have been kept short in order to save some memory (there may be a lot of | ||
| * "ov::Constant" nodes in a model). Also, two characters should be sufficient to avoid collisions. "np" stands for | ||
| * "NPU pointer", "ns" for "NPU size". | ||
razvanapetroaie marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| */ | ||
| bool visit_attributes(ov::AttributeVisitor& visitor) override { | ||
| visitor.on_attribute("np", memory_pointer); | ||
| visitor.on_attribute("ns", byte_size); | ||
| return true; | ||
| } | ||
|
|
||
| size_t memory_pointer; | ||
| size_t byte_size; | ||
| }; | ||
|
|
||
| } // namespace intel_npu | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -75,6 +75,11 @@ class writer_streambuf final : public std::streambuf { | |
| } | ||
| } | ||
|
|
||
| pos_type seekpos(pos_type pos, std::ios_base::openmode which) override { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The new serialization algorithm is based on |
||
| writeIt = startIt + pos; | ||
| return pos; | ||
| } | ||
|
|
||
| OutputIt startIt; | ||
| OutputIt writeIt; | ||
| }; | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
82 changes: 0 additions & 82 deletions
82
src/plugins/intel_npu/src/compiler_adapter/include/ir_serializer.hpp
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will tune this later to see which value yields the best performance. For now, we assume 0 is the best candidate (only weights pointers & sizes are stored).