-
Couldn't load subscription status.
- Fork 2.8k
[NPUW] Phi-3 2K accuracy fix #32426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NPUW] Phi-3 2K accuracy fix #32426
Conversation
2a1bacc to
62a8876
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to minimize the diff
| public: | ||
| OPENVINO_MATCHER_PASS_RTTI("npuw::LLMCompiledModel::Phi3SlidingMask"); | ||
|
|
||
| explicit Phi3SlidingMask() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems in this particular case explicit is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
| // Searching for Phi3 sliding mask pattern to extend it to work with right-padded | ||
| // past tokens and left-padded present tokens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Searching for Phi3 sliding mask pattern to extend it to work with right-padded | |
| // past tokens and left-padded present tokens. | |
| // Search for the Phi3 sliding mask pattern to extend it to work with right-padded | |
| // past tokens and left-padded present tokens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
| // past tokens = 3 | ||
| // present tokens = 5 (starting with current_pos_id = 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity, how can we get 5 present tokens with 3 past tokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be done in speculative decoding case, for example
However, 3 and 5 case was described as general simple case for dynamic shapes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you introduced a new pass, why there are changes to the existing code? Have you also moved stuff around?
Since the changes are quite complex nowadays, I'd avoid the unnecessary diffs to focus only on the relevant ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
f7e27e9 to
4fed7fb
Compare
|
|
||
| LOG_DEBUG("Try patch Phi-3 sliding window mask, if it exists."); | ||
| patch_phi3_sliding_mask(kvcache_model); | ||
| ov::save_model(kvcache_model, "swa_patched.xml"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
6fadef6 to
8b1a295
Compare
c4bb342 to
b142dc0
Compare
Details:
Phi3SlidingMaskpass that replaces formula :to
to create correct final sliding window mask for left-padded present tokens and right-padded past ones
Tickets: