-
Notifications
You must be signed in to change notification settings - Fork 575
pd: support dpa1 #4414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd: support dpa1 #4414
Conversation
for more information, see https://pre-commit.ci
…y to coverage newly added code
for more information, see https://pre-commit.ci
edea0aa to
63ccdbd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
source/tests/pd/model/water/multitask_sharefit.json (2)
11-15: Document the rationale behind selection parametersThe selection parameters
[46, 92, 4]seem arbitrary. Consider adding a comment explaining why these specific values were chosen and their significance in the context of water molecule simulations.
56-61: Consider adjusting the minimum learning rateThe
stop_lrvalue of 3.51e-08 is extremely small. Consider using a larger value (e.g., 1e-6) as training might not benefit from such small learning rates and could unnecessarily extend training time.source/tests/pd/test_multitask.py (1)
71-76: Document the conditions for parameter sharingThe conditions for parameter sharing in fitting networks exclude certain parameters (
bias_atom_eandcase_embd). Consider adding a comment explaining why these specific parameters are excluded from sharing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
source/tests/pd/model/water/multitask_sharefit.json(1 hunks)source/tests/pd/test_multitask.py(6 hunks)
🧰 Additional context used
🪛 GitHub Check: CodeQL
source/tests/pd/test_multitask.py
[notice] 48-48: Unused global variable
The global variable 'multitask_sharefit_template' is not used.
🔇 Additional comments (2)
source/tests/pd/model/water/multitask_sharefit.json (1)
64-82: LGTM! Well-structured loss weighting strategy
The loss configuration implements a good strategy by:
- Initially prioritizing force accuracy (pref_f: 1000 → 1)
- Gradually increasing energy contribution (pref_e: 0.02 → 1)
- Maintaining identical configurations for both models
source/tests/pd/test_multitask.py (1)
43-48:
Remove or utilize the unused template variable
The multitask_sharefit_template variable is loaded but never used in the code. Either:
- Remove it if it's not needed, or
- Update the tests to utilize this shared fitting configuration
🧰 Tools
🪛 GitHub Check: CodeQL
[notice] 48-48: Unused global variable
The global variable 'multitask_sharefit_template' is not used.
|
@coderabbitai resolve |
This reverts commit e8167ce.
Support DPA-2 in paddle backend. This PR will be updated after #4414 is merged. ### Training curve:  ### Accuracy test(left: paddle, right: torch):  Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new classes for molecular descriptors: `DescrptDPA2`, `DescrptBlockRepformers`, `DescrptSeTTebd`, and `DescrptBlockSeTTebd`. - Added new functions for tensor operations and descriptor management, enhancing the capabilities of the module. - Updated JSON configurations for multitask models to refine selection criteria and data paths. - **Bug Fixes** - Improved error handling and parameter validation across various descriptor classes. - **Documentation** - Enhanced test coverage for new descriptor functionalities and configurations. - **Tests** - Added new test classes to validate the functionality of `DescrptDPA2` and multitask training scenarios. - Expanded test capabilities for descriptor classes based on installed dependencies. - Updated existing tests to support new configurations and functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary of this PR:
softmax_grad,p_norm_grad,split_grad, andconcat_gradto the composite operator blacklist(https://github.com/deepmodeling/deepmd-kit/pull/4414/files#diff-e678abb052b278f8a479f8d13b839a9ec0effd9923478a850bc13758f918e1e9R134-R148) to significantly improve model execution speed (reducing the time taken from 100% more than PyTorch to about 10% to 15% more).related PR: lanpa/tensorboardX#728
Training curve:
Accuracy test(left: paddle, right: torch):
Ralated optimization of Paddle framework:
paddle.whereandpaddle.where_in eager mode PaddlePaddle/Paddle#69556Summary by CodeRabbit
Release Notes
New Features
DescrptDPA1,DescrptBlockSeAtten, andLayerNorm, enhancing the modeling capabilities for molecular simulations.DPAtomicModeland various descriptor classes.TestDenoiseModelDPA1andTestDenoiseModelDPA2.ModelWrapperclass to clarify the handling of model parameters and state management.Bug Fixes
Documentation
Tests