pd: support dpa1 #4414

HydrogenSulfate · 2024-11-25T05:46:34Z

Summary of this PR:

upload DPA-1 related code
merge much develop code
add all eager composite operators except softmax_grad, p_norm_grad, split_grad, and concat_grad to the composite operator blacklist(https://github.com/deepmodeling/deepmd-kit/pull/4414/files#diff-e678abb052b278f8a479f8d13b839a9ec0effd9923478a850bc13758f918e1e9R134-R148) to significantly improve model execution speed (reducing the time taken from 100% more than PyTorch to about 10% to 15% more).

related PR: lanpa/tensorboardX#728

Training curve:

Accuracy test(left: paddle, right: torch):

Ralated optimization of Paddle framework:

Summary by CodeRabbit

Release Notes

New Features
- Introduced several new classes for molecular descriptors, including DescrptDPA1, DescrptBlockSeAtten, and LayerNorm, enhancing the modeling capabilities for molecular simulations.
- Added new JSON configuration files for model parameters and multitask models related to water simulations.
- Implemented new test classes for validating the functionality of the DPAtomicModel and various descriptor classes.
- Added new test classes for evaluating denoising models, including TestDenoiseModelDPA1 and TestDenoiseModelDPA2.
- Enhanced the ModelWrapper class to clarify the handling of model parameters and state management.
Bug Fixes
- Improved internal logic for handling model state saving and loading, ensuring consistency in outputs.
Documentation
- Enhanced type hints and return annotations across various classes and methods for better clarity.
Tests
- Expanded the testing framework with new test cases for denoising models and descriptor functionalities, ensuring robust validation of features.
- Activated previously skipped tests for energy models, improving test coverage.
- Enhanced multitask training tests with new configuration handling and test classes.

for more information, see https://pre-commit.ci

…y to coverage newly added code

for more information, see https://pre-commit.ci

HydrogenSulfate · 2024-12-04T03:47:27Z

@iProzd PR of pd: DPA-1 and pd: DPA-2 is ready for review.

deepmd/pd/train/wrapper.py

source/tests/pd/test_multitask.py

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)

source/tests/pd/model/water/multitask_sharefit.json (2)

11-15: Document the rationale behind selection parameters

The selection parameters [46, 92, 4] seem arbitrary. Consider adding a comment explaining why these specific values were chosen and their significance in the context of water molecule simulations.

56-61: Consider adjusting the minimum learning rate

The stop_lr value of 3.51e-08 is extremely small. Consider using a larger value (e.g., 1e-6) as training might not benefit from such small learning rates and could unnecessarily extend training time.

source/tests/pd/test_multitask.py (1)

71-76: Document the conditions for parameter sharing

The conditions for parameter sharing in fitting networks exclude certain parameters (bias_atom_e and case_embd). Consider adding a comment explaining why these specific parameters are excluded from sharing.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 95ac41d and 63ccdbd.

📒 Files selected for processing (2)

source/tests/pd/model/water/multitask_sharefit.json (1 hunks)
source/tests/pd/test_multitask.py (6 hunks)

🧰 Additional context used

🪛 GitHub Check: CodeQL

source/tests/pd/test_multitask.py

[notice] 48-48: Unused global variable
The global variable 'multitask_sharefit_template' is not used.

🔇 Additional comments (2)

source/tests/pd/model/water/multitask_sharefit.json (1)

64-82: LGTM! Well-structured loss weighting strategy

The loss configuration implements a good strategy by:

Initially prioritizing force accuracy (pref_f: 1000 → 1)
Gradually increasing energy contribution (pref_e: 0.02 → 1)
Maintaining identical configurations for both models

source/tests/pd/test_multitask.py (1)

43-48: ⚠️ Potential issue

Remove or utilize the unused template variable

The multitask_sharefit_template variable is loaded but never used in the code. Either:

Remove it if it's not needed, or
Update the tests to utilize this shared fitting configuration

🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 48-48: Unused global variable
The global variable 'multitask_sharefit_template' is not used.

source/tests/pd/model/water/multitask_sharefit.json

source/tests/pd/test_multitask.py

njzjz · 2024-12-13T20:37:02Z

@coderabbitai resolve

This reverts commit e8167ce.

Support DPA-2 in paddle backend. This PR will be updated after #4414 is merged. ### Training curve: ![training_curves_comparison_dpa2](https://github.com/user-attachments/assets/29bdeffa-cf2d-4586-afcf-7df0569997c3) ### Accuracy test(left: paddle, right: torch): ![image](https://github.com/user-attachments/assets/5bff55f3-1c39-4b95-93f0-68783e794716) Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556  ## Summary by CodeRabbit - **New Features** - Introduced new classes for molecular descriptors: `DescrptDPA2`, `DescrptBlockRepformers`, `DescrptSeTTebd`, and `DescrptBlockSeTTebd`. - Added new functions for tensor operations and descriptor management, enhancing the capabilities of the module. - Updated JSON configurations for multitask models to refine selection criteria and data paths. - **Bug Fixes** - Improved error handling and parameter validation across various descriptor classes. - **Documentation** - Enhanced test coverage for new descriptor functionalities and configurations. - **Tests** - Added new test classes to validate the functionality of `DescrptDPA2` and multitask training scenarios. - Expanded test capabilities for descriptor classes based on installed dependencies. - Updated existing tests to support new configurations and functionalities.  --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

HydrogenSulfate and others added 30 commits November 2, 2024 11:14

add core modules of paddle backend and water/se_e2_a example

48f77f3

add paddle code in consistent test

2082a59

clean env and training

2ae45b8

add more test files

7f03a04

Merge branch 'devel' into add_paddle_backend_core_and_water_se_e2_a

4d1c44c

fix pt->pd

72c9b4e

update test_python.yml

3b1c348

restore .pre-commit-config.yaml

a46dcb5

remove redundant file

90f9ff9

Skip bfloat16 for some cases

0a6baa6

enable prim by default in unitest

4b77e55

[pre-commit.ci] auto fixes from pre-commit.com hooks

6e139a2

for more information, see https://pre-commit.ci

fix env code

9437957

Merge branch 'devel' into add_paddle_backend_core_and_water_se_e2_a

f1d762f

Merge branch 'devel' into add_paddle_backend_core_and_water_se_e2_a

8534597

update test_ener.py

c22b45d

add missing pd_class

39842ff

use paddle Tensor instead of numpy array in pd/test_auto_batch_size.p…

07cd98e

…y to coverage newly added code

add training test and remove ase_calc.py

bb2d547

add training test and remove ase_calc.py

5fb6d8e

Merge branch 'devel' into add_paddle_backend_core_and_water_se_e2_a

91066f8

upload missing json

90c9c03

restore pt/test_auto_batch_size.py

eb7384e

rerun CI for network problem

9faf54f

add multitask unitest

4e3a121

add more unitest

18333ab

Merge branch 'devel' into add_paddle_backend_core_and_water_se_e2_a

f9c6da8

remove redundant file and fix typo

3fd979d

[pre-commit.ci] auto fixes from pre-commit.com hooks

5922e84

for more information, see https://pre-commit.ci

update unitest

f5a17a9

njzjz requested a review from iProzd November 30, 2024 20:31

iProzd reviewed Dec 5, 2024

View reviewed changes

deepmd/pd/train/wrapper.py Outdated Show resolved Hide resolved

HydrogenSulfate added 4 commits December 6, 2024 19:43

restore share_params code and update it

d1a5a65

Merge branch 'devel' into add_dpa1

95ac41d

update test_multitask.py

22fffd8

Merge branch 'devel' into add_dpa1

63ccdbd

HydrogenSulfate force-pushed the add_dpa1 branch from edea0aa to 63ccdbd Compare December 9, 2024 14:06

github-advanced-security bot found potential problems Dec 9, 2024

View reviewed changes

source/tests/pd/test_multitask.py Fixed Show fixed Hide fixed

coderabbitai bot reviewed Dec 9, 2024

View reviewed changes

source/tests/pd/model/water/multitask_sharefit.json Show resolved Hide resolved

source/tests/pd/test_multitask.py Show resolved Hide resolved

njzjz requested a review from iProzd December 13, 2024 20:38

HydrogenSulfate added 2 commits December 14, 2024 12:14

remove useless variable 'multitask_sharefit_template_json'

273d281

Merge branch 'devel' into add_dpa1

aaf28d5

iProzd approved these changes Dec 16, 2024

View reviewed changes

njzjz approved these changes Dec 17, 2024

View reviewed changes

njzjz enabled auto-merge December 17, 2024 22:27

njzjz added this pull request to the merge queue Dec 17, 2024

Merged via the queue into deepmodeling:devel with commit e8167ce Dec 18, 2024
60 checks passed

HydrogenSulfate added a commit to HydrogenSulfate/deepmd-kit that referenced this pull request Dec 18, 2024

Revert "pd: support dpa1 (deepmodeling#4414)"

2a7421d

This reverts commit e8167ce.

HydrogenSulfate deleted the add_dpa1 branch December 19, 2024 07:56

HydrogenSulfate mentioned this pull request Jan 15, 2025

Integrate paddle backend in devel branch #4157

Closed

12 tasks

coderabbitai bot mentioned this pull request Feb 18, 2025

Feat(pt): Support fitting_net input statistics. #4504

Merged

This was referenced Mar 28, 2025

fix(pt/dp): fix non-smooth edge update in DPA3 #4675

Merged

feat: add huber loss #4684

Merged

coderabbitai bot mentioned this pull request May 28, 2025

fix(jax): workaround for "xxTracer is not a valid JAX type" #4776

Merged

coderabbitai bot mentioned this pull request Jul 22, 2025

[WIP] pd: support dpa2/dpa3 inference #4846

Closed

1 task

coderabbitai bot mentioned this pull request Sep 24, 2025

pd(feat): support python inference with DP class #4987

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pd: support dpa1 #4414

pd: support dpa1 #4414

Uh oh!

HydrogenSulfate commented Nov 25, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

HydrogenSulfate commented Dec 4, 2024

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

njzjz commented Dec 13, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pd: support dpa1 #4414

pd: support dpa1 #4414

Uh oh!

Conversation

HydrogenSulfate commented Nov 25, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Training curve:

Accuracy test(left: paddle, right: torch):

Summary by CodeRabbit

Release Notes

Uh oh!

HydrogenSulfate commented Dec 4, 2024

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

njzjz commented Dec 13, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HydrogenSulfate commented Nov 25, 2024 •

edited by coderabbitai bot

Loading