-
Notifications
You must be signed in to change notification settings - Fork 575
feat: add huber loss #4684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add huber loss #4684
Conversation
📝 WalkthroughWalkthroughThis pull request integrates a conditional Huber loss computation into multiple loss modules across different frameworks. A new function, Changes
Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
source/tests/universal/dpmodel/loss/test_loss.py (1)
57-102: 💡 Verification agent🧩 Analysis chain
Consider extending Huber loss to Energy Spin loss
Since
LossParamEnergynow supports Huber loss via theuse_huberparameter, consider whether this functionality should also be added toLossParamEnergySpin. If Huber loss is generally beneficial for energy-related losses, it would make sense to implement it consistently across similar loss functions.
🏁 Script executed:
#!/bin/bash # Check if Huber loss is implemented in the EnergySpinLoss class # Search for EnergySpinLoss class implementation rg "class EnergySpinLoss" -A 5 # Check if there's a spin-specific loss implementation that might need Huber loss rg "def custom_huber_loss" -A 5Length of output: 2124
Action: Consider Extending Huber Loss Support for Energy Spin Loss
After verifying the codebase, it appears that while custom Huber loss is implemented for various energy loss functions (as seen in files likedeepmd/tf/loss/ener.py,deepmd/pd/loss/ener.py, anddeepmd/dpmodel/loss/ener.py), theEnergySpinLossclass indeepmd/pt/loss/ener_spin.pydoes not incorporate Huber loss functionality. Given thatLossParamEnergynow includes support for Huber loss via theuse_huberparameter, it may be beneficial to align the Energy Spin loss implementation for consistency and potentially improved robustness.Points for Consideration:
- Parameter Addition: Introduce a
use_huber(or similar) parameter inEnergySpinLossto enable Huber loss behavior.- Functionality Parity: Leverage the existing
custom_huber_lossimplementations as a guideline for this extension.- Test Updates: Update the tests in
source/tests/universal/dpmodel/loss/test_loss.pyto cover scenarios with and without Huber loss when using the Energy Spin loss variant.
🧹 Nitpick comments (12)
source/tests/universal/dpmodel/loss/test_loss.py (1)
17-17: Consider addinghuber_deltaparameter to testsWhile you've added the
use_huberparameter, I notice there's no correspondinghuber_deltaparameter in the test configuration, which was mentioned in other files. Consider adding this parameter to fully test the Huber loss functionality with different delta values, which controls the transition point between L1 and L2 loss.- def LossParamEnergy( - starter_learning_rate=1.0, - pref_e=1.0, - pref_f=1.0, - pref_v=1.0, - pref_ae=1.0, - use_huber=False, - ): + def LossParamEnergy( + starter_learning_rate=1.0, + pref_e=1.0, + pref_f=1.0, + pref_v=1.0, + pref_ae=1.0, + use_huber=False, + huber_delta=1.0, + ):Then update the input_dict:
"start_pref_ae": pref_ae, "limit_pref_ae": pref_ae / 2, "use_huber": use_huber, + "huber_delta": huber_delta,And possibly add it to parameterization:
"pref_v": (1.0, 0.0), "pref_ae": (1.0, 0.0), "use_huber": (False, True), + "huber_delta": (1.0, 0.5),Also applies to: 36-36, 49-49
deepmd/pt/loss/ener.py (2)
26-32: Consider validating thedeltaparameter in the newcustom_huber_loss.
While the implementation is standard Huber logic, it would be safer to raise an error ifdelta < 0. For instance:def custom_huber_loss(predictions, targets, delta=1.0): + if delta < 0: + raise ValueError("The Huber delta should be non-negative.") error = targets - predictions ...
214-221: Duplicate Huber vs. L2 logic across multiple loss branches.
Repeated “if not self.use_huber … else …” structures for energy, force, virial, and atom energy may be condensed into a helper function to reduce duplication. This could simplify maintenance and ensure consistency.deepmd/dpmodel/loss/ener.py (2)
20-27: Consider enforcing non-negativedeltaincustom_huber_loss.
The logic is correct, but raising an error whendeltais negative would prevent anomalous results:def custom_huber_loss(predictions, targets, delta=1.0): + if delta < 0: + raise ValueError("The Huber delta should be non-negative.") xp = array_api_compat.array_namespace(predictions, targets) ...
171-179: Introducing Huber-based energy loss.
Consolidating L2 and Huber under a single condition is correct, though you could factor out the repeated logic to reduce duplication.deepmd/pd/loss/ener.py (3)
56-57: Consider reviewing default Huber parameters
Settinguse_huber=Falseby default is fine, but a delta of0.01might be quite small for many scenarios. You may wish to confirm whether this default threshold is broadly applicable or should be adjusted to a more typical value (e.g., 1.0).
103-109: Improve the parameter documentation
These added docstrings are helpful. You might consider including a usage example or linking to references explaining Huber loss advantages in training to guide new users.
146-150: Runtime check is clear
Raising aRuntimeErrorwhen Huber loss is combined with force-based configurations not yet supported is a good defensive measure. Consider adding a note in the docstring clarifying that Huber loss is unavailable for those terms.deepmd/tf/loss/ener.py (4)
76-84: Documentation for Huber parameters
Clear explanation of how Huber loss transitions from L2 to L1. Including a brief usage example could further clarify for new users.
138-143: Restriction on combining Huber loss with certain force features
The explicit runtime error is a good safeguard. If future implementation is planned, consider explaining the rationale or timeline in a comment or docstring.
357-357: Naming consideration for the summary
l2_loss_summarymight be misleading if Huber loss is used. Consider a more general label likeloss_summary.
386-386: Storing and returning total loss
Assigningself.l2_l = losscontinues the older naming. You might rename it for clarity, e.g.self.total_loss, since Huber loss is no longer purely L2.Also applies to: 388-388
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
deepmd/dpmodel/loss/ener.py(11 hunks)deepmd/pd/loss/ener.py(11 hunks)deepmd/pt/loss/ener.py(11 hunks)deepmd/tf/loss/ener.py(9 hunks)deepmd/utils/argcheck.py(2 hunks)source/tests/consistent/loss/test_ener.py(3 hunks)source/tests/universal/dpmodel/loss/test_loss.py(3 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
source/tests/consistent/loss/test_ener.py (1)
source/tests/consistent/loss/common.py (1)
LossTest(4-5)
deepmd/pt/loss/ener.py (4)
deepmd/dpmodel/loss/ener.py (1)
custom_huber_loss(20-27)deepmd/pd/loss/ener.py (1)
custom_huber_loss(26-32)deepmd/tf/loss/ener.py (1)
custom_huber_loss(28-34)source/tests/consistent/loss/test_ener.py (1)
data(64-78)
⏰ Context from checks skipped due to timeout of 90000ms (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Analyze (python)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Test C++ (true)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build C++ (cpu, cpu)
🔇 Additional comments (39)
source/tests/universal/dpmodel/loss/test_loss.py (3)
17-17: LGTM! New parameter for Huber loss.The addition of
use_huberparameter with a default value ofFalseensures backward compatibility while enabling the new Huber loss functionality.
36-36: Good implementation of the parameter in the input dictionary.The
use_huberparameter is correctly passed through to the output dictionary, making it available for the loss calculations.
49-49: LGTM! Proper test parameterization for the new feature.The test parameterization correctly includes both
FalseandTruevalues foruse_huber, ensuring test coverage for both standard and Huber loss calculations.deepmd/utils/argcheck.py (2)
2449-2457: Well-documented Huber loss parameters.The documentation clearly explains the purpose and behavior of the Huber loss function, including how it transitions between L2 and L1 loss based on the threshold delta.
2571-2584: Good implementation of Huber loss parameters.The implementation of
use_huberandhuber_deltaparameters is clean and follows the existing pattern in the codebase. The default values (False foruse_huberand 0.01 forhuber_delta) provide sensible defaults for this functionality.source/tests/consistent/loss/test_ener.py (4)
24-24: Use of parameterized decorator is a good test strategy.
Importing and usingparameterizedhere is helpful for systematically testing multiple parameter sets, enhancing test coverage.
59-61: Parameterizing test scenarios for Huber usage.
This approach clearly tests both standard and Huber-enabled scenarios in a single test class, improving reliability.
65-65: Extracting theuse_huberparameter.
This assignment is clear and consistent with the parameterized approach.
75-77: Zeroing out prefactors whenuse_huberis true.
Settingstart_pref_pfandlimit_pref_pfto 0.0 conditionally aligns with the code restriction that Huber loss is not combined with prefactor forces. Consider adding a negative test to confirm the expected behavior if these prefactors were non-zero when Huber is enabled.deepmd/pt/loss/ener.py (6)
56-57: New constructor parameters for Huber support.
Defininguse_huberandhuber_deltain the signature is a clear extension for conditional Huber loss computation.
102-110: Well-documented new parameters.
The docstring provides a concise explanation aboutuse_huberandhuber_delta, improving clarity for maintainers.
143-151: Runtime error for unsupported Huber configurations.
Raising an exception if combined usage is not implemented prevents undefined or incorrect behavior and is a robust safeguard.
276-285: Same Huber vs. L2 logic repetition.
352-360: Same Huber vs. L2 logic repetition.
381-389: Same Huber vs. L2 logic repetition.deepmd/dpmodel/loss/ener.py (8)
49-50: Introducinguse_huberandhuber_deltain the constructor.
These parameters are consistent with the new Huber loss support.
79-86: Runtime error for unimplemented force modes with Huber.
This check helps avoid silent misbehavior in cases where force-based features conflict with Huber loss.
167-167: Initializing thelossaggregator to 0.
A simple and effective way to accumulate terms. Looks good.
183-191: Duplicate Huber vs. L2 logic.
201-209: Duplicate Huber vs. L2 logic.
219-227: Duplicate Huber vs. L2 logic.
263-265: Capturing the final loss result.
Storing intermediate loss values inself.l2_landself.l2_morebefore returning can aid in debugging or logging.
355-355: Version increment ensures backward compatibility.
Updating to version 2 properly reflects the introduction of Huber-related parameters.deepmd/pd/loss/ener.py (7)
26-32: Validate correctness of the custom Huber loss implementation
The piecewise definition and the mean reduction closely follow the standard Huber loss formula, and the implementation looks accurate.
213-221: Energy term Huber logic
Conditionally using Huber loss for the energy term is consistent with the rest of the design. To ensure correctness, please confirm the ordering of (predictions, targets) across all calls tocustom_huber_lossis intentional and consistent.
277-284: Force term Huber logic
The usage of Huber loss for forces is analogous to the energy term. Consider validating shapes or dimensional consistency for force arrays in case further expansions (e.g., multi-dimensional or large-batch) are introduced.
363-371: Virial term Huber logic
The conditional Huber application for virial looks correct. No issues identified.
392-400: Atomic energy term Huber logic
Similarly consistent approach for atomic energy. The code is straightforward and matches the style used for the other terms.
498-498: Version bump to 2
This ensures serialization includes the new Huber-related parameters. No further concerns.
534-534: Check version compatibility
You are correctly narrowing the acceptable versions to accommodate the new features. The lack of explicit backward handling for older versions might be acceptable if such usage is unsupported.deepmd/tf/loss/ener.py (9)
28-34: Implementation of custom Huber loss
This follows the standard piecewise definition correctly. If desired, you could explore usingtf.keras.losses.Huber(delta=...)to reduce custom code.
106-107: New constructor parameters
Havinguse_huber=Falseandhuber_delta=0.01as defaults is logically consistent. Ensure these defaults align with typical dataset scales.
293-304: Initial loss setup and energy Huber logic
Settingloss = 0first helps maintain consistent accumulation. The conditional Huber branch for energy is properly matching the earlier lines. No major concerns.
307-307: Force term Huber logic
Mirrors the approach used for energy. Confirm the shared force shape remains valid when extending to different spin or multi-system cases.Also applies to: 310-315
320-320: Virial term Huber logic
The usage of Huber loss for virial is parallel to energy/force. No issues noted.Also applies to: 323-328
333-333: Atomic energy Huber logic
Using Huber for atomic energy matches the approach for energy. This keeps the code design consistent across loss components.Also applies to: 336-341
351-351: No Huber logic for generalized force
Retaining the standard L2 path is consistent with the earlier explicit error blocking Huber for generalized force use cases.
482-482: Updated serialization version
Version 2 properly signals that the new Huber parameters are included.
520-520: Deserialization version check
This ensures alignment with the updated serialization. Minimal or no backward compatibility is acceptable if older versions are no longer in use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
deepmd/pd/loss/ener.py(11 hunks)deepmd/tf/loss/ener.py(15 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pd/loss/ener.py
⏰ Context from checks skipped due to timeout of 90000ms (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Analyze (python)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C++ (cpu, cpu)
🔇 Additional comments (12)
deepmd/tf/loss/ener.py (12)
28-34: New Huber loss implementation looks good.The implementation correctly calculates Huber loss with a smooth transition between L2 loss (quadratic) for small errors and L1 loss (linear) for large errors based on the delta threshold.
76-83: Well-documented parameters.The documentation for
use_huberandhuber_deltaparameters is thorough, explaining both what the Huber loss does and how it transitions between L2 and L1 behavior.
106-107: Parameters added with sensible defaults.The new parameters
use_huberandhuber_deltaare added with sensible default values (False and 0.01 respectively), maintaining backward compatibility.
136-143: Good runtime compatibility check.The code properly validates that Huber loss is not used with incompatible features (prefactor forces, generalized forces, or relative forces), which prevents potential issues during training.
311-323: Good conditional implementation of Huber loss for energy.The code correctly applies either L2 or Huber loss based on the
use_huberflag, properly normalizing the inputs to the Huber loss function.
325-334: Good conditional implementation of Huber loss for forces.The force loss calculation correctly applies Huber loss when the flag is enabled, with proper reshaping of tensors.
338-347: Good conditional implementation of Huber loss for virial.The virial loss calculation correctly applies Huber loss when the flag is enabled, with proper normalization and reshaping.
351-360: Good conditional implementation of Huber loss for atomic energies.The atomic energy loss calculation correctly applies Huber loss when the flag is enabled, with proper reshaping of tensors.
375-375: Appropriate summary name update.The summary name is updated to reflect that the loss might not be L2 loss when Huber loss is used, ensuring clarity in TensorBoard visualizations.
404-406: Variable name change from l2_loss to loss.The variable name is appropriately changed from
l2_losstolossto accurately reflect that it may not be L2 loss when Huber loss is enabled.
500-500: Correct version update and serialization of new parameters.The version number is appropriately updated from 1 to 2, and the new parameters
use_huberandhuber_deltaare included in the serialized data.Also applies to: 517-518
538-538: Version compatibility check updated.The version compatibility check is updated to handle both version 1 and version 2, ensuring backward compatibility with previously saved models.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #4684 +/- ##
==========================================
+ Coverage 84.80% 84.82% +0.01%
==========================================
Files 692 692
Lines 66410 66511 +101
Branches 3539 3539
==========================================
+ Hits 56320 56418 +98
- Misses 8949 8951 +2
- Partials 1141 1142 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
deepmd/utils/argcheck.py (1)
2449-2456: Documentation formatting needs adjustment for RST compatibility.The Huber loss documentation looks good in terms of content, but needs formatting adjustments to be consistent with other documentation in the codebase:
- Add a linebreak before each bullet point to match other docstrings in this file
- Ensure proper RST formatting for bullet points
doc_use_huber = ( "Enables Huber loss calculation for energy/force/virial terms with user-defined threshold delta (D). " - "The loss function smoothly transitions between L2 and L1 loss: \n\n" - "- For absolute prediction errors within D: quadratic loss (0.5 * (error**2)) \n\n" - "- For absolute errors exceeding D: linear loss (D * \\|error\\| - 0.5 * D) \n\n" + "The loss function smoothly transitions between L2 and L1 loss: \n\n\ + - For absolute prediction errors within D: quadratic loss (0.5 * (error**2)) \n\n\ + - For absolute errors exceeding D: linear loss (D * \\|error\\| - 0.5 * D) \n\n\ "Formula: loss = 0.5 * (error**2) if \\|error\\| <= D else D * (\\|error\\| - 0.5 * D). " )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepmd/utils/argcheck.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Analyze (python)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Test C++ (true)
- GitHub Check: Analyze (c-cpp)
🔇 Additional comments (1)
deepmd/utils/argcheck.py (1)
2571-2584: The Huber loss parameters are well implemented.The implementation of
use_huberandhuber_deltaparameters looks good. They follow the same pattern as other parameters in the codebase, with appropriate types, defaults, and documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deepmd/pd/loss/ener.py(11 hunks)deepmd/pt/loss/ener.py(11 hunks)deepmd/tf/loss/ener.py(15 hunks)deepmd/utils/argcheck.py(2 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
deepmd/pt/loss/ener.py (4)
deepmd/pd/loss/ener.py (1)
custom_huber_loss(26-32)deepmd/tf/loss/ener.py (1)
custom_huber_loss(28-34)deepmd/dpmodel/loss/ener.py (1)
custom_huber_loss(20-27)source/tests/consistent/loss/test_ener.py (1)
data(64-78)
deepmd/pd/loss/ener.py (3)
deepmd/tf/loss/ener.py (1)
custom_huber_loss(28-34)deepmd/pt/loss/ener.py (1)
custom_huber_loss(26-32)deepmd/dpmodel/loss/ener.py (1)
custom_huber_loss(20-27)
⏰ Context from checks skipped due to timeout of 90000ms (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Analyze (python)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build C++ (cpu, cpu)
🔇 Additional comments (18)
deepmd/utils/argcheck.py (2)
2449-2456: Documentation formatting looks good with proper linebreaks and escapingThe documentation for the new Huber loss parameters is well-formatted with:
- Clear descriptions of the Huber loss functionality
- Proper linebreaks before list items
- Correctly escaped pipe characters (
|) to prevent RST from interpreting them as linksThis addresses the formatting concerns from the previous review.
2571-2584: Parameter implementation looks goodThe implementation of the
use_huberandhuber_deltaparameters is correctly done with appropriate default values:
use_huberdefaults toFalse(disabled by default)huber_deltadefaults to0.01(reasonable threshold for Huber loss)These parameters match the functionality described in the PR objectives for conditional Huber loss computation.
deepmd/pt/loss/ener.py (5)
26-32: Good implementation of the Huber loss function.The
custom_huber_lossfunction correctly implements the Huber loss, providing a smooth transition between L2 loss for small errors and L1 loss for larger errors. This helps with training stability when dealing with outliers.
56-57: Well-documented new parameters for Huber loss.The new parameters
use_huberandhuber_deltaare well-documented, with clear explanations of what they do and how the Huber loss function works. The documentation clearly explains the threshold behavior and formula.Also applies to: 102-109
213-221: Consistent implementation of Huber loss across different loss terms.The implementation of Huber loss is consistent across energy, force, virial, and atomic energy loss terms. Each section correctly checks the
use_huberflag and applies either standard L2 loss or Huber loss accordingly.Also applies to: 276-284, 352-360, 381-389
487-487: Proper serialization of new parameters.The new
use_huberandhuber_deltaparameters are correctly included in the serialization method, and the version number is incremented from 1 to 2 to reflect these changes.Also applies to: 504-505
523-523: Ensure backward compatibility in deserialization.The deserialization method correctly uses
check_version_compatibilityto handle both old (version 1) and new (version 2) serialized data.deepmd/pd/loss/ener.py (4)
26-32: Good implementation of Huber loss for Paddle.The
custom_huber_lossfunction correctly implements the Huber loss using Paddle's API, providing the same functionality as in other frameworks.
56-57: Good documentation for Huber loss parameters.The documentation for
use_huberandhuber_deltaclearly explains the parameters and the threshold behavior of Huber loss, addressing the previous review comment asking for clarification about the delta parameter.Also applies to: 102-109
213-221: Consistent implementation of conditional Huber loss across all loss components.The implementation correctly applies Huber loss to energy, force, virial, and atomic energy loss terms when
use_huberis enabled, maintaining consistency with the PyTorch and TensorFlow implementations.Also applies to: 276-284, 363-371, 392-400
498-498: Proper versioning and serialization.The version number is correctly updated, new parameters are added to serialization, and the deserialization method appropriately handles version compatibility.
Also applies to: 515-516, 534-534
deepmd/tf/loss/ener.py (7)
28-34: Well-implemented Huber loss for TensorFlow.The
custom_huber_lossfunction correctly implements the Huber loss using TensorFlow operations, consistent with the implementations in other frameworks.
76-83: Good documentation for Huber loss parameters.The documentation for
use_huberandhuber_deltaclearly explains the threshold behavior and formula of the Huber loss function.Also applies to: 106-107
136-143: Good validation of Huber loss compatibility.The code correctly checks for incompatible features and raises informative error messages when Huber loss is used with atom prefactors, generalized forces, or relative forces.
163-164: Fixed potential uninitialized variable warnings.These additions properly initialize variables that could otherwise be uninitialized, addressing static analysis warnings seen in previous reviews.
Also applies to: 189-190, 201-202, 210-211, 228-229, 238-239, 248-249, 308-309
311-322: Consistent implementation of conditional Huber loss.The implementation correctly checks the
use_huberflag and applies either standard L2 loss or Huber loss accordingly for energy, force, virial, and atomic energy loss terms.Also applies to: 325-333, 338-346, 351-359
500-500: Proper versioning and serialization.The version number is correctly updated, new parameters are added to serialization, and the deserialization method appropriately handles version compatibility.
Also applies to: 517-518, 538-538
311-311: Improved variable naming for clarity.Changing from
l2_losstolossin various places makes the code more accurate since the loss is no longer exclusively L2 when Huber loss is enabled.Also applies to: 374-375, 404-406
Summary by CodeRabbit
New Features
Documentation
Tests