-
Notifications
You must be signed in to change notification settings - Fork 575
fix: set multiprocessing start method to 'fork' in main function(python3.14 defaults to forkserver) #5019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
…on3.14 defaults to forkserver)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a compatibility issue with Python 3.14, which defaults to the 'forkserver' multiprocessing start method instead of 'fork'. The change explicitly sets the multiprocessing start method to 'fork' at the application entry point to maintain consistent behavior across Python versions.
Key Changes:
- Added multiprocessing start method configuration in the main function to explicitly set it to 'fork'
- Included error handling with logging for both successful configuration and failures
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a startup block in Changes
Sequence Diagram(s)sequenceDiagram
participant Main as deepmd/main.py
participant MP as multiprocessing
participant Log as logger
participant CLI as arg parsing & backend validation
Main->>MP: import multiprocessing
alt import succeeds
Main->>MP: set_start_method("fork", force=True)
alt set_start_method succeeds
MP-->>Log: return success
Log-->>Main: "multiprocessing start method set to fork"
else set_start_method fails
MP-->>Log: raise Exception
Log-->>Main: warn("failed to set start method")
end
else import fails
MP-->>Log: ImportError
Log-->>Main: warn("multiprocessing import failed")
end
Main->>CLI: continue with argument parsing & backend validation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
deepmd/main.py (1)
992-993
: Past review suggestions not addressed.Previous reviews suggested:
- Line 992: Change
logging.info()
tologging.debug()
since successful initialization typically doesn't need info-level logging- Line 993: Catch specific exceptions (
RuntimeError
,ValueError
) instead of broadException
These suggestions remain valid. The static analysis tool also flags the broad exception catching.
As per coding guidelines, please run
ruff check .
before committing to address the BLE001 warning.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepmd/main.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Always run
ruff check .
andruff format .
before committing changes to Python code
Files:
deepmd/main.py
🪛 Ruff (0.14.0)
deepmd/main.py
993-993: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Test C++ (true)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test C++ (false)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
🔇 Additional comments (1)
deepmd/main.py (1)
987-994
: Logging occurs before CLI configuration is applied.This code logs messages before
parse_args()
(line 996) processes the--log-level
and--log-path
arguments. The log messages will use Python's default logging configuration, so user-specified log levels and paths won't apply to these early messages.This is acceptable for early initialization messages, but be aware that users won't be able to control these logs via CLI arguments.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #5019 +/- ##
==========================================
- Coverage 84.23% 84.23% -0.01%
==========================================
Files 709 709
Lines 70074 70082 +8
Branches 3619 3619
==========================================
+ Hits 59026 59032 +6
- Misses 9880 9882 +2
Partials 1168 1168 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this PR?
logging.debug("Successfully set multiprocessing start method to 'fork'.") | ||
else: | ||
logging.debug("Skipping fork start method on Windows (not supported).") | ||
except (RuntimeError, ValueError) as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of try...except
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python3.14 by default use forkserver as the start mode of multiprocessing. As a result, with this mode, the NUM_WORKERS will be set to 0. So the setting of this mode is required at the beginning of the program. The try and except is for the compatibility of windows since it does not support this behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why NUM_WORKERS
is set to 0. @caic99 Do you know details?
This module is the command line module. It seems you should move to the PyTorch module.
It seems that line 996 is for Windows. Will it touch line 997?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why
NUM_WORKERS
is set to 0. @caic99 Do you know details?
The current implementation of deepmd dataloader requires either using fork as the worker launching method, or setting the number of workers to 0, or an error will be raised. I've tried it in earlier versions of Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why
NUM_WORKERS
is set to 0. @caic99 Do you know details?The current implementation of deepmd dataloader requires either using fork as the worker launching method, or setting the number of workers to 0, or an error will be raised. I've tried it in earlier versions of Python.
Is there any suggestion of how to set this mode in a better way?
Summary by CodeRabbit