-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[AutoParallel] Refactor llama3.1 model in intermediate api #2859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Thanks for your contribution! |
2d10cfd to
2b22e31
Compare
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (19.14%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #2859 +/- ##
==========================================
Coverage ? 31.91%
==========================================
Files ? 419
Lines ? 67382
Branches ? 0
==========================================
Hits ? 21505
Misses ? 45877
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bff8036 to
8e83c57
Compare
|
|
||
| register_pp_reshard_information(config.num_hidden_layers) | ||
| except: | ||
| print("Not register llama pp reshard information.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
什么情况下会执行失败?不注册这个reshard会造成什么影响
| Whether the model's input and output word embeddings should be tied. Note that this is only relevant if the | ||
| model has a output word embedding layer. | ||
| run_single_model (`bool`, *optional*, defaults to `False`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果这个是想表达非并行模式的话,名字并不表意,建议替换下,开发者可以更好理解,例如:run_without_parallelism,run_in_non_parallel_mode,如果这种模式下还允许dp和sharding的话,看看是否有更合适的名字
| any(architecture in str(config.architectures) for architecture in architectures_to_check) | ||
| and training_args.data_parallel_degree > 1 | ||
| ): | ||
| training_args.use_expert_parallel = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单卡模式下允许EP吗?
| training_args.use_expert_parallel = True | ||
|
|
||
| if model_args.continue_training: | ||
| # NOTE(gongenlei): new add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note去掉
| if training_args.autotuner_benchmark: | ||
| model = model_class.from_config(config, dtype=dtype) | ||
| else: | ||
| model = model_class.from_pretrained( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
热启时不需要参考下面 paddle.lazyGuard写法吗?
liym27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Before submitting
testsfolder. If there are codecov issues, please add tests cases first.PR types
Function optimization
PR changes
Models
Description
在重构 trainer ( #2801 )的基础上对 llama 3.1 模型组网接口进行优化
组网统一使用 modeling.py,移除 modeling_network.py 和 modeling_auto.py
该 PR 逻辑如下:
用户在 yaml 配置中开启自动并行情况如下:
中层api动半:
单卡:
动手:
启动 llama3 脚本: