-
Couldn't load subscription status.
- Fork 146
Version based GPU configuration and QoS addition #1092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version based GPU configuration and QoS addition #1092
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D78778304 |
Summary: Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Differential Revision: D78778304
f67d4b6 to
d06b519
Compare
Summary: Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Differential Revision: D78778304
|
This pull request was exported from Phabricator. Differential Revision: D78778304 |
d06b519 to
9a36468
Compare
Summary: Pull Request resolved: meta-pytorch#1092 Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Differential Revision: D78778304
9a36468 to
899defb
Compare
Summary: Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Reviewed By: kiukchung Differential Revision: D78778304
|
This pull request was exported from Phabricator. Differential Revision: D78778304 |
Summary: Pull Request resolved: meta-pytorch#1092 Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Reviewed By: kiukchung Differential Revision: D78778304
899defb to
e28eb0f
Compare
Summary: Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Reviewed By: kiukchung Differential Revision: D78778304
e28eb0f to
fb9e308
Compare
Summary: Pull Request resolved: meta-pytorch#1092 Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call `gpus-per-node` in sbatch to ensure failure free allocation. https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md # Changes here 1. Introduced Slurm Version based GPU request configuration 2. Introduced an option QoS parameter which can be used to control priority of jobs. Reviewed By: kiukchung Differential Revision: D78778304
fb9e308 to
942aa7c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D78778304 |
ae55901
into
meta-pytorch:main
Summary:
Slurm 24.11.0rc1 and beyond do not suport GRES per task. So we need to call
gpus-per-nodein sbatch to ensure failure free allocation.https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-24.11.md
Changes here
Introduced Slurm Version based GPU request configuration
Introduced an option QoS parameter which can be used to control priority of jobs.
Differential Revision: D78778304