You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> *AlgoPerf* is a suite of benchmarks and competitions to measure neural network training speedups due to algorithmic improvements in both training algorithms and models. This is the repository for the *AlgoPerf: Training Algorithms benchmark* and its associated competition. It is developed by the [MLCommons Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/). This repository holds the [**competition rules**](/COMPETITION_RULES.md), the [**technical documentation**](/DOCUMENTATION.md) of the benchmark, [**getting started guides**](/GETTING_STARTED.md), and the benchmark code. For a detailed description of the benchmark design, see our [**paper**](https://arxiv.org/abs/2306.07179).
25
+
> *AlgoPerf* is a suite of benchmarks and competitions to measure neural network training speedups due to algorithmic improvements in both training algorithms and models. This is the repository for the *AlgoPerf: Training Algorithms benchmark* and its associated competition. It is developed by the [MLCommons Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/). This repository holds the [**competition rules**](/docs/COMPETITION_RULES.md), the [**technical documentation**](/docs/DOCUMENTATION.md) of the benchmark, [**getting started guides**](/docs/GETTING_STARTED.md), and the benchmark code. For a detailed description of the benchmark design, see our [**paper**](https://arxiv.org/abs/2306.07179).
26
26
27
27
---
28
28
@@ -45,9 +45,9 @@
45
45
> [!TIP]
46
46
> **If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us.** Either [file an issue](https://github.com/mlcommons/algorithmic-efficiency/issues), ask a question on [our Discord](https://discord.gg/5FPXK7SMt6) or [join our weekly meetings](https://mlcommons.org/en/groups/research-algorithms/).
47
47
48
-
You can install this package and dependencies in a [Python virtual environment](/GETTING_STARTED.md#python-virtual-environment) or use a [Docker/Singularity/Apptainer container](/GETTING_STARTED.md#docker) (recommended).
48
+
You can install this package and dependencies in a [Python virtual environment](/docs/GETTING_STARTED.md#python-virtual-environment) or use a [Docker/Singularity/Apptainer container](/docs/GETTING_STARTED.md#docker) (recommended).
49
49
We recommend using a Docker container (or alternatively, a Singularity/Apptainer container) to ensure a similar environment to our scoring and testing environments.
50
-
Both options are described in detail in the [**Getting Started**](/GETTING_STARTED.md) document.
50
+
Both options are described in detail in the [**Getting Started**](/docs/GETTING_STARTED.md) document.
51
51
52
52
*TL;DR to install the Jax version for GPU run:*
53
53
@@ -67,7 +67,7 @@ pip3 install -e '.[full]'
67
67
68
68
## Getting Started
69
69
70
-
For detailed instructions on developing and scoring your own algorithm in the benchmark see the [Getting Started](/GETTING_STARTED.md) document.
70
+
For detailed instructions on developing and scoring your own algorithm in the benchmark see the [Getting Started](/docs/GETTING_STARTED.md) document.
The [Call for Submissions](/CALL_FOR_SUBMISSIONS.md) announces the first iteration of the AlgoPerf: Training Algorithms competition based on the benchmark by the same name. This document also contains the schedule and key dates for the competition.
98
+
The [Call for Submissions](/docs/CALL_FOR_SUBMISSIONS.md) announces the first iteration of the AlgoPerf: Training Algorithms competition based on the benchmark by the same name. This document also contains the schedule and key dates for the competition.
99
99
100
100
### Competition Rules
101
101
102
-
The competition rules for the *AlgoPerf: Training Algorithms* competition can be found in the separate [**Competition Rules**](/COMPETITION_RULES.md) document.
102
+
The competition rules for the *AlgoPerf: Training Algorithms* competition can be found in the separate [**Competition Rules**](/docs/COMPETITION_RULES.md) document.
103
103
104
104
### Technical Documentation of the Benchmark & FAQs
105
105
106
-
We provide additional technical documentation of the benchmark and answer frequently asked questions in a separate [**Documentation**](/DOCUMENTATION.md) page. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the [working group](mailto:[email protected]).
106
+
We provide additional technical documentation of the benchmark and answer frequently asked questions in a separate [**Documentation**](/docs/DOCUMENTATION.md) page. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the [working group](mailto:[email protected]).
107
107
108
108
## Contributing
109
109
110
110
We invite everyone to look through our rules, documentation, and codebase and submit issues and pull requests, e.g. for rules changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please [join the weekly meetings](https://mlcommons.org/en/groups/research-algorithms/) and consider becoming a member of the working group.
111
111
112
-
Our [**Contributing**](/CONTRIBUTING.md) document provides further MLCommons contributing guidelines and additional setup and workflow instructions.
112
+
Our [**Contributing**](/docs/CONTRIBUTING.md) document provides further MLCommons contributing guidelines and additional setup and workflow instructions.
Copy file name to clipboardExpand all lines: docs/CALL_FOR_SUBMISSIONS.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,11 +19,11 @@ Please fill out the (mandatory but non-binding) [**registration form**](https://
19
19
-**Submission deadline: April 04th, 2024***(moved by a week from the initial March 28th, 2024)*
20
20
-[Announcement of all results](https://mlcommons.org/2024/08/mlc-algoperf-benchmark-competition/): August 1st, 2024
21
21
22
-
For a detailed and up-to-date timeline see the [Competition Rules](/COMPETITION_RULES.md).
22
+
For a detailed and up-to-date timeline see the [Competition Rules](/docs/COMPETITION_RULES.md).
23
23
24
24
## Participation
25
25
26
-
For details on how to participate in the competition, please refer to our [Competition Rules](/COMPETITION_RULES.md). To learn more about the benchmark, see our [technical documentation](/DOCUMENTATION.md). The benchmark is further motivated, explained, and justified in the accompanying [paper](https://arxiv.org/abs/2306.07179). We require all submissions to be provided under the open-source [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0).
26
+
For details on how to participate in the competition, please refer to our [Competition Rules](/docs/COMPETITION_RULES.md). To learn more about the benchmark, see our [technical documentation](/docs/DOCUMENTATION.md). The benchmark is further motivated, explained, and justified in the accompanying [paper](https://arxiv.org/abs/2306.07179). We require all submissions to be provided under the open-source [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0).
Copy file name to clipboardExpand all lines: docs/DOCUMENTATION.md
+6-7Lines changed: 6 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,7 +65,7 @@ The intention is that a training algorithm submission will be broadly applicable
65
65
66
66
### Competition Rules
67
67
68
-
For a description of the competition rules and how to submit a training algorithm to the AlgoPerf: Training Algorithms Benchmark, see the [Competition Rules](/COMPETITION_RULES.md), which details the entire competition process.
68
+
For a description of the competition rules and how to submit a training algorithm to the AlgoPerf: Training Algorithms Benchmark, see the [Competition Rules](/docs/COMPETITION_RULES.md), which details the entire competition process.
69
69
70
70
### Submissions
71
71
@@ -222,7 +222,6 @@ def update_params(
222
222
- Cannot replace the model parameters with pre-trained ones.
223
223
- Batch norm should work here because the `model_fn` will return updated batch norm moving averages when it is told to with`update_batch_norm`.
224
224
225
-
226
225
###### Prepare for evaluation function
227
226
228
227
```python
@@ -278,7 +277,7 @@ def data_selection(
278
277
279
278
In general, with noisy, non-deterministic training, evaluation frequency can affect training time measurements as more "bites of the apple" potentially allows the training code to exploit instability. We also want to discourage submissions from complicated and unrealistic logic that attempts to guess when training is close to complete and increases the evaluation rate, whilenot producing a well-sampled training curve at the start of training. Simply allowing submissions complete freedom over evaluation frequency encourages competitors to work to minimize the number of evaluations, which distracts from the primary goal of finding better training algorithms.
280
279
281
-
Submissions are eligible for an untimed eval every `eval_period` seconds. Before proceeding to evaluation, the submission can prepare the model through a call to `prepare_for_eval`, effectively modifying the model parameters and state as well as the the optimizer state. Any additional evaluations performed by the submission code count against the runtime for scoring.
280
+
Submissions are eligible for an untimed eval every `eval_period` seconds. Before proceeding to evaluation, the submission can prepare the model through a call to `prepare_for_eval`, effectively modifying the model parameters and state as well as the the optimizer state. Any additional evaluations performed by the submission code count against the runtime for scoring.
282
281
The harness that runs the submission code will attempt to eval every `eval_period` seconds by checking between each submission step (call of `update_params`) whether it has been at least `eval_period` seconds since that last eval, if so, the submission is given the possibility to prepare for evaluation (through a timed call to `prepare_for_eval`). If the accumulated runtime does not exceed the maximum allowed runtime after the preparation step, the clock is paused, and the submission is evaluated. This means that if calls to `update_params` typically take a lot more than `eval_period` seconds, such submissions will not receive as many untimed evals as a submission that had an `update_params` function that took less time. However, for appropriate settings of `eval_period`, we expect this to be quite rare. Submissions are always free to restructure their `update_params` code to split work into two subsequent steps to regain the potential benefits of these untimed model evaluations. For each workload, the `eval_period` will be set such that the total evaluation time is roughly between 10% and 20% of the total training time for the target-setting runs.
283
282
284
283
#### Valid submissions
@@ -475,7 +474,7 @@ Our scoring procedure uses the held-out workloads only to penalize submissions t
475
474
476
475
NOTE: Submitters are no longer required to self-report results for AlgoPerf competition v0.5.
477
476
478
-
The qualification setis designed for submitters that may not have the compute resources to self-report on the full set of [fixed](#fixed-workloads) and [held-out workloads](#randomized-workloads). They may instead self-report numbers on this smaller qualification set. The best-performing submissions may then qualify for compute sponsorship offering a free evaluation on the full benchmark set and therefore the possibility to win [awards and prizes](/COMPETITION_RULES.md#prizes).
477
+
The qualification setis designed for submitters that may not have the compute resources to self-report on the full set of [fixed](#fixed-workloads) and [held-out workloads](#randomized-workloads). They may instead self-report numbers on this smaller qualification set. The best-performing submissions may then qualify for compute sponsorship offering a free evaluation on the full benchmark set and therefore the possibility to win [awards and prizes](/docs/COMPETITION_RULES.md#prizes).
479
478
480
479
The qualification set consists of the same [fixed workloads](#fixed-workloads) as mentioned above, except for both workloads on *ImageNet*, both workloads on *LibriSpeech*, and the *fastMRI* workload. The remaining three workloads (*WMT*, *Criteo 1TB*, and *OGBG*) form the qualification set. There are no [randomized workloads](#randomized-workloads) in the qualification set. The qualification set of workloads aims to have a combined runtime of roughly 24 hours on the [benchmarking hardware](#benchmarking-hardware).
481
480
@@ -585,7 +584,7 @@ GPUs with higher per GPU memory, please monitor your memory usage to make sure i
585
584
586
585
#### How do I run this on my SLURM cluster?
587
586
588
-
You may run into issues with`sudo`and`docker` on a SLURM cluster. To run the workloads in a SLURM cluster you can use Apptainer (previously Singularity), see this [section](/GETTING_STARTED.md#using-singularityapptainer-instead-of-docker).
587
+
You may run into issues with`sudo`and`docker` on a SLURM cluster. To run the workloads in a SLURM cluster you can use Apptainer (previously Singularity), see this [section](/docs/GETTING_STARTED.md#using-singularityapptainer-instead-of-docker).
589
588
590
589
#### How can I run this on my AWS/GCP/Azure cloud project?
591
590
@@ -627,13 +626,13 @@ You only have to use the benchmarking hardware for runs that are directly involv
627
626
628
627
NOTE: Submitters are no longer required to self-report results for AlgoPerf competition v0.5.
629
628
630
-
Submitters unable to self-fund scoring costs can instead self-report only on the [qualification set of workloads](/COMPETITION_RULES.md#qualification-set) that excludes some of the most expensive workloads. Based on this performance on the qualification set, the working group will provide - as funding allows - compute to evaluate and score the most promising submissions. Additionally, we encourage researchers to reach out to the [working group](mailto:[email protected]) to find potential collaborators with the resources to run larger, more comprehensive experiments for both developing and scoring submissions.
629
+
Submitters unable to self-fund scoring costs can instead self-report only on the [qualification set of workloads](/docs/COMPETITION_RULES.md#qualification-set) that excludes some of the most expensive workloads. Based on this performance on the qualification set, the working group will provide - as funding allows - compute to evaluate and score the most promising submissions. Additionally, we encourage researchers to reach out to the [working group](mailto:[email protected]) to find potential collaborators with the resources to run larger, more comprehensive experiments for both developing and scoring submissions.
631
630
632
631
#### Can I submit previously published training algorithms as submissions?
633
632
634
633
Yes, you may, aslongas it isn't an exact copy of an existing submission.
635
634
For example, you may submit the Adam optimizer with your particularly effective hyperparameter search space and hyperparameter configuration, as different choices for hyperparameter values and/or search spaces constitute different training algorithms and are potential sources of innovation.
636
-
That said, while submitting Adam with some novel heuristic to set various hyperparameters, some especially effective hyperparameter search space, or your single best hyperparameter configuration is fine, avoid making multiple submissions that only differ by their hyperparameter configuration without a convincing justification they are substantially different (see ["Can I submit multiple times to the benchmark competition?"](/COMPETITION_RULES.md#can-i-submit-multiple-times-to-the-benchmark-competition), above).
635
+
That said, while submitting Adam with some novel heuristic to set various hyperparameters, some especially effective hyperparameter search space, or your single best hyperparameter configuration is fine, avoid making multiple submissions that only differ by their hyperparameter configuration without a convincing justification they are substantially different (see ["Can I submit multiple times to the benchmark competition?"](/docs/COMPETITION_RULES.md#can-i-submit-multiple-times-to-the-benchmark-competition), above).
0 commit comments