-
Notifications
You must be signed in to change notification settings - Fork 116
[CI] Write Routine Maintenance Document #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch introduces a document that describes all of the routine maintenance actions that need to be performed on the premerge cluster, currently only version updates.
`.github/workflows/containers/github-action-ci-windows/Dockerfile` has an | ||
argument called `RUNNER_VERSION` near the bottom of the file that needs to | ||
be updated to the new version. | ||
3. The libc++ CI container - The `docker-compose` manifest at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear what you mean by "The docker-compose
manifest". Are you talking about the GITHUB_RUNNER_VERSION (in the docker-compose.yml file)? Or something else?
argument called `RUNNER_VERSION` near the bottom of the file that needs to | ||
be updated to the new version. | ||
3. The libc++ CI container - The `docker-compose` manifest at | ||
`libcxx/utils/ci/docker-compose.yml` needs to be updated to pull in the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the entire #3 set of instructions is a bit confusing. The libc++ instructions you reference mention how to grab new container image that already contains a particular runner version, but they don't explicitly state how to create a container with the correct runner version? Which is what #1 and #2 seem to be about?
`libcxx/utils/ci/docker-compose.yml` needs to be updated to pull in the latest | ||
runner images using the [libc++ instructions](https://libcxx.llvm.org/Contributing.html#updating-the-ci-testing-container-images) | ||
|
||
### Other Container Image Software |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which repo alert would cause trigger this set of instructions? (It's not clear that any of the repos mentioned above correspond to this).
for building LLVM, like CMake, ninja, and the toolchain itself. Keeping | ||
most of these up to date is ideal. | ||
|
||
A large amount of the software comes from the distribution and thus does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the distribution" -- What distribution? And how do you get notified that it's time to update it?
|
||
1. Modify `.github/workflows/containers/github-action-ci/Dockerfile` locally | ||
to use the latest `ubuntu:xx.04` image. | ||
2. Ensure that `monolithic-linux.sh` with all options enabled runs successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there instructions for how to test monolithic-linux.sh with all options enabled?
2. Ensure that `monolithic-linux.sh` with all options enabled runs successfully. | ||
Push any needed changes. | ||
3. Push the updated `Dockerfile` and update the workflow in | ||
`.github/workflows/build-ci-container.yml` to use the correct image name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For #3: Can you be more explicit about what to update in build-ci-container.yml? Do you mean the 'container_name" in the bulid-ci-container job? or something else? What's the right value to update it to?
Updating explicitly versioned software (just LLVM in the Linux container, but | ||
most software in the Windows container) just requires bumping the version number | ||
and pushing the new image to the monorepo. In the Linux container, the LLVM | ||
version can be bumped by changing the `LLVM_VERSION` environment variable. In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "containers" you're talking about here refer to the Dockerfiles mentioned above? Under .github/workflows/containers/github-actions-ci{-windows}?
and pushing the new image to the monorepo. In the Linux container, the LLVM | ||
version can be bumped by changing the `LLVM_VERSION` environment variable. In | ||
the Windows container, versions are controlled by the `--version` flag passed | ||
to `choco install` commands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Windows Dockerfile doesn't contain any 'choco install' commands for installing LLVM that I can see?
|
||
### Windows Edition | ||
|
||
Whenever a new Windows Server datacenter edition (eg 2025) is supported by GKE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you get notifications about this?
|
||
1. Modify `.github/workflows/containers/github-action-ci-windows/Dockerfile` | ||
locally to get it building on the new Windows Server version. This requires | ||
having a host at that version as Windows Server containers can only run on a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"having a host at that version"? A host windows machine? Where/how do you get a local windows machine for this?
`Dockerfile` has previously been enough. | ||
2. When that is ready, push your changes in a PR along with changes to | ||
`.github/workflows/build-ci-container-windows.yml` so that it uses the new | ||
Windows Server edition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this involved changing all the "windows-20xx" in that file (both the runs-on values and the container_name) to use the new "windows-20yy" version?
Windows Server edition. | ||
3. Test locally that the pushed container image can run the | ||
`monolithic-windows.sh` script with all projects enabled successfully. Make | ||
and submit any changes that are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roughly same question as above: How do you test monolithic-windows.sh with all projects enabled locally?
3. Test locally that the pushed container image can run the | ||
`monolithic-windows.sh` script with all projects enabled successfully. Make | ||
and submit any changes that are needed. | ||
4. Duplicate all of the versioned Windows resources in the terraform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, for example, if we're upgrading from windows-2022 to windows-2025, I would KEEP all the windows-2022 resources defined in the all the *.tf files (under premerge), but would now also have windows-2025 versions of them?
Also I assume I would need to copy windows_2022_runner_values.yaml to windows_2025_runner_values.yaml, and update the values inside to use 2025 instead of 2022?
Also, both architecture.md and cluster-management.md explicitly mention 'windows_2022' or 'windows-2022', so if the version gets updated then the docs need to be updated too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should probably mention that this is in the llvm-zorg repo, since most of the other steps are in the llvm-project repo.
configuration and apply it. | ||
5. Switch the workflow in `.github/workflows/premerge.yaml` to the new | ||
runner set. | ||
6. Sunset the existing sunset in 1-2 weeks to give time for people utilizing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Sunset the existing sunset" seems like a typo. Also, you should be more explicit about what 'sunsetting' means here. I assume you mean: Delete all the old resources that you KEPT in step 4 when you duplicated everything. Are there other steps beyond that?
This patch introduces a document that describes all of the routine maintenance actions that need to be performed on the premerge cluster, currently only version updates.