-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat: transparent buildx support for remote buildkit (#6732) #9648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…Tools#6732) * detect-buildx global config option for backward compatibility * cache-tag global config option to customize cache destination * new CacheTo in DockerArtifact in configuration yaml (for docker build --cache-to) * export LoadDockerConfig to read ~/.docker/config.json for buildx detection * fix avoid loading image via buildx if no docker daemon is accessible * fix remote lookup / import missing in buildx workaround * fix image import if no docker daemon is available (under buildx) * adjust cache reference preserving tag and default cacheTo if not given * parse buildx metadata to extract ImageID digest Initially based on ebekebe's GoogleContainerTools#8172 patch ebekebe@1c1fdeb Signed-off-by: [email protected]
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi team,
I'm currently reviewing this pull request and will provide feedback in a few minutes. In the meantime, here's a summary of the changes for you and other reviewers to quickly get up to speed:
This pull request, feat: transparent buildx support for remote buildkit (#6732)
, aims to improve Skaffold's build process by adding transparent support for docker buildx
. The primary goal is to enable users to leverage the advanced features of buildx
, such as remote buildkit builders and improved caching, without significant changes to their existing Skaffold configurations.
Here's a breakdown of the changes:
- New Global Configs:
detect-buildx
: Enables automatic detection ofbuildx
as the default builder.cache-tag
: Allows overriding the default cache tagging strategy, useful for managing caches across different branches or environments.
- Build Process Enhancements:
- The build process now intelligently detects and utilizes
buildx
if available and configured. - Reduced dependency on the local Docker daemon when using
buildx
, enhancing security and flexibility. - Added
cacheTo
support in the Skaffold schema to specify custom cache destinations. - Added
--cache-to
support to the Docker CLI build process when usingbuildx
.
- The build process now intelligently detects and utilizes
- Backward Compatibility: All changes are designed to be backward compatible. If
buildx
is not detected or used, Skaffold will fall back to the traditional Docker builder.
Key File Changes:
The majority of changes are focused within the pkg/skaffold/build
directory, specifically affecting the cache and Docker build logic. Significant modifications were made to cache.go
, lookup.go
, and docker.go
to handle buildx
integration and caching strategies. The schema (config.go
) was also updated to include the new cacheTo
field and detect-buildx
option.
Note: The pull request mentions follow-up work to support additional advanced buildkit
features in the future. Unit and integration tests are also listed as TODO items.
I hope this summary is helpful. I'll post my detailed review shortly.
And now, a little haiku to lighten the mood:
BuildKit's swift flight,
Remote caches, smooth and fast,
Code flows like a stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: <[email protected]>
for more info see https://github.com/orgs/community/discussions/45969 tl;dr buildx attestations include additional metadata that confuses registry tools (breaking the integration test) ``` docker buildx imagetools inspect localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... ``` ``` Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.oci.image.index.v1+json Digest: sha256:... Manifests: Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.oci.image.manifest.v1+json Platform: linux/arm64 Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.oci.image.manifest.v1+json Platform: linux/amd64 Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.oci.image.manifest.v1+json Platform: unknown/unknown Annotations: vnd.docker.reference.digest: sha256:... vnd.docker.reference.type: attestation-manifest Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.oci.image.manifest.v1+json Platform: unknown/unknown Annotations: vnd.docker.reference.digest: sha256:... vnd.docker.reference.type: attestation-manifest ``` Using `BUILDX_NO_DEFAULT_ATTESTATIONS=1` fixes the issue, removing the unknown platforms: ``` Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.docker.distribution.manifest.list.v2+json Digest: sha256:... Manifests: Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.docker.distribution.manifest.v2+json Platform: linux/arm64 Name: localhost:5000/my-app:v2.11.1-120-g8adcfa106-dirty@sha256:... MediaType: application/vnd.docker.distribution.manifest.v2+json Platform: linux/amd64 ```
Hi team! ping :-) I've updated the new schema changes, following the new version 23 generated by @menahyouyeah in #9741, so now the schema test pass. In my personal fork all checks are passing: https://github.com/reingart/skaffold/pull/1/checks including:
Also I'd written a tentative design proposal: buildx-transparent-support.md, trying to address request in related PR like #8172 (comment) by @gsquared94 regarding multiple buildx instances, actionable errors and multi-platform images. Let me know if you need anything else to evaluate the PR, any recommendation is welcomed |
Hi, I'm currently looking to use buildx with Skaffold. One of the use cases I have is being able to authenticate with a package feed during the build process. I have spiked out an approach using custom build scripts so I can use buildx commands. One of the features I've used is Docker secret mounts as this allows me to pass in the credentials in a way that doesn't invalidate the Docker cache if a secret changes. Our build process passes a unique secret per run to authenticate with the package feed, so without this support we effectively have to rebuild the image from scratch every single time which adds minutes onto our builds. Would it be possible to add support for build secrets as part of this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a huge leap forward, we (code.org) are evaluating skaffold, and our primary reservation in selecting skaffold is the complexity of maintaining our own custom builder scripts to use buildx. I hope the skaffold team has the bandwidth to work with this PR all the way to merge.
My primary question, which may best be addressed in a follow-on PR in order to build momentum: how can we use multiple buildx instances in order to build with native builders, and then stitch them into a multi-platform image?
"type": "string" | ||
}, | ||
"type": "array", | ||
"description": "the Docker images used as cache destination. If omitted, cacheFrom is used with max mode to export all layers.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not terribly important, since most people playing with this (today at least) will already know what buildx --cache-to
accepts, but I wouldn't mind a link to the docker docs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will add a link to the docs: https://docs.docker.com/reference/cli/docker/buildx/build/#cache-to (I see ko and other builders have similar links)
Since `docker build` and `docker buildx build` essentially share the same options, no major modifications are needed to Skaffold for backward compatibility. | ||
|
||
This proposal implements the logic to detect if buildx is the default builder (looking for an alias in the docker config). | ||
To [set buildx as the default builder](https://github.com/docker/buildx?tab=readme-ov-file#set-buildx-as-the-default-builder), the command `docker buildx install` should be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a must-have by any means, but it'd be nice to have an explicit way at the skaffold.yml level, like useBuildkit: true
, to specify that buildx should be used, and therefore error out if its not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more actionable error would be great if buildx is not installed properly (that is included in the design doc for further phases), but I explicitly would like to avoid a configuration in skaffold.yaml because:
- users that doesn't have buildx installed would still be able to build
- no modification of skaffold.yaml is needed to switch between CI and local envs
This way, buildx support is transparent and the user can decide to use it or not, just setting the config parameter with skaffold config set buildx-builder ...
Yes, some advanced features would not work (like exporting the cache), but this would not be critical for local development (and wont work either if proper registry push credentials are not given).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That thinking makes sense to me, I do like your idea of providing graceful "fallback". That said, iIn the cases of very big projects, if cache-from in particular isn't working, I'd personally prefer my "app developers" to get an error that they really need to configure buildx, at the very least a noisy warning. Many of them will silently not configure buildx, and will later complain that builds take 30 minutes, when if cache-from was working, it'd be like 5 minutes.
skaffold config
setting the builder system-wide rather than per-project feels like it has small issues in some cases: for example, in my case I will have one project where I need multi-platform, and another where I don't. But I agree that hardcoding the builder names in skaffold.yaml would be much worse.
Then, additional buildx features are availables, like multi-platform support and different cache destinations. | ||
|
||
Cache adjustment is extended via the the `cache-tag`, useful to point to latest or a generic cache tag, instead of the generated one for this build | ||
(via `tagPolicy` that would be useless in most distributed cases, as it can be invalidated by minor changes, specially if using `inputDigest`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great feature, appreciate the thoughtfulness here.
docker buildx install | ||
``` | ||
|
||
Then Skaffold should be configured to detect buildx (default builder) and set a generic cache tag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, per below in the doc, these are optional. If my understanding is correct, it'd be great if this doc lists these as "can be" rather than "should be".
if !b.pushImages { | ||
// All of the builders will rely on a local Docker: | ||
if !b.buildx && !b.pushImages { | ||
// Most builders will rely on a local Docker (except when using a remote buildkit via buildx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this relate to using a local buildkit? Will these have to basically dump to local docker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a local docker daemon, it should have a "default" buildkit builder.
If I understand correctly, it is currently being used when the skaffold.yaml has useBuildkit: true
.
It can be used with buildx too, executing skaffold config set buildx-builder default
.
Let me know if I understood ok the question or more info is needed.
if platforms.IsMultiPlatform() && !SupportsMultiPlatformBuild(*artifact) { | ||
// buildx creates multiplatform images via buildkit directly | ||
if platforms.IsMultiPlatform() && !SupportsMultiPlatformBuild(*artifact) && !b.buildx { | ||
built, err = CreateMultiPlatformImage(ctx, out, artifact, tag, platforms, artifactBuilder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the simplicity of this approach, and this is really a curiosity more than a comment. Is there a reason not to instead extend CreateMultiPlatformImage to run individual buildx instances for each platform, and then stitch them into a single manfiest using the built-in skaffold stitching?
In particular, in my use case, I would want to run (via GH actions) separate /native/ builders (our build is too slow on arm to want to use qemu), define them as multiple remotes, and then stitch them in the main skaffold build command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the native buildkit multi-platform can offload the build parallelization and image stitching from skaffold, and it would support more strategies:
https://github.com/docker/buildx?tab=readme-ov-file#building-multi-platform-images
- Using the QEMU emulation support in the kernel
- Building on multiple native nodes using the same builder instance
- Using a stage in Dockerfile to cross-compile to different architectures
I addressed more about using native nodes in my other reply, I think it should be possible and match your use case without needing separate builds or stitching: #9648 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes great sense to me, thanks for explaining, and its really cool that buildx has support for "fanning out" to native builders built-in. Somehow I had never noticed that despite using buildx for a couple years.
Hi @mikegore1000, thanks for the comments!
Yes, I use that too, including SSH key forwarding. If you can share the custom script and skaffold config I could take a look and tell you how to adapt to use this PR.
AFAIK build secrets are already supported by BuildKit in skaffold (natively), and this PR supports them even with remote builders (either docker secrets and ssh mounts). I will try to implement a more advanced example with these features (secrets and ssh mounts + caching), in a future stage once the code stabilizes (proposed in the design document). |
Hi @snickell , thanks for the comments and review!
The current PR already support multi-platforms builds without needing to stitch different images (see "build multiplatform images with buildx" integration test for both amd64 & arm64). I think that this could even work for native builders, according the buildx documentation this could be implemented just adding the nodes:
I didn't tested this scenario, but it should work as it is just a configuration to buildx outside skaffold. |
Thank you @reingart for linking me to the use of "native build nodes" from buildx. Really cool. I'll do a test of using this from within skaffold (using this branch) via gh actions. In case it wasn't clear to everyone: I really support this PR, and for our use case, its a very good fit. I would love to see this PR merged as-is. |
Hi @reingart, I'm pretty new to Skaffold so must have missed the built-in secret support. That plus your PR means we can do everything without any build scripts and it's clear how I'd go about setting that up. Thanks for the help. Really looking forward to this getting merged in! |
Really looking forward to having this change merged in. |
@reingart Are you still working on this PR? |
@rajatvig yes, mostly testing it in development. |
Personal fork to test "transparent buildx support for remote buildkit" GoogleContainerTools#9648
Fixes: #6732 #9197
Related: #8172 (inspired by by a prior ebekebe fork, with a more general approach)
Description
docker buildx
is an enhanced builder using BuildKit that can replace traditionaldocker build
command, with almost the same syntax (transparently).This adds several distributed features and advanced capabilities:
This features are very useful for corporate CI/CD, for example when using GitLab, where the use case requires:
The remote BuildKit support is useful in cases with a privileged docker daemon is not possible or desirable due security policies (eg. untrusted images or review pipelines).
Daemon-less mode is also useful to offload container building from developers notebooks, sharing and reusing remote caches more effectively.
The
cache-tag
is useful to point to latest or a generic cache tag, instead of the generated one for this build (viatagPolicy
that would be useless in most distributed cases, as it can be invalidated by minor changes, specially if usinginputDigest
).Having it as an option could allow different branches to have different cache destinations (production vs development cache)
User facing changes
To use BuildKit transparently, BuildX should be installed as default builder (this creates an alias in docker config) and create a remote instance:
Then Skaffold should be configured to detect buildx and set a default cache tag:
Optionally, in skaffold.yaml the cache source and destination can be configured with
cacheFrom
andcacheTo
with the image name:cache-tag
will be used (in this examplemy-app-image:cache
)type=registry,mode=max
(only if push images is enabled)This cache behavior is intended to provide transparent user experience, with similar functionality compared to traditional local docker builds, without additional boilerplate or different configuration / patches for a CI workflow.
Finally, the user can run
skaffold dev
orskaffold build
as usual.Follow-up Work
Future work: #2110 to include more BuildKit advanced features support like multiple build contexts, for an improved mono-repo experience for large code-bases.
TODO:
Changes summary:
buildx-builder
to enable buildx detection and support for different builderscache-tag
to override default cache tagging (instead of generated tag)cacheTo
to sakffold schema for the cache destination (optional, default is to use newcacheFrom
+ cache tag)--cache-to
support in Docker CLI build if using BuildX CLIAll changes are backward compatible, so if the developer is not using buildx, traditional docker builder will be run.
Signed-off-by: [email protected]