-
Notifications
You must be signed in to change notification settings - Fork 585
Description
Contributing guidelines
- I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- ... the documentation does not mention anything about my problem
- ... there are no open or closed issues that are related to my problem
Description
Overriding OTEL-related bake variables (intended for building an image) can introduce artificially long build delays depending on the variable values.
(Note: though a real issue, this is not something I encountered in real usage; this is mainly a companion to compose issue docker/compose#13157, which is something I encountered.)
Expected behaviour
For a bake variable used only in the building of an image, I would expect the value to influence build time to the extent that it impacts the build cache. Even more specifically, I would expect a fully-cached build to complete in sub-second time.
Actual behaviour
Depending on the variable name and value, a delay of ten seconds (or more) can occur despite being fully cached.
Buildx version
github.com/docker/buildx v0.26.1 1a8287f
Docker info
Client: Docker Engine - Community
Version: 28.3.3
Context: default
Debug Mode: false
Plugins:
ai: Docker AI Agent - Ask Gordon (Docker Inc.)
Version: v1.9.3
Path: /home/robertovillarreal/.docker/cli-plugins/docker-ai
buildx: Docker Buildx (Docker Inc.)
Version: v0.26.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.39.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
model: Docker Model Runner (EXPERIMENTAL) (Docker Inc.)
Version: v0.1.36
Path: /usr/libexec/docker/cli-plugins/docker-model
scan: Docker Scan (Docker Inc.)
Version: v0.23.0
Path: /usr/libexec/docker/cli-plugins/docker-scan
Server:
Containers: 11
Running: 4
Paused: 0
Stopped: 7
Images: 97
Server Version: 28.3.3
Storage Driver: overlayfs
driver-type: io.containerd.snapshotter.v1
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc sysbox-runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
runc version: v1.2.5-0-g59923ef
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.8.0-60-generic
Operating System: Ubuntu 24.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 30.56GiB
Name: l-9jylpn3
ID: B6K2:2BOW:BSIE:WIGE:RODV:GC2B:JMYF:6XP4:25AT:3S4Q:6634:3OII
Docker Root Dir: /var/lib/docker
Debug Mode: true
File Descriptors: 67
Goroutines: 127
System Time: 2025-08-25T20:52:31.9142915-06:00
EventsListeners: 1
Experimental: true
Insecure Registries:
<snip>
192.168.1.0/24
::1/128
127.0.0.0/8
Registry Mirrors:
http://localhost:5005/
http://localhost:5006/
http://localhost:5007/
Live Restore Enabled: false
Default Address Pools:
Base: 172.25.0.0/16, Size: 24
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
buildkit-dev docker-container
\_ buildkit-dev0 \_ unix:///var/run/docker.sock inactive
jd docker-container
\_ jd0 \_ unix:///var/run/docker.sock running v0.22.0 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (7 more)
<snip inactive but sensitive entries>
temp docker-container
\_ temp0 \_ unix:///var/run/docker.sock inactive
default* docker
\_ default \_ default running v0.23.2 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (6 more)
teamx docker
\_ teamx \_ teamx running v0.23.2 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (6 more)
worker docker
\_ worker \_ worker running v0.23.2 linux/amd64 (+2), linux/arm64, linux/arm (+2), linux/ppc64le, (5 more)
Configuration
Bake file:
variable "OTEL_TRACES_EXPORTER" {
type = string
default = "none"
}
target "default" {
dockerfile-inline = <<-EOT
FROM busybox
ARG OTEL_TRACES_EXPORTER
RUN echo "using $OTEL_TRACES_EXPORTER"
EOT
args = {
OTEL_TRACES_EXPORTER = OTEL_TRACES_EXPORTER
}
}
Very fast, as expected:
$ date && docker buildx bake && date
Mon Aug 25 09:00:30 PM MDT 2025
[+] Building 0.2s (7/7) FINISHED docker:default
<snip>
Mon Aug 25 09:00:30 PM MDT 2025
Always take ten seconds (note the builder says it was .2 seconds as above, as opposed to wall time):
$ date; OTEL_TRACES_EXPORTER=otlp docker buildx bake default; date
Mon Aug 25 09:04:19 PM MDT 2025
[+] Building 0.2s (7/7) FINISHED docker:default
<snip>
Mon Aug 25 09:04:29 PM MDT 2025
To help illustrate the lag:
$ date; OTEL_TRACES_EXPORTER=otlp docker buildx bake default --progress rawjson; date
Mon Aug 25 09:06:09 PM MDT 2025
{"vertexes":[{"digest":"sha256:032bddc7348073368c320605544d844c00e2b5f7e6ed7271de7ecf8e6e49821d","name":"[internal] load local bake definitions","started":"2025-08-25T21:06:10.011840056-06:00"}]}
<snip... time between these two are .2 seconds>
{"vertexes":[{"digest":"sha256:cf54b426da55281043924583d6743b1f70151b6a0169f1b1d3ee7de26f96edee","name":"exporting to image","started":"2025-08-26T03:06:10.234018654Z","completed":"2025-08-26T03:06:10.283281877Z"}]}
<ten seconds between last printed vertex and bake execution>
Mon Aug 25 09:06:20 PM MDT 2025
Build logs
Additional info
My reproduction is not something I'd do in reality; it is very common for me to use OTEL environment variables in a Dockerfile, but they are always static values and not something I'd change at build time. But somebody else might. Though I discovered this "on accident" (docker/compose#13157), this is a grey area. Obviously the BUILDX_*
, DOCKER_*
, etc. environment variables are more-or-less 'protected', but not OTEL_*
. In my example, there doesn't appear to be a way for the user to say "I only want to influence my bake file" or "I want to influence buildx telemetry", or worse... "I want to influence both".
I chose OTEL_TRACES_EXPORTER
in my reproduction because in the absence of other OTEL variables, it consistently gives a ten second lag. But if my example was OTEL_EXPORTER_OTLP_ENDPOINT
, it would be unlikely that one value would be 'correct' for both inclusion in the image as well as buildx telemetry. And there would be no way to provide each (buildx itself, and the image being created) with its own 'correct' value.
Though a fix for this would likely be low priority (on the bake side), I thought maybe your thoughts of potential solutions might help on influence what the compose folks might do. I noticed that #2447 seems somewhat related (esp. the solutions/strategies discussed).