Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 107 additions & 89 deletions flytekit/image_spec/default_builder.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import json
import os
import platform
import re
import shutil
import subprocess
Expand Down Expand Up @@ -158,6 +157,49 @@
echo "export PATH=$$PATH" >> $$HOME/.profile
""")

NIX_DOCKER_FILE_TEMPLATE = Template("""\
# Use Ubuntu as base instead of nixpkgs/nix for better compatibility
FROM ubuntu:24.04

# Install curl and other basic dependencies needed for the installer
RUN apt-get update -y && \
apt-get install -y \
curl \
sudo \
xz-utils \
git \
ca-certificates \
rsync && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

# Install Nix using cache mount so it persists across builds
RUN --mount=type=cache,target=/nix,id=nix-determinate \
curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | \
sh -s -- install linux \
--determinate \
--extra-conf "sandbox = true" \
--extra-conf "max-substitution-jobs = 256" \
--extra-conf "http-connections = 256" \
--extra-conf "download-buffer-size = 1073741824" \
--init none \
--no-confirm

# Create a working directory for the build
WORKDIR /build

# Build with cache mount - reuses the same cache across builds
RUN --mount=type=bind,source=.,target=/build/ \
--mount=type=cache,target=/nix,id=nix-determinate \
--mount=type=cache,target=/root/.cache/nix,id=nix-git-cache \
--mount=type=cache,target=/var/lib/containers/cache,id=container-cache \
. /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh && \
nix run .#docker.copyTo -- docker://$IMAGE_NAME --dest-creds "AWS:$ECR_TOKEN" \
--image-parallel-copies 32 \
--dest-creds "AWS:$ECR_TOKEN"
Comment on lines +197 to +199
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Duplicate --dest-creds flag in NIX_DOCKER_FILE_TEMPLATE

The NIX_DOCKER_FILE_TEMPLATE passes --dest-creds "AWS:$ECR_TOKEN" twice to the nix run .#docker.copyTo command.

Root Cause

At flytekit/image_spec/default_builder.py:197-199, the template contains:

nix run .#docker.copyTo -- docker://$IMAGE_NAME --dest-creds "AWS:$ECR_TOKEN" \
--image-parallel-copies 32 \
--dest-creds "AWS:$ECR_TOKEN"

The --dest-creds argument appears on both line 197 and line 199. Depending on how skopeo (the underlying tool for copyTo) handles duplicate flags, this could cause unexpected behavior or errors. At minimum it's redundant; at worst it could cause authentication failures if the tool doesn't handle duplicate credential flags gracefully.

Impact: Every nix-based image build will pass duplicate credential flags to the copy tool.

Suggested change
nix run .#docker.copyTo -- docker://$IMAGE_NAME --dest-creds "AWS:$ECR_TOKEN" \
--image-parallel-copies 32 \
--dest-creds "AWS:$ECR_TOKEN"
nix run .#docker.copyTo -- docker://$IMAGE_NAME --dest-creds "AWS:$ECR_TOKEN" \
--image-parallel-copies 32
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

""")


def get_flytekit_for_pypi():
"""Get flytekit version on PyPI."""
from flytekit import __version__
Expand Down Expand Up @@ -729,7 +771,16 @@ def create_docker_context(image_spec: ImageSpec, tmp_dir: Path):
uv_venv_install = ""

if image_spec.nix:
return
docker_content = NIX_DOCKER_FILE_TEMPLATE.substitute(
IMAGE_NAME=image_spec.image_name(),
COPY_LOCAL_PACKAGES=copy_local_packages,
ECR_TOKEN=subprocess.run(
["aws", "ecr", "get-login-password", "--region", "us-west-2"],
capture_output=True,
text=True,
check=True,
).stdout.strip(),
)
Comment on lines 773 to +783
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 ECR token fetched unconditionally in create_docker_context for nix builds, even when not pushing

create_docker_context always calls aws ecr get-login-password when nix=True, even when the build is local-only (push=False) or has no registry configured.

Root Cause

At flytekit/image_spec/default_builder.py:773-782, the nix branch unconditionally runs:

ECR_TOKEN=subprocess.run(
    ["aws", "ecr", "get-login-password", "--region", "us-west-2"],
    capture_output=True, text=True, check=True,
).stdout.strip(),

This is called from _build_image at line 883 via create_docker_context(image_spec, tmp_path) — the push parameter is not passed through. In the previous code (before the revert), the ECR token was only fetched when push and image_spec.registry was true (flytekit/image_spec/default_builder.py:826-829 in the old code).

This means:

  1. Running _build_image(spec, push=False) with nix=True will fail with a CalledProcessError if AWS CLI isn't configured, even though no push is intended.
  2. Running with nix=True but no ECR registry will also unnecessarily fail.
  3. The ECR token is baked into the Dockerfile as plaintext, which is a credential exposure risk (the token is written to disk and embedded in Docker build layers).

Impact: Nix builds are broken for local-only testing (push=False) and for non-ECR registries. Additionally, ECR credentials are exposed in the Dockerfile on disk.

Prompt for agents
In flytekit/image_spec/default_builder.py, the create_docker_context function at line 773-783 unconditionally fetches an ECR token when nix=True. This should be conditional on whether a push is actually needed and whether the registry is ECR. However, create_docker_context does not receive the push parameter. The fix requires either: (1) passing a push parameter to create_docker_context and only fetching the ECR token when push=True and the registry is ECR, or (2) moving the ECR token fetching logic out of create_docker_context and into _build_image where the push parameter is available, and passing it as a parameter to the template substitution. Additionally, consider using Docker build secrets (--secret) instead of embedding the token in the Dockerfile to avoid credential exposure.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

else:
docker_content = DOCKER_FILE_TEMPLATE.substitute(
UV_PYTHON_INSTALL_COMMAND=uv_python_install_command,
Expand Down Expand Up @@ -786,6 +837,8 @@ def build_image(self, image_spec: ImageSpec) -> str:
)

def _build_image(self, image_spec: ImageSpec, *, push: bool = True) -> str:
# For testing, set `push=False`` to just build the image locally and not push to
# registry
unsupported_parameters = [
name
for name, value in vars(image_spec).items()
Expand All @@ -795,102 +848,67 @@ def _build_image(self, image_spec: ImageSpec, *, push: bool = True) -> str:
msg = f"The following parameters are unsupported and ignored: {unsupported_parameters}"
warnings.warn(msg, UserWarning, stacklevel=2)

# Check if build tools are available
import shutil

if image_spec.use_depot:
if not shutil.which("depot"):
raise RuntimeError(
"Depot is not installed or not in PATH. "
"Please install depot (https://depot.dev/docs/installation) or use Docker instead by setting use_depot=False"
)
else:
if not shutil.which("docker"):
raise RuntimeError(
"Docker is not installed or not in PATH. "
"Please install Docker (https://docs.docker.com/get-docker/) or use depot by setting use_depot=True"
)

# Check if Docker daemon is running
try:
result = run(["docker", "info"], capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(
f"Docker daemon is not running or not accessible. Error: {result.stderr}\n"
"Please start Docker daemon or use depot by setting use_depot=True"
)
except Exception as e:
raise RuntimeError(
f"Failed to check Docker daemon status: {str(e)}\n"
"Please ensure Docker is properly installed and running, or use depot by setting use_depot=True"
)
Comment on lines +868 to +879
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 except Exception catches and re-wraps the RuntimeError raised for non-running Docker daemon

When the Docker daemon check fails (returncode != 0), the specific RuntimeError is caught by the broad except Exception handler and re-raised with a less informative message.

Root Cause

At flytekit/image_spec/default_builder.py:868-879:

try:
    result = run(["docker", "info"], capture_output=True, text=True)
    if result.returncode != 0:
        raise RuntimeError(
            f"Docker daemon is not running or not accessible. Error: {result.stderr}\n"
            "Please start Docker daemon or use depot by setting use_depot=True"
        )
except Exception as e:
    raise RuntimeError(
        f"Failed to check Docker daemon status: {str(e)}\n"
        "Please ensure Docker is properly installed and running, or use depot by setting use_depot=True"
    )

When result.returncode != 0, the RuntimeError at line 871 is raised inside the try block. This is immediately caught by except Exception as e at line 875, and a new RuntimeError is raised wrapping the original. The user sees a confusing double-wrapped message like:

Failed to check Docker daemon status: Docker daemon is not running or not accessible. Error: ...
Please start Docker daemon or use depot by setting use_depot=True
Please ensure Docker is properly installed and running, or use depot by setting use_depot=True

The except block was intended to catch errors from run() itself (e.g., FileNotFoundError), not from the explicit RuntimeError raise.

Impact: Users see a confusing, double-wrapped error message when Docker daemon is not running.

Suggested change
try:
result = run(["docker", "info"], capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(
f"Docker daemon is not running or not accessible. Error: {result.stderr}\n"
"Please start Docker daemon or use depot by setting use_depot=True"
)
except Exception as e:
raise RuntimeError(
f"Failed to check Docker daemon status: {str(e)}\n"
"Please ensure Docker is properly installed and running, or use depot by setting use_depot=True"
)
# Check if Docker daemon is running
try:
result = run(["docker", "info"], capture_output=True, text=True)
except Exception as e:
raise RuntimeError(
f"Failed to check Docker daemon status: {str(e)}\n"
"Please ensure Docker is properly installed and running, or use depot by setting use_depot=True"
)
if result.returncode != 0:
raise RuntimeError(
f"Docker daemon is not running or not accessible. Error: {result.stderr}\n"
"Please start Docker daemon or use depot by setting use_depot=True"
)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


with tempfile.TemporaryDirectory() as tmp_dir:
tmp_path = Path(tmp_dir)
create_docker_context(image_spec, tmp_path)

if image_spec.nix:
if not shutil.which("nix"):
raise RuntimeError(
"Nix is not installed or not in PATH. "
"Please install Nix (https://nixos.org/download)"
)
platform_to_nix_system = {
"linux/amd64": "x86_64-linux",
"linux/arm64": "aarch64-linux",
}
nix_system = platform_to_nix_system.get(image_spec.platform)
if not nix_system:
raise RuntimeError(
f"Unsupported platform for nix builds: {image_spec.platform}. "
f"Supported: {', '.join(platform_to_nix_system.keys())}"
)
os_suffix = "darwin" if platform.system() == "Darwin" else "linux"
machine_to_nix = {"x86_64": f"x86_64-{os_suffix}", "aarch64": f"aarch64-{os_suffix}", "arm64": f"aarch64-{os_suffix}"}
local_system = machine_to_nix.get(platform.machine(), f"x86_64-{os_suffix}")
is_cross_build = nix_system != local_system

if push and image_spec.registry:
ecr_token = subprocess.run(
["aws", "ecr", "get-login-password", "--region", "us-west-2"],
capture_output=True, text=True, check=True,
).stdout.strip()
if is_cross_build:
docker_attr = f"packages.{local_system}.docker-{nix_system}.copyTo"
click.secho(f"Cross-build: {nix_system} image via {local_system} n2c", fg="yellow")
else:
docker_attr = f"packages.{nix_system}.docker.copyTo"
command = [
"nix", "run",
f"path:{tmp_dir}#{docker_attr}", "--",
f"docker://{image_spec.image_name()}",
"--dest-creds", f"AWS:{ecr_token}",
"--image-parallel-copies", "32",
]
else:
command = ["nix", "build", f"path:{tmp_dir}#packages.{nix_system}.docker"]
elif image_spec.use_depot:
if not shutil.which("depot"):
raise RuntimeError(
"Depot is not installed or not in PATH. "
"Please install depot (https://depot.dev/docs/installation) or use Docker instead by setting use_depot=False"
)
if image_spec.use_depot:
command = [
"depot", "build",
"--tag", f"{image_spec.image_name()}",
"--platform", image_spec.platform,
"depot",
"build",
"--tag",
f"{image_spec.image_name()}",
"--platform",
image_spec.platform,
]
if image_spec.registry and push:
command.append("--push")
command.append(tmp_dir)
if image_spec.nix:
command.extend(["--project", "bf5bv9t2mj"])
else:
if not shutil.which("docker"):
raise RuntimeError(
"Docker is not installed or not in PATH. "
"Please install Docker (https://docs.docker.com/get-docker/)"
)
try:
result = run(["docker", "info"], capture_output=True, text=True)
except Exception as e:
raise RuntimeError(
f"Failed to check Docker daemon status: {str(e)}\n"
"Please ensure Docker is properly installed and running."
)
if result.returncode != 0:
raise RuntimeError(
f"Docker daemon is not running or not accessible. Error: {result.stderr}\n"
"Please start Docker daemon."
)
command = [
"docker", "image", "build",
"--tag", f"{image_spec.image_name()}",
"--platform", image_spec.platform,
"docker",
"image",
"build",
"--tag",
f"{image_spec.image_name()}",
"--platform",
image_spec.platform,
]
if image_spec.registry and push:
command.append("--push")
command.append(tmp_dir)

log_command = list(command)
for i, arg in enumerate(log_command):
if arg == "--dest-creds" and i + 1 < len(log_command):
log_command[i + 1] = "[REDACTED]"
click.secho(f"Run command: {' '.join(log_command)} ", fg="blue")
result = run(command)
if result.returncode != 0:
raise RuntimeError(
f"Build command failed with exit code {result.returncode}: "
f"{' '.join(log_command)}"
)

if image_spec.registry and push and not image_spec.nix:
command.append("--push")
command.append(tmp_dir)

concat_command = " ".join(command)
click.secho(f"Run command: {concat_command} ", fg="blue")
run(command, check=True)
return image_spec.image_name()
4 changes: 2 additions & 2 deletions flytekit/image_spec/image_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ class ImageSpec:
If the option is set by the user, then that option is of course used.
copy: List of files/directories to copy to /root. e.g. ["src/file1.txt", "src/file2.txt"]
python_exec: Python executable to use for install packages
use_depot: Whether to use depot to build the image. If True, the image will be built using depot. If False, the image will be built using docker. Defaults to False (docker).
use_depot: Whether to use depot to build the image. If True, the image will be built using depot. If False, the image will be built using docker.
uv_export_args: Extra arguments to pass to uv export.
vendor_local: Whether to vendor the local project into the image.
nix: Whether to use nix to build the image. If True, the image will be built using nix. If False, the image will be built using docker.
Expand Down Expand Up @@ -280,7 +280,7 @@ class ImageSpec:
copy: Optional[List[str]] = None
python_exec: Optional[str] = None
install_project: Optional[bool] = False
use_depot: Optional[bool] = False
use_depot: Optional[bool] = True
uv_export_args: str = ""
vendor_local: Optional[bool] = True
nix: Optional[bool] = False
Expand Down
4 changes: 3 additions & 1 deletion tests/flytekit/unit/core/image_spec/test_error_handling.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ def test_docker_not_installed(self, mock_which):
# Verify
assert "Docker is not installed or not in PATH" in str(exc_info.value)
assert "https://docs.docker.com/get-docker/" in str(exc_info.value)
assert "use_depot=True" in str(exc_info.value)

@patch('shutil.which')
@patch('flytekit.image_spec.default_builder.run')
Expand All @@ -102,6 +103,7 @@ def test_docker_daemon_not_running(self, mock_run, mock_which):

# Verify
assert "Docker daemon is not running" in str(exc_info.value)
assert "use_depot=True" in str(exc_info.value)

@patch('shutil.which')
def test_depot_not_installed(self, mock_which):
Expand Down Expand Up @@ -157,4 +159,4 @@ def test_envd_not_installed(self, mock_which):
# Verify
assert "envd is not installed or not in PATH" in str(exc_info.value)
assert "https://github.com/tensorchord/envd#installation" in str(exc_info.value)
assert "builder='default'" in str(exc_info.value)
assert "builder='default'" in str(exc_info.value)