Skip to content

[image-spec]: fix transitive nix flake path inputs not copied to Docker build context#47

Open
Vervious wants to merge 1 commit intomasterfrom
devin/1772492757-fix-transitive-nix-flake-path-inputs
Open

[image-spec]: fix transitive nix flake path inputs not copied to Docker build context#47
Vervious wants to merge 1 commit intomasterfrom
devin/1772492757-fix-transitive-nix-flake-path-inputs

Conversation

@Vervious
Copy link

@Vervious Vervious commented Mar 2, 2026

Why are the changes needed?

Training CI tests (e.g. training-pretrain-aws-test) fail with:

error: path '/build/local_packages/rust/shared/exautils/flake.nix' does not exist

The existing discover_nix_flake_path_inputs() only finds direct path inputs declared in flake.nix via regex. It does not discover transitive path dependencies recorded in flake.lock — i.e., dependencies of dependencies.

Concrete example in the exa_ml flake:

  1. exa_ml/flake.nixminos2_rust (direct, copied ✓)
  2. minos2_rust/flake.nixrust-exautils (transitive via flake.lock, not copied ✗)
  3. rust-exautils/flake.nixrust-base (transitive via flake.lock, not copied ✗)

When Nix evaluates the root flake inside the Docker build, it reads the root flake.lock which references these transitive paths — but the directories were never copied into local_packages/, so the build fails.

What changes were proposed in this pull request?

flytekit/image_spec/image_spec.py — New function discover_transitive_nix_flake_path_inputs():

  • Parses flake.lock JSON to find all nodes with locked.type == "path" and a non-empty parent chain (transitive deps)
  • Resolves each node's absolute source path by recursively walking the parent chain (parent[-1] = declaring parent; path is relative to that parent's directory)
  • Uses a cache to avoid redundant resolution
  • Skips nodes already handled as direct inputs (via already_resolved set)

flytekit/image_spec/default_builder.py — In _copy_local_packages_and_update_lock():

  • After copying direct flake path inputs, calls discover_transitive_nix_flake_path_inputs() and copies the discovered transitive deps into local_packages/
  • No flake.lock path rewriting is needed for transitive deps because local_packages/ mirrors the monorepo directory structure, preserving relative paths between packages

How was this patch tested?

Verified with an inline integration test that simulates the exa_ml dependency chain (root → minos2_rustrust-exautilsrust-base) and confirms both transitive deps are discovered with correct absolute paths.

No committed unit tests were added — the function is exercised only through CI's Nix image builds.

Human review checklist

  • Verify parent[-1] interpretation: The code assumes the last element of the parent chain in flake.lock v7 is the declaring parent whose directory the path is relative to. Confirm this matches Nix's flake.lock spec.
  • Silent skip on unresolvable transitive deps: When _find_git_root() returns None for a transitive dep, it's silently skipped (continue). Direct inputs raise ValueError instead. Consider whether a warning/error would be more appropriate to surface issues early.
  • Large directory copies: Transitive deps like rust-base could point to the entire rust/ directory. Confirm that copying large subtrees with gitignore filtering doesn't cause excessive build context size/time.
  • Bare except Exception: JSON parsing failure returns empty list silently. Could mask real issues if flake.lock is corrupted.

Link to Devin Session: https://app.devin.ai/sessions/46b404cede304b2b8ce0d143523914d8
Requested by: @Vervious (benchan@exa.ai)

…ild context

discover_nix_flake_path_inputs() only found direct path inputs from
flake.nix. Transitive path dependencies (e.g. rust-exautils referenced
by minos2_rust, rust-base referenced by rust-exautils) were not copied
to local_packages/, causing 'path does not exist' errors during nix
builds in CI.

Add discover_transitive_nix_flake_path_inputs() which parses flake.lock
for all path-type nodes with non-empty parent chains, resolves their
absolute paths by walking the parent chain, and copies them to
local_packages/. Since local_packages/ mirrors the monorepo directory
structure, relative paths between transitive deps are preserved
automatically without needing flake.lock path rewrites.

Co-Authored-By: benchan <ben@vervious.com>
@devin-ai-integration
Copy link

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant