Skip to content

direct: Fix permissions state path to match input config schema#4703

Open
denik wants to merge 93 commits intomainfrom
denik/permissions-reference
Open

direct: Fix permissions state path to match input config schema#4703
denik wants to merge 93 commits intomainfrom
denik/permissions-reference

Conversation

@denik
Copy link
Contributor

@denik denik commented Mar 11, 2026

Changes

  • Add EmbeddedSlice field name convention to struct walkers in libs/structs/ — when a struct field is named EmbeddedSlice, walkers treat it as transparent (no path segment added), so its elements appear directly at the parent path
  • Apply this to PermissionsState: rename Permissions field to EmbeddedSlice, making state paths like resources.jobs.foo.permissions[0] match input config paths (previously resources.jobs.foo.permissions.permissions[0])
  • Change state file version to 2 and introduce automatic migration from 0 & 1 to 2.

Why

The direct deployment engine's permissions state used a wrapper struct that added an extra permissions segment to paths. This caused a mismatch with input config paths, preventing dependency tracking between permissions and their parent resources. With this fix, state and config paths are consistent.

Tests

  • New acceptance & invariant tests for references from/to permissions.
  • New invariant test that checks that bundle deployed with previous fixed version (0.293.0) does not have drift when CLI is upgraded to latest.

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Mar 11, 2026

Commit: 79d09ab

Run: 23202024150

Env 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 1 9 268 796 7:35
🟨​ aws windows 7 1 9 270 794 6:06
🔄​ aws-ucws linux 2 7 9 364 711 7:41
🔄​ aws-ucws windows 2 7 9 366 709 5:51
💚​ azure linux 2 11 271 794 6:10
💚​ azure windows 2 11 273 792 4:53
🔄​ azure-ucws linux 2 1 11 369 707 7:21
🔄​ azure-ucws windows 2 1 11 371 705 6:27
💚​ gcp linux 2 11 267 797 5:47
💚​ gcp windows 2 11 269 795 4:18
18 interesting tests: 9 SKIP, 7 KNOWN, 1 flaky, 1 RECOVERED
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 🔄​f 🔄​f 💚​R 💚​R 🔄​f 🔄​f 💚​R 💚​R
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🔄​ TestAccept/ssh/connect-serverless-gpu 🙈​s 🙈​s 🔄​f 🔄​f 🙈​s 🙈​s 🔄​f 🔄​f 🙈​s 🙈​s
💚​ TestAccept/ssh/connection 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 20 slowest tests (at least 2 minutes):
duration env testname
3:28 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:26 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:12 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:11 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:10 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:09 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:09 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:08 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:07 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:47 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:47 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:46 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:46 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:45 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:43 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:31 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:21 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:20 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:06 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:06 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

@denik denik temporarily deployed to test-trigger-is March 11, 2026 10:41 — with GitHub Actions Inactive
@denik denik force-pushed the denik/permissions-reference branch from 66d736a to cf0178e Compare March 11, 2026 16:07
@denik denik temporarily deployed to test-trigger-is March 11, 2026 16:08 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 11, 2026 16:46 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 11, 2026 16:49 — with GitHub Actions Inactive
@denik denik force-pushed the denik/permissions-reference branch from 849df2a to 7824eb6 Compare March 11, 2026 22:31
@denik denik temporarily deployed to test-trigger-is March 11, 2026 22:31 — with GitHub Actions Inactive
Copy link
Member

@simonfaltum simonfaltum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] Verdict: Not ready yet

  • 1 Critical
  • 2 Major
  • 2 Gap
  • 2 Nit
  • 2 Suggestion

The core idea (EmbeddedSlice convention for struct walkers) is sound and well-implemented across libs/structs/. However, there is a critical backward-compatibility issue: the JSON tag json:"_,omitempty" on PermissionsState.EmbeddedSlice means old state files using the "permissions" key will silently load as empty, erasing all permission state. The fix is straightforward: change to json:"permissions,omitempty" since the EmbeddedSlice convention is driven by Go field name, not JSON tag.

See inline comments for details.

return err
}
*p = PermissionsState(raw)
migratePermissionLevel(p.EmbeddedSlice)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent Swarm Review] [Critical]

State backward compatibility: JSON key change from "permissions" to "_" breaks old state files.

EmbeddedSlice uses json:"_,omitempty", so old state files containing "permissions": [...] will silently load as empty. The custom UnmarshalJSON only migrates permission_level to level within elements, but does NOT handle the outer key name change. This erases all permission state on the first plan/deploy after upgrading.

Both reviewers independently found this issue and confirmed it in cross-review.

Suggestion: Change to json:"permissions,omitempty". The EmbeddedSlice convention is driven entirely by the Go field name, not the JSON tag. This preserves backward compat, produces readable JSON output, and eliminates the need for key-name migration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to rename permissions to _. The breakage will indeed cause a drift for all permissions resources. The drift will be removed on next "bundle deploy". However, given that direct engine is well adopted, I think it makes sense to migrate the state properly.

I bumped the state version to 2 and added migration function that takes old struct and produces a new one.

@denik denik force-pushed the denik/permissions-reference branch from 7824eb6 to 2b2b440 Compare March 13, 2026 12:39
@denik denik temporarily deployed to test-trigger-is March 13, 2026 12:40 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 12:53 — with GitHub Actions Inactive
@denik denik force-pushed the denik/permissions-reference branch from 29d6d69 to d0a4d17 Compare March 13, 2026 14:21
@denik denik temporarily deployed to test-trigger-is March 13, 2026 14:21 — with GitHub Actions Inactive
@denik denik requested a review from simonfaltum March 13, 2026 14:27
@denik denik temporarily deployed to test-trigger-is March 13, 2026 14:29 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 14:35 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 14:37 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 14:54 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 15:06 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 15:19 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 15:30 — with GitHub Actions Inactive
@denik denik temporarily deployed to test-trigger-is March 13, 2026 15:47 — with GitHub Actions Inactive
denik and others added 27 commits March 17, 2026 16:24
Cross-resource permission references don't work in terraform mode:
databricks_job doesn't expose permissions as output attributes, so
${databricks_job.job_b.permissions[0].level} can't be resolved by Terraform.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…comment

Tests both directions of cross-resource references:
- permission fields referencing job tag values (job tag -> permission group_name)
- permission fields referencing another job's permissions (permission -> permission)

Also expands the comment explaining why permission references don't work in
terraform mode (interpolator converts paths but databricks_job has no
permissions output attributes).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds infrastructure to verify the current CLI can deploy on top of state
produced by an older version:

- continuity/prepare.py: run with a version (e.g. v0.293.0) to deploy each
  invariant config using that CLI version and save the resulting state files
  to continuity/<version>/<config>.state.json. Uses -forcerun to bypass the
  Local=false guard so it only runs when explicitly invoked.

- continuity/prepare/: acceptance test that does the deploy+save; excluded
  from normal CI via Local=false/Cloud=false.

- continuity/v0.293.0/: continuity test that loads the committed state file,
  deploys with the current CLI, then verifies no drift.

- 24 state files generated from v0.293.0 (released 2026-03-11). Resources
  with name-based IDs unrecognised by the mock server are excluded from the
  test (dashboard, database_catalog, secret_scope, synced_database_table).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…CLI at test time

Instead of pre-generating state files from v0.293.0, the test runner now always
downloads and caches v0.293.0 and exposes it as $CLI_293. The continue_293 test
deploys using $CLI_293 against the mock server, then deploys with the current CLI,
verifying no drift. This ensures the state is always consistent with the mock server.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Group names (viewers, admins) in job_permission_ref and job_cross_resource_ref
configs don't exist in real workspaces, causing 404 errors on cloud.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace 'viewers' (non-existent in real workspaces) with 'users' (built-in
Databricks group). This allows the job_permission_ref and job_cross_resource_ref
configs to run on cloud without requiring custom group setup.

Remove cloud exclusion from no_drift test that was added to work around the
non-existent group names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ature not in v0.293.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nuity

The new config deploys a job with a permission entry (group 'users' with CAN_VIEW)
and is included in the continue_293 test to verify that the current CLI can deploy
on top of state produced by v0.293.0 without drift.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… variable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Running an older CLI on state produced by a newer CLI is unsupported;
require an upgrade rather than silently treating it as up-to-date.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te/permission_level_migration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These are separate sub-resource adapters with their own entries; walking them
from the parent produced duplicate paths with different types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduce embedFieldIndex to cache the field index per reflect.Type,
so both findEmbedField and findEmbedFieldType share a single sync.Map
lookup and field access becomes an O(1) v.Field(idx) call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@denik denik force-pushed the denik/permissions-reference branch from 334d8b3 to 79d09ab Compare March 17, 2026 15:25
@denik denik enabled auto-merge March 17, 2026 15:25
@denik denik deployed to test-trigger-is March 17, 2026 15:26 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants