-
Notifications
You must be signed in to change notification settings - Fork 66
[multicast] implicit group lifecycle with IP pool integration #9450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR also addresses permission models, object deletion, and error handling questions related to reserved addresses presented in @askfongjojo's testing Google Doc (default IP Pools are covered in a follow-up, stacked PR). In thinking through the *Groups* API, permission scopes, and flexibility, @rcgoodfellow mentioned this consideration: > Do we need an explicit notion of a group object at all? Or can > instances simply allocate/deallocate group IPs from pools, and there is > no explicit management of group objects. With Fleet admins having access control to create pools and link silos to a pool, we arrived at the idea of replacing the current explicit multicast group CRUD with an implicit lifecycle, where groups are created upon the first member join and deleted when the last member leaves. **Note**: Most of the PR's changes are test-related due to moving away from the explicit multicast group(s) lifecycle. Auth Model: - Discovery (fleet-scoped): - Read/list groups and list members: any authenticated user in the same fleet. - Membership (project-scoped): - Join/leave requires Instance::Modify on the specific instance. - Creation control: - Implicit group creation only when the s silo is linked to a suitable multicast pool (by name or by explicit IP in that pool). Behavior: - Implicit lifecycle: - Create on first join (idempotent); delete when last member leaves (atomic mark-for-removal, reconciler schedules cleanup). - Addressing and validation: - Implicit allocation from the s linked multicast pools. - SSM/ASM semantics enforced: - IPv4 SSM 232/8 and IPv6 ff3x::/32 - Error handling: - Reserved/invalid multicast ranges rejected at pool/range add time. API: - Primary flows: - Group-centric member management: POST/DELETE /v1/multicast-groups/{group}/members - Instance-centric join/leave: PUT/DELETE /v1/instances/{instance}/multicast-groups/{group} - Discovery endpoints remain for list/view; there is no explicit group create/update/delete. - This is a *breaking* change, but multicast is not yet enabled or available in production Key changes: - Implicit group model; groups exist while they have members. - IP pool integration for multicast allocation with silo link gating. - Simplified API centered on join/leave flows. - Add multicast_ip to the member table for responses. - For consistency, move to `Instant` type over `SystemTime` for mcast-related caches Follow-ups (stacked PRs) - [ ] Remove MVLAN from group data model. - [ ] Default IP pool support (IPv4/IPv6 Followrequire unicast/multicast). - [ ] Dendrite: use omicron-common constants for validation.
e44bc27 to
1c9d172
Compare
Previously, internal multicast groups accepted admin-scoped addresses
including admin-local (ff04), site-local (ff05), and org-local (ff08).
This narrows the scope to only admin-local (ff04::/16), which is what
Omicron *now* dictates.
- [ ] This should be merged after
oxidecomputer/omicron#9450 is reviewed
and merged into Omicron. We now make Dendrite/Dpd match Omicron
consistently for validation.
Key changes:
- Remove IPV6_SITE_LOCAL_PATTERN and IPV6_ORG_SCOPE_PATTERN from P4
- Update P4 table entries to only match admin-local (size 4→2)
- Add ADMIN_LOCAL_PREFIX const to dpd-types with RFC doc links
- Update validation to use `is_admin_local_multicast()` from oxnet v0.1.4
- Bump to API version 2 for doc changes (only)
- Update README with OpenAPI generation instructions
- Use new multicast subnet constants from `omicron-common` for validation
Previously, internal multicast groups accepted admin-scoped addresses
including admin-local (ff04), site-local (ff05), and org-local (ff08).
This narrows the scope to only admin-local (ff04::/16), which is what
Omicron *now* dictates.
- [ ] This should be merged after
oxidecomputer/omicron#9450 is reviewed
and merged into Omicron. We now make Dendrite/Dpd match Omicron
consistently for validation.
Key changes:
- Remove IPV6_SITE_LOCAL_PATTERN and IPV6_ORG_SCOPE_PATTERN from P4
- Update P4 table entries to only match admin-local (size 4→2)
- Add ADMIN_LOCAL_PREFIX const to dpd-types with RFC doc links
- Update validation to use `is_admin_local_multicast()` from oxnet v0.1.4
- Bump to API version 2 for doc changes (only)
- Update README with OpenAPI generation instructions
- Use new multicast subnet constants from `omicron-common` for validation
|
@zeeshanlakhani - The revised authn model and implicit creation/deletion logic make sense to me. I'll make another pass tomorrow to ensure I didn't miss anything but the mental model of multicast group lifecycle seems straightforward enough. |
This PR adds omdb commands to inspect multicast state:
- `omdb db multicast groups` - list multicast groups with optional
state and pool name filters
- `omdb db multicast members` - list group members with filters for group-id,
group-name, group-ip, state, and sled-id
- `omdb db multicast info` - show detailed info for a specific group
- `omdb db multicast pools` - list multicast IP pools
We also include:
- Background task status display for multicast_reconciler
- Integration tests for all multicast omdb commands
Follows the multicast lifecycle work in
#9450.
Introduce API version `VERSION_MULTICAST_IMPLICIT_LIFECYCLE_UPDATES`
(v2025120500) to support the transition from explicit to implicit
multicast group lifecycle management.
Changes in new API version:
- Groups are created implicitly when first member joins
- Groups are deleted implicitly when last member leaves
- Instance create/update accept `MulticastGroupIdentifier` (name, UUID,
or multicast IP address) instead of just `NameOrId`
- MulticastGroupMemberAdd now has optional `source_ips` for SSM
Backward compatibility (v20251120):
- Add `v20251120` module with compatibility types using `NameOrId`
- Explicit group create/update/delete endpoints marked deprecated
- Proper base64 validation for user_data via shared UserData serde helper
Also includes:
- Add version_policy to techport server for omdb compatibility
Includes: - Remove GLOP (233/8), admin-scoped (239/8), and specific reserved address (NTP, Cisco Auto-RP, PTP) restrictions from IP pool validation - Only link-local multicast (224.0.0.0/24) is now rejected (not routable) - Add ASM pool fallback when join-by-name with source_ips finds no SSM pool linked - Allow source filtering on ASM addresses (IGMPv3/MLDv2 supports this) - SSM addresses still require sources per RFC 4607 The previous restrictions were overly conservative. Customers may have legitimate use cases for GLOP (AS-based allocations), admin-scoped (organization-local multicast), and protocol-specific addresses.
…rce TODO wrt Dendrite
Includes:
- Add shared `put_upsert` helper for idempotent PUT+CREATED requests, for 201 responses
- Add pool_selection.rs tests for SSM/ASM fallback behavior
- ASM sources TODO/workaround:
- Only send sources to DPD for SSM groups (232/8 IPv4, ff3x:: IPv6)
- ASM groups get `None` for sources, meaning "any source allowed"
- Temporary fix until dendrite accepts ASM source filtering (upcoming PR)
- Schema
- Bump version 213.0. 214.0.0 (post-merge_
|
I'll update the schema/versions next review @rcgoodfellow @internet-diglett. |
Previously, each silo could only have one default IP pool. This change allows one default pool per (pool_type, ip_version) combination, enabling silos to have separate defaults for: - Unicast IPv4 - Unicast IPv6 - Multicast IPv4 - Multicast IPv6 This work previously branched off #9451, but is now off `main`, involving changes that have to do with the mcast lifecycle changes. Includes: - Each default can now be set or unset and demoted independently. Unsetting the unicast IPv4 default does not affect the multicast IPv4 default, for example. - Add `pool_type` and `ip_version` columns to `ip_pool_resource` (denormalized from parent `ip_pool` for unique index) - Replace unique index with partial index on (resource_id, pool_type, ip_version) WHERE is_default = true - Rename `IpPoolResourceLink` to `IncompleteIpPoolResource` to reflect that pool_type/ip_version are actually populated by the linking query - Add `ip_version` field to API params for default pool disambiguation - API versioning for backwards compatibility with older clients
…zl/mcast-implicit-lifecycle
SSM addresses must always use specific_sources. This adds the `SourceFilterState` type with a `has_any_source_member` flag so SSM ignores it (always filters) while ASM respects it. [multicast] Fix SSM source filtering per RFC 4607
…zl/mcast-implicit-lifecycle
rcgoodfellow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zeeshanlakhani my comments have been addressed. Sorry for the delay in getting back to this.
When joining a multicast group by name, if multiple default multicast pools of different IP versions (IPv4/IPv6) are linked to the silo, the API now accepts an optional `ip_version` parameter to select the correct pool.
Includes:
- Split multicast-implicit-lifecycle migration into 8 files (per DDL)
- Align column ordering (schema.rs, models) with dbinit.sql
- Fix raw SQL column order in group allocation and member attach CTEs
- Expand dataplane trait for DPD handling
- Update RPW reconciler for per-member source filtering
- Add comprehensive tests post IPv6 merge
- Revert changes to merged multicast-group-support migration
…ecycle Includes fixes within the conflict(s): - Versioned API cleanup: direct infallible conversions in v2025121200 (removed .expect() through v2026010300), removed unused duplicate impl - Extended path_param! macro to support custom types (MulticastGroupIdentifier) - DB constraint consistency: aligned params.rs and migrations for IPv4/IPv6 blocked ranges (up10, up11) - Test cleanup: clear out some redundancy
This PR also addresses permission models, object deletion, and error handling questions related to reserved addresses presented in @askfongjojo's testing Google Doc (default IP Pools are covered in a follow-up, stacked PR).
In thinking through the Groups API, permission scopes, and flexibility, @rcgoodfellow mentioned this consideration:
With Fleet admins having access control to create pools and link silos to a pool, we arrived at the idea of replacing the current explicit multicast group CRUD with an implicit lifecycle, where groups are created upon the first member join and deleted when the last member leaves.
Note: Most of the PR's changes are test-related due to moving away from the explicit multicast group(s) lifecycle.
Auth Model:
Behavior:
API:
Key changes:
Instanttype overSystemTimefor mcast-related cachesThis also fixes the flaky test issue in #9588.
Follow‑ups (stacked [and related] PRs)
omdbadditions for multicast introspection -> [multicast] add omdb mcast commands for groups, members, pools #9464