-
Notifications
You must be signed in to change notification settings - Fork 30
fix(declarative): Support nested $ref patterns in dynamic stream templates (do not merge) #708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(declarative): Support nested $ref patterns in dynamic stream templates (do not merge) #708
Conversation
- Add generic object support to stream_template field in DynamicDeclarativeStream - Add generic object support to requester field in SimpleRetriever schemas - Add generic object support to stream field in ParentStreamConfig - Fixes validation regression introduced in PR #560 (commit f52bc2e) - Enables Airtable connector validation while maintaining backward compatibility - Allows with additional properties (e.g. path, http_method) Resolves validation issue: 'Validation against json schema defined in declarative_component_schema.yaml schema failed' Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1723407187-fix-dynamic-stream-template-validation#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1723407187-fix-dynamic-stream-template-validation Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a schema validation regression that prevented the Airtable connector from validating against the declarative component schema. The issue was caused by overly restrictive validation that only accepted direct $ref
patterns, breaking Airtable's nested reference pattern.
- Added generic
$ref
object support to multiple schema fields to allow nested reference patterns - Maintained backward compatibility while enabling more flexible reference patterns
- Fixed validation errors that prevented Airtable connector from working in the Connector Builder UI
type: string | ||
pattern: "^#/definitions/" | ||
required: ["$ref"] | ||
additionalProperties: true |
Copilot
AI
Aug 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting additionalProperties: true
allows any additional properties beyond $ref
, which could lead to unexpected behavior or security issues. Consider restricting this to only allow known safe properties or set to false
if additional properties aren't necessary.
additionalProperties: true | |
additionalProperties: false |
Copilot uses AI. Check for mistakes.
properties: | ||
$ref: | ||
type: string | ||
pattern: "^#/definitions/" |
Copilot
AI
Aug 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern ^#/definitions/
only validates the prefix but doesn't prevent potentially malicious or malformed references like #/definitions/../../../sensitive/path
. Consider using a more restrictive pattern that validates the complete reference structure.
pattern: "^#/definitions/" | |
pattern: "^#/definitions/[A-Za-z0-9_-]+$" |
Copilot uses AI. Check for mistakes.
📝 WalkthroughWalkthroughExpanded the declarative component schema to accept an inline object wrapper containing a $ref to definitions in six anyOf locations, alongside existing direct $ref usage. No runtime or control-flow changes; only schema validation broadened to allow both reference shapes. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (6)
1511-1517
: Broadened retriever ref shape looks good; consider de-duplicating the wrapper schema.Allowing an inline object with a local "$ref" plus extra fields is spot on for nested and embellished refs, and the pattern anchor restricts external refs. Would you consider extracting this wrapper into a shared definition (e.g., LocalManifestRefObject) to avoid repeating the same schema in six places and prevent future drift, wdyt?
Example:
+ LocalManifestRefObject: + description: Inline wrapper for a local manifest reference under the top-level 'definitions' of the manifest. + type: object + properties: + $ref: + type: string + pattern: "^#/definitions/" + required: ["$ref"] + additionalProperties: trueThen here:
- - type: object - properties: - $ref: - type: string - pattern: "^#/definitions/" - required: ["$ref"] - additionalProperties: true + - "$ref": "#/definitions/LocalManifestRefObject"
2424-2430
: LGTM; same inline $ref wrapper for DynamicSchemaLoader.retriever.This mirrors the earlier change and maintains consistency. Would you also swap this repeated block for the shared LocalManifestRefObject if you adopt the de-dup refactor above, wdyt?
3231-3237
: LGTM; ParentStreamConfig.stream now supports the wrapper.This unblocks nested parent stream refs while keeping local-only restriction. If you adopt the shared definition, would you replace this block as well, wdyt?
3678-3684
: LGTM; SimpleRetriever.requester now supports wrapper refs.Nice for reuse across retriever shapes. Same de-duplication suggestion applies here if you add a LocalManifestRefObject, wdyt?
4233-4239
: LGTM; HttpComponentsResolver.retriever updated consistently.Good consistency across components. Would you also add a brief description to the wrapper definition (or to the shared one) clarifying it points to the manifest’s top-level 'definitions' (not JSON Schema defs), wdyt?
4374-4380
: Fix for DynamicDeclarativeStream.stream_template looks correct; please add tests for regression coverage.This addresses the regression for nested stream templates while preventing external refs; could you add/extend tests to cover:
- passing cases: direct "$ref" and wrapper form,
- wrapper with extra fields (e.g., path/http_method) accepted,
- rejection of external refs (http://, file://),
- optionally a negative case where the wrapper points to a non-stream definition to document expected behavior?
Happy to draft test cases if helpful, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: destination-motherduck
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
fix(declarative): Support nested $ref patterns in dynamic stream templates
Summary
This PR fixes a schema validation regression introduced in PR #560 (commit f52bc2e) that prevented the Airtable connector from validating against the declarative component schema. The issue occurred when the
stream_template
field inDynamicDeclarativeStream
was changed from a simple$ref
to ananyOf
structure that only accepted direct references toDeclarativeStream
orStateDelegatingStream
, breaking Airtable's nested reference pattern{"$ref": "#/definitions/streams/airtable_stream"}
.Key Changes:
$ref
object support tostream_template
field inDynamicDeclarativeStream
$ref
object support torequester
field in multiple SimpleRetriever schema definitions$ref
object support tostream
field inParentStreamConfig
$ref
objects with additional properties (e.g.,path
,http_method
) as used by AirtableThe fix maintains backward compatibility while enabling more flexible reference patterns needed by dynamic stream connectors like Airtable.
Review & Testing Checklist for Human
$ref
pattern support doesn't break existing connectors that use direct references^#/definitions/
pattern prevents invalid external references while allowing valid nested referencesRecommended Test Plan:
$ref
patternsDiagram
Notes
$ref
pattern#/definitions/streams/airtable_stream
Session Details:
Summary by CodeRabbit