Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ff01ea0
feat: Add metadata models package with dynamic schema download
devin-ai-integration[bot] Oct 21, 2025
63930c6
refactor: Move metadata models to airbyte_cdk.test.models.connector_m…
devin-ai-integration[bot] Oct 21, 2025
62902f6
refactor: Move models to generated subdirectory and add convenience i…
devin-ai-integration[bot] Oct 21, 2025
3de3af0
refactor: Generate metadata models as single file with JSON schema ou…
devin-ai-integration[bot] Oct 21, 2025
933d478
style: Apply ruff formatting to build script
devin-ai-integration[bot] Oct 21, 2025
c89faab
docs: Move metadata models documentation to CONTRIBUTING.md
devin-ai-integration[bot] Oct 27, 2025
6980060
Merge branch 'main' into devin/1760999875-add-metadata-models
aaronsteers Oct 27, 2025
a56208b
chore: revert unrelated format changes on other generated file
aaronsteers Oct 27, 2025
0f48425
Delete airbyte_cdk/test/models/connector_metadata/README.md
aaronsteers Oct 27, 2025
07d7014
docs: clean up docstring (merged content from `README.md`)
aaronsteers Oct 27, 2025
c63223a
feat: Replace HTTP downloads with sparse git clone for metadata schemas
devin-ai-integration[bot] Oct 27, 2025
da4371f
Revert accidental formatting of generated file declarative_component_…
devin-ai-integration[bot] Oct 27, 2025
5373480
Add exclusions for auto-generated files in ruff and pre-commit configs
devin-ai-integration[bot] Oct 27, 2025
7e4e3f4
Fix JSON schema consolidation to properly resolve references
devin-ai-integration[bot] Oct 27, 2025
015a60e
Remove $schema and $id from definitions to fix IDE validation
devin-ai-integration[bot] Oct 27, 2025
fe4b9cc
Refactor: Extract metadata generation into separate script
devin-ai-integration[bot] Oct 27, 2025
66d4eeb
Move metadata generation to poe tasks instead of shell script
devin-ai-integration[bot] Oct 27, 2025
23837eb
Replace Dagger with uvx in metadata generation script
devin-ai-integration[bot] Oct 27, 2025
c686574
Simplify metadata generation: generate Python from JSON schema
devin-ai-integration[bot] Oct 27, 2025
ba912fe
Fix schema consolidation per CodeRabbit feedback
devin-ai-integration[bot] Oct 27, 2025
3c2a4f8
Add type annotations and fix formatting
devin-ai-integration[bot] Oct 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions airbyte_cdk/metadata_models/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Airbyte Metadata Models

This package contains Pydantic models for validating Airbyte connector `metadata.yaml` files.

## Overview

The models are automatically generated from JSON Schema YAML files maintained in the [airbytehq/airbyte](https://github.com/airbytehq/airbyte) repository at:
```
airbyte-ci/connectors/metadata_service/lib/metadata_service/models/src/
```

During the CDK build process (`poetry run poe build`), these schemas are downloaded from GitHub and used to generate Pydantic models via `datamodel-code-generator`.

## Usage

### Validating a metadata.yaml file

```python
from pathlib import Path
import yaml
from airbyte_cdk.metadata_models import ConnectorMetadataDefinitionV0

# Load metadata.yaml
metadata_path = Path("path/to/metadata.yaml")
metadata_dict = yaml.safe_load(metadata_path.read_text())

# Validate using Pydantic
try:
metadata = ConnectorMetadataDefinitionV0(**metadata_dict)
print("✓ Metadata is valid!")
except Exception as e:
print(f"✗ Validation failed: {e}")
```

### Accessing metadata fields

```python
from airbyte_cdk.metadata_models import ConnectorMetadataDefinitionV0

metadata = ConnectorMetadataDefinitionV0(**metadata_dict)

# Access fields with full type safety
print(f"Connector: {metadata.data.name}")
print(f"Docker repository: {metadata.data.dockerRepository}")
print(f"Docker image tag: {metadata.data.dockerImageTag}")
print(f"Support level: {metadata.data.supportLevel}")
```

### Available models

The main model is `ConnectorMetadataDefinitionV0`, which includes nested models for:

- `ConnectorType` - Source or destination
- `ConnectorSubtype` - API, database, file, etc.
- `SupportLevel` - Community, certified, etc.
- `ReleaseStage` - Alpha, beta, generally_available
- `ConnectorBreakingChanges` - Breaking change definitions
- `ConnectorReleases` - Release information
- `AllowedHosts` - Network access configuration
- And many more...

## Regenerating Models

Models are regenerated automatically when you run:

```bash
poetry run poe build
```

This command:
1. Downloads the latest schema YAML files from the airbyte repository
2. Generates Pydantic models using `datamodel-code-generator`
3. Outputs models to `airbyte_cdk/metadata_models/generated/`

## Schema Source

The authoritative schemas are maintained in the [airbyte monorepo](https://github.com/airbytehq/airbyte/tree/master/airbyte-ci/connectors/metadata_service/lib/metadata_service/models/src).

Any changes to metadata validation should be made there, and will be automatically picked up by the CDK build process.
1 change: 1 addition & 0 deletions airbyte_cdk/metadata_models/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .generated import *
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# generated by datamodel-codegen:
# filename: ActorDefinitionResourceRequirements.yaml

from __future__ import annotations

from enum import Enum
from typing import List, Optional

from pydantic.v1 import BaseModel, Extra, Field


class ResourceRequirements(BaseModel):
class Config:
extra = Extra.forbid

cpu_request: Optional[str] = None
cpu_limit: Optional[str] = None
memory_request: Optional[str] = None
memory_limit: Optional[str] = None


class JobType(Enum):
get_spec = "get_spec"
check_connection = "check_connection"
discover_schema = "discover_schema"
sync = "sync"
reset_connection = "reset_connection"
connection_updater = "connection_updater"
replicate = "replicate"


class JobTypeResourceLimit(BaseModel):
class Config:
extra = Extra.forbid

jobType: JobType
resourceRequirements: ResourceRequirements


class ActorDefinitionResourceRequirements(BaseModel):
class Config:
extra = Extra.forbid

default: Optional[ResourceRequirements] = Field(
None,
description="if set, these are the requirements that should be set for ALL jobs run for this actor definition.",
)
jobSpecific: Optional[List[JobTypeResourceLimit]] = None
39 changes: 39 additions & 0 deletions airbyte_cdk/metadata_models/generated/AirbyteInternal.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# generated by datamodel-codegen:
# filename: AirbyteInternal.yaml

from __future__ import annotations

from enum import Enum
from typing import Optional

from pydantic.v1 import BaseModel, Extra, Field


class Sl(Enum):
integer_0 = 0
integer_100 = 100
integer_200 = 200
integer_300 = 300


class Ql(Enum):
integer_0 = 0
integer_100 = 100
integer_200 = 200
integer_300 = 300
integer_400 = 400
integer_500 = 500
integer_600 = 600


class AirbyteInternal(BaseModel):
class Config:
extra = Extra.allow

sl: Optional[Sl] = None
ql: Optional[Ql] = None
isEnterprise: Optional[bool] = False
requireVersionIncrementsInPullRequests: Optional[bool] = Field(
True,
description="When false, version increment checks will be skipped for this connector",
)
18 changes: 18 additions & 0 deletions airbyte_cdk/metadata_models/generated/AllowedHosts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# generated by datamodel-codegen:
# filename: AllowedHosts.yaml

from __future__ import annotations

from typing import List, Optional

from pydantic.v1 import BaseModel, Extra, Field


class AllowedHosts(BaseModel):
class Config:
extra = Extra.allow

hosts: Optional[List[str]] = Field(
None,
description="An array of hosts that this connector can connect to. AllowedHosts not being present for the source or destination means that access to all hosts is allowed. An empty list here means that no network access is granted.",
)
70 changes: 70 additions & 0 deletions airbyte_cdk/metadata_models/generated/ConnectorBreakingChanges.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# generated by datamodel-codegen:
# filename: ConnectorBreakingChanges.yaml

from __future__ import annotations

from datetime import date
from enum import Enum
from typing import Any, Dict, List, Optional

from pydantic.v1 import AnyUrl, BaseModel, Extra, Field, constr


class DeadlineAction(Enum):
auto_upgrade = "auto_upgrade"
disable = "disable"


class StreamBreakingChangeScope(BaseModel):
class Config:
extra = Extra.forbid

scopeType: Any = Field("stream", const=True)
impactedScopes: List[str] = Field(
...,
description="List of streams that are impacted by the breaking change.",
min_items=1,
)


class BreakingChangeScope(BaseModel):
__root__: StreamBreakingChangeScope = Field(
...,
description="A scope that can be used to limit the impact of a breaking change.",
)


class VersionBreakingChange(BaseModel):
class Config:
extra = Extra.forbid

upgradeDeadline: date = Field(
...,
description="The deadline by which to upgrade before the breaking change takes effect.",
)
message: str = Field(
..., description="Descriptive message detailing the breaking change."
)
deadlineAction: Optional[DeadlineAction] = Field(
None, description="Action to do when the deadline is reached."
)
migrationDocumentationUrl: Optional[AnyUrl] = Field(
None,
description="URL to documentation on how to migrate to the current version. Defaults to ${documentationUrl}-migrations#${version}",
)
scopedImpact: Optional[List[BreakingChangeScope]] = Field(
None,
description="List of scopes that are impacted by the breaking change. If not specified, the breaking change cannot be scoped to reduce impact via the supported scope types.",
min_items=1,
)


class ConnectorBreakingChanges(BaseModel):
class Config:
extra = Extra.forbid

__root__: Dict[constr(regex=r"^\d+\.\d+\.\d+$"), VersionBreakingChange] = Field(
...,
description="Each entry denotes a breaking change in a specific version of a connector that requires user action to upgrade.",
title="ConnectorBreakingChanges",
)
15 changes: 15 additions & 0 deletions airbyte_cdk/metadata_models/generated/ConnectorBuildOptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# generated by datamodel-codegen:
# filename: ConnectorBuildOptions.yaml

from __future__ import annotations

from typing import Optional

from pydantic.v1 import BaseModel, Extra


class ConnectorBuildOptions(BaseModel):
class Config:
extra = Extra.forbid

baseImage: Optional[str] = None
36 changes: 36 additions & 0 deletions airbyte_cdk/metadata_models/generated/ConnectorIPCOptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# generated by datamodel-codegen:
# filename: ConnectorIPCOptions.yaml

from __future__ import annotations

from enum import Enum
from typing import List

from pydantic.v1 import BaseModel, Extra


class SupportedSerializationEnum(Enum):
JSONL = "JSONL"
PROTOBUF = "PROTOBUF"
FLATBUFFERS = "FLATBUFFERS"


class SupportedTransportEnum(Enum):
STDIO = "STDIO"
SOCKET = "SOCKET"


class DataChannel(BaseModel):
class Config:
extra = Extra.forbid

version: str
supportedSerialization: List[SupportedSerializationEnum]
supportedTransport: List[SupportedTransportEnum]


class ConnectorIPCOptions(BaseModel):
class Config:
extra = Extra.forbid

dataChannel: DataChannel
Loading
Loading