Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions docs/rfcs/component-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Component Schema

## Overview

OpenTelemetry components expose configuration as a public API.
This public API is currently expressed directly in Golang structs.

This RFC discusses expressing the public API of components as JSON schema files.

## Past approaches

### `configschema`

Check warning on line 12 in docs/rfcs/component-schema.md

View workflow job for this annotation

GitHub Actions / spell-check

Unknown word (configschema)
The [`configschema`](https://pkg.go.dev/github.com/open-telemetry/opentelemetry-collector-contrib/cmd/configschema) go tool was part of opentelemetry-collector-contrib and used to build a specific YAML schema.

Check warning on line 13 in docs/rfcs/component-schema.md

View workflow job for this annotation

GitHub Actions / spell-check

Unknown word (configschema)

It required to import all go modules as part of schemabuilder's dependencies, which created issues with the build of the repository.

Check warning on line 15 in docs/rfcs/component-schema.md

View workflow job for this annotation

GitHub Actions / spell-check

Unknown word (schemabuilder's)

See https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/30187

### Golang JSON schema reverse-engineering

This approach consists of reverse-engineering the JSON schema from the Golang config structs.

This approach ran into issues with the Golang generation from JSON schema.

See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003 for historical context.

### mdatagen Config struct generation

This is [an in-progress approach](https://github.com/open-telemetry/opentelemetry-collector/pull/13155) where the metadata.yaml file of the component contains a config key, under which a JSON schema is present.

This approach is demonstrated with a POC showing the batchprocessor configuration.

Issues are present when it comes to composing configuration by embedding a library such as confighttp.

### checkapi schema check

This is an [in-progress approach](https://github.com/open-telemetry/opentelemetry-go-build-tools/pull/1148) where checkapi, a tool built as part of opentelemetry-go-build-tools, is used to check that
the schema present under the config key of metadata.yaml matches the Config struct fields.

This approach limits to the presence of fields in the Config struct.

The tool can output the missing elements of the schema ; as such it can give a first take on the schema of the component.

The tool also can associate imports such as confighttp.ClientConfig to a reference to a remote schema.

## Goals

We want to use JSON schema to help formulate and stabilize the public API of the components.

JSON schema consumers use it to:
* Create component configurations.
* Validate component configurations.
* Version the configuration of components.

See https://github.com/open-telemetry/opentelemetry-collector/issues/9769 for discussion of the goals.

See https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/42214 for a more recent discussion of goals and roadmap.

## Current behavior

### Validation via the `validate` command

The collector can run with a validate CLI option that runs validation of the yaml file.

### Configuration SIG

There is a [SIG](https://github.com/open-telemetry/opentelemetry-configuration) dedicated to OpenTelemetry Configuration ; members of the SIG should review and sign off on this approach.

## Roadmap

To integrate JSON schema into components, we will need to make incremental changes with an opt-in approach.

### Initial schemas

We work to add initial schemas by hand in libraries in opentelemetry-collector.

The schemas are added to the library metadata.yaml files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible not to couple the solution with metadata.yaml? I.e. it can be used as a source of the schema but ideally the tooling should be able to accept alternative location of config schemas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a JSON schema and in a yaml file, so maybe a couple ways:

  1. We use a yaml anchor and you define the config key in a different section.
  2. You use a JSON schema with a single reference to another schema.

Note if we move the schema to a separate location, we need to be able to resolve it if we lint with checkapi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at your draft PR, and I'm worried that these schema blocks are too big for the metadata.yaml file. Could we have this be a path relative to the metadata.yaml? Could a go.mod link a dependency that embeds the schema? Could we refer to versioned URLs in other repositories? https://github.com/open-telemetry/opentelemetry-collector/pull/13726/files#r2334122066

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd I think you might be right that metadata.yaml becomes a bit big with this, but as I mention above, we can build a reference model so it's possible to link to another file. I also think we need to try, see how difficult or messy it gets, and get wiser.

My goal is to leverage our git tags for any references to versioned schemas, I guess.


[A PR is open to implememt this.](https://github.com/open-telemetry/opentelemetry-collector/pull/13726)

Check warning on line 79 in docs/rfcs/component-schema.md

View workflow job for this annotation

GitHub Actions / spell-check

Misspelled word (implememt) Suggestions: (implement*)

### CheckAPI - optional

We upgrade checkapi to 0.27.1 and turn on schema validation in metadata.yaml if present.

We add configuration in checkapi to add references to schemas defined in the previous step.

### Add a schema to each component

Create a schema for all components opting in, starting with simple components and adding more complex ones that provide validation of the approach.

### Tie schema configuration to stability levels

We add to the stability level requirements of the stable level the need to set a configuration schema for the component.
Loading