Skip to content

Conversation

@aliibii
Copy link

@aliibii aliibii commented Oct 16, 2025

Description
This commit adds plugin execution tracing capability to the OpenTelemetry plugin, allowing users to trace individual plugin phases (rewrite, access, header_filter, body_filter, log) as child spans of the main request trace.

Changes:

  • Added trace_plugins configuration object with the following properties:
    • enabled: boolean to enable/disable plugin execution tracing (default: false)
    • plugin_span_kind: string enum ("internal" or "server") for observability provider compatibility (default: "internal")
    • excluded_plugins: array of plugin names to exclude from tracing (default: ["opentelemetry", "prometheus"])
  • Enhanced plugin execution in plugin.lua with OpenTelemetry span creation and finishing
  • Added proper span hierarchy: plugin phase spans are nested under main request spans
  • Added span context management with stack-based tracking for nested spans
  • Added upstream attributes (upstream.addr, upstream.host, upstream.ip, upstream.port) to the main request span in before_proxy phase
  • Updated test suite to reflect new schema structure and added tests for new features

New OpenTelemetry API functions available to plugins via api_ctx.otel:

  • api_ctx.otel.start_span(span_info) - Create custom spans

    • Parameters: span_info table with optional fields: name, kind, attributes, parent
    • Returns: span context object or nil
    • Automatically tracks spans in a stack for proper parent-child relationships
  • api_ctx.otel.stop_span(span_ctx, error_msg) - Finish spans with error handling

    • Parameters: span_ctx (from start_span), optional error_msg string
    • Sets error status on span if error_msg provided
    • Automatically manages span stack
  • api_ctx.otel.current_span() - Get current span context

    • Returns: most recently started span context (top of stack) or nil
    • Useful for adding attributes or creating child spans
  • api_ctx.otel.get_plugin_context(plugin_name, phase) - Get plugin phase span context

    • Parameters: plugin_name (string), phase (string)
    • Returns: span context for the specified plugin phase or nil
    • Useful for plugins that want to reference or extend existing plugin phase spans
  • api_ctx.otel.with_span(span_info, fn) - Create span, execute function, and automatically finish span

    • Parameters: span_info table (same as start_span), fn function to execute
    • The function receives span_ctx as its first parameter, allowing access to the span for setting attributes
    • Automatically handles span creation, execution, error handling, and cleanup
    • Returns function results in error-first pattern (err, ...values)
    • Sets span status to ERROR if function throws Lua error or returns an error

Features:

  • Plugin Phase Tracing: Creates child spans for each plugin phase execution automatically
  • Span Kind Control: Supports "internal" (default) and "server" span kinds for observability provider compatibility
  • Configurable: Can be enabled/disabled via trace_plugins.enabled configuration
  • Plugin Exclusion: Can exclude specific plugins from tracing (e.g., opentelemetry, prometheus)
  • Proper Hierarchy: Plugin spans are correctly nested under main request spans
  • Upstream Attributes: Upstream information (addr, host, ip, port) is automatically attached to the main request span
  • Error Handling: Proper error status and message propagation for plugin execution errors
  • No-op API: API is always available but returns no-ops when tracing is disabled
  • Stack Management: Automatic span stack tracking for nested span hierarchies
  • Convenient Span Management: with_span provides automatic span lifecycle management with error handling

Which issue(s) this PR fixes:

Resolves #12510

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch 2 times, most recently from f456f22 to 8393636 Compare October 17, 2025 10:35
@aliibii aliibii marked this pull request as ready for review October 18, 2025 09:55
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request plugin labels Oct 18, 2025
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from 8393636 to 70731ad Compare October 22, 2025 19:06
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch 2 times, most recently from 087aa81 to d478351 Compare November 2, 2025 12:01
This commit adds plugin execution tracing capability to the OpenTelemetry plugin,
allowing users to trace individual plugin phases (rewrite, access, header_filter,
body_filter, log) as child spans of the main request trace.

Changes:
- Added trace_plugins configuration option (default: false, opt-in)
- Added plugin_span_kind configuration for observability provider compatibility
- Enhanced plugin execution with OpenTelemetry span creation and finishing
- Added comprehensive request context attributes to plugin spans
- Updated documentation with examples and usage instructions
- Added comprehensive test suite for the new functionality

Features:
- Plugin Phase Tracing: Creates child spans for each plugin phase execution
- Rich Context: Includes HTTP method, URI, hostname, user agent, route info, and service info
- Configurable: Can be enabled/disabled via trace_plugins configuration
- Span Kind Control: Supports internal (default) and server span kinds for observability provider compatibility
- Proper Hierarchy: Plugin spans are correctly nested under main request spans
- Performance: Minimal overhead when disabled (default behavior)

Configuration:
- trace_plugins: boolean (default: false) - Enable/disable plugin tracing
- plugin_span_kind: string (default: 'internal') - Span kind for plugin spans
  - 'internal': Standard internal operation (may be excluded from metrics)
  - 'server': Server-side operation (typically included in service-level metrics)

This addresses GitHub issue apache#12510 and provides end-to-end tracing visibility
for APISIX plugin execution phases.
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from d478351 to 8af172f Compare November 2, 2025 12:16
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Nov 2, 2025
@juzhiyuan
Copy link
Member

  1. Just approved to run tests.
  2. @bzp2010 @moonming @Baoyuantop can have a look?

@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from 2cba5ed to a81fd3e Compare November 3, 2025 19:54
@Baoyuantop
Copy link
Contributor

Hi @aliibii, thanks for your contribution. Could you please fix the failed CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: opentelemetry plugin give traces of all other plugin that are added into route

4 participants