|
| 1 | +# Observability Roadmap |
| 2 | + |
| 3 | +## Current State |
| 4 | + |
| 5 | +The project currently implements structured logging with Pino, which provides: |
| 6 | + |
| 7 | +- Structured JSON logging in production |
| 8 | +- Correlation ID support via environment variables |
| 9 | +- Context propagation through child loggers |
| 10 | +- Performance-optimized logging |
| 11 | + |
| 12 | +## OpenTelemetry Integration (Future) |
| 13 | + |
| 14 | +The logger module has been designed with OpenTelemetry integration in mind. The following placeholders and patterns are ready for future implementation: |
| 15 | + |
| 16 | +### Trace Context Integration |
| 17 | + |
| 18 | +The `withTraceContext` helper function is ready to integrate with OpenTelemetry: |
| 19 | + |
| 20 | +```typescript |
| 21 | +import { logger, withTraceContext } from './logger.js'; |
| 22 | +import { trace } from '@opentelemetry/api'; |
| 23 | + |
| 24 | +// Future implementation example |
| 25 | +const span = trace.getActiveSpan(); |
| 26 | +const spanContext = span?.spanContext(); |
| 27 | + |
| 28 | +const tracedLogger = withTraceContext(logger, spanContext?.traceId, spanContext?.spanId); |
| 29 | + |
| 30 | +tracedLogger.info('Operation with trace context'); |
| 31 | +``` |
| 32 | + |
| 33 | +### Correlation ID from OpenTelemetry |
| 34 | + |
| 35 | +Currently, correlation IDs come from environment variables. In the future, they will be extracted from OpenTelemetry context: |
| 36 | + |
| 37 | +```typescript |
| 38 | +// Current (placeholder) |
| 39 | +correlationId: process.env.CORRELATION_ID; |
| 40 | + |
| 41 | +// Future (with OpenTelemetry) |
| 42 | +correlationId: trace.getActiveSpan()?.spanContext().traceId; |
| 43 | +``` |
| 44 | + |
| 45 | +### Planned OpenTelemetry Features |
| 46 | + |
| 47 | +1. **Automatic Trace Context Propagation** |
| 48 | + - Extract trace and span IDs from active OpenTelemetry context |
| 49 | + - Automatically attach to all log entries |
| 50 | + - Correlate logs with distributed traces |
| 51 | + |
| 52 | +2. **Metrics Integration** |
| 53 | + - Export custom metrics alongside logs |
| 54 | + - Track application performance indicators |
| 55 | + - Integrate with Prometheus/Grafana |
| 56 | + |
| 57 | +3. **Distributed Tracing** |
| 58 | + - Instrument HTTP requests/responses |
| 59 | + - Track database queries |
| 60 | + - Monitor external service calls |
| 61 | + - Support for W3C Trace Context propagation |
| 62 | + |
| 63 | +4. **Exporters Configuration** |
| 64 | + |
| 65 | + ```typescript |
| 66 | + // Future configuration example |
| 67 | + import { NodeSDK } from '@opentelemetry/sdk-node'; |
| 68 | + import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'; |
| 69 | + import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http'; |
| 70 | + |
| 71 | + const sdk = new NodeSDK({ |
| 72 | + traceExporter: new OTLPTraceExporter({ |
| 73 | + url: process.env.OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, |
| 74 | + }), |
| 75 | + metricReader: new PeriodicExportingMetricReader({ |
| 76 | + exporter: new OTLPMetricExporter({ |
| 77 | + url: process.env.OTEL_EXPORTER_OTLP_METRICS_ENDPOINT, |
| 78 | + }), |
| 79 | + }), |
| 80 | + }); |
| 81 | + ``` |
| 82 | + |
| 83 | +## Migration Path |
| 84 | + |
| 85 | +When implementing OpenTelemetry: |
| 86 | + |
| 87 | +1. **Install OpenTelemetry packages** |
| 88 | + |
| 89 | + ```bash |
| 90 | + pnpm add @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node |
| 91 | + ``` |
| 92 | + |
| 93 | +2. **Initialize OpenTelemetry SDK** |
| 94 | + - Create `src/telemetry.ts` for SDK initialization |
| 95 | + - Configure exporters (OTLP, Jaeger, Zipkin, etc.) |
| 96 | + - Set up auto-instrumentation |
| 97 | + |
| 98 | +3. **Update Logger Integration** |
| 99 | + - Modify `getLoggerConfig()` to extract correlation IDs from OpenTelemetry context |
| 100 | + - Update `withTraceContext()` to use active span context |
| 101 | + - Add span events for important log entries |
| 102 | + |
| 103 | +4. **Instrument Application Code** |
| 104 | + - Add custom spans for business operations |
| 105 | + - Track custom metrics |
| 106 | + - Implement baggage propagation for metadata |
| 107 | + |
| 108 | +## Environment Variables |
| 109 | + |
| 110 | +Future OpenTelemetry configuration will use standard environment variables: |
| 111 | + |
| 112 | +```bash |
| 113 | +# Service identification |
| 114 | +OTEL_SERVICE_NAME=agentic-node-ts-starter |
| 115 | +OTEL_SERVICE_VERSION=1.0.0 |
| 116 | + |
| 117 | +# Exporter configuration |
| 118 | +OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 |
| 119 | +OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces |
| 120 | +OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:4318/v1/metrics |
| 121 | + |
| 122 | +# Resource attributes |
| 123 | +OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,service.namespace=myapp |
| 124 | + |
| 125 | +# Sampling |
| 126 | +OTEL_TRACES_SAMPLER=parentbased_traceidratio |
| 127 | +OTEL_TRACES_SAMPLER_ARG=0.1 |
| 128 | +``` |
| 129 | + |
| 130 | +## Benefits of Future Integration |
| 131 | + |
| 132 | +1. **Unified Observability**: Logs, traces, and metrics in one platform |
| 133 | +2. **Root Cause Analysis**: Correlate logs with specific trace spans |
| 134 | +3. **Performance Monitoring**: Track latency and throughput across services |
| 135 | +4. **Error Tracking**: Automatically capture and trace errors |
| 136 | +5. **Service Dependencies**: Visualize service communication patterns |
| 137 | +6. **SLO/SLI Tracking**: Monitor service level objectives with metrics |
| 138 | + |
| 139 | +## Compatible Backends |
| 140 | + |
| 141 | +The OpenTelemetry integration will support various observability backends: |
| 142 | + |
| 143 | +- **Cloud Providers** |
| 144 | + - AWS X-Ray |
| 145 | + - Google Cloud Trace |
| 146 | + - Azure Application Insights |
| 147 | + |
| 148 | +- **Open Source** |
| 149 | + - Jaeger |
| 150 | + - Zipkin |
| 151 | + - Grafana Tempo |
| 152 | + - SigNoz |
| 153 | + |
| 154 | +- **Commercial** |
| 155 | + - Datadog |
| 156 | + - New Relic |
| 157 | + - Honeycomb |
| 158 | + - Dynatrace |
| 159 | + - Splunk |
| 160 | + |
| 161 | +## References |
| 162 | + |
| 163 | +- [OpenTelemetry JavaScript Documentation](https://opentelemetry.io/docs/instrumentation/js/) |
| 164 | +- [OpenTelemetry Specification](https://opentelemetry.io/docs/specs/otel/) |
| 165 | +- [W3C Trace Context](https://www.w3.org/TR/trace-context/) |
| 166 | +- [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/) |
0 commit comments