Skip to content

Commit 45fb9d4

Browse files
traces article
1 parent 630d92a commit 45fb9d4

File tree

1 file changed

+40
-56
lines changed

1 file changed

+40
-56
lines changed
Lines changed: 40 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,70 @@
11
---
2-
title: OpenTelemetry metrics
3-
subTitle: An introduction to OpenTelemetry's most data-efficient signal
4-
displayTitle: OpenTelemetry Metrics
5-
description: OpenTelemetry Metrics play a critical role in monitoring applications by offering a way to capture and analyze key metrics in a standardized, scalable manner. Whether you're managing a complex microservices architecture or a simpler system, OpenTelemetry helps track essential statistics that reveal the health and performance of your services.
6-
date: 2024-10-18
2+
title: OpenTelemetry traces
3+
subTitle: An introduction to OpenTelemetry's most readable tool
4+
displayTitle: OpenTelemetry Traces
5+
description: OpenTelemetry traces capture how individual operations within your system interact over time. A trace follows a request as it flows through a system, recording the relationships between different operations. Traces are particularly useful in distributed systems, where multiple services or components interact. However, they are equally valuable for monolithic applications, providing insights even when everything runs in a single process.
6+
date: 2024-10-30
77
author: Nocnica Mellifera
88
githubUser: serverless-mom
99
displayDescription:
1010
Learn more about OpenTelemetry & Monitoring with Checkly. Explore metrics, one of the three pillars of observability.
1111
menu:
1212
learn:
1313
parent: "OpenTelemetry"
14-
weight: 3
14+
weight: 4
1515
---
1616

17-
**OpenTelemetry Metrics** play a critical role in monitoring applications by offering a way to capture and analyze key metrics in a standardized, scalable manner. Whether you're managing a complex microservices architecture or a simpler system, OpenTelemetry helps track essential statistics that reveal the health and performance of your services.
17+
# An Introduction to OpenTelemetry Traces
1818

19-
---
20-
21-
## What are Metrics?
22-
23-
Metrics represent **quantitative measurements** of your system’s health and behavior. They provide insights into performance trends, such as:
24-
25-
- **CPU usage** over time
26-
- **Request rates** per endpoint
27-
- **Error counts** or failure rates
28-
- **Latency** in handling requests
29-
30-
Metrics are lightweight and highly efficient to collect, aggregate, and query. They help identify patterns and anomalies without burdening storage, making them suitable for continuous monitoring at scale.
31-
32-
### Types of Metrics in OpenTelemetry:
33-
34-
- **Counter**: Measures occurrences or events, such as the number of requests handled.
35-
- **Gauge**: Captures values that fluctuate, like memory usage.
36-
- **Histogram**: Measures the distribution of values, such as response time percentiles.
37-
38-
Explore further in the [OpenTelemetry Metrics Documentation](https://opentelemetry.io/docs/concepts/signals/metrics/).
39-
40-
41-
## Why Metrics Matter
42-
43-
In a **microservices** environment, metrics are indispensable for:
19+
## What Are OpenTelemetry Traces?
4420

45-
- **Performance monitoring**: Identifying bottlenecks or degraded performance.
46-
- **Capacity planning**: Forecasting when additional resources are required.
47-
- **Incident detection**: Alerting teams about abnormal system behavior.
21+
OpenTelemetry traces capture how individual operations within your system interact over time. A trace follows a request as it flows through a system, recording the relationships between different operations. Traces are particularly useful in distributed systems, where multiple services or components interact. However, they are equally valuable for monolithic applications, providing insights even when everything runs in a single process.
4822

49-
Metrics are often **the first step** in identifying that something has gone wrong. If a metric shows unusual values (e.g., a spike in response time), you can investigate further by drilling into traces or logs to find the root cause.
23+
## Key Concepts in OpenTelemetry Traces
5024

51-
## Metrics vs. Traces
25+
1. **Spans:**
26+
- The core unit in a trace.
27+
- Represents an individual operation.
28+
- Each span has a name, a start and end time, and metadata (attributes) as key-value pairs.
29+
- Spans can be nested to reflect parent-child relationships.
5230

53-
Metrics have a number of advantages over tracing. Metrics are much more data efficient, generally at the collector level it’s possible to compress hundreds of individual metrics reported to a single packet of data sent on to the metrics backend. Further, metrics show broad trends whereas a trace, no matter how interesting, will always cover only a single request.
31+
2. **Trace Context:**
32+
- Propagates trace identifiers across process boundaries.
33+
- Helps track related spans across multiple services or components.
5434

55-
Should you use metrics instead of traces to monitor your service? Absolutely not. Metrics will always present average performance, and the specific information needed to really understand root causes will be elusive. Further, even with high resolution timeseries metrics it’s very hard to go from worrying metrics to find matching log data of a problem. Finally, modern traces can effectively show information about asynchronous requests as they contribute to overall request time, something that’s very hard to tease out of bare metrics.
35+
3. **Automatic Instrumentation:**
36+
- Some languages and frameworks allow tracing without code changes by using instrumentation agents.
37+
- This approach quickly provides a basic trace structure, capturing incoming requests and outgoing responses.
5638

57-
## Setting up OpenTelemetry Metrics
39+
4. **Manual Instrumentation:**
40+
- Developers use OpenTelemetry APIs to create spans where deeper insights are needed.
41+
- Useful for tracking specific application logic or attaching custom attributes.
5842

59-
### Auto-Instrumentation vs. Manual Instrumentation
43+
## Tracing in Monolithic vs. Distributed Systems
6044

61-
1. **Auto-Instrumentation**: Many popular frameworks and libraries come with automatic OpenTelemetry instrumentation, requiring minimal setup.
62-
2. **Manual Instrumentation**: Developers can manually add metrics within the application code by using SDKs to track specific business metrics (e.g., purchases per hour).
45+
Though OpenTelemetry is often associated with microservices, its principles apply equally to monoliths. Even when working with a single application, external dependencies like databases, message queues, or third-party services make distributed tracing beneficial. Instrumenting a monolith provides visibility into which operations are slow, how many database calls occur per request, and which API calls contribute to latency.
6346

64-
Learn more about instrumentation options in the [OpenTelemetry SDK Guide](https://opentelemetry.io/docs/instrumentation/).
47+
### Example: Intercom’s Tracing Journey
6548

66-
## Example Metric Pipeline
49+
Intercom, a company that offers customer communication tools, transitioned from using structured logs to adopting tracing incrementally. They started by instrumenting API and database calls, which provided immediate value. Over time, they instrumented more of their service, improving their understanding of internal workflows and onboarding processes.
6750

68-
With OpenTelemetry, you can collect, process, and export metrics using **Collectors**. Here’s a high-level example of a typical metric pipeline:
51+
## Logs and Traces: A Complementary Approach
6952

70-
1. **Data Collection**: Metrics are generated by instrumented services.
71-
2. **Processing**: The OpenTelemetry Collector aggregates and processes the data (e.g., batching or filtering metrics).
72-
3. **Exporting**: Metrics are sent to observability platforms like **Prometheus** or **Grafana**.
53+
Organizations often have an existing logging infrastructure when adopting tracing. OpenTelemetry’s logs bridge allows integration between structured logs and traces by wrapping logs with trace identifiers. This ensures logs and traces remain correlated without requiring a complete overhaul of existing logging practices.
7354

74-
Learn how to configure a collector in the [OpenTelemetry Collector Guide](learn/opentelemetry/otel-collector/).
55+
### Gradual Migration with Logs Bridge
7556

57+
Organizations can slowly convert significant logs into spans, as seen with Loan Market, an Australian financial services company. This approach allows gradual adoption of tracing without interrupting existing workflows, ensuring a smooth transition.
7658

59+
## Benefits of OpenTelemetry Tracing
7760

78-
## Best Practices for Metrics in OpenTelemetry
61+
- **Visibility:** Quickly identify slow or failing operations.
62+
- **Efficiency:** Diagnose complex issues by tracking dependencies and relationships.
63+
- **Onboarding:** Help new developers understand system behavior through visualized traces.
64+
- **Adaptability:** Works across monoliths, microservices, and hybrid systems.
7965

80-
- **Optimize cardinality**: Avoid creating too many distinct labels, as this can overwhelm storage and query systems.
81-
- **Set appropriate aggregation intervals**: Batch data intelligently to balance between real-time insights and system load.
82-
- **Use meaningful names**: Clearly describe the purpose of each metric to make dashboards and alerts easier to understand.
83-
- **Standardize naming early**: While OpenTelemetry defines standard language for a number of concepts, actual metric naming is not standardized. As such it's possible to report `total-web-shop-checkout-time` and `webShopCheckoutTime_total` as two totally separate metrics even though they should be aggregated. No standard is perfect, of course, and to normalize data before it's stored, use the [filtering tools in the OpenTelemetry collector](learn/opentelemetry/otel-filtering/).
66+
## Getting Started
8467

68+
To begin, select your language and follow the documentation to add automatic instrumentation or use the OpenTelemetry API to create spans. Many libraries already support tracing out-of-the-box, making it easier to adopt tracing incrementally.
8569

86-
OpenTelemetry metrics provide a robust foundation for observability, helping teams proactively monitor performance and detect issues before they escalate. With the right setup and tooling, you can gain comprehensive insights into your applications, enabling faster resolution times and improved reliability.
70+
Incorporating OpenTelemetry traces helps developers detect problems earlier, understand their systems better, and respond effectively to user issues. Whether your application is a monolith, a microservice, or somewhere in between, traces provide the insight you need to optimize and troubleshoot your software.

0 commit comments

Comments
 (0)