Configuring OpenTelemetry Collector
TOC
Feature Overview
You can configure the OpenTelemetry Collector to meet your observability needs. Before diving into the workings of Collector configuration, it's important to familiarize yourself with the following:
- Data Collection Concepts to understand the repository applicable to the OpenTelemetry Collector.
- Security Guidelines.
Configuration Structure
The structure of any Collector configuration file consists of four types of pipeline components that interact with telemetry data:
Once each pipeline component is configured, you must enable it through a pipeline defined in the service section of the configuration file.
In addition to pipeline components, you can configure Extensions, which provide additional functionalities that can be added to the Collector, such as diagnostic tools. Extensions do not need direct access to telemetry data and are enabled through the service section.
Below is an example of a Collector configuration that includes receivers, processors, exporters, and three extensions.
Important Note: While it's common to bind endpoints to localhost when all clients are local, our sample configuration uses the "unspecified" address 0.0.0.0 for convenience. The Collector currently defaults to using localhost. For detailed information on these endpoint configuration values, see the Security Measures to Prevent Denial of Service Attacks.
Note that receivers, processors, exporters, and pipelines are defined using a component identifier in the type[/name] format, such as otlp or otlp/2. As long as the identifier is unique, you can define the same type of component multiple times. For example:
The configuration can also include other files, allowing the Collector to merge them into a single in-memory YAML configuration:
The exporters.yaml file might contain:
The resulting in-memory configuration would be:
Receivers
Receivers collect telemetry data from one or more sources. They can be either pull-based or push-based and may support one or more data sources.
Typically, a receiver ingests data in a specified format, converts it to an internal format, and then passes it to the processors and exporters defined in the applicable pipeline.
Receivers are configured in the receivers section. By default, no receivers are configured. You must configure one or more receivers. Many receivers come with default settings, so specifying the receiver name may be sufficient. If you need to customize or modify the default configuration, you can do so in this section. Any settings you specify will override the defaults (if they exist).
Note: Configuring a receiver does not enable it. A receiver is enabled by adding it to the appropriate pipeline in the service section.
Below are common receiver configuration examples.
Tip: For more detailed receiver configurations, refer to the receiver README.
OTLP Receiver
The OTLP receiver ingests traces, metrics, and logs using the OpenTelemetry Protocol (OTLP).
YAML Example
protocols Parameter Descriptions
Jaeger Receiver
The Jaeger receiver ingests traces in the Jaeger format.
YAML Example
protocols Parameter Description
Zipkin Receiver
The Zipkin receiver ingests traces in both Zipkin v1 and v2 formats.
YAML Example
zipkin Parameter Description
Processors
Processors modify or transform data collected by receivers before sending it to exporters. Data processing is based on the rules or settings defined for each processor, which may include filtering, deleting, renaming, or recalculating telemetry data. The order of processors in the pipeline determines the sequence in which the Collector applies processing operations to signals.
Processors are optional, but some are recommended.
You can configure processors in the processors section of the Collector configuration file. Any settings you specify will override default values (if they exist).
Note: Configuring a processor does not enable it. A processor must be enabled by adding it to the appropriate pipeline in the service section. By default, no processors are enabled.
Below are examples of common processor configurations.
Tip: You can find a complete list of processors by combining the lists from opentelemetry-collector-contrib and opentelemetry-collector. For more detailed processor configurations, refer to the processor README.
Batch Processor
The Batch Processor batches and compresses spans, metrics, or logs based on size or time. Batching can help reduce the number of submission requests made by exporters and help regulate the flow of telemetry from multiple or single receivers in the pipeline.
YAML Example
batch Parameter Description
Memory Limiter Processor
The Memory Limiter Processor periodically checks the Collector's memory usage and pauses data processing when it reaches a soft memory limit, preventing out-of-memory scenarios. This processor supports spans, metrics, and logs. It is typically the first component after receivers, expecting retries to send the same data and possibly applying back pressure to incoming data. When memory usage exceeds the hard limit, the Memory Limiter Processor enforces garbage collection.
YAML Example
memory_limiter Parameter Description
Filter Processor
The Filter Processor filters spans, metrics, or logs based on the conditions you define in its configuration. A typical use case for the Filter Processor is to discard telemetry data irrelevant to the observability system, such as non-critical logs or spans, to reduce noise in the data.
Filtering works with allow and deny lists, which include or exclude telemetry based on regular expressions and resource attributes. You can also use the OpenTelemetry Transformation Language (OTTL) to better describe the signals you want to filter. The processor supports all pipeline types.
YAML Example
filter/ottl Parameter Description
Metrics Transform Processor
The Metrics Transform Processor shares some features with the Attributes Processor and is typically used to perform the following tasks:
- Add, rename, or delete label keys and values.
- Scale and aggregate metrics based on labels or label values.
- The processor only supports renaming and aggregation within a single batch of metrics. It does not perform any cross-batch aggregation, so do not use it to aggregate metrics from multiple sources, such as multiple nodes or clients.
For a complete list of supported operations, see Available Operations.
YAML Example
The Metrics Transform processor also supports the use of regular expressions, allowing transformation rules to be applied to multiple metric names or metric labels simultaneously. The following example renames cluster_name to cluster-name across all metrics:
Available Operations
The processor can perform the following operations:
- Rename metrics. For example, renaming
system.cpu.usagetosystem.cpu.usage_time. - Add labels. For instance, you can add a new label
identifierwith a value of1to all points. - Rename label keys. For example, renaming the label
statetocpu_state. - Rename label values. For example, renaming the value
idleto-within thestatelabel. - Delete data points. For instance, removing all points where the
statelabel has a value ofidle. - Switch data types. You can change
intdata points todoubledata points. - Scale values. For example, multiplying values by 1000 to convert from seconds to milliseconds.
- Aggregate across label sets. For example, retaining only the
statelabel and averaging all points with the same value for this label. - Aggregate across label values. For example, summing points with the values
userorsystemin thestatelabel asused = user + system.
The following rules apply:
- You can only apply operations to one or more metrics using a
strictorregexpfilter. - Using the
actionproperty, you can:- Update your metrics in place (
update). - Duplicate and update the duplicated metrics (
insert). - Combine your metrics into a newly inserted metric by merging all data points from a set of matched metrics into a single metric (
combine). The original matched metrics are also removed.
- Update your metrics in place (
- When renaming metrics, the capture groups in the
regexpfilter will be expanded.
Transform Processor
The Transform Processor modifies matched spans, metrics, or logs through statements. Use cases include, but are not limited to, converting metrics to different types, replacing or removing keys, and setting fields based on predefined conditions.
Statements are functions in the OpenTelemetry Transformation Language (OTTL) and are applied to telemetry data according to their order in the list. The transform processor includes additional functions for converting metric types. Statements transform data according to the OTTL context you define, such as Span or DataPoint.
Supported contexts for the transform processor:
Statements can transform telemetry data of higher contexts. For example, a statement applied to a data point can access the metric and resource of that data point. Lower contexts cannot be accessed; for example, you cannot use a span statement to transform individual span events. Typically, statements are associated with the context you want to transform.
Example of using the transform processor to set the status of a span. The following example sets the span status to Ok when the http.request.status_code attribute is 400:
YAML Example
The error_mode field describes how the processor reacts to errors when processing statements:
"error_mode: ignore"tells the processor to ignore errors and continue execution. This is the default error mode."error_mode: propagate"tells the processor to return an error. As a result, the collector will drop the data.
Advanced Features
- You can also use the transform processor to modify span names based on span attributes or extract span attributes from the span name. For examples, see the transform processor's example configuration file.
- The transform processor also offers advanced attribute transformation capabilities. The transform processor allows end users to specify transformations for metrics, logs, and traces using the OpenTelemetry Transformation Language (OTTL).
- For more information on OTTL functions and syntax, see:
- OTTL Syntax: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/README.md
- OTTL Functions: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/ottlfuncs
- OTTL Contexts: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts
Exporters
Exporters send data to one or more backends or destinations. Exporters can be pull-based or push-based and may support one or more data sources.
You can configure exporters in the exporters section of the Collector configuration file. Most exporters require at least a destination and security settings like authentication tokens or TLS certificates. Any settings you specify will override the defaults (if they exist).
Note: Configuring an exporter does not enable it. An exporter needs to be enabled by adding it to the appropriate pipeline in the service section. By default, no exporters are enabled.
The Collector requires one or more exporters. Below are common examples of exporter configurations:
Tip: Some exporters require x.509 certificates to establish a secure connection, as described in Setting up Certificates. For more detailed exporter configurations, refer to the exporter README.
OTLP Exporter
The OTLP exporter sends metrics, traces, and logs using the OTLP format over gRPC. Supported pipeline types include spans, metrics, and logs. By default, this exporter requires TLS and offers queue retry functionality. To send OTLP data over HTTP, use the OTLP/HTTP exporter. See the OTLP/HTTP exporter for instructions.
YAML Example
otlp Parameter Description
OTLP HTTP Exporter
The OTLP HTTP Exporter exports traces and metrics using the OpenTelemetry Protocol (OTLP).
YAML Example
otlphttp Parameter Description
Debug Exporter
The Debug Exporter outputs telemetry data to the console for debugging purposes.
YAML Example
The debug.verbosity field controls the detail level of logged exports (detailed|normal|basic). When set to detailed, pipeline data is logged in detail.
Load Balancing Exporter
The Load Balancing Exporter can export spans, metrics, and logs to multiple backends. Supported pipeline types include metrics, spans, and logs. The Load Balancing Exporter can use routing policies to send telemetry data to multiple backends simultaneously. You can configure routing_key to use routing policies to classify telemetry data into groups and map these groups to specific endpoints.
Using the Load Balancing Exporter, you can also send data to other running OpenTelemetry Collector instances via collector endpoints. For example, you could send all traces to one running collector instance and all logs to another instance. This allows you to process or manipulate your data in different collector environments.
YAML Example
loadbalancing Parameter Description
Prometheus Exporter
The Prometheus Exporter exports metric data in Prometheus or OpenMetrics format, allowing Prometheus servers to scrape the data. Below are the details for configuring and using the Prometheus Exporter.
Required Configuration
- endpoint: Specifies the address for exposing metric data, with the path
/metrics. This must be configured and does not have a default value.
Optional Configuration
- const_labels: Key-value pair labels applied to each exported metric, default is not set.
- namespace: If set, all exported metrics will use this namespace, with no default value.
- send_timestamps: Whether to send timestamps for metric samples in the response, default is
false. - metric_expiration: Defines how long metrics are exposed without updates, default is
5m. - resource_to_telemetry_conversion:
enabled: Default isfalse. When enabled, resource attributes are converted to metric labels.
- enable_open_metrics: Default is
false. When enabled, metrics are exported using the OpenMetrics format. - add_metric_suffixes: Default is
true. Whether to add type and unit suffixes.
TLS Configuration
TLS certificates can be set using ca_file, cert_file, and key_file to ensure secure communication.
Example YAML Configuration
Below is an example configuration for the Prometheus Exporter:
This configuration exposes Prometheus metrics at 0.0.0.0:8889/metrics and configures TLS certificates and other parameters.
Usage Recommendations
- In OpenTelemetry, metric names and labels are standardized to comply with Prometheus naming conventions.
- By default, resource attributes are added to the
target_infometric. You can use Prometheus queries to select and group these attributes as metric labels. - To simplify queries and grouping, it is recommended to use the
transform processorto directly convert commonly used resource attributes into metric labels.
Connectors
Connectors link two pipelines, acting both as exporters and receivers. A connector consumes data as an exporter at the end of one pipeline and sends data as a receiver at the start of another pipeline. The consumed and sent data can be of the same or different types. You can use connectors to aggregate, duplicate, or route data.
You can configure one or more connectors in the connectors section of the Collector configuration file. By default, no connectors are configured. Each type of connector is designed to handle a combination of one or more data types and can only be used for connecting the corresponding data type pipelines.
Note: Configuring a connector does not enable it. Connectors need to be enabled by adding them to the relevant pipelines in the service section.
Tip: For more detailed connector configuration, refer to the connector README.
ASM Service Graph Connector
The ASM Service Graph Connector builds a map representing the relationships between various services in a system. This connector analyzes spans data and generates metrics describing the relationships between services. These metrics can be used by data visualization applications (such as Grafana) to draw a service graph.
Note: This component is a custom version of the community component Service Graph Connector and cannot be directly replaced with the community-native component. Specific parameter differences are explained below.
Service topologies are very useful in many use cases:
- Inferring the topology of a distributed system. As distributed systems grow, they become increasingly complex. Service graphs help you understand the structure of the system.
- Providing a high-level overview of system health. Service graphs display error rates, latency, and other relevant data.
- Offering a historical view of the system topology. Distributed systems change frequently, and service graphs provide a way to view how these systems evolve over time.
YAML Example
asmservicegraph Parameter Description
Extensions
Extensions add capabilities to the Collector. For example, extensions can automatically add authentication features to receivers and exporters.
Extensions are optional components used to extend the functionality of the Collector for tasks unrelated to processing telemetry data. For example, you can add extensions for health monitoring, service discovery, or data forwarding.
You can configure extensions in the extensions section of the Collector configuration file. Most extensions come with default settings, so you only need to specify the name of the extension to configure it. Any specified settings will override the default values (if any).
Note: Configuring an extension does not enable it. Extensions must be enabled in the service section. By default, no extensions are configured.
Tip: For more detailed extension configuration, refer to the extension README.
Service Section
The service section is used to enable components in the Collector based on the configuration in the receivers, processors, exporters, and extensions sections. If a component is configured but not defined in the service section, it will not be enabled.
The service section includes three subsections:
-
Extensions: A list of required extensions to be enabled. For example:
-
Pipelines: Configures pipelines, which can be of the following types:
- traces: Collects and processes trace data.
- metrics: Collects and processes metric data.
- logs: Collects and processes log data.
A pipeline consists of a set of
receivers,processors, andexporters. Before including areceiver,processor, orexporterin a pipeline, ensure its configuration is defined in the respective sections.You can use the same
receiver,processor, orexporterin multiple pipelines. When a processor is referenced in multiple pipelines, each pipeline gets a separate instance of that processor.Here is an example of pipeline configuration. Note that the order of processors determines the processing sequence of the data:
-
Telemetry: The
telemetryconfiguration section is used to set up the observability of the Collector itself. It consists oflogsandmetricssubsections. For information on how to configure these signals, see Activating Internal Telemetry in the Collector.
Additional Information
Environment Variables
Collector configurations support using and extending environment variables. For example, to use values stored in DB_KEY and OPERATION environment variables, you can write:
Use $$ to represent a literal $. For example, to represent $DataVisualization, you can write:
Proxy Support
Exporters using the net/http package support the following proxy environment variables:
- HTTP_PROXY: Address of the HTTP proxy.
- HTTPS_PROXY: Address of the HTTPS proxy.
- NO_PROXY: Addresses that should bypass the proxy.
If these variables are set when the Collector starts, exporters will proxy or bypass proxy traffic according to these environment variables, regardless of the protocol.
Authentication
Most receivers exposing HTTP or gRPC ports can be protected using the Collector's authentication mechanisms. Similarly, most exporters using HTTP or gRPC clients can add authentication to outgoing requests.
The authentication mechanism in the Collector uses the extension mechanism, allowing custom authenticators to be integrated into the Collector distribution. Each authentication extension can be used in two ways:
- As a client authenticator for
exporters, adding authentication data to outgoing requests. - As a server authenticator for
receivers, authenticating incoming connections.
For a list of known authenticators, see the Registry.
To add a server authenticator to a receiver in the Collector, follow these steps:
- Add the authenticator extension and its configuration under
.extensions. - Add a reference to the authenticator under
.services.extensionsso the Collector loads it. - Add a reference to the authenticator under
.receivers.<your-receiver>.<http-or-grpc-config>.auth.
The following example uses an OIDC authenticator on the receiving side, suitable for remote Collectors receiving data via the OpenTelemetry Collector as a proxy:
On the proxy side, this example configures the OTLP exporter to obtain OIDC tokens and add them to each RPC sent to the remote Collector:
Configuring certificates
For secure communication in production environments, use TLS certificates or mTLS for mutual authentication. Follow these steps to generate a self-signed certificate, or use your current certificate provider for production certificates.
Install cfssl and create the following csr.json file:
Then run the following commands:
This will create two certificates:
- A certificate authority (CA) named "OpenTelemetry Example" stored in
ca.pem, with the associated key inca-key.pem. - A client certificate stored in
cert.pem, signed by the OpenTelemetry Example CA, with the associated key incert-key.pem.
Overriding Settings
You can override Collector settings using the --set option. Settings defined this way will be merged into the final configuration after all --config sources are resolved and merged.
The following example shows how to override settings in nested sections:
Important Note: The --set option does not support setting keys containing dots (.) or equals signs (=).