Configuring distributed tracing platform with Service Mesh
Alauda Service Mesh supports distributed tracing through integration with the following components:
-
Alauda Build of Jaeger v2: A customized distribution based on the open source Jaeger project. It provides end-to-end visibility into requests across complex distributed systems. In v2, the Jaeger instance is deployed as an
OpenTelemetryCollectorcustom resource managed by the Alauda Build of OpenTelemetry v2 Operator. -
Alauda Build of OpenTelemetry v2: Based on the OpenTelemetry project, this Operator manages the lifecycle of both the Jaeger v2 instance and the OpenTelemetry Collector that fronts it.
The OpenTelemetry Collector acts as an intermediary for telemetry signals. It supports multiple data formats and provides a standardized pipeline for processing and exporting telemetry to backends such as Jaeger.
TOC
Configuring distributed tracing data collection with Service MeshUninstalling distributed tracingRemoving the Service Mesh tracing configurationUninstalling the OpenTelemetry Collector and Jaeger v2Configuring distributed tracing data collection with Service Mesh
You can integrate Alauda Service Mesh with Alauda Distributed Tracing by deploying a Jaeger v2 instance and an OpenTelemetry Collector, then configuring Istio to export trace data through them.
The Jaeger v2 instance and the OpenTelemetry Collector are not service-mesh-specific: a single Jaeger v2 and OpenTelemetry Collector pair (deployed by default in the jaeger-system namespace) can serve both Service Mesh and other workloads in the cluster. This section only covers the Service Mesh-specific configuration; refer to the Alauda Distributed Tracing documentation for the underlying installation steps.
Prerequisites
-
The Alauda Build of OpenTelemetry v2 Operator is installed. See Installing the Alauda Build of OpenTelemetry v2 Operator.
-
A Jaeger v2 instance is deployed. See Deploying the Alauda Build of Jaeger v2.
INFOThe
JAEGER_ES_INDEX_PREFIXvariable referenced in the install procedure controls the prefix of the Elasticsearch indices that store trace data. The default value,acp-${CLUSTER_NAME}, works for a single-cluster deployment. For a service mesh deployment, choose the prefix according to the topology of the mesh:- For a single-cluster service mesh, we recommend ending the prefix with the cluster name, for example
acp-cluster-1. - For a multi-cluster service mesh, traces from all clusters must be stored in the same index family; we recommend ending the prefix with the meshID, for example
acp-mesh-1. Use the sameJAEGER_ES_INDEX_PREFIXwhen running the install procedure on every cluster in the mesh so that the Jaeger UI can correlate spans across clusters.
- For a single-cluster service mesh, we recommend ending the prefix with the cluster name, for example
-
An OpenTelemetry Collector is deployed. See Deploying the OpenTelemetry Collector.
-
An Istio instance is created.
-
An Istio CNI instance is created.
Procedure
Update the Istio resource to enable tracing and define the OpenTelemetry tracing provider
Example: Enabling tracing via meshConfig
- The
servicefield is the FQDN of the OpenTelemetry Collector Service. The default value targets the Collector deployed in thejaeger-systemnamespace as described in Deploying the OpenTelemetry Collector. Replace it with the actual Collector address if you deployed the Collector in a different namespace or with a different instance name.
To apply this configuration, patch the Istio resource:
This command uses a JSON merge patch, which replaces the entire meshConfig.extensionProviders array. If the Istio resource already defines other extension providers, they will be overwritten. To preserve them, edit the resource with kubectl edit istio default and append the otel entry by hand, or use a JSON Patch (--type=json) that appends to /spec/values/meshConfig/extensionProviders/-.
Update the Telemetry resource to enable the tracing provider defined in the meshConfig:
Example Istio Telemetry resource
To apply this configuration, patch the Telemetry resource:
Once you verify that you can see traces, lower the randomSamplingPercentage value to reduce the number of requests.
Uninstalling distributed tracing
If you no longer need the distributed tracing integration with Service Mesh, remove the configuration in the order below.
Removing the Service Mesh tracing configuration
Before removing the underlying components, detach the mesh from the OpenTelemetry Collector so that Istio stops sending spans.
-
Edit the
Telemetryresource and remove thetracingproviders entry that references theotelprovider:Alternatively, remove the
tracingconfiguration non-interactively withkubectl patch: -
Edit the
Istioresource and remove themeshConfig.extensionProvidersentry namedotel, or setmeshConfig.enableTracingtofalse:Alternatively, set
meshConfig.enableTracingtofalsenon-interactively withkubectl patch:
Uninstalling the OpenTelemetry Collector and Jaeger v2
Skip this step if other workloads in the cluster still rely on the OpenTelemetry Collector or the Jaeger v2 instance.
For instructions on deleting the OpenTelemetry Collector instance, the Jaeger v2 instance, and (optionally) the Alauda Build of OpenTelemetry v2 Operator, see Uninstalling Alauda Distributed Tracing.