Distributed Tracing and Service Mesh

TOC

Configuring distributed tracing platform with Service Mesh

Alauda Service Mesh supports distributed tracing through integration with the following components:

  • Alauda Build of Jaeger: A customized distribution based on the open source Jaeger project. It provides end-to-end visibility into requests across complex distributed systems.

  • Alauda Build of OpenTelemetry: Based on the OpenTelemetry project, this component simplifies telemetry data collection across metrics, logs, and traces by managing the OpenTelemetry Collector and workload instrumentation.

The OpenTelemetry Collector acts as an intermediary for telemetry signals. It supports multiple data formats and provides a standardized pipeline for processing and exporting telemetry to backends such as Jaeger.

Configuring distributed tracing data collection with Service Mesh

You can integrate Alauda Service Mesh with OpenTelemetry to instrument, generate, collect, and export OpenTelemetry traces, metrics, and logs to analyze and understand your software's performance and behavior.

Prerequisites

  • Alauda Build of Jaeger is installed.
  • Alauda Build of OpenTelemetry is installed.
  • A Jaeger instance is installed and configured in the cpaas-system namespace.
  • An Istio instance is created.
  • An Istio CNI instance is created.

Procedure

Example OpenTelemetry Collector in istio-system namespace

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: otel
  namespace: istio-system
spec:
  observability:
    metrics: {}
  deploymentUpdateStrategy: {}
  config:
    processors:
      batch: {}
    exporters:
      debug: {}
      otlp:
        endpoint: 'dns:///jaeger-prod-collector-headless.cpaas-system:4317'
        balancer_name: round_robin
        tls:
          insecure: true
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: '0.0.0.0:4317'
    service:
      pipelines:
        traces:
          exporters:
            - debug
            - otlp
          processors:
            - batch
          receivers:
            - otlp
  1. The endpoint field is the Jaeger collector service in the cpaas-system namespace.

Update the Istio resource to enable tracing and define the OpenTelemetry tracing provider:

Example: Enabling tracing via meshConfig

apiVersion: sailoperator.io/v1
kind: Istio
metadata:
  name: default
  # ...
spec:
  namespace: istio-system
  # ...
  values:
    meshConfig:
      enableTracing: true
      extensionProviders:
      - name: otel
        opentelemetry:
          port: 4317
          service: otel-collector.istio-system.svc.cluster.local
  1. The service field is the OpenTelemetry collector service in the istio-system namespace.

Update the Telemetry resource to enable the tracing provider defined in the meshConfig:

Example Istio Telemetry resource

apiVersion: telemetry.istio.io/v1
kind: Telemetry
metadata:
  name: asm-default
  namespace: istio-system
  # ...
spec:
  # ...
  tracing:
    - providers:
        - name: otel
      randomSamplingPercentage: 100
NOTE

Once you verify that you can see traces, lower the randomSamplingPercentage value or set it to default to reduce the number of requests.