OTel

OpenTelemetry (OTel) is an open-source project aimed at providing a vendor-neutral standard for collecting, processing, and exporting telemetry data in distributed systems, such as microservices architectures. It helps developers analyze the performance and behavior of software more easily, thus facilitating the diagnosis and resolution of application issues.

Terminology

TermExplanation
TraceThe data submitted to the OTel Server, which is a collection of related events or operations used to track the flow of requests in distributed systems; each Trace consists of multiple Spans.
SpanAn independent operation or event within a Trace that includes start time, duration, and other relevant information.
OTel ServerAn OTel server capable of receiving and storing Trace data, such as Jaeger, Prometheus, etc.
JaegerAn open-source distributed tracing system used for monitoring and troubleshooting microservices architectures, supporting integration with OpenTelemetry.
AttributesKey-value pairs attached to a Trace or Span to provide additional contextual information. This includes Resource Attributes and Span Attributes; see Attributes for more information.
SamplerA strategy component that determines whether to sample and report a Trace. Different sampling strategies can be configured, such as full sampling, proportional sampling, etc.
ALB (Another Load Balancer)A software or hardware device that distributes network requests across available nodes in a cluster; the load balancer (ALB) used in the platform is a layer 7 software load balancer, which can be configured to monitor traffic with OTel. ALB supports submitting Traces to a specified Collector and allows different sampling strategies; it also supports configuring whether to submit Traces at the Ingress level.
FT (Frontend)The port configuration for ALB, specifying port-level configurations.
RuleRouting rules on the port (FT) used to match specific routes.
HotROD (Rides on Demand)A sample application provided by Jaeger to demonstrate the use of distributed tracing; refer to Hot R.O.D. - Rides on Demand for more details.
hotrod-with-proxySpecifies the addresses of HotROD's internal microservices via environment variables; refer to hotrod-with-proxy for more details.

Prerequisites

  • Ensure that an operable ALB exists: Create or use an existing ALB, where the name of the ALB is replaced with <otel-alb> in this document. For instructions on creating an ALB, refer to Creating Load Balancer.

  • Ensure that there is an OTel data reporting server address: This address will hereinafter be referred to as <jaeger-server>.

Steps

Update ALB Configuration

  1. On the Master node of the cluster, use the CLI tool to execute the following command to edit the ALB configuration.

    kubectl edit alb2 -n cpaas-system <otel-alb> # Replace <otel-alb> with the actual ALB name
  2. Add the following fields under the spec.config section.

    otel:
      enable: true
      exporter:
        collector:
          address: "<jaeger-server>" # Replace <jaeger-server> with the actual OTel data reporting server address
          request_timeout: 1000
    

    Example configuration once completed:

    spec:
      address: 192.168.1.1
      config:
        otel:
         enable: true
         exporter:
           collector:
             address: "http://jaeger.default.svc.cluster.local:4318"
             request_timeout: 1000
        antiAffinityKey: system
        defaultSSLCert: cpaas-system/cpaas-system
        defaultSSLStrategy: Both
        gateway:
        ...
    type: nginx
    
  3. Execute the following command to save the updates. After the update, the ALB will default to enabling OpenTelemetry, and all request Trace information will be reported to the Jaeger Server.

    :wq

Configuring OTel in Ingress

  • Enable or Disable OTel on Ingress

    By configuring whether to enable OTel on Ingress, you can better monitor and debug the request flow of applications, identifying performance bottlenecks or errors by tracing requests as they propagate between different services.

    Steps

    Add the following configuration under the metadata.annotations field of Ingress:

    nginx.ingress.kubernetes.io/enable-opentelemetry: "true"

    Parameter Explanation:

    • nginx.ingress.kubernetes.io/enable-opentelemetry: When set to true, it indicates that the Ingress controller enables OpenTelemetry functionality while processing requests through this Ingress, which means request Trace information will be collected and reported. When set to false or this annotation is removed, it means that request Trace information will not be collected or reported.
  • Enable or Disable OTel Trust on Ingress

    OTel Trust determines whether Ingress trusts and uses the Trace information (e.g., trace ID) from incoming requests.

    Steps

    Add the following configuration under the metadata.annotations field of Ingress:

    nginx.ingress.kubernetes.io/opentelemetry-trust-incoming-span: "true"

    Parameter Explanation:

    • nginx.ingress.kubernetes.io/opentelemetry-trust-incoming-span: When set to true, the Ingress continues to use already existing Trace information, helping maintain consistency in cross-service tracing, allowing the entire request chain to be fully traced and analyzed in the distributed tracing system. When set to false, it will generate new tracing information for the request, which may cause the request to be treated as part of a new tracing chain after entering the Ingress, interrupting cross-service trace continuity.
  • Add Different OTel Configurations on Ingress

    This configuration allows you to customize OTel's behavior and data export methodology for different Ingress resources, enabling fine-grained control over each service's tracing strategy or target.

    Steps

    Add the following configuration under the metadata.annotations field of Ingress:

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        alb.ingress.cpaas.io/otel: >
         {
            "enable": true,
            "exporter": {
                "collector": {
                    "address": "<jaeger-server>", # Replace <jaeger-server> with the actual OTel data reporting server address, e.g., "address": "http://128.0.0.1:4318"
                    "request_timeout": 1000
                }
            }
         }
    

    Parameter Explanation:

    • exporter: Specifies how the collected Trace data is sent to the OTel Collector (the OTel data reporting server).
    • address: Specifies the address of the OTel Collector.
    • request_timeout: Specifies the request timeout.

Using OTel in Applications

The following configuration shows the complete OTel configuration structure, which can be used to define how to enable and use OTel features in applications.

On the cluster Master node, use the CLI tool to execute the following command to get the complete OTel configuration structure.

kubectl get crd alaudaloadbalancer2.crd.alauda.io -o json|jq ".spec.versions[2].schema.openAPIV3Schema.properties.spec.properties.config.properties.otel"

Echoed Result:

{ "otel": { "enable": true } "exporter": { "collector": { "address": "" }, }, "flags": { "hide_upstream_attrs": false "notrust_incoming_span": false "report_http_request_header": false "report_http_response_header": false }, "sampler": { "name": "", "options": { "fraction": "" "parent_name": "" }, }, }

Parameter Explanation:

ParameterDescription
otel.enableWhether to enable OTel functionality.
exporter.collector.addressThe address of the OTel data reporting server, supporting http/https protocols and domain names.
flags.hide_upstream_attrsWhether to report information about upstream rules.
flag.notrust_incoming_spanWhether to trust and use the OTel Trace information (e.g., trace ID) from incoming requests.
flags.report_http_request_headerWhether to report request headers.
flags.report_http_response_headerWhether to report response headers.
sampler.nameSampling strategy name; see Sampling Strategies for details.
sampler.options.fractionSampling rate.
sampler.options.parent_nameThe parent strategy for parent_base sampling strategies.

Inheritance

By default, if the ALB configures certain OTel parameters and FT is not configured, FT will inherit the parameters from the ALB as its own configuration; that is, FT inherits the ALB configuration, while Rule can inherit configurations from both ALB and FT.

  • ALB: The configuration on the ALB is typically global and default. You can configure global parameters such as Collector addresses here, which will be inherited by the lower-level FT and Rule.

  • FT: FT can inherit configurations from ALB, meaning that certain OTel parameters that are not configured on FT will use the configuration from ALB. However, FT can also be refined further; for instance, you can choose to selectively enable or disable OTel on FT without affecting other FT or the global settings of ALB.

  • Rule: Rule can inherit configurations from both ALB and FT. However, Rule can also be refined further; for instance, a specific Rule can choose not to trust the incoming OTel Trace information or to adjust the sampling strategies.

Steps

By configuring the spec.config.otel field in the YAML files of ALB, FT, and Rule, you can add OTel-related configuration.

Additional Notes

Sampling Strategies

ParameterExplanation
always onAlways report all tracing data.
always offNever report tracing data.
traceid-ratioDecide whether to report based on traceid. The format of traceparent is xx-traceid-xx-flag, where the first 16 characters of traceid represent a 32-bit hexadecimal integer. If this integer is less than fraction multiplied by 4294967295 (i.e., (2^32-1)), it will be reported.
parent-baseDecide whether to report based on the flag part of the traceparent in the request. When the flag is 01, it will be reported; for example: curl -v "http://$ALB_IP/" -H 'traceparent: 00-xx-xx-01'; when the flag is 02, it will not be reported; for example: curl -v "http://$ALB_IP/" -H 'traceparent: 00-xx-xx-02'.

Attributes

  • Resource Attributes

    These attributes are reported by default.

    ParameterDescription
    hostnameThe hostname of the ALB Pod
    service.nameThe name of the ALB
    service.namespaceThe namespace where the ALB resides
    service.typeDefault is ALB
    service.instance.idThe name of the ALB Pod
  • Span Attributes

    • Attributes reported by default:

      ParameterDescription
      http.status_codeStatus code
      http.request.resend_countRetry count
      alb.rule.rule_nameThe name of the rule matched by this request
      alb.rule.source_typeThe type of the rule matched by this request, currently only Ingress
      alb.rule.source_nameThe name of the Ingress
      alb.rule.source_nsThe namespace where the Ingress resides
    • Attributes reported by default but can be excluded by modifying the flag.hide_upstream_attrs field:

      ParameterDescription
      alb.upstream.svc_nameThe name of the Service (internal route) to which traffic is forwarded
      alb.upstream.svc_nsThe namespace where the Service (internal route) being forwarded resides
      alb.upstream.peerThe IP address and port of the Pod being forwarded to
    • Attributes not reported by default but can be reported by modifying the flag.report_http_request_header field:

      ParameterDescription
      **http.request.header.<header>**Request Header
    • Attributes not reported by default but can be reported by modifying the flag.report_http_response_header field:

      ParameterDescription
      **http.response.header.<header>**Response Header

Configuration Example

The following YAML configuration deploys an ALB and uses Jaeger as the OTel server, with Hotrod-proxy as the demonstration backend. By configuring Ingress rules, when clients request the ALB, the traffic will be forwarded to HotROD. Additionally, the communication between internal microservices of HotROD is also routed through the ALB.

  1. Save the following YAML as a file named all.yaml.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hotrod
    spec:
      replicas: 1
      selector:
        matchLabels:
          service.cpaas.io/name: hotrod
          service_name: hotrod
      template:
        metadata:
          labels:
            service.cpaas.io/name: hotrod
            service_name: hotrod
        spec:
          containers:
            - name: hotrod
              env:
                - name: PROXY_PORT
                  value: "80"
                - name: PROXY_ADDR
                  value: "otel-alb.default.svc.cluster.local:"
                - name: OTEL_EXPORTER_OTLP_ENDPOINT
                  value: "http://jaeger.default.svc.cluster.local:4318"
              image: theseedoaa/hotrod-with-proxy:latest
              imagePullPolicy: IfNotPresent
              command: ["/bin/hotrod","all","-v"]
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: hotrod-frontend
    spec:
      ingressClassName: otel-alb
      rules:
      - http:
          paths:
          - backend:
              service:
                name: hotrod
                port:
                  number: 8080
            path: /dispatch
            pathType: ImplementationSpecific
          - backend:
              service:
                name: hotrod
                port:
                  number: 8080
            path: /frontend
            pathType: ImplementationSpecific
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: hotrod-customer
    spec:
      ingressClassName: otel-alb
      rules:
      - http:
          paths:
          - backend:
              service:
                name: hotrod
                port:
                  number: 8081
            path: /customer
            pathType: ImplementationSpecific
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: hotrod-route
    spec:
      ingressClassName: otel-alb
      rules:
      - http:
          paths:
          - backend:
              service:
                name: hotrod
                port:
                  number: 8083
            path: /route
            pathType: ImplementationSpecific
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: hotrod
    spec:
      internalTrafficPolicy: Cluster
      ipFamilies:
        - IPv4
      ipFamilyPolicy: SingleStack
      ports:
        - name: frontend
          port: 8080
          protocol: TCP
          targetPort: 8080
        - name: customer
          port: 8081
          protocol: TCP
          targetPort: 8081
        - name: router
          port: 8083
          protocol: TCP
          targetPort: 8083
      selector:
        service_name: hotrod
      sessionAffinity: None
      type: ClusterIP
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: jaeger
    spec:
      replicas: 1
      selector:
        matchLabels:
          service.cpaas.io/name: jaeger
          service_name: jaeger
      template:
        metadata:
          labels:
            service.cpaas.io/name: jaeger
            service_name: jaeger
        spec:
          containers:
            - name: jaeger
              env:
               - name: LOG_LEVEL
                 value: debug
              image: jaegertracing/all-in-one:1.58.1
              imagePullPolicy: IfNotPresent
          hostNetwork: true
          tolerations:
            - operator: Exists
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: jaeger
    spec:
      internalTrafficPolicy: Cluster
      ipFamilies:
        - IPv4
      ipFamilyPolicy: SingleStack
      ports:
        - name: http
          port: 4318
          protocol: TCP
          targetPort: 4318
      selector:
        service_name: jaeger
      sessionAffinity: None
      type: ClusterIP
    ---
    apiVersion: crd.alauda.io/v2
    kind: ALB2
    metadata:
      name: otel-alb
    spec:
      config:
        loadbalancerName: otel-alb
        otel:
          enable: true
          exporter:
            collector:
              address: "http://jaeger.default.svc.cluster.local:4318"
              request_timeout: 1000
        projects:
        - ALL_ALL
        replicas: 1
        resources:
          alb:
            limits:
              cpu: 200m
              memory: 2Gi
            requests:
              cpu: 50m
              memory: 128Mi
          limits:
            cpu: "1"
            memory: 1Gi
          requests:
            cpu: 50m
            memory: 128Mi
      type: nginx
    
  2. In the CLI tool, execute the following command to deploy Jaeger, ALB, HotROD, and all necessary CRs for testing.

    kubectl apply ./all.yaml
  3. Execute the following command to get the access address of Jaeger.
    export JAEGER_IP=$(kubectl get po -A -o wide |grep jaeger | awk '{print $7}');echo "http://$JAEGER_IP:16686"
  4. Execute the following command to obtain the access address of otel-alb.

    export ALB_IP=$(kubectl get po -A -o wide|grep otel-alb | awk '{print $7}');echo $ALB_IP
  5. Execute the following command to send a request to HotROD via ALB. Here, ALB will report the Trace to Jaeger.

    curl -v "http://<$ALB_IP>:80/dispatch?customer=567&nonse=" # Replace <$ALB_IP> in the command with the access address of otel-alb obtained in the previous steps
  6. Open the access address of Jaeger obtained in Step 3 to view the results.