安装
WARNING
该部署文档仅适用于集成容器平台和追踪系统的场景。
追踪组件和服务网格组件是互斥的。如果您已经部署了服务网格组件,请先卸载它。
本指南为集群管理员提供了在 Alauda 容器平台集群上安装追踪系统的过程。
先决条件:
- 您可以使用具有
platform-admin-system
权限的帐户访问 Alauda 容器平台集群。
- 您已安装
kubectl
CLI。
- 已设置
Elasticsearch
组件以存储追踪数据,包括访问 URL 和 Basic Auth
信息。
安装 Jaeger Operator
使用 Web 控制台安装 Jaeger Operator
您可以通过 Alauda 容器平台 应用市场管理 → OperatorHub 部分安装 Jaeger Operator,其中列出了可用的 Operators。
步骤
-
在 Web 控制台的 平台管理 视图中,选择要部署 Jaeger Operator 的 集群,然后导航到 应用市场管理 → OperatorHub。
-
使用搜索框在目录中搜索 Alauda build of Jaeger
。点击 Alauda build of Jaeger 标题。
-
阅读 Alauda build of Jaeger 页面上关于 Operator 的介绍信息。点击 安装。
-
在 安装 页面:
-
点击 安装。
-
验证 状态 显示为 Succeeded 以确认 Jaeger Operator 已正确安装。
-
检查 Jaeger Operator 的所有组件是否已成功安装。通过终端登录到集群并运行以下命令:
kubectl -n jaeger-operator get csv
示例输出
NAME DISPLAY VERSION REPLACES PHASE
jaeger-operator.vx.x.0 Jaeger Operator x.x.0 Succeeded
如果 PHASE
字段显示为 Succeeded
,则表示 Operator 及其组件已成功安装。
部署 Jaeger 实例
Jaeger 实例及其相关资源可通过 install-jaeger.sh
脚本安装,该脚本接收三个参数:
-
--es-url
:Elasticsearch 的访问 URL。
-
--es-user-base64
:Elasticsearch 的 Basic Auth
用户名,已进行 base64 编码。
-
--es-pass-base64
:Elasticsearch 的 Basic Auth
密码,已进行 base64 编码。
从 DETAILS 中复制安装脚本,登录要安装的集群,将其保存为 install-jaeger.sh
,并在授予执行权限后执行:
DETAILS
#!/bin/bash
set -euo pipefail
# get arg
while [ "$#" -gt 0 ]; do
case $1 in
--es-url=*)
ES_URL="${1#*=}"
;;
--es-user-base64=*)
ES_USER_BASE64="${1#*=}"
;;
--es-pass-base64=*)
ES_PASS_BASE64="${1#*=}"
;;
*)
echo "unknown argument: $1"
exit 1
;;
esac
shift
done
# print arg
echo "ES_URL: $ES_URL"
echo "ES_USER_BASE64: $ES_USER_BASE64"
echo "ES_PASS_BASE64: $ES_PASS_BASE64"
# get global-info from ConfigMap
CLUSTER_NAME=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.clusterName}')
echo "CLUSTER_NAME: ${CLUSTER_NAME}"
ISSUER_URL=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.oidcIssuer}')
CLIENT_ID=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.oidcClientID}')
CLIENT_SECRET=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.oidcClientSecret}')
CLIENT_SECRET_BASE64=$(echo -n "${CLIENT_SECRET}" | base64 -w0)
PLATFORM_URL=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.platformURL}')
echo "PLATFORM_URL: ${PLATFORM_URL}"
TARGET_NAMESPACE="cpaas-system"
JAEGER_BASEPATH="clusters/$CLUSTER_NAME/acp/jaeger"
_apply_resource() {
if [ -z "$1" ]; then
echo "Usage: _apply_resource <yaml_content>"
return 1
fi
local yaml_content="$1"
echo "$yaml_content" | kubectl apply -f -
}
_install_configmap() {
local yaml_content=$(cat <<EOF
apiVersion: v1
data:
OAUTH2_PROXY_CLIENT_ID: $CLIENT_ID
OAUTH2_PROXY_COOKIE_SECURE: "false"
OAUTH2_PROXY_EMAIL_DOMAINS: "*"
OAUTH2_PROXY_HTTP_ADDRESS: 0.0.0.0:4180
OAUTH2_PROXY_INSECURE_OIDC_ALLOW_UNVERIFIED_EMAIL: "true"
OAUTH2_PROXY_OIDC_ISSUER_URL: $ISSUER_URL
OAUTH2_PROXY_PROVIDER: oidc
OAUTH2_PROXY_PROXY_PREFIX: /$JAEGER_BASEPATH/oauth2
OAUTH2_PROXY_REDIRECT_URL: $PLATFORM_URL/$JAEGER_BASEPATH/oauth2/callback
OAUTH2_PROXY_SCOPE: openid profile email groups ext
OAUTH2_PROXY_SKIP_JWT_BEARER_TOKENS: "true"
OAUTH2_PROXY_SKIP_PROVIDER_BUTTON: "true"
OAUTH2_PROXY_SSL_INSECURE_SKIP_VERIFY: "true"
OAUTH2_PROXY_UPSTREAMS: http://127.0.0.1:16686
kind: ConfigMap
metadata:
name: jaeger-oauth2-proxy
namespace: $TARGET_NAMESPACE
EOF
)
_apply_resource "$yaml_content"
}
_install_secret() {
local yaml_content=$(cat <<EOF
apiVersion: v1
data:
OAUTH2_PROXY_CLIENT_SECRET: $CLIENT_SECRET_BASE64
OAUTH2_PROXY_COOKIE_SECRET: $CLIENT_SECRET_BASE64
kind: Secret
metadata:
name: jaeger-oauth2-proxy
namespace: $TARGET_NAMESPACE
type: Opaque
---
apiVersion: v1
data:
ES_PASSWORD: $ES_PASS_BASE64
ES_USERNAME: $ES_USER_BASE64
kind: Secret
metadata:
name: jaeger-elasticsearch-basic-auth
namespace: $TARGET_NAMESPACE
type: Opaque
EOF
)
_apply_resource "$yaml_content"
}
_install_sa() {
local yaml_content=$(cat <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: jaeger-prod-acp
namespace: $TARGET_NAMESPACE
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: jaeger-prod-acp
namespace: $TARGET_NAMESPACE
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: jaeger-prod-acp
namespace: $TARGET_NAMESPACE
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: jaeger-prod-acp
subjects:
- kind: ServiceAccount
name: jaeger-prod-acp
namespace: $TARGET_NAMESPACE
EOF
)
_apply_resource "$yaml_content"
}
_install_jaeger() {
local yaml_content=$(cat <<EOF
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: jaeger-prod
namespace: $TARGET_NAMESPACE
spec:
collector:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: jaeger-prod-collector
topologyKey: kubernetes.io/hostname
replicas: 1
resources:
limits:
cpu: "2"
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
imagePullSecrets:
- name: global-registry-auth
ingress:
enabled: false
labels:
service_name: jaeger
query:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: jaeger-prod-query
topologyKey: kubernetes.io/hostname
annotations:
oauth2-proxy.github.io/image: ""
oauth2-proxy.github.io/inject: "true"
oauth2-proxy.github.io/oidc-configmap: jaeger-oauth2-proxy
oauth2-proxy.github.io/oidc-secret: jaeger-oauth2-proxy
oauth2-proxy.github.io/proxyCPULimit: 100m
oauth2-proxy.github.io/proxyMemoryLimit: 128Mi
options:
query:
base-path: /$JAEGER_BASEPATH
replicas: 1
resources:
limits:
cpu: "1"
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
sampling:
options: {}
serviceAccount: jaeger-prod-acp
storage:
dependencies:
enabled: false
resources: {}
schedule: 55 23 * * *
elasticsearch:
name: elasticsearch
nodeCount: 3
redundancyPolicy: SingleRedundancy
esIndexCleaner:
enabled: true
numberOfDays: 7
resources: {}
schedule: 55 23 * * *
esRollover:
resources: {}
schedule: 0 0 * * *
options:
es.asm.cname: jaeger-elasticsearch-basic-auth
es.asm.cnamespace: $TARGET_NAMESPACE
es.index-prefix: acp-tracing-$CLUSTER_NAME
es.max-span-age: 168h0m0s
es.server-urls: $ES_URL
es.tls.enabled: true
es.tls.skip-host-verify: true
secretName: ""
type: elasticsearch
strategy: production
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
ui:
options:
dependencies:
menuEnabled: false
EOF
)
_apply_resource "$yaml_content"
}
_install_pod_monitor() {
local yaml_content=$(cat <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
labels:
monitoring: pods
prometheus: kube-prometheus
name: jaeger-monitor
namespace: $TARGET_NAMESPACE
spec:
jobLabel: app.kubernetes.io/name
namespaceSelector:
matchNames:
- $TARGET_NAMESPACE
podMetricsEndpoints:
- interval: 60s
path: /metrics
port: admin-http
selector:
matchLabels:
app.kubernetes.io/instance: jaeger-prod
EOF
)
_apply_resource "$yaml_content"
}
_install_ingress() {
local alb_annotation=""
if [[ "$CLUSTER_NAME" == "global" ]]; then
alb_annotation=$(cat <<EOF
alb.ingress.cpaas.io/rewrite-request: |
{"headers_var":{"Authorization":"cookie_cpaas_id_token"}}
EOF
)
fi
local ingress_class=""
if [[ "$CLUSTER_NAME" != "global" ]]; then
ingress_class=" ingressClassName: cpaas-system"
fi
local yaml_content=$(cat <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: jaeger-query
namespace: $TARGET_NAMESPACE
annotations:
nginx.ingress.kubernetes.io/enable-cors: "true"
$alb_annotation
spec:
$ingress_class
rules:
- http:
paths:
- path: /$JAEGER_BASEPATH
pathType: ImplementationSpecific
backend:
service:
name: jaeger-prod-query
port:
number: 4180
EOF
)
_apply_resource "$yaml_content"
}
# 最终安装
_install_configmap
_install_secret
_install_sa
_install_jaeger
_install_pod_monitor
_install_ingress
echo "Jaeger 安装完成"
脚本执行示例:
./install-jaeger.sh --es-url='https://xxx' --es-user-base64='xxx' --es-pass-base64='xxx'
脚本输出示例:
ES_URL: https://xxx
ES_USER_BASE64: xxx
ES_PASS_BASE64: xxx
CLUSTER_NAME: cluster-xxx
PLATFORM_URL: https://xxx
INSTALLED_CSV: jaeger-operator.vx.x.x
OAUTH2_PROXY_IMAGE: build-harbor.alauda.cn/3rdparty/oauth2-proxy/oauth2-proxy:vx.x.x
configmap/jaeger-oauth2-proxy created
secret/jaeger-oauth2-proxy created
secret/jaeger-elasticsearch-basic-auth created
serviceaccount/jaeger-prod-acp created
role.rbac.authorization.k8s.io/jaeger-prod-acp created
rolebinding.rbac.authorization.k8s.io/jaeger-prod-acp created
jaeger.jaegertracing.io/jaeger-prod created
podmonitor.monitoring.coreos.com/jaeger-monitor created
ingress.networking.k8s.io/jaeger-query created
Jaeger 安装完成
安装 OpenTelemetry Operator
使用 Web 控制台安装 OpenTelemetry Operator
您可以使用 Alauda 容器平台 应用市场管理 → OperatorHub 中列出的 Operator 安装 OpenTelemetry Operator。
步骤
-
在 Web 控制台的 平台管理 视图中,选择要部署 OpenTelemetry Operator 的 集群,然后导航到 应用市场管理 → OperatorHub。
-
使用搜索框在目录中搜索 Alauda build of OpenTelemetry
。点击 Alauda build of OpenTelemetry 标题。
-
阅读 Alauda build of OpenTelemetry 页面的 Operator 简介。点击 安装。
-
在 安装 页面:
-
点击 安装。
-
验证 状态 显示为 Succeeded 以确认 OpenTelemetry Operator 已正确安装。
-
检查 OpenTelemetry Operator 的所有组件是否已成功安装。通过终端登录到集群并运行以下命令:
kubectl -n opentelemetry-operator get csv
示例输出
NAME DISPLAY VERSION REPLACES PHASE
openTelemetry-operator.vx.x.0 OpenTelemetry Operator x.x.0 Succeeded
如果 PHASE
字段显示为 Succeeded
,则表示 Operator 及其组件已成功安装。
部署 OpenTelemetry 实例
OpenTelemetry 实例及其相关资源通过 install-otel.sh
脚本安装。
从 DETAILS 中复制安装脚本,登录要安装的集群,将其保存为 install-otel.sh
,并在授予执行权限后执行:
DETAILS
#!/bin/bash
set -euo pipefail
TARGET_NAMESPACE="cpaas-system"
# 从 ConfigMap 获取全局信息
CLUSTER_NAME=$(kubectl get configmap global-info -n kube-public -o jsonpath='{.data.clusterName}')
echo "CLUSTER_NAME: ${CLUSTER_NAME}"
_apply_resource() {
if [ -z "$1" ]; then
echo "Usage: _apply_resource <yaml_content>"
return 1
fi
local yaml_content="$1"
echo "$yaml_content" | kubectl apply -f -
}
_install_rbac() {
local yaml_content=$(cat <<EOF
apiVersion: v1
imagePullSecrets:
- name: global-registry-auth
kind: ServiceAccount
metadata:
name: otel-collector
namespace: $TARGET_NAMESPACE
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: otel-collector:cpaas-system:cluster-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: otel-collector
namespace: $TARGET_NAMESPACE
EOF
)
_apply_resource "$yaml_content"
}
_install_otel_collector() {
local yaml_content=$(cat <<EOF
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
name: otel
namespace: $TARGET_NAMESPACE
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: otel-collector
topologyKey: kubernetes.io/hostname
config:
exporters:
debug: {}
otlp:
balancer_name: round_robin
endpoint: dns:///jaeger-prod-collector-headless.$TARGET_NAMESPACE:4317
tls:
insecure: true
prometheus:
endpoint: 0.0.0.0:8889
extensions:
health_check:
endpoint: 0.0.0.0:13133
processors:
batch: {}
filter/metric_apis:
metrics:
datapoint:
- attributes["http.route"] == "/actuator/health" or attributes["uri"] == "/actuator/health"
- attributes["http.route"] == "/actuator/prometheus" or attributes["uri"] == "/actuator/prometheus"
transform:
metric_statements:
- context: datapoint
statements:
- delete_key(attributes, "inner.client.ms.name")
- delete_key(attributes, "inner.client.ms.namespace")
- delete_key(attributes, "inner.client.cluster.name")
- delete_key(attributes, "inner.client.env.type")
- set(attributes["namespace"], resource.attributes["k8s.namespace.name"])
- set(attributes["container"], resource.attributes["k8s.container.name"])
- set(attributes["service_name"], resource.attributes["service.name"])
- set(attributes["pod"], resource.attributes["k8s.pod.name"])
memory_limiter:
check_interval: 5s
limit_percentage: 85
spike_limit_percentage: 25
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
service:
extensions:
- health_check
pipelines:
metrics:
exporters:
- debug
- prometheus
processors:
- memory_limiter
- filter/metric_apis
- transform
- batch
receivers:
- otlp
traces:
exporters:
- debug
- otlp
processors:
- memory_limiter
- batch
receivers:
- otlp
telemetry:
logs:
level: info
metrics:
address: 0.0.0.0:8888
level: detailed
managementState: managed
mode: deployment
replicas: 1
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 250m
memory: 512Mi
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
serviceAccount: otel-collector
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
upgradeStrategy: automatic
EOF
)
_apply_resource "$yaml_content"
}
_install_instrumentation() {
local yaml_content=$(cat <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: acp-common-java
namespace: $TARGET_NAMESPACE
spec:
env:
- name: SERVICE_CLUSTER
value: "$CLUSTER_NAME"
- name: OTEL_TRACES_EXPORTER
value: otlp
- name: OTEL_METRICS_EXPORTER
value: otlp
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://otel-collector.$TARGET_NAMESPACE:4317
- name: OTEL_SERVICE_NAME
value: \$(SERVICE_NAME).\$(SERVICE_NAMESPACE)
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.namespace=\$(SERVICE_NAMESPACE),cluster.name=\$(SERVICE_CLUSTER)
sampler:
type: parentbased_traceidratio
argument: "1"
EOF
)
_apply_resource "$yaml_content"
}
_install_service_monitor() {
local yaml_content=$(cat <<EOF
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
monitoring: services
prometheus: kube-prometheus
name: otel-collector-monitoring
namespace: $TARGET_NAMESPACE
spec:
endpoints:
- interval: 60s
path: /metrics
port: monitoring
jobLabel: app.kubernetes.io/name
namespaceSelector:
matchNames:
- $TARGET_NAMESPACE
selector:
matchLabels:
app.kubernetes.io/instance: $TARGET_NAMESPACE.otel
operator.opentelemetry.io/collector-service-type: monitoring
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
monitoring: services
prometheus: kube-prometheus
name: otel-collector
namespace: $TARGET_NAMESPACE
spec:
endpoints:
- honorLabels: true
interval: 60s
path: /metrics
port: prometheus
jobLabel: app.kubernetes.io/name
namespaceSelector:
matchNames:
- $TARGET_NAMESPACE
selector:
matchLabels:
app.kubernetes.io/instance: $TARGET_NAMESPACE.otel
operator.opentelemetry.io/collector-service-type: base
EOF
)
_apply_resource "$yaml_content"
}
_install_rbac
_install_otel_collector
_install_instrumentation
_install_service_monitor
echo "OpenTelemetry 安装完成"
脚本执行示例:
脚本输出示例:
CLUSTER_NAME: cluster-xxx
serviceaccount/otel-collector created
clusterrolebinding.rbac.authorization.k8s.io/otel-collector:cpaas-system:cluster-admin created
opentelemetrycollector.opentelemetry.io/otel created
instrumentation.opentelemetry.io/acp-common-java created
servicemonitor.monitoring.coreos.com/otel-collector-monitoring created
servicemonitor.monitoring.coreos.com/otel-collector created
OpenTelemetry 安装完成
启用功能开关
追踪系统目前处于 Alpha 阶段,您需要手动在 功能开关 视图中启用 acp-tracing-ui
功能开关。
然后,导航到 容器平台 视图,进入 可观察性 → 追踪,查看追踪功能。
卸载追踪
删除 OpenTelemetry 实例
登录到已安装的集群中并执行以下命令以删除 OpenTelemetry 实例及相关资源。
kubectl -n cpaas-system delete servicemonitor otel-collector-monitoring
kubectl -n cpaas-system delete servicemonitor otel-collector
kubectl -n cpaas-system delete instrumentation acp-common-java
kubectl -n cpaas-system delete opentelemetrycollector otel
kubectl delete clusterrolebinding otel-collector:cpaas-system:cluster-admin
kubectl -n cpaas-system delete serviceaccount otel-collector
卸载 OpenTelemetry Operator
您可以在 Web 控制台的 平台管理 视图中卸载 OpenTelemetry Operator。
步骤
- 从 应用市场管理 → OperatorHub → 使用 搜索框 搜索
Alauda build of OpenTelemetry
。
- 点击 Alauda build of OpenTelemetry 标题以进入其详情页。
- 在 Alauda build of OpenTelemetry 详情页中,点击右上角 卸载 按钮。
- 在 确定卸载 opentelemetry-operator 吗? 窗口中,点击 卸载。
删除 Jaeger 实例
登录到已安装的集群中并执行以下命令以删除 Jaeger 实例及其相关资源。
kubectl -n cpaas-system delete ingress jaeger-query
kubectl -n cpaas-system delete podmonitor jaeger-monitor
kubectl -n cpaas-system delete jaeger jaeger-prod
kubectl -n cpaas-system delete rolebinding jaeger-prod-acp
kubectl -n cpaas-system delete role jaeger-prod-acp
kubectl -n cpaas-system delete serviceaccount jaeger-prod-acp
kubectl -n cpaas-system delete secret jaeger-oauth2-proxy
kubectl -n cpaas-system delete secret jaeger-elasticsearch-basic-auth
kubectl -n cpaas-system delete configmap jaeger-oauth2-proxy
卸载 Jaeger Operator
您可以在 Web 控制台的 平台管理 视图中卸载 Jaeger Operator。
步骤
- 从 应用市场管理 → OperatorHub → 使用 搜索框 搜索
Alauda build of Jaeger
。
- 点击 Alauda build of Jaeger 标题进入详情页。
- 在 Alauda build of Jaeger 详情页中,点击右上角 卸载 按钮。
- 在 确定卸载 jaeger-operator 吗? 窗口中,点击 卸载。