Install

TOC

Overview

The monitoring component serves as the infrastructure for monitoring, alerting, inspection, and health checking functions within the observability module. This document describes how to install the ACP Monitoring with Prometheus plugin or the ACP Monitoring with VictoriaMetrics plugin within a cluster.

Installation Preparation

Before install the monitoring components, please ensure the following conditions are met:

  • The appropriate monitoring component has been selected by referring to the Monitoring Component Selection Guide.
  • When install in a workload cluster, ensure that the global cluster can access port 11780 of the workload cluster.
  • If you need to use storage classes or persistent volume storage for monitoring data, please create the corresponding resources in the Storage section in advance.

Install the ACP Monitoring with Prometheus Plugin via console

Installation Procedures

  1. Navigate to App Store Management > Cluster Plugins and select the target cluster.

  2. Locate the ACP Monitoring with Prometheus plugin and click Install.

  3. Configure the following parameters:

    ParameterDescription
    Scale ConfigurationSupports three configurations: Small Scale, Medium Scale, and Large Scale:
    - Default values are set based on the recommended load test values of the platform
    - You can choose or customize quotas based on the actual cluster scale
    - Default values will be updated with platform versions; for fixed configurations, custom settings are recommended
    Storage Type- LocalVolume: Local storage with data stored on specified nodes
    - StorageClass: Automatically generates persistent volumes using a storage class
    - PV: Utilizes existing persistent volumes
    Note: Storage configuration cannot be modified after Installation
    Replica CountSets the number of monitoring component pods
    Note: Prometheus supports only single-node installation
    Parameter ConfigurationData parameters for the monitoring component can be adjusted as needed
  4. Click Install to complete the installation.

Access Method

Once installation is complete, the components can be accessed at the following addresses (replace <> with actual values):

ComponentAccess Address
Thanos<platform_access_address>/clusters/<cluster>/prometheus
Prometheus<platform_access_address>/clusters/<cluster>/prometheus-0
Alertmanager<platform_access_address>/clusters/<cluster>/alertmanager

Install the ACP Monitoring with Prometheus Plugin via YAML

1. Check available versions

Ensure the plugin has been published by checking for ModulePlugin and ModuleConfig resources, in the globalcluster:


# kubectl get moduleplugin | grep prometheus 
prometheus                       30h
# kubectl get moduleconfig | grep prometheus
prometheus-v4.1.0                30h

This indicates that the ModulePlugin prometheus exists in the cluster and version v4.1.0 is published.

2. Create a ModuleInfo

Create a ModuleInfo resource to install the plugin without any configuration parameters:

kind: ModuleInfo
apiVersion: cluster.alauda.io/v1alpha1
metadata:
  name: global-prometheus
  labels:
    cpaas.io/cluster-name: global
    cpaas.io/module-name: prometheus
    cpaas.io/module-type: plugin
spec:
  version: v4.1.0
  config:
    storage:
      type: LocalVolume
      capacity: 40
      nodes:
        - xxx.xxx.xxx.xx
      path: /cpaas/monitoring
      storageClass: ""
      pvSelectorK: ""
      pvSelectorV: ""
    replicas: 1
    components:
      prometheus:
        retention: 7
        scrapeInterval: 60
        scrapeTimeout: 45
        resources: null
      nodeExporter:
        port: 9100
        resources: null
      alertmanager:
        resources: null
      kubeStateExporter:
        resources: null
      prometheusAdapter:
        resources: null
      thanosQuery:
        resources: null
    size: Small

Reference for resources settings, example prometheus:

spec:
  config:
    components:
      prometheus:
        resources:
          limits:
            cpu: 2000m
            memory: 2000Mi
          requests:
            cpu: 1000m
            memory: 1000Mi

For more details, please refer to the Monitor Component Capacity Planning

YAML field reference (VictoriaMetrics):

Field pathDescription
metadata.labels.cpaas.io/cluster-nameTarget cluster name where the plugin is installed.
metadata.labels.cpaas.io/module-nameMust be victoriametrics.
metadata.labels.cpaas.io/module-typeMust be plugin.
metadata.nameModuleInfo name (e.g., <cluster>-victoriametrics).
spec.versionPlugin version to install.
spec.config.storage.typeStorage type: LocalVolume, StorageClass, or PV.
spec.config.storage.capacityStorage size for VictoriaMetrics (Gi). Minimum 30 Gi recommended.
spec.config.storage.nodesNode list when storage.type=LocalVolume. Up to 1 node supported.
spec.config.storage.pathLocalVolume path when storage.type=LocalVolume.
spec.config.storage.storageClassStorageClass name when storage.type=StorageClass.
spec.config.storage.pvSelectorKPV selector key when storage.type=PV.
spec.config.storage.pvSelectorVPV selector value when storage.type=PV.
spec.replicasReplica count; LV does not support multiple replicas.
spec.config.components.vmstorage.retentionData retention days for vmstorage.
spec.config.components.vmagent.scrapeIntervalScrape interval seconds; applies to ServiceMonitors without interval.
spec.config.components.vmagent.scrapeTimeoutScrape timeout seconds; must be less than scrapeInterval.
spec.config.components.vmstorage.resourcesResource settings for vmstorage.
spec.config.components.nodeExporter.portNode Exporter port (default 9100).
spec.config.components.nodeExporter.resourcesResource settings for Node Exporter.
spec.config.components.alertmanager.resourcesResource settings for Alertmanager.
spec.config.components.kubeStateExporter.resourcesResource settings for Kube State Exporter.
spec.config.components.prometheusAdapter.resourcesResource settings for Prometheus Adapter (used for HPA/custom metrics).
spec.config.components.vmagent.resourcesResource settings for vmagent.
spec.config.sizeMonitoring scale: Small, Medium, or Large.

3. Verify installation

Since the ModuleInfo name changes upon creation, locate the resource via label to check the plugin status and version:

kubectl get moduleinfo -l cpaas.io/module-name=victoriametrics
NAME                                             CLUSTER         MODULE            DISPLAY_NAME     STATUS    TARGET_VERSION   CURRENT_VERSION   NEW_VERSION
global-e671599464a5b1717732c5ba36079795          global          victoriametrics   victoriametrics  Running   v4.1.0           v4.1.0            v4.1.0

Field explanations:

  • NAME: ModuleInfo resource name
  • CLUSTER: Cluster where the plugin is installed
  • MODULE: Plugin name
  • DISPLAY_NAME: Display name of the plugin
  • STATUS: Installation status; Running means successfully installed and running
  • TARGET_VERSION: Intended installation version
  • CURRENT_VERSION: Version before installation
  • NEW_VERSION: Latest available version for installation

Install the ACP Monitoring with VictoriaMetrics Plugin via console

Prerequisites

  • If only install the VictoriaMetrics agent, ensure that the VictoriaMetrics Center has been installed in another cluster.

Installation Procedures

  1. Navigate to App Store Management > Cluster Plugins and select the target cluster.

  2. Locate the ACP Monitoring with VictoriaMetrics plugin and click Install.

  3. Configure the following parameters:

    ParameterDescription
    Scale ConfigurationSupports three configurations: Small Scale, Medium Scale, and Large Scale:
    - Default values are set based on the recommended load test values of the platform
    - You can choose or customize quotas based on the actual cluster scale
    - Default values will be updated with platform versions; for fixed configurations, custom settings are recommended
    Install Agent Only- Off: Install the complete VictoriaMetrics component suite
    - On: Install only the VMAgent collection component, which relies on the VictoriaMetrics Center
    VictoriaMetrics CenterSelect the cluster where the complete VictoriaMetrics component has been installed
    Storage Type- LocalVolume: Local storage with data stored on specified nodes
    - StorageClass: Automatically generates persistent volumes using a storage class
    - PV: Utilizes existing persistent volumes
    Replica CountSets the number of monitoring component pods:
    - LocalVolume storage type does not support multiple replicas
    - For other storage types, please refer to on-screen prompts for configuration
    Parameter ConfigurationData parameters for the monitoring component can be adjusted
    Note: Data may temporarily exceed the retention period before being deleted
  4. Click Install to complete the installation.

Install the ACP Monitoring with VictoriaMetrics Plugin via YAML

1. Check available versions

Ensure the plugin has been published by checking for ModulePlugin and ModuleConfig resources, in the globalcluster:


# kubectl get moduleplugin | grep victoriametrics    
victoriametrics                       30h
# kubectl get moduleconfig | grep victoriametrics
victoriametrics-v4.1.0                30h

This indicates that the ModulePlugin victoriametrics exists in the cluster and version v4.1.0 is published.

2. Create a ModuleInfo

Create a ModuleInfo resource to install the plugin without any configuration parameters:

kind: ModuleInfo
apiVersion: cluster.alauda.io/v1alpha1
metadata:
  name: business-1-victoriametrics
  labels:
    cpaas.io/cluster-name: business-1
    cpaas.io/module-name: victoriametrics
    cpaas.io/module-type: plugin
spec:
  version: v4.1.0
  config:
    storage:
      type: LocalVolume
      capacity: 40
      nodes:
        - xxx.xxx.xxx.xx
      path: /cpaas/monitoring
      storageClass: ""
      pvSelectorK: ""
      pvSelectorV: ""
    replicas: 1
    agentOnly: false
    agentReplicas: 1
    crossClusterDependency:
      victoriametrics: ""
    components:
      nodeExporter:
        port: 9100
        resources: null
      vmstorage:
        retention: 7
        resources: null
      kubeStateExporter:
        resources: null
      vmalert:
        resources: null
      prometheusAdapter:
        resources: null
      vmagent:
        scrapeInterval: 60
        scrapeTimeout: 45
        resources: null
      vminsert:
        resources: null
      alertmanager:
        resources: null
      vmselect:
        resources: null
    size: Small

Reference for resources settings, example prometheus:

spec:
  config:
    components:
      vmagent:
        resources:
          limits:
            cpu: 2000m
            memory: 2000Mi
          requests:
            cpu: 1000m
            memory: 1000Mi

For more details, please refer to the Monitor Component Capacity Planning

YAML field reference (Prometheus):

Field pathDescription
metadata.labels.cpaas.io/cluster-nameTarget cluster name where the plugin is installed.
metadata.labels.cpaas.io/module-nameMust be prometheus.
metadata.labels.cpaas.io/module-typeMust be plugin.
metadata.nameModuleInfo name (e.g., <cluster>-prometheus).
spec.versionPlugin version to install.
spec.config.storage.typeStorage type: LocalVolume, StorageClass, or PV.
spec.config.storage.capacityStorage size for Prometheus (Gi). Minimum 30 Gi recommended.
spec.config.storage.nodesNode list when storage.type=LocalVolume. Up to 1 node supported.
spec.config.storage.pathLocalVolume path when storage.type=LocalVolume.
spec.config.storage.storageClassStorageClass name when storage.type=StorageClass.
spec.config.storage.pvSelectorKPV selector key when storage.type=PV.
spec.config.storage.pvSelectorVPV selector value when storage.type=PV.
spec.replicasReplica count; only applicable to StorageClass/PV types.
spec.config.components.prometheus.retentionData retention days.
spec.config.components.prometheus.scrapeIntervalScrape interval seconds; applies to ServiceMonitors without interval.
spec.config.components.prometheus.scrapeTimeoutScrape timeout seconds; must be less than scrapeInterval.
spec.config.components.prometheus.resourcesResource settings for Prometheus.
spec.config.components.nodeExporter.portNode Exporter port (default 9100).
spec.config.components.nodeExporter.resourcesResource settings for Node Exporter.
spec.config.components.alertmanager.resourcesResource settings for Alertmanager.
spec.config.components.kubeStateExporter.resourcesResource settings for Kube State Exporter.
spec.config.components.prometheusAdapter.resourcesResource settings for Prometheus Adapter.
spec.config.components.thanosQuery.resourcesResource settings for Thanos Query.
spec.config.sizeMonitoring scale: Small, Medium, or Large.

3. Verify installation

Since the ModuleInfo name changes upon creation, locate the resource via label to check the plugin status and version:

kubectl get moduleinfo -l cpaas.io/module-name=prometheus
NAME                                             CLUSTER         MODULE        DISPLAY_NAME   STATUS    TARGET_VERSION   CURRENT_VERSION   NEW_VERSION
global-e671599464a5b1717732c5ba36079795          global          prometheus    prometheus     Running   v4.1.0           v4.1.0            v4.1.0

Field explanations:

  • NAME: ModuleInfo resource name
  • CLUSTER: Cluster where the plugin is installed
  • MODULE: Plugin name
  • DISPLAY_NAME: Display name of the plugin
  • STATUS: Installation status; Running means successfully installed and running
  • TARGET_VERSION: Intended installation version
  • CURRENT_VERSION: Version before installation
  • NEW_VERSION: Latest available version for installation