Metrics are used to quantitatively describe the operating status of a system, and each metric consists of four basic elements:
cpu_usage
85.5
{pod="nginx-1", namespace="default"}
PromQL is the query language for Prometheus, used to query and aggregate metric data from the monitoring system.
The platform has preset a series of commonly used monitoring metrics based on long-term operational experience. You can directly use these metrics when configuring alarm rules or creating monitoring dashboards without additional configuration.
The Exporter is a component for collecting monitoring data, with primary responsibilities including:
ServiceMonitor is used to declaratively manage monitoring configurations and primarily defines:
Alarm rules define the specific conditions for triggering alarms:
Alarm policies organize multiple alarm rules together for unified configuration:
Notification policies manage the rules for sending alarm messages:
Notification templates customize the display format of alarm messages:
A dashboard is a collection of multiple related pannels, providing an overall view of the system status. It supports flexible layout arrangements and can organize pannels in rows or columns.
Pannels are visual representations of monitoring data, supporting various display types.
The configuration of monitoring data sources. Currently, only the monitoring components of the current cluster are supported as data sources, and custom data sources are not supported for now.
Variables serve as placeholders for values and can be used in metric queries. Through the variable selector at the top of the dashboard, you can dynamically adjust query conditions, allowing chart content to update in real-time.