Monitoring and Alerts

The monitoring data in the embedded panel can be used for RabbitMQ monitoring and alerts in terms of resources, performance, etc., and supports the configuration of notification strategies.

The intuitively presented monitoring data can assist in decision-making for operational inspections or performance tuning, and the comprehensive alert and notification mechanism will help ensure stable database operations.

Monitoring

The platform collects commonly used monitoring metrics for RabbitMQ resources, performance, etc., by default. In the instance's Monitoring tab, real-time monitoring data for the metrics can be viewed.

CategoryMetrics
Node MonitoringMonitoring of node available memory, disk, file descriptors, and other metrics
Message MonitoringMonitoring of sent and received messages
Queue MonitoringOverall monitoring of queues
Channel MonitoringOverall monitoring of channels
Connection MonitoringOverall monitoring of client connections

RabbitMQ metrics primarily expose overall metrics for various components; thus, the default monitoring panel cannot accurately monitor the status of a specific Queue or Exchange.

Alerts

Go to the Application Service's Alerts > Alert Policies page to create an alert policy corresponding to RabbitMQ (you may also use the platform-wide alert feature to create alerts).

Configuring Alert Policies

To enable alerts, first, create an alert policy in the Application Service. The alert policy describes the objects you wish to monitor, under what circumstances you wish to receive alerts, and how you should be notified of related alerts.

The platform includes the following built-in alert metrics:

NameRecommended Trigger ConditionsDescription
Instance Availability!=1, and persists for 30 secondsMonitors the availability of the cluster
Channel CountSet thresholds based on actual specifications and client applicationsReal-time monitoring of the number of channels
Message Write FrequencySet thresholds based on actual specificationsReal-time monitoring of message write frequency
Connection CountSet thresholds based on actual specificationsReal-time monitoring of the number of connections
Queue CountSet thresholds based on actual specificationsReal-time monitoring of the number of queues
Node CPU Utilization>80%, and persists for 30 secondsReal-time monitoring of CPU usage; if CPU usage is too high, consider scaling.
Node Memory Utilization>80%, and persists for 30 secondsReal-time monitoring of memory usage; if memory usage is too high, scaling is needed immediately.
Node Storage Space Utilization>80%, and persists for 30 secondsReal-time monitoring of storage space; if usage is too high, consider scaling.

These alert metrics allow for the quick creation of alert policies.

In addition to the built-in alert metrics, custom alert metrics can also be defined. Custom alerts require you to edit PromQL yourself and submit it to the alert metric form. PromQL is a relatively complex expression that can be edited and debugged using the built-in Prometheus Console. For basic PromQL support, please refer to the PromQL official documentation.

For more information on configuring and using alerts, please refer to .