English

Monitoring and Alerts

The platform provides comprehensive monitoring capabilities with integrated dashboards for Redis instances. These monitoring features enable performance analysis, resource utilization tracking, and configurable alerting mechanisms for proactive management.

Monitoring

The platform automatically collects key performance metrics for Redis instances related to resource utilization and operational performance. These metrics can be viewed in real-time through the instance's Monitoring tab.

Category	Metrics
Cluster Status Monitoring	Key count statistics, command execution metrics, replication lag, etc.
Resource Monitoring	Memory utilization, network traffic patterns, storage consumption, etc.
Performance Monitoring	Connection count, network I/O throughput, command latency, etc.

Alerts

To configure alerting rules for Redis instances, navigate to the Alerts > Rules page in Alauda Application Service.

Configuring Alert Rules

Implementing alerts requires the creation of an alert rule in Alauda Application Service. An alert rule defines the monitoring targets, threshold conditions that trigger notifications, and the notification delivery mechanisms.

The platform provides the following pre-configured alert indicators:

Indicator	Recommended Threshold	Description
Instance Status	!=1, sustained for 30 seconds	Monitors instance availability and operational state
Key Access Hit Rate	< 80%, sustained for 30 seconds	Monitors cache efficiency; low hit rates may indicate cache misses requiring strategy adjustments (increasing TTL values, optimizing key patterns, etc.)
Average Response Time	>0.1s, sustained for 30 seconds	Monitors command execution latency; prolonged response times may indicate CPU constraints, excessive workload, or BigKey operations
Master-Slave Failover	=1, sustained for 30 seconds	Detects master-slave role transitions that may indicate underlying infrastructure issues or Redis node failures
Inbound Bandwidth per Node	Environment-specific thresholds	Monitors network ingress in real-time to prevent bandwidth saturation affecting service availability
Outbound Bandwidth per Node	Environment-specific thresholds	Monitors network egress in real-time to prevent bandwidth saturation affecting service availability
Client Connections per Node	Environment-specific thresholds	Monitors connection patterns to detect potential connection leaks or abnormal access patterns
CPU Utilization per Node	> 80%, sustained for 30 seconds	Monitors CPU consumption; sustained high utilization may require capacity planning and scaling
Memory Utilization per Node	> 80%, sustained for 30 seconds	Monitors memory usage; approaching capacity limits requires immediate scaling to prevent eviction or OOM conditions
Storage Utilization per Node	> 80%, sustained for 30 seconds	Monitors persistent storage usage for RDB/AOF configurations; high utilization requires capacity expansion

These pre-configured indicators facilitate rapid alert rule implementation. For advanced monitoring requirements, custom alert indicators can be defined using Prometheus query syntax:

(1/(1+(avg(irate(redis_keyspace_misses_total{namespace=~"<namespace>", pod=~"<podname prefix>-.*"}[5m])) by(namespace,service) / (avg(irate(redis_keyspace_hits_total{namespace=~"<namespace>", pod=~"<podname prefix>-.*"}[5m])) by(namespace,service)+1))))

In this example, redis_keyspace_misses_total represents a Prometheus-collected metric, <namespace> filters resources by namespace, and <podname prefix> specifies the Pod name pattern for resources managed by Deployment or StatefulSet. For comprehensive information on metric queries, refer to the PromQL Official Documentation.

For detailed guidance on alert configuration and management, see .

Redis APIs

Monitoring and Alerts

TOC

Monitoring

Alerts

Configuring Alert Rules

Redis APIs

#Monitoring and Alerts

#TOC

#Monitoring

#Alerts

#Configuring Alert Rules

Monitoring and Alerts

TOC

Monitoring

Alerts

Configuring Alert Rules