By default, Redis uses a single-threaded model to handle read and write requests. High CPU usage can lead to slower database responses, affecting normal business operations.
You can check CPU usage from the perspective of the instance itself or the Pods.
Use Redis's info
command to get CPU usage statistics for the Redis instance itself. You can also find the corresponding metrics in Redis Instance Details > Monitoring.
Metric Name | Meaning |
---|---|
Cpu User | used_cpu_user: CPU usage in user mode. used_cpu_user_children: CPU used by background processes in user mode. |
Cpu Sys | used_cpu_sys: CPU usage in system mode. used_cpu_sys_children: CPU used by background processes in system mode. |
CPU usage statistics for each Pod under Redis will be displayed on a per Pod basis, showing CPU statistics for each container underneath. You can find the corresponding Pod CPU monitoring in Redis Instance Details > Replicas > StatefulSet > Pods.
Metric Name | Meaning |
---|---|
CPU Usage | CPU usage of each Pod. |
Check the Redis logs for any error or warning messages.
Use system-level monitoring tools like top or htop to check the CPU usage, memory usage, and other information of the Redis process, and determine whether it is only the Redis process's CPU usage that is high or if the entire system is under heavy load.
Check if the maxmemory
parameter in the Redis configuration file is set reasonably. If Redis exceeds the maxmemory
limit, it may lead to high CPU usage.
Verify that the persistence method of Redis is correct. If the Redis persistence configuration is unreasonable, such as frequent AOF operations, it might cause high CPU usage.
Check for slow queries.
In a Redis instance started with default parameters, queries that take longer than 20ms are considered slow queries. The execution time refers to the time taken to execute a query command excluding client response (talking), sending replies, and other IO operations. You can customize the slow query threshold and retention count through the parameters slowlog-log-slower-than
and slowlog-max-len
. You can view the slow queries using the Redis slowlog get
command.
Generally, the CPU usage of Redis commands is closely related to their time complexity. It is commonly believed that O(N) complexity or above should be approached cautiously in terms of traffic assessment during business code design. For the time complexity of each Redis command, please refer to the Redis Official Documentation.
The most common slow logs include KEYS
, LRANGE
, EVAL
, HGETALL
, PUBSUB
, etc. These generally involve commands that scan large datasets or involve heavy computation logic in Redis. Since Redis supports various data types, common data types like hash and list have high limits on length, leading to potential big key issues in practical usage.
Check if there is excessive client request volume to Redis. An excessive volume of client requests may lead to high CPU usage.
If there are no apparent slow queries, consider whether the client request volume to Redis is excessive. If there is a sudden increase in CPU usage without any changes to the business code, monitor the QPS of the Redis instance to see if it has spiked.
If it is confirmed that the surge in CPU usage is due to normal business traffic growth or due to hot keys, you can address this through architectural upgrades:
For extremely hot hash keys, consider splitting them into string types and distributing them across multiple Redis instances.
For extremely hot string keys, if rate limiting is not feasible within your business architecture, consider increasing multi-thread configuration and scaling up the CPU specifications of the instance.
For read-heavy scenarios, you can consider adding read-only instances in Sentinel mode to handle high-concurrency read requests, or adding caching logic in the business code; in cluster mode, you can consider increasing sharding.
For write-heavy and read-light scenarios, it is recommended to use Redis in cluster mode.
If you are using Redis in cluster mode, check for uneven traffic distribution. When there is a response delay in the cluster, if the CPU usage of all nodes providing external services within the cluster is high, consider changing the data distribution rules or scaling the instances.
If Lua scripts are used in Redis, check the scripts for any issues that may lead to high CPU usage.