English

Shard Scaling in Cluster Mode

Redis Cluster architecture provides horizontal scalability through data distribution across multiple shards. As workload requirements evolve, you may need to adjust the shard count to optimize performance, resource utilization, and data distribution. This document outlines the technical considerations, resource requirements, and implementation guidelines for shard scaling operations.

Prerequisites

Before modifying the shard configuration in a Redis Cluster, ensure the following conditions are met:

Operational Status: The cluster must be in a stable Running state with all nodes functioning correctly.
Resource Availability: Sufficient memory resources must be available across the cluster to accommodate data redistribution during the scaling process.
Maintenance Window: Schedule scaling operations during periods of low traffic to minimize potential service impact.
Network Capacity: Verify adequate network bandwidth between Redis nodes to support efficient data migration.
Infrastructure Stability: Ensure the underlying infrastructure is stable to prevent node failures during the scaling operation, which could compromise data integrity.

Resource Requirements

Proper resource planning is critical for successful shard scaling operations. The following memory-related concepts are important to understand:

Term	Definition	Notes
Data Memory	Actual memory consumed by Redis dataset	Observable via `info memory` command; this value fluctuates as Redis performs memory optimization and garbage collection
Maximum Data Memory	Highest memory consumption observed across all Redis nodes	Used as a reference point for capacity planning
Maximum Running Memory	Maximum Data Memory ÷ 0.8	The coefficient (0.8) represents the recommended ratio between `maxmemory` and total instance memory allocation; lower ratios increase overhead margin but reduce memory efficiency
Migration Data Memory	Aggregate data volume from shards being decommissioned	During scale-down operations, this data must be redistributed across remaining shards

Scale-Up Operations

When increasing the number of shards, each instance should satisfy:

Instance Memory Allocation ≥ Maximum Running Memory

This formula ensures successful shard addition under balanced data distribution scenarios. In environments with data imbalance or presence of BigKeys, additional capacity planning is required as data migration patterns may concentrate load on specific shards.

Scale-Down Operations

When decreasing the number of shards, each remaining instance should satisfy:

Instance Memory Allocation ≥ (Maximum Data Memory of Retained Shards + Migration Data Memory) ÷ 0.8

This formula accounts for the redistribution of data from decommissioned shards. Unlike scale-up operations, scale-down processes impose strict resource requirements to accommodate potential data concentration scenarios, even with unbalanced data distribution.

Procedure

CLI

Web Console

Shard scaling is controlled through the spec.replicas.cluster.shard field in the Redis custom resource (see API documentation for detailed parameter specifications).

# Configure the cluster to use 4 shards
$ kubectl -n default patch redis c6 --type=merge --patch='{"spec": {"replicas":{"cluster":{"shard": 4}}}}'

To monitor the scaling operation progress:

$ kubectl -n default get redis c6 -w

After submitting the shard configuration change, the system transitions the instance to Data Balancing state and autonomously manages the slot migration process. While in this state, all instance modifications except deletion are restricted.

Client Connectivity During Scaling

During shard redistribution, Redis slot migration affects client request handling as follows:

When a client requests access to a key in a slot marked for migration:
- If the key has already migrated to the target shard, Redis returns a MOVED redirection error
- If the key has not yet migrated, it will be processed by the original shard
Redis Cluster-aware clients automatically handle these redirection errors by reissuing the request to the appropriate shard

Performance Advisory: Shard scaling during high-traffic periods may double the effective client request volume due to redirections, potentially creating network congestion. Schedule scaling operations during periods of reduced activity whenever possible.

Operation Duration

The time required to complete shard scaling operations depends on multiple factors:

Total data volume being redistributed
Network throughput between nodes
Instance resource specifications
Presence of BigKeys requiring migration
Ongoing workload intensity

Typically, shard scaling operations take between 20 minutes and several hours to complete. Larger datasets and more complex migration patterns extend this duration.

Redis APIs

Shard Scaling in Cluster Mode

TOC

Prerequisites

Resource Requirements

Scale-Up Operations

Scale-Down Operations

Procedure

Client Connectivity During Scaling

Operation Duration

Redis APIs

#Shard Scaling in Cluster Mode

#TOC

#Prerequisites

#Resource Requirements

#Scale-Up Operations

#Scale-Down Operations

#Procedure

#Client Connectivity During Scaling

#Operation Duration

Shard Scaling in Cluster Mode

TOC

Prerequisites

Resource Requirements

Scale-Up Operations

Scale-Down Operations

Procedure

Client Connectivity During Scaling

Operation Duration