Shard Scaling in Cluster Mode

Redis Cluster architecture provides horizontal scalability through data distribution across multiple shards. As workload requirements evolve, you may need to adjust the shard count to optimize performance, resource utilization, and data distribution. This document outlines the technical considerations, resource requirements, and implementation guidelines for shard scaling operations.

TOC

Prerequisites

Before modifying the shard configuration in a Redis Cluster, ensure the following conditions are met:

  • Operational Status: The cluster must be in a stable Running state with all nodes functioning correctly.
  • Resource Availability: Sufficient memory resources must be available across the cluster to accommodate data redistribution during the scaling process.
  • Maintenance Window: Schedule scaling operations during periods of low traffic to minimize potential service impact.
  • Network Capacity: Verify adequate network bandwidth between Redis nodes to support efficient data migration.
  • Infrastructure Stability: Ensure the underlying infrastructure is stable to prevent node failures during the scaling operation, which could compromise data integrity.

Resource Requirements

Proper resource planning is critical for successful shard scaling operations. The following memory-related concepts are important to understand:

TermDefinitionNotes
Data MemoryActual memory consumed by Redis datasetObservable via info memory command; this value fluctuates as Redis performs memory optimization and garbage collection
Maximum Data MemoryHighest memory consumption observed across all Redis nodesUsed as a reference point for capacity planning
Maximum Running MemoryMaximum Data Memory ÷ 0.8The coefficient (0.8) represents the recommended ratio between maxmemory and total instance memory allocation; lower ratios increase overhead margin but reduce memory efficiency
Migration Data MemoryAggregate data volume from shards being decommissionedDuring scale-down operations, this data must be redistributed across remaining shards

Scale-Up Operations

When increasing the number of shards, each instance should satisfy:

Instance Memory Allocation ≥ Maximum Running Memory

This formula ensures successful shard addition under balanced data distribution scenarios. In environments with data imbalance or presence of BigKeys, additional capacity planning is required as data migration patterns may concentrate load on specific shards.

Scale-Down Operations

When decreasing the number of shards, each remaining instance should satisfy:

Instance Memory Allocation ≥ (Maximum Data Memory of Retained Shards + Migration Data Memory) ÷ 0.8

This formula accounts for the redistribution of data from decommissioned shards. Unlike scale-up operations, scale-down processes impose strict resource requirements to accommodate potential data concentration scenarios, even with unbalanced data distribution.

Procedure

CLI
Web Console

Shard scaling is controlled through the spec.replicas.cluster.shard field in the Redis custom resource (see API documentation for detailed parameter specifications).

# Configure the cluster to use 4 shards
$ kubectl -n default patch redis c6 --type=merge --patch='{"spec": {"replicas":{"cluster":{"shard": 4}}}}'

To monitor the scaling operation progress:

$ kubectl -n default get redis c6 -w

After submitting the shard configuration change, the system transitions the instance to Data Balancing state and autonomously manages the slot migration process. While in this state, all instance modifications except deletion are restricted.

Client Connectivity During Scaling

During shard redistribution, Redis slot migration affects client request handling as follows:

  1. When a client requests access to a key in a slot marked for migration:

    • If the key has already migrated to the target shard, Redis returns a MOVED redirection error
    • If the key has not yet migrated, it will be processed by the original shard
  2. Redis Cluster-aware clients automatically handle these redirection errors by reissuing the request to the appropriate shard

Performance Advisory: Shard scaling during high-traffic periods may double the effective client request volume due to redirections, potentially creating network congestion. Schedule scaling operations during periods of reduced activity whenever possible.

Operation Duration

The time required to complete shard scaling operations depends on multiple factors:

  • Total data volume being redistributed
  • Network throughput between nodes
  • Instance resource specifications
  • Presence of BigKeys requiring migration
  • Ongoing workload intensity

Typically, shard scaling operations take between 20 minutes and several hours to complete. Larger datasets and more complex migration patterns extend this duration.