Limitations and Risks

TOC

Only IP connection addresses are currently supported

When the target instance connects, only <ip:port> connections are supported, and domain names are not currently supported.

The disaster recovery Proxy address itself is not highly available. If you expose the source Proxy via NodePort, place it behind a front-end load balancer to ensure HA; alternatively, use MetalLB to provide a highly available virtual IP for the Proxy.

Currently only Redis 6.0 is supported, other versions do not support enabling disaster recovery

Only Redis 6.0 instances support the disaster recovery function, and support for Redis 7.2 is still under development; after Redis 6.0 enables disaster recovery, it does not support upgrading to Redis 7.2.

After the Redis instance enables disaster recovery, it cannot be turned off

After the Redis disaster recovery module is enabled, it will write AUX information records such as the corresponding offset and oplog id in the Redis persistent RDB file; at the same time, there will be special synchronization instructions in the master-slave synchronization of the Redis instance. Therefore, in order to ensure the reliability of the data and the availability of the cluster, it is not allowed to uninstall the Redis disaster recovery module after it is enabled.

The Redis disaster recovery instance will write oplogs to storage in real time (fsync is executed once per second); when the oplog reaches the limited size (3Gb), the oplog file will be cut; the cleaning logic will regularly mark the unused oplog slice files and clean them up in time. Therefore, the above will have the storage logic of oplog reading and writing and rdb generation at the same time, which will increase the pressure on storage. It is recommended to use local SSD as storage to ensure persistence performance.

Using the RDB parameter template, combined with the save parameter, actively triggers snapshots and actively triggers the cleaning of oplogs, which can ensure that new target instance connections can quickly synchronize snapshots. If you use the AOF parameter template, you still need to set a save parameter to ensure regular RDB persistence, to ensure that the local data is more reliable, but at the same time, enabling RDB and AOF persistence has better performance requirements for storage. Please confirm that you are using local SSD as storage.

To build a disaster recovery network, the Redis instances at both ends need to ensure that the architecture is the same

The Redis disaster recovery plug-in currently synchronizes data based on oplogs. The oplog context has correlation and idempotency, and the oplog data cannot be re-sharded and submitted to different shards. This restriction requires that:

  1. The target side of the sentinel mode can only be the sentinel mode
  2. The target side of the cluster mode can only be the cluster mode
  3. The shards of the target side in cluster mode must be consistent with the source
  4. The slot distribution of the target side in cluster mode must be consistent with the source
  5. Cluster mode does not support online synchronization to add or reduce shards

A Redis disaster recovery instance as a source can have at most 15 target instances

Due to the limitation of service_id, a disaster recovery source can have at most 15 target instances. However, cascading is not currently supported, and multi-active is not supported (only star distribution is supported, not tree or ring links).

The following Redis 6.0 write commands do not currently support disaster recovery

migrate, pubsub related commands, stream related commands. For pubsub and stream commands, it is recommended to use MQ components as an alternative.

After the source fails, the first thing to do is to disconnect from the source

After the source fails, before the disaster recovery switch, the first thing to do is to disconnect the synchronization link between the target and the source to prevent the source from suddenly recovering but the data is incorrect, thereby polluting the target side.