Redis Cluster Concepts
Overview
Redis Cluster is a distributed implementation of Redis that provides horizontal scalability and high availability without the need for external tools like Redis Sentinel. When deployed through Portworx Data Services (PDS) on Kubernetes, Redis Cluster allows organizations to harness the performance of Redis with the resiliency, automation, and flexibility of a cloud-native platform.
In PDS, Redis Cluster deployments are managed by the PDS Deployments Operator, which automates lifecycle operations including provisioning, scaling, failover, backup, and restore. Redis Cluster on PDS utilizes Kubernetes-native constructs and Portworx Enterprise for persistent storage, ensuring reliable and scalable in-memory data services.
PDS supports the Redis Cluster and tracks upstream releases to ensure compatibility and timely availability of new versions. Supported versions are listed here.
Clustering
Redis Cluster partitions data across multiple nodes using sharding. Each node holds a subset of slots (from the total 16,384 hash slots), and clients interact with multiple nodes transparently.
Key Features
-
Automatic Sharding: Data is automatically partitioned across nodes.
-
High Availability: Each shard can be replicated to one or more replica nodes. On failure, a replica is promoted.
-
Client-Aware Topology: Redis clients are aware of the cluster topology and route requests accordingly.
-
No Central Coordinator: Cluster metadata is shared among nodes, removing the need for a central controller.
To enable high availability, PDS requires Redis Cluster to be deployed with at least 3 master nodes, each with at least one replica. Nodes are deployed using Kubernetes StatefulSets
and automatically form a cluster through a bootstrapping process managed by the PDS Operator.
Redis Cluster topology is automatically rebalanced when scaling horizontally. Pods are distributed across different Kubernetes worker nodes using Stork and Portworx volumes, maximizing fault tolerance.
Replication
Application-Level Replication
Each master node in the Redis Cluster can have one or more replicas. These replicas stay in sync with the master node asynchronously. In case of master node failure, a replica is promoted to restore service availability.
-
Write operations: is directed to master node owning the corresponding hash slot.
-
Read operations: can be routed/offloaded to replicas (if client supports read from replicas).
Storage-Level Replication
PDS also leverages Portworx Enterprise for storage-level replication. Each Redis pod uses a persistent volume provisioned with a defined number of replicas (for example, 2). These replicas span across different worker nodes or availability zones, ensuring faster recovery and high availability during node failures.
Storage-level replication complements application-level replication and reduces RTO by making the data available even before application-level recovery mechanisms kick in.
Configuration
Redis Cluster configuration parameters can be overridden using environment variables specified in the Application Configuration Template. Users can adjust performance, memory handling, eviction policies, timeouts, and more.
PDS simplifies the process of customizing and managing these configurations at scale. For a full list of configurable parameters, refer to the PDS Redis Cluster supported configurations.
Scaling
PDS supports both vertical and horizontal scaling of Redis Cluster deployments.
Vertical Scaling
Vertical scaling increases or decreases the CPU and memory resources allocated to Redis pods. Changes are rolled out with minimal disruption using Kubernetes rolling updates.
Horizontal Scaling
Redis Cluster supports dynamic horizontal scaling by:
-
Adding more shards (masters) to increase write and memory capacity.
-
Adding more replicas to existing shards to improve read throughput and fault tolerance.
The PDS operator manages resharding operations during scaling events, minimizing client impact and redistributing hash slots evenly across new nodes.
Connectivity
PDS provisions service endpoints for Redis Cluster pods based on user selection (LoadBalancer or ClusterIP). These services provide stable IPs and DNS records. Redis client connections can use:
-
Redis round robin endpoint (rr): routes to all the nodes in round-robin.
-
Individual pod endpoints (vip): routes to specific Redis pods.
Endpoint Types
Service Name | Details |
---|---|
redcl-<name>-<namespace>-<pod-id>-vip | Connects to a specific Redis Cluster node |
redcl-<name>-<namespace>-rr | Round Robin Endpoint to all nodes |
Redis clients that support clustering (for example, redis-cli
,Jedis
, Lettuce
, ioredis
) will automatically discover the topology and route commands to the appropriate node based on hash slots.
PDS provisions a default administrator user named pds
with full access privileges. You can retrieve the connection credentials from the PDS UI.
Access Control & User Management
Portworx Data Services (PDS) enhances Redis deployments with built-in support for managing users and ACLs (Access Control Lists). This system ensures consistent and secure access across all Redis nodes.
This ACL synchronization capability is unique to PDS and is not available in open source Redis.
Key Capabilities
-
A master node (called the origin) manages ACLs centrally and pushes updates.
-
Other nodes sync the updated
users.acl
file as needed. -
PDS ensures a safe failover if the origin master becomes unresponsive.
-
Redis user settings are automatically included in deployment backups.
Managing Redis Users
You can create, update, and delete users using the redis-cli command-line tool or desktop applications such as RedisInsight
Example:
-
Create a user with limited access:
ACL SETUSER appuser on >mysecurepassword ~app:* +GET +SET
-
Save the ACL file:
ACL SAVE
-
User management is supported only within PDS-managed Redis deployments.
-
ACL syncs occur at regular intervals or on detected changes; not instantly in real-time.
Backup and Restore
Redis Cluster in PDS supports full-cluster backup and restore operations using Redis RDB and Portworx snapshots.
Backups
Backups are taken using Redis Cluster RDB snapshots and stored in dedicated Portworx volumes. Backups can be scheduled or taken ad hoc. After a backup is complete, the PDS Backup Operator uses Portworx’s cloud snapshot feature to replicate the volume to an offsite object store.
This approach supports the 3-2-1 backup strategy; 3 copies of data, on 2 types of media, with 1 offsite copy.
Restore
Redis restore operations are performed out-of-place, which means the data is not restored directly into the original Redis instance or dataset. Instead, it is restored into a new or separate Redis instance, or a different namespace/database. This ensures no accidental data overwrites and allows for forensics or troubleshooting on restored data.
- Only full database restores are supported (no KV level restores).
- Passwords and credentials will be rolled back to the state at the time of backup.
Monitoring
Each Redis Cluster deployment in PDS includes a Prometheus exporter. Metrics are exposed in the Prometheus format and can be visualized using Grafana or other tools.
Metrics are available through the PDS Control Plane’s Prometheus endpoint and include details like memory usage, keyspace statistics, replication lag, and command stats.
For the complete list of supported metrics and example usage, refer to the PDS Redis Cluster Metrics.