Concepts
Overview
Cassandra is a widely-used NoSQL database management system (DBMS). While several distributions of Cassandra are generally available, PDS makes use of Apache Cassandra.
PDS follows the release lifecycle of Apache, meaning new releases are available on PDS shortly after they have become GA. Likewise, older versions of Cassandra are removed once they have reached end-of-life. The list of currently supported Cassandra versions can be found here.
Like all data services in PDS, Cassandra deployments run within Kubernetes. PDS includes a component called the PDS Deployments Operator
which manages the deployment of all PDS data services. The operator extends the functionality of Kubernetes by implementing a custom resource called cassandra
. The cassandra
resource type represents a Cassandra cluster allowing standard Kubernetes tools to be used to manage Cassandra clusters, including scaling, monitoring, and upgrading.
Learn more about the PDS architecture.
Connectivity
Any standard Cassandra clients or drivers can be used with PDS, including cqlsh
and nodetool
. Each Cassandra node is bound to default ports and made accessible with a fully qualified domain name (FQDN).
PDS enables efficient routing of requests and can be used with topology-aware drivers. Where possible, it is recommended to configure clients or drivers with each of the Cassandra nodes as endpoints. Clients can then route their requests directly to the Cassandra nodes where data live instead of requiring an extra hop through a load balancer.
Cassandra clusters in PDS are configured to use both authentication and authorization. Specifically, the PasswordAuthenticator
and CassandraAuthorizer
are used by default, though these can be overridden. PDS will create a default administrator user called pds which can be used to connect to the database initially. This user can be used to create additional users and can be dropped if needed.
Clustering
Like many NoSQL systems, Cassandra is a distributed database meant to be run as a cluster of individual nodes. This allows Cassandra to be fault-tolerant and highly available when deployed with the right configuration.
Although PDS allows Cassandra to be deployed with any number of nodes, high availability can only be achieved when running with more than three nodes. Smaller clusters should only be considered in development environments.
Cassandra clusters allow nodes to be grouped together into data centers
. These data centers
can represent real-world data centers (for example., different regions of the world); or, they can represent logical groupings (for example., separate workloads). While PDS allows Cassandra deployments to be configured with multiple data centers
, all Cassandra nodes from a PDS deployment must reside in a single Kubernetes cluster.
Cassandra racks are not used in PDS as hardware racks are usually abstracted away in Kubernetes clusters.
Because Cassandra is a distributed database, it must make trade-offs between consistency, availability, and partition tolerance. It can only guarantee two. Cassandra allows users to tune the consistency requirements to favor consistency or availability, but partition tolerance is always guaranteed. This makes Cassandra well-suited to run in a dynamic environment like Kubernetes.
PDS leverages the partition tolerance of Cassandra by spreading Cassandra nodes across Kubernetes worker nodes when possible. PDS utilizes the Storage Orchestrator for Kubernetes (Stork), in combination with Kubernetes storageClasses
, to intelligently schedule pods. By provisioning storage from different worker nodes, and then scheduling pods to be hyper-converged with the persistent volumes, PDS deploys Cassandra clusters in a way that maximizes fault tolerance, even if entire worker nodes or availability zones are impacted.
Replication
Application replication
Cassandra naturally distributes data over all nodes in the cluster through a process known as partitioning
. Cassandra uses a consistent hashing algorithm to determine how the data is spread between nodes.
Optionally, partitions can be replicated to multiple nodes so that failure of any node does not result in data loss. Commonly, users choose a replication factor
of three in an HA setup. This ensures that Cassandra will replicate each partition across three nodes in the cluster.
Application replication is used in Cassandra to accomplish load balancing and high availability. With data replicated to multiple nodes, more nodes are able to respond to requests for data. Likewise, in the event of node failure, the cluster may still be available to serve client requests.
Storage replication
PDS takes data replication further. Storage in PDS is provided by Portworx Enterprise, which itself allows data to be replicated at the storage level.
Each Cassandra node in PDS is configured to store data to a persistent volume
which is provisioned by Portworx Enterprise. These Portworx volumes
can, in turn, be configured to replicate data to multiple volume replicas. It is recommended to use two volume replicas
in PDS in combination with application replication in Cassandra.
While the additional level of replication will result in write amplification, the storage-level replication solves a different problem than what is accomplished by application-level replication. Specifically, storage-level replication reduces the amount of downtime in failure scenarios. That is, it reduces Recovery Time Objective (RTO).
Portworx volume replicas
are able to ensure that data are replicated to different Kubernetes worker nodes or even different availability zones. This maximizes the ability of the Kubernetes API scheduler to schedule Cassandra pods.
For example, in cases where Kubernetes worker nodes are unschedulable, pods can be scheduled on other worker nodes where data already exist. What’s more, pods can start instantly and service traffic immediately without waiting for Cassandra to replicate data to the pod. This further extends the durability of Cassandra by ensuring performance is not impacted by a pod restart or node failure.
Scaling
Because of the ease with which databases can be deployed and managed in PDS, it is common for customers to deploy many of them. Likewise, because it is easy to scale databases in PDS, it is common for customers to start with smaller clusters and then add resources when needed. PDS supports both vertical_ scaling
(i.e., CPU/memory), as well as horizontal_ scaling
(i.e., nodes) of Cassandra clusters.
Vertical scaling
Vertical scaling refers to the process of adding hardware resources to (scaling up) or removing hardware resources from (scaling down) database nodes. In the context of Kubernetes and PDS, these hardware resources are virtualized CPU and memory.
PDS allows Cassandra nodes to be dynamically reprovisioned with additional or reduced CPU and/or memory. These changes are applied in a rolling fashion across each of the nodes in the Cassandra cluster. For Cassandra clusters with more than a single node, this operation is non-disruptive to client applications.
Horizontal scaling
Horizontal scaling refers to the process of adding database nodes to (scaling out) or removing database nodes from (scaling in) a cluster. Currently, PDS supports only scaling out of clusters. This is accomplished by adding additional pods to the existing Cassandra cluster by updating the replica count of the statefulSet
.
When new pods are started, they discover the other Cassandra nodes and join the cluster. PDS uses the auto bootstrap functionality of Cassandra to automatically redistribute data from existing nodes to the new nodes.
If you reduce the size of your Cassandra cluster, you will need to manually run nodetool removenode
from one of the remaining Cassandra nodes.
Backup and Restore
Backup and restore functionality for Cassandra in PDS is provided by Medusa. Backups can be taken ad hoc or can be performed on a schedule. Both full and differential backups are supported.
To learn more about ad hoc and scheduled backups in PDS, see Backup a data service.
Backups
Each Cassandra deployment in PDS is bundled with Medusa. Medusa is configured to take a backup of the entire Cassandra cluster and to store the data to a dedicated Portworx volume.
The backup volume is shared across all nodes in the Cassandra cluster which allows for state to be shared between the Cassandra nodes. By using a dedicated volume, the backup process does not interfere with the performance of the database.
Once Medusa has completed the backup, the PDS Backup Operator
makes a copy of the volume to a remote object store. This feature, which is provided by Portworx Enterprise’s cloud snapshot functionality, allows PDS to fulfill the 3-2-1 backup strategy – three copies of data on two types of storage media with one copy offsite.
Restores
Restoring Cassandra in PDS is done out-of-place. That is to say, PDS will deploy a new Cassandra cluster and restore the data to the new cluster. This prevents users from accidentally overwriting data and allows users to stand up multiple copies of their databases for debugging or forensic purposes.
PDS does not currently support restoration of individual nodes or of specific tables or keyspaces
.
The new Cassandra deployment will have the same properties as the original at the time of backup. If a backup was taken of a Cassandra cluster with three nodes, and the cluster was later scaled to five nodes, restoring from backup would deploy a three-node cluster. Effectively, PDS restores the state of the deployment, not just the data.
As with restoring any database, loss of data is likely to occur. Because user passwords are stored within the database, restoring from backup will revert any password changes. Therefore, in order to ensure access to the database, users must manage their own credentials.
Monitoring
Cassandra allows for monitoring via JMX. While PDS allows users to use existing JMX tooling if they choose, PDS also extends the monitoring capabilities of Cassandra to allow modern tools such as Grafana to be used with Cassandra.
With each Cassandra deployment, PDS bundles a metrics exporter known as the jmx_exporter. This exporter is provided by the Prometheus community and makes Cassandra JMX metrics available in the Prometheus text-based format.
The jmx_exporter
is configured for use with Cassandra by whitelisting specific MBeans at the keyspace
, cluster, and JVM levels. As an optimization measure, table-level metrics are not exported by default.