Snapshot group of PVCs in AWS EKS
Summary and Key concepts
Summary:
This article provides a comprehensive guide on creating and managing group snapshots of Portworx volumes in Kubernetes using the GroupVolumeSnapshot
Custom Resource Definition (CRD). It covers prerequisites such as ensuring Stork is installed and operational, and explains how to create group snapshots for multiple PVCs, track snapshot status, and restore PVCs from the snapshots. The document also details how to set pre- and post-snapshot rules to manage application behavior during snapshot operations. It includes an example of taking group snapshots for Cassandra PVCs and restoring them, along with instructions for managing cross-namespace snapshots.
Kubernetes Concepts:
- PersistentVolumeClaim (PVC): A user request for storage in Kubernetes, used for creating and restoring group snapshots.
- StorageClass: Defines how storage is dynamically provisioned for PVCs and snapshots.
- Annotations: Used to add metadata to objects such as PVCs and snapshots, including cross-namespace snapshot restoration.
- StatefulSet: Manages the deployment of stateful applications, ensuring each pod maintains a stable identity.
Portworx Concepts:
-
Stork: A Portworx extension that handles advanced data management operations such as snapshots and backups in Kubernetes.
-
VolumeSnapshot: A snapshot of a Portworx volume that can be used for backup and recovery.
This document will show you how to create group snapshots of Portworx volumes and how you can restore those snapshots to use them in pods.
Pre-requisites
Installing Stork
This requires that you already have Stork installed and running on your Kubernetes cluster. If you fetched the Portworx specs from the Portworx spec generator in Portworx Central and used the default options, Stork is already installed.
Creating group snapshots
To take group snapshots, you need to use the GroupVolumeSnapshot CRD object. Here is a simple example:
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-groupsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
Above spec will take a group snapshot of all PVCs that match labels app=cassandra
.
The Examples section has a more detailed end-to-end example.
Above spec will keep all the snapshots local to the Portworx cluster. If you intend on backing up the group snapshots to cloud (S3 endpoint), refer to Create group cloud snapshots.
The GroupVolumeSnapshot
object also supports specifying pre and post rules that are run on the application pods using the volumes being snapshotted. This allows users to quiesce the applications before the snapshot is taken and resume I/O after the snapshot is taken. Refer to 3D Snapshots for more detailed documentation on that.
Checking status of group snapshots
A new VolumeSnapshot object will get created for each PVC that matches the given pvcSelector
. For example, if the label selector app: cassandra
matches 3 PVCs, you will have 3 volumesnapshot objects.
You can track the status of the group volume snapshots using:
kubectl describe groupvolumesnapshot <group-snapshot-name>
This will show the latest status and will also list the VolumeSnapshot objects once it's complete. Below is an example of the status section of the cassandra group snapshot.
Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 1015874155818710382
Parent Volume ID: 763613271174793816
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 1130064992705573378
Parent Volume ID: 1081147806034223862
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 175241555565145805
Parent Volume ID: 237262101530372284
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
You can see 3 VolumeSnapshots which are part of the group snapshot. The name of the VolumeSnapshot is in the Volume Snapshot Name field. For more details on the VolumeSnapshot, you can do:
kubectl get volumesnapshot.volumesnapshot.external-storage.k8s.io/<volume-snapshot-name> -o yaml
Snapshots across namespaces
When creating a group snapshot, you can specify a list of namespaces to which the group snapshot can be restored. Below is an example of a group snapshot which can be restored into prod-01 and prod-02 namespaces.
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-groupsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
restoreNamespaces:
- prod-01
- prod-02
Restoring from group snapshots
Previous section describes how to list the volume snapshots that are part of a group snapshot. Once you have the names the VolumeSnapshot objects, you can use them to create PVCs from them.
When you install Stork, it also creates a storage class called stork-snapshot-sc. This storage class can be used to create PVCs from snapshots.
To create a PVC from a snapshot, add the snapshot.alpha.kubernetes.io/snapshot
annotation to refer to the snapshot name. If the snapshot exists in another namespace, you should specify the snapshot namespace with the stork.libopenstorage.org/snapshot-source-namespace
annotation in the PVC.
The Retain policy is important if you need to keep the volume in place, even after removing the Kubernetes objects from a cluster.
- As shown in the following example, the storageClassName should be the Stork StorageClass
stork-snapshot-sc
. - When using this storage class the PVC is creating with
delete
as Retain policy. However, if the source PVC is having the policy asretain
, then this will not be inherited to the restored PVC. After the restore, you should manually verify the retain policy and change it if needed.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-snap-clone
annotations:
snapshot.alpha.kubernetes.io/snapshot: mysql-snapshot
spec:
accessModes:
- ReadWriteOnce
storageClassName: stork-snapshot-sc
resources:
requests:
storage: 2Gi
Once you apply the above spec, you will see a PVC created by Stork. This PVC will be backed by a Portworx volume clone of the snapshot created above.
kubectl get pvc
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
default mysql-data Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447 2Gi RWO px-mysql-sc 2d
default mysql-snap-clone Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447 2Gi RWO stork-snapshot-sc 2s
Examples
Group snapshot for all cassandra PVCs
In below example, we will take a group snapshot for all PVCs in the default namespace and that have labels app: cassandra.
Step 1: Deploy cassandra statefulset and PVCs
Following spec creates a replica 3 cassandra statefulset. Each replica pod will use its own PVC.
##### Portworx storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: portworx-repl2
provisioner: pxd.portworx.com
parameters:
repl: "2"
---
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: "apps/v1"
kind: StatefulSet
metadata:
name: cassandra
spec:
selector:
matchLabels:
app: cassandra
serviceName: cassandra
replicas: 3
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v12
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
volumeClaimTemplates:
- metadata:
name: cassandra-data
labels:
app: cassandra
annotations:
volume.beta.kubernetes.io/storage-class: portworx-repl2
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
Step 2: Wait for all cassandra pods to be running
List the cassandra pods:
kubectl get pods -l app=cassandra
NAME READY STATUS RESTARTS AGE
cassandra-0 1/1 Running 0 3m
cassandra-1 1/1 Running 0 2m
cassandra-2 1/1 Running 0 1m
Once you see all the 3 pods, you can also list the cassandra PVCs.
kubectl get pvc -l app=cassandra
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-cassandra-0 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 3m
cassandra-data-cassandra-1 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 2m
cassandra-data-cassandra-2 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 1m
Step 3: Take the group snapshot
Apply the following spec to take the cassandra group snapshot. Portworx will quiesce I/O on all volumes before triggering their snapshots.
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-snapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
Once you apply the above object you can check the status of the snapshots using kubectl
:
kubectl describe groupvolumesnapshot cassandra-group-snapshot
While the group snapshot is in progress, the status will reflect as InProgress. Once complete, you should see a status stage as Final and status as Successful.
Name: cassandra-group-snapshot
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"stork.libopenstorage.org/v1alpha1","kind":"GroupVolumeSnapshot","metadata":{"annotations":{},"name":"cassandra-group-snapshot"
,"namespac...
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T18:02:16Z
Generation: 0
Resource Version: 18184467
Self Link: /apis/stork.libopenstorage.org/v1alpha1/namespaces/default/groupvolumesnapshots/cassandra-group-snapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Options: <nil>
Post Snapshot Rule:
Pre Snapshot Rule:
Pvc Selector:
Match Labels:
App: cassandra
Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 1015874155818710382
Parent Volume ID: 763613271174793816
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 1130064992705573378
Parent Volume ID: 1081147806034223862
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: 175241555565145805
Parent Volume ID: 237262101530372284
Task ID:
Volume Snapshot Name: cassandra-group-snapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 16s
Above we can see that creation of cassandra-group-snapshot created 3 volumesnapshots:
- cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
- cassandra-group-snapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
- cassandra-group-snapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
These correspond to the PVCs cassandra-data-cassandra-0, cassandra-data-cassandra-1 and cassandra-data-cassandra-2 respectively.
You can also describe these individual volume snapshots using
kubectl describe volumesnapshot cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Name: cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Namespace: default
Labels: <none>
Annotations: <none>
API Version: volumesnapshot.external-storage.k8s.io/v1
Kind: VolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T18:02:47Z
Owner References:
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Name: cassandra-group-snapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Resource Version: 18184459
Self Link: /apis/volumesnapshot.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Persistent Volume Claim Name: cassandra-data-cassandra-0
Snapshot Data Name: cassandra-group-snapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Status:
Conditions:
Last Transition Time: 2019-01-14T18:02:47Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Creation Timestamp: <nil>
Events: <none>
Deleting group snapshots
To delete group snapshots, you need to delete the GroupVolumeSnapshot
that was used to create the group snapshots. Stork will delete all other volumesnapshots that were created for this group snapshot.
kubectl delete groupvolumesnapshot cassandra-group-snapshot