Skip to main content
Version: 3.2

Cloud backups for group of PVCs in OCP on bare metal

Summary and Key concepts

Summary:

This article provides instructions on how to create and manage group cloud snapshots of Portworx volumes in Kubernetes. It covers prerequisites such as installing Stork and configuring cloud credentials for connecting to cloud storage providers like S3. The article details the process of creating group cloud snapshots using the GroupVolumeSnapshot Custom Resource Definition (CRD), tracking the status of snapshots, and restoring data from snapshots. It also explains how to set pre- and post-snapshot rules and manage retries for failed snapshots. The guide includes an example of taking a group cloud snapshot of Cassandra PVCs and restoring the data in pods.

Kubernetes Concepts:

  • PersistentVolumeClaim (PVC): Requests for storage in Kubernetes, which can be snapshotted and restored.
  • Annotations: Metadata used to manage group snapshots and specify cloud backup options.
  • Namespace: Used to isolate Kubernetes resources, and group snapshots can be restored across multiple namespaces.

Portworx Concepts:

  • Stork: A Kubernetes extension for managing Portworx snapshots, backups, and restores.

  • GroupVolumeSnapshot: A Portworx CRD for taking snapshots of multiple PVCs as a group, including support for cloud backups.

  • VolumeSnapshot: A snapshot of a Portworx volume, which can be used for restoring data.

  • VolumeSnapshotRestore: A resource used for restoring volumes from snapshots, including in-place restores with Stork.

This document will show you how to create group cloud snapshots of Portworx volumes and how you can clone those snapshots to use them in pods.

Pre-requisites

Installing Stork

This requires that you already have Stork installed and running on your Kubernetes cluster. If you fetched the Portworx specs from the Portworx spec generator in Portworx Central and used the default options, Stork is already installed.

Configuring cloud secrets

To create cloud snapshots, one needs to setup secrets with Portworx which will get used to connect and authenticate with the configured cloud provider.

Follow instructions on the create and configure credentials section to setup secrets.

Creating group cloud snapshots

To take group snapshots, you need to use the GroupVolumeSnapshot CRD object and pass in portworx/snapshot-type as cloud. Here is a simple example:

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud

Above spec will take a group snapshot of all PVCs that match labels app=cassandra.

The Examples section has a more detailed end-to-end example.

note

Above spec backs up the snapshots to a cloud S3 endpoint. If you intend on taking snapshots just local tot he cluster, refer to Create local group snapshots.

The GroupVolumeSnapshot object also supports specifying pre and post rules that are run on the application pods using the volumes being snapshotted. This allows users to quiesce the applications before the snapshot is taken and resume I/O after the snapshot is taken. Refer to 3D Snapshots for more detailed documentation on that.

Checking status of group cloud snapshots

A new VolumeSnapshot object will get created for each PVC that matches the given pvcSelector. For example, if the label selector app: cassandra matches 3 PVCs, you will have 3 volumesnapshot objects.

You can track the status of the group volume snapshots using:

oc describe groupvolumesnapshot <group-snapshot-name>

This will show the latest status and will also list the VolumeSnapshot objects once it's complete. Below is an example of the status section of the cassandra group snapshot.

Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/763613271174793816-922960401583326548
Snapshot Type: cloud
Parent Volume ID: 763613271174793816
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-66490f4172c7
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/1081147806034223862-518034075073409747
Snapshot Type: cloud
Parent Volume ID: 1081147806034223862
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-b62951dcca0e
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/237262101530372284-299546281563771622
Snapshot Type: cloud
Parent Volume ID: 237262101530372284
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-ee3b13f7c03f
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
  • You can see 3 volume snapshots which are part of the group snapshot. The name of the volume snapshot is in the Volume Snapshot Name field. For more details on the volumesnapshot, you can do:

oc get volumesnapshot.volumesnapshot.external-storage.k8s.io/<volume-snapshot-name> -o yaml

Retries of group cloud snapshots

If a cloud groupvolumesnapshot fails to trigger, it will be retried. However, by default, if a cloud groupvolumesnapshot fails after it has been triggered/started successfully, it will be marked as Failed and will not be retried

If you want to change this behavior, you can set the maxRetries field in the spec. In below example, we will perform 3 retries on failures.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
maxRetries: 3
options:
portworx/snapshot-type: cloud

When maxRetries are enabled, NumRetries in the status of the groupvolumesnapshot will indicate the number of retries performed.

Snapshots across namespaces

When creating a group snapshot, you can specify a list of namespaces to which the group snapshot can be restored. Below is an example of a group cloud snapshot which can be restored into prod-01 and prod-02 namespaces.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-groupsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud
restoreNamespaces:
- prod-01
- prod-02

Restoring from group cloud snapshots

Previous section describes how to list the volume snapshots that are part of a group snapshot. Once you have the names the VolumeSnapshot objects, you can use them to create PVCs from them.

When you perform an in-place restore to a PVC, Stork takes the pods using that PVC offline, restores the volume from the snapshot, then brings the pods back online.

note

In-place restore using VolumeSnapshotRestore works only for applications deployed using the stork scheduler. If you're not using the Stork scheduler, Portworx displays the following error when describing the VolumeSnapshotRestore resource:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 5s (x2 over 15s) stork application not scheduled by stork scheduler
  1. Create a VolumeSnapshotRestore YAML file specifying the following:

    • apiVersion as stork.libopenstorage.org/v1alpha1
    • kind as VolumeSnapshotRestore
    • metadata.name with the name of the object that performs the restore
    • metadata.namespace with the name of the target namespace
    • spec.sourceName with the name of the snapshot you want to restore
    • spec.sourceNamespace with the namespace in which the snapshot resides

    The following example restores data from a snapshot called mysql-snapshot which was created in the mysql-snap-restore-splocal namespace to a PVC called mysql-snap-inrestore in the default namespace:

    apiVersion: stork.libopenstorage.org/v1alpha1
    kind: VolumeSnapshotRestore
    metadata:
    name: mysql-snap-inrestore
    namespace: default
    spec:
    sourceName: mysql-snapshot
    sourceNamespace: mysql-snap-restore-splocal
  2. Place the spec into a file called mysql-cloud-snapshot-restore.yaml and apply it:

    oc apply -f mysql-cloud-snapshot-restore.yaml

  3. You can enter the following command to see the status of the restore process:

    storkctl get volumesnapshotrestore
    NAME                   SOURCE-SNAPSHOT   SOURCE-SNAPSHOT-NAMESPACE   STATUS          VOLUMES   CREATED
    mysql-snap-inrestore mysql-snapshot default Successful 1 23 Sep 19 21:55 EDT

    You can also use the oc describe command to retrieve more detailed information about the status of the restore process. Example:

    oc describe volumesnapshotrestore mysql-snap-inrestore

    Name:         mysql-snap-inrestore
    Namespace: default
    Labels: <none>
    Annotations: kubectl.kubernetes.io/last-applied-configuration:
    {"apiVersion":"stork.libopenstorage.org/v1alpha1","kind":"VolumeSnapshotRestore","metadata":{"annotations":{},"name":"mysql-snap-inrestore...
    API Version: stork.libopenstorage.org/v1alpha1
    Kind: VolumeSnapshotRestore
    Metadata:
    Creation Timestamp: 2019-09-23T17:24:30Z
    Generation: 5
    Resource Version: 904014
    Self Link: /apis/stork.libopenstorage.org/v1alpha1/namespaces/default/volumesnapshotrestores/mysql-snap-inrestore
    UID: xxxxxxxx-xxxx-xxxx-xxxx-000c295d6364
    Spec:
    Group Snapshot: false
    Source Name: mysql-snapshot
    Source Namespace: default
    Status:
    Status: Successful
    Volumes:
    Namespace: default
    Pvc: mysql-data
    Reason: Restore is successful
    Snapshot: k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-320ff611f4ca
    Status: Successful
    Volume: pvc-xxxxxxxx-xxxx-xxxx-xxxx-000c295d6364
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Successful 0s stork Snapshot in-Place Restore completed

When you install Stork, it also creates a storage class called stork-snapshot-sc. This storage class can be used to create PVCs from snapshots.

To create a PVC from a snapshot, add the snapshot.alpha.kubernetes.io/snapshot annotation to refer to the snapshot name. If the snapshot exists in another namespace, you should specify the snapshot namespace with the stork.libopenstorage.org/snapshot-source-namespace annotation in the PVC.

The Retain policy is important if you need to keep the volume in place, even after removing the Kubernetes objects from a cluster.

note
  • As shown in the following example, the storageClassName should be the Stork StorageClass stork-snapshot-sc.
  • When using this storage class the PVC is creating with delete as Retain policy. However, if the source PVC is having the policy as retain, then this will not be inherited to the restored PVC. After the restore, you should manually verify the retain policy and change it if needed.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-snap-clone
annotations:
snapshot.alpha.kubernetes.io/snapshot: mysql-snapshot
spec:
accessModes:
- ReadWriteOnce
storageClassName: stork-snapshot-sc
resources:
requests:
storage: 2Gi

Once you apply the above spec, you will see a PVC created by Stork. This PVC will be backed by a Portworx volume clone of the snapshot created above.

oc get pvc

NAMESPACE   NAME                                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                AGE
default mysql-data Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447 2Gi RWO px-mysql-sc 2d
default mysql-snap-clone Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447 2Gi RWO stork-snapshot-sc 2s

Examples

Group cloud snapshot for all cassandra PVCs

In below example, we will take a group snapshot for all PVCs in the default namespace and that have labels app: cassandra and back it up to the configured cloud S3 endpoint in the Portworx cluster.

Step 1: Deploy cassandra statefulset and PVCs

Following spec creates a replica 3 cassandra statefulset. Each replica pod will use its own PVC.

##### Portworx storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: portworx-repl2
provisioner: pxd.portworx.com
parameters:
repl: "2"
---
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra

---

apiVersion: "apps/v1"
kind: StatefulSet
metadata:
name: cassandra
spec:
selector:
matchLabels:
app: cassandra
serviceName: cassandra
replicas: 3
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v12
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
volumeClaimTemplates:
- metadata:
name: cassandra-data
labels:
app: cassandra
annotations:
volume.beta.kubernetes.io/storage-class: portworx-repl2
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi

Step 2: Wait for all cassandra pods to be running

List the cassandra pods:

oc get pods -l app=cassandra

NAME          READY     STATUS    RESTARTS   AGE
cassandra-0 1/1 Running 0 3m
cassandra-1 1/1 Running 0 2m
cassandra-2 1/1 Running 0 1m

Once you see all the 3 pods, you can also list the cassandra PVCs.

oc get pvc -l app=cassandra

NAME                         STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
cassandra-data-cassandra-0 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 3m
cassandra-data-cassandra-1 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 2m
cassandra-data-cassandra-2 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 1m

Step 3: Take the group cloud snapshot

Apply the following spec to take the cassandra group snapshot. Portworx will quiesce I/O on all volumes before triggering their snapshots.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud

Once you apply the above object you can check the status of the snapshots using oc:

oc describe groupvolumesnapshot cassandra-group-cloudsnapshot

While the group snapshot is in progress, the status will reflect as InProgress. Once complete, you should see a status stage as Final and status as Successful.

Name:         cassandra-group-cloudsnapshot
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"stork.libopenstorage.org/v1alpha1","kind":"GroupVolumeSnapshot","metadata":{"annotations":{},"name":"cassandra-group-cloudsnapshot","nam...
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T20:30:13Z
Generation: 0
Resource Version: 18212101
Self Link: /apis/stork.libopenstorage.org/v1alpha1/namespaces/default/groupvolumesnapshots/cassandra-group-cloudsnapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Options:
Portworx / Snapshot - Type: cloud
Post Snapshot Rule:
Pre Snapshot Rule:
Pvc Selector:
Match Labels:
App: cassandra
Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/763613271174793816-922960401583326548
Snapshot Type: cloud
Parent Volume ID: 763613271174793816
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-66490f4172c7
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/1081147806034223862-518034075073409747
Snapshot Type: cloud
Parent Volume ID: 1081147806034223862
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-b62951dcca0e
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/237262101530372284-299546281563771622
Snapshot Type: cloud
Parent Volume ID: 237262101530372284
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-ee3b13f7c03f
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Events: <none>

Above we can see that creation of cassandra-group-snapshot created 3 volumesnapshots:

  1. cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
  2. cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
  3. cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7

These correspond to the PVCs cassandra-data-cassandra-0, cassandra-data-cassandra-1 and cassandra-data-cassandra-2 respectively.

You can also describe these individual volume snapshots using

oc describe volumesnapshot cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7

Name:         cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Namespace: default
Labels: <none>
Annotations: <none>
API Version: volumesnapshot.external-storage.k8s.io/v1
Kind: VolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T20:30:49Z
Owner References:
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Name: cassandra-group-cloudsnapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Resource Version: 18212097
Self Link: /apis/volumesnapshot.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Persistent Volume Claim Name: cassandra-data-cassandra-0
Snapshot Data Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Status:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Creation Timestamp: <nil>
Events: <none>

Deleting group snapshots

To delete group snapshots, you need to delete the GroupVolumeSnapshot that was used to create the group snapshots. Stork will delete all other volumesnapshots that were created for this group snapshot.

oc delete groupvolumesnapshot cassandra-group-cloudsnapshot