Cloud Snapshots
A cloud snapshot is a point-in-time copy of a volume or group of volumes that Portworx uploads from the source storage system to a remote cloud storage location, such as a configured S3-compliant endpoint like AWS S3. You can use cloud snapshots to protect data against cluster-level failures and enable disaster recovery across regions or clusters. Cloud snapshots support long-term retention, off-site backups, and compliance with data protection policies.
By default, Portworx groups the backup data into 10 MB objects, compresses the objects, and uploads them to the cloud. You can increase the CloudSnap object size from the default 10 MB to 100 MB. Using larger object sizes improves the backup and restore performance for large volumes, particularly in environments using NFS backends. For information on how to increase the CloudSnap object size, see Increase the CloudSnap object size.
This topic describes how you can create cloud snapshots of Portworx volumes and FlashArray Direct Access (FADA) volumes, and how you can clone those snapshots to use them in pods. Cloud snapshots on FADA volumes and PX volumes use the same failure handling, cloud upload, and restore mechanisms.
Prerequisites
-
Ensure that you have two running Portworx clusters.
For information on how to install Portworx on a cluster, see System Requirements. -
Ensure that you have an Object store. Cloud snapshots work with Amazon S3, Azure Blob, Google Cloud Storage, or any S3 compatible object store. If you do not have an object store, Portworx by Pure Storage recommends using MinIO.
For information on how to install MinIO, see the MinIO Quickstart Guide. -
Ensure that you have a secret store provider.
For information on how to configure a secret store provider, see Secret store management. -
Ensure that Stork is installed and running on your cluster.
noteIf you generated the Portworx specs using the default options in Portworx Central, Stork is already installed.
-
Set up secrets in Portworx to connect and authenticate with the configured cloud provider.
For more information, see the Create and Configure Credentials section. -
To increase the CloudSnap object size to 100 MB, ensure that each node in your cluster has at least 16 core CPU and 16 GB of RAM.
Limitations
-
The following are the limitations of cloud snapshots on FADA volumes:
- Cloud snapshots on FADA volumes do not support group snapshots.
- Cloud snapshots on FADA volumes do not support schedules in Portworx. However, schedules through Stork is supported.
-
If a CloudSnap is created with a 100 MB object size, restoring it to a cluster configured with a 10 MB object size will fail.
(Optional) Increase the CloudSnap object size
To increase the CloudSnap object size from the default 10 MB to 100 MB, use the PX_CLOUDSNAP_LARGE_OBJECT_SUPPORT environment variable in the PX configuration:
spec:
env:
- name: PX_CLOUDSNAP_LARGE_OBJECT_SUPPORT
value: "true"
After applying this change, the Portworx cluster automatically restarts to apply the new setting.
Create and Restore a Cloud Snapshot in the Same Cluster
This section describes how to create a snapshot and restore it to the same Portworx cluster. You can either snapshot individual PVCs one by one or snapshot a group of PVCs.
You cannot use an older version of Portworx to restore a cloud snapshot created with a newer one. For example, if you are running Portworx 3.3, you cannot restore a cloud snapshot created with Portworx 3.4.
Creating Cloud Snapshot of a Single PVC
The cloud snapshot method supports the following annotations:
- portworx/snapshot-type: Indicates the type of snapshot. For cloud snapshots, the value should be cloud.
- portworx/cloud-cred-id (Optional): Specifies the credentials UUID if you have configured credentials for multiple cloud providers.
- portworx.io/cloudsnap-incremental-count: Specifies the number of incremental cloud snapshots after which a full backup is taken.
Example
Below, we create a cloud snapshot for a PVC called mysql-data backed by a Portworx volume.
apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: mysql-snapshot
namespace: default
annotations:
portworx/snapshot-type: cloud
spec:
persistentVolumeClaimName: mysql-data
After you apply the above object you can check the status of the snapshots using the kubectl get volumesnapshot.volumesnapshot.external-storage.k8s.io/ command with the name of your snapshot appended:
- Kubernetes
- OpenShift
kubectl get volumesnapshot.volumesnapshot.external-storage.k8s.io/mysql-snapshot
oc get volumesnapshot.volumesnapshot.external-storage.k8s.io/mysql-snapshot
NAME AGE
volumesnapshots/mysql-snapshot 2s
- Kubernetes
- OpenShift
kubectl get volumesnapshotdatas
oc get volumesnapshotdatas
NAME AGE
volumesnapshotdatas/k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-5a34ec89e61c 1s
The creation of the volumesnapshotdatas object indicates that the snapshot is created. If you describe the volumesnapshotdatas object you can see the Portworx Cloud Snapshot ID and the PVC for which the snapshot was created.
- Kubernetes
- OpenShift
kubectl describe volumesnapshotdatas
Name: k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-5a34ec89e61c
Namespace:
Labels: <none>
Annotations: <none>
API Version: volumesnapshot.external-storage.k8s.io/v1
Kind: VolumeSnapshotData
Metadata:
Cluster Name:
Creation Timestamp: 2018-03-08T03:17:02Z
Deletion Grace Period Seconds: <nil>
Deletion Timestamp: <nil>
Resource Version: 29989636
Self Link: /apis/volumesnapshot.external-storage.k8s.io/v1/k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-5a34ec89e61c
UID: xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Spec:
Persistent Volume Ref:
Kind: PersistentVolume
Name: pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-33c5ab8d4d8e/149813028909420894-125009403033610837-incr
Volume Snapshot Ref:
Kind: VolumeSnapshot
Name: default/mysql-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Status:
Conditions:
Last Transition Time: <nil>
Message:
Reason:
Status:
Type:
Creation Timestamp: <nil>
Events: <none>
oc describe volumesnapshotdatas
Name: k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-5a34ec89e61c
Namespace:
Labels: <none>
Annotations: <none>
API Version: volumesnapshot.external-storage.k8s.io/v1
Kind: VolumeSnapshotData
Metadata:
Cluster Name:
Creation Timestamp: 2018-03-08T03:17:02Z
Deletion Grace Period Seconds: <nil>
Deletion Timestamp: <nil>
Resource Version: 29989636
Self Link: /apis/volumesnapshot.external-storage.k8s.io/v1/k8s-volume-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-5a34ec89e61c
UID: xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Spec:
Persistent Volume Ref:
Kind: PersistentVolume
Name: pvc-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-33c5ab8d4d8e/149813028909420894-125009403033610837-incr
Volume Snapshot Ref:
Kind: VolumeSnapshot
Name: default/mysql-snapshot-xxxxxxxx-xxxx-xxxx-xxxx-0214683e8447
Status:
Conditions:
Last Transition Time: <nil>
Message:
Reason:
Status:
Type:
Creation Timestamp: <nil>
Events: <none>
Creating Cloud Snapshot of a Group of PVCs
To take group snapshots, you need to use the GroupVolumeSnapshot CRD object and pass in portworx/snapshot-type as cloud. Here is a simple example:
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud
The above spec takes a group snapshot of all PVCs that match labels app=cassandra.
The Examples section has a more detailed end-to-end example.
The above spec backs up the snapshots to a cloud S3 endpoint. If you intend on taking snapshots just local to the cluster, refer to Create local group snapshots.
The GroupVolumeSnapshot object also supports specifying pre and post rules that are run on the application pods using the volumes being snapshotted. This allows users to quiesce the applications before the snapshot is taken and resume I/O after the snapshot is taken. For more information, see 3D Snapshots.
Checking status of group cloud snapshots
A new VolumeSnapshot object gets created for each PVC that matches the given pvcSelector.
For example, if the label selector app: cassandra matches three PVCs, you have three volumesnapshot objects.
You can track the status of the group volume snapshots using:
- Kubernetes
- OpenShift
kubectl describe groupvolumesnapshot <group-snapshot-name>
oc describe groupvolumesnapshot <group-snapshot-name>
This shows the latest status and lists the VolumeSnapshot objects after it is complete.
Below is an example of the status section of the cassandra group snapshot.
Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/763613271174793816-922960401583326548
Snapshot Type: cloud
Parent Volume ID: 763613271174793816
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-66490f4172c7
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/1081147806034223862-518034075073409747
Snapshot Type: cloud
Parent Volume ID: 1081147806034223862
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-b62951dcca0e
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/237262101530372284-299546281563771622
Snapshot Type: cloud
Parent Volume ID: 237262101530372284
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-ee3b13f7c03f
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
-
You can see three volume snapshots which are part of the group snapshot. The name of the volume snapshot is in the Volume Snapshot Name field. For more details on the
volumesnapshot, you can do:- Kubernetes
- OpenShift
kubectl get volumesnapshot.volumesnapshot.external-storage.k8s.io/<volume-snapshot-name> -o yamloc get volumesnapshot.volumesnapshot.external-storage.k8s.io/<volume-snapshot-name> -o yaml
Retries of group cloud snapshots
If a cloud GroupVolumeSnapshot fails to trigger, it is retried. However, by default, if a cloud GroupVolumeSnapshot fails after it has been triggered or started successfully, it is marked Failed and is not retried.
To change this behavior, you can set the maxRetries field in the spec. In below example, we perform three retries on failures.
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
maxRetries: 3
options:
portworx/snapshot-type: cloud
When maxRetries are enabled, NumRetries in the status of the groupvolumesnapshot indicates the number of retries performed.
Snapshots across namespaces
When creating a group snapshot, you can specify a list of namespaces to which the group snapshot can be restored.
Below is an example of a group cloud snapshot which can be restored into prod-01 and prod-02 namespaces.
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-groupsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud
restoreNamespaces:
- prod-01
- prod-02
Examples
Group cloud snapshot for all cassandra PVCs
In below example, we take a group snapshot for all PVCs in the default namespace and that have labels app: cassandra and back it up to the configured cloud S3 endpoint in the Portworx cluster.
Step 1: Deploy cassandra statefulset and PVCs
Following spec creates a replica 3 cassandra statefulset. Each replica pod usea its own PVC.
##### Portworx storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: portworx-repl2
provisioner: pxd.portworx.com
parameters:
repl: "2"
---
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: "apps/v1"
kind: StatefulSet
metadata:
name: cassandra
spec:
selector:
matchLabels:
app: cassandra
serviceName: cassandra
replicas: 3
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v12
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
volumeClaimTemplates:
- metadata:
name: cassandra-data
labels:
app: cassandra
annotations:
volume.beta.kubernetes.io/storage-class: portworx-repl2
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
Step 2: Wait for all cassandra pods to be running
List the cassandra pods:
- Kubernetes
- OpenShift
kubectl get pods -l app=cassandra
oc get pods -l app=cassandra
NAME READY STATUS RESTARTS AGE
cassandra-0 1/1 Running 0 3m
cassandra-1 1/1 Running 0 2m
cassandra-2 1/1 Running 0 1m
Once you see all the three pods, you can also list the cassandra PVCs.
- Kubernetes
- OpenShift
kubectl get pvc -l app=cassandra
oc get pvc -l app=cassandra
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-cassandra-0 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 3m
cassandra-data-cassandra-1 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 2m
cassandra-data-cassandra-2 Bound pvc-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7 2Gi RWO stork-snapshot-sc 1m
Step 3: Take the group cloud snapshot
Apply the following spec to take the cassandra group snapshot. Portworx quiesce I/O on all volumes before triggering their snapshots.
apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
name: cassandra-group-cloudsnapshot
spec:
pvcSelector:
matchLabels:
app: cassandra
options:
portworx/snapshot-type: cloud
- Kubernetes
- OpenShift
After you apply the above object you can check the status of the snapshots using kubectl:
kubectl describe groupvolumesnapshot cassandra-group-cloudsnapshot
Once you apply the above object you can check the status of the snapshots using oc:
oc describe groupvolumesnapshot cassandra-group-cloudsnapshot
While the group snapshot is in progress, the status reflect as InProgress. After it is complete, the system displays a status stage as Final and status as Successful.
Name: cassandra-group-cloudsnapshot
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"stork.libopenstorage.org/v1alpha1","kind":"GroupVolumeSnapshot","metadata":{"annotations":{},"name":"cassandra-group-cloudsnapshot","nam...
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T20:30:13Z
Generation: 0
Resource Version: 18212101
Self Link: /apis/stork.libopenstorage.org/v1alpha1/namespaces/default/groupvolumesnapshots/cassandra-group-cloudsnapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Options:
Portworx / Snapshot - Type: cloud
Post Snapshot Rule:
Pre Snapshot Rule:
Pvc Selector:
Match Labels:
App: cassandra
Status:
Stage: Final
Status: Successful
Volume Snapshots:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/763613271174793816-922960401583326548
Snapshot Type: cloud
Parent Volume ID: 763613271174793816
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-66490f4172c7
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/1081147806034223862-518034075073409747
Snapshot Type: cloud
Parent Volume ID: 1081147806034223862
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-b62951dcca0e
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Data Source:
Portworx Volume:
Snapshot Id: xxxxxxxx-xxxx-xxxx-xxxx-4b6f09463a98/237262101530372284-299546281563771622
Snapshot Type: cloud
Parent Volume ID: 237262101530372284
Task ID: xxxxxxxx-xxxx-xxxx-xxxx-ee3b13f7c03f
Volume Snapshot Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Events: <none>
Above we can see that creation of cassandra-group-snapshot created 3 volumesnapshots:
- cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
- cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
- cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
These correspond to the PVCs cassandra-data-cassandra-0, cassandra-data-cassandra-1 and cassandra-data-cassandra-2 respectively.
You can also describe these individual volume snapshots using
- Kubernetes
- OpenShift
kubectl describe volumesnapshot cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
oc describe volumesnapshot cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Namespace: default
Labels: <none>
Annotations: <none>
API Version: volumesnapshot.external-storage.k8s.io/v1
Kind: VolumeSnapshot
Metadata:
Cluster Name:
Creation Timestamp: 2019-01-14T20:30:49Z
Owner References:
API Version: stork.libopenstorage.org/v1alpha1
Kind: GroupVolumeSnapshot
Name: cassandra-group-cloudsnapshot
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Resource Version: 18212097
Self Link: /apis/volumesnapshot.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
UID: xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Spec:
Persistent Volume Claim Name: cassandra-data-cassandra-0
Snapshot Data Name: cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-xxxxxxxx-xxxx-xxxx-xxxx-080027ee1df7
Status:
Conditions:
Last Transition Time: 2019-01-14T20:30:49Z
Message: Snapshot created successfully and it is ready
Reason:
Status: True
Type: Ready
Creation Timestamp: <nil>
Events: <none>
Restoring Cloud Snapshots in the same Cluster
After you create a cloud snapshot, you can restore it to a new PVC or the original PVC.