Version: 3.1

Manage storage for KubeVirt VMs in ARO

important

This feature is under Directed Availability. Please engage with your Portworx representative if you are interested and need to enable it in your environment under the current guidelines.

KubeVirt is an extension for Kubernetes that offers the flexibility to manage traditional VM-based workloads alongside modern containerized applications within a Kubernetes framework.

Portworx provides resources that VMs can use for both their initial startup process and for retaining data even when they are not running. To utilize OpenShift features, such as live migration, these volumes must have the ReadWriteMany access mode.

Follow the instructions on this page to create a StorageClass, which you can use to create the necessary PVCs.

Prerequisites

An OpenShift cluster that supports KubeVirt.
OpenShift Virtualization is enabled.

Create a StorageClass

To ensure PVCs are compatible with KubeVirt virtual machines, they must be configured with the ReadWriteMany access mode and use NFS version 3.0 with nolock mount option as shown below in the sharedv4_mount_options parameter. To meet these requirements, create PVCs from the StorageClass with the following parameters configured:

sharedv4: "true"
sharedv4_mount_options: vers=3.0,nolock

Create the px-kubevirt-sc.yaml file:
```
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: portworx-rwx-kubevirt
provisioner: pxd.portworx.com
parameters:
  repl: "3"
  sharedv4: "true"
  sharedv4_mount_options: vers=3.0,nolock
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
```
note
- The volumeBindingMode=WaitForFirstConsumer flag enables Portworx to intelligently place the volumes. For more information, see the KubeVirt page.
  - Note that the PVCs used by the VMs directly, should not include the annotation cdi.kubevirt.io/storage.bind.immediate.requested=true. This is because such an annotation overrides the WaitForFirstConsumer setting in the StorageClass.
- When migrating Forklift from vSphere to OpenShift Container Platform with Migration Toolkit for Virtualization, use volumeBindingMode=immediate for a successful migration.
Run the following command to apply your StorageClass:
```
oc apply -f px-kubevirt-sc.yaml
```

Portworx optimizes volume placement and access for KubeVirt VMs using principles of hyperconvergence and collocation.

Shared volumes for live migration: Live migration requires two virt-launcher pods to run simultaneously, with the same volume mounted in both pods. The type of shared volume—either bind-mounted or NFS-mounted—depends on the volume's attachment at the virt-launcher pod creation:
- Bind-mount: Used when the volume is attached on the same node where the virt-launcher pod starts, optimizing performance through hyperconvergence.
- NFS-mount: Used when the volume is attached on a different node. Volumes can switch between bind-mount and NFS by live-migrating or restarting the VM.
Collocation: Portworx ensures that multiple volumes used by a single KubeVirt VM are placed on the same set of replica nodes. This automatic collocation simplifies achieving hyperconvergence.

Create a PVC

You can create PVCs using one of the following methods, and Portworx will automatically recognize them as KubeVirt volumes. Ensure these PVCs are configured with the RWX access mode:

Virtualization tab in the OpenShift web console
Konveyor Forklift or Migration Toolkit for Virtualization
Containerized data importer's (CDI) DataVolume

Once PVCs are created, run the following command to verify if they have the RWX access mode:

oc get pvc -n <vm-namespace>

NAME                  STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS             AGE
<your-kubevirt-pvc>   Bound    pvc-xxxx-xxx-xxx   1Gi        RWX            portworx-rwx-kubevirt    15h

The output should show the PVCs with the RWX access mode.

important

If you are creating a PVC using some other mechanism, then ensure the following:

Add the portworx.io/app: kubevirt annotation to the PVC spec. This ensures that Portworx will apply KubeVirt-specific logic when processing the volume.
Maintain the same HA or replication factor for all volumes associated with a VM.

Create a VM

Refer to the applicable version of the OpenShift documentation to create a KubeVirt VM.

note

For OpenShift version 4.14 or newer, add the usePopulators=false annotation to your VM spec, as shown below:

  dataVolumeTemplates:
  - metadata:
      annotations:
        cdi.kubevirt.io/storage.usePopulator: "false"

Once the VMs are created, each VM will start running in a virt-launcher pod.

KubeVirt facilitates the live migration of VMs with Portworx ReadWriteMany volumes. However, the underlying libvirtd lacks this capability, prohibiting such live migrations. To address this, the Stork webhook controller modifies the virt-launcher pod manifest. It achieves this by inserting a special shared library through LD_PRELOAD. This library intercepts the statfs() system call made by libvirtd when accessing a Portworx volume. Here is the code of this shared library.

Portworx ensures:

The newly created VMs (even with operators such as Konveyor Forklift or Migration toolkit for Virtualization), have their volumes collocated during creation. Stork will schedule the VMs on nodes where volume replicas exist, making the VMs hyperconverged (bind mounted).
During planned node maintenance, OpenShift will live-migrate the VMs out of that node. When OpenShift reboots the node, Portworx will perform a sharedv4 service (NFS) failover, and as part of this failover, it will live-migrate the VMs to ensure they are hyperconverged once again.
Existing VMs with non-collocated volumes will be identified and corrected by a background job.

Manage KubeVirt VMs during Portworx node upgrades

When upgrading Portworx on a node, the Portworx Operator manages KubeVirt VMs by initiating a live migration before the upgrade begins. Here’s what happens during this process:

Eviction notice: As the operator attempts to evict VMs from a node, it generates an event on the storage node stating:

Warning: UpdatePaused - The update of the storage node <node-name> has been paused because there are 3 KubeVirt VMs running on the node. Portworx will live-migrate the VMs, and the update will proceed once no VMs remain on this node.
Migration failure: If the operator cannot successfully live-migrate a VM, the upgrade is paused, and the following event is recorded:

Warning: FailedToEvictVM - Live migration <migration-name> failed for VM <vm-namespace>/<vm-name> on node <node-name>. Please stop or migrate the VM manually to continue the update of the storage node.

Manage KubeVirt VMs with adjusted filesystem overhead

When creating VMs from templates on Portworx filesystem volumes, you might encounter an error due to insufficient filesystem overhead. To resolve this issue, follow the steps below to increase the overhead:

When you add a virtual machine disk to a PVC that uses the filesystem volume mode, you must ensure that there is enough space on the PVC for the VM disk and for file system overhead, such as metadata. By default, OpenShift Virtualization reserves 5.5% of the PVC space for overhead, reducing the space available for virtual machine disks by that amount. You can configure a different overhead value by editing the HyperConverged Operator (HCO) object.

Prerequisite

Install the OpenShift CLI (oc).

The following procedure explains how to change the default file system overhead value to 8%:

Edit the HCO object:

oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv

Populate the fields to set the overhead to 8%. For example:
```
spec:
  filesystemOverhead:
    global: "0.08"
    storageClass:
      <storage_class_name>: "0.08"
```
where:
- global: The default file system overhead percentage used for any storage classes that do not already have a set value. Setting global: "0.08" reserves 8% of the PVC for file system overhead.
- <storage_class_name>: If you want to set the overhead for a specific storage class to 8%, replace <storage_class_name> with the name of your storage class.
Save and exit the editor to update the HCO object.
Verify changes to CDIConfig:
```
oc get cdiconfig -o yaml
```

View your specific changes to CDIConfig:

oc get cdiconfig -o jsonpath='{.items..status.filesystemOverhead}'

By following these steps, you can adjust the filesystem overhead to 8%, ensuring that there is enough space on the PVC for the VM disk and file system overhead. For further details, refer to the Red Hat article.

Opt out of Live Migration

If you prefer not to have the operator live-migrate VMs during the Portworx upgrade, add the operator.libopenstorage.org/evict-vms-during-update: false annotation to the StorageCluster object. This annotation can also be used to resume an update that has been paused due to the presence of KubeVirt VMs.

note

This feature is designed to upgrade one node at a time, which is the default setting. Upgrading multiple nodes simultaneously can lead to VMs being paused or restarted, and may stall the upgrade process.

Prerequisites​

Create a StorageClass​

Create a PVC​

Create a VM​

Manage KubeVirt VMs during Portworx node upgrades​

Manage KubeVirt VMs with adjusted filesystem overhead​

Prerequisite​

Opt out of Live Migration​

Further Reading​