Version: 3.3

Manage Shared Block Device (RWX Block) for KubeVirt VMs

The Portworx RWX Block volume is designed and qualified only for use with the KubeVirt VM use case. It isn't intended for use as a generic ReadWriteMany (RWX) block volume type in other scenarios. Contact support to confirm suitability for other use cases (outside of the KubeVirt VM) before deployment.

KubeVirt is an extension for Kubernetes that offers the flexibility to manage traditional VM-based workloads alongside modern containerized applications within a Kubernetes framework.

Portworx supports volumes for virtual machines (VMs) as storage disks in various configurations. To enable OpenShift features such as live migration, these volumes must support the ReadWriteMany (RWX) access mode. Portworx supports the RWX block volume type, which offers improved performance in certain scenarios, and also supports the RWX file system (FS) volume type.

Portworx does not support the following configurations with Shared Raw Block for KubeVirt VMs:

Synchronous disaster recovery
Asynchronous disaster recovery for virtual machines (VMs) on Portworx Raw Block volumes that were migrated from another environment, such as VMware

Prerequisites

An OpenShift cluster that supports KubeVirt.
Portworx Operator version 25.2.1 or later.
Portworx Stork version 25.2.0 or later.
OpenShift Virtualization is enabled.
Ensure that you are running the latest versions of OpenShift Virtualization and the Migration Toolkit for Virtualization Operator that are compatible with your current OpenShift version. Failing to do so may result in issues with live migration operations and other functionalities.
Review the Known issues.

Create a StorageClass

Create a StorageClass if one does not already exist. The following is an example StorageClass.

note

OpenShift Virtualization (OSV) versions 4.18.3 and earlier have a known issue which handle discards incorrectly when used with Portworx block devices. As a workaround, you can disable discards for the Portworx volume by including the parameter nodiscard: true in the StorageClass.

Create the px-kubevirt-sc.yaml file:

StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: px-rwx-block-kubevirt
provisioner: pxd.portworx.com
parameters:
  repl: "3"
  nodiscard: "true" # Disables discard operations on the block device to help avoid known compatibility issues on OpenShift Container Platform (OCP) versions 4.18 and earlier.
volumeBindingMode: Immediate
allowVolumeExpansion: true

Run the following command to apply your StorageClass:
```
oc apply -f px-kubevirt-sc.yaml
```

Create a PVC

Using the above StorageClass, define a PVC with the following configuration:

accessModes: ReadWriteMany for shared volume access.
volumeMode: Block to create a raw block device volume

PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rwx-disk-1
  labels: 
    portworx.io/app: kubevirt
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 100Gi
  storageClassName: px-rwx-block-kubevirt
  volumeMode: Block

Apply this PVC to your cluster:
```
kubectl apply -f pvc.yaml
```

When deploying KubeVirt VMs, reference the PVC created in the previous step to attach the Portworx RWX raw block volume to the VM. Ensure the VM configuration specifies the correct StorageClass and volume.

Once the VM is running with the specified PVC, you can perform live migration using OpenShift's native functionality. The shared RWX volume ensures data consistency during the migration process by allowing simultaneous read/write operations for the source and destination nodes.

important

If you are manually creating a PVC and attaching it to a VM, ensure the following:

Add the portworx.io/app: kubevirt annotation to the PVC spec. This ensures that Portworx will apply KubeVirt-specific logic when processing the volume.
Maintain the same HA or replication factor for all volumes associated with a VM.

VM configuration guidelines for Portworx raw block volumes

When using Portworx RWX block volumes VMs, specific configurations are required to ensure compatibility and performance. The following guidance outlines considerations for root and data disks, block sizes, and bootloaders.

Block size and bootloader compatibility

The VM disk configuration defaults to a 512-byte block size. The hypervisor makes it compatible so that 512-byte operations work over a provisioned Portworx storage disk with a 4096-byte block size. Therefore, no changes are needed because block size compatibility is ensured.

Portworx block volumes always use a 4096-byte block size.
VM disks default to a 512-byte block size unless otherwise specified. Specifying a logical block size of 512-byte and a physical block size of 4096-byte in the VM disk specification is an optional configuration detail within the VM that may help applications or file systems optimize performance, if supported.
VM root disks also contain a bootloader, which can be either EFI or BIOS. BIOS supports booting only from disks with a 512-byte block size. EFI supports booting from disks with either a 4096-byte or 512-byte block size. They are independent bootloading mechanisms. The root disk configuration determines which bootloader is used.
- You can identify the root disk configuration by examining the QCOW2 image or the disk partition table.
- EFI requires an EFI system partition, a GPT partition table, and related components.
- BIOS requires a partition marked as bootable.
- For example, RHEL configures its QCOW2 cloud images to boot using both EFI and BIOS by creating two partitions that contain the required information. Note that not all distributions support this configuration.

VM spec example with custom block size

VirtualMachine
...
spec:
  domain:
    devices:
      disks:
        - bootOrder: 1
          blockSize:
            custom:
              logical: 512
              physical: 4096
          disk:
            bus: virtio
          name: rootdisk
        - name: fio-data-disk-1
          blockSize:
            custom:
              logical: 4096
              physical: 4096
...

Set bootOrder: 1 to indicate the root disk.
If blockSize is not specified, the default is 512 for both logical and physical sizes.
For additional data disks, specifying both logical and physical block sizes as 4096 is recommended for improved performance.

VM bootloader spec example

Portworx supports both UEFI and BIOS bootloader.

To use UEFI, define the bootloader section with efi: {}.
If no bootloader section is present, BIOS is used by default.

VirtualMachine
...
spec:
  domain:
    firmware:
      bootloader:
        efi: {}
...

Supported configuration matrix

VM Root disk

S.No	Portworx block size	VM physical block size	VM logical block size	Bootloader (root disk only)	Supported
1	4096	4096 / 512	512	BIOS	Yes
2	4096	4096 / 512	512	EFI	Yes
3	4096	4096	4096	BIOS	No
4	4096	4096	4096	EFI only (qcow2 with 4K block size)	Yes

Additional disks

S.No	Portworx block size	VM physical block size	VM logical block size	Supported
1	4096	4096	512	Yes
2	4096	4096	4096	Yes (Recommended)

Create a VM

Refer to the applicable version of the OpenShift documentation and KubeVirt user guide to create a KubeVirt VM.

Once the VMs are created, each VM will start running in a virt-launcher pod.

Manage KubeVirt VMs during Portworx node upgrades

When upgrading Portworx on a node, the Portworx Operator manages KubeVirt VMs by initiating a live migration before the upgrade begins. Here’s what happens during this process:

Eviction notice: As the operator attempts to evict virtual machines (VMs) from a node, it generates the following event if it is unable to migrate the VMs:

Warning: UpdatePaused - The update of the storage node <node-name> has been paused because there are 3 KubeVirt VMs running on the node. Portworx will live-migrate the VMs and the storage node will be updated after there are no VMs left on this node.

Migration failure: If the operator cannot successfully live-migrate a VM, the upgrade is paused, and the following event is recorded:

Warning: FailedToEvictVM - Live migration <migration-name> failed for VM <vm-namespace\ß>/<vm-name> on node <node-name>. Please stop or migrate the VM manually to continue the update of the storage node.

Known issues

A known issue in libvirt affects the use of 4K block volumes and may cause VMs to pause due to I/O errors. This issue is resolved in OpenShift Container Platform (OCP) version 4.16 or later. For more information, see Red Hat solution.
OpenShift Virtualization (OSV) versions 4.18.3 and earlier contain a known issue that incorrectly handles discards when used with Portworx block devices. As a workaround, disable discards on the Portworx volume. If discards are disabled, the StorageClass must include the parameter nodiscard: true.

The nodiscard setting can also be toggled after PVC creation by using the command pxctl volume update --nodiscard on <your_pvc_name>. After updating, restart the pod for the change to take effect.
When VMs are migrated from a VMware environment to a KubeVirt environment with VolumeMode: Block, the persistent volume claims (PVCs) may appear to have no free space remaining or to be thick-provisioned. This can increase pool usage and may trigger unnecessary volume expansion.
Golden image and golden PVC behavior

A golden image is a preconfigured virtual machine disk image used as a standard template to create consistent VMs.

A golden PVC is a pre-provisioned persistent volume claim that contains a golden image. These PVCs are cloned to provision new VMs using a consistent and repeatable method.
- Golden PVCs created via HTTP import can behave like thick-provisioned volumes, consuming more physical storage than the actual image size. Cloned volumes from these PVCs, particularly when using Portworx Raw Block volumes, may exhibit increased capacity usage and degraded performance compared to sharedv4 volumes. To optimize space efficiency, run defragmentation inside the guest VM.
- VMs and their associated PVCs may be scheduled on the same node as the golden PVC. This can lead to resource bottlenecks. To avoid this, maintain multiple golden PVCs.
- If a golden PVC is created with a replication factor of 3 and a node or Portworx restart occurs during VM creation, the resulting cloned PVC may be created with a replication factor of 2. There is no automatic correction, but replication can be manually restored using repl add on the affected PVC.
- When multiple VMs are created at the same time from a single golden PVC, some VMs may display a "Running" status while the guest operating system is unresponsive. To recover the VM, perform a power cycle using the OpenShift UI or virtctl CLI.
- when multiple VM disks are cloned from a single template PVC, all the resulting clone volumes reside on the same set of replica nodes. This can create I/O hot spots and degrade performance for those nodes. To reduce this risk, create multiple template PVCs distributed across different nodes and round-robin between them when provisioning new VMs.

Prerequisites​

Create a StorageClass​

Create a PVC​

VM configuration guidelines for Portworx raw block volumes​

Block size and bootloader compatibility​

VM spec example with custom block size​

VM bootloader spec example​

Supported configuration matrix​

VM Root disk​

Additional disks​

Create a VM​

Manage KubeVirt VMs during Portworx node upgrades​

Known issues​