English

Physical GPU Passthrough Environment Preparation

Physical GPU passthrough in virtual machines refers to the process of directly allocating the actual Graphics Processing Unit (GPU) to a virtual machine within a virtualization environment. This allows the virtual machine to access and utilize the physical GPU directly, achieving graphics performance equivalent to that of running directly on a physical machine. It avoids performance bottlenecks caused by virtual graphics adapters, thus enhancing overall performance.

Constraints and Limitations

The physical GPU passthrough functionality requires the use of the kubevirt-gpu-device-plugin; however, there is currently no ARM64 image available for the kubevirt-gpu-device-plugin, which means this functionality cannot be used in an operating system with an ARM64 CPU architecture.

Prerequisites

Chart and Image Preparation

Obtain the following Chart and images and upload them to an image repository. This document uses build-harbor.example.cn as an example repository address. For the specific method of obtaining the Chart and images, please contact the relevant personnel.

Chart

build-harbor.example.cn/example/chart-gpu-operator
.9.1

Images

build-harbor.example.cn/3rdparty/nvidia/gpu-operator
.9.0
build-harbor.example.cn/3rdparty/nvidia/cloud-native/gpu-operator-validator
.9.0
build-harbor.example.cn/3rdparty/nvidia/cuda
.3.1-base-ubi8
build-harbor.example.cn/3rdparty/nvidia/kubevirt-gpu-device-plugin
.2.4
build-harbor.example.cn/3rdparty/nvidia/nfd/node-feature-discovery
.14.2

Enabling IOMMU

The procedure for enabling IOMMU varies across different operating systems. Please refer to the documentation of the corresponding operating system. This document uses CentOS as an example, and all commands should be executed in the terminal.

Edit the /etc/default/grub file and add intel_iommu=on iommu=pt to the GRUB_CMDLINE_LINUX configuration option.
```
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet intel_iommu=on iommu=pt"
```
Execute the following command to generate the grub.cfg file.
```
grub2-mkconfig -o /boot/grub2/grub.cfg
```
Restart the server.
Run the following command to confirm if IOMMU has been successfully enabled. If the output contains IOMMU enabled, then it indicates that it has been successfully enabled.
```
dmesg | grep -i iommu
```

Operating Steps

Note: All commands below should be executed in the CLI tool on the corresponding cluster Master node unless otherwise specified.

Create Namespace

Execute the following command to create a namespace named gpu-system. If the output displays namespace/gpu-system created, it indicates that the creation was successful.

kubectl create ns gpu-system

Deploy gpu-operator

Execute the following command to deploy the gpu-operator.

export REGISTRY=<registry> # Replace <registry> with the repository address where the gpu-operator image is located, e.g.: export REGISTRY=build-harbor.example.cn
  
cat <<EOF | kubectl create -f -
apiVersion: operator.alauda.io/v1alpha1
kind: AppRelease
metadata:
  annotations:
    auto-recycle: "true"
    interval-sync: "true"
  name: gpu-operator
  namespace: gpu-system
spec:
  destination:
    cluster: ""
    namespace: "gpu-operator"
  source:
    charts:
    - name: <chartName> # Replace <chartName> with the actual chart path, e.g.: name = example/chart-gpu-operator
      releaseName: gpu-operator
      targetRevision: v23.9.1
    repoURL: $REGISTRY
  timeout: 120
  values:
    global:
      registry:
        address: $REGISTRY
    nfd:
      enabled: true
    sandboxWorkloads:
      enabled: true
      defaultWorkload: "vm-passthrough"
EOF

Execute the following command to check if the gpu-operator has synchronized. If SYNC shows as Synced, it indicates that it has synchronized successfully.

kubectl -n gpu-system get apprelease gpu-operator

Output information:

NAME           SYNC           HEALTH        MESSAGE        UPDATE   AGE
gpu-operator   Synced         Ready         chart synced   28s      32s

Execute the following command to retrieve the names of all nodes and find the GPU node name.
```
kubectl get nodes -o wide
```

Execute the following command to check if the GPU node has any pass-through capable GPU. If the output contains GPU information similar to nvidia.com/GK210GL_TESLA_K80, it indicates that there are pass-through capable GPUs.

kubectl get node <gpu-node-name> -o jsonpath='{.status.allocatable}' # Replace <gpu-node-name> with the GPU node name obtained from Step 3

Output information:

{"cpu":"39","devices.kubevirt.io/kvm":"1k","devices.kubevirt.io/tun":"1k","devices.kubevirt.io/vhost-net":"1k","ephemeral-storage":"426562784165","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"122915848Ki","nvidia.com/GK210GL_TESLA_K80":"8","pods":"256"}

At this point, the gpu-operator has been successfully deployed.

Configure Kubevirt

Execute the following command to enable the DisableMDEVConfiguration feature. If a message similar to hyperconverged.hco.kubevirt.io/kubevirt-hyperconverged patched is returned, it indicates successful enabling.
```
kubectl patch hco kubevirt-hyperconverged -n kubevirt --type='json' -p='[{"op": "add", "path": "/spec/featureGates/disableMDevConfiguration", "value": true }]'
```

In the terminal of the GPU node, execute the following command to obtain the pciDeviceSelector. The 10de:102d part in the output is the value of pciDeviceSelector. {#pciDeviceSelector}

lspci -nn | grep -i nvidia

Output information:

04:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
05:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
08:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
09:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
85:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
86:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
89:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
8a:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)

Execute the following command to retrieve the names of all nodes and find the GPU node name.
```
kubectl get nodes -o wide
```

Execute the following command to obtain the resourceName. The nvidia.com/GK210GL_TESLA_K80 part in the output is the value of resourceName.

kubectl get node <gpu-node-name> -o jsonpath='{.status.allocatable}' # Replace <gpu-node-name> with the GPU node name obtained from Step 3

Output information:

{"cpu":"39","devices.kubevirt.io/kvm":"1k","devices.kubevirt.io/tun":"1k","devices.kubevirt.io/vhost-net":"1k","ephemeral-storage":"426562784165","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"122915848Ki","nvidia.com/GK210GL_TESLA_K80":"8","pods":"256"}

Execute the following command to add the passthrough GPU.

Note: When replacing the <pci-devices-id> part in the command below with the pciDeviceSelector value obtained in Step 2, all letters in the pciDeviceSelector must be converted to uppercase. For example, if the pciDeviceSelector value obtained is 10de:102d, it should be replaced with export DEVICE=10DE:102D.

Adding a single GPU card

export DEVICE=<pci-devices-id> # Replace <pci-devices-id> with the pciDeviceSelector obtained in Step 2, e.g.: export DEVICE=10DE:102D
export RESOURCE=<resource-name> # Replace <resource-name> with the resourceName obtained in Step 4, e.g.: export RESOURCE=nvidia.com/GK210GL_TESLA_K80
 
kubectl patch hco kubevirt-hyperconverged -n kubevirt --type='json' -p='
[
  {
    "op": "add",
    "path": "/spec/permittedHostDevices",
    "value": {
      "pciHostDevices": [
        {
          "externalResourceProvider": true,
          "pciDeviceSelector": "'"$DEVICE"'",
          "resourceName": "'"$RESOURCE"'"
        }
      ]
    }
  }
]'

Adding multiple GPU cards

Note: When adding multiple GPU cards, each pciDeviceSelector value used to replace <pci-devices-id> must be unique.

export DEVICE1=<pci-devices-id1> # Replace <pci-devices-id1> with the pciDeviceSelector obtained in Step 2
export RESOURCE1=<resource-name1> # Replace <resource-name1> with the resourceName obtained in Step 4
export DEVICE2=<pci-devices-id2> # Replace <pci-devices-id2> with the pciDeviceSelector obtained in Step 2
export RESOURCE2=<resource-name2> # Replace <resource-name2> with the resourceName obtained in Step 4
 
kubectl patch hco kubevirt-hyperconverged -n kubevirt --type='json' -p='
[
  {
    "op": "add",
    "path": "/spec/permittedHostDevices",
    "value": {
      "pciHostDevices": [
        {
          "externalResourceProvider": true,
          "pciDeviceSelector": "'"$DEVICE"'",
          "resourceName": "'"$RESOURCE"'"
        },
        {
          "externalResourceProvider": true,
          "pciDeviceSelector": "'"$DEVICE2"'",
          "resourceName": "'"$RESOURCE2"'"
        }
      ]
    }
  }
]'

Adding new GPU cards after already adding GPU cards

export DEVICE=<pci-devices-id> # Replace <pci-devices-id> with the pciDeviceSelector obtained in Step 2
export RESOURCE=<resource-name> # Replace <resource-name> with the resourceName obtained in Step 4
export INDEX=<index> # index is a zero-based array index, use the number to replace <index>, for example: if one GPU card has already been added, and now you want to add another one, index should be 1, i.e., export INDEX=1
  
kubectl patch hco kubevirt-hyperconverged -n kubevirt --type='json' -p='
[
  {
    "op": "add",
    "path": "/spec/permittedHostDevices/pciHostDevices/'"${INDEX}"'",
    "value": {
      "externalResourceProvider": true,
      "pciDeviceSelector": "'"$DEVICE"'",
      "resourceName": "'"$RESOURCE"'"
    }
  }
]'

Result Verification

After completing the above configuration steps, if the corresponding physical GPU can be selected when creating the virtual machine, it indicates that the physical GPU passthrough environment has been successfully prepared.

Note: If physical GPU passthrough needs to be configured, please enable the relevant features in advance.

Go to Container Platform.
In the left navigation bar, click Virtualization > Virtual Machines.
Click Create Virtual Machine.
Configure the Physical GPU (Alpha) parameter for the virtual machine.

Parameter Description
Physical GPU (Alpha) Select the model of the configured physical GPU. Only one physical GPU can be assigned to each virtual machine.
At this point, the physical GPU passthrough environment has been successfully prepared.

Parameter	Description
Physical GPU (Alpha)	Select the model of the configured physical GPU. Only one physical GPU can be assigned to each virtual machine.

Delete the Virtual Machine with Passthrough GPU

Go to Container Platform.
In the left navigation bar, click Virtualization > Virtual Machines.
In the list page, click the ⋮ on the right side of the virtual machine to be deleted > Delete, or click the name of the virtual machine to be deleted to enter its detail information page, and click Actions > Delete.
Input the confirmation information to delete the virtual machine with passthrough GPU.

On the corresponding cluster Master node for the GPU, use the CLI tool to execute the following command to remove the GPU-related configuration from KubeVirt.
```
kubectl patch hco kubevirt-hyperconverged -n kubevirt --type='json' -p='[{"op": "remove", "path": "/spec/permittedHostDevices"}]'
```
After deletion, if it is not possible to choose the corresponding physical GPU model when creating a virtual machine through Container Platform, it indicates that the deletion was successful. Please refer to Select Physical GPU Model for the specific steps to create a virtual machine.

Uninstall gpu-operator

Use the CLI tool on the corresponding cluster Master node for the GPU to execute the following command to uninstall the gpu-operator.
```
kubectl -n gpu-system delete apprelease gpu-operator
```
Output information:
```
apprelease.operator.alauda.io "gpu-operator" deleted
```
Execute the command, and if you receive a response similar to the one below, it indicates that the gpu-operator has been successfully uninstalled.
```
kubectl -n gpu-system get apprelease gpu-operator
```
Output information:
```
Error from server (NotFound): appreleases.operator.alauda.io "gpu-operator" not found
```

View full docs as PDF

How to

Architecture

Concepts

Guides

How To

Trouble Shooting

Concepts

Guides

How To

Troubleshooting

Install

Concepts

Guides

How To

Disaster Recovery

Concepts

Guides

How To

Guides

Compliance

Install

API Refiner

User

Guides

Group

Guides

Role

Guides

IDP

Guides

Troubleshooting

User Policy

Guides

Overview

Images

Guides

How To

Virtual Machine

Guides

How To

Troubleshooting

Network

Guides

How To

Storage

Guides

Backup and Recovery

Guides

Concepts

Concepts

Guides

Namespaces

Pre-Application-Creation Preparation

Creating Applications

Post-Application-Creation Configuration

Operation and Maintenance

Application Observability

Workloads

Pod

Container

How To

Install

How To

Install

Guides

How To

Concepts

Guides

Argo CD Concept

Alauda Container Platform GitOps Concepts

Creating GitOps Application

GitOps Observability

Architecture

Guides

How To

Guides

How To

Troubleshooting

Architecture

Guides