Install for Huawei Ascend NPU
This chapter covers the end-to-end installation steps for clusters with Huawei Ascend NPUs. For NVIDIA GPUs, see Install for NVIDIA GPU.
Starting from Alauda Build of Hami v2.8, the deploy form exposes an Enable Ascend toggle that lets hami-scheduler schedule Huawei Ascend NPUs alongside (or instead of) NVIDIA GPUs.
Unlike NVIDIA — where turning the toggle on deploys both hami-scheduler and the NVIDIA device plugin — turning Enable Ascend on only configures hami-scheduler with Ascend awareness and ships the hami-scheduler-device ConfigMap that contains the Ascend virtualization templates. The Ascend device plugin DaemonSet is not bundled in the HAMi chart and must be installed separately.
TOC
PrerequisitesStep 1: Install Ascend driver and Ascend Docker RuntimeStep 2: Enable Ascend in the HAMi deploy formStep 3: Deploy the HAMiascend-device-pluginStep 4: VerifyTroubleshootingPrerequisites
- Cluster administrator access to your ACP cluster
- Kubernetes version: v1.16+
- ACP version: v4.0+
- One or more nodes with a supported Huawei Ascend NPU (
Ascend 910B,Ascend 310P, etc.)
Step 1: Install Ascend driver and Ascend Docker Runtime
Alauda Build of NPU Operator is the recommended way to install the Ascend driver, firmware, and Ascend Docker Runtime on the NPU nodes. Follow Installing Alauda Build of NPU Operator and pay attention to the deploy form options:
After the operator finishes, verify the driver on each NPU node:
Step 2: Enable Ascend in the HAMi deploy form
- Open
Administrator->Marketplace->Cluster Plugin, switch to the target cluster, then deploy or updateAlauda Build of Hami. - In the deploy form, turn on
Enable Ascend. Enable NVIDIAcan be left on (mixed cluster) or turned off (NPU-only cluster).- Submit the form. After the rollout finishes,
hami-scheduleris started with--enable-ascend=trueand thehami-scheduler-deviceConfigMap includes the Ascendvnpustemplates that the device plugin in step 3 consumes.
Enable NVIDIA and Enable Ascend are independent. You can turn either of them off, but you should keep at least one device type enabled.
Step 3: Deploy the HAMi ascend-device-plugin
-
Label every NPU node so the DaemonSet only runs where it is needed:
-
Download the DaemonSet manifest
ascend-device-plugin.yamland apply it to the business cluster:The manifest creates:
ServiceAccount,ClusterRole, andClusterRoleBindingnamedhami-ascendin thekube-systemnamespace.- A
DaemonSetnamedhami-ascend-device-pluginscheduled to nodes labelledascend=on. - The DaemonSet mounts the
hami-scheduler-deviceConfigMap, which was created by the HAMi chart in step 2 and contains the Ascend virtualization templates (vnpus).
-
To use a custom plugin image (for example, an internal mirror), edit the
imagefield inascend-device-plugin.yamlbefore applying.
Step 4: Verify
-
Check that the device plugin pods are running on every NPU node:
-
Check that the NPU resources are reported as allocatable (the resource name varies by chip — for example
huawei.com/Ascend910Borhuawei.com/Ascend310P): -
Run a sample workload to request a vNPU slice. The example below requests one
Ascend910B4chip and limits memory to4096MiB; HAMi picks the smallest virtualization template that satisfies the memory request:For more device-side examples, see the HAMi ascend-device-plugin examples.
For a deeper dive into how vNPU slicing works (templates, memory rounding, and per-card budget accounting), see How Ascend vNPU slicing works.
Troubleshooting
- Pod stays in
Pendingwith no Ascend resources reported: confirm the node carries theascend=onlabel and that thehami-ascend-device-pluginpod on that node isRunning. hami-scheduler-deviceConfigMap is missing thevnpussection: re-open the HAMi deploy form and confirmEnable Ascendis on; the section is only rendered when the toggle is enabled.- Conflicts with
Alauda Build of NPU Operator: make sureAscend Device Pluginwas disabled in the operator deploy form. If it was enabled by mistake, redeploy the operator with the option turned off, then reapply the HAMiascend-device-plugin.yaml. - Need to customize Ascend virtualization templates: edit the
vnpussection in thehami-scheduler-deviceConfigMap (kubectl -n kube-system edit configmap hami-scheduler-device) and restart thehami-ascend-device-pluginpods. Be aware that the change will be lost on the next chart upgrade.