Install for Huawei Ascend NPU

This chapter covers the end-to-end installation steps for clusters with Huawei Ascend NPUs. For NVIDIA GPUs, see Install for NVIDIA GPU.

Starting from Alauda Build of Hami v2.8, the deploy form exposes an Enable Ascend toggle that lets hami-scheduler schedule Huawei Ascend NPUs alongside (or instead of) NVIDIA GPUs.

Unlike NVIDIA — where turning the toggle on deploys both hami-scheduler and the NVIDIA device plugin — turning Enable Ascend on only configures hami-scheduler with Ascend awareness and ships the hami-scheduler-device ConfigMap that contains the Ascend virtualization templates. The Ascend device plugin DaemonSet is not bundled in the HAMi chart and must be installed separately.

Prerequisites

  • Cluster administrator access to your ACP cluster
  • Kubernetes version: v1.16+
  • ACP version: v4.0+
  • One or more nodes with a supported Huawei Ascend NPU (Ascend 910B, Ascend 310P, etc.)

Step 1: Install Ascend driver and Ascend Docker Runtime

Alauda Build of NPU Operator is the recommended way to install the Ascend driver, firmware, and Ascend Docker Runtime on the NPU nodes. Follow Installing Alauda Build of NPU Operator and pay attention to the deploy form options:

ComponentSettingReason
DriverEnabledInstalls the Ascend driver and firmware on each NPU node.
Ascend Docker RuntimeEnabledInstalls the runtime so containers can mount Ascend devices.
Ascend Device PluginDisabledHAMi ships its own ascend-device-plugin (installed in step 3). Enabling the operator's plugin will conflict with HAMi's plugin and the hami-scheduler-device ConfigMap.
NPU exporterOptionalEnable if you need NPU metrics.

After the operator finishes, verify the driver on each NPU node:

npu-smi info

Step 2: Enable Ascend in the HAMi deploy form

  1. Open Administrator -> Marketplace -> Cluster Plugin, switch to the target cluster, then deploy or update Alauda Build of Hami.
  2. In the deploy form, turn on Enable Ascend.
  3. Enable NVIDIA can be left on (mixed cluster) or turned off (NPU-only cluster).
  4. Submit the form. After the rollout finishes, hami-scheduler is started with --enable-ascend=true and the hami-scheduler-device ConfigMap includes the Ascend vnpus templates that the device plugin in step 3 consumes.
TIP

Enable NVIDIA and Enable Ascend are independent. You can turn either of them off, but you should keep at least one device type enabled.

Step 3: Deploy the HAMi ascend-device-plugin

  1. Label every NPU node so the DaemonSet only runs where it is needed:

    kubectl label node {ascend-node} ascend=on
  2. Download the DaemonSet manifest ascend-device-plugin.yaml and apply it to the business cluster:

    kubectl apply -f ascend-device-plugin.yaml

    The manifest creates:

    • ServiceAccount, ClusterRole, and ClusterRoleBinding named hami-ascend in the kube-system namespace.
    • A DaemonSet named hami-ascend-device-plugin scheduled to nodes labelled ascend=on.
    • The DaemonSet mounts the hami-scheduler-device ConfigMap, which was created by the HAMi chart in step 2 and contains the Ascend virtualization templates (vnpus).
  3. To use a custom plugin image (for example, an internal mirror), edit the image field in ascend-device-plugin.yaml before applying.

Step 4: Verify

  1. Check that the device plugin pods are running on every NPU node:

    kubectl -n kube-system get pods -l app.kubernetes.io/component=hami-ascend-device-plugin -o wide
  2. Check that the NPU resources are reported as allocatable (the resource name varies by chip — for example huawei.com/Ascend910B or huawei.com/Ascend310P):

    kubectl get node {ascend-node} -o=jsonpath='{.status.allocatable}'
  3. Run a sample workload to request a vNPU slice. The example below requests one Ascend910B4 chip and limits memory to 4096 MiB; HAMi picks the smallest virtualization template that satisfies the memory request:

    apiVersion: v1
    kind: Pod
    metadata:
      name: ascend-pod
    spec:
      containers:
        - name: npu-pod
          image: ascendai/pytorch:ubuntu-python3.8-cann8.0.rc1.beta1-pytorch2.1.0
          command: ["sleep", "infinity"]
          resources:
            limits:
              huawei.com/Ascend910B4: "1"
              # If you omit Ascend910B4-memory, the whole NPU is allocated.
              huawei.com/Ascend910B4-memory: "4096"

    For more device-side examples, see the HAMi ascend-device-plugin examples.

For a deeper dive into how vNPU slicing works (templates, memory rounding, and per-card budget accounting), see How Ascend vNPU slicing works.

Troubleshooting

  • Pod stays in Pending with no Ascend resources reported: confirm the node carries the ascend=on label and that the hami-ascend-device-plugin pod on that node is Running.
  • hami-scheduler-device ConfigMap is missing the vnpus section: re-open the HAMi deploy form and confirm Enable Ascend is on; the section is only rendered when the toggle is enabled.
  • Conflicts with Alauda Build of NPU Operator: make sure Ascend Device Plugin was disabled in the operator deploy form. If it was enabled by mistake, redeploy the operator with the option turned off, then reapply the HAMi ascend-device-plugin.yaml.
  • Need to customize Ascend virtualization templates: edit the vnpus section in the hami-scheduler-device ConfigMap (kubectl -n kube-system edit configmap hami-scheduler-device) and restart the hami-ascend-device-plugin pods. Be aware that the change will be lost on the next chart upgrade.