Upgrade

This article will introduce how to upgrade from GPU-manager or old Hami(version 2.5) to the newest Hami version.

TOC

GPU-manager to Hami

Note

  1. GPU-manager and Hami can not deploy in same node but can deploy in same cluster.
  2. When you start upgrading, applications need to be modified one by one, which will cause the business pod to restart.
  3. When you have only one gpu node, you need to uninstall GPU-manager and then install Hami. You can do this by modifying the node label when you both deploy the two plugin. For example, you can remove the nvidia-device-enable=vgpu node label to delete the gpu-manager instance on this node, and then add the gpu=on label to deploy the hami plugin on this node.

Procedure

Modified Your applications one by one, example:

Your old GPU-manager instance:

spec:
  containers:
    - image: your-image
      imagePullPolicy: IfNotPresent
      name: gpu
      resources:
        limits:
          cpu: '2'
          memory: 4Gi
          tencent.com/vcuda-core: "50"
          tencent.com/vcuda-memory: "8000"

Migrate to Hami:

spec:
  containers:
    - image: your-image
      imagePullPolicy: IfNotPresent
      name: gpu
      resources:
        limits:
          cpu: '2'
          memory: 4Gi
          nvidia.com/gpualloc: 1     # Request 1 physical GPU (required)
          nvidia.com/gpucores: "50"  # Request 50% of the compute resources per GPU (optional)
          nvidia.com/gpumem: 8000    # Request 8000MB of video memory per GPU (optional)

Hami to Hami

Important Changes (v2.5 → v2.6)

VersionParameter AvailabilityRequired Action After Upgrade
Hami v2.5Nvidia Runtime Class Name and Create Nvidia Runtime Class not included in the pop-up form.N/A
Hami v2.6These parameters must be configured when deploying a plugin instance on a new node.Update plugin deployment params:
- Nvidia Runtime Class Name: hami-nvidia
- Create Nvidia Runtime Class: true (enable switch)

⚠️ Upgrading from v2.5 to v2.6 should not affect existing applications. ✅ It is recommended to restart applications with a rolling update to avoid unexpected issues.


Procedure

  1. Updrade ACP version if needed.
  2. Upload the package of Hami v2.6 plugin to ACP.
  3. Go to the Administrator -> Clusters -> Tartget Cluster -> Functional Components page, then click the Updrade button and you will see the Alauda Build of HAMi can be updraded. Clusters -> Tartget Cluster -> Functional Components page, then click the Updrade button and you will see the Alauda Build of HAMi can be updraded.