Configuring HPA

For stateless applications, HPA (Horizontal Pod Autoscaler) automatically scales the number of pods up or down based on your business needs and preset policies, in order to handle sudden spikes in business load or to save resources for other applications.

Prerequisites

Please ensure that the monitoring components are deployed in the current cluster and are functioning properly. You can check the deployment and health status of the monitoring components by clicking on the top right corner of the platform expand > Platform Health Status..

Operation Steps

  1. Enter Container Platform.

  2. In the left navigation bar, click Workloads > Deployments.

  3. Click on Deployment Name.

  4. Scroll down to the Elastic Scaling area and click on Update on the right.

  5. Select Horizontal Scaling and complete the policy configuration.

    ParameterDescription
    Pod CountAfter a deployment is successfully created, you need to evaluate the Minimum Pod Count corresponding to known and regular business volume changes, as well as the Maximum Pod Count that can be supported by the namespace quota under high business pressure. The maximum or minimum pod counts can be changed after setting, and it is recommended to first derive a more accurate value through performance testing and to continuously adjust during usage to meet business needs.
    Trigger PolicyList the Metrics that are sensitive to business changes and their Target Thresholds to trigger scale-up or scale-down actions.
    For example, if you set CPU Utilization = 60%, once the CPU utilization deviates from 60%, the platform will start to automatically adjust the number of pods based on the current deployment's insufficient or excessive resource allocation.
    Note: Metric types include built-in metrics and custom metrics. Custom metrics only apply to deployments in native applications, and you must first add custom metrics .
    Scale Up/Down Step (Alpha)For businesses with specific scaling rate requirements, you can gradually adapt to changes in business volume by specifying Scale-Up Step or Scale-Down Step.
    For the scale-down step, you can customize the Stability Window, which defaults to 300 seconds, meaning that you must wait 300 seconds before executing scale-down actions.
  6. Click Update.

Notes:

  • For CPU Utilization and Memory Utilization metrics, auto-scaling will only be triggered when the actual value fluctuates outside the range of ±10% of the target threshold.

  • Scale-down may impact ongoing business operations; please proceed with caution.

Calculation Rules

When business metrics change, the platform will automatically calculate the target pod count that matches the business volume according to the following rules and adjust accordingly. If the business metrics continue to fluctuate, the value will be adjusted to the set Minimum Pod Count or Maximum Pod Count.

  • Single Policy Target Pod Count: ceil[(sum(actual metric values)/metric threshold)] . This means that the sum of the actual metric values of all pods divided by the metric threshold, rounded up to the smallest integer that is greater than or equal to the result. For example: If there are currently 3 pods with CPU utilizations of 80%, 80%, and 90%, and the set CPU utilization threshold is 60%. According to the formula, the number of pods will be automatically adjusted to: ceil[(80%+80%+90%)/60%] = ceil 4.1 = 5 pods.

    Note:

    • If the calculated target pod count exceeds the set Maximum Pod Count (for example 4), the platform will only scale up to 4 pods. If after changing the maximum pod count the metrics remain persistently high, you may need to use alternate scaling methods, such as increasing the namespace pod quota or adding hardware resources.

    • If the calculated target pod count (in the previous example 5) is less than the pod count adjusted according to the Scale-Up Step (for example 10), the platform will only scale up to 5 pods.

  • Multiple Policy Target Pod Count: Take the maximum value among the results of each policy calculation.