For both stateless and stateful applications, VerticalPodAutoscaler (VPA) automatically recommends and optionally applies more appropriate CPU and memory resource limits based on your business needs, ensuring that pods have sufficient resources while improving cluster resource utilization.
You can create a VerticalPodAutoscaler to recommend or automatically update the CPU and memory resource requests and limits for your pods based on their historical usage patterns.
After you create a VerticalPodAutoscaler, the platform begins to monitor the CPU and memory resource usage of the pods. When sufficient data is available, the VerticalPodAutoscaler calculates recommended resource values based on the observed usage patterns. Depending on the configured update mode, VPA can either automatically apply these recommendations or simply make them available for manual application.
The VPA works by analyzing the resource usage of your pods over time and making recommendations based on this analysis. It can help ensure that your pods have the resources they need without over-provisioning, which can lead to more efficient resource utilization across your cluster.
The VerticalPodAutoscaler (VPA) extends the concept of pod resource optimization. The VPA monitors the resource usage of your pods and provides recommendations for CPU and memory requests based on the observed usage patterns.
The VPA works by continuously monitoring the resource usage of your pods and updating its recommendations as new data becomes available. The VPA can operate in the following modes:
Important: Elastic scaling can achieve horizontal or vertical scaling of Pods. When sufficient resources are available, elastic scaling can bring good results, but when cluster resources are insufficient, it may cause Pods to be in a Pending state. Therefore, please ensure that the cluster has sufficient resources or reasonable quotas, or you can configure alerts to monitor scaling conditions.
The VerticalPodAutoscaler provides resource recommendations based on historical usage patterns, allowing you to optimize your pod's CPU and memory configurations.
Important: When manually applying VPA recommendations, pod recreation will occur, which can cause temporary disruption to your application. Consider applying recommendations during maintenance windows for production workloads.
Before using VPA, you need to install the Vertical Pod Autoscaler cluster plugin:
Log in and navigate to the Administrators page.
Click Marketplace > Cluster Plugins to access the Cluster Plugins list page.
Locate the Alauda Container Platform Vertical Pod Autoscaler cluster plugin, click Install, then proceed to the installation page.
You can create a VerticalPodAutoscaler using the command line interface by defining a YAML file and using the kubectl create
command. The following example shows vertical pod autoscaling for a Deployment object:
vpa.yaml
with the following content:Example output:
Example output (partial):
Enter Container Platform.
In the left navigation bar, click Workloads > Deployments.
Click on Deployment Name.
Scroll down to the Elastic Scaling area and click Update on the right.
Select Vertical Scaling and configure the scaling rules.
Parameter | Description |
---|---|
Scaling Mode | Currently supports Manual Scaling mode, which provides recommended resource configurations by analyzing past resource usage. You can manually adjust according to the recommended values. Adjustments will cause pods to be recreated and restarted, so please choose an appropriate time to avoid impacting running applications. Typically, after pods have been running for more than 8 days, the recommended values will become accurate. Note that when cluster resources are insufficient, scaling may cause Pods to be in a Pending state. Please ensure that the cluster has sufficient resources or reasonable quotas, or configure alerts to monitor scaling conditions. |
Target Container | Defaults to the first container of the workload. You can choose to enable resource limit recommendations for one or more containers as needed. |
Click Update.
updateMode: "Off"
- VPA only provides recommendations without automatically applying them. You can manually apply these recommendations as needed.updateMode: "Auto"
- Automatically sets resource requests when creating pods and updates current pods to recommended values. Currently equivalent to "Recreate".updateMode: "Recreate"
- Automatically sets resource requests when creating pods and evicts current pods to update to recommended values.updateMode: "Initial"
- Only sets resource requests when creating pods, no modifications afterward.minReplicas: <number>
- Minimum number of replicas. Ensures this minimum number of pods remain available when the Updater evicts pods. Must be greater than 0.containerName: "*"
- Apply policy to all containers in the pod.mode: "Auto"
- Automatically generate recommendations for the container.mode: "Off"
- Do not generate recommendations for the container.Notes:
After configuring VPA, the recommended values for CPU and memory resource limits of the target container can be viewed in the Elastic Scaling area. In the Containers area, select the target container tab and click the icon on the right side of Resource Limits to update the resource limits according to the recommended values.