This topic provides recommended practices and resource evaluation guidelines for Multi-Cluster in .
Proper node sizing ensures the global cluster can efficiently manage all registered clusters, handle synchronization traffic, and process user API and Web Console requests without performance degradation.
The Global cluster is responsible for:
Because the Global Cluster must handle both management operations and data aggregation from all connected clusters, resource allocation should be planned according to the expected scale and workload intensity.
The production-scale sizing depends primarily on:
The following table provides reference configurations validated through internal performance testing.
Scale Tier | Managed Clusters | Number of Nodes | CPU per Node | Memory per Node | Notes |
---|---|---|---|---|---|
Small | ≤ 10 | 3 | 8 cores | 16 GB | Suitable for small-scale environments |
Medium | ≤ 50 | 3 | 16 cores | 32 GB | Default production setup |
Large | ≤ 100 | 3 | 24 cores | 48 GB | Supports heavy Web Console usage and frequent sync cycles |
Extra Large | ≤ 500 | 6 | 32 cores | 64 GB | Requires horizontal scaling and dedicated infra nodes |
These recommendations are general guidelines. Actual requirements depend on your cluster topology, user concurrency, and plugins installed.
When increasing load per node (for example, 2× more clusters or higher user concurrency), follow these adjustments:
Parameter | Scaling Recommendation |
---|---|
CPU | +50% for every 50 additional managed clusters |
Memory | +50% for every 50 additional managed clusters |
When exceeding 100 managed clusters or encountering persistent API latency above 500 ms:
Add nodes to distribute request handling and controller workloads.
After deployment, continuously monitor the following metrics to validate node sizing:
Metric | Recommended Range |
---|---|
Node CPU utilization | 60–75% under peak load |
Node Memory utilization | ≤80% sustained |
API request latency | P90 < 500ms |
etcd commit latency | P99 < 50ms |
If sustained resource usage consistently exceeds recommended thresholds, scale vertically (add CPU/memory) or horizontally (add nodes) before user-facing performance degradation occurs.
When sizing the Global Cluster:
Following these guidelines ensures predictable performance and operational stability as your Multi-Cluster environment grows.