Service Mesh is a PaaS platform based on Istio that provides cloud-native solutions for applications and microservices.
The platform offers one-click infrastructure deployment and upgrade capabilities, visualized service governance, efficient application performance management, and high availability and performance for service gateway management. It provides a comprehensive, stable, reliable, and open microservice solution for enterprises undergoing digital transformation, helping improve service governance efficiency and reduce maintenance costs of the microservice framework, while minimizing developers' dependence on the framework, allowing enterprises to focus on business development and continuously enhance their core competitiveness.
Managing a service mesh becomes particularly complex in multi-cluster environments. The full lifecycle management of a service mesh involves not only deploying, updating, and deleting the service mesh but also monitoring its runtime status in real-time, enabling operations personnel to detect anomalies and troubleshoot issues promptly, thereby ensuring consistent and stable support for business operations.
Feature | Description |
---|---|
Multi-Primary Cluster Service Mesh | The multi-primary cluster architecture service mesh management feature visualizes the process of deploying microservice governance platform components. |
Visual Deployment and Management of Service Mesh | - Administrators can easily deploy all microservice governance platform components, including Istio, for each Kubernetes cluster requiring service governance capabilities using visual forms to configure key parameters. External components can also be integrated as needed to enable containerized microservice governance. - Administrators can customize the resource configuration of mesh components based on the actual business scale. |
Service Mesh Monitoring | The service mesh monitoring feature includes multiple built-in monitoring dashboards that display monitoring data of the control plane and data plane from different dimensions and granularity, making it easy for administrators to query or view in real-time. |
Feature | Description |
---|---|
Service Topology | The service topology graph intuitively presents the relationships between services within the service mesh and between services inside and outside the mesh (e.g., middleware, virtual machine services injected with OpenTelemetry Java Agent). The topology graph helps you understand which components communicate and quickly pinpoint the exact location of errors. - Focus from global to local and view detailed information of a particular service, including service status, request conditions, traffic information, etc. - When services are configured with security, circuit breaker, and alert strategies, these are also displayed with special markers in the topology graph. - Determine the health status of service pods by node color. - Determine the status of traffic between services by connection color. |
Tracing | The platform's tracing feature supports querying the call chain between services in the current namespace by service name or TraceID, monitoring key indicators such as the call status and duration of services in the call chain. Combined with detailed log data, this helps developers understand the response of each node or specific API in the business request chain for quick problem localization and resolution. - Facilitates in-depth analysis of calls between services by development and operations personnel. - Displays a scatter plot abstracted from the start and duration time of the call chain to quickly identify low-performance links. - Supports full-link tracking (services in Service Mesh architecture, Java 8+ services). - Quickly locate call chains by TraceID. - Query link information based on service or TraceID. |
Service Gateway | The service gateway, including egress and ingress gateways, operates at the edge of the service mesh, managing the traffic entering and leaving the service mesh, allowing you to specify permitted traffic. - Configure multiple ingress or egress gateways for the service mesh, with each gateway allowing access through multiple customizable ports. - The ingress gateway supports routing external traffic to different backend resources (API groups, services) through various routing scenarios; the egress gateway allows services within the mesh to access external services without affecting the development experience and monitors traffic to them. - Supports a variety of usage scenarios, such as network isolation between business areas, dedicated gateways for specific businesses, and port isolation. - Easily decommission gateways that are no longer needed without loss. - Authenticate and authorize client requests based on JWT. - Configure gateway routing rules based on request path, request header, and URI rewriting. - Supports pod-level gateway monitoring, with multiple built-in alert strategies to help users detect and resolve issues promptly. Monitoring data includes CPU, memory, QPS, connection count, inbound and outbound traffic metrics. |
Native Resource List | Allows users familiar with Kubernetes to view Istio native resources (YAML files) categorized by resource type. |
Integrated with Flagger, this provides an automated service release experience, supporting multiple deployment strategies such as Canary, A/B Testing, and Blue/Green deployments.
Feature | Description |
---|---|
Release Configuration | Define service release configurations to provide templated execution parameters for future changes, enabling automated deployment and rollback of canary version Pods. |
Traffic Routing Strategy | Offers flexible traffic routing strategies. Services can automatically route traffic based on release configuration templates or customize traffic routing rules for specific release tasks. |
Release Management | Supports both manual and automated release options. Users can delegate release decisions to automated systems or choose manual release. The release process allows for manual intervention and the ability to roll back tasks at any time. |
Observability | Provides metrics comparison and trace tracking between the main version and canary versions, enabling quality analysis and fault localization for canary releases. |
In the microservice governance scenario, service calls shift from local to network protocol interface calls, bringing security risks.
Feature | Description |
---|---|
Security Strategy | Provides traffic encryption between services. By setting security strategies for services, traffic can be encrypted using mTLS. |
When a service failure occurs, the usual troubleshooting process involves: tracking abnormal traffic > pinpointing the fault location > narrowing down the scope > analyzing the specific cause. Based on this process, the platform provides a complete service issue troubleshooting path, helping you quickly locate suspicious positions based on abnormal traffic and resolve issues promptly to minimize business impact.
Feature | Description |
---|---|
Traffic Monitoring and JVM Monitoring | Provides multi-dimensional traffic monitoring data panels for client and server services, with granularity down to the API level; - JVM monitoring supports collecting and displaying monitoring data for various indicators such as CPU/memory, threads, and classes. It allows viewing the data of a single pod or comparing the data of two pods to help developers quickly locate the indicator items and time points where specific issues occurred. |
Alerts | - Provides alerting functions based on common monitoring indicators, supporting indicator collection, strategy management, alert triggering, and alert notification. Combined with traffic monitoring and JVM monitoring, it aims to provide a complete monitoring-alert-notification intelligent operations experience. - Provides alert strategy configuration templates, supporting single alert strategy configuration or quick creation of alert strategies through templates. - Allows overviewing the resource information currently under alert through the real-time alert panel, filtering data by alert level, and viewing alert details to help users understand the scope of the issue and quickly locate the root cause. |
Real-time Logs | Provides a real-time log panel at the service level, allowing you to quickly locate and troubleshoot issues using logs (including access logs) when a service anomaly occurs. |
Quick Troubleshooting | When a service failure occurs, you can combine rich visual data (service topology, tracing) and monitoring and operations data (monitoring, alerts, logs) to track and locate abnormal traffic and quickly resolve issues. |
In multi-region, multi-data center scenarios, when a service pod in a data center fails and is isolated by circuit breaking, the platform will automatically distribute
the client's traffic to service pods in other data centers to ensure stable service provision.
Feature | Description |
---|---|
Proximity Routing | All service calls default to proximity routing in all data centers. When the call chain nodes are long, it effectively ensures that disaster recovery traffic is prioritized for processing by the same cluster's service endpoint. |
Custom Disaster Recovery Priority | Allows configuring regional loads to customize the disaster recovery load priority of disaster recovery traffic across multiple clusters based on the region where the clusters are located. |