This document provides a step-by-step guide on how to configure external access for your inference services, including checking external access addresses, creating domains, setting up load balancers, and verifying the configuration.
You can:
status.url.In the Administrator Console, go to Network > Domains, and then click Create Domain.
(One load balancer can be shared by multiple projects; create a new one only if necessary.)
In the Administrator Console, go to Network > Load Balancers, and then click Create Load Balancer. For detailed help documentation, please refer to .
In the Alauda Container Platform Console, navigate to Network > Load Balancers, then click the name of the load balancer you just created to enter its configuration page.
Add listening ports for your service. You can add more ports as needed.
Step 1: Add a Port
80. For HTTPS, the standard port is 443.Step 2: Configure HTTPS (if applicable) If you choose the HTTPS protocol, you must select a default certificate.
istio-system namespace. This is crucial for the next step.knative-serving-cert as the Default Certificate.Configure forwarding rules for the ports you added in the previous step.
Step 1: Add a Rule
qwen2-0b5-kubeflow-admin-cpaas-io.my-company.com.Step 2: Configure the Service Group
istio-system. If not, switch your project's namespace to istio-system first.knative-ingressgateway from the dropdown list.Note: The process for configuring rules for the HTTPS protocol (port 443) is the same as described above.
For more detailed parameter configurations, you can refer to .
To verify that your inference service is accessible externally, use the curl command below. Remember to replace the placeholders with your actual load balancer IP address, port, and inference service address.
Here's what each part of the command means and what you need to replace:
your-inference-service-domain.com: This should be the domain name you created for your inference service (e.g., qwen2-0b5-kubeflow-admin-cpaas-io.my-company.com).your-port: This is the port your load balancer is listening on for HTTP traffic (commonly 80).your-load-balancer-ip: This is the actual IP address of your load balancer (e.g., 192.168.137.21).If the request successfully returns the model list, your configuration is complete! If it fails, double-check your load balancer settings or review the inference service logs to pinpoint the problem.