GET
Actionexec
ActionSocket
ActionRefer to the official Kubernetes documentation:
In Kubernetes, health checks, also known as probes, are a critical mechanism to ensure the high availability and resilience of your applications. Kubernetes uses these probes to determine the health and readiness of your Pods, allowing the system to take appropriate actions, such as restarting containers or routing traffic. Without proper health checks, Kubernetes cannot reliably manage your application's lifecycle, potentially leading to service degradation or outages.
Kubernetes offers three types of probes:
livenessProbe
: Detects if the container is still running. If a liveness probe fails, Kubernetes will terminate the Pod and restart it according to its restart policy.readinessProbe
: Detects if the container is ready to serve traffic. If a readiness probe fails, the Endpoint Controller removes the Pod from the Service's Endpoint list until the probe succeeds.startupProbe
: Specifically checks if the application has successfully started. Liveness and readiness probes will not execute until the startup probe succeeds. This is very useful for applications with long startup times.Properly configuring these probes is essential for building robust and self-healing applications on Kubernetes.
Kubernetes supports three mechanisms for implementing probes:
GET
ActionExecutes an HTTP
GET
request against the Pod's IP address on a specified port and path. The probe is considered successful if the response code is between 200 and 399.
exec
ActionExecutes a specified command inside the container. The probe is successful if the command exits with status code 0.
Use Cases: Applications without HTTP endpoints, checking internal application state, or performing complex health checks that require specific tools.
Example:
Socket
ActionAttempts to open a TCP socket on the container's IP address and a specified port. The probe is successful if the TCP connection can be established.
Example:
Parameters | Description |
---|---|
Initial Delay | initialDelaySeconds : Grace period (seconds) before starting probes. Default: 300 . |
Period | periodSeconds : Probe interval (1-120s). Default: 60 . |
Timeout | timeoutSeconds : Probe timeout duration (1-300s). Default: 30 . |
Success Threshold | successThreshold : Minimum consecutive successes to mark healthy. Default: 0 . |
Failure Threshold | failureThreshold : Maximum consecutive failures to trigger action:- 0 : Disables failure-based actions- Default: 5 failures → container restart. |
Parameter | Applicable Protocols | Description |
---|---|---|
Protocol | HTTP/HTTPS | Health check protocol |
Port | HTTP/HTTPS/TCP | Target container port for probing. |
Path | HTTP/HTTPS | Endpoint path (e.g., /healthz ). |
HTTP Headers | HTTP/HTTPS | Custom headers (Add key-value pairs). |
Command | EXEC | Container-executable check command (e.g., sh -c "curl -I localhost:8080 | grep OK" ).Note: Escape special characters and test command viability. |
When a Pod's status indicates issues related to probes, here's how to troubleshoot:
Look for events related to LivenessProbe failed, ReadinessProbe failed, or StartupProbe failed. These events often provide specific error messages (e.g., connection refused, HTTP 500 error, command exit code).
Examine application logs to see if there are errors or warnings around the time the probe failed. Your application might be logging why its health endpoint isn't responding correctly.
kubectl exec -it <pod-name> -- curl <probe-path>:<probe-port>
or wget
from within the container to see the actual response.kubectl exec -it <pod-name> -- <command-from-probe>
and check its exit code and output.nc
(netcat) or telnet
from another Pod in the same network or from the host if allowed, to test TCP connectivity: kubectl exec -it <another-pod> -- nc -vz <pod-ip> <probe-port>
.kubectl top pod <pod-name>
) and consider adjusting resources
limits/requests.