Troubleshooting Istio CNI Plugin
TOC
Log
The Istio CNI plugin log provides information about how the plugin configures application pod traffic redirection based on PodSpec.
The plugin runs in the container runtime process space, so you can see CNI log entries in the kubelet log.
To make debugging easier, the CNI plugin also sends its log to the istio-cni-node DaemonSet.
The default log level for the CNI plugin is info. To get more detailed log output, you can change the level by
editing the cni.values.logLevel in ServiceMesh component config for istioCNI.
The Istio CNI DaemonSet pod log also provides information about CNI plugin installation, and race condition repairing.
DaemonSet readiness
Readiness of the CNI DaemonSet indicates that the Istio CNI plugin is properly installed and configured.
If Istio CNI DaemonSet is unready, it suggests something is wrong. Look at the istio-cni-node DaemonSet logs to diagnose.
Race condition repair
By default, the Istio CNI DaemonSet has race condition mitigation enabled, which will evict a pod that was started before the CNI plugin was ready. To understand which pods were evicted, look for log lines like the following:
Diagnose pod start-up failure
A common issue with the CNI plugin is that a pod fails to start due to container network set-up failure. Typically the failure reason is written to the pod events, and is visible via pod description:
If a pod keeps getting init error, check the init container istio-validation log for “connection refused” errors like the following:
The istio-validation init container sets up a local dummy server which listens on traffic redirection target
inbound/outbound ports, and checks whether test traffic can be redirected to the dummy server.
When pod traffic redirection is not set up correctly by the CNI plugin, the istio-validation init container blocks pod
startup, to prevent traffic bypass.
To see if there were any errors or unexpected network setup behaviors, search the istio-cni-node for the pod ID.
Another symptom of a malfunctioned CNI plugin is that the application pod is continuously evicted at start-up time. This is typically because the plugin is not properly installed, thus pod traffic redirection cannot be set up. CNI race repair logic considers the pod is broken due to the race condition and evicts the pod continuously. When running into this issue, check the CNI DaemonSet log for information on why the plugin could not be properly installed.
When using Multus to chain multiple CNI providers, for example Kube-OVN, it may cause the Istio CNI plugin to install
configuration multiple times (in both Multus and Kube-OVN), result in a circular CNI chain that unable to configure
network for the pod. To mitigate this issue, you can specify cni.values.cniConfFileName explicitly to Multus configuration file name.
And manually review all CNI configurations to make sure there is no duplicate configuration.