ImagePullBackOff - One of the message in Kubernetes, you either instantly know the solution or you go mental trying to find it.
In this article, I want to help with some common source for the
ImagePullBackOff status and providing some possible solution, so you can continue to work on your actual task, rather spending to much time and energy to debug your Kubernetes cluster(s).
ImagePullBackOff means that a Pod couldn’t start on the assigned node. And the reason is that Kubernetes couldn’t pull the container image, hence the
ImagePull part of the status.
The second part, the
BackOff means that Kubernetes will keep trying to pull the image. But with with an increasing delay (back-off). The back-off delay is exponential (10s, 20s, 40s, …) and is capped at five minutes. Good thing is: Once a container has executed for 10 minutes without any problems, the kubelet resets the restart backoff timer for that container.
So if the image can’t be pulled, the kubelet will report
ImagePullBackOff. That means that every node in the cluster needs to be able to get that image. If something prevents the container runtime from pulling an image onto the node, the kubelet will first report
ImagePullBackOff, while keep trying.
Here are the most common ones:
- Typo in the image name or tag. (Happens to me all the time)
- Image or tag doesn’t exist. (Simple, but often true)
- Your image registry requires authentication (Get's me every time on Azure ACR)
- Since Docker Hub introduced rate limits, it could mean you hit a rate or download limit on your registry
How can you troubleshoot the situation?
Make sure that you can pull the image. Below are the some steps I do to verify that the image is pullable:
I try to pull from my local machine! back to the good old works on my machine setup!
Sometime I ssh into a node in the cluster and try to pull the Image from there.
In cases I am not using a registry (e.g. build in the cluster), I verify if the image exist on the particular node.
And of course I check that there are pods at all running on that particular node! Could be an issue with the node or the whole cluster.
Next thing to check is
kubectl describe <pod name>. I am particular interested in the
k describe pod/nginx
So in this example, I get following error message:
Warning Failed 6s kubelet Failed to pull image "notexits/image:0.1.0": rpc error: code = Unknown desc = Error response from daemon: pull access denied for notexits/image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Clearly I can see that I made an error with the spelling of the image. As its not from a private registry.
Speaking about private registries:
Another good thing to verify is that the image pull secret is present, correct and that your deployment or pod manifest is referencing this secret.
apiVersion: v1 kind: Pod metadata: name: private-reg spec: containers: - name: private-reg-container image: <your-private-image> imagePullSecrets: - name: regcred
Last thing I do during my troubleshooting sessions is to check the logs of kubelet and try to increase the log level to get more output if I need it.
Reminder: set the log level to 0-4 as debug-level logs and 5-8 as trace-level logs.
Important to mention: This may not every time possible, especially when you are using a managed service.
Your image must be accessible from every node in your Kubernetes cluster
ImagePullBackOffcan be typos, wrong tag names or missing/wrong/expired private registry credentials.
If no images can be pulled, there might be a problem with your network setup.