Engin Diri
_CLOUD

_CLOUD

Kubernetes: ImagePullBackOff!

Kubernetes: ImagePullBackOff!

How to keep your calm and fix this like a pro!

Engin Diri's photo
Engin Diri
·Jul 5, 2022·

3 min read

Table of contents

  • Introduction
  • Potential causes
  • How can you troubleshoot the situation?
  • Wrap up

Introduction

ImagePullBackOff - One of the message in Kubernetes, you either instantly know the solution or you go mental trying to find it.

In this article, I want to help with some common source for the ImagePullBackOff status and providing some possible solution, so you can continue to work on your actual task, rather spending to much time and energy to debug your Kubernetes cluster(s).

The status ImagePullBackOff means that a Pod couldn’t start on the assigned node. And the reason is that Kubernetes couldn’t pull the container image, hence the ImagePull part of the status.

The second part, the BackOff means that Kubernetes will keep trying to pull the image. But with with an increasing delay (back-off). The back-off delay is exponential (10s, 20s, 40s, …) and is capped at five minutes. Good thing is: Once a container has executed for 10 minutes without any problems, the kubelet resets the restart backoff timer for that container.

So if the image can’t be pulled, the kubelet will report ImagePullBackOff. That means that every node in the cluster needs to be able to get that image. If something prevents the container runtime from pulling an image onto the node, the kubelet will first report ErrImagePull, then ImagePullBackOff, while keep trying.

Potential causes

Here are the most common ones:

  • Typo in the image name or tag. (Happens to me all the time)
  • Image or tag doesn’t exist. (Simple, but often true)
  • Your image registry requires authentication (Get's me every time on Azure ACR)
  • Since Docker Hub introduced rate limits, it could mean you hit a rate or download limit on your registry

How can you troubleshoot the situation?

Make sure that you can pull the image. Below are the some steps I do to verify that the image is pullable:

  • I try to pull from my local machine! back to the good old works on my machine setup!

  • Sometime I ssh into a node in the cluster and try to pull the Image from there.

  • In cases I am not using a registry (e.g. build in the cluster), I verify if the image exist on the particular node.

  • And of course I check that there are pods at all running on that particular node! Could be an issue with the node or the whole cluster.

Next thing to check is kubectl describe <pod name>. I am particular interested in the events output.

 k describe pod/nginx

image.png

So in this example, I get following error message:

 Warning  Failed     6s    kubelet            Failed to pull image "notexits/image:0.1.0": rpc error: code = Unknown desc = Error response from daemon: pull access denied for notexits/image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

Clearly I can see that I made an error with the spelling of the image. As its not from a private registry.

Speaking about private registries:

Another good thing to verify is that the image pull secret is present, correct and that your deployment or pod manifest is referencing this secret.

apiVersion: v1
kind: Pod
metadata:
  name: private-reg
spec:
  containers:
  - name: private-reg-container
    image: <your-private-image>
  imagePullSecrets:
  - name: regcred

Last thing I do during my troubleshooting sessions is to check the logs of kubelet and try to increase the log level to get more output if I need it.

Environment="KUBELET_LOG_LEVEL=2"

Reminder: set the log level to 0-4 as debug-level logs and 5-8 as trace-level logs.

Important to mention: This may not every time possible, especially when you are using a managed service.

Wrap up

  • Your image must be accessible from every node in your Kubernetes cluster

  • Reasons for ImagePullBackOff can be typos, wrong tag names or missing/wrong/expired private registry credentials.

  • If no images can be pulled, there might be a problem with your network setup.

image.png

 
Share this