A pod in my Kubernetes cluster is stuck on "ContainerCreating" after running a create. How do I see logs for this operation in order to diagnose why it is stuck? kubectl logs doesn't seem to work since the container needs to be in a non-pending state.
- 2,965
6 Answers
kubectl describe pods will list some (probably most but not all) of the events associated with the pod, including pulling of images, starting of containers.
- 2,138
- 3
- 24
- 30
- 3,426
More info could be provided in the events.
kubectl get events --all-namespaces --sort-by='.metadata.creationTimestamp'
However do note that sorting events might not work correctly due to this bug: https://github.com/kubernetes/kubernetes/issues/29838
Alternatively:
As of Kubernetes 1.18 all new objects have metadata for server-side apply, which gives us a new way to sort events:
kubectl get events --sort-by=".metadata.managedFields[0].time"
From: https://github.com/kubernetes/kubernetes/issues/29838#issuecomment-789660546
In my case I had an event relating to a pod:
default 13s Warning FailedMount Pod Unable to mount volumes for pod "restore-db-123-1-5f24s_default(9b7df264-2976-11ea-bb8f-42010a9a002c)": timeout expired waiting for volumes to attach or mount for pod "default"/"restore-db-123-1-5f24s". list of unmounted volumes=[nfsv]. list of unattached volumes=[nfsv default-token-hxrng]
- 2,138
- 3
- 24
- 30
In my case, docker's access to internet was blocked. It was solved using a proxy (using sandylss's comment):
minikube stopminikube deleteexport http_proxy=http://user:pass@ip:portexport https_proxy=http://user:pass@ip:portexport no_proxy=192.168.99.0/24minikube start --logtostderr --v=0 --bootstrapper=localkube --vm-driver hyperv --hyperv-virtual-switch "Primary Virtual Switch" --docker-env HTTP_PROXY=$http_proxy \ --docker-env HTTPS_PROXY=$https_proxy --docker-env NO_PROXY=$no_proxyexport no_proxy=$no_proxy,$(minikube ip)export NO_PROXY=$no_proxy,$(minikube ip)
Then, to check if docker has access to internet, run:
$ docker pull tutum/hello-world
in the cluster (connect to the cluster using minikube ssh); stop the process if it starts downloading.
My second problem was slow internet connection. Since the required docker images are on the order of 100MB, both docker containers and Kubernetes pods remained in \pause and ContainerCreating states for 30 minutes.
To check if docker is downloading the images, run:
$ ls -l /var/lib/docker/tmp
in the cluster, which shows the temporary image file[s] that are being downloaded, empty otherwise.
If you are developing in minikube and using VPN, docker can use your VPN via fiddler. That is, docker will be connected to fiddler's ip:port, and fiddler is connected to the VPN. Otherwise, VPN is not shared between your host and minikube VM.
In my case, a pod was stuck at 'ContainerCreating' because a docker image pull was hung (some layers were downloaded, some were stuck at "downloading").
$ kubectl get events --all-namespaces --sort-by='.metadata.creationTimestamp'
showed an event "Pulling image"
Tried to pull that image using docker image pull... and saw that it was hanging.
It turned out that there is a bug in concurrent pulls of layers. Changing docker config to limit concurrency solved the problem.
Added this to docker config (on windows, docker-desktop UI, settings, Docker Engine) to limit concurrency:
"max-concurrent-downloads": 1,
"max-concurrent-uploads": 1
- 61
- 1
- 1
The one time I hit this was because my resource declarations were accidentally very very small.
resources: limits: cpu: 1000m memory: 1024M requests: cpu: 1000m memory: 1024M
vs
resources: limits: cpu: 1000m memory: 1024m requests: cpu: 1000m memory: 1024m
capitalizing that m makes a very large difference in resource use. I was stuck on ContainerCreating because I had not given enough memory to my container.
I've been there! I had a similar issue where my StatefulSet pod was stuck in the ContainerCreating status. After spending about three hours troubleshooting, I decided to cordon the node where the NFS server was located and redeploy it to another node.
This did the trick for me! My service started running smoothly once the volume was correctly bound to the new NFS server location.
Hope this helps!
- 1