Introduction
I've an issue with a pod in a StatefulSet which is terminated, stays in the Completed state and is not restarted.
I will describe the situation in a concrete example, that will provide some status and log data to analyze the issue.
Background: Our Installation
We run a mongodb 7.0.7 aks kubernetes cluster using a StatefulSet
Analysis
In the status section of the pod's yaml (see below) I can see that the pod was terminated with status code 0, but I can't see a reason, why the pod isn't restarted. Of course we have restartPolicy: Always
In the kubelet logs (also below) I can see that the pod was evicted (because of NodeHasInsufficientMemory).
My understanding of kubernetes is, that the pod
- should be always rescheduled, so that the StatefulSet has the desired count of healthy pods.
- remains in Pending state as long as not enough resources are available
- the pod will be created as soon as enough resources are available
Question
- Is my understanding correct?
- So why isn't the pod rescheduled?
- Which kubernetes process should care about the rescheduling?
- Can I find more logs about rescheduling?
Status YAML
status:
conditions:
...
- lastProbeTime: null
lastTransitionTime: "2024-04-30T23:48:10Z"
reason: PodCompleted
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2024-04-30T23:48:10Z"
reason: PodCompleted
status: "False"
type: ContainersReady
...
containerStatuses:
- containerID: containerd://12345
image: mongodb:7.0.7
imageID: mongodb@sha256:12345
lastState: {}
name: mongodb
ready: false
restartCount: 0
started: false
state:
terminated:
containerID: containerd://123345
exitCode: 0
finishedAt: "2024-04-30T23:48:09Z"
reason: Completed
startedAt: "2024-04-24T17:55:38Z"
Kubelet Log
kubectl debug node/mynode -it --image=busybox
chroot /host
journalctl -u kubelet -o cat|grep mongodb
...
... kuberuntime_container.go ... : "Killing container with a grace period" pod="mongodb-0" ...
... eviction_manager.go:427] "Eviction manager: pods successfully cleaned up" pods=[mongodb-0]