1

We just moved to kubernetes, but the engineer that helped launch it is on paternity leave before we had hoped (never trust a baby not to be eager!).

Now we're trying to do maintenance tasks and one off work and having nodes get killed in the middle of things.

I've looked at using kubernetes jobs for this, but that's overkill. We don't want to write manifest files for everything.

We just need long lived shell access to do this and that.

What's the pattern for this so your maintenance task doesn't get killed?

mlissner
  • 131
  • 4

1 Answers1

1

We were eventually able to answer this by following the rules for when a node is terminated. According to the FAQ, there are a number of ways that pods can prevent the cluster autoscaler from removing a node. One type of pod is:

Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc).

So our solution is to create a pod in that way via a manifest file. This lets us have a pod named maintenance that sticks around and isn't killed by the cluster autoscaler:

---
apiVersion: v1
kind: Pod
metadata:
  name: maintenance
  namespace: blah
  labels:
    type: maintenance
spec:
  containers:
    - name: web
      image: whatever
      imagePullPolicy: IfNotPresent
      command: [bash]
      stdin: true
      tty: true
mlissner
  • 131
  • 4