8

I have installed Prometheus to monitor my installation and it is frequently raising alerts about CPU throttling.

The Prometheus alert rules to identify this alert is :

alert: CPUThrottlingHigh
expr: 100
  * sum by(container_name, pod_name, namespace) (increase(container_cpu_cfs_throttled_periods_total{container_name!=""}[5m]))
  / sum by(container_name, pod_name, namespace) (increase(container_cpu_cfs_periods_total[5m]))
  > 25
for: 15m

If I look at one of the pods identified by this alert, it does not seem to have any reason to throttle :

$ kubectl top pod -n monitoring my-pod
NAME            CPU(cores)   MEMORY(bytes)   
my-pod          0m           6Mi

This pod has one container with these resources setup :

Limits:
  cpu:     100m
  memory:  128Mi
Requests:
  cpu:     25m
  memory:  64Mi

And the node that is hosting this pod is not under any heavy cpu use :

$ kubectl -n monitoring top node aks-agentpool-node-1
NAME                       CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
aks-agentpool-node-1       853m         21%       11668Mi         101%

On grafana, if I look at the chart for this pod, it never goes above 0,000022 of cpu usage

Why is it throttling ?

jobou
  • 183
  • 1
  • 1
  • 5

2 Answers2

15

The CPUThrottlingHigh is an alert created by the kubernetes-mixin project. There is an open issue (#108) to discuss this alert. I suggest that you read all the comments on this issue to better understand the problem.

In short, the problem is: When working with low CPU limits, spiky workloads can have low averages and still be being throttled.

Also, take a look at this issue (#67577) from Kubernetes project, which addresses a Kernel bug in CFS quotas that may cause unnecessary CPU throttling. The discussion is still open, and the Kubernetes project are even considering disabling CFS quotas for pods in the Guaranteed QoS (see #70585 for reference).

Consider the following options:

  • Increase (or even remove) your container CPU limits
  • Disable Kubernetes CFS quotas entirely (kubelet's flag --cpu-cfs-quota=false)
  • Use a Kernel version that contains this fix (torvalds/linux 512ac99)
Eduardo Baitello
  • 981
  • 8
  • 26
2

To piggy back off of @eduardo-baitello answer, A third option is to increase the CPUThrottlingPercent config here