7

We are using a NLB in AWS connected to our EKS cluster via a nginx ingress controller. Some of our requests get a random 504 gateway timeout.

We think we debugged the problem to our nginx ingress. Based on some Stackoverflow recommendations we played around with Connection headers.

  1. We set Connection "close" this had no effect
  2. We set Connection "keep-alive" again no effect

We also noticed another behavior with our proxy_read_timeout when it was 60 seconds our request from the browser would be fulfilled at 60.xx seconds. When we reduced it to 30 it became 30.xx, 20 became 20.xx. We went to 1 but still get random 504 gateway timeouts and do not understand why proxy_read_timeout has this behavior in our environment.

We want to understand what is the effect of proxy_read_timeout and why do we get above behavior? Also is there a way to set Connection to "" on our nginx ingress (we are not able to do this via nginx.ingress.kubernetes.io/connection-proxy-header: "")

Thanks in advance!

Siva Vg
  • 73
  • 1
  • 1
  • 4

3 Answers3

3

We think our issue was related to this:

https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-troubleshooting.html#loopback-timeout

We're using an internal nlb with our nginx ingress controller, with targets registered by instance ID. We found that the 504 timeouts and the X second waits were only occurring on applications that were sharing a node with one of our ingress controller replicas. We used a combination of nodeSelectors, labels, taints, and tolerations to force the ingress controllers onto their own node, and it appears to have eliminated the timeouts.

We also changed our externalTrafficPolicy setting to Local.

1

I had the same issue as J. Koncel where my applications that were sharing the same nodes as the nginx ingress controller were the only ones that got the 504 timeouts.

Instead of using nodeSelectors and taints/tolerations, I used Pod anti-affinity: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity.

I added a label to the spec for my nginx-ingress-controller

podType: ingress

Then I updated the yml files for the applications that should not be scheduled on the same instance as the nginx-ingress-controller to be this:

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: podType
          operator: In
          values:
          - ingress
      topologyKey: "kubernetes.io/hostname"
nedstark179
  • 111
  • 3
0

At the moment I am not able to comment, but the following line should help you in adding the externalTrafficPolicy Setting:

kubectl patch svc nodeport -p '{"spec":{"externalTrafficPolicy":"Local"}}'