72

I have a server which was working ok until 3rd Oct 2013 at 10:50am when it began to intermittently return "502 Bad Gateway" errors to the client.

Approximately 4 out of 5 browser requests succeed but about 1 in 5 fail with a 502.

The nginx error log contains many hundreds of these errors;

2013/10/05 06:28:17 [error] 3111#0: *54528 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 66.249.66.75, server: www.bec-components.co.uk  request: ""GET /?_n=Fridgefreezer/Hotpoint/8591P;_i=x8078 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.bec-components.co.uk"

However the PHP error log does not contain any matching errors.

Is there a way to get PHP to give me more info about why it is resetting the connection?

This is nginx.conf;

user              www-data;
worker_processes  4;
error_log         /var/log/nginx/error.log;
pid               /var/run/nginx.pid;

events {
   worker_connections  1024;
}

http {
  include          /etc/nginx/mime.types;
  access_log       /var/log/nginx/access.log;

  sendfile               on;
  keepalive_timeout      30;
  tcp_nodelay            on;
  client_max_body_size   100m;

  gzip         on;
  gzip_types   text/plain application/xml text/javascript application/x-javascript text/css;
  gzip_disable "MSIE [1-6]\.(?!.*SV1)";

  include /gvol/sites/*/nginx.conf;

}

And this is the .conf for this site;

server {

  server_name   www.bec-components.co.uk bec3.uk.to bec4.uk.to bec.home;
  root          /gvol/sites/bec/www/;
  index         index.php index.html;

  location ~ \.(js|css|png|jpg|jpeg|gif|ico)$ {
    expires        2592000;   # 30 days
    log_not_found  off;
  }

  ## Trigger client to download instead of display '.xml' files.
  location ~ \.xml$ {
    add_header Content-disposition "attachment; filename=$1";
  }

   location ~ \.php$ {
      fastcgi_read_timeout  3600;
      include               /etc/nginx/fastcgi_params;
      keepalive_timeout     0;
      fastcgi_param         SCRIPT_FILENAME  $document_root$fastcgi_script_name;
      fastcgi_pass          127.0.0.1:9000;
      fastcgi_index         index.php;
   }
}

## bec-components.co.uk ##
server {
   server_name   bec-components.co.uk;
   rewrite       ^/(.*) http://www.bec-components.co.uk$1 permanent;
}
ivanleoncz
  • 1,861

13 Answers13

31

I'd always trust if my web servers are telling me 502 Bad Gateway.

  • What is the uptime of your FastCGI/NGINX process?
  • Do you monitor network connections?
  • Can you confirm/deny a change of visitors count around that day?

What the error means

Your FastCGI process is not accessible by NGINX; either to slow or not corresponding at all. Bad gateway means that NGINX cannot complete the fastcgi_pass step to that defined resources listening on 127.0.0.1:9000 and at that very specific moment.

Your inital error logs tells it all:

recv() failed 
    -> nginx failed

(104: Connection reset by peer) while reading response header from upstream, -> no complete answer, or no answer at all upstream: "fastcgi://127.0.0.1:9000", -> who is he, who failed???

From my limited point of view, I'd suggest:

  • To restart your FastCGI process or server
  • To check your access.log
  • To enable enable debug.log
Eddie C.
  • 549
  • 1
  • 3
  • 12
20

I know this topic is old, but it still continues to pop up occasionally, so, looking for answers on the web, I came up with the following three possibilities:

  1. A programming error is sometimes segfaulting php-fpm, which in turn means that the connection with nginx will be severed. This will usually leave at least some logs around and/or core dumps, which can be analysed further.
  2. For some reason, PHP is not being able to write a session file (usually: session.save_path = "/var/lib/php/sessions"). This can be bad permissions, bad ownership, bad user/group, or more esoteric/obscure issues like running out of inodes on that directory (or even a full disk!). This will usually not leave many core dumps around and possibly not even anything on the PHP error logs.
  3. Even more tricky to debug: an extension is misbehaving (occasionally hitting some kind of inner limit, or a bug which is not triggered all the time), segfaulting, and bringing the php-fpm process down with it — thus closing the connection with nginx. The usual culprits are APC, memcache/d, etc. (in my case it was the New Relic extension), so the idea here is to turn each extension off until the error disappears.
11

Kept getting this as well. Solved it by increasing the opcache memory limit, if you use it (replacement for APC). Seems PHP-FPM dropped connections whenever the cache got too full. This is also the reason why shgnInc's answer fixes it for a short time.

So find the file /etc/php5/fpm/php.ini (or equivalent in your distribution) and increase memory_consumption to whatever level your site needs. Disabling opcache may also work.

[opcache]
opcache.memory_consumption = 196 
Manu
  • 211
6

In my case of same problem, I just restart the php-fpm service so it solved.

sudo service php5-fpm restart

Or some times this problem happen because of huge of requests. By default the pm.max_requests in php5-fpm maybe is 100 or below.

To solve it increase its value depend on the your site's requests, For example 500.

And after the you have to restart the service

shgnInc
  • 1,996
3

In my case, disabling xdebug extension did help.

Vasily
  • 131
2

You may want to consider this git on github: https://gist.github.com/amichaelgrant/90d99d7d5d48bf8fd209

I encountered a similar situation, when I checked error logs for my upstream servers they were reporting some ulimit error so I increased that to 1000000(on both the upstream and nginx boxes) and everything worked fine

Michael Hampton
  • 252,907
2

This issue may also arise if a PHP-FPM process exceeds its allocated memory limit. When this happens, the connection between NGINX and PHP-FPM is severed and NGINX returns a 502 Bad Gateway. The PHP-FPM process memory limit is controlled by the memory_limit variable. This can be set with php_admin_value[memory_limit] in the PHP-FPM configuration file.

It is important to note that the memory limit applies on a per-script basis. With n PHP-FPM processes, the total memory usage can be up to memory_limit * n. Be sure to check that your machine has sufficient memory headroom!

Francis
  • 21
1

If you are using multiple reverse proxies, you should be aware that nginx will send a connection reset in some situations. If for instance you are getting "n worker_connections are not enough" in your logs, that's the source of the connection reset. Each request on a reverse proxy requires 2 worker_connections. If you don't know that then your margin of safety may not be a margin at all.

Jason
  • 111
1

I just had a similar problem:

You connect to php-fpm on Port 9000. (fastcgi://127.0.0.1:9000)

Standard configuration on Ubuntu on my server is:

/etc/php/7.0/fpm/pool.d/www.conf:

listen = /run/php/php7.0-fpm.sock

you have to change this to:

listen = 0.0.0.0:9000

In my case, I did update my server 1 1/2 Months ago, overwriting my custom configuration with the default. Now having restarted php-fpm this error came to effect with delay.

Martin K.
  • 115
1

For me it was the server running out of memory and php-fpm getting killed by OOM killer. The solution was to increase amount of server memory.

1

For me it was because php-fpm was hitting the max_children limit. The php-fpm log for the pool in question pointed me in the right direction

0

I got a similar issue: random Connection reset by peer when server was under load. Eventually found it was due to a difference in keepalive_timeout values between nginx and upstream (gunicorn in my case). Nginx was at 75s and upstream was just a few seconds. Thus sometimes upstream dropped the connection and nginx didn't understand why.

Increasing upstream value to be identical to nginx' solved the issue.

0

Culprit for me in 2023 was Tideways on PHP 8.2. Removing it removes the PHP crash.

Looking for an alternative... Back to xhprof certainly (via PECL: https://pecl.php.net/package/xhprof)

Yvan
  • 500