32

I am trying to set robots.txt for all virtual hosts under nginx http server. I was able to do it in Apache by putting the following in main httpd.conf:

<Location "/robots.txt">
    SetHandler None
</Location>
Alias /robots.txt /var/www/html/robots.txt

I tried doing something similar with nginx by adding the lines given below (a) within nginx.conf and (b) as include conf.d/robots.conf

location ^~ /robots.txt {
        alias /var/www/html/robots.txt;
}

I have tried with '=' and even put it in one of the virtual host to test it. Nothing seemed to work.

What am I missing here? Is there another way to achieve this?

masegaloeh
  • 18,498
anup
  • 747

5 Answers5

103

You can set the contents of the robots.txt file directly in the nginx config:

location = /robots.txt { return 200 "User-agent: *\nDisallow: /\n"; }

It is also possible to add the correct Content-Type:

location = /robots.txt {
   add_header Content-Type text/plain;
   return 200 "User-agent: *\nDisallow: /\n";
}
13

Are there other rules that are defined? Maybe common.conf or another conf file in included which is over-riding your config. One of the following should definitely work.

location /robots.txt { alias /home/www/html/robots.txt; }
location /robots.txt { root /home/www/html/;  }
  1. Nginx runs all "regexp" locations in order of their appearance. If any "regexp" location succeeds, Nginx will use this first match. If no "regexp" location succeeded, Nginx uses the ordinary location found on the previous step.
  2. "regexp" locations have precedence over "prefix" locations
user79644
  • 656
7

location cannot be used inside http block. nginx does not have global aliases (i.e., aliases that can be defined for all vhosts). Save your global definations in a folder and include those.

server {
  listen 80;
  root /var/www/html;
  include /etc/nginx/global.d/*.conf;
}
user79644
  • 656
1

You could also just serve it directly:

location /robots.txt {
   return 200 "User-agent: *\nDisallow: /\n";
}
-1

I had the same issue with the acme challanges, but the same principle applies to your case as well.

What I did to solve this issue was to move all my sites to a non-standard port, I picked 8081, and created a virtual server listening on port 80. It proxies all requests to 127.0.0.1:8081, except the ones to .well-known. This acts almost as a global alias, with one extra hop, but that shouldn't cause a significant drop in performance due to the async nature of nginx.

upstream nonacme {
  server 127.0.0.1:8081;
}

server {
  listen 80;

  access_log  /var/log/nginx/acme-access.log;
  error_log   /var/log/nginx/acme-error.log;

  location /.well-known {
    root /var/www/acme;
  }

  location / {
    proxy_set_header    Host                $http_host;
    proxy_set_header    X-Real-IP           $remote_addr;
    proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
    proxy_set_header    X-Forwarded-Proto   $scheme;
    proxy_set_header    X-Frame-Options     SAMEORIGIN;

    # WebSocket support (nginx 1.4)
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";

    proxy_pass http://nonacme;
  }
}