-1

I have multiple physical sub-domains and I don't want to change any robots.txt file of any of that sub-domains.

Is there any way to disallow all the sub-domains from my main domain's physical robots.txt file without using any sub-domain's physical file?

Any common server (Apache's) file which can access all the sub-domains and main domain, too?

unor
  • 246

2 Answers2

1

It's impossible to say anything about subdomain.example.com on example.com/robots.txt.

The robots.txt has really limited syntax, e.g.

User-agent: Google
Disallow: /administrator

User-agent: *
Disallow: /

Where the User-agent: defines the search engine and Disallow: the path related to server root. In this example Google allowed crawl anything but /administrator, all disallowed for rest. As always with robots.txt it doesn't limit anything; it's merely a beautiful wish not to go there.

The syntax simply has no place for a subdomain and the Web Robot only looks for /robots.txt i.e. subdomain.example.com/robots.txt, not example.com/robots.txt.

Esa Jokinen
  • 52,963
  • 3
  • 95
  • 151
0

Assuming that by 'domain', you mean something like example.com and by sub-domain you mean blerf.example.com, then I believe the answer to be 'you can't do that'.

The problem is that when a crawler tries to crawl blerf.example.com, it looks at blerf.example.com/robots.txt to see what it shouldn't crawl. It doesn't look at example.com/robots.txt, because that's a different domain.

One explanation of robots.txt operation can be found at http://www.robotstxt.org/robotstxt.html .