Is it good idea to ban amazonaws.com

Question

Site are crawled by anonymous bot hosted on amazon ec2. This robot doesn't respect robots.txt and creates high load on web server so I added check if reverse IP for request ends with "amazonaws.com" then server returns 403 page immediately.

This solved problem but may be it can cause other problems? ec2 may be used for some "good" bots and this will cause access problem for theirs. Can you give example of such problems?

score 5 · Accepted Answer · answered Sep 15 '11 at 19:26

5

Amazon EC2 is a hosting platform. They don't directly control what people host. If you block the whole *.amazonaws.com domain then you will stop access to any hosted service using EC2. Which is quite a lot these days.

answered Sep 15 '11 at 19:26

George Hewitt

1,066

score 1 · Answer 2 · edited Apr 13 '17 at 12:14

1

Check out this similar question: it shows how to block by user agent directly in the .htaccess file. This is good for robots that fail to follow your robots.txt rule...

Blocking by user-agent string in httpd.conf not effective

And you would put that in either the httpd.conf file, OR a .htaccess.

Good luck.

edited Apr 13 '17 at 12:14

Community

1

answered Sep 15 '11 at 19:26

U4iK_HaZe

633

Is it good idea to ban amazonaws.com

2 Answers2