5

My blog is a custom ruby/rack application, and has been crashing randomly every couple of weeks. I sometimes don't notice for days, and I'd like to be notified immediately if it happens.

What's the best way to do it? I'm running Centos 5.3, Nginx, Passenger, Rack, etc.

I've considered figuring out some way to email myself the tail of my error log, as that would help me catch EVERYTHING, not just that one app (it would tell me of missing links, etc). Is there an easy way to do that?

Thanks!

8 Answers8

6

If you need an alert when your site goes down you should consider an online service for notifications: They will see the outside perspective.

If you monitor from "inside your own box" you will never get an email if it crashes completely or looses its network connectivity because your script will not be able to run or alert you anymore.

Bello or Pingdom both offer free accounts that are great to get you started.

More services are listed in Can anyone recommend a website monitoring service?

4

I'm surprised nobody's mentioned Nagios. It's incredibly powerful, does uptime percentages, notification via email/IM, can run scripts on downtime, etc. It's probably the best out there.

Josh
  • 9,398
2

Check out AreMySitesUp (http://aremysitesup.com) and Pingdom. Both have free options, and will send an email and SMS when your site is down. AreMySitesUp has an iPhone app as well.

ctb
  • 306
1
  • you can use God : god (dot) rubyforge [dot] org

  • do you have a server in another location where you could run scripts?

  • these guys will monitor your page (max 2 urls) for free (every 30 minutes) http host-tracker.com order-page

1

You can get basic connectivity tests by just writing a shell script that uses wget and then determines if the page responded or not based on the response code.

#!/bin/bash
WGET='/usr/bin/wget'
URL='http://url.to.check'

${WGET} -O /dev/null --tries=1 ${URL}

if [ $? -eq 0 ]; then
    echo "Success!"
     # You could write a log file or something here
else
    echo "Fail! :("
     # run something to mail you that your site isn't responding
fi

This is a very basic example that could be expanded, but if you are just looking for something quick, this will work. You can cron it so you know w/in a minute if it has crashed.

Alex
  • 6,723
1

Nagios is great if you have a large amount of servers. I suggest starting with munin it is simple to setup and plugins are literally a 5 minute time investment. It is great for collecting statistics and alerting on a smaller scale than nagios. The best part is should you expand to be large enough to warrant the investment nagios requires, it integrates into nagios well.

Munin: http://munin.projects.linpro.no/

Development started picking up again also!!

ScottZ
  • 467
0

You can use something like puppet or cfengine for process monitoring.

Monitoring whether a certain process still runs and if not, restart the process and report the event, is quite easy with these tools. You can even extend it so that it runs a check like opening a port and expecting some reply on a request.

However, this does not work if your entire server is dying, but that doesn't seem to be the cause here.

I'm not familiar with the ruby/rack set of options, but I know Django can also mail you on server errors (a page that causes an error while rendering) and 404's from your own site. Maybe you can find a similar option or hook in what you're building.

Combining the two of these means I'm notified in case a page fails to render and if the entire daemon dies.

0

you really should focus on debugging and fixing the problem instead :)

Said that, there are two ways to do what you want. If your server is always up (and you trust it to be up), you can easily monitor any running service via a cron job. Any monitoring software would simply be an overkill. But if you have problems with your web application and it fails in some way without actually bringing down any services running on your server, and there is no simple way to test that it failed (the process itself still runs, check results are inconsistent, etc.) then you probably want to use one of the recommended here services that check your site from the outside.

monomyth
  • 971