0

So I set up a simple script to send an email alert when a certain web service stops running.

It has a simple flow of:

test = $( curl [address] | grep [a certain string in response] | wc -l )
if [ $test -ne 1 ]; then 
  echo "there has been an error" | mail -s "Error" -t "[my-mail-address]"
fi

and in crontab it is set to do the check once every five minutes:

*/5 * * * * sh /path/to/script/

It was working well for a couple of days, but suddenly about ten minutes ago, almost hundred e-mails from the server were received simultaneously. It doesn't seem possible at all since there aren't even any loops in the script.

Syslog:

Jan 26 01:05:01 sv1 CRON[23310]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan 26 01:10:01 sv1 CRON[23815]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan 26 01:12:12 sv1 kernel: [5962667.417178] [ 1106]     0  1106     5914      168      17        0             0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417250] [27493]     0 27493    14949      224      34        0             0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417252] [27939]     0 27939    14949      224      34        0             0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417254] [28436]     0 28436    14948      224      34        0             0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417256] [28943]     0 28943    14949      224      34        0             0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417258] [29408]     0 29408    14949      224      34        0             0 cron
...

* this continues for about 800+ lines with similar timestamp (until 01:12:24). The timestamp of these 800+ lines coincide with the simultaneous mails. It is odd as the cron is scheduled to run every 5 mins, hence the first 2 lines. The lines starting from 01:12:12 are the fishy ones.

Update:

Just brought the service down again and let cron and the script do their job. A single mail was sent.

As the test is a very simple true/false, I am struggling to figure out what kind of special circumstances would result in multiple mails being sent simultaneously.

Reuben L.
  • 111

1 Answers1

1

Are you sure It was working well for a couple of days....? This means that a mail was sent every 5 minutes.

It could be posssible that the mails could not be sent for some reason resulting in a queue and when the connectivity issue was solved that all the mails were sent. In order to find the problem the mail-log should be checked.

The cron should be debugged. Check the syslog and the cronlog:

sudo less /var/log/cron

Some information regarding the cron should be found at the time the 124 mails were sent.

Also check this Q&A. If a system is too busy cron jobs could be summed up and then a daemon should be considered.

Check the output of curl [address] | grep [a certain string in response] | wc -l. Does it take a long time before the command has been executed? Why do you grep all similarities? The first hit should be sufficient. | head -1 could be used.

030
  • 6,085