1

I have a case where empty PID files are being generated by certain processes which are monitored by monit. Monit is NOT so good in handling empty files and tries to re-start the process even when the process is already running and keep throwing errors in the monit log.

I am thinking of implementing a custom script to handle this when monit sees a PID file using which it failed to restart that process, run this custom script and re-populate the PID file with the PID of the already running process.

I am failing to write the "if failed" part to run this custom script.If it is some server process with port and protocol I can write one, but for just a background process I am NOT sure on how to handle this case

Intended Monit config but failing to compile when I run "monit -t"

Please help in suggesting the right config to handle monit restart failures.

Thank you.

# Check for cmaeventd process
check process cmaeventd with pidfile /var/run/cmaeventd.pid
group snmp-agents
start program = "/opt/hp/hp-snmp-agents/storage/etc/cmaeventd start"
stop program = "/opt/hp/hp-snmp-agents/storage/etc/cmaeventd stop"
if failed (restart|start) then exec "/tmp/pidchk.sh cmaeventd"
if 2 restarts within 3 cycles then timeout

Monit logfile:


[PST Feb  3 18:18:20] error    : monit: Error reading pid from file '/var/run/cmaidad.pid'
[PST Feb  3 18:18:21] error    : monit: Error reading pid from file '/var/run/cmaidad.pid'
[PST Feb  3 18:18:22] error    : 'cmaidad' failed to start

[PST Feb  3 18:19:22] error    : 'cmaidad' service restarted 2 times within 2 cycles(s) - unmonitor


Empty PID file:
logbash-3.1# ps -ef|grep cmaidad|grep -v grep
root     32298     1  0 18:14 ?        00:00:01 cmaidad -p 15 -s OK -l /var/log/hp-snmp-agents/cma.log
logbash-3.1# cat /var/run/cmaidad.pid

logbash-3.1# ls -l /var/run/cmaidad.pid
-rw-r--r-- 1 root root 1 Feb  3 18:14 /var/run/cmaidad.pid

Script that I wrote to populate the PID file, if that given process is running.

#!/bin/bash
# To re-populate the empty PID files which were NOT populated by the hp-snmp scripts
AGNTFILEPATH=/var/run

#different distros put pidof in different places
if [ -f /sbin/pidof ]; then
  PIDOF=/sbin/pidof
elif [ -f /bin/pidof ]; then
  PIDOF=/bin/pidof
fi

#add pid into agent file
addpidintofile() {
                PIDOFAGNT=`$PIDOF -o $$ -o $PPID -o %PPID -x $PNAME > /dev/stdout | cut -d " " -f1` 2> /dev/null
                if [ -f $AGNTFILEPATH/$PNAME.pid ]; then
                        echo "$PIDOFAGNT" > $AGNTFILEPATH/$PNAME.pid
                fi
}

PNAME=$1
cnt=`ps -ef|grep $PNAME|grep -v grep|wc -l`
if [ cnt == 0 ]
    then
    exit 1;
else 
    addpidintofile
    exit 0;
fi
gowin09
  • 21

1 Answers1

1

This is all a pretty bad approach to the problem you're trying to solve. You really want your HP monitoring agents/drivers to be stable and not crash...

Either way, if you aren't going to solve the root issue, you can just instruct Monit to use the process name instead of a PID.

check process cmaeventd
        matching "cmaeventd"
        start program = "/etc/init.d/cmaeventd start"
        stop program = "/etc/init.d/cmaeventd stop"
ewwhite
  • 201,205