60

I need to have network messages sent when a systemd service I have crashes or is hung (i.e., enters failed state; I monitor for hung by using WatchdogSec=). I noticed that newer systemd have FailureAction=, but then saw that this doesn't allow arbitrary commands, but just rebooting/shutdown.

Specifically, I need a way to have one network message sent when systemd detects the program has crashed, and another when it detects it has hung.

I'm hoping for a better answer than "parse the logs", and I need something that has a near-instant response time, so I don't think a polling approach is good; it should be something triggered by the event occurring.

4 Answers4

57

systemd units support OnFailure that will activate a unit (or more) when the unit goes to failed. You can put something like

 OnFailure=notify-failed@%n

And then create the notify-failed@.service service where you can use the required specifier (you probably will want at least %i) to launch the script or command that will send notification.

You can see a practical example in http://northernlightlabs.se/systemd.status.mail.on.unit.failure

37

Just my way to notify :

/etc/systemd/system/notify-email@.service

[Unit]
Description=Sent email

[Service] Type=oneshot ExecStart=/usr/bin/bash -c '/usr/bin/systemctl status %i | /usr/bin/mailx -Ssendwait -s "[SYSTEMD_%i] Fail" your_admin@company.blablabla'

[Install] WantedBy=multi-user.target

add to systemd:

systemctl enable /etc/systemd/system/notify-email@.service

At others services add:

[Unit]
OnFailure=notify-email@%i.service

Reload the configuration:

systemctl daemon-reload
tjmcewan
  • 493
ceinmart
  • 537
  • 5
  • 11
0

The OnFailure= service way is the most "systemd way" I think. Other ways:

ExecStopPost= will execute after the service is stopped, where you can then check $SERVICE_RESULT with bash.

ExecStopPost=bash -c "[ $SERVICE_RESULT = 'success' ] && echo 'success' || echo 'fail'"

If ExecStart= runs a script, you can do error handling within the shell script itself, and systemd will report a success if that script executes successfully.

I've tested these with non-zero exit codes, but not if the process gets killed or something.

qwr
  • 111
0

I came across this utility which seems to provide this: https://github.com/joonty/systemd_mon