1

I have built a system which does various types of time-series analysis and now I would like to feed it data from a monitoring tool. Since I have Nagios set up already in my test environment, I prefer to get it from there. But as a second choice I could get access to a test Zenoss instance, and would appreciate answers for Zenoss as well.

What I want

I want time-series for multiple KPIs on multiple devices.

Ideally I would be able to specify the data format, but as long as it contains the information I need I am happy to transform it upon receipt. The information I need is just

  • The device identifier e.g. 10.2.42.2 or Ubuntu-42A
  • The component being monitored e.g. CPU or Memory
  • The KPI e.g. %Usage, KBytes Available
  • The value of the KPI
  • The timestamp

Finally, I would like to send the data via HTTP (for now, later via HTTPS).

I can already do this in the case of an alert - for example when a threshold is breached I know how to configure Nagios to call a simple script of mine with the device IP etc. as parameters - and my script executes the HTTP request. But I haven't seen how this can be set up to fire on every poll.

What I don't want

I don't want alert data, I want the raw time-series.

I don't want to poll Nagios to get this data - the polling intervals would vary and I would like to avoid unnecessary network traffic.

I checked this question but that seemed to send data from slave Nagios nodes to a master Nagios node.

2 Answers2

0

You can do this with the pieces that are intended for distributed monitoring.

For example, use an ocsp command to send all check results elsewhere. The command def can point to a script that pushes perfdata via curl or similar.

Keith
  • 4,637
0

In nagios.cfg we added

  • obsess_over_services=1
  • ocsp_command=OUR_COMMAND_NAME

Then we defined the new command in commands.cfg:

  • command_name OUR_COMMAND_NAME
  • command_line /path/to/our/script

The script receives the following parameters:

  1. Host name
  2. Service Description
  3. Service State
  4. Message from the relevant plugin

Referring to my question: the device I wanted is this host name, and the component & KPI can be extracted from the service description and plugin message.

I do however need to do a little parsing work to get these values, since the plugin message is written more for humans than machines e.g.

OK - 1.05 GB used (1.05 GB RAM + 0.00 GB SWAP, this is 32.4% of 3.24 GB RAM)

but at least the format is consistent so I'm not complaining.