Passive Service Checks

Introduction

On of the features of Nagios is that is can process service check results that are submitted by external applications. Service checks which are performed and submitted to Nagios by external apps are called passive checks. Passive checks can be contrasted with active checks, which are service checks that have been initiated by Nagios.

Why The Need For Passive Checks?

Passive checks are useful for monitoring services that are:

  • located behind a firewall, and can therefore not be checked actively from the host running Nagios
  • asynchronous in nature and can therefore not be actively checked in a reliable manner (e.g. SNMP traps, security alerts, etc.)

How Do Passive Checks Work?

The only real difference between active and passive checks is that active checks are initiated by Nagios, while passive checks are performed by external applications. Once an external application has performed a service check (either actively or by having received an synchronous event like an SNMP trap or security alert), it submits the results of the service "check" to Nagios through the external command file.

The next time Nagios processes the contents of the external command file, it will place the results of all passive service checks into a queue for later processing. The same queue that is used for storing results from active checks is also used to store the results from passive checks.

Nagios will periodically execute a service reaper event and scan the service check result queue. Each service check result, regardless of whether the check was active or passive, is processed in the same manner. The service check logic is exactly the same for both types of checks. This provides a seamless method for handling both active and passive service check results.

How Do External Apps Submit Service Check Results?

External applications can submit service check results to Nagios by writing a PROCESS_SERVICE_CHECK_RESULT external command to the external command file.

The format of the command is as follows:

[<timestamp>] PROCESS_SERVICE_CHECK_RESULT;<host_name>;<description>;<return_code>;<plugin_output>

where...

  • timestamp is the time in time_t format (seconds since the UNIX epoch) that the service check was perfomed (or submitted). Please note the single space after the right bracket.
  • host_name is the short name of the host associated with the service in the service definition
  • description is the description of the service as specified in the service definition
  • return_code is the return code of the check (0=OK, 1=WARNING, 2=CRITICAL, 3=UNKNOWN)
  • plugin_output is the text output of the service check (i.e. the plugin output)

Note that in order to submit service checks to Nagios, a service must have already been defined in the object configuration file! Nagios will ignore all check results for services that had not been configured before it was last (re)started.

If you only want passive results to be provided for a specific service (i.e. active checks should not be performed), simply set the active_checks_enabled member of the service definition to 0. This will prevent Nagios from ever actively performing a check of the service. Make sure that the passive_checks_enabled member of the service definition is set to 1. If it isn't, Nagios won't process passive checks for the service!

An example shell script of how to submit passive service check results to Nagios can be found in the documentation on volatile services.

Submitting Passive Service Check Results From Remote Hosts

If an application that resides on the same host as Nagios is sending passive service check results, it can simply write the results directly to the external command file as outlined above. However, applications on remote hosts can't do this so easily. In order to allow remote hosts to send passive service check results to the host that runs Nagios, I've developed the nsca addon. The addon consists of a daemon that runs on the Nagios hosts and a client that is executed from remote hosts. The daemon will listen for connections from remote clients, perform some basic validation on the results being submitted, and then write the check results directly into the external command file (as described above). More information on the nsca addon can be found here...

Using Both Active And Passive Service Checks

Unless you're implementing a distributed monitoring environment with the central server accepting only passive service checks (and not performing any active checks), you'll probably be using both types of checks in your setup. As mentioned before, active checks are more suited for services that lend themselves to periodic checks (availability of an FTP or web server, etc), whereas passive checks are better off at handling asynchronous events that occur at variable intervals (security alerts, etc.).

The image below gives a visual representation of how active and passive service checks can both be used to monitor network resources (click on the image for a larger version).

The orange bubbles on the right side of the image are third-party applications that submit passive check results to Nagios' external command file. One of the applications resides on the same host as Nagios, so it can write directly to the command file. The other application resides on a remote host and makes used of the nsca client program and daemon to transfer the passive check results to Nagios.

The items on the left side of the image represent active service checks that Nagios is performing. I've shown how the checks can be made for local resources (disk usage, etc.), "exposed" resources on remote hosts (web server, FTP server, etc.), and "private" resources on remote hosts (remote host disk usage, processor load, etc.). In this example, the private resources on the remote hosts are actually checked by making use of the nrpe addon, which facilitates the execution of plugins on remote hosts.


Subject