Home   Profile   Fun
#128 Linux  03.06.2007

The monitoring server tutorial with Nagios 2.8, NRPE 2.7.1 and Fruity


There are so many requests concerning Nagios and NRPE reaching me that I decided to write a tutorial for the setup of a complete Nagios monitoring server. Here is it, I hope you like it and it helps you to get all done as easy as possible. I am using Gentoo so the tutorial describes the setup on a Gentoo system. But it should be possible to follow the instructions on any other distribution.

Nagios in combination with Fruity and NRPE gives a maximum of usability and flexibility. Usually there is no need to touch a Nagios configuration file directly any more. With NRPE we can monitor just everything imaginable because NRPE accepts self written check scripts.

The result is a Nagios monitoring service with Fruity as its frontend. NRPE is used to execute remote checks on the monitored hosts. To send notifications Postfix is used. MySQL is also needed, Fruity stores its data in it.

Before we start I would like to explain shortly the basic workflow for the configuration when all is set up. The first example shows the procedure for the set up of a normal Nagios check. The second one shows the procedure for a Nagios check with NRPE. Just to give an overview and make things more clear.
What does normal Nagios check mean? This is a check which can be run solely by the Nagios daemon and the standard plugins. A NRPE check on the other hand can only be performed with the help of the NRPE daemon running on the monitored host. This is necessary if we want to check parameters which can not be accessed from outside of the monitored host. A normal Nagios check for example is used to test if a website is available. An example for a parameter which needs a NRPE check is the CPU load.

These are the steps to set up a new check:

Normal Nagios check
On the monitoring host:
1) Opening the Fruity page
2) Adding a new service to a host
3) Adding a check to this service
4) Configuring this check by adding parameters if necessary
5) Exporting the data to Nagios
6) Openening the Nagios page and verifying that the new service check is set up correctly

Nagios check with NRPE
On the monitored host:
1) Creating your own check script or using one of the standard plugins
2) Making an entry for this check in /etc/nrpe.cfg
3) Restarting NRPE
On the monitoring host:
1) Opening the Fruity page
2) Adding a new service to a host
3) Adding a NRPE check to this service
4) Configuring this check by adding parameters.
5) Exporting the data to Nagios
6) Opening the Nagios page and verifying that the new Service check is set up correctly

When the system is completely set up, to add new checks is extremely easy and it takes only a few seconds or minutes.



For this example we use the two physical hosts 'nagios1' and 'c1' in a LAN:

The monitoring host which has the Nagios daemon running.
nagios1: 192.168.1.33

The host which is being monitored and has the NRPE daemon running.
c1: 192.168.1.56


Ok, here we go. These are the steps to set up the whole server.
'M' means on the monitoring host (Nagios)
'C' means on the monitored host (NRPE)
'MC' means on both

Installation
1) M Setting the date and time
2) M Installing Apache2, PHP, MySQL, bind-tools, net-snmp and sudo
3) M Installing Nagios and the standard plugins
4) M Installing Fruity and Postfix
5) MC Installing NRPE
Configuration / Test
6) C Making sure that port 5666 is open
7) M Setting up a normal Nagios TCP check
8) MC Setting up a Nagios NRPE load check
9) M Making sure that alerts occur as expected and notifications arrive
10) MC Troubleshooting




My experience on Gentoo is that it's better to install Nagios and NRPE manually. I run into several problems when I installed these packages with portage. One example is that emerging of the standard plugins fails on systems which have no localhost or a different IP than 127.0.0.1 (e.g. vservers). This is because 127.0.0.1 is hardcoded in the configure script. So we install Nagios, the Nagios standard plugings and NRPE manually and the rest via portage. Fruity is not yet available in portage, thus also a candidate for a manual installation.



1) M Setting the date and time
First we install rdate which can be used easily to set a correct date and time.
emerge rdate

It's best to chose a time server which is as close as possible to the location of our Nagios server. This command sets the correct time now.
/usr/bin/rdate -s time.fu-berlin.de

For a daily synchronisation we put this line in /etc/crontab.
03 01 * * * root /usr/bin/rdate -s time.fu-berlin.de > /dev/null



2) M Installing Apache2, PHP, MySQL, bind-tools, net-snmp and sudo
Installing and starting Apache2.
emerge apache
/etc/init.d/apache2 start

Adding Apache2 to the default runlevel.
rc-update add apache2 default

http://nagios1/ should show the standard page for a successful Apache installation.


Installing PHP.
USE="apache2 calendar cgi gd mysql mysqli sockets xml zip" emerge php


Installing MySQL.
emerge dev-db/mysql

Installing the MySQL database.
/usr/bin/mysql_install_db

Starting MySQL.
/etc/init.d/mysql start

Setting the root password for the MySQL server.
/usr/bin/mysqladmin -u root password 'NEW-PASSWORD'

Adding MySQL to the default runlevel.
rc-update add mysql default


Installing bind-tools.
emerge bind-tools


Installing net-snmp.
USE="-lm_sensors" emerge net-snmp


Installing sudo.
emerge app-admin/sudo



3) M Installing Nagios and the standard plugins
As the system has all additional packages we proceed with the actual Nagios installation.

Creating the Nagios user and group.
useradd -m -d /usr/local/nagios -s /bin/bash nagios

Identifying the user the Apache daemon runs as.
grep "^User" /etc/apache2/httpd.conf
User apache

We need a new group whose members are the Apache user and the Nagios user.
groupadd nagcmd
usermod -G nagcmd apache
usermod -G nagcmd nagios

Downloading and compiling Nagios.
cd /tmp
wget http://puzzle.dl.sourceforge.net/sourceforge/nagios/nagios-2.8.tar.gz
tar xfvzp nagios-2.8.tar.gz
cd nagios-2.8
For the tutorial we use the standard options here, so it's just ./configure.
./configure
make all
make install
Setting permissions on the directory which holds the external command file.
make install-commandmode
Installing the sample configuration files.
make install-config
Installing the initscript for the Nagios daemon.
make install-init


Downloading and compiling the standard plugins. Nagios on its own does not execute any checks. The standard plugins do this job. They must be installed otherwise Nagios is useless.
cd /tmp
wget http://ovh.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.7.tar.gz
tar xfvzp nagios-plugins-1.4.7.tar.gz
cd nagios-plugins-1.4.7
./configure
make
make install


Setting up the Nagios web interface. First we create a new virtual host for Nagios.
cd /etc/apache2/vhosts.d
Opening a new virtual host configuration file.
vi nagios1.conf
Adding the following lines:
NameVirtualHost 192.168.1.33:80
<VirtualHost 192.168.1.33:80>
	ServerAdmin webmaster@nagios1
	DocumentRoot /var/www/nagios1/htdocs
	ServerName nagios1

	ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin

	<Directory "/usr/local/nagios/sbin">
		Options ExecCGI
		AllowOverride None
		Order allow,deny
		Allow from all
		AuthName "Nagios Access"
		AuthType Basic
		AuthUserFile /usr/local/nagios/etc/htpasswd.users
		Require valid-user
	</Directory>

	Alias /nagios /usr/local/nagios/share

	<Directory "/usr/local/nagios/share">
		Options None
		AllowOverride None
		Order allow,deny
		Allow from all
		AuthName "Nagios Access"
		AuthType Basic
		AuthUserFile /usr/local/nagios/etc/htpasswd.users
		Require valid-user
	</Directory>

  <Directory "/var/www/nagios1/htdocs">
    AllowOverride AuthConfig
    Order allow,deny
    Allow from all
  </Directory>

	CustomLog /var/www/nagios1/logs/nagios1-access_log combined
	ErrorLog /var/www/nagios1/logs/nagios1-error_log
</VirtualHost>

Creating the directories for the DocumentRoot and logs.
mkdir -p /var/www/nagios1/htdocs
mkdir -p /var/www/nagios1/logs


Now we create the accounts for all users who need access to the Nagios web interface and CGIs. In this tutorial we use nagiosadmin for the Nagios administrator and mike who has limited permissions.
htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password: 
Re-type new password: 

htpasswd2 -c /usr/local/nagios/etc/htpasswd.users mike
New password: 
Re-type new password: 


The next step is to tell Nagios which of the authenticated users have access to which functionality. This is done inside the file cgi.cfg.
cd /usr/local/nagios/etc

Here we find all the Nagios configuration files. Now there are only sample files. As we want to set the permissions we create cgi.cfg from a sample file.
cp cgi.cfg-sample cgi.cfg
vi cgi.cfg
Make sure that the parameter use_authentication is set to 1. The user nagiosadmin needs access to all functions of Nagios, whereas mike is allowed only to some functions. Please change to following lines.
#authorized_for_system_information=nagiosadmin,theboss,jdoe
#authorized_for_configuration_information=nagiosadmin,jdoe
#authorized_for_system_commands=nagiosadmin
#authorized_for_all_services=nagiosadmin,guest
#authorized_for_all_hosts=nagiosadmin,guest
#authorized_for_all_service_commands=nagiosadmin
#authorized_for_all_host_commands=nagiosadmin
to
authorized_for_system_information=nagiosadmin,mike
authorized_for_configuration_information=nagiosadmin,mike
authorized_for_system_commands=nagiosadmin
authorized_for_all_services=nagiosadmin,mike
authorized_for_all_hosts=nagiosadmin,mike
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin

The Nagios daemon needs a main configuration file to run:
cp nagios.cfg-sample nagios.cfg
vi nagios.cfg
Here we set check_external_commands to 1. Otherwise it is not possible to use the CGI interface.

Later on we want to import nagios.cfg with Fruity. Apperently Fruity does not know buffer slots, so these lines must be commented out. Otherwise Fruity refuses to import the nagios.cfg file.
#check_result_buffer_slots=4096
#external_command_buffer_slots=4096


Setting the correct permissions for the external command file.
chown nagios:nagcmd /usr/local/nagios/var/rw
chmod u+rwx /usr/local/nagios/var/rw
chmod g+rwx /usr/local/nagios/var/rw
chmod g+s /usr/local/nagios/var/rw


If we make a test now and start Nagios in the foreground we still get some errors.
/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
...

The errors can be fixed by renaming the necessary sample files.
cp commands.cfg-sample commands.cfg        
cp resource.cfg-sample resource.cfg
cp localhost.cfg-sample localhost.cfg
chown nagios:nagios /usr/local/nagios/var/nagios.log

Now Nagios can be started in the foreground without errors.
/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
Ctrl-c


We use Fruity later on to modify the previously changed configuration files. Therefor we make some more adjustments to the permissions.
chown nagios:apache *.cfg
chgrp apache .
chmod g+w *.cfg  

Also we want to restart Nagios from Fruity, thus more modifications have to be done: The nagios binary needs a different group.
chgrp apache /usr/local/nagios/bin/nagios

/etc/init.d/nagios must be executed as root, otherwise the kill command inside fails if the Nagios daemon shall be stopped. An easy way to do this is to create a wrapper for /etc/init.d/nagios. This wrapper calls /etc/init.d/nagios with sudo. Use visudo to add the following line to /etc/sudoers.
apache  ALL=(ALL) NOPASSWD: /etc/init.d/nagios
This allows the apache user to run /etc/init.d/nagios as root with sudo.

Now create the wrapper with vi /usr/local/nagios/sbin/nagios_wrapper and add these lines:
#!/bin/bash
sudo /etc/init.d/nagios $1

Making this wrapper executable.
chmod 775 /usr/local/nagios/sbin/nagios_wrapper


For the final test we restart Apache and start Nagios as a daemon
/etc/init.d/apache2 restart
/etc/init.d/nagios start

Adding Nagios to the default runlevel.
rc-update add nagios default


It's time to log in to Nagios with either nagiosadmin or mike:
http://nagios1/nagios/

We configured Nagios so that mike has access to all host and service information as well as to all system and configuration information. But he is not allowed to execute a service command for example.

What can be seen are some basic localhost checks which already exist by default. If you are logged in as mike and try to disable the notifications for a service you get an error message after the commit button is pressed. It indicates that the user mike is not allowed to do this, because its a service command. To change this the parameter authorized_for_all_service_commands in cgi.cfg must be set like this
authorized_for_all_service_commands=nagiosadmin,mike
and Nagios must be restarted.

Of course the user nagiosadmin is allowed to perform any operation from the web interface.

Screenshot
Nagios1


4) M Installing Fruity and Postfix
Ok so far, the biggest part is done. Nagios is up and running. Now we add Fruity to have a nice graphical frontend and Postfix which sends the notifications.
Let's go...
cd /var/www/nagios1/htdocs

Protecting the whole DocumentRoot with .htaccess, so that users have to authenticate to access Fruity:
vi ./.htaccess
Inserting these lines.
AuthType Basic
AuthName "Authentication required"
AuthUserFile /var/www/nagios1/.htpasswd
require valid-user

Creating an account for Fruity
htpasswd2 -c /var/www/nagios1/.htpasswd fruity
New password: 


Now that the directory is protected we start the installation of Fruity
wget http://puzzle.dl.sourceforge.net/sourceforge/fruity/fruity-1.0-rc2.tar.gz
tar xfvzp fruity-1.0-rc2.tar.gz
mv fruity-1.0-rc2 fruity
cd fruity
ln -s  /usr/local/nagios/share/images logos


Creating the MySQL database for Fruity.
mysql -uroot -p
Enter password:
mysql> create database fruity;
Query OK, 1 row affected (0.02 sec)

mysql> grant all privileges on fruity.* to 'fruity'@'localhost' identified by 'PASSWORD';
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> exit

Importing the Fruity SQL data into the new database
mysql fruity -u root -p < fruity-mysql.sql


The installation of Fruity is completed, just a little bit of configuration is still necessary.
vi includes/config.inc
Change
$sys_config['logos_path'] = '/usr/local/groundwork/fruity/logos';
$sys_config['nagios_preflight'] = false;
$sitedb_config['username'] = 'root';
$sitedb_config['password'] = '';
$sys_config['nagios_start'] = '/etc/init.d/nagios start';
$sys_config['nagios_stop'] = '/etc/init.d/nagios stop';        
to
$sys_config['logos_path'] = '/var/www/nagios1/htdocs/fruity/logos';
$sys_config['nagios_preflight'] = true;
$sitedb_config['username'] = 'fruity';
$sitedb_config['password'] = 'PASSWORD';
$sys_config['nagios_start'] = '/usr/local/nagios/sbin/nagios_wrapper start';
$sys_config['nagios_stop'] = '/usr/local/nagios/sbin/nagios_wrapper stop';        


Fruity is installed and configured and we can enjoy the result.
http://nagios1/fruity/

Screenshot
Fruity1
Isn't that great? I love this software!


At this point you can import Nagios 1.x data. I did that once and it worked almost flawlessly. There were only minor corrections necessary. But here in the tutorial we set up a minimal configuration from scratch. As we have already set up Nagios we can tell Fruity to import its current configuration data into the Fruity database. Therefor click "Import" in the main menu. Enter the the following paths and press 'Begin Import'.
/usr/local/nagios/etc/nagios.cfg
/usr/local/nagios/etc/cgi.cfg
/usr/local/nagios/etc/resource.cfg

If no Errors occured click 'Export' in the main menu to write the data back to Nagios. You may ask why we did this step as the configuration of Nagios was already done? The answer is that Fruity needs the whole Nagios configuration in its database. Before we start to make changes to the Nagios configuration with Fruity we must test that exporting and restarting Nagios from Fruity works.


The next step is to install Postfix. Note that this configuration of Postfix here is quick and dirty, just to send out notifications!
ssmtp is blocking Postfix so we remove it first.
emerge -C ssmtp
emerge postfix
vi /etc/postfix/main.cf
Set the following parameters:
myhostname = nagios1
mydomain = mymonitoringdomain.com
myorigin = $mydomain
mydestination = $myhostname, localhost.$mydomain, localhost
mydomain must be an existing internet domain, otherwise mail servers will refuse to accept mails from the monitoring server.

Setting the root alias to nagios.
vi /etc/mail/aliases
root:               nagios

Creating the alias map.
postalias hash:/etc/mail/aliases

Starting Postfix.
/etc/init.d/postfix start

Like for all other services we add Postfix to the default runlevel as we want it to be started automatically if the machine is rebooted.
rc-update add postfix default



5) MC Installing NRPE
The installation of NRPE has to be done on all involved servers. But only the monitored servers are running the NRPE daemon. The Nagios server needs only the check_nrpe program which communicates with the NRPE daemons on the monitored servers.

First we install NRPE on the Nagios server:
cd /tmp	
wget http://mesh.dl.sourceforge.net/sourceforge/nagios/nrpe-2.7.1.tar.gz
tar xfvzp nrpe-2.7.1.tar.gz
cd nrpe-2.7.1
./configure --enable-ssl --enable-command-args
make all

Copying the check_nrpe program into the Nagios plugin directory.
cp ./src/check_nrpe /usr/local/nagios/libexec


Second, we install the NRPE daemon on the monitored host. But before this step we install the Nagios standard plugins like we did on the Nagios server. NRPE can execute these plugins or execute self written scripts.
cd /tmp
wget http://ovh.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.7.tar.gz
tar xfvzp nagios-plugins-1.4.7.tar.gz
cd nagios-plugins-1.4.7
./configure
make
make install

Now we install the NRPE daemon.
cd /tmp	
wget http://mesh.dl.sourceforge.net/sourceforge/nagios/nrpe-2.7.1.tar.gz
tar xfvzp nrpe-2.7.1.tar.gz
cd nrpe-2.7.1
./configure --enable-ssl --enable-command-args
make all

Copying the NRPE daemon to /usr/sbin:
cp ./src/nrpe /usr/sbin/

Copying the sample configuration file to /etc.
cp ./sample-config/nrpe.cfg /etc

Because this was a manual installation we have to create an initscript for the NRPE daemon.
vi /etc/init.d/nrpe
Put the following lines into this file.
start() {
        /usr/sbin/nrpe -c /etc/nrpe.cfg -d
}
stop() {
        kill `cat /var/run/nrpe.pid`
        rm /var/run/nrpe.pid
}

Adding execution permissions to the initscript.
chmod a+x nrpe

Creating the user and group for the NRPE daemon:
useradd -d /usr/local/nagios -s /usr/sbin/nologin nagios

We are almost done with NRPE, just some minor configuration to do.
vi /etc/nrpe.cfg
# Allow access only from our Nagios server
allowed_hosts=127.0.0.1,192.168.1.33
# Allow to pass arguments to commands which are send to NRPE
dont_blame_nrpe=1

Starting the NRPE daemon.
/etc/init.d/nrpe start

Adding it to the default runlevel.
rc-update add nrpe default



6) C Making sure that port 5666 is open
Port 5666 is where NRPE on the monitored hosts is listening. On the Nagios server use telnet to check this port on the monitored host.
telnet 192.168.1.56 5666
Trying 192.168.1.56...
Connected to 192.168.1.56.
Escape character is '^]'.

This indicates that the NRPE daemon can be reached from the monitoring host, which is exactly what we want. Otherwise check the firewall settings on the monitored host. Use 'netstat -tan' there to make sure that the NRPE daemon is listening on port 5666.



7) M Setting up a normal Nagios TCP check
The installation orgy is completed, now we come to the fun part :-)

First we have to make some important modifications to the configuration in Fruity.
Opening the Fruity page.
http://nagios1/fruity

Select 'Main Config', then 'External Commands' and enable 'Check External Commands' here. After that press 'Update External Command Configuration'. It is unclear why Fruity does not import this parameter correctly, because we already set it manually in nagios.cfg. But this is no problem, now it is correct. Otherwise it is not possible to restart the Nagios process from the Nagios page for example. Another issue concerns the two commands host-notify-by-email and notify-by-email. Please change them, put each command in one single line unlike here.
host-notify-by-email:
/usr/bin/printf "%b" "Subject:Host $HOSTSTATE$ alert for 
$HOSTNAME$!\n\n***** Nagios 2.8 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n
Host: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\n
Info: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" |  sendmail $CONTACTEMAIL$
notify-by-email:
/usr/bin/printf "%b" "Subject:** $NOTIFICATIONTYPE$ alert - 
$HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **\n\n
***** Nagios 2.8 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\n
Service: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\n
State: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n
$SERVICEOUTPUT$" | sendmail  $CONTACTEMAIL$


Now it's time to set up the first Nagios check. For that we have to create host and service objects in Fruity. First create a new host for our monitored host. Click 'Hosts' then 'Add A New Child Host'. Fill out the three fields:
c1
Our first monitored host.
192.168.1.56

Select the template 'generic-host' and click 'Add Host'.

Second we create a new service for this host. Click on the newly created c1 host, then 'Services'. Here we add a service with 'Create A New Service For This Host'. Use 'TCP' for the description and the template 'generic-service'. Then click 'Add Service'. A service acts like a container, it does not check anything. Therefor we have to add a check to this service. Click 'Checks', then 'Edit'. Select the box 'Include in Definition' for 'Check Command' and select the 'check_tcp' check command. Accordingly set 1 for 'Maximum Check Attempts', 60 for 'Normal Check Interval In Time-Units', 60 for 'Retry Check Interval In Time-Units' and activate 'Notification Period'. Finally click 'Update Checks'. To set the port number we want to be checked click 'Check Command Parameters' and simple add a paramter with the value 22.

Then click 'Hosts' on the main menu, then 'c1', then 'Checks' and 'Edit'. Here set also 1 for 'Maximum Check Attempts' and press 'Update Checks'.

Another thing to do is to set the notification parameters for the host and the service: Select 'Hosts' from the main menu, then 'c1', then 'Notifications', then 'Edit' and set 'Notification Interval in Time-Units' to 60. Also select the notification options: Down, Unreachable, Recovery. Then click 'Update Notifications'.

For the service notification parameters set 'Notification Interval in Time-Units' to 60, activate 'Notification Period' and select the notification options: Warning, Unknown, Critical, Recovery. Then click 'Update Notifications'.

Add a default contact group for the host 'c1': Click on 'Hosts' from the main menu, then 'c1', then 'Contact Groups' and add the contact group admins. Do the same for the service 'TCP'.

In order to actually recieve notifications you should set a correct email address for the standard contact: Click 'Contacts' in the main menu, then 'nagios-admin' and 'Edit'. Change the email address to something valid, then press 'Modify Contact'.

All these steps are necessary. Otherwise Nagios will not accept the configuration. Now click 'Export' from the main menu to pass all the data to Nagios.

When you go to the Nagios page you will see the new host 'c1' and its service 'TCP'. All fields a grey because Nagios has just been restarted by Fruity. After a while you see all in green indicating that all hosts and services are up (provided you allow ssh connections on c1 from nagios1 and SSH is running).

Screenshot
Nagios2

This check simply makes sure that SSH is accessible on c1. Nagios executes the programm check_tcp from the standard plugins and processes the result. This is a check that can be made from outside of the monitored server. For more sophisticated checks NRPE comes into play.



8) MC Setting up a Nagios NRPE load check
The CPU load of c1 cannot be checked from outside like SSH in the previous example. It is possible to run SNMP on c1 and ask this daemon for the information. Another way is to ask our NRPE daemon on c1 which is already up and running. The advantage of NRPE is that you are not limited to the possibilities of the daemon itself like SNMP for example. NRPE itself does no checks at all. It can use the standard plugins locally on c1 which are installed there or it can use self written scripts. In this example we can use the check_load program from the standard plugins, which is already configured in nrpe.cfg by default:
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

Before NRPE checks can be executed from the Nagios server we must create the command check_nrpe in Fruity: Click 'Commands' in the main menu then click 'Add A New Command'. 'Command Name' is check_nrpe and 'Command Line' is
$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

Now we can use the command check_nrpe to establish a check for the CPU load on c1. Just add another service in Fruity to the host c1, as you did already in the previous chapter. Then add a check to this service and use 'check_nrpe' for the 'Check Command'. But Nagios needs to know which check we actually want to be executed by NRPE on c1. This is done again by clicking 'Check Command Parameters'. Now add a parameter with the value 'check_load'. Note that the value here must match exactly what is written in square brackets in nrpe.cfg on c1.

In this example the thresholds are set in nrpe.cfg on c1 so we don't need to pass additional parameters. For more in depth information how NRPE works and how to use it with parameters see my previous article about Nagios and NRPE or the NPRE documentation. Links can be found at the bottom of this page. The advantage of passing thresholds via Nagios to NRPE instead of setting them in nrpe.cfg is that you have the configuration centralized in Nagios. Otherwise you must go to every host and make the changes there. But this is a tutorial and I want to make things as clear as possible, so we stick with the first configuration.

Now it's time to export the data to Nagios and enjoy the result:

Screenshot
Nagios3


9) M Making sure that alerts occur as expected and notifications arrive
We come to the last step. To make sure we actually get informed by Nagios if something goes wrong you you should shutdown the monitored services if possible or at least simulate it. For services which depend on thresholds like the load check you can reduce the threshold until the check returns an alert. Make sure that you get notifications for all these services on all configured email accounts, for the alert as well as for the recovery when you have switched them back on.

That's it. Congratulation! At this point you have a very nice and flexible monitoring solution which is easy to configure.

There is much more to learn. For example how to monitor the Nagios daemon itself, let services being restarted by Nagios, dependencies, escalations, failover monitoring ...
Much more fascinating things to explore in the Nagios documentation.



10) MC Troubleshooting
In case of problems with certain checks of the standard plugins you can run make check from the source directory to get more infos:
cd /tmp/nagios-plugins-1.4.7
make check


If NRPE is not working correctly, make sure that

NRPE is compiled on both sides with
./configure --enable-command-args --enable-ssl 
make all

Port 5666 on the monitored servers is not blocked by a firewall

/etc/nrpe.cfg is configured properly:
server_address=local IP
allowed_hosts=IP of the Nagios server
dont_blame_nrpe=1

The daemon is running: ps aux | grep nrpe should display something like this
/usr/sbin/nrpe -c /etc/nrpe.cfg -d
The daemon is actually running with the user you configured in nrpe.cfg



Links
Home Nagios
Home Nagios plugins
Documentation NRPE

Nagios 3 with failover on SLES 11 with NConf, NRPE, NSCA, PNP4Nagios and NagVis
Nagios and Cacti
Nagios Training