| Home Profile Fun |
#128 Linux 03.06.2007
The monitoring server tutorial with Nagios 2.8, NRPE 2.7.1 and FruityThere are so many requests concerning Nagios and NRPE reaching me that I decided to write a tutorial for the setup of a complete Nagios monitoring server. Here is it, I hope you like it and it helps you to get all done as easy as possible. I am using Gentoo so the tutorial describes the setup on a Gentoo system. But it should be possible to follow the instructions on any other distribution. Nagios in combination with Fruity and NRPE gives a maximum of usability and flexibility. Usually there is no need to touch a Nagios configuration file directly any more. With NRPE we can monitor just everything imaginable because NRPE accepts self written check scripts. The result is a Nagios monitoring service with Fruity as its frontend. NRPE is used to execute remote checks on the monitored hosts. To send notifications Postfix is used. MySQL is also needed, Fruity stores its data in it. Before we start I would like to explain shortly the basic workflow for the configuration when all is set up. The first example shows the procedure for the set up of a normal Nagios check. The second one shows the procedure for a Nagios check with NRPE. Just to give an overview and make things more clear. What does normal Nagios check mean? This is a check which can be run solely by the Nagios daemon and the standard plugins. A NRPE check on the other hand can only be performed with the help of the NRPE daemon running on the monitored host. This is necessary if we want to check parameters which can not be accessed from outside of the monitored host. A normal Nagios check for example is used to test if a website is available. An example for a parameter which needs a NRPE check is the CPU load. These are the steps to set up a new check: Normal Nagios check On the monitoring host: 1) Opening the Fruity page 2) Adding a new service to a host 3) Adding a check to this service 4) Configuring this check by adding parameters if necessary 5) Exporting the data to Nagios 6) Openening the Nagios page and verifying that the new service check is set up correctly Nagios check with NRPE On the monitored host: 1) Creating your own check script or using one of the standard plugins 2) Making an entry for this check in /etc/nrpe.cfg 3) Restarting NRPE On the monitoring host: 1) Opening the Fruity page 2) Adding a new service to a host 3) Adding a NRPE check to this service 4) Configuring this check by adding parameters. 5) Exporting the data to Nagios 6) Opening the Nagios page and verifying that the new Service check is set up correctly When the system is completely set up, to add new checks is extremely easy and it takes only a few seconds or minutes. For this example we use the two physical hosts 'nagios1' and 'c1' in a LAN: The monitoring host which has the Nagios daemon running. nagios1: 192.168.1.33 The host which is being monitored and has the NRPE daemon running. c1: 192.168.1.56 Ok, here we go. These are the steps to set up the whole server. 'M' means on the monitoring host (Nagios) 'C' means on the monitored host (NRPE) 'MC' means on both Installation 1) M Setting the date and time 2) M Installing Apache2, PHP, MySQL, bind-tools, net-snmp and sudo 3) M Installing Nagios and the standard plugins 4) M Installing Fruity and Postfix 5) MC Installing NRPE Configuration / Test 6) C Making sure that port 5666 is open 7) M Setting up a normal Nagios TCP check 8) MC Setting up a Nagios NRPE load check 9) M Making sure that alerts occur as expected and notifications arrive 10) MC Troubleshooting My experience on Gentoo is that it's better to install Nagios and NRPE manually. I run into several problems when I installed these packages with portage. One example is that emerging of the standard plugins fails on systems which have no localhost or a different IP than 127.0.0.1 (e.g. vservers). This is because 127.0.0.1 is hardcoded in the configure script. So we install Nagios, the Nagios standard plugings and NRPE manually and the rest via portage. Fruity is not yet available in portage, thus also a candidate for a manual installation. 1) M Setting the date and time First we install rdate which can be used easily to set a correct date and time. emerge rdate It's best to chose a time server which is as close as possible to the location of our Nagios server. This command sets the correct time now. /usr/bin/rdate -s time.fu-berlin.de For a daily synchronisation we put this line in /etc/crontab. 03 01 * * * root /usr/bin/rdate -s time.fu-berlin.de > /dev/null 2) M Installing Apache2, PHP, MySQL, bind-tools, net-snmp and sudo Installing and starting Apache2. emerge apache /etc/init.d/apache2 start Adding Apache2 to the default runlevel. rc-update add apache2 default http://nagios1/ should show the standard page for a successful Apache installation. Installing PHP. USE="apache2 calendar cgi gd mysql mysqli sockets xml zip" emerge php Installing MySQL. emerge dev-db/mysql Installing the MySQL database. /usr/bin/mysql_install_db Starting MySQL. /etc/init.d/mysql start Setting the root password for the MySQL server. /usr/bin/mysqladmin -u root password 'NEW-PASSWORD' Adding MySQL to the default runlevel. rc-update add mysql default Installing bind-tools. emerge bind-tools Installing net-snmp. USE="-lm_sensors" emerge net-snmp Installing sudo. emerge app-admin/sudo 3) M Installing Nagios and the standard plugins As the system has all additional packages we proceed with the actual Nagios installation. Creating the Nagios user and group. useradd -m -d /usr/local/nagios -s /bin/bash nagios Identifying the user the Apache daemon runs as. grep "^User" /etc/apache2/httpd.conf User apache We need a new group whose members are the Apache user and the Nagios user. groupadd nagcmd usermod -G nagcmd apache usermod -G nagcmd nagios Downloading and compiling Nagios. cd /tmp wget http://puzzle.dl.sourceforge.net/sourceforge/nagios/nagios-2.8.tar.gz tar xfvzp nagios-2.8.tar.gz cd nagios-2.8For the tutorial we use the standard options here, so it's just ./configure. ./configure make all make installSetting permissions on the directory which holds the external command file. make install-commandmodeInstalling the sample configuration files. make install-configInstalling the initscript for the Nagios daemon. make install-init Downloading and compiling the standard plugins. Nagios on its own does not execute any checks. The standard plugins do this job. They must be installed otherwise Nagios is useless. cd /tmp wget http://ovh.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.7.tar.gz tar xfvzp nagios-plugins-1.4.7.tar.gz cd nagios-plugins-1.4.7 ./configure make make install Setting up the Nagios web interface. First we create a new virtual host for Nagios. cd /etc/apache2/vhosts.dOpening a new virtual host configuration file. vi nagios1.confAdding the following lines: NameVirtualHost 192.168.1.33:80
<VirtualHost 192.168.1.33:80>
ServerAdmin webmaster@nagios1
DocumentRoot /var/www/nagios1/htdocs
ServerName nagios1
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
<Directory "/usr/local/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
</Directory>
Alias /nagios /usr/local/nagios/share
<Directory "/usr/local/nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
</Directory>
<Directory "/var/www/nagios1/htdocs">
AllowOverride AuthConfig
Order allow,deny
Allow from all
</Directory>
CustomLog /var/www/nagios1/logs/nagios1-access_log combined
ErrorLog /var/www/nagios1/logs/nagios1-error_log
</VirtualHost>
Creating the directories for the DocumentRoot and logs. mkdir -p /var/www/nagios1/htdocs mkdir -p /var/www/nagios1/logs Now we create the accounts for all users who need access to the Nagios web interface and CGIs. In this tutorial we use nagiosadmin for the Nagios administrator and mike who has limited permissions. htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin New password: Re-type new password: htpasswd2 -c /usr/local/nagios/etc/htpasswd.users mike New password: Re-type new password: The next step is to tell Nagios which of the authenticated users have access to which functionality. This is done inside the file cgi.cfg. cd /usr/local/nagios/etc Here we find all the Nagios configuration files. Now there are only sample files. As we want to set the permissions we create cgi.cfg from a sample file. cp cgi.cfg-sample cgi.cfg vi cgi.cfgMake sure that the parameter use_authentication is set to 1. The user nagiosadmin needs access to all functions of Nagios, whereas mike is allowed only to some functions. Please change to following lines. #authorized_for_system_information=nagiosadmin,theboss,jdoe #authorized_for_configuration_information=nagiosadmin,jdoe #authorized_for_system_commands=nagiosadmin #authorized_for_all_services=nagiosadmin,guest #authorized_for_all_hosts=nagiosadmin,guest #authorized_for_all_service_commands=nagiosadmin #authorized_for_all_host_commands=nagiosadminto authorized_for_system_information=nagiosadmin,mike authorized_for_configuration_information=nagiosadmin,mike authorized_for_system_commands=nagiosadmin authorized_for_all_services=nagiosadmin,mike authorized_for_all_hosts=nagiosadmin,mike authorized_for_all_service_commands=nagiosadmin authorized_for_all_host_commands=nagiosadmin The Nagios daemon needs a main configuration file to run: cp nagios.cfg-sample nagios.cfg vi nagios.cfgHere we set check_external_commands to 1. Otherwise it is not possible to use the CGI interface. Later on we want to import nagios.cfg with Fruity. Apperently Fruity does not know buffer slots, so these lines must be commented out. Otherwise Fruity refuses to import the nagios.cfg file. #check_result_buffer_slots=4096 #external_command_buffer_slots=4096 Setting the correct permissions for the external command file. chown nagios:nagcmd /usr/local/nagios/var/rw chmod u+rwx /usr/local/nagios/var/rw chmod g+rwx /usr/local/nagios/var/rw chmod g+s /usr/local/nagios/var/rw If we make a test now and start Nagios in the foreground we still get some errors. /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg ... The errors can be fixed by renaming the necessary sample files. cp commands.cfg-sample commands.cfg cp resource.cfg-sample resource.cfg cp localhost.cfg-sample localhost.cfg chown nagios:nagios /usr/local/nagios/var/nagios.log Now Nagios can be started in the foreground without errors. /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg Ctrl-c We use Fruity later on to modify the previously changed configuration files. Therefor we make some more adjustments to the permissions. chown nagios:apache *.cfg chgrp apache . chmod g+w *.cfg Also we want to restart Nagios from Fruity, thus more modifications have to be done: The nagios binary needs a different group. chgrp apache /usr/local/nagios/bin/nagios /etc/init.d/nagios must be executed as root, otherwise the kill command inside fails if the Nagios daemon shall be stopped. An easy way to do this is to create a wrapper for /etc/init.d/nagios. This wrapper calls /etc/init.d/nagios with sudo. Use visudo to add the following line to /etc/sudoers. apache ALL=(ALL) NOPASSWD: /etc/init.d/nagiosThis allows the apache user to run /etc/init.d/nagios as root with sudo. Now create the wrapper with vi /usr/local/nagios/sbin/nagios_wrapper and add these lines: #!/bin/bash sudo /etc/init.d/nagios $1 Making this wrapper executable. chmod 775 /usr/local/nagios/sbin/nagios_wrapper For the final test we restart Apache and start Nagios as a daemon /etc/init.d/apache2 restart /etc/init.d/nagios start Adding Nagios to the default runlevel. rc-update add nagios default It's time to log in to Nagios with either nagiosadmin or mike: http://nagios1/nagios/ We configured Nagios so that mike has access to all host and service information as well as to all system and configuration information. But he is not allowed to execute a service command for example. What can be seen are some basic localhost checks which already exist by default. If you are logged in as mike and try to disable the notifications for a service you get an error message after the commit button is pressed. It indicates that the user mike is not allowed to do this, because its a service command. To change this the parameter authorized_for_all_service_commands in cgi.cfg must be set like this authorized_for_all_service_commands=nagiosadmin,mikeand Nagios must be restarted. Of course the user nagiosadmin is allowed to perform any operation from the web interface. Screenshot 4) M Installing Fruity and Postfix Ok so far, the biggest part is done. Nagios is up and running. Now we add Fruity to have a nice graphical frontend and Postfix which sends the notifications. Let's go... cd /var/www/nagios1/htdocs Protecting the whole DocumentRoot with .htaccess, so that users have to authenticate to access Fruity: vi ./.htaccessInserting these lines. AuthType Basic AuthName "Authentication required" AuthUserFile /var/www/nagios1/.htpasswd require valid-user Creating an account for Fruity htpasswd2 -c /var/www/nagios1/.htpasswd fruity New password: Now that the directory is protected we start the installation of Fruity wget http://puzzle.dl.sourceforge.net/sourceforge/fruity/fruity-1.0-rc2.tar.gz tar xfvzp fruity-1.0-rc2.tar.gz mv fruity-1.0-rc2 fruity cd fruity ln -s /usr/local/nagios/share/images logos Creating the MySQL database for Fruity. mysql -uroot -p Enter password: mysql> create database fruity; Query OK, 1 row affected (0.02 sec) mysql> grant all privileges on fruity.* to 'fruity'@'localhost' identified by 'PASSWORD'; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) mysql> exit Importing the Fruity SQL data into the new database mysql fruity -u root -p < fruity-mysql.sql The installation of Fruity is completed, just a little bit of configuration is still necessary. vi includes/config.incChange $sys_config['logos_path'] = '/usr/local/groundwork/fruity/logos'; $sys_config['nagios_preflight'] = false; $sitedb_config['username'] = 'root'; $sitedb_config['password'] = ''; $sys_config['nagios_start'] = '/etc/init.d/nagios start'; $sys_config['nagios_stop'] = '/etc/init.d/nagios stop';to $sys_config['logos_path'] = '/var/www/nagios1/htdocs/fruity/logos'; $sys_config['nagios_preflight'] = true; $sitedb_config['username'] = 'fruity'; $sitedb_config['password'] = 'PASSWORD'; $sys_config['nagios_start'] = '/usr/local/nagios/sbin/nagios_wrapper start'; $sys_config['nagios_stop'] = '/usr/local/nagios/sbin/nagios_wrapper stop'; Fruity is installed and configured and we can enjoy the result. http://nagios1/fruity/ Screenshot Isn't that great? I love this software! At this point you can import Nagios 1.x data. I did that once and it worked almost flawlessly. There were only minor corrections necessary. But here in the tutorial we set up a minimal configuration from scratch. As we have already set up Nagios we can tell Fruity to import its current configuration data into the Fruity database. Therefor click "Import" in the main menu. Enter the the following paths and press 'Begin Import'. /usr/local/nagios/etc/nagios.cfg /usr/local/nagios/etc/cgi.cfg /usr/local/nagios/etc/resource.cfg If no Errors occured click 'Export' in the main menu to write the data back to Nagios. You may ask why we did this step as the configuration of Nagios was already done? The answer is that Fruity needs the whole Nagios configuration in its database. Before we start to make changes to the Nagios configuration with Fruity we must test that exporting and restarting Nagios from Fruity works. The next step is to install Postfix. Note that this configuration of Postfix here is quick and dirty, just to send out notifications! ssmtp is blocking Postfix so we remove it first. emerge -C ssmtp emerge postfix vi /etc/postfix/main.cfSet the following parameters: myhostname = nagios1 mydomain = mymonitoringdomain.com myorigin = $mydomain mydestination = $myhostname, localhost.$mydomain, localhostmydomain must be an existing internet domain, otherwise mail servers will refuse to accept mails from the monitoring server. Setting the root alias to nagios. vi /etc/mail/aliases root: nagios Creating the alias map. postalias hash:/etc/mail/aliases Starting Postfix. /etc/init.d/postfix start Like for all other services we add Postfix to the default runlevel as we want it to be started automatically if the machine is rebooted. rc-update add postfix default 5) MC Installing NRPE The installation of NRPE has to be done on all involved servers. But only the monitored servers are running the NRPE daemon. The Nagios server needs only the check_nrpe program which communicates with the NRPE daemons on the monitored servers. First we install NRPE on the Nagios server: cd /tmp wget http://mesh.dl.sourceforge.net/sourceforge/nagios/nrpe-2.7.1.tar.gz tar xfvzp nrpe-2.7.1.tar.gz cd nrpe-2.7.1 ./configure --enable-ssl --enable-command-args make all Copying the check_nrpe program into the Nagios plugin directory. cp ./src/check_nrpe /usr/local/nagios/libexec Second, we install the NRPE daemon on the monitored host. But before this step we install the Nagios standard plugins like we did on the Nagios server. NRPE can execute these plugins or execute self written scripts. cd /tmp wget http://ovh.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.7.tar.gz tar xfvzp nagios-plugins-1.4.7.tar.gz cd nagios-plugins-1.4.7 ./configure make make install Now we install the NRPE daemon. cd /tmp wget http://mesh.dl.sourceforge.net/sourceforge/nagios/nrpe-2.7.1.tar.gz tar xfvzp nrpe-2.7.1.tar.gz cd nrpe-2.7.1 ./configure --enable-ssl --enable-command-args make all Copying the NRPE daemon to /usr/sbin: cp ./src/nrpe /usr/sbin/ Copying the sample configuration file to /etc. cp ./sample-config/nrpe.cfg /etc Because this was a manual installation we have to create an initscript for the NRPE daemon. vi /etc/init.d/nrpePut the following lines into this file. start() {
/usr/sbin/nrpe -c /etc/nrpe.cfg -d
}
stop() {
kill `cat /var/run/nrpe.pid`
rm /var/run/nrpe.pid
}
Adding execution permissions to the initscript. chmod a+x nrpe Creating the user and group for the NRPE daemon: useradd -d /usr/local/nagios -s /usr/sbin/nologin nagios We are almost done with NRPE, just some minor configuration to do. vi /etc/nrpe.cfg # Allow access only from our Nagios server allowed_hosts=127.0.0.1,192.168.1.33 # Allow to pass arguments to commands which are send to NRPE dont_blame_nrpe=1 Starting the NRPE daemon. /etc/init.d/nrpe start Adding it to the default runlevel. rc-update add nrpe default 6) C Making sure that port 5666 is open Port 5666 is where NRPE on the monitored hosts is listening. On the Nagios server use telnet to check this port on the monitored host. telnet 192.168.1.56 5666 Trying 192.168.1.56... Connected to 192.168.1.56. Escape character is '^]'. This indicates that the NRPE daemon can be reached from the monitoring host, which is exactly what we want. Otherwise check the firewall settings on the monitored host. Use 'netstat -tan' there to make sure that the NRPE daemon is listening on port 5666. 7) M Setting up a normal Nagios TCP check The installation orgy is completed, now we come to the fun part :-) First we have to make some important modifications to the configuration in Fruity. Opening the Fruity page. http://nagios1/fruity Select 'Main Config', then 'External Commands' and enable 'Check External Commands' here. After that press 'Update External Command Configuration'. It is unclear why Fruity does not import this parameter correctly, because we already set it manually in nagios.cfg. But this is no problem, now it is correct. Otherwise it is not possible to restart the Nagios process from the Nagios page for example. Another issue concerns the two commands host-notify-by-email and notify-by-email. Please change them, put each command in one single line unlike here. host-notify-by-email: /usr/bin/printf "%b" "Subject:Host $HOSTSTATE$ alert for $HOSTNAME$!\n\n***** Nagios 2.8 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n Host: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\n Info: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | sendmail $CONTACTEMAIL$notify-by-email: /usr/bin/printf "%b" "Subject:** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **\n\n ***** Nagios 2.8 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\n Service: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\n State: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n $SERVICEOUTPUT$" | sendmail $CONTACTEMAIL$ Now it's time to set up the first Nagios check. For that we have to create host and service objects in Fruity. First create a new host for our monitored host. Click 'Hosts' then 'Add A New Child Host'. Fill out the three fields: c1 Our first monitored host. 192.168.1.56 Select the template 'generic-host' and click 'Add Host'. Second we create a new service for this host. Click on the newly created c1 host, then 'Services'. Here we add a service with 'Create A New Service For This Host'. Use 'TCP' for the description and the template 'generic-service'. Then click 'Add Service'. A service acts like a container, it does not check anything. Therefor we have to add a check to this service. Click 'Checks', then 'Edit'. Select the box 'Include in Definition' for 'Check Command' and select the 'check_tcp' check command. Accordingly set 1 for 'Maximum Check Attempts', 60 for 'Normal Check Interval In Time-Units', 60 for 'Retry Check Interval In Time-Units' and activate 'Notification Period'. Finally click 'Update Checks'. To set the port number we want to be checked click 'Check Command Parameters' and simple add a paramter with the value 22. Then click 'Hosts' on the main menu, then 'c1', then 'Checks' and 'Edit'. Here set also 1 for 'Maximum Check Attempts' and press 'Update Checks'. Another thing to do is to set the notification parameters for the host and the service: Select 'Hosts' from the main menu, then 'c1', then 'Notifications', then 'Edit' and set 'Notification Interval in Time-Units' to 60. Also select the notification options: Down, Unreachable, Recovery. Then click 'Update Notifications'. For the service notification parameters set 'Notification Interval in Time-Units' to 60, activate 'Notification Period' and select the notification options: Warning, Unknown, Critical, Recovery. Then click 'Update Notifications'. Add a default contact group for the host 'c1': Click on 'Hosts' from the main menu, then 'c1', then 'Contact Groups' and add the contact group admins. Do the same for the service 'TCP'. In order to actually recieve notifications you should set a correct email address for the standard contact: Click 'Contacts' in the main menu, then 'nagios-admin' and 'Edit'. Change the email address to something valid, then press 'Modify Contact'. All these steps are necessary. Otherwise Nagios will not accept the configuration. Now click 'Export' from the main menu to pass all the data to Nagios. When you go to the Nagios page you will see the new host 'c1' and its service 'TCP'. All fields a grey because Nagios has just been restarted by Fruity. After a while you see all in green indicating that all hosts and services are up (provided you allow ssh connections on c1 from nagios1 and SSH is running). Screenshot This check simply makes sure that SSH is accessible on c1. Nagios executes the programm check_tcp from the standard plugins and processes the result. This is a check that can be made from outside of the monitored server. For more sophisticated checks NRPE comes into play. 8) MC Setting up a Nagios NRPE load check The CPU load of c1 cannot be checked from outside like SSH in the previous example. It is possible to run SNMP on c1 and ask this daemon for the information. Another way is to ask our NRPE daemon on c1 which is already up and running. The advantage of NRPE is that you are not limited to the possibilities of the daemon itself like SNMP for example. NRPE itself does no checks at all. It can use the standard plugins locally on c1 which are installed there or it can use self written scripts. In this example we can use the check_load program from the standard plugins, which is already configured in nrpe.cfg by default: command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 Before NRPE checks can be executed from the Nagios server we must create the command check_nrpe in Fruity: Click 'Commands' in the main menu then click 'Add A New Command'. 'Command Name' is check_nrpe and 'Command Line' is $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ Now we can use the command check_nrpe to establish a check for the CPU load on c1. Just add another service in Fruity to the host c1, as you did already in the previous chapter. Then add a check to this service and use 'check_nrpe' for the 'Check Command'. But Nagios needs to know which check we actually want to be executed by NRPE on c1. This is done again by clicking 'Check Command Parameters'. Now add a parameter with the value 'check_load'. Note that the value here must match exactly what is written in square brackets in nrpe.cfg on c1. In this example the thresholds are set in nrpe.cfg on c1 so we don't need to pass additional parameters. For more in depth information how NRPE works and how to use it with parameters see my previous article about Nagios and NRPE or the NPRE documentation. Links can be found at the bottom of this page. The advantage of passing thresholds via Nagios to NRPE instead of setting them in nrpe.cfg is that you have the configuration centralized in Nagios. Otherwise you must go to every host and make the changes there. But this is a tutorial and I want to make things as clear as possible, so we stick with the first configuration. Now it's time to export the data to Nagios and enjoy the result: Screenshot 9) M Making sure that alerts occur as expected and notifications arrive We come to the last step. To make sure we actually get informed by Nagios if something goes wrong you you should shutdown the monitored services if possible or at least simulate it. For services which depend on thresholds like the load check you can reduce the threshold until the check returns an alert. Make sure that you get notifications for all these services on all configured email accounts, for the alert as well as for the recovery when you have switched them back on. That's it. Congratulation! At this point you have a very nice and flexible monitoring solution which is easy to configure. There is much more to learn. For example how to monitor the Nagios daemon itself, let services being restarted by Nagios, dependencies, escalations, failover monitoring ... Much more fascinating things to explore in the Nagios documentation. 10) MC Troubleshooting In case of problems with certain checks of the standard plugins you can run make check from the source directory to get more infos: cd /tmp/nagios-plugins-1.4.7 make check If NRPE is not working correctly, make sure that NRPE is compiled on both sides with ./configure --enable-command-args --enable-ssl make all Port 5666 on the monitored servers is not blocked by a firewall /etc/nrpe.cfg is configured properly: server_address=local IP allowed_hosts=IP of the Nagios server dont_blame_nrpe=1 The daemon is running: ps aux | grep nrpe should display something like this /usr/sbin/nrpe -c /etc/nrpe.cfg -d The daemon is actually running with the user you configured in nrpe.cfg Links Home Nagios Home Nagios plugins Documentation NRPE Nagios 3 with failover on SLES 11 with NConf, NRPE, NSCA, PNP4Nagios and NagVis Nagios and Cacti Nagios Training |