| Home Profile Fun |
#162 Linux 31.08.2008
NaNs in Cacti/SpineThe first step is to go through the NaN debugging guide. If this does not dissolve the NaNs and you are monitoring a cluster then have a look at the following situation. It happened to me when I had to monitor an Isilon storage cluster. Spine checks the uptime of the snmp daemon on the monitored machine. As long as the service is up the value is increasing constantly of course. As soon as snmpd is restarted the uptime is lower then the last value because it starts from zero again. In such a case Spine drops all the currently recieved values for all graphs of this host and puts a NaN into them! Ok this should not happen too often on a single machine. I had to monitor the disks on a storage cluster which are mounted on all nodes. I used the dns name of the cluster which was resolved by round robin dns to the different cluster nodes. Because the disks where mounted on all nodes I thought the delegation to different physical machines is not a problem. Well, this was absolutely wrong. With every jump to the next machine snmp recieved a different snmpd uptime because it was another physical host and thus another snmpd! And Spine dropped the values each time and put a NaN into the graphs. The result were NaNs all over the graphs. The solution was obvious. Instead of the dns name I used the IP addresses of the individual nodes and all NaNs were gone. To check the value of the uptime run the following command several times. snmpwalk -v2c -c public IP/host sysUpTimeInstance Nagios and Cacti Cacti Training |