| Author: | Trond Hasle Amundsen |
|---|---|
| Contact: | t.h.amundsen@usit.uio.no |
| Date: | 2010-06-30 |
Contents:
check_hp_bladechassis is a plugin for the Nagios monitoring software which checks the hardware health of HP blade enclosures via SNMP. The plugin is only tested with the c7000 enclosure.

This plugin is designed to be a companion plugin to check_dell_bladechassis in terms of supported options and functionality. The information that can be gathered via SNMP from these enclosures is different than that of Dell enclosures, so the plugins will differ in output.
check_hp_bladechassis is written in Perl, and needs a perl interpreter. Nagios' embedded perl interpreter (ePN) can be used, but be aware that the plugin is not well tested against ePN. The plugin assumes that perl is available as /usr/bin/perl, but you can easily change this as you wish by editing the first line in the script.
Since this plugin uses SNMP, you'll also need the perl module Net::SNMP on the Nagios server (or the server running the queries). This module is not part of perl itself, but is available in all modern Linux distributions. Installing Net::SNMP is quite easy:
For RHEL/CentOS 5.x the best way is to use EPEL:
rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm yum install perl-Net-SNMP
For Fedora:
yum install perl-Net-SNMP
For SuSE:
rug install perl-Net-SNMP
For Debian and Ubuntu:
aptitude install libnet-snmp-perl
If this does not apply to your server, consult your OS repository to find Net::SNMP. If all else fails, try installing from CPAN.
Attention!
This is a short HOWTO that describes how to get started with using check_hp_bladechassis. This HOWTO assumes that the prerequisites are met, and that you have a Nagios server up and running. Nagios version 3.x is assumed.
The examples below are simple examples with very basic usage of check_hp_bladechassis. There are many more or less advanced options that you might consider useful. Se the usage section for info.
The first thing you want to do is create a hostgroup that contains your blade enclosures. If you have very few enclosures you can skip this step and use hosts in the service definition instead, but I think hostgroups are always better:
# hostgroup for HP blade enclosures
define hostgroup {
hostgroup_name hp-bladecenters
alias HP bladecenters
}
You'll need a host definition for each of the enclosures. If you are an experienced Nagios admin you already know this, of course:
define host {
host_name my-bladecenter1.foo.org
alias my-bladecenter1
address 192.168.10.12
use generic-host
hostgroups hp-bladecenters
contact_groups example@foo.org
}
Next you want to create a servicegroup for this service. This is not required, but it makes things easier when you want to inspect your HP enclosures via Nagios' web interface. Creating a servicegroup is simple:
# Servicegroup for HP blade enclosures
define servicegroup {
servicegroup_name hp-bladechassis
alias HP server health status
}
The servicegroup is used later in the service definition.
The next step is to define a command for check_hp_bladechassis:
# HP blade enclosure check
define command {
command_name check_hp_bladechassis
command_line /path/to/check_hp_bladechassis -H $HOSTADDRESS$
}
Note that is is a very basic example of check_hp_bladechassis usage. Refer to the usage section for info about the different options that alters the behaviour of check_hp_bladechassis.
Finally, you define the service:
define service {
use generic-service
hostgroup_name hp-bladecenters
service_description HP blade enclosure health
servicegroups hp-bladechassis
check_command check_hp_bladechassis
action_url https://$HOSTNAME$/
notes_url http://folk.uio.no/trondham/software/check_hp_bladechassis.html
}
The action_url and notes_url is optional.
The plugin queries the monitored host remotely via SNMP. Prerequisites for this are that the monitored host is running SNMP, and that the Nagios server is allowed to communicate with the enclosure over SNMP. The -H|--hostname option is needed for the hostname/IP you want to check.
$ check_hp_bladechassis -H my-bladecenter1 OK - System: 'BladeSystem c7000 Enclosure', SN: 'XXXXXXXXXX', Firmware: '2.41', hardware working fine, 14 blades, 6 i/o modules
You can specify the SNMP community string (for SNMP version 1 and 2c) with the -C|--community option. Default community is set to "public" if the option is not present:
$ check_hp_bladechassis -H my-bladecenter2 -C mycommunity OK - System: 'BladeSystem c7000 Enclosure G2', SN: 'XXXXXXXXXX', Firmware: '2.52', hardware working fine, 2 blades, 6 i/o modules
For other SNMP options, refer to the manual page.
The default behaviour of the plugin is to print all alerts on separate lines with no extra fuzz:
$ check_hp_bladechassis -H my-bladecenter1 Fan 2 condition is Failed
There are several options that allows you to alter this, as listed below.
The -s|--state option will prefix each alert with the full service state:
$ check_hp_bladechassis -H my-bladecenter1 -s CRITICAL: Fan 2 condition is Failed
Example output with the --short-state option, which does the same, except that the service state is abbreviated to only one letter, i.e. C for CRITICAL, W for WARNING etc.:
$ check_hp_bladechassis -H my-bladecenter1 --short-state C: Fan 2 condition is Failed
The option -i|--info will prefix all alerts with the serial number:
$ check_hp_bladechassis -H my-bladecenter1 -i [XXXXXXXXXX] Fan 2 condition is Failed
The option -v|--verbose will append the part number, spare part number and serial number of the failed component:
$ check_hp_bladechassis -H my-bladecenter1 -v Fan 2 condition is Failed [part: n/a, spare: n/a, sn: n/a]
In the above example the fan is missing, so the information is not available.
The option -e|--extinfo will print the server model, serial number and firmware revision on a separate line at the end of the alert:
$ check_hp_bladechassis -H my-bladecenter1 -e Fan 2 condition is Failed ------ SYSTEM: BladeSystem c7000 Enclosure G2, SN: XXXXXXXXXX, FW: 2.52
You can combine any of these options. Example:
$ check_hp_bladechassis -H my-bladecenter1 -s -e CRITICAL: Fan 2 condition is Failed ------ SYSTEM: BladeSystem c7000 Enclosure G2, SN: XXXXXXXXXX, FW: 2.52
Which (combination) of these options you choose to use, if any, depends on how you use Nagios and your personal preference.
If supplied the option -d or --debug, check_hp_bladechassis will output messages about all the checked components, along with their respectible part numners and alert states. If supported by the enclosure the plugin will also output total power usage. An example debug output from a c7000 is given below.
Warning
The option -d|--debug is intended for diagnostics and debugging purposes only. Do not use this option from within Nagios, i.e. in your Nagios config.
The output from this plugin contains multiple lines separated by HTML linebreaks (<br/>) if run as a command within Nagios, via NRPE etc. If run from a console which has a TTY, i.e. if you log in via SSH or similar and run the plugin manually, the linebreaks will be regular linebreaks.
Nagios 3.x allows the following option in cgi.cfg:
# ESCAPE HTML TAGS # This option determines whether HTML tags in host and service # status output is escaped in the web interface. If enabled, # your plugin output will not be able to contain clickable links. escape_html_tags=1
The default, as seen above in the sample cgi.cfg from the distribution, is that HTML tags are escaped. My advice is to turn this off. If not, you will see output like this in your Nagios console:
CRITICAL: example error message 1<br/>WARNING: example error message 2
instead of this:
CRITICAL: example error message 1 WARNING: example error message 2
With Nagios 3.x, plugins are allowed to output multiple lines with regular linebreaks, but only the first line is shown in the web interface (status.cgi).
Usage information gathered with check_hp_bladechassis -h:
Usage: check_hp_bladechassis -H <HOSTNAME> [OPTION]... OPTIONS: -H, --hostname Hostname or IP of the enclosure -C, --community SNMP community string -P, --protocol SNMP protocol version --port SNMP port number -p, --perfdata Ouput performance data -t, --timeout Plugin timeout in seconds -i, --info Prefix alerts with the enclosure's serial number -v, --verbose Append extra info to alerts (part no. etc.) -e, --extinfo Append system info to alerts -s, --state Prefix alerts with alert state --short-state Prefix alerts with alert state (abbreviated) -d, --debug Debug output, reports everything -h, --help Display this help text -V, --version Display version info For more information and advanced options, see the manual page or URL: http://folk.uio.no/trondham/software/check_hp_bladechassis.html
check_hp_bladechassis will output performance data (power consumption in Watts) if the --perfdata or -p option is used. An example graph using PNP4Nagios is given below.

The template used to generate these graphs are available as check_hp_bladechassis.php in the downloadable ZIP archive and tarball.
You can also download the plugin and the manpage separately.
| Version | Date | Changes |
|---|---|---|
| 1.0.1 | 2010-01-22 |
|
| 1.0.0 | 2009-08-04 |
|
Please let me know if you are experiencing bugs, have feature requests, or suggestions on how to improve check_hp_bladechassis. We use this plugin in production at the University of Oslo, but we don't use all the different features of the plugin. While the plugin is bug-free for us, it might not be for you, so let me know if you have problems.
You can also send bug reports or feature requests to the Nagios users mailing list. I read postings to this list frequently:
nagios-users@lists.sourceforge.net
You can email me directly, but then other users won't benefit from the discussion. Unless you have security issues or other concerns are preventing you from using the mailing list, it is better to discuss problems in a public forum.
This is free software. Use at your own risk.