Friday, October 27, 2017

Monitoring My Pi Cluster WIth Nagios, Part III - Checking the other Pi's using check_by_ssh


It's been too long since I wrote my other Nagios posts. The server I had running died, and I didn't recover the nagios files from it. Fortunately, I had an AWS machine to compare to, so I didn't lose everything.

Setting up the check_by_ssh checks is, like all of nagios (it seems), a multi-config process. You'll need to modify $NAGIOS_HOME/etc/objects/commands.cfg, $NAGIOS_HOME/etc/nagios.cfg, and $NAGIOS_HOME/etc/servers/<server_name>.





In my last nagios post, I showed how to add servers for nagios monitoring. By uncommenting the line #cfg_dir=/usr/local/nagios/etc/servers - it will cause nagios to read any file in that directory during start-up. This is where I put my pi config files.

Next, I added remote commands to $NAGIOS_HOME/etc/objects/commands.cfg. Here is an example of how to write those:

# 'check_remote_disk' command definition
define command{
        command_name    check_remote_disk
        command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "$USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$"
        }

# 'check_remote_load' command definition
define command{
        command_name    check_remote_load
        command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "$USER1$/check_load -w $ARG1$ -c $ARG2$"
        }

That may not format too well - the command_line is one continuous line.

Next, add a file for the server you want to check to the $NAGIOS_HOME/etc/servers/ directory.  Here is a trimmed down example with the check_by_ssh commands that I added to $NAGIOS_HOME/etc/objects/commands.cfg file(above):

# including the define host section

define host{
    use linux-server ; Name of host template to use
            ; This host definition will inherit all variables that are defined
            ; in (or inherited by) the linux-server host template definition.
    host_name   Raspberry_Pi_1
    alias           Raspberry_Pi_1
    address         192.168.22.101
    }

# including the host group as well
define hostgroup{
    hostgroup_name  pi-servers ; The name of the hostgroup
    alias              Raspberry pi servers infrastructure Servers
                       ; Long name of the group
    members            Raspberry_Pi_1     
                       ; Comma separated list of hosts that belong to this group

# and the service definitions:
define service{
    use                 local-service ; Name of service template to use
    host_name           Raspberry_Pi_1
    service_description Root Partition
    check_command       check_remote_disk!20%!10%!/
    }

define service{
    use                 local-service  ; Name of service template to use
    host_name           Raspberry_Pi_1
    service_description Current Load
    check_command       check_remote_load!5.0,4.0,3.0!10.0,6.0,4.0
    }
With all of that added, restart nagios and it will show the new raspberry pi that you just added. Hopefully this is enough to get you going. Feel free to copy and modify the other commands in $NAGIOS_HOME/etc/objects/commands.cfg file for broader coverage.

As always, please let me know if you have any questions, or there is any of this that needs clarification.
     


No comments:

Post a Comment