Category5.TV Community Forum

Full Version: check_nrpe
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I'm new to nems and nagios in general. I have 3 raspbery pi's 3b and 2 raspberry pi 2b that do various things that I'm trying to setup monitoring on. Most of the monitoring feature are working fine but I'm having issues with doing things like checking disk space. 

I read the documentation on setting up NPRE on my other pi's and have set them up per the instructions. I added the Advanced Service "/ Disk Space" to my one of my hosts in the Nconf and deployed. When it runs if fails with the follow error "NRPE: Command 'check_disk -w 80 -c 90 -p /' not defined". 

I started looking at the documentation for NPRE on nagios stite and realized that I needed to go to the host machine and modify the /etc/nagios/nrpe.cfg file. 

I see the lines that talks about args and set the blame to. 


Code:
dont_blame_nrpe=1



I also noticed that the command for check_disk was commeted. 


Code:
command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$


I uncommented the line above and restarted the nagios-nrpe-server. Same error "NRPE: Command 'check_disk -w 80 -c 90 -p /' not defined". 

I modfied the line in the hosts nrpe.cfg file that calls check_hda1 and changed the path from /dev/hda1 because it doesn't exist on the pi to the following.

Code:
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /


Then I modified the Advanced service args "/ Disk Space"  in Nconf to call check_hda1 and it returns as expected. 

I have also tried the following from the terminal on my Nems Server.

Test 1:
call check disk on the remote host.(before I uncomment the check_disk commmand from above)

./check_nrpe -H 192.168.1.120 -c check_disk

NRPE: Command 'check_disk' not defined

Test 2:
call check disk on the remote host.(After I uncomment the check_disk commmand from above) Starting to get somewhere.

./check_nrpe -H 192.168.1.120 -c check_disk
Unknown argument
Usage:
 check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
[-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
[-t timeout] [-u unit] [-v] [-X type] [-N type]


Test 3:
call check disk on the remote host with args.
 ./check_nrpe -H 192.168.1.120 -c  check_disk -a 20 10 /

NRPE: Command 'check_disk!20!10!/' not defined



So I guess I have two questions. 

First one, any idea what I may be doing wrong with the remote commands?

Second, I noticed that in Advanced Services we don't have a way to create one of the check command type check_nrpe in the dropdown. I had to modify one of the existing ones to get the check_hda1 command to fire. Is there a way to make check_nrpe available in the dropdown or a way to clone one of the existing ones and the change the clone?

Any help would be greatly appreciated.

Thanks,
Kris
Wow. This really boils down to the need for NEMS to have more documentation in setting up check commands. The December issue of ODROID Magazine will definitely touch on this, but it's definitely on the roadmap for 2019.

Essentially, the service checks you've edited are the samples that are included for the local disk. The NEMS Server. By changing them, you've now changed the included samples. You'd probably want to have created new ones for your purpose, instead.

You should not manually edit the config files - you will lose all those settings. NEMS is not Nagios. You cannot manually edit config files in NEMS.

By editing the nrpe config on your NEMS server, you are doing just that. You are modifying NRPE on your NEMS server. You should instead be configuring it on the systems you're hoping to monitor (ie., the hosts). Does that make sense? The NRPE configuration tells your system how to *respond* to NRPE requests. So if you are editing it on the NEMS server, that's the wrong place - you're then changing how NEMS responds. NEMS does not have an HDA1, but it'd respond as if it does if you create a command called check_hda1. But you're checking the disk of the NEMS server, not your host. So the result is completely inaccurate to what you're really hoping to achieve (which is to check the host's hard drive space, not your NEMS Server's).

NRPE is confuzzling to be sure. It's one of the reasons I've also included WMI. But it takes understanding how everything talks to eachother. Eg., the check_disk service you're using is checking the NEMS server's disk, not your host's.

I think it boils down to being confused by the samples since you're new to Nagios/NRPE.

I will be improving the documentation heavily... in the meantime perhaps you'd be best to express what you're actually hoping to do (rather than what you have done) and see if I can be of more help.

As it is, I imagine your setup is a bit 'broken'  :)

Is all fun though, right?  :)

Robbie
Clearing up a few things. 

I have not changed any config files on my Server. All the changes have been made on the host machine. The change I did on my Server was only to the ARGS param in the Advanced Service for the one called / Disk Space in Nconf.


[Image: screenshot-nems-local-2018-11-27-14-36-53-1.png]

[Image: screenshot-nems-local-2018-11-27-14-36-53.png]

The mail host is using the modified service above. My Mqtt host is using the default one and is failing.

[Image: Capture.png]


Mail host /etc/nagios/nrpe.cfg changes.


[Image: mailhost.png]

Kris
Further findings:

Been doing some research and experimentation and I have more information. I'm sure that the version of nagios-nrpe-server thats in the in the raspibian strech repo is not compiled with the --enable-command-args

3.0.1 from repo deb http://raspbian.raspberrypi.org/raspbian/ stretch main contrib non-free rpi using sudo apt-get install nagios-nrpe-server
Nov 27 06:01:59 www systemd[1]: Started Nagios Remote Plugin Executor.
Nov 27 06:01:59 www nrpe[14780]: Starting up daemon
Nov 27 06:01:59 www nrpe[14780]: Server listening on 0.0.0.0 port 5666.
Nov 27 06:01:59 www nrpe[14780]: Server listening on :: port 5666.
Nov 27 06:01:59 www nrpe[14780]: Listening for connections on port 5666


3.0.1 from the github archive source compiled with the flag --enable-command-args. I think this would be the way to do it with the current version of nems if i was running on jessie. May try this.
It will not make all and throws a bunch of ssl errors. I think its due to the openSSL version changes in stretch.

3.2.1 from strech-backports repo deb http://ftp.debian.org/debian stretch-backports main. using sudo apt-get install nagios-nrpe-server -t stretch-backports -y --force-yes
Nov 27 06:01:59 www systemd[1]: Started Nagios Remote Plugin Executor.
Nov 27 06:01:59 www nrpe[14780]: Starting up daemon
Nov 27 06:01:59 www nrpe[14780]: Server listening on 0.0.0.0 port 5666.
Nov 27 06:01:59 www nrpe[14780]: Server listening on :: port 5666.
Nov 27 06:01:59 www nrpe[14780]: Listening for connections on port 5666


3.2.1 using the instructions located here https://support.nagios.com/kb/article.ph...2#Raspbian from gitub source compiled with the flag --enable-command-args
Nov 28 04:39:57 mail systemd[1]: Started Nagios Remote Plugin Executor.
Nov 28 04:39:57 mail nrpe[8020]: Starting up daemon
Nov 28 04:39:57 mail nrpe[8020]: Server listening on 0.0.0.0 port 5666.
Nov 28 04:39:57 mail nrpe[8020]: Server listening on :: port 5666.
Nov 28 04:39:57 mail nrpe[8020]: Warning: Daemon is configured to accept command arguments from clients!
Nov 28 04:39:57 mail nrpe[8020]: Listening for connections on port 5666

Still not working Sad

When I try to connect to the Host machine running 3.2.1 it throws the following:
CHECK_NRPE: Receive header underflow - only -1 bytes received (4 expected).

That I'm sure is because of the version diff between nems using 3.0.1 and the 3.2.1 installed on my host.

Are there any plans on upgrading to 3.2.1 in nems 1.5?

Thanks, 
Kris
Yes, NEMS 1.5 already has check_nrpe 2.2.1 as per the changelogs.

However, I held back check_snmp to version 2.1.1 due to the bugs in output, which they promise to fix in 2.2.2 but have been taking forever to release.
Sorry for replying to an old thread but I have the same issue. 
I configure check_nrpe as an advanced-service and when NEMS is executing command it looks like this:
 
/usr/local/nagios/libexec/check_nrpe -H 192.168.10.15 -c "check_load -a '-w 5.0,4.0,3.0 -c 10.0,6.0,4.0'"
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).
 
When I try the command from CLI I found out that it should be executed without double quotes around the argument. Like this:
/usr/local/nagios/libexec/check_nrpe -H 192.168.10.15 -c check_load -a '-w 5.0,4.0,3.0 -c 10.0,6.0,4.0'
OK - load average: 0.00, 0.00, 0.00|load1=0.000;5.000;10.000;0; load5=0.000;4.000;6.000;0; load15=0.000;3.000;4.000;0;
 
Is this a bug or have I configured it incorrectly?
 
NEMS Platform....: RPi 3 Model B+
NEMS Version.....: 1.5 (Current Version is 1.5)
I solved it by defined the arguments in /usr/local/nagios/etc/nrpe.cfg at each server. By that I only need to pass check_load as argument from NEMS.