Category5.TV Community Forum

Full Version: check_nrpe
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I'm new to nems and nagios in general. I have 3 raspbery pi's 3b and 2 raspberry pi 2b that do various things that I'm trying to setup monitoring on. Most of the monitoring feature are working fine but I'm having issues with doing things like checking disk space. 

I read the documentation on setting up NPRE on my other pi's and have set them up per the instructions. I added the Advanced Service "/ Disk Space" to my one of my hosts in the Nconf and deployed. When it runs if fails with the follow error "NRPE: Command 'check_disk -w 80 -c 90 -p /' not defined". 

I started looking at the documentation for NPRE on nagios stite and realized that I needed to go to the host machine and modify the /etc/nagios/nrpe.cfg file. 

I see the lines that talks about args and set the blame to. 


Code:
dont_blame_nrpe=1



I also noticed that the command for check_disk was commeted. 


Code:
command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$


I uncommented the line above and restarted the nagios-nrpe-server. Same error "NRPE: Command 'check_disk -w 80 -c 90 -p /' not defined". 

I modfied the line in the hosts nrpe.cfg file that calls check_hda1 and changed the path from /dev/hda1 because it doesn't exist on the pi to the following.

Code:
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /


Then I modified the Advanced service args "/ Disk Space"  in Nconf to call check_hda1 and it returns as expected. 

I have also tried the following from the terminal on my Nems Server.

Test 1:
call check disk on the remote host.(before I uncomment the check_disk commmand from above)

./check_nrpe -H 192.168.1.120 -c check_disk

NRPE: Command 'check_disk' not defined

Test 2:
call check disk on the remote host.(After I uncomment the check_disk commmand from above) Starting to get somewhere.

./check_nrpe -H 192.168.1.120 -c check_disk
Unknown argument
Usage:
 check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
[-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
[-t timeout] [-u unit] [-v] [-X type] [-N type]


Test 3:
call check disk on the remote host with args.
 ./check_nrpe -H 192.168.1.120 -c  check_disk -a 20 10 /

NRPE: Command 'check_disk!20!10!/' not defined



So I guess I have two questions. 

First one, any idea what I may be doing wrong with the remote commands?

Second, I noticed that in Advanced Services we don't have a way to create one of the check command type check_nrpe in the dropdown. I had to modify one of the existing ones to get the check_hda1 command to fire. Is there a way to make check_nrpe available in the dropdown or a way to clone one of the existing ones and the change the clone?

Any help would be greatly appreciated.

Thanks,
Kris
Wow. This really boils down to the need for NEMS to have more documentation in setting up check commands. The December issue of ODROID Magazine will definitely touch on this, but it's definitely on the roadmap for 2019.

Essentially, the service checks you've edited are the samples that are included for the local disk. The NEMS Server. By changing them, you've now changed the included samples. You'd probably want to have created new ones for your purpose, instead.

You should not manually edit the config files - you will lose all those settings. NEMS is not Nagios. You cannot manually edit config files in NEMS.

By editing the nrpe config on your NEMS server, you are doing just that. You are modifying NRPE on your NEMS server. You should instead be configuring it on the systems you're hoping to monitor (ie., the hosts). Does that make sense? The NRPE configuration tells your system how to *respond* to NRPE requests. So if you are editing it on the NEMS server, that's the wrong place - you're then changing how NEMS responds. NEMS does not have an HDA1, but it'd respond as if it does if you create a command called check_hda1. But you're checking the disk of the NEMS server, not your host. So the result is completely inaccurate to what you're really hoping to achieve (which is to check the host's hard drive space, not your NEMS Server's).

NRPE is confuzzling to be sure. It's one of the reasons I've also included WMI. But it takes understanding how everything talks to eachother. Eg., the check_disk service you're using is checking the NEMS server's disk, not your host's.

I think it boils down to being confused by the samples since you're new to Nagios/NRPE.

I will be improving the documentation heavily... in the meantime perhaps you'd be best to express what you're actually hoping to do (rather than what you have done) and see if I can be of more help.

As it is, I imagine your setup is a bit 'broken'  :)

Is all fun though, right?  :)

Robbie
Clearing up a few things. 

I have not changed any config files on my Server. All the changes have been made on the host machine. The change I did on my Server was only to the ARGS param in the Advanced Service for the one called / Disk Space in Nconf.


[Image: screenshot-nems-local-2018-11-27-14-36-53-1.png]

[Image: screenshot-nems-local-2018-11-27-14-36-53.png]

The mail host is using the modified service above. My Mqtt host is using the default one and is failing.

[Image: Capture.png]


Mail host /etc/nagios/nrpe.cfg changes.


[Image: mailhost.png]

Kris
Further findings:

Been doing some research and experimentation and I have more information. I'm sure that the version of nagios-nrpe-server thats in the in the raspibian strech repo is not compiled with the --enable-command-args

3.0.1 from repo deb http://raspbian.raspberrypi.org/raspbian/ stretch main contrib non-free rpi using sudo apt-get install nagios-nrpe-server
Nov 27 06:01:59 www systemd[1]: Started Nagios Remote Plugin Executor.
Nov 27 06:01:59 www nrpe[14780]: Starting up daemon
Nov 27 06:01:59 www nrpe[14780]: Server listening on 0.0.0.0 port 5666.
Nov 27 06:01:59 www nrpe[14780]: Server listening on :: port 5666.
Nov 27 06:01:59 www nrpe[14780]: Listening for connections on port 5666


3.0.1 from the github archive source compiled with the flag --enable-command-args. I think this would be the way to do it with the current version of nems if i was running on jessie. May try this.
It will not make all and throws a bunch of ssl errors. I think its due to the openSSL version changes in stretch.

3.2.1 from strech-backports repo deb http://ftp.debian.org/debian stretch-backports main. using sudo apt-get install nagios-nrpe-server -t stretch-backports -y --force-yes
Nov 27 06:01:59 www systemd[1]: Started Nagios Remote Plugin Executor.
Nov 27 06:01:59 www nrpe[14780]: Starting up daemon
Nov 27 06:01:59 www nrpe[14780]: Server listening on 0.0.0.0 port 5666.
Nov 27 06:01:59 www nrpe[14780]: Server listening on :: port 5666.
Nov 27 06:01:59 www nrpe[14780]: Listening for connections on port 5666


3.2.1 using the instructions located here https://support.nagios.com/kb/article.ph...2#Raspbian from gitub source compiled with the flag --enable-command-args
Nov 28 04:39:57 mail systemd[1]: Started Nagios Remote Plugin Executor.
Nov 28 04:39:57 mail nrpe[8020]: Starting up daemon
Nov 28 04:39:57 mail nrpe[8020]: Server listening on 0.0.0.0 port 5666.
Nov 28 04:39:57 mail nrpe[8020]: Server listening on :: port 5666.
Nov 28 04:39:57 mail nrpe[8020]: Warning: Daemon is configured to accept command arguments from clients!
Nov 28 04:39:57 mail nrpe[8020]: Listening for connections on port 5666

Still not working Sad

When I try to connect to the Host machine running 3.2.1 it throws the following:
CHECK_NRPE: Receive header underflow - only -1 bytes received (4 expected).

That I'm sure is because of the version diff between nems using 3.0.1 and the 3.2.1 installed on my host.

Are there any plans on upgrading to 3.2.1 in nems 1.5?

Thanks, 
Kris
Yes, NEMS 1.5 already has check_nrpe 2.2.1 as per the changelogs.

However, I held back check_snmp to version 2.1.1 due to the bugs in output, which they promise to fix in 2.2.2 but have been taking forever to release.
Sorry for replying to an old thread but I have the same issue. 
I configure check_nrpe as an advanced-service and when NEMS is executing command it looks like this:
 
/usr/local/nagios/libexec/check_nrpe -H 192.168.10.15 -c "check_load -a '-w 5.0,4.0,3.0 -c 10.0,6.0,4.0'"
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).
 
When I try the command from CLI I found out that it should be executed without double quotes around the argument. Like this:
/usr/local/nagios/libexec/check_nrpe -H 192.168.10.15 -c check_load -a '-w 5.0,4.0,3.0 -c 10.0,6.0,4.0'
OK - load average: 0.00, 0.00, 0.00|load1=0.000;5.000;10.000;0; load5=0.000;4.000;6.000;0; load15=0.000;3.000;4.000;0;
 
Is this a bug or have I configured it incorrectly?
 
NEMS Platform....: RPi 3 Model B+
NEMS Version.....: 1.5 (Current Version is 1.5)
I solved it by defined the arguments in /usr/local/nagios/etc/nrpe.cfg at each server. By that I only need to pass check_load as argument from NEMS.
I'm a total newcomer to this whole universe, but having a ton of fun learning stuff. I was having a similar issue to what is being described above trying to use the check_disk plugin.

I've got a RPi 3B+ running the latest version of NEMS 1.5. Currently setting NEMS up to monitor 4 other RPis (3 more 3B+ and a 4B), plus some various and sundry home network equipment (router, NAS, etc.). Got the easy stuff working in NEMS (PING and SSH). I was trying to get NEMS to use NRPE to check the filespace on the root of one of the RPis. Installed NRPE on the remote RPi without issue, and it seemed to work. When I ran the following in the CLI on my NEMS server, I got a good response back from the remote client (Disk OK, blah blah blah, with all of the right values showing it's talking to the remote client).
Code:
/usr/local/nagios/libexec/check_nrpe -H <ipaddress> -c check_disk -a '-w 20% -c 10% -p/'


I noticed the the pre-defined Advanced Service for "/ Disk Space" in NConf isn't what I need because it lacks the -a switch and the single quotes. It's also setting the warning level to 80% free space and the critical level to 90% free space (might want to check those values, I don't mind have 80% free and I'd be THRILLED to have 90%). So I added a new Advanced Service that uses all of the same values as that original one, except the ARG1 field is set to:
Code:
check_disk -a '-w 20% -c 10% -p /'

Generated config and deployed. When the check is made, it errors out with the message "CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected)."

Checked versions of NRPE on both the NEMS server and the client. Both versions are the same (3.2.1).

I made a workaround by defining a new hard-coded command in the nrpe.cfg file on the remote client (check_disk_root) and just calling that from the Advanced Service. But let's face it, that's cheating. I'd rather be able to define everything in NEMS than have to edit config files on individual clients.

Poking around while simultaneously working on this reply, I think I stumbled across the issue. It appears to be an issue with the checkcommand in NConf. The command by default is set to

Code:
$USER1$/check_nrpe -H $HOSTADDRESS$ -c "$ARG1$"

I'm not sure why $ARG1$ is surrounded by quotes in that line. Maybe other NRPE checks need them? But as soon as I removed the quotes, generated the config, and deployed, boom! Everything started working right with my disk_check.