Current Load - Warning / Critical - experimenter - 10-10-2018

Any suggestions on why my NEMS server is generating warnings and criticals for current load?  This is a Raspberry Pi 3 Model B, with only NEMS running on it.  Since I see lots of people running R-Pi3s and even R-Pi Zeros, it seems as if I should not be having load issues.

So far, I detected that it would spike after a reboot, as it looked like it was trying to query all the hosts at once at 15 minute intervals.  Since there are only six [6] hosts being monitored, I went in and manually changed the periods so that they are staggered [host1 = 12 min, host2 = 13 min, etc].

The 1, 5, 15 min parameters are set as follows:
Warning:  5.0, 4.0, 3.0
Critical:    10.0, 6.0, 4.0

I haven't saved all the emails it generated, but samples are:
WARNING - load average: 6.07, 5.50, 3.27
WARNING - load average: 3.79, 3.87, 3.64
WARNING - load average: 4.46, 4.26, 3.45

After some time it will settle down, and the post-revovery numbers look like this:
OK - load average: 1.32, 3.09, 2.88
OK - load average: 2.10, 2.31, 2.74

Obviously, I COULD raise the thresholds for triggering the warnings, but if I can determine that there is some way to lower the load, that would be preferable.


RE: Current Load - Warning / Critical - experimenter - 10-14-2018


RE: Current Load - Warning / Critical - Robbie Ferguson - 10-16-2018

Hi experimenter,
Can you provide a support.nems file so I can point you in the right direction?

It's not abnormal for a Raspberry Pi to have a pretty high load when working hard... it is what it is... but hitting a load average of 6 isn't something to be concerned about... just may want to adjust your alert thresholds for the NEMS server itself.

If your i7 was hitting a load average of 6, you'd wanna know. But a Pi hitting 6 is not overly abnormal (as long as it's happening in spikes, not 24/7).

Lemme know.

RE: Current Load - Warning / Critical - experimenter - 10-22-2018

OK, I'll get you the file.  Can you remind me where to locate it?

Thanks in advance,

RE: Current Load - Warning / Critical - Robbie Ferguson - 10-24-2018

You don't locate it, you create it.


RE: Current Load - Warning / Critical - experimenter - 10-24-2018

Thanks.  It's on its way to you.

For completeness, and as a reference to others who might read this thread: I did raise the thresholds and still got alerts.  After some correspondence with Robbie about relative safe levels for a Raspberry Pi to operate, I turned off monitoring the load.  If we can determine why this particular Pi is running so hard, I will gladly turn it back on.

FWIW, I have not yet turned on load monitoring for any other Pi servers running here, though that would be useful.

RE: Current Load - Warning / Critical - Robbie Ferguson - 10-24-2018

Yeah, the Pi is great, but not very powerful. It doesn't take much to bring it to a full load. Turning off the monitors is okay, though you may just need to find a more reasonable threshold that suits such a low-powered chip.
NEMS 1.5 will have new sample checks with much higher thresholds.
Unfortunately since you have disabled the load monitor, I cannot see the issue in your support.nems file (as it doesn't exists).
I do however see your weekly load average is just 1.5, so that's perfectly fine.
Perhaps sharing the notification output would help since it shows the load your system was hitting.