Hello There, Guest! Login Register
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5

[RESOLVED] My NEMS 1.3 locking up

#1
Hi
I am just a Home User (14 yrs retired IT person now. I know some Unix but not my strong point ).  
I have been using NEMS 1.1 since I found it about a year or so ago.  I finally got the email going and every since it has been stable and sent me emails whenever one of my  21  devices went down.  I love this little tool. I monitor my security cameras, weather station, Ether switches, NAS and so on.
I have just recently upgraded to NEMS 1.3 and was sort of successful in using the backup.nems file from 1.1 after I realise that I had to run NConf and generate the new file to deploy and a few other little things that i missed.

I played with it considerably for about a week and did all sorts and then it started locking up.
I decided that it was probably all my fiddling that cause it so took a backup of NEMS and rebuilt it again on a new SD Card 16GB.
This time I followed the instructions with the exception of NConf generation and deploy which I did not see in instructions.
This rebuild took about 30 minutes which I thought was excellent.

The things I did this time was  downloaded and burnt the image, booted, sudo nems init, rebooted, nems-restore, run the raspi-config to expand the root partition, nems-upgrade but did not do the 135 or so Raspbian updates that it said were there as I was not sure that this was not the problem nems-restore, NConf to generate & deploy and then let it run.
In using the nems-restore I mounted a USB stick that I have saved the backup.nems file to.

Things that I have in my NConf that I did not have in NEMS 1.1 are:
NRPE checks of another Raspberry Pi
SNMP checks on my Synology NAS
Other than that I really want to know if the devices are up or down.

In my NRPE I could not get it to work until I update the RPi that I was checking from jessie to stretch so that the SSL and packet sizes were the same.

With SNMP I am using it to check the Mains Voltage, High & Low, as well as battery run time in seconds of the UPS that is connected to the NAS via a USB port and Synology use Network UPS Tools (NUT).
It took a bit to get the OID but when I got it NEMS checked it nicely.

So I am wrapped with what it's doing with the exception of it going down.  It has only gone down once since the last rebuild but as I said NEMS 1.1 never did so I am wondering if I have made some stupid mistake or is there a bug?

Thanks for a great system and hope you can give me some insight into this issue.

Thanks in advance.
Regards
Ron


Attached Files Thumbnail(s)
   
 Reply
#2
Hey Ron,
First of all, thanks so much for being such a loyal NEMS fan! Means a lot!

Instability issues are never fun, and certainly not something I'm used to seeing on NEMS. It should be pretty rock-solid. Therefore, I'd be quite interested in looking at your log files.

If you could, go into NEMS SST and remove your personal info (ie., SMTP settings) and then send me a copy of your backup.nems file via email - [email protected] - I'd like to see if I can help you track down the cause.

It's definitely an anomaly, and my first guess was going to the be the SD card, but then you said you'd already changed to a different card, so that's less likely. Do you have a second Pi handy, just in case it's something with the board? It could also be the power supply... Raspberry Pi systems can get finicky if they're not getting enough amperage.

Drawing straws until I see the logs  Smile

Thanks man! Great looking setup!

You're gonna love NEMS 1.3 once we get this solved! It's a massive improvement over NEMS 1.1!

Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#3
Hi Robbie,
Have sent requested info as well as /var/log/syslog.1

Hope it helps.

Ron
 Reply
#4
Thanks Ron.

March 27 at 1:25am your Pi hung for no apparent reason. You power cycled it a bit more than 5 hours later.

Your system load is good, averaging around 0.08, and thermals also look good at around 44 degrees C.

There's literally nothing going on of any consequence here...

Because you set an encryption password in NEMS SST, I don't have access to any confidential info like your config files, so I really can only go by the logs... which show everything running just fine, up until that one blip for no apparent reason.

It could be a power issue (eg., if power dropped for a moment and locked up the Pi) but I understand you have a UPS.

Is it the same Pi and power supply as previously?

You upgraded the SD card, so that could be a variable.

Sorry I can't pinpoint this. Literally, it appears your Pi froze up--not NEMS. I mean, the logs just STOP. There is no crash. There is nothing happening leading up to it. It just STOPS. So that leads me to suspect hardware.

Lemme know if you find anything.

Cheers,
Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#5
Thank you Robbie.
I have just sent an email with further details for you perusal.

The device has not fallen over since so could have been a power spike so I will just have to monitor.
I have rebooted it once just to check on the RPi Monitor that also seems to be now having issues but the reboot did not change the effect.
I have attached a small video of what happens for you to see.
I have also sent a pdf document of screenshots of menu options that I cannot access. This is a user configuration but I want to consult with you before I did any changes as it might effect other things.

Thanks again
Ron
 Reply
#6
Hi Ron,
Regarding things not showing correctly immediately at boot, this is because I use RAM heavily to offload the wear and tear on your SD card. So when you first turn on your NEMS server, it's as if it's starting with a clean hard drive for all temporary assets like graphs (for example). Because NEMS is meant to be left running 24/7, you'd generally never even notice this - but if you start opening things immediately after rebooting, you'll find some broken images and so-on that simply have not generated in RAM yet.

Cheers!

Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#7
Hi Robbie,
Thanks for your help.
New image is cooking well!


So Far So Good.
Appreciate you efforts and response here.

Smile Smile 

Cheers
Ron
 Reply
#8
My pleasure, Ron! Glad to hear it.

For those seeing this after the fact, Ron upgraded to NEMS 1.3.1 which appears to have resolved any lingering issues he was experiencing.

All the best, Ron. Keep me posted on how things are going.

Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#9
I have ordered a new RPi B+. 
This should arrive on April 9.  I am hopefully that I should be able to just swap the SD cards, connect Ethernet and Power and it should all work.
Hopefully this will cure my stability issues.
Looking at ( have deleted my board id from the dump)
To my novice eye they seem to be OK..

benchmark.log
---------------------------------
SD Card READ:
/dev/mmcblk0p2:
 Timing buffered disk reads:  70 MB in  3.10 seconds =  22.59 MB/sec
SD Card WRITE:
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 2.99038 s, 35.1 MB/s
---------------------------------
Memory WRITE:
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.284994 s, 368 MB/s
---------------------------------
Filesystem:
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        15G  4.2G   11G  30% /
devtmpfs        484M     0  484M   0% /dev
tmpfs           489M     0  489M   0% /dev/shm
tmpfs           489M   13M  476M   3% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           489M     0  489M   0% /sys/fs/cgroup
tmpfs           489M  100K  489M   1% /tmp
tmpfs           489M     0  489M   0% /var/tmp
tmpfs           489M     0  489M   0% /var/www/nconf/temp
tmpfs           489M  2.7M  486M   1% /var/www/html/backup/snapshot
tmpfs           489M  3.0M  486M   1% /var/lib/monitorix/www/imgs
/dev/mmcblk0p1   60M   22M   39M  37% /boot
---------------------------------
Memory:
              total        used        free      shared  buff/cache   available
Mem:           976M        297M        181M         28M        497M        595M
Swap:          976M        196K        976M
---------------------------------
Internet Speed:
---------------------------------
Benchmark of this benchmark: 37 seconds
________________________________________________
The last figure is interesting.  What does 37 seconds mean?

My Internet connection is via a Fibre To The Node (FTTN) connection and I am 800m from the node.
Average speed is about 32Mb/s Down
However at the moment I am getting warning emails from NEMS that my Load stats are too high??

I would be interested if anyone things my stats are abnormal.


RT
 Reply
#10
37 seconds = the time it took for the benchmark to run.

Hopefully the Pi 3 B+ will correct the issue for you. Because I haven't received reports from others of this issue, it makes me feel that it may be isolated. It could be other than the SBC, but it's an easy one to rule out. The SD card would be another thing to try... if the B+ hangs, try a different SD card. Another thing to check is the Pi's power supply. Make sure it's 2.5A or higher.

So many variables :) Fingers crossed for you.

I'll have to have a look at the Internet speed benchmark... not sure why it's blanks, thanks!
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
 
 
Forum Jump:

Users browsing this thread: 1 Guest(s)