Hello There, Guest! Login Register
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5

Not all Services sending emails

#1
I've got three hosts/services running check_ping....the Google DNS server, my webhost's email server and my router.  They are all configured identically.  If I power off the router and power it back on I would expect to see an email from all three services showing they are back up and running.  However I never get an email from the router's service.

The router's service alert history shows the event, I just never get the email.

Could this be a timing issue, where the router is back up and running the local network so NEMS fires off the email even though there is not yet an actual Internet connection?  If so, how do I fix this?
 Reply
#2
From what you're describing, it sounds to me as if the router is coming back online before the notification time, but once the router comes back up, it still takes some time before your Internet service comes online, so the remote pings get caught within the time window for the notifications, but the router doesn't.

Does that make sense?

You can change your notification settings if you want. Possibly adding notifications when flapping, or shorten your notification time. This is admittedly something I need to add to the documentation... let me know if you're stuck.

Cheers,
Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#3
Ok, I ran a test.  I powered off the router and waited 6 mins.  Then I powered it back up and monitored both the LAN and WAN sides.  I was able to access my NAS on the LAN within less than 1 minute after powering up the router.  I was able to get to Google about 2.5 mins after powering up the router.

This time I got 4 emails.

One from the service pinging my web host's email sever and one from the service pinging the Google DNS server.  Both of those emails were RECOVERY emails saying the host was UP.

I also got one from each of the hosts saying they were FLAPPINGSTART alerts that the hosts were up.

As before, I got NO emails about the router.

Here's a snippet of the Service Alert History and you can see that it definitely sees the router has gone down.


[2018-03-26 15:57:24] HOST FLAPPING ALERT: Google DNS;STOPPED; Host appears to have stopped flapping (4.7% change < 5.0% threshold)
[2018-03-26 15:49:44] SERVICE FLAPPING ALERT: NEMS;HTTP;STARTED; Service appears to have started flapping (24.2% change >= 20.0% threshold)
[2018-03-26 15:49:44] SERVICE ALERT: NEMS;HTTP;OK;HARD;4;HTTP OK: HTTP/1.1 200 OK - 15796 bytes in 0.944 second response time
[2018-03-26 15:45:54] SERVICE ALERT: RR Email;check_ping;OK;HARD;2;PING OK - Packet loss = 0%, RTA = 80.25 ms
[2018-03-26 15:45:54] SERVICE FLAPPING ALERT: Google DNS;check_ping;STARTED; Service appears to have started flapping (20.5% change >= 20.0% threshold)
[2018-03-26 15:45:54] SERVICE ALERT: Google DNS;check_ping;OK;HARD;2;PING OK - Packet loss = 0%, RTA = 12.66 ms
[2018-03-26 15:45:45] HOST FLAPPING ALERT: Google DNS;STARTED; Host appears to have started flapping (20.1% change > 20.0% threshold)
[2018-03-26 15:45:44] HOST ALERT: Google DNS;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 36.28 ms
[2018-03-26 15:45:35] HOST FLAPPING ALERT: RR Email;STARTED; Host appears to have started flapping (20.5% change > 20.0% threshold)
[2018-03-26 15:45:34] HOST ALERT: RR Email;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 78.58 ms
[2018-03-26 15:45:34] SERVICE FLAPPING ALERT: RR Email;check_ping;STARTED; Service appears to have started flapping (21.1% change >= 20.0% threshold)
[2018-03-26 15:45:34] SERVICE ALERT: RR Email;check_ping;CRITICAL;HARD;2;PING CRITICAL - Packet loss = 100%
[2018-03-26 15:44:54] SERVICE ALERT: NEMS;HTTP;CRITICAL;HARD;4;CRITICAL - Socket timeout after 10 seconds
[2018-03-26 15:44:09] HOST FLAPPING ALERT: Router;STARTED; Host appears to have started flapping (20.0% change > 20.0% threshold)
[2018-03-26 15:44:04] HOST ALERT: Router;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.70 ms
[2018-03-26 15:43:54] SERVICE FLAPPING ALERT: Router;check_ping;STARTED; Service appears to have started flapping (21.2% change >= 20.0% threshold)
[2018-03-26 15:43:54] SERVICE ALERT: Router;check_ping;OK;HARD;2;PING OK - Packet loss = 16%, RTA = 0.88 ms
[2018-03-26 15:43:54] SERVICE ALERT: NEMS;HTTP;CRITICAL;SOFT;3;CRITICAL - Socket timeout after 10 seconds
[2018-03-26 15:42:54] SERVICE ALERT: NEMS;HTTP;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
[2018-03-26 15:41:54] SERVICE ALERT: NEMS;HTTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds


Attaching the Service page for the router.  The other two services have the exact same settings.


Attached Files Thumbnail(s)
   
 Reply
#4
Ahh, right after I submitted the last post I got three emails, one from each service with the alert being FLAPPINGSTOP
 Reply
#5
Hey tonydi, so are you all good at this point?

Thanks,
Robbie
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#6
No, that was the only time I've gotten an email from the router being down.
 Reply
#7
Hi tonydi,
I am not 100% sure I follow where the problem is - like, I mean, I don't know what it is you're trying to describe. Everything I see here is by design... unless I'm just not understanding.

For example, I believe you're saying you're not getting the Flapping emails. But then your screenshot shows you only have notifications enabled for W, U, C. So if your service entered the Flapping state, as detailed in the log you posted above, you'll never get an email (because it hasn't reached C -- critical -- yet). Since you're not getting them when they go critical, this tells me probably that your host notification settings are incorrect. See https://docs.nemslinux.com/usage/notific...efinitions

I mean, I'm only going by the service definition. I'd have to see the Host notification settings to confirm.

Am I on the right track?  :D
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#8
LOL, sorry about confusing you but that's probably a by-product of the fact that I have absolutely no clue what I'm doing! Sad

Let's back up to the beginning.  I started out wanting a device that would email me when one of three things happened....my router went down (I do a lot of beta testing for a major network equipment company), my web host's email server went down and the Internet connection went down (different from the router because sometimes wifi goes down but the Internet is stilll available wired).  So I probably don't even need all of the "wuc" settings because I just want an idea if these are UP or DOWN.

I had a Pi running the free version of Netbeez but they discontinued the service.  NEMS seemed to be the best alternative although since it's running locally that makes the email situation less than ideal because obviously if the router or Internet goes down, the initial email can't go out.

The purpose of the original post was that while I found that I was at least getting the emails for the Internet and email server coming back up, I was never getting any emails telling me that the router was back up (except for that one time).

So it's not about flapping or flipping or clapping or clipping or anything else. Big Grin


All I want is some sort of notification from NEMS when the router has a problem.  


Is the attached what you're asking to see?  If not, please point me to the path to get to the Host notification settings.


Attached Files Thumbnail(s)
   
 Reply
#9
For me to truly get my head around what you are attempting, can you email me a copy of your backup.nems file? Before you do, change your encryption password to something you can share with me in NEMS SST, and then wait 5 minutes (for the new backup to generate) and email it to [email protected] - then let me know the password. :)
You also raise an interesting flaw if SMTP is not accessible... I'll look into this.
Robbie Ferguson // The Bald Nerd

Did I help you out? Appreciate what I do? Please consider saying thanks:
 Reply
#10
Emailed.    tdi499
 Reply
 
 
Forum Jump:

Users browsing this thread: 1 Guest(s)