Announcing another quick-and-dirty perl script today: surfboard-metrics
I have been having a lot of trouble with my ISP over the last few weeks – constant disconnects triggered by a modem reboot – which were taking us offline for 3-5 minutes at a time (or longer). This would be a mere annoyance, other than the fact that I work from home pretty frequently, and the reboot takes the connection out long enough to drop VPN and ssh sessions. The longest outage to date is 4 hours. I finally decided to start collecting data to see if I needed to add a powered amplifier to my cable system.
I’m using a Motorola Surfboard 6121, and while the spec sheet lists SNMP v2 and v3, they appear to only allow SNMP access over the coax interface. It’s for the ISP to use, not for the end user. This means screen-scraping the web interfaces, which by default live at http://192.168.100.1/cmSignal.htm. There is also a log event page at http://192.168.100.1/cmLogs.htm but I’m not dealing with that yet.
The important thing to watch for are power levels and the signal-to-noise ratios for the upstream and downstream channels. If upstream goes above 55dBvM, or if downstream gets much above 40dB, performance will go to crap, and you will start seeing
No Ranging Response received - T3 time-out or
Unicast Ranging Received Abort Response - initializing MAC and the modem will eventually reboot itself.
The script I wrote screen scrapes the pages, and outputs to a graphite/carbon server all the various metrics. I’ve tried to make it reasonably flexible – if you have a higher speed connection and have multiple bonded upstream or downstream channels, the script should be able to create a metric for each channel – but as I only have my modem for reference, I can’t verify some of that. As it might be helpful, I’ve put it on github. If you have a Motorola Surfboard, give it a spin. I welcome pull reuquests that add support for ohter modems or other outputs to metric engines.
Here’s a screen capture of my local grafana instance, looking at recent data:
You can see a reboot at 7/11, around 14:30, and again on 7/12 at 11:30. I think the next step is to correlate temperature and humidity readings, as well as internet traffic levels, and see if any patterns emerge.