Stuff like this can be really tedious to peg down, since unless you have a keyboard and monitor you can plug in, there's no way, if ssh doesn't work, to check what's gone wrong on the live system.
Here's a simple starting point:
#!/bin/bash
# Set these to whatever you want.
router_ip=192.168.0.1
log_file=/tmp/mystery.log
# Make sure we can write to the log.
touch $log_file
if [ $? != 0 ]; then
echo "Cannot use $log_file."
exit 1
fi
# Redirect output.
exec 1> /dev/null
exec 2>> $log_file
# A function for logging.
print2log () {
echo $(date +"%D %R ")$@ >>$log_file
}
# Loop infinitely.
while [ 1 ]; do
sleep 900 # 15 minutes
# Ping router.
ping -c 1 $router_ip & wait $!
if [ $? != 0 ]; then
print2log "Ping $router_ip failed."
else print2log "Ping OK."
fi
# Check sshd.
print2log "sshd PIDs: "$(ps -o pid= -C sshd)
done
Call this check.sh
or whatever you want, chmod 755 check.sh
to make it executable, and start it from within an ssh login:
setsid ./check.sh &
It does not have to be run sudo
. You can now log out and that should stay going. Every 15 minutes it will print something like this to /tmp/mystery.log
:
03/27/15 10:59 Ping OK.
03/27/15 10:59 sshd PIDs: 4261 14262
The first line indicates there is a working network connection and the second one indicates sshd
is running. WRT those PIDs: there should be at least one, and while exactly what it is doesn't matter, it should be reasonably consistent (i.e., not change every 15 minutes).
If there are no PIDs at a certain point, you have at least confirmed that sshd
has died for some reason.
grep sshd /var/log/syslog
Should help you find the reason.
WICD-CURSES
- It usually works well for me with WiFi reconnect but you need a good power supply, like a regulated one, not just a cheapo USB power supply, to completly solve the issue. Cheap USB things have allot of noise on the DC line, this messes up WiFi allot of times! The other times its just not enough AMPs to keep it going stable. – Piotr Kula Jul 26 '15 at 18:21