2

When rasp is under heavier load (eg. copying tb of data over ssh for few hours) it recently started to randomly die without response. syslogs don't tell much.

Not overclocked, fan attached.

led blinking like this: https://dl.dropboxusercontent.com/u/44131220/2015-08-25%2016.46.37.mp4

there is kern.log from last minutes of Pi's life: https://dl.dropboxusercontent.com/u/44131220/rasplog.txt

it always looks nearly the same: flooded with error -110

Jacobm001
  • 11,898
  • 7
  • 46
  • 56
Lapsio
  • 121
  • 3
  • I think a continually blinking green LED means the SD card isn't inserted (or rather is not responding to the once a second are you there message). – joan Aug 25 '15 at 17:26
  • 4
    "copying tb of data over...a few hours" -> It would take several days to pass a single terabyte of data over the pi's 100BASE-TX ethernet. – goldilocks Aug 25 '15 at 17:38
  • 2
    yes, but it dies after few hours :D – Lapsio Aug 25 '15 at 17:40
  • @user2111737 Does the PI fails electrically or OS stops responding? – Chetan Bhargava Feb 21 '16 at 22:17
  • @ChetanBhargava OS. I think it might be kernel panic or something. This rasp had drive with btrfs filesystem and data was transferred to it. Old Raspian kernel - btrfs used to cause panics. It's similar case to what I'm experiencing on desktop PC with faulty SSD - OS just freezes and nothing happens later. And there's nothing in logs because well... system SSD is dead. I've replaced this RasPi with Intel NUC. Now using rasp as HTPC, there it's not as critical as it was as server... :< – Lapsio Feb 21 '16 at 23:34
  • @user2111737 can you describe what kind of power supply were you using? Did you see btrfs errors in the logs? FS errors are logged into syslog and you should be able to see them. Did you use the same power supply with Intel NUC? I think not, as you can use almost ANY power supply with PI but there is a standard for PS that can be used with NUC. Please elaborate. Sorry for a late reply as I was not that motivated to see all comments to my posts in this section of SE. – Chetan Bhargava Mar 23 '16 at 05:35

4 Answers4

1

I believe that your PI crashed due to deprivation of power.

I believe that your hard drive "sustained read" needed more power that your power supply could provide. The constant power depleted power in the output capacitors of the power supply. This doesn't happen in occasional reads where the output caps get time to replenish their charge.

You can test this using a USB power meter (picture below). When Raspberry Pi current draw increases, the cheap power adapter starts to dip voltage below 5V and then the Pi starts to stop working.

USB Meter

Raspberry Pi B+ has under-voltage indicator. Here is a helpful post.

EDIT (as explained to Chris):

All power supplies (including linear and swithers) have an output capacitor(s). This capacitor tries to store energy in case the there is a transient demand in output current. The cap charges to its full capacity after some time at power up. Upon usage, the power draw may increase marginally (fraction) than what the power supply can deliver. Therefore the energy in the output capacitor gradually decreases with time. The depletion is dependent on factors like:

  • Capacity of the output capacitor. Transient capacity ∝ Charge carrying capacity or value of the capacitor in use. Measured in farad (F)
  • Output capability of the regulator circuit.
  • Sustained power draw from target.

When the energy in the cap is fully depleted, the power draw becomes higher than what the power supply circuit (excluding output capicator) can store and deliver. Hence the load does not get enough power (P=V.I). This is my understanding of the situation.

As I mentioned, a better analysis can be done by measuring current and voltage (output power of the power supply). OP has not provided any specifications about the power supply used.

I have personally seen these dropouts happen when using cheap 500 mA power supplies that are labelled as 1A. I would encourage OP to use a 2A power supply with the V, I monitor, and report back if issue still persists.

Chetan Bhargava
  • 1,262
  • 3
  • 15
  • 29
  • it's a possible answer, but according to the question this happens after a few hours of continuous ssh data transfer. You seem to suggest that this won't happen on occasional reads, but would happen under heavier load. So, shouldn't this happen fairly quickly and not take hours to become a problem? – Chris Mar 23 '16 at 04:24
  • @Chris All power supplies (including linear and swithers) have an output capacitor. This capacitor tries to store energy in case the there is a transient demand in output current. The cap charges to its full capacity after some time at power up. Once the power draw increases marginally (fraction) than what the power supply can deliver, the energy in the output capacitor gradually decreases. When the energy in the cap is fully depleted, the power draw becomes higher what the power supply can deliver. Hence the load does not get enough power (P=V.I). This is my understanding of the situation. – Chetan Bhargava Mar 23 '16 at 05:16
  • @Chris as I mentioned, a better analysis can be done by measuring current and voltage. OP has not provided any specifications about the power supply used. I have personally seen this happen when using cheap 500 mA power supplies that are labelled as 1A. I would encourage OP to use a 2A power supply with the V, I monitor, and report back if issue still persists. If the issue persists, people can freely downvote this answer. – Chetan Bhargava Mar 23 '16 at 05:19
  • 1
    Great explanation - I honestly don't agree with the approach to down vote answers, because they didn't work in a specific situation. Down votes should refer to inferior quality or similar arguments, but a valid answer and possible solution for the next person looking through this question shouldn't receive any down-votes in my understanding of stackexchange. – Chris Mar 23 '16 at 05:43
  • @Chris Thanks. There are a few people that would downvote for any reason including formatting. This is a new SE site so users are not aware about the SE culture. – Chetan Bhargava Mar 23 '16 at 06:07
  • I think it could be kernel panic or other kernel failure after all. I've finally replaced this rasp with NUC and after hours of I/O it throws up drive on I/O error. As drive is using btrfs and rasp was using quite old kernel maybe this drive failure could cause serious system crash instead of just throwing up drive. But i don't think it was power issue. I was powering this rasp using strongest powered HUB i could find i don't remember exact values now but it was much more than standard USB 1A and drive is using own power supply – Lapsio Mar 28 '16 at 19:20
0

The only easily visible blinking green LED in that video is on the router (?) in the background, so it is hard to tell whay you are talking about. The green ACT light will blink in a very regular pattern, as in a number of evenly spaces blinks, then a pause, then repeat, if the pi has rebooted and encountered a problem -- however, the fact that the ethernet link is up makes that extremely unlikely, in which case the blinking is probably more irregular. That's what it does when it is busy accessing the SD card.

So the pi does not in fact appear dead. It is possibly bogged down with something and very unresponsive. I'd suggest you attach a screen so you can leave top or something running and see if that provides a clue as to what's up.

goldilocks
  • 58,859
  • 17
  • 112
  • 227
  • it stops responding to pings - router finds it dead, i can't connect to it, keyboard doesn't work, there's nothing on display. – Lapsio Aug 25 '15 at 17:36
  • this is log from last minutes of Pi's life: https://dl.dropboxusercontent.com/u/44131220/rasplog.txt it looks always similar before death – Lapsio Aug 25 '15 at 17:37
  • The ethernet has obviously failed, that's all those errors. If the system has gotten bogged down, it will not be responsive if you plug in a screen at that point. You have to 1) have the screen plugged in and whatever monitor running, 2) keep the display active if that is a problem. We probably get a few posts a week like this one ("my headless pi is dead", where "dead" really means, not responsive). 95% of them are never solved (or if they were, no indication is left by the poster). You need to make further effort to diagnose this or there's not much anyone is likely to be able to tell you. – goldilocks Aug 25 '15 at 17:44
  • Doing a quick search for one of those errors ("FIQ reported NYET"), pretty much all the hits are pi specific (i.e., they aren't scattered amongst similar reports from other linux systems). That implies it has something to do with the smsc95xx driver; as to whether it or the hardware is the cause I dunno. – goldilocks Aug 25 '15 at 17:50
  • it's quite hard to diagnose as it happens once few weeks, really rare, sometimes in the night - i know it happened when it's already "dead" because my router is beeping once rasp link is down and enables fallback redirect. Monitor attached to rasp is always switched to different source anyway so I guess it'd be fine to disable screen auto poweroff and keep it always active so I could see what's going on eventually - how can i do that? Hmm... If it's ethernet, and rasp has this fancy USB+ethernet duo - can it be that when ethernet is down, so is USB, that's why keyboard doesn't respond? – Lapsio Aug 25 '15 at 17:51
  • What are the timestamps in the log indicative of? You could try a trick like this with some USB device to see if you can force it to shutdown/reboot, and then check the log to see if it really happened. Not that that will solve the problem, of course. The thing is if it gets busy looped, it will go very sluggish and may never complete setting up a keyboard or whatever when you plug it in. If that's the case then you would want to confirm it further by noting the CPU usage and trying to spot a guilty process. – goldilocks Aug 25 '15 at 18:00
  • I've already tried ctrl+alt+del reboot (i was not logged in so it should trigger reboot) using keyboard but it didn't do anything even I've waited quite a long time (like 10 min). I couldn't check alt+sysrq+B because unfortunately my keyboard doesn't have sysrq nor prt sc. One more note - i think it's stuck, not busy because CPU and air around pi is quite cold comparing to full load temps – Lapsio Aug 25 '15 at 18:09
0

According to the log, you get a fail to write error. Are you exceeding the file system capacity? Have you expanded the rootfs? If either of those questions seem strange, then most likely you are running out of room on your device.

Check out expanding the file system on the Foundation's site for the usual fix.

Jacobm001
  • 11,898
  • 7
  • 46
  • 56
0

It is possible that the sd card is corrupt. Try using a different card and redownloading the os. If that doesn't work, try a older version. This may help