I have a Pi 3 B running Raspbian Stretch. Recently, I started experiencing lag logging in with SSH, and sporadically during use. It really became apparent when I tried zipping my 2GB Plex library files. It would hang a few times per minute on a small file. Monitoring in htop, the process will change to state D and drop to 0% CPU. Typically when it's hung, I can't execute a command from another window. It resumes after 1-2 seconds.
I suspected power issues, but running this command...
$ /opt/vc/bin/vcgencmd get_throttled
...indicates nothing is throttling the system when the freezes happen. I also don't get the lighting bolt on the screen.
I ran this command as root:
# echo w > /proc/sysrq-trigger
When no processes are hung, it puts this in /var/log/messages:
Dec 8 21:00:57 gax-pi-2 kernel: [ 549.369634] sysrq: SysRq : Show Blocked State
Dec 8 21:00:57 gax-pi-2 kernel: [ 549.369662] task PC stack pid father
When the zip process is hung, it puts this in /var/log/messages:
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077763] sysrq: SysRq : Show Blocked State
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077802] task PC stack pid father
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077856] jbd2/mmcblk0p2- D 0 69 2 0x00000000
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077903] [<8079f530>] (__schedule) from [<8079fb98>] (schedule+0x50/0xa8)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077924] [<8079fb98>] (schedule) from [<8014b690>] (io_schedule+0x20/0x40)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077943] [<8014b690>] (io_schedule) from [<8021d718>] (wait_on_page_bit+0x110/0x130)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077963] [<8021d718>] (wait_on_page_bit) from [<8021d818>] (__filemap_fdatawait_range+0xe0/0x114)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.077979] [<8021d818>] (__filemap_fdatawait_range) from [<8021d8b8>] (filemap_fdatawait_keep_errors+0x34/0x60)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078000] [<8021d8b8>] (filemap_fdatawait_keep_errors) from [<8037b7bc>] (jbd2_journal_commit_transaction+0x79c/0x17f0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078020] [<8037b7bc>] (jbd2_journal_commit_transaction) from [<80381920>] (kjournald2+0x110/0x2c0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078037] [<80381920>] (kjournald2) from [<8013dd98>] (kthread+0x13c/0x16c)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078058] [<8013dd98>] (kthread) from [<801080ac>] (ret_from_fork+0x14/0x28)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078174] zip D 0 1693 1689 0x00000000
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078200] [<8079f530>] (__schedule) from [<8079fb98>] (schedule+0x50/0xa8)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078215] [<8079fb98>] (schedule) from [<8014b690>] (io_schedule+0x20/0x40)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078230] [<8014b690>] (io_schedule) from [<807a0508>] (bit_wait_io+0x1c/0x70)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078247] [<807a0508>] (bit_wait_io) from [<807a0198>] (__wait_on_bit+0x94/0xd0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078264] [<807a0198>] (__wait_on_bit) from [<807a025c>] (out_of_line_wait_on_bit+0x88/0x90)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078281] [<807a025c>] (out_of_line_wait_on_bit) from [<80379544>] (do_get_write_access+0x254/0x4f4)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078299] [<80379544>] (do_get_write_access) from [<8037982c>] (jbd2_journal_get_write_access+0x48/0x74)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078319] [<8037982c>] (jbd2_journal_get_write_access) from [<8031c68c>] (__ext4_journal_get_write_access+0x44/0x84)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078340] [<8031c68c>] (__ext4_journal_get_write_access) from [<8033a618>] (ext4_reserve_inode_write+0x8c/0xc0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078356] [<8033a618>] (ext4_reserve_inode_write) from [<8033a830>] (ext4_mark_inode_dirty+0x48/0x1e0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078371] [<8033a830>] (ext4_mark_inode_dirty) from [<8033f8cc>] (ext4_dirty_inode+0x54/0x70)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078389] [<8033f8cc>] (ext4_dirty_inode) from [<802bbe9c>] (__mark_inode_dirty+0x54/0x408)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078414] [<802bbe9c>] (__mark_inode_dirty) from [<802a7e78>] (generic_update_time+0x84/0xc0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078432] [<802a7e78>] (generic_update_time) from [<802a81c0>] (file_update_time+0xc8/0x11c)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078451] [<802a81c0>] (file_update_time) from [<802217fc>] (__generic_file_write_iter+0xa0/0x1f8)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078472] [<802217fc>] (__generic_file_write_iter) from [<80328dd0>] (ext4_file_write_iter+0x188/0x454)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078491] [<80328dd0>] (ext4_file_write_iter) from [<8028b8c0>] (__vfs_write+0xe0/0x138)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078511] [<8028b8c0>] (__vfs_write) from [<8028bae0>] (vfs_write+0xb4/0x1bc)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078528] [<8028bae0>] (vfs_write) from [<8028bd30>] (SyS_write+0x54/0xb0)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078545] [<8028bd30>] (SyS_write) from [<80108000>] (ret_fast_syscall+0x0/0x28)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078553] kworker/0:0 D 0 1717 2 0x00000000
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078577] Workqueue: events_freezable mmc_rescan
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078598] [<8079f530>] (__schedule) from [<8079fb98>] (schedule+0x50/0xa8)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078615] [<8079fb98>] (schedule) from [<8061a79c>] (__mmc_claim_host+0xb8/0x1cc)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078632] [<8061a79c>] (__mmc_claim_host) from [<8061a8e0>] (mmc_get_card+0x30/0x34)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078650] [<8061a8e0>] (mmc_get_card) from [<806234dc>] (mmc_sd_detect+0x20/0x74)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078665] [<806234dc>] (mmc_sd_detect) from [<8061d1a8>] (mmc_rescan+0x1c8/0x394)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078681] [<8061d1a8>] (mmc_rescan) from [<801379b4>] (process_one_work+0x158/0x454)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078695] [<801379b4>] (process_one_work) from [<80137d14>] (worker_thread+0x64/0x5b8)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078710] [<80137d14>] (worker_thread) from [<8013dd98>] (kthread+0x13c/0x16c)
Dec 8 21:01:41 gax-pi-2 kernel: [ 594.078726] [<8013dd98>] (kthread) from [<801080ac>] (ret_from_fork+0x14/0x28)
I ran this command as well:
# time badblocks -sv /dev/mmcblk0p2 -o mmcblk0p2.log
And this was the result. Which seems pretty good to me.
Checking blocks 0 to 15511039
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
0.52user 13.93system 13:24.49elapsed 1%CPU (0avgtext+0avgdata 2260maxresident)k
31022080inputs+0outputs (0major+139minor)pagefaults 0swaps
I added fsck.mode=force to cmdline.txt and rebooted, with this result:
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: Pass 1: Checking inodes, blocks, and sizes
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: Pass 2: Checking directory structure
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: Pass 3: Checking directory connectivity
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: Pass 4: Checking reference counts
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: Pass 5: Checking group summary information
Dec 8 22:11:13 gax-pi-2 systemd-fsck[122]: /dev/mmcblk0p2: 335467/955808 files (0.3% non-contiguous), 3338340/3877760 blocks
I feel like the stack trace from /var/log/messages is telling me there are FS or SD Card problems, but it isn't showing up in the tests. Do I need to replace my card, or is there something else I should be trying?
Edit:
Before running the Zip, $ free -m
says:
total used free shared buff/cache available
Mem: 927 224 33 12 669 633
Swap: 99 0 99
During a hang, it says:
total used free shared buff/cache available
Mem: 927 348 31 5 546 509
Swap: 99 7 92
It seems to hang whether the swap used is changing or not.
$ free -m
. – Dustin Lewis Dec 09 '18 at 18:47