* Lock-ups, assertion failure in btrfsck, scrub reporting super=4
@ 2015-03-05 11:48 Tobias Getzner
2015-03-06 0:10 ` Duncan
0 siblings, 1 reply; 5+ messages in thread
From: Tobias Getzner @ 2015-03-05 11:48 UTC (permalink / raw)
To: linux-btrfs
Hello,
I had been running kernel 3.19 for a few days now, when yesterday I
started getting hard lock-ups all the time. These seemed to be
correlated to some program activity, most likely to running Firefox. I
downgraded back to 3.18.6, but the problem persisted, though now I only
got soft lock-ups which I could (after SysRQ+REISUB) inspect in dmesg.
dmesg info pointed me toward btrfs.
For what it’s worth, I uploaded the full kernel logs to:
http://a.pomf.se/zsywtk.xz
These might seem noteworthy:
> kernel BUG at fs/btrfs/ctree.h:2565!
> kernel BUG at fs/btrfs/ctree.h:2501!
> NMI watchdog: BUG: soft lockup + btrfs-related stack trace
> WARNING: CPU: 2 PID: 7051 at fs/btrfs/scrub.c:2461 scrub_stripe 0xcb8/0x1080 [btrfs]()
Since the lock-ups appeared predictably a few minutes after boot &
launching Firefox, I tried running btrfsck (without --repair) to see
whether something is wrong with the home volume (maybe due to the
unclean shutdowns from earlier lock-ups). However, when running
btrfsck, all I get is:
> checking extents
> cmds-check.c:4943: process_extent_item: Assertion `item_size !=
sizeof(*ei0)` failed.
> btrfs check[0x41a7dc]
> btrfs check[0x41d9af]
> btrfs check[0x423751]
> btrfs check[0x4241e9]
> btrfs check[0x424de1]
> btrfs check(cmd_check+0x14b5)[0x427f0b]
> btrfs check(main+0x15d)[0x40997d]
> /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f8ce834d800]
> btrfs check(_start+0x29)[0x409539]
Does this indicate any particular problems with the file system?
In addition, I ran a full scrub. This finished with «scrubbed … with 0
errors», but it also reported:
> error details: super=4
> corrected errors: 0, uncorrectable errors: 0, unverified errors: 0
I booted back into the graphical system, and when not running Firefox,
I did not get any immediate lock-ups anymore.
I’d welcome any advice on how to proceed, i.e., in how to resolve the
lock-ups, and, if possible, in fixing potential problems with the
file-system.
Best regards,
Tobias
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4 2015-03-05 11:48 Lock-ups, assertion failure in btrfsck, scrub reporting super=4 Tobias Getzner @ 2015-03-06 0:10 ` Duncan 2015-03-09 12:45 ` Tobias Getzner 0 siblings, 1 reply; 5+ messages in thread From: Duncan @ 2015-03-06 0:10 UTC (permalink / raw) To: linux-btrfs Tobias Getzner posted on Thu, 05 Mar 2015 12:48:00 +0100 as excerpted: > I booted back into the graphical system, and when not running Firefox, I > did not get any immediate lock-ups anymore. > > I’d welcome any advice on how to proceed, i.e., in how to resolve the > lock-ups, and, if possible, in fixing potential problems with the > file-system. I'll let a dev answer that side of things but a couple comments, for what they are worth... 1) The firefox issue is likely related to the sqlite database files it uses. Database random-rewrite-pattern files are always a challenge for cow-based filesystems such as btrfs, tho with small ones like those firefox typically uses, the btrfs autodefrag mount option can help. Be aware, however, that it can trigger performance issues with larger (typically half-gig plus) random-rewrite-pattern files such as VM images and large databases, tho. But if you're not running anything like that and don't have autodefrag in your btrfs mount options, I'd suggest trying it. If you /are/ running VMs and the like, it's worth doing a bit more research on the topic both on the btrfs wiki, and on the backlist, here. There are workaround, but they can get a bit complex... Meanwhile, the problem file is likely in your firefox profile. You could try starting with a clean firefox profile and see if the problem disappears, and if so, bisect the profile to see what file it is and delete it or restore it from backup. 2) Just noting, I'm running kernel 3.19 here without issues, but I run multiple smaller btrfs (largest is <50 GiB) than some people, and I'm on fast SSD, so I don't tend to see the issues that people with TB-sized btrfs on spinning rust see. And I mount with autodefrag... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4 2015-03-06 0:10 ` Duncan @ 2015-03-09 12:45 ` Tobias Getzner 2015-03-09 16:44 ` Chris Murphy 0 siblings, 1 reply; 5+ messages in thread From: Tobias Getzner @ 2015-03-09 12:45 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs On Fr, 2015-03-06 at 00:10 +0000, Duncan wrote: > Tobias Getzner posted on Thu, 05 Mar 2015 12:48:00 +0100 as excerpted: > > > I booted back into the graphical system, and when not running Firefox, I > > did not get any immediate lock-ups anymore. > > > > I’d welcome any advice on how to proceed, i.e., in how to resolve the > > lock-ups, and, if possible, in fixing potential problems with the > > file-system. > > I'll let a dev answer that side of things but a couple comments, for what > they are worth... > > 1) The firefox issue is likely related to the sqlite database files it > uses. Database random-rewrite-pattern files are always a challenge for > cow-based filesystems such as btrfs, tho with small ones like those > firefox typically uses, the btrfs autodefrag mount option can help. Thanks, I usually mount with autodefrag as well. Also, I had the Firefox sqlite databases set to NOCOW, which might or might not be involved in triggering this bug. > Meanwhile, the problem file is likely in your firefox profile. You could > try starting with a clean firefox profile and see if the problem > disappears, and if so, bisect the profile to see what file it is and > delete it or restore it from backup. I guessed the same, so I moved my old profile aside and made a fresh copy (no reflinking) of the old one. Indeed this was fruitful, because the machine would no longer predictably lock up after starting Firefox. However, after a while I figured I would «rm -r» the old (somehow corrupted) profile folder, and this command then immediately froze the machine. To my dismay, when I rebooted, the lock-up would now not only trigger when starting Firefox with the old profile, but instead a soft-lockup would predictably trigger when lauching zsh, most likely when it sourced some rc file (the zsh binary itself is on another, uncorrupted partition). Again, here’s today’s kernel logs with some back-traces from btrfs. While the logs I posted in the previous message were when running kernel 3.19 and 3.18.6, these logs are with 3.19.1. http://a.pomf.se/xmmpgw.xz Since I need the machine for work, I decided to create a new btrfs FS on a spare partition and copy over all the data (since the scrub had indicated no problems with the data, except super=4). I noticed that even just mounting the old partition would cause a «kernel bug at ctree.h:2498». This is not in the logs because I had booted into a rescue system to copy the files (3.16.3 kernel). The back-trace however seems to be the same which follows «kernel BUG at fs/btrfs/ctree.h:2501» in my logs. Luckily, I could mount the old partition read-only and all the files could be rsynced to the new FS just fine. For now I still have the corrupted file-system lying around, so if some additional information from there could be helpful in fixing this issue, let me know. I won’t be able to keep it around for too long though, since the spare partition I’m using now is a bit restricted in space. Apart from the corruption issue as such, it might be helpful if the assertion failure in btrfsck I posted could give some informative output as to what’s happening. Best regards, Tobias ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4 2015-03-09 12:45 ` Tobias Getzner @ 2015-03-09 16:44 ` Chris Murphy 2015-03-10 7:53 ` Tobias Getzner 0 siblings, 1 reply; 5+ messages in thread From: Chris Murphy @ 2015-03-09 16:44 UTC (permalink / raw) To: Btrfs BTRFS Maybe try btrfs-progs-3.19rc3? You have the data backed up, and progs 3.18.2 fails so it seems worth a shot. Chris Murphy ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4 2015-03-09 16:44 ` Chris Murphy @ 2015-03-10 7:53 ` Tobias Getzner 0 siblings, 0 replies; 5+ messages in thread From: Tobias Getzner @ 2015-03-10 7:53 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On Mo, 2015-03-09 at 10:44 -0600, Chris Murphy wrote: > Maybe try btrfs-progs-3.19rc3? You have the data backed up, and progs > 3.18.2 fails so it seems worth a shot. This was worth a shot; 3.19-rc3 throws the same assertion failure, though. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-03-10 7:53 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-05 11:48 Lock-ups, assertion failure in btrfsck, scrub reporting super=4 Tobias Getzner 2015-03-06 0:10 ` Duncan 2015-03-09 12:45 ` Tobias Getzner 2015-03-09 16:44 ` Chris Murphy 2015-03-10 7:53 ` Tobias Getzner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).