* Lock-ups, assertion failure in btrfsck, scrub reporting super=4
@ 2015-03-05 11:48 Tobias Getzner
2015-03-06 0:10 ` Duncan
0 siblings, 1 reply; 5+ messages in thread
From: Tobias Getzner @ 2015-03-05 11:48 UTC (permalink / raw)
To: linux-btrfs
Hello,
I had been running kernel 3.19 for a few days now, when yesterday I
started getting hard lock-ups all the time. These seemed to be
correlated to some program activity, most likely to running Firefox. I
downgraded back to 3.18.6, but the problem persisted, though now I only
got soft lock-ups which I could (after SysRQ+REISUB) inspect in dmesg.
dmesg info pointed me toward btrfs.
For what it’s worth, I uploaded the full kernel logs to:
http://a.pomf.se/zsywtk.xz
These might seem noteworthy:
> kernel BUG at fs/btrfs/ctree.h:2565!
> kernel BUG at fs/btrfs/ctree.h:2501!
> NMI watchdog: BUG: soft lockup + btrfs-related stack trace
> WARNING: CPU: 2 PID: 7051 at fs/btrfs/scrub.c:2461 scrub_stripe 0xcb8/0x1080 [btrfs]()
Since the lock-ups appeared predictably a few minutes after boot &
launching Firefox, I tried running btrfsck (without --repair) to see
whether something is wrong with the home volume (maybe due to the
unclean shutdowns from earlier lock-ups). However, when running
btrfsck, all I get is:
> checking extents
> cmds-check.c:4943: process_extent_item: Assertion `item_size !=
sizeof(*ei0)` failed.
> btrfs check[0x41a7dc]
> btrfs check[0x41d9af]
> btrfs check[0x423751]
> btrfs check[0x4241e9]
> btrfs check[0x424de1]
> btrfs check(cmd_check+0x14b5)[0x427f0b]
> btrfs check(main+0x15d)[0x40997d]
> /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f8ce834d800]
> btrfs check(_start+0x29)[0x409539]
Does this indicate any particular problems with the file system?
In addition, I ran a full scrub. This finished with «scrubbed … with 0
errors», but it also reported:
> error details: super=4
> corrected errors: 0, uncorrectable errors: 0, unverified errors: 0
I booted back into the graphical system, and when not running Firefox,
I did not get any immediate lock-ups anymore.
I’d welcome any advice on how to proceed, i.e., in how to resolve the
lock-ups, and, if possible, in fixing potential problems with the
file-system.
Best regards,
Tobias
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4
2015-03-05 11:48 Lock-ups, assertion failure in btrfsck, scrub reporting super=4 Tobias Getzner
@ 2015-03-06 0:10 ` Duncan
2015-03-09 12:45 ` Tobias Getzner
0 siblings, 1 reply; 5+ messages in thread
From: Duncan @ 2015-03-06 0:10 UTC (permalink / raw)
To: linux-btrfs
Tobias Getzner posted on Thu, 05 Mar 2015 12:48:00 +0100 as excerpted:
> I booted back into the graphical system, and when not running Firefox, I
> did not get any immediate lock-ups anymore.
>
> I’d welcome any advice on how to proceed, i.e., in how to resolve the
> lock-ups, and, if possible, in fixing potential problems with the
> file-system.
I'll let a dev answer that side of things but a couple comments, for what
they are worth...
1) The firefox issue is likely related to the sqlite database files it
uses. Database random-rewrite-pattern files are always a challenge for
cow-based filesystems such as btrfs, tho with small ones like those
firefox typically uses, the btrfs autodefrag mount option can help. Be
aware, however, that it can trigger performance issues with larger
(typically half-gig plus) random-rewrite-pattern files such as VM images
and large databases, tho. But if you're not running anything like that
and don't have autodefrag in your btrfs mount options, I'd suggest trying
it. If you /are/ running VMs and the like, it's worth doing a bit more
research on the topic both on the btrfs wiki, and on the backlist, here.
There are workaround, but they can get a bit complex...
Meanwhile, the problem file is likely in your firefox profile. You could
try starting with a clean firefox profile and see if the problem
disappears, and if so, bisect the profile to see what file it is and
delete it or restore it from backup.
2) Just noting, I'm running kernel 3.19 here without issues, but I run
multiple smaller btrfs (largest is <50 GiB) than some people, and I'm on
fast SSD, so I don't tend to see the issues that people with TB-sized
btrfs on spinning rust see. And I mount with autodefrag...
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4
2015-03-06 0:10 ` Duncan
@ 2015-03-09 12:45 ` Tobias Getzner
2015-03-09 16:44 ` Chris Murphy
0 siblings, 1 reply; 5+ messages in thread
From: Tobias Getzner @ 2015-03-09 12:45 UTC (permalink / raw)
To: Duncan; +Cc: linux-btrfs
On Fr, 2015-03-06 at 00:10 +0000, Duncan wrote:
> Tobias Getzner posted on Thu, 05 Mar 2015 12:48:00 +0100 as excerpted:
>
> > I booted back into the graphical system, and when not running Firefox, I
> > did not get any immediate lock-ups anymore.
> >
> > I’d welcome any advice on how to proceed, i.e., in how to resolve the
> > lock-ups, and, if possible, in fixing potential problems with the
> > file-system.
>
> I'll let a dev answer that side of things but a couple comments, for what
> they are worth...
>
> 1) The firefox issue is likely related to the sqlite database files it
> uses. Database random-rewrite-pattern files are always a challenge for
> cow-based filesystems such as btrfs, tho with small ones like those
> firefox typically uses, the btrfs autodefrag mount option can help.
Thanks, I usually mount with autodefrag as well. Also, I had the
Firefox sqlite databases set to NOCOW, which might or might not be
involved in triggering this bug.
> Meanwhile, the problem file is likely in your firefox profile. You could
> try starting with a clean firefox profile and see if the problem
> disappears, and if so, bisect the profile to see what file it is and
> delete it or restore it from backup.
I guessed the same, so I moved my old profile aside and made a fresh
copy (no reflinking) of the old one. Indeed this was fruitful, because
the machine would no longer predictably lock up after starting Firefox.
However, after a while I figured I would «rm -r» the old (somehow
corrupted) profile folder, and this command then immediately froze the
machine. To my dismay, when I rebooted, the lock-up would now not only
trigger when starting Firefox with the old profile, but instead a
soft-lockup would predictably trigger when lauching zsh, most likely
when it sourced some rc file (the zsh binary itself is on another,
uncorrupted partition).
Again, here’s today’s kernel logs with some back-traces from btrfs.
While the logs I posted in the previous message were when running
kernel 3.19 and 3.18.6, these logs are with 3.19.1.
http://a.pomf.se/xmmpgw.xz
Since I need the machine for work, I decided to create a new btrfs FS
on a spare partition and copy over all the data (since the scrub had
indicated no problems with the data, except super=4). I noticed that
even just mounting the old partition would cause a «kernel bug at
ctree.h:2498». This is not in the logs because I had booted into a
rescue system to copy the files (3.16.3 kernel). The back-trace however
seems to be the same which follows «kernel BUG at
fs/btrfs/ctree.h:2501» in my logs.
Luckily, I could mount the old partition read-only and all the files
could be rsynced to the new FS just fine.
For now I still have the corrupted file-system lying around, so if some
additional information from there could be helpful in fixing this
issue, let me know. I won’t be able to keep it around for too long
though, since the spare partition I’m using now is a bit restricted in
space.
Apart from the corruption issue as such, it might be helpful if the
assertion failure in btrfsck I posted could give some informative
output as to what’s happening.
Best regards,
Tobias
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4
2015-03-09 12:45 ` Tobias Getzner
@ 2015-03-09 16:44 ` Chris Murphy
2015-03-10 7:53 ` Tobias Getzner
0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2015-03-09 16:44 UTC (permalink / raw)
To: Btrfs BTRFS
Maybe try btrfs-progs-3.19rc3? You have the data backed up, and progs
3.18.2 fails so it seems worth a shot.
Chris Murphy
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lock-ups, assertion failure in btrfsck, scrub reporting super=4
2015-03-09 16:44 ` Chris Murphy
@ 2015-03-10 7:53 ` Tobias Getzner
0 siblings, 0 replies; 5+ messages in thread
From: Tobias Getzner @ 2015-03-10 7:53 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
On Mo, 2015-03-09 at 10:44 -0600, Chris Murphy wrote:
> Maybe try btrfs-progs-3.19rc3? You have the data backed up, and progs
> 3.18.2 fails so it seems worth a shot.
This was worth a shot; 3.19-rc3 throws the same assertion failure,
though.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-03-10 7:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-05 11:48 Lock-ups, assertion failure in btrfsck, scrub reporting super=4 Tobias Getzner
2015-03-06 0:10 ` Duncan
2015-03-09 12:45 ` Tobias Getzner
2015-03-09 16:44 ` Chris Murphy
2015-03-10 7:53 ` Tobias Getzner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).