linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance
Date: Wed, 16 Sep 2015 05:02:26 +0000 (UTC)	[thread overview]
Message-ID: <pan$e885$1290ceef$c5017ead$b38037f1@cox.net> (raw)
In-Reply-To: 532aadf0f92d08d3d2b274173548aee1@all.all

Stéphane Lesimple posted on Tue, 15 Sep 2015 23:47:01 +0200 as excerpted:

> Le 2015-09-15 16:56, Josef Bacik a écrit :
>> On 09/15/2015 10:47 AM, Stéphane Lesimple wrote:
>>>> I've been experiencing repetitive "kernel BUG" occurences in the past
>>>> few days trying to balance a raid5 filesystem after adding a new
>>>> drive.
>>>> It occurs on both 4.2.0 and 4.1.7, using 4.2 userspace tools.
>>> 
>>> I've ran a scrub on this filesystem after the crash happened twice,
>>> and if found no errors.
>>> 
>>> The BUG_ON() condition that my filesystem triggers is the following :
>>> 
>>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID);
>>> // in insert_inline_extent_backref() of extent-tree.c.
>>> 
>> Does btrfsck complain at all?

Just to elucidate a bit...

Scrub is designed to detect, and where there's a second copy available 
(dup or raid1/10 modes, raid5/6 modes can reconstruct from parity) 
correct, exactly one problem, corruption where the checksum stored at 
data write doesn't match that computed on data read back from storage.  
As such, it detects/corrects media errors and (perhaps more commonly) 
corrupted data due to crashes in the middle of the write, but if the data 
was bad when it was written in the first place and thus the checksum 
covering it simply validates what was already bad before the write 
happened, scrub will be none the wiser and will happily validate the 
incorrect data, since it's a totally valid checksum covering data that 
was bad before the checksum was ever created.

Which is where btrfs check comes in and why JB asked you to run it, since 
unlike scrub, check is designed to catch filesystem logic errors.

> Thanks for your suggestion.
> You're right, even if btrfs scrub didn't complain, btrfsck does :
> 
> checking extents
> bad metadata [4179166806016, 4179166822400) crossing stripe boundary
> bad metadata [4179166871552, 4179166887936) crossing stripe boundary
> bad metadata [4179166937088, 4179166953472) crossing stripe boundary

This is an actively in-focus bug ATM, and while I'm not a dev and can't 
tell you for sure that it's behind the specific balance-related crash and 
traces you posted (tho I believe it so), it certainly has the potential 
to be that serious, yes.

The most common cause is a buggy btrfs-convert that was creating invalid 
btrfs when converting from ext* at one point.  AFAIK they've hotfixed the 
immediate convert issue, but are still actively working on a longer term 
proper fix.  Meanwhile, while btrfs check does now detect the issue (and 
even that is quite new code, added in 4.2 I believe), there's still no 
real fix for what was after all a defective btrfs from the moment the 
convert was done.

So where that's the cause, the filesystem was created from an ext* fs 
using a buggy btrfs-convert and is thus actually invalid due to this 
cross-stripe-metadata, the current fix is to back up the files you want 
to keep (and FWIW, as any good sysadmin will tell you, a backup that 
hasn't been tested restorable isn't yet a backup, as the job isn't 
complete), then blow away and recreate the filesystem properly, using 
mkfs.btrfs, and of course then restore to the new filesystem.

If, however, you created the filesystem using mkfs.btrfs, then the 
problem must have occurred some other way.  Whether there's some other 
cause beyond the known cause, a buggy btrfs-convert, has in fact been in 
question, so in this case the devs are likely to be quite interested 
indeed in your case and perhaps the filesystem history that brought you 
to this point.  The ultimate fix is likely to be the same (unless the 
devs have you test new fix code for btrfs check --repair), but I'd 
strongly urge you to delay blowing away the filesystem, if possible, 
until the devs have a chance to ask you to run other diagnostics and 
perhaps even get a btrfs-image for them, since you may well have 
accidentally found a corner-case they'll have trouble reproducing, 
without your information.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2015-09-16  5:02 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-14 11:46 kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance Stéphane Lesimple
2015-09-15 14:47 ` Stéphane Lesimple
2015-09-15 14:56   ` Josef Bacik
2015-09-15 21:47     ` Stéphane Lesimple
2015-09-16  5:02       ` Duncan [this message]
2015-09-16 10:28         ` Stéphane Lesimple
2015-09-16 10:46           ` Holger Hoffstätte
2015-09-16 13:04             ` Stéphane Lesimple
2015-09-16 20:18               ` Duncan
2015-09-16 20:41                 ` Stéphane Lesimple
2015-09-17  3:03                   ` Qu Wenruo
2015-09-17  6:11                     ` Stéphane Lesimple
2015-09-17  6:42                       ` Qu Wenruo
2015-09-17  8:02                         ` Stéphane Lesimple
2015-09-17  8:11                           ` Qu Wenruo
2015-09-17 10:08                             ` Stéphane Lesimple
2015-09-17 10:41                               ` Qu Wenruo
2015-09-17 18:47                                 ` Stéphane Lesimple
2015-09-18  0:59                                   ` Qu Wenruo
2015-09-18  7:36                                     ` Stéphane Lesimple
2015-09-18 10:15                                       ` Stéphane Lesimple
2015-09-18 10:26                                         ` Stéphane Lesimple
2015-09-20  1:22                                           ` Qu Wenruo
2015-09-20 10:35                                             ` Stéphane Lesimple
2015-09-20 10:51                                               ` Qu Wenruo
2015-09-20 11:14                                                 ` Stéphane Lesimple
2015-09-22  1:30                                                   ` Stéphane Lesimple
2015-09-22  1:37                                                     ` Qu Wenruo
2015-09-22  7:34                                                       ` Stéphane Lesimple
2015-09-22  8:40                                                         ` Qu Wenruo
2015-09-22  8:51                                                           ` Qu Wenruo
2015-09-22 14:31                                                             ` Stéphane Lesimple
2015-09-23  7:03                                                               ` Qu Wenruo
2015-09-23  9:40                                                                 ` Stéphane Lesimple
2015-09-23 10:13                                                                   ` Qu Wenruo
2015-09-17  6:29               ` Stéphane Lesimple
2015-09-17  7:54                 ` Stéphane Lesimple

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$e885$1290ceef$c5017ead$b38037f1@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).