From: "Stéphane Lesimple" <stephane_btrfs@lesimple.fr>
To: Josef Bacik <jbacik@fb.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance
Date: Tue, 15 Sep 2015 23:47:01 +0200 [thread overview]
Message-ID: <532aadf0f92d08d3d2b274173548aee1@all.all> (raw)
In-Reply-To: <55F83181.9010201@fb.com>
Le 2015-09-15 16:56, Josef Bacik a écrit :
> On 09/15/2015 10:47 AM, Stéphane Lesimple wrote:
>>> I've been experiencing repetitive "kernel BUG" occurences in the past
>>> few days trying to balance a raid5 filesystem after adding a new
>>> drive.
>>> It occurs on both 4.2.0 and 4.1.7, using 4.2 userspace tools.
>>
>> I've ran a scrub on this filesystem after the crash happened twice,
>> and
>> if found no errors.
>>
>> The BUG_ON() condition that my filesystem triggers is the following :
>>
>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID);
>> // in insert_inline_extent_backref() of extent-tree.c.
>>
>> I've compiled a fresh 4.3.0-rc1 with a couple added printk's just
>> before
>> the BUG_ON(), to dump the parameters passed to
>> insert_inline_extent_backref() when the problem occurs.
>> Here is an excerpt of the resulting dmesg :
>>
>> {btrfs} in insert_inline_extent_backref, got owner <
>> BTRFS_FIRST_FREE_OBJECTID
>> {btrfs} with bytenr=4557830635520 num_bytes=16384 parent=4558111506432
>> root_objectid=3339 owner=1 offset=0 refs_to_add=1
>> BTRFS_FIRST_FREE_OBJECTID=256
>> ------------[ cut here ]------------
>> kernel BUG at fs/btrfs/extent-tree.c:1837!
>>
>> I'll retry with the exact same kernel once I get the machine back up,
>> and see if the the bug happens again at the same filesystem spot or a
>> different one.
>> The variable amount of time after a balance start elapsed before I get
>> the bug suggests that this would be a different one.
>>
>
> Does btrfsck complain at all?
Thanks for your suggestion.
You're right, even if btrfs scrub didn't complain, btrfsck does :
checking extents
bad metadata [4179166806016, 4179166822400) crossing stripe boundary
bad metadata [4179166871552, 4179166887936) crossing stripe boundary
bad metadata [4179166937088, 4179166953472) crossing stripe boundary
[... some more ...]
extent buffer leak: start 4561066901504 len 16384
extent buffer leak: start 4561078812672 len 16384
extent buffer leak: start 4561078861824 len 16384
[... some more ...]
then some complains about mismatched counts for qgroups.
I can see from tbe btrfsck source code that the --repair will not work
here, so I didn't try.
I'm not sure if those errors would be a cause or a consequence of the
bug. As the filesystem was only a few days old and as there was always a
balance running during the crashes, I would be tempted to think it might
actually be a consequence, but I can't be sure.
In your experience, could these inconsistencies cause the crash ?
If you think so, then I'll btrfs dev del the 3rd device, then remount
the array degraded with just 1 disk and create a new btrfs system from
scratch on the second, then copy the data in single redundancy, then
re-add the 2 disks and balance convert in raid5.
If you think not, then this array could still help you debug a corner
case, and I can keep it that way for a couple days if more testing/debug
is needed.
Thanks,
--
Stéphane
next prev parent reply other threads:[~2015-09-15 21:47 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-14 11:46 kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance Stéphane Lesimple
2015-09-15 14:47 ` Stéphane Lesimple
2015-09-15 14:56 ` Josef Bacik
2015-09-15 21:47 ` Stéphane Lesimple [this message]
2015-09-16 5:02 ` Duncan
2015-09-16 10:28 ` Stéphane Lesimple
2015-09-16 10:46 ` Holger Hoffstätte
2015-09-16 13:04 ` Stéphane Lesimple
2015-09-16 20:18 ` Duncan
2015-09-16 20:41 ` Stéphane Lesimple
2015-09-17 3:03 ` Qu Wenruo
2015-09-17 6:11 ` Stéphane Lesimple
2015-09-17 6:42 ` Qu Wenruo
2015-09-17 8:02 ` Stéphane Lesimple
2015-09-17 8:11 ` Qu Wenruo
2015-09-17 10:08 ` Stéphane Lesimple
2015-09-17 10:41 ` Qu Wenruo
2015-09-17 18:47 ` Stéphane Lesimple
2015-09-18 0:59 ` Qu Wenruo
2015-09-18 7:36 ` Stéphane Lesimple
2015-09-18 10:15 ` Stéphane Lesimple
2015-09-18 10:26 ` Stéphane Lesimple
2015-09-20 1:22 ` Qu Wenruo
2015-09-20 10:35 ` Stéphane Lesimple
2015-09-20 10:51 ` Qu Wenruo
2015-09-20 11:14 ` Stéphane Lesimple
2015-09-22 1:30 ` Stéphane Lesimple
2015-09-22 1:37 ` Qu Wenruo
2015-09-22 7:34 ` Stéphane Lesimple
2015-09-22 8:40 ` Qu Wenruo
2015-09-22 8:51 ` Qu Wenruo
2015-09-22 14:31 ` Stéphane Lesimple
2015-09-23 7:03 ` Qu Wenruo
2015-09-23 9:40 ` Stéphane Lesimple
2015-09-23 10:13 ` Qu Wenruo
2015-09-17 6:29 ` Stéphane Lesimple
2015-09-17 7:54 ` Stéphane Lesimple
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=532aadf0f92d08d3d2b274173548aee1@all.all \
--to=stephane_btrfs@lesimple.fr \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).