linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "Stéphane Lesimple" <stephane_btrfs@lesimple.fr>,
	"Qu Wenruo" <quwenruo@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance
Date: Wed, 23 Sep 2015 18:13:49 +0800	[thread overview]
Message-ID: <56027B5D.1070404@gmx.com> (raw)
In-Reply-To: <eb465f0cf1fe18eee3f0a0627c2ec0e2@all.all>



在 2015年09月23日 17:40, Stéphane Lesimple 写道:
> Le 2015-09-23 09:03, Qu Wenruo a écrit :
>> Stéphane Lesimple wrote on 2015/09/22 16:31 +0200:
>>> Le 2015-09-22 10:51, Qu Wenruo a écrit :
>>>>>>>> [92098.842261] Call Trace:
>>>>>>>> [92098.842277]  [<ffffffffc035a5d8>] ?
>>>>>>>> read_extent_buffer+0xb8/0x110
>>>>>>>> [btrfs]
>>>>>>>> [92098.842304]  [<ffffffffc0396d00>] ?
>>>>>>>> btrfs_find_all_roots+0x60/0x70
>>>>>>>> [btrfs]
>>>>>>>> [92098.842329]  [<ffffffffc039af3d>]
>>>>>>>> btrfs_qgroup_rescan_worker+0x28d/0x5a0 [btrfs]
>>>>>>>
>>>>>>> Would you please show the code of it?
>>>>>>> This one seems to be another stupid bug I made when rewriting the
>>>>>>> framework.
>>>>>>> Maybe I forgot to reinit some variants or I'm screwing memory...
>>>>>>
>>>>>> (gdb) list *(btrfs_qgroup_rescan_worker+0x28d)
>>>>>> 0x97f6d is in btrfs_qgroup_rescan_worker (fs/btrfs/ctree.h:2760).
>>>>>> 2755
>>>>>> 2756    static inline void btrfs_disk_key_to_cpu(struct btrfs_key
>>>>>> *cpu,
>>>>>> 2757                                             struct
>>>>>> btrfs_disk_key
>>>>>> *disk)
>>>>>> 2758    {
>>>>>> 2759            cpu->offset =e64_to_cpu(disk->offset);
>>>>>> 2760            cpu->type =isk->type;
>>>>>> 2761            cpu->objectid =e64_to_cpu(disk->objectid);
>>>>>> 2762    }
>>>>>> 2763
>>>>>> 2764    static inline void btrfs_cpu_key_to_disk(struct
>>>>>> btrfs_disk_key
>>>>>> *disk,
>>>>>> (gdb)
>>>>>>
>>>>>>
>>>>>> Does it makes sense ?
>>>>> So it seems that the memory of cpu key is being screwed up...
>>>>>
>>>>> The code is be specific thin inline function, so what about other
>>>>> stack?
>>>>> Like btrfs_qgroup_rescan_helper+0x12?
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>> Oh, I forgot that you can just change the number of
>>>> btrfs_qgroup_rescan_worker+0x28d to smaller value.
>>>> Try +0x280 for example, which will revert to 14 bytes asm code back,
>>>> which may jump out of the inline function range, and may give you a
>>>> good hint.
>>>>
>>>> Or gdb may have a better mode for inline function, but I don't know...
>>>
>>> Actually, "list -" is our friend here (show 10 lignes before the last
>>> src output)
>> No, that's not the case.
>>
>> List - will only show lines around the source code.
>>
>> What I need is to get the higher caller stack.
>> If debugging a running program, it's quite easy to just use frame
>> command.
>>
>> But in this situation, we don't have call stack, so I'd like to change
>> the +0x28d to several bytes backward, until we jump out of the inline
>> function call, and see the meaningful codes.
>
> Ah, you're right.
> I had a hard time finding a value where I wouldn't end up in another inline
> function or entirely somewhere else in the kernel code, but here it is :
>
> (gdb) list *(btrfs_qgroup_rescan_worker+0x26e)
> 0x97f4e is in btrfs_qgroup_rescan_worker (fs/btrfs/qgroup.c:2237).
> 2232            memcpy(scratch_leaf, path->nodes[0],
> sizeof(*scratch_leaf));
> 2233            slot = path->slots[0];
> 2234            btrfs_release_path(path);
> 2235            mutex_unlock(&fs_info->qgroup_rescan_lock);
> 2236
> 2237            for (; slot < btrfs_header_nritems(scratch_leaf); ++slot) {
> 2238                    btrfs_item_key_to_cpu(scratch_leaf, &found,
> slot); <== here
>
> 2239                    if (found.type != BTRFS_EXTENT_ITEM_KEY &&
> 2240                        found.type != BTRFS_METADATA_ITEM_KEY)
> 2241                            continue;
>
> the btrfs_item_key_to_cpu() inline func calls 2 other inline funcs:
>
> static inline void btrfs_item_key_to_cpu(struct extent_buffer *eb,
>                                    struct btrfs_key *key, int nr)
> {
>          struct btrfs_disk_key disk_key;
>          btrfs_item_key(eb, &disk_key, nr);
>          btrfs_disk_key_to_cpu(key, &disk_key); <== this is 0x28d
> }
>
> btrfs_disk_key_to_cpu() is the inline referenced by 0x28d and this is where
> the GPF happens.

Thanks, now things are much more clear.
Not completely sure, but scratch_leaf seems invalid and cause the bug.
(found is in stack memory, so I don't think it's the cause).

But less related to the qgroup rework, as that's the existing code.

A quick glance already shows some dirty and maybe deadly hack, like 
copying the whole extent buffer, which includes pages and all kinds of 
locks.

But I'm not 100% sure if that's the problem, but I'll create a patch to 
for you to test in recent days.

>
>
>> BTW, did you tried the following patch?
>> https://patchwork.kernel.org/patch/7114321/
>> btrfs: qgroup: exit the rescan worker during umount
>>
>> The problem seems a little related to the bug you encountered, so I'd
>> recommend to give it a try.
>
> Not yet, but I've come across this bug too during my tests: starting a
> rescan
> and umounting gets you a crash. I didn't mention it because I was sure this
> was an already known bug. Nice to see it has been fixed though !
> I'll certainly give it a try but I'm not really sure it'll fix the specific
> bug we're talking about.
> However the group of patches posted by Mark should fix the qgroup count
> disrepancies as I understand it, right ? It might be of interest to try
> them
> all at once for sure.

Yes, his patch should fix the qgroup count mismatch problem for 
subvolume remove.

If I read the codes correctly, after remove and sync, the accounting 
number for qgroup of deleted subvolume should be:
rfer = 0 and excl = 0.

Thanks,
Qu
>
> Thanks,
>

  reply	other threads:[~2015-09-23 10:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-14 11:46 kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance Stéphane Lesimple
2015-09-15 14:47 ` Stéphane Lesimple
2015-09-15 14:56   ` Josef Bacik
2015-09-15 21:47     ` Stéphane Lesimple
2015-09-16  5:02       ` Duncan
2015-09-16 10:28         ` Stéphane Lesimple
2015-09-16 10:46           ` Holger Hoffstätte
2015-09-16 13:04             ` Stéphane Lesimple
2015-09-16 20:18               ` Duncan
2015-09-16 20:41                 ` Stéphane Lesimple
2015-09-17  3:03                   ` Qu Wenruo
2015-09-17  6:11                     ` Stéphane Lesimple
2015-09-17  6:42                       ` Qu Wenruo
2015-09-17  8:02                         ` Stéphane Lesimple
2015-09-17  8:11                           ` Qu Wenruo
2015-09-17 10:08                             ` Stéphane Lesimple
2015-09-17 10:41                               ` Qu Wenruo
2015-09-17 18:47                                 ` Stéphane Lesimple
2015-09-18  0:59                                   ` Qu Wenruo
2015-09-18  7:36                                     ` Stéphane Lesimple
2015-09-18 10:15                                       ` Stéphane Lesimple
2015-09-18 10:26                                         ` Stéphane Lesimple
2015-09-20  1:22                                           ` Qu Wenruo
2015-09-20 10:35                                             ` Stéphane Lesimple
2015-09-20 10:51                                               ` Qu Wenruo
2015-09-20 11:14                                                 ` Stéphane Lesimple
2015-09-22  1:30                                                   ` Stéphane Lesimple
2015-09-22  1:37                                                     ` Qu Wenruo
2015-09-22  7:34                                                       ` Stéphane Lesimple
2015-09-22  8:40                                                         ` Qu Wenruo
2015-09-22  8:51                                                           ` Qu Wenruo
2015-09-22 14:31                                                             ` Stéphane Lesimple
2015-09-23  7:03                                                               ` Qu Wenruo
2015-09-23  9:40                                                                 ` Stéphane Lesimple
2015-09-23 10:13                                                                   ` Qu Wenruo [this message]
2015-09-17  6:29               ` Stéphane Lesimple
2015-09-17  7:54                 ` Stéphane Lesimple

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56027B5D.1070404@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    --cc=stephane_btrfs@lesimple.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).