From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net ([212.227.17.22]:58060 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754036AbbIWKOC (ORCPT ); Wed, 23 Sep 2015 06:14:02 -0400 Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance To: =?UTF-8?Q?St=c3=a9phane_Lesimple?= , Qu Wenruo References: <9c864637fe7676a8b7badc5ddd7a4e0c@all.all> <55FA2D9A.1060405@cn.fujitsu.com> <55FA60C5.5090002@cn.fujitsu.com> <7a6f2d794fb6cbf7d598b92e3470201c@all.all> <55FA759E.6030707@cn.fujitsu.com> <3386a8bfa1a5796460306a53a668e47e@all.all> <55FA98D8.5010301@gmx.com> <53a5553a9c5301789e246144bb264e43@all.all> <55FB61E9.4000300@cn.fujitsu.com> <2ce9b35f73732b145e0f80b18f230a52@all.all> <762ec73d5389b5057be4d3c17f74e1f9@all.all> <55FE0A50.9060607@gmx.com> <3ba27cf5afd82cf4e3bde718386b7cc3@all.all> <55FE8FB6.4070509@gmx.com> <72b4368e7180a4d703ef3ea1112d7358@all.all> <4749d42363070fcd228af172781750df@all.all> <5600B0BF.604@cn.fujitsu.com> <0a4be8fab4876a245900e4833e8139e0@all.all> <560113EF.2090209@gmx.com> <5601169B.4060600@gmx.com> <69919cd93d3942e8210fb45a53fc42ac@all.all> <56024EDA.5090406@cn.fujitsu.com> Cc: linux-btrfs@vger.kernel.org From: Qu Wenruo Message-ID: <56027B5D.1070404@gmx.com> Date: Wed, 23 Sep 2015 18:13:49 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: 在 2015年09月23日 17:40, Stéphane Lesimple 写道: > Le 2015-09-23 09:03, Qu Wenruo a écrit : >> Stéphane Lesimple wrote on 2015/09/22 16:31 +0200: >>> Le 2015-09-22 10:51, Qu Wenruo a écrit : >>>>>>>> [92098.842261] Call Trace: >>>>>>>> [92098.842277] [] ? >>>>>>>> read_extent_buffer+0xb8/0x110 >>>>>>>> [btrfs] >>>>>>>> [92098.842304] [] ? >>>>>>>> btrfs_find_all_roots+0x60/0x70 >>>>>>>> [btrfs] >>>>>>>> [92098.842329] [] >>>>>>>> btrfs_qgroup_rescan_worker+0x28d/0x5a0 [btrfs] >>>>>>> >>>>>>> Would you please show the code of it? >>>>>>> This one seems to be another stupid bug I made when rewriting the >>>>>>> framework. >>>>>>> Maybe I forgot to reinit some variants or I'm screwing memory... >>>>>> >>>>>> (gdb) list *(btrfs_qgroup_rescan_worker+0x28d) >>>>>> 0x97f6d is in btrfs_qgroup_rescan_worker (fs/btrfs/ctree.h:2760). >>>>>> 2755 >>>>>> 2756 static inline void btrfs_disk_key_to_cpu(struct btrfs_key >>>>>> *cpu, >>>>>> 2757 struct >>>>>> btrfs_disk_key >>>>>> *disk) >>>>>> 2758 { >>>>>> 2759 cpu->offset =e64_to_cpu(disk->offset); >>>>>> 2760 cpu->type =isk->type; >>>>>> 2761 cpu->objectid =e64_to_cpu(disk->objectid); >>>>>> 2762 } >>>>>> 2763 >>>>>> 2764 static inline void btrfs_cpu_key_to_disk(struct >>>>>> btrfs_disk_key >>>>>> *disk, >>>>>> (gdb) >>>>>> >>>>>> >>>>>> Does it makes sense ? >>>>> So it seems that the memory of cpu key is being screwed up... >>>>> >>>>> The code is be specific thin inline function, so what about other >>>>> stack? >>>>> Like btrfs_qgroup_rescan_helper+0x12? >>>>> >>>>> Thanks, >>>>> Qu >>>> Oh, I forgot that you can just change the number of >>>> btrfs_qgroup_rescan_worker+0x28d to smaller value. >>>> Try +0x280 for example, which will revert to 14 bytes asm code back, >>>> which may jump out of the inline function range, and may give you a >>>> good hint. >>>> >>>> Or gdb may have a better mode for inline function, but I don't know... >>> >>> Actually, "list -" is our friend here (show 10 lignes before the last >>> src output) >> No, that's not the case. >> >> List - will only show lines around the source code. >> >> What I need is to get the higher caller stack. >> If debugging a running program, it's quite easy to just use frame >> command. >> >> But in this situation, we don't have call stack, so I'd like to change >> the +0x28d to several bytes backward, until we jump out of the inline >> function call, and see the meaningful codes. > > Ah, you're right. > I had a hard time finding a value where I wouldn't end up in another inline > function or entirely somewhere else in the kernel code, but here it is : > > (gdb) list *(btrfs_qgroup_rescan_worker+0x26e) > 0x97f4e is in btrfs_qgroup_rescan_worker (fs/btrfs/qgroup.c:2237). > 2232 memcpy(scratch_leaf, path->nodes[0], > sizeof(*scratch_leaf)); > 2233 slot = path->slots[0]; > 2234 btrfs_release_path(path); > 2235 mutex_unlock(&fs_info->qgroup_rescan_lock); > 2236 > 2237 for (; slot < btrfs_header_nritems(scratch_leaf); ++slot) { > 2238 btrfs_item_key_to_cpu(scratch_leaf, &found, > slot); <== here > > 2239 if (found.type != BTRFS_EXTENT_ITEM_KEY && > 2240 found.type != BTRFS_METADATA_ITEM_KEY) > 2241 continue; > > the btrfs_item_key_to_cpu() inline func calls 2 other inline funcs: > > static inline void btrfs_item_key_to_cpu(struct extent_buffer *eb, > struct btrfs_key *key, int nr) > { > struct btrfs_disk_key disk_key; > btrfs_item_key(eb, &disk_key, nr); > btrfs_disk_key_to_cpu(key, &disk_key); <== this is 0x28d > } > > btrfs_disk_key_to_cpu() is the inline referenced by 0x28d and this is where > the GPF happens. Thanks, now things are much more clear. Not completely sure, but scratch_leaf seems invalid and cause the bug. (found is in stack memory, so I don't think it's the cause). But less related to the qgroup rework, as that's the existing code. A quick glance already shows some dirty and maybe deadly hack, like copying the whole extent buffer, which includes pages and all kinds of locks. But I'm not 100% sure if that's the problem, but I'll create a patch to for you to test in recent days. > > >> BTW, did you tried the following patch? >> https://patchwork.kernel.org/patch/7114321/ >> btrfs: qgroup: exit the rescan worker during umount >> >> The problem seems a little related to the bug you encountered, so I'd >> recommend to give it a try. > > Not yet, but I've come across this bug too during my tests: starting a > rescan > and umounting gets you a crash. I didn't mention it because I was sure this > was an already known bug. Nice to see it has been fixed though ! > I'll certainly give it a try but I'm not really sure it'll fix the specific > bug we're talking about. > However the group of patches posted by Mark should fix the qgroup count > disrepancies as I understand it, right ? It might be of interest to try > them > all at once for sure. Yes, his patch should fix the qgroup count mismatch problem for subvolume remove. If I read the codes correctly, after remove and sync, the accounting number for qgroup of deleted subvolume should be: rfer = 0 and excl = 0. Thanks, Qu > > Thanks, >