From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net ([212.227.15.18]:62689 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752873AbbIVIvs (ORCPT ); Tue, 22 Sep 2015 04:51:48 -0400 Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance To: =?UTF-8?Q?St=c3=a9phane_Lesimple?= , Qu Wenruo References: <9c864637fe7676a8b7badc5ddd7a4e0c@all.all> <55F9486F.4040302@googlemail.com> <0973de930ee87e102c533c719807b748@all.all> <55FA2D9A.1060405@cn.fujitsu.com> <55FA60C5.5090002@cn.fujitsu.com> <7a6f2d794fb6cbf7d598b92e3470201c@all.all> <55FA759E.6030707@cn.fujitsu.com> <3386a8bfa1a5796460306a53a668e47e@all.all> <55FA98D8.5010301@gmx.com> <53a5553a9c5301789e246144bb264e43@all.all> <55FB61E9.4000300@cn.fujitsu.com> <2ce9b35f73732b145e0f80b18f230a52@all.all> <762ec73d5389b5057be4d3c17f74e1f9@all.all> <55FE0A50.9060607@gmx.com> <3ba27cf5afd82cf4e3bde718386b7cc3@all.all> <55FE8FB6.4070509@gmx.com> <72b4368e7180a4d703ef3ea1112d7358@all.all> <4749d42363070fcd228af172781750df@all.all> <5600B0BF.604@cn.fujitsu.com> <0a4be8fab4876a245900e4833e8139e0@all.all> <560113EF.2090209@gmx.com> Cc: linux-btrfs@vger.kernel.org From: Qu Wenruo Message-ID: <5601169B.4060600@gmx.com> Date: Tue, 22 Sep 2015 16:51:39 +0800 MIME-Version: 1.0 In-Reply-To: <560113EF.2090209@gmx.com> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: 在 2015年09月22日 16:40, Qu Wenruo 写道: > > > 在 2015年09月22日 15:34, Stéphane Lesimple 写道: >> Le 2015-09-22 03:37, Qu Wenruo a écrit : >>> Stéphane Lesimple wrote on 2015/09/22 03:30 +0200: >>>> Le 2015-09-20 13:14, Stéphane Lesimple a écrit : >>>>> Le 2015-09-20 12:51, Qu Wenruo a écrit : >>>>>>>> Would you please use gdb to show the codes of >>>>>>>> "btrfs_qgroup_rescan_worker+0x388" ? >>>>>>>> (Need kernel debuginfo) >>>>>>>> >>>>>>>> My guess is the following line:(pretty sure, but not 100% sure) >>>>>>>> ------ >>>>>>>> /* >>>>>>>> * only update status, since the previous part has alreay >>>>>>>> updated the >>>>>>>> * qgroup info. >>>>>>>> */ >>>>>>>> trans =trfs_start_transaction(fs_info->quota_root, 1); >>>>>>>> <<<<< >>>>>>>> if (IS_ERR(trans)) { >>>>>>>> err =TR_ERR(trans); >>>>>>>> btrfs_err(fs_info, >>>>>>>> "fail to start transaction for status >>>>>>>> update: %d\n", >>>>>>>> err); >>>>>>>> goto done; >>>>>>>> } >>>>>>>> ------ >>>>>>> >>>>>>> The kernel and modules were already compiled with debuginfo. >>>>>>> However for some reason, I couldn't get gdb disassembly of >>>>>>> /proc/kcore >>>>>>> properly >>>>>>> aligned with the source I compiled: the asm code doesn't match the C >>>>>>> code shown >>>>>>> by gdb. In any case, watching the source of this function, this is >>>>>>> the >>>>>>> only place >>>>>>> btrfs_start_transaction is called, so we can be 100% sure it's where >>>>>>> the >>>>>>> crash >>>>>>> happens indeed. >>>>>> >>>>>> Yep, that's the only caller. >>>>>> >>>>>> Here is some useful small hint to locate the code, if you are >>>>>> interestied in kernel development. >>>>>> >>>>>> # Not sure about whether ubuntu gzipped modules, at least Arch does >>>>>> # compress it >>>>>> $ cp /kernel/fs/btrfs/btrfs.ko.gz /tmp/ >>>>>> $ gunzip /tmp/btrfs.ko.gz >>>>>> $ gdb /tmp/btrfs.ko >>>>>> # Make sure gdb read all the needed debuginfo >>>>>> $ gdb list *(btrfs_qgroup_rescan_worker+0x388) >>>>>> >>>>>> And gdb will find the code position for you. >>>>>> Quite easy one, only backtrace info is needed. >>>>> >>>>> Ah, thanks for the tips, I was loading whole vmlinux and using >>>>> /proc/kcore >>>>> as the core info, then adding the module with "add-symbol-file". >>>>> But as >>>>> we're just looking for the code and not the variables, it was indeed >>>>> completely overkill. >>>>> >>>>> (gdb) list *(btrfs_qgroup_rescan_worker+0x388) >>>>> 0x98068 is in btrfs_qgroup_rescan_worker (fs/btrfs/qgroup.c:2328). >>>>> 2323 >>>>> 2324 /* >>>>> 2325 * only update status, since the previous part has >>>>> alreay updated the >>>>> 2326 * qgroup info. >>>>> 2327 */ >>>>> 2328 trans =trfs_start_transaction(fs_info->quota_root, >>>>> 1); >>>>> 2329 if (IS_ERR(trans)) { >>>>> 2330 err =TR_ERR(trans); >>>>> 2331 btrfs_err(fs_info, >>>>> 2332 "fail to start transaction for >>>>> status update: %d\n", >>>>> >>>>> So this just confirms what we were already 99% sure of. >>>>> >>>>>> Another hint is about how to collect the kernel crash info. >>>>>> Your netconsole setup would be definitely one good practice. >>>>>> >>>>>> Another one I use to collect crash info is kdump. >>>>>> Ubuntu should have a good wiki on it. >>>>> >>>>> I've already come across kdump a few times, but never really look into >>>>> it. >>>>> To debug the other complicated extend backref bug, it could be of some >>>>> use. >>>>> >>>>>>>>>> So, as a quick summary of this big thread, it seems I've been >>>>>>>>>> hitting >>>>>>>>>> 3 bugs, all reproductible : >>>>>>>>>> - kernel BUG on balance (this original thread) >>>>>>>> >>>>>>>> For this, I can't provide much help, as extent backref bug is quite >>>>>>>> hard to debug, unless a developer is interested in it and find a >>>>>>>> stable way to reproduce it. >>>>>>> >>>>>>> Yes, unfortunately as it looks so much like a race condition, I know >>>>>>> I can >>>>>>> reproduce it with my worflow, but it can take between 1 minute >>>>>>> and 12 >>>>>>> hours, >>>>>>> so I wouldn't call it a "stable way" to reproduce it >>>>>>> unfortunately :( >>>>>>> >>>>>>> Still if any dev is interested in it, I can reproduce it, with a >>>>>>> patched >>>>>>> kernel if needed. >>>>>> >>>>>> Maybe you are already doing it, you can only compile the btrfs >>>>>> modules, which will be far more faster than compile the whole kernel, >>>>>> if and only if the compiled module can be loaded. >>>>> >>>>> Yes, I've compiled this 4.3.0-rc1 in a completely modular form, so >>>>> I'll try to >>>>> load the modified module and see if the running kernel accepts it. I >>>>> have to rmmod >>>>> the loaded module first, hence umounting any btrfs fs before that. >>>>> Should be able >>>>> to do it in a couple hours. >>>>> >>>>> I'll delete again all my snapshots and run my script. Should be easy >>>>> to trigger >>>>> the (hopefully worked-around) bug again. >>>> >>>> Well, I didn't trigger this exact bug, but another one, not less severe >>>> though, as it also crashed the system: >>>> >>>> [92098.841309] general protection fault: 0000 [#1] SMP >>>> [92098.841338] Modules linked in: ... >>>> [92098.841814] CPU: 1 PID: 24655 Comm: kworker/u4:12 Not tainted >>>> 4.3.0-rc1 #1 >>>> [92098.841834] Hardware name: ASUS All Series/H87I-PLUS, BIOS 1005 >>>> 01/06/2014 >>>> [92098.841868] Workqueue: btrfs-qgroup-rescan >>>> btrfs_qgroup_rescan_helper >>>> [btrfs] >>>> [92098.841889] task: ffff8800b6cc4100 ti: ffff8800a3dc8000 task.ti: >>>> ffff8800a3dc8000 >>>> [92098.841910] RIP: 0010:[] [] >>>> memcpy_erms+0x6/0x10 >>>> [92098.841935] RSP: 0018:ffff8800a3dcbcc8 EFLAGS: 00010207 >>>> [92098.841950] RAX: ffff8800a3dcbd67 RBX: 0000000000000009 RCX: >>>> 0000000000000009 >>>> [92098.841970] RDX: 0000000000000009 RSI: 0005080000000000 RDI: >>>> ffff8800a3dcbd67 >>>> [92098.841989] RBP: ffff8800a3dcbd00 R08: 0000000000019c60 R09: >>>> ffff88011fb19c60 >>>> [92098.842009] R10: ffffea0003006480 R11: 0000000001000000 R12: >>>> ffff8800b76c32c0 >>>> [92098.842028] R13: 0000160000000000 R14: ffff8800a3dcbd70 R15: >>>> 0000000000000009 >>>> [92098.842048] FS: 0000000000000000(0000) GS:ffff88011fb00000(0000) >>>> knlGS:0000000000000000 >>>> [92098.842070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [92098.842086] CR2: 00007fe1f2bd8000 CR3: 0000000001c10000 CR4: >>>> 00000000000406e0 >>>> [92098.842105] Stack: >>>> [92098.842111] ffffffffc035a5d8 ffffffffc0396d00 000000000000028b >>>> 0000000000000000 >>>> [92098.842212] 0000cc6c00000000 ffff8800b76c3200 0000160000000000 >>>> ffff8800a3dcbdc0 >>>> [92098.842237] ffffffffc039af3d ffff8800c7196dc8 ffff8800c7196e08 >>>> ffff8800c7196da0 >>>> [92098.842261] Call Trace: >>>> [92098.842277] [] ? read_extent_buffer+0xb8/0x110 >>>> [btrfs] >>>> [92098.842304] [] ? btrfs_find_all_roots+0x60/0x70 >>>> [btrfs] >>>> [92098.842329] [] >>>> btrfs_qgroup_rescan_worker+0x28d/0x5a0 [btrfs] >>> >>> Would you please show the code of it? >>> This one seems to be another stupid bug I made when rewriting the >>> framework. >>> Maybe I forgot to reinit some variants or I'm screwing memory... >> >> (gdb) list *(btrfs_qgroup_rescan_worker+0x28d) >> 0x97f6d is in btrfs_qgroup_rescan_worker (fs/btrfs/ctree.h:2760). >> 2755 >> 2756 static inline void btrfs_disk_key_to_cpu(struct btrfs_key *cpu, >> 2757 struct btrfs_disk_key >> *disk) >> 2758 { >> 2759 cpu->offset =e64_to_cpu(disk->offset); >> 2760 cpu->type =isk->type; >> 2761 cpu->objectid =e64_to_cpu(disk->objectid); >> 2762 } >> 2763 >> 2764 static inline void btrfs_cpu_key_to_disk(struct btrfs_disk_key >> *disk, >> (gdb) >> >> >> Does it makes sense ? > So it seems that the memory of cpu key is being screwed up... > > The code is be specific thin inline function, so what about other stack? > Like btrfs_qgroup_rescan_helper+0x12? > > Thanks, > Qu Oh, I forgot that you can just change the number of btrfs_qgroup_rescan_worker+0x28d to smaller value. Try +0x280 for example, which will revert to 14 bytes asm code back, which may jump out of the inline function range, and may give you a good hint. Or gdb may have a better mode for inline function, but I don't know... Thanks, Qu >> >> >>>> [92098.842351] [] ? >>>> ttwu_do_activate.constprop.90+0x5d/0x70 >>>> [92098.842377] [] normal_work_helper+0xc0/0x270 >>>> [btrfs] >>>> [92098.842401] [] >>>> btrfs_qgroup_rescan_helper+0x12/0x20 [btrfs] >>>> [92098.842421] [] process_one_work+0x14e/0x3d0 >>>> [92098.842438] [] worker_thread+0x11a/0x470 >>>> [92098.842454] [] ? rescuer_thread+0x310/0x310 >>>> [92098.842471] [] kthread+0xc9/0xe0 >>>> [92098.842485] [] ? kthread_park+0x60/0x60 >>>> [92098.842502] [] ret_from_fork+0x3f/0x70 >>>> [92098.842517] [] ? kthread_park+0x60/0x60 >>>> [92098.842532] Code: ff eb eb 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 >>>> c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 >>>> 89 d1 a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 >>>> [92098.842658] RIP [] memcpy_erms+0x6/0x10 >>>> [92098.842675] RSP >>>> [92098.849594] ---[ end trace 9d5fb7931a3ec713 ]--- >>>> >>>> I would definitely say that rescans should be avoided on current >>>> kernels >>>> as the possibility that it'll bring the system down shouldn't be >>>> ignored. >>>> It confirms that this code really needs a rewrite ! >>>> >>>> Regards, >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html