From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:59275 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751871AbbAPHoF (ORCPT ); Fri, 16 Jan 2015 02:44:05 -0500 Received: from kw-mxoi2.gw.nic.fujitsu.com (unknown [10.0.237.143]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id DC1113EE170 for ; Fri, 16 Jan 2015 16:44:03 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (s1.gw.fujitsu.co.jp [10.0.50.91]) by kw-mxoi2.gw.nic.fujitsu.com (Postfix) with ESMTP id 23C9CAC0738 for ; Fri, 16 Jan 2015 16:44:03 +0900 (JST) Received: from g01jpfmpwkw03.exch.g01.fujitsu.local (g01jpfmpwkw03.exch.g01.fujitsu.local [10.0.193.57]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id B9F2EE08005 for ; Fri, 16 Jan 2015 16:44:02 +0900 (JST) Message-ID: <54B8C13B.7010204@jp.fujitsu.com> Date: Fri, 16 Jan 2015 16:43:55 +0900 From: Satoru Takeuchi MIME-Version: 1.0 To: Marcel Ritter , Subject: Re: Kernel bug in 3.19-rc4 References: In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi Marcel, On 2015/01/16 4:46, Marcel Ritter wrote: > Hi, > > I just started some btrfs stress testing on latest linux kernel 3.19-rc4: > A few hours later, filesystem stopped working - the kernel bug report > can be found below. > > The test consists of one massive IO thread (writing 100GB files with dd), > and 2 tar instances extracting kernel sources and deleting them afterwards > (I can provide the simple bash script doing this, if needed). Could you give me this script? Thanks, Satoru > > System information (Ubuntu 14.04.1, latest kernel): > > root@thunder # uname -a > Linux thunder 3.19.0-rc4-custom #1 SMP Mon Jan 12 16:13:44 CET 2015 > x86_64 x86_64 x86_64 GNU/Linux > > root@thunder # /root/btrfs-progs/btrfs --version > Btrfs v3.18-36-g0173148 > > Tests are done on 14 SCSI disks, using raid6 for data and metadata: > > root@thunder # /root/btrfs-progs/btrfs fi show > Label: 'raid6' uuid: cbe34d2b-5f75-46cf-9263-9813028ebc19 > Total devices 14 FS bytes used 674.62GiB > devid 1 size 279.39GiB used 59.24GiB path /dev/cciss/c1d0 > devid 2 size 279.39GiB used 59.22GiB path /dev/cciss/c1d1 > devid 3 size 279.39GiB used 59.22GiB path /dev/cciss/c1d10 > devid 4 size 279.39GiB used 59.22GiB path /dev/cciss/c1d11 > devid 5 size 279.39GiB used 59.22GiB path /dev/cciss/c1d12 > devid 6 size 279.39GiB used 59.22GiB path /dev/cciss/c1d13 > devid 7 size 279.39GiB used 59.22GiB path /dev/cciss/c1d2 > devid 8 size 279.39GiB used 59.22GiB path /dev/cciss/c1d3 > devid 9 size 279.39GiB used 59.22GiB path /dev/cciss/c1d4 > devid 10 size 279.39GiB used 59.22GiB path /dev/cciss/c1d5 > devid 11 size 279.39GiB used 59.22GiB path /dev/cciss/c1d6 > devid 12 size 279.39GiB used 59.22GiB path /dev/cciss/c1d7 > devid 13 size 279.39GiB used 59.22GiB path /dev/cciss/c1d8 > devid 14 size 279.39GiB used 59.22GiB path /dev/cciss/c1d9 > > Btrfs v3.18-36-g0173148 > > # This is provided for completeness only, and is taken > # somewhen *before* the kernel crash occured, so basic > # setup is the same, but allocated/free sizes won't match > root@thunder # /root/btrfs-progs/btrfs fi df /tmp/m > Data, single: total=8.00MiB, used=0.00B > Data, RAID6: total=727.45GiB, used=697.84GiB > System, single: total=4.00MiB, used=0.00B > System, RAID6: total=13.50MiB, used=64.00KiB > Metadata, single: total=8.00MiB, used=0.00B > Metadata, RAID6: total=3.43GiB, used=805.91MiB > GlobalReserve, single: total=272.00MiB, used=0.00B > > > Here's what happens after some hours of stress testing: > > [85162.472989] ------------[ cut here ]------------ > [85162.473071] kernel BUG at fs/btrfs/inode.c:3142! > [85162.473139] invalid opcode: 0000 [#1] SMP > [85162.473212] Modules linked in: btrfs(E) xor(E) raid6_pq(E) > radeon(E) ttm(E) drm_kms_helper(E) drm(E) hpwdt(E) amd64_edac_mod(E) > kvm(E) edac_core(E) shpchp(E) k8temp(E) serio_raw(E) hpilo(E) > edac_mce_amd(E) mac_hid(E) i2c_algo_bit(E) ipmi_si(E) nfsd(E) > auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) lp(E) > fscache(E) parport(E) hid_generic(E) usbhid(E) hid(E) hpsa(E) > psmouse(E) bnx2(E) cciss(E) pata_acpi(E) pata_amd(E) > [85162.473911] CPU: 4 PID: 3039 Comm: btrfs-cleaner Tainted: G > E 3.19.0-rc4-custom #1 > [85162.474028] Hardware name: HP ProLiant DL585 G2 , BIOS A07 05/02/2011 > [85162.474122] task: ffff88085b054aa0 ti: ffff88205ad4c000 task.ti: > ffff88205ad4c000 > [85162.474230] RIP: 0010:[] [] > btrfs_orphan_add+0x1d2/0x1e0 [btrfs] > [85162.474422] RSP: 0018:ffff88205ad4fc48 EFLAGS: 00010286 > [85162.474497] RAX: 00000000ffffffe4 RBX: ffff8810a35d42f8 RCX: ffff88185b896000 > [85162.474595] RDX: 0000000000006a54 RSI: 0000000000040000 RDI: ffff88185b896138 > [85162.474694] RBP: ffff88205ad4fc88 R08: 000000000001e670 R09: ffff88016194b240 > [85162.474793] R10: ffffffffa06bd797 R11: ffffea0004f71800 R12: ffff88185baa2000 > [85162.474892] R13: ffff88085f6d7630 R14: ffff88185baa2458 R15: 0000000000000001 > [85162.474992] FS: 00007fb3f27fb740(0000) GS:ffff88085fd00000(0000) > knlGS:0000000000000000 > [85162.475105] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [85162.475184] CR2: 00007f896c02c220 CR3: 000000085b328000 CR4: 00000000000007e0 > [85162.475286] Stack: > [85162.475318] ffff88205ad4fc88 ffffffffa06e6a14 ffff88185b896b04 > ffff88105b03e800 > [85162.475442] ffff88016194b240 ffff8810a35d42f8 ffff881e8ffe9a00 > ffff88133dc48ea0 > [85162.475561] ffff88205ad4fd18 ffffffffa0691a57 ffff88016194b244 > ffff88016194b240 > [85162.475680] Call Trace: > [85162.475738] [] ? > lookup_free_space_inode+0x44/0x100 [btrfs] > [85162.475849] [] > btrfs_remove_block_group+0x137/0x740 [btrfs] > [85162.475964] [] btrfs_remove_chunk+0x672/0x780 [btrfs] > [85162.476065] [] btrfs_delete_unused_bgs+0x25f/0x280 [btrfs] > [85162.476172] [] cleaner_kthread+0x12c/0x190 [btrfs] > [85162.476269] [] ? check_leaf+0x350/0x350 [btrfs] > [85162.476355] [] kthread+0xd2/0xf0 > [85162.476424] [] ? kthread_create_on_node+0x180/0x180 > [85162.476519] [] ret_from_fork+0x7c/0xb0 > [85162.476592] [] ? kthread_create_on_node+0x180/0x180 > [85162.476648] Code: ff ff 0f 1f 80 00 00 00 00 89 45 c8 f0 80 63 80 > fd 48 89 df e8 d0 23 fe ff 8b 45 c8 e9 14 ff ff ff b8 f4 ff ff ff e9 > 12 ff ff ff <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 > 55 48 > [85162.476648] RIP [] btrfs_orphan_add+0x1d2/0x1e0 [btrfs] > [85162.476648] RSP > [85162.640076] ---[ end trace 396c6a6abc5a7fce ]--- > > One reboot later, creating a new, clean filesystem and running the > same tests again: > > [30204.556282] ------------[ cut here ]------------ > [30204.556358] kernel BUG at fs/btrfs/inode.c:3142! > [30204.556422] invalid opcode: 0000 [#1] SMP > [30204.556492] Modules linked in: btrfs(E) xor(E) radeon(E) ttm(E) > drm_kms_helper(E) raid6_pq(E) drm(E) kvm(E) amd64_edac_mod(E) > edac_core(E) i2c_algo_bit(E) edac_mce_amd(E) mac_hid(E) shpchp(E) > serio_raw(E) k8temp(E) hpwdt(E) ipmi_si(E) hpilo(E) nfsd(E) > auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) > fscache(E) lp(E) parport(E) hpsa(E) hid_generic(E) usbhid(E) hid(E) > pata_acpi(E) psmouse(E) bnx2(E) cciss(E) pata_amd(E) > [30204.557194] CPU: 2 PID: 2186 Comm: btrfs-cleaner Tainted: G > E 3.19.0-rc4-custom #1 > [30204.557313] Hardware name: HP ProLiant DL585 G2 , BIOS A07 05/02/2011 > [30204.557407] task: ffff88105b644aa0 ti: ffff88185c2b8000 task.ti: > ffff88185c2b8000 > [30204.557510] RIP: 0010:[] [] > btrfs_orphan_add+0x1d2/0x1e0 [btrfs] > [30204.557687] RSP: 0018:ffff88185c2bbc48 EFLAGS: 00010286 > [30204.557762] RAX: 00000000ffffffe4 RBX: ffff881091e9fca0 RCX: ffff88205bb15000 > [30204.557860] RDX: 000000000000bd74 RSI: 0000000000040000 RDI: ffff88205bb15138 > [30204.557959] RBP: ffff88185c2bbc88 R08: 000000000001e670 R09: ffff8810a3c963f0 > [30204.558058] R10: ffffffffa0dc0797 R11: ffffea004255ba00 R12: ffff882059bc5000 > [30204.558157] R13: ffff8818588526e0 R14: ffff882059bc5458 R15: 0000000000000001 > [30204.558256] FS: 00007f34ad4b3840(0000) GS:ffff88185fc00000(0000) > knlGS:0000000000000000 > [30204.558374] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [30204.558453] CR2: 00007fab99cbe700 CR3: 0000000fbd16c000 CR4: 00000000000007e0 > [30204.558552] Stack: > [30204.558582] ffff88185c2bbc88 ffffffffa0de9a14 ffff88205bb15b04 > ffff88205c62e000 > [30204.558707] ffff8810a3c963f0 ffff881091e9fca0 ffff88105b02c000 > ffff8808b9130480 > [30204.558826] ffff88185c2bbd18 ffffffffa0d94a57 ffff8810a3c963f4 > ffff8810a3c963f0 > [30204.558945] Call Trace: > [30204.558996] [] ? > lookup_free_space_inode+0x44/0x100 [btrfs] > [30204.559102] [] > btrfs_remove_block_group+0x137/0x740 [btrfs] > [30204.559210] [] btrfs_remove_chunk+0x672/0x780 [btrfs] > [30204.559306] [] btrfs_delete_unused_bgs+0x25f/0x280 [btrfs] > [30204.559408] [] cleaner_kthread+0x12c/0x190 [btrfs] > [30204.559501] [] ? check_leaf+0x350/0x350 [btrfs] > [30204.559583] [] kthread+0xd2/0xf0 > [30204.559649] [] ? kthread_create_on_node+0x180/0x180 > [30204.559743] [] ret_from_fork+0x7c/0xb0 > [30204.559816] [] ? kthread_create_on_node+0x180/0x180 > [30204.559907] Code: ff ff 0f 1f 80 00 00 00 00 89 45 c8 f0 80 63 80 > fd 48 89 df e8 d0 23 fe ff 8b 45 c8 e9 14 ff ff ff b8 f4 ff ff ff e9 > 12 ff ff ff <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 > 55 48 > [30204.560138] RIP [] btrfs_orphan_add+0x1d2/0x1e0 [btrfs] > [30204.560138] RSP > [30204.719832] ---[ end trace bbc20b459964e0ed ]--- > > Maybe this helps to locate the error. If I can do more tests, or > provide more necessary information to diagnose this, please let me > know. > > Bye, > Marcel > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >