* 4.1-rc6 - kernel crash after doing chattr +C @ 2015-06-06 6:07 Tomasz Chmielewski 2015-06-08 15:48 ` Chris Mason 2015-07-03 20:25 ` Filipe David Manana 0 siblings, 2 replies; 4+ messages in thread From: Tomasz Chmielewski @ 2015-06-06 6:07 UTC (permalink / raw) To: linux-btrfs 4.1-rc6, busy filesystem. I was running mongo import which made quite a lot of IO. During the import, I did "chattr +C /var/lib/mongodb" - shortly after I saw this in dmesg and server died: [57860.149839] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [57860.149877] IP: [<ffffffffc0158b8e>] btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] [57860.149923] PGD 5d1ac6067 PUD 5d40fc067 PMD 0 [57860.149943] Oops: 0002 [#1] SMP [57860.149960] Modules linked in: xt_conntrack veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul eeepc_wmi asus_wmi crc32_pclmul ghash_clmulni_intel sparse_keymap aesni_intel aes_x86_64 ie31200_edac lpc_ich lrw gf128mul edac_core glue_helper ablk_helper shpchp cryptd serio_raw wmi video tpm_infineon 8250_fintek mac_hid btrfs lp parport raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq e1000e raid1 ahci raid0 ptp libahci pps_core multipath linear [57860.150203] CPU: 4 PID: 14111 Comm: mongod Not tainted 4.1.0-040100rc6-generic #201506010235 [57860.150237] Hardware name: System manufacturer System Product Name/P8B WS, BIOS 0904 10/24/2011 [57860.150271] task: ffff88007901bc60 ti: ffff8805d5c38000 task.ti: ffff8805d5c38000 [57860.150303] RIP: 0010:[<ffffffffc0158b8e>] [<ffffffffc0158b8e>] btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] [57860.150346] RSP: 0018:ffff8805d5c3bd18 EFLAGS: 00010206 [57860.150364] RAX: 0000000000000000 RBX: ffff880103c9d950 RCX: 0000000000003d44 [57860.150386] RDX: 0000000000000000 RSI: 0000000000003d44 RDI: ffff880806a74838 [57860.150407] RBP: ffff8805d5c3bd88 R08: 0000000000000000 R09: 0000000000000000 [57860.150428] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880806bcb800 [57860.150450] R13: ffff880806a74838 R14: ffff880103c9d8d8 R15: ffff88080a7e3518 [57860.150471] FS: 00007f5f4e6dc700(0000) GS:ffff88082fb00000(0000) knlGS:0000000000000000 [57860.150504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [57860.150523] CR2: 0000000000000008 CR3: 000000062a584000 CR4: 00000000000407e0 [57860.150544] Stack: [57860.150558] ffff8805d5c3bd48 ffff88080a7e35c8 ffff880806bcb000 ffff880806bcb800 [57860.150592] ffff8800070da638 ffffffffd5c3bdb0 0000000000000287 ffff88080a72a4d0 [57860.150626] ffff880806bcb800 ffff88080a72a4d0 ffff880806bcb800 0000000000000000 [57860.150659] Call Trace: [57860.150682] [<ffffffffc015addb>] btrfs_commit_transaction+0x40b/0xb60 [btrfs] [57860.150717] [<ffffffff810c0700>] ? prepare_to_wait_event+0x100/0x100 [57860.150745] [<ffffffffc0171973>] btrfs_sync_file+0x313/0x380 [btrfs] [57860.150768] [<ffffffff81236bf6>] vfs_fsync_range+0x46/0xc0 [57860.150788] [<ffffffff81236c8c>] vfs_fsync+0x1c/0x20 [57860.150806] [<ffffffff81236cc8>] do_fsync+0x38/0x70 [57860.150825] [<ffffffff812370c3>] SyS_fdatasync+0x13/0x20 [57860.150846] [<ffffffff8180cb32>] system_call_fastpath+0x16/0x75 [57860.150866] Code: 45 98 48 39 d8 0f 84 ad 00 00 00 48 8d 45 a8 48 83 c0 18 48 89 45 90 66 0f 1f 44 00 00 48 8b 13 48 8b 43 08 4c 89 ef 4c 8d 73 88 <48> 89 42 08 48 89 10 48 89 1b 48 89 5b 08 e8 bf 3a 6b c1 e8 aa [57860.150959] RIP [<ffffffffc0158b8e>] btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] [57860.150998] RSP <ffff8805d5c3bd18> [57860.151014] CR2: 0000000000000008 [57860.151186] ---[ end trace f41cd52aa31494ac ]--- -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 4.1-rc6 - kernel crash after doing chattr +C 2015-06-06 6:07 4.1-rc6 - kernel crash after doing chattr +C Tomasz Chmielewski @ 2015-06-08 15:48 ` Chris Mason 2015-06-09 19:08 ` David Sterba 2015-07-03 20:25 ` Filipe David Manana 1 sibling, 1 reply; 4+ messages in thread From: Chris Mason @ 2015-06-08 15:48 UTC (permalink / raw) To: Tomasz Chmielewski, linux-btrfs On 06/06/2015 02:07 AM, Tomasz Chmielewski wrote: > 4.1-rc6, busy filesystem. > > I was running mongo import which made quite a lot of IO. > During the import, I did "chattr +C /var/lib/mongodb" - shortly after I > saw this in dmesg and server died: > > [57860.149839] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000008 > [57860.149877] IP: [<ffffffffc0158b8e>] > btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] Sorry, it's not obvious where the 0000000000000008 is coming from, can you turn btrfs_wait_pending_ordered+0x5e/0x110 into a line number? Use list *btrfs_wait_pending_ordered+0x5e at the gdb prompt, after you gdb btrfs.ko -chris ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 4.1-rc6 - kernel crash after doing chattr +C 2015-06-08 15:48 ` Chris Mason @ 2015-06-09 19:08 ` David Sterba 0 siblings, 0 replies; 4+ messages in thread From: David Sterba @ 2015-06-09 19:08 UTC (permalink / raw) To: Chris Mason; +Cc: Tomasz Chmielewski, linux-btrfs On Mon, Jun 08, 2015 at 11:48:54AM -0400, Chris Mason wrote: > On 06/06/2015 02:07 AM, Tomasz Chmielewski wrote: > > 4.1-rc6, busy filesystem. > > > > I was running mongo import which made quite a lot of IO. > > During the import, I did "chattr +C /var/lib/mongodb" - shortly after I > > saw this in dmesg and server died: > > > > [57860.149839] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000008 > > [57860.149877] IP: [<ffffffffc0158b8e>] > > btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] > > Sorry, it's not obvious where the 0000000000000008 is coming from, can > you turn btrfs_wait_pending_ordered+0x5e/0x110 into a line number? > > Use list *btrfs_wait_pending_ordered+0x5e at the gdb prompt, after you > gdb btrfs.ko Guesswork, but doing that on my sources points to __list_del (gdb) l *(btrfs_wait_pending_ordered+0x5e) 0x333fe is in btrfs_wait_pending_ordered (include/linux/list.h:89). 84 * This is only for internal list manipulation where we know 85 * the prev/next entries already! 86 */ 87 static inline void __list_del(struct list_head * prev, struct list_head * next) 88 { 89 next->prev = prev; 90 prev->next = next; 91 } that is called from btrfs_wait_pending_ordered. The off 8 corresponds to 'prev' of list_head, so the 'next' poiinter is NULL. If we go from the list_del_init(ordered->trans_list) we find that it's called as list_del(entry->prev, entry->next) (ie entry === ordered->trans_list). 1755 while (!list_empty(&cur_trans->pending_ordered)) { 1756 ordered = list_first_entry(&cur_trans->pending_ordered, 1757 struct btrfs_ordered_extent, 1758 trans_list); 1759 list_del_init(&ordered->trans_list); 1760 spin_unlock(&fs_info->trans_lock); 1761 1762 wait_event(ordered->wait, test_bit(BTRFS_ORDERED_COMPLETE, 1763 &ordered->flags)); 1764 btrfs_put_ordered_extent(ordered); 1765 spin_lock(&fs_info->trans_lock); 1766 } So we probably got bogus data from cur_trans->pending_ordered. I don't know if ordered is zeroed or if just the list_head got corrupted. The way the list_head pointer magic works it's possible to get there both ways (I think). ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 4.1-rc6 - kernel crash after doing chattr +C 2015-06-06 6:07 4.1-rc6 - kernel crash after doing chattr +C Tomasz Chmielewski 2015-06-08 15:48 ` Chris Mason @ 2015-07-03 20:25 ` Filipe David Manana 1 sibling, 0 replies; 4+ messages in thread From: Filipe David Manana @ 2015-07-03 20:25 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: linux-btrfs, Chris Mason On Sat, Jun 6, 2015 at 7:07 AM, Tomasz Chmielewski <tch@virtall.com> wrote: > 4.1-rc6, busy filesystem. > > I was running mongo import which made quite a lot of IO. > During the import, I did "chattr +C /var/lib/mongodb" - shortly after I saw > this in dmesg and server died: > > [57860.149839] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000008 > [57860.149877] IP: [<ffffffffc0158b8e>] > btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] > [57860.149923] PGD 5d1ac6067 PUD 5d40fc067 PMD 0 > [57860.149943] Oops: 0002 [#1] SMP > [57860.149960] Modules linked in: xt_conntrack veth xt_CHECKSUM > iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat > nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp > iptable_filter ip_tables x_tables bridge stp llc intel_rapl iosf_mbi > x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm > crct10dif_pclmul eeepc_wmi asus_wmi crc32_pclmul ghash_clmulni_intel > sparse_keymap aesni_intel aes_x86_64 ie31200_edac lpc_ich lrw gf128mul > edac_core glue_helper ablk_helper shpchp cryptd serio_raw wmi video > tpm_infineon 8250_fintek mac_hid btrfs lp parport raid10 raid456 > async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > e1000e raid1 ahci raid0 ptp libahci pps_core multipath linear > [57860.150203] CPU: 4 PID: 14111 Comm: mongod Not tainted > 4.1.0-040100rc6-generic #201506010235 > [57860.150237] Hardware name: System manufacturer System Product Name/P8B > WS, BIOS 0904 10/24/2011 > [57860.150271] task: ffff88007901bc60 ti: ffff8805d5c38000 task.ti: > ffff8805d5c38000 > [57860.150303] RIP: 0010:[<ffffffffc0158b8e>] [<ffffffffc0158b8e>] > btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] > [57860.150346] RSP: 0018:ffff8805d5c3bd18 EFLAGS: 00010206 > [57860.150364] RAX: 0000000000000000 RBX: ffff880103c9d950 RCX: > 0000000000003d44 > [57860.150386] RDX: 0000000000000000 RSI: 0000000000003d44 RDI: > ffff880806a74838 > [57860.150407] RBP: ffff8805d5c3bd88 R08: 0000000000000000 R09: > 0000000000000000 > [57860.150428] R10: 0000000000000001 R11: 0000000000000000 R12: > ffff880806bcb800 > [57860.150450] R13: ffff880806a74838 R14: ffff880103c9d8d8 R15: > ffff88080a7e3518 > [57860.150471] FS: 00007f5f4e6dc700(0000) GS:ffff88082fb00000(0000) > knlGS:0000000000000000 > [57860.150504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [57860.150523] CR2: 0000000000000008 CR3: 000000062a584000 CR4: > 00000000000407e0 > [57860.150544] Stack: > [57860.150558] ffff8805d5c3bd48 ffff88080a7e35c8 ffff880806bcb000 > ffff880806bcb800 > [57860.150592] ffff8800070da638 ffffffffd5c3bdb0 0000000000000287 > ffff88080a72a4d0 > [57860.150626] ffff880806bcb800 ffff88080a72a4d0 ffff880806bcb800 > 0000000000000000 > [57860.150659] Call Trace: > [57860.150682] [<ffffffffc015addb>] btrfs_commit_transaction+0x40b/0xb60 > [btrfs] > [57860.150717] [<ffffffff810c0700>] ? prepare_to_wait_event+0x100/0x100 > [57860.150745] [<ffffffffc0171973>] btrfs_sync_file+0x313/0x380 [btrfs] > [57860.150768] [<ffffffff81236bf6>] vfs_fsync_range+0x46/0xc0 > [57860.150788] [<ffffffff81236c8c>] vfs_fsync+0x1c/0x20 > [57860.150806] [<ffffffff81236cc8>] do_fsync+0x38/0x70 > [57860.150825] [<ffffffff812370c3>] SyS_fdatasync+0x13/0x20 > [57860.150846] [<ffffffff8180cb32>] system_call_fastpath+0x16/0x75 > [57860.150866] Code: 45 98 48 39 d8 0f 84 ad 00 00 00 48 8d 45 a8 48 83 c0 > 18 48 89 45 90 66 0f 1f 44 00 00 48 8b 13 48 8b 43 08 4c 89 ef 4c 8d 73 88 > <48> 89 42 08 48 89 10 48 89 1b 48 89 5b 08 e8 bf 3a 6b c1 e8 aa > [57860.150959] RIP [<ffffffffc0158b8e>] > btrfs_wait_pending_ordered+0x5e/0x110 [btrfs] > [57860.150998] RSP <ffff8805d5c3bd18> > [57860.151014] CR2: 0000000000000008 > [57860.151186] ---[ end trace f41cd52aa31494ac ]--- Hi, Managed to reproduce it and the following patch should fix the problem: https://patchwork.kernel.org/patch/6716871/ > > > -- > Tomasz Chmielewski > http://wpkg.org > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men." ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-07-03 20:25 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-06 6:07 4.1-rc6 - kernel crash after doing chattr +C Tomasz Chmielewski 2015-06-08 15:48 ` Chris Mason 2015-06-09 19:08 ` David Sterba 2015-07-03 20:25 ` Filipe David Manana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox