* xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) @ 2013-05-07 13:37 Dave Jones 2013-05-07 19:04 ` Mark Tinguely 2013-05-11 12:09 ` Dmitry Monakhov 0 siblings, 2 replies; 18+ messages in thread From: Dave Jones @ 2013-05-07 13:37 UTC (permalink / raw) To: Linux Kernel; +Cc: xfs started compiling a kernel, and then... [ 172.233200] ============================================================================= [ 172.233205] BUG xfs_efi_item (Not tainted): Poison overwritten [ 172.233207] ----------------------------------------------------------------------------- [ 172.233210] Disabling lock debugging due to kernel taint [ 172.233213] INFO: 0xffff8800aaac4ea8-0xffff8800aaac4ea8. First byte 0x6a instead of 0x6b [ 172.233235] INFO: Allocated in kmem_zone_alloc+0x67/0xf0 [xfs] age=29290 cpu=1 pid=1357 [ 172.233239] __slab_alloc+0x468/0x52c [ 172.233243] kmem_cache_alloc+0x2b4/0x320 [ 172.233256] kmem_zone_alloc+0x67/0xf0 [xfs] [ 172.233269] kmem_zone_zalloc+0x14/0x40 [xfs] [ 172.233287] xfs_efi_init+0x41/0xa0 [xfs] [ 172.233305] xfs_trans_get_efi+0x58/0x90 [xfs] [ 172.233320] xfs_bmap_finish+0x76/0x1b0 [xfs] [ 172.233338] xfs_itruncate_extents+0x2cd/0x610 [xfs] [ 172.233353] xfs_inactive+0x401/0x530 [xfs] [ 172.233367] xfs_fs_evict_inode+0x8c/0x1b0 [xfs] [ 172.233370] evict+0xa3/0x1a0 [ 172.233372] iput+0xf5/0x180 [ 172.233374] dput+0x208/0x2f0 [ 172.233378] SYSC_renameat+0x3be/0x430 [ 172.233380] SyS_renameat+0xe/0x10 [ 172.233383] SyS_rename+0x1b/0x20 [ 172.233399] INFO: Freed in xfs_efi_item_free+0x21/0x40 [xfs] age=27303 cpu=1 pid=177 [ 172.233402] __slab_free+0x41/0x39f [ 172.233405] kmem_cache_free+0x326/0x370 [ 172.233420] xfs_efi_item_free+0x21/0x40 [xfs] [ 172.233435] __xfs_efi_release+0x4e/0x60 [xfs] [ 172.233449] xfs_efi_release+0x50/0x70 [xfs] [ 172.233463] xfs_efd_item_committed+0x22/0x40 [xfs] [ 172.233479] xfs_trans_committed_bulk+0xcf/0x290 [xfs] [ 172.233494] xlog_cil_committed+0x37/0x110 [xfs] [ 172.233510] xlog_state_do_callback+0x1b8/0x3f0 [xfs] [ 172.233525] xlog_state_done_syncing+0xf2/0x110 [xfs] [ 172.233539] xlog_iodone+0x87/0x110 [xfs] [ 172.233550] xfs_buf_iodone_work+0x5e/0xd0 [xfs] [ 172.233554] process_one_work+0x211/0x700 [ 172.233556] worker_thread+0x11d/0x3a0 [ 172.233559] kthread+0xed/0x100 [ 172.233562] ret_from_fork+0x7c/0xb0 [ 172.233565] INFO: Slab 0xffffea0002aab100 objects=22 used=22 fp=0x (null) flags=0x20000000004080 [ 172.233567] INFO: Object 0xffff8800aaac4e38 @offset=3640 fp=0xffff8800aaac7058 [ 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk [ 172.233590] Object ffff8800aaac4eb8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233592] Object ffff8800aaac4ec8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233594] Object ffff8800aaac4ed8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233597] Object ffff8800aaac4ee8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233599] Object ffff8800aaac4ef8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233601] Object ffff8800aaac4f08: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233603] Object ffff8800aaac4f18: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233605] Object ffff8800aaac4f28: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233608] Object ffff8800aaac4f38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233610] Object ffff8800aaac4f48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233612] Object ffff8800aaac4f58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233614] Object ffff8800aaac4f68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233616] Object ffff8800aaac4f78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233618] Object ffff8800aaac4f88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233620] Object ffff8800aaac4f98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233622] Object ffff8800aaac4fa8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [ 172.233625] Object ffff8800aaac4fb8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. [ 172.233627] Redzone ffff8800aaac4fc8: bb bb bb bb bb bb bb bb ........ [ 172.233629] Padding ffff8800aaac5108: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ [ 172.233632] CPU: 1 PID: 1658 Comm: gcc Tainted: G B 3.9.0+ #1 [loadavg: 0.68 0.43 0.17 4/333 1660] [ 172.233636] Hardware name: LENOVO 2356JK8/2356JK8, BIOS G7ET92WW (2.52 ) 02/18/2013 [ 172.233638] ffffea0002aab100 ffff8800a4b03918 ffffffff815f5588 ffff8800a4b03968 [ 172.233644] ffffffff81185944 0000000000000008 ffff880000000001 ffffffff81078016 [ 172.233648] ffff8800aaac4ea8 ffff8800aaac4ea9 ffff880115937cc0 000000000000006b [ 172.233653] Call Trace: [ 172.233658] [<ffffffff815f5588>] dump_stack+0x19/0x1b [ 172.233662] [<ffffffff81185944>] print_trailer+0x154/0x210 [ 172.233666] [<ffffffff81078016>] ? perf_trace_sched_process_template+0xe6/0x100 [ 172.233669] [<ffffffff81185b3f>] check_bytes_and_report+0xcf/0x110 [ 172.233673] [<ffffffff81187037>] check_object+0x1e7/0x260 [ 172.233676] [<ffffffff811864f7>] ? check_slab+0x87/0x130 [ 172.233691] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] [ 172.233694] [<ffffffff815f2ea3>] alloc_debug_processing+0x76/0x118 [ 172.233697] [<ffffffff815f3b8d>] __slab_alloc+0x468/0x52c [ 172.233716] [<ffffffffa0592787>] ? xfs_iext_bno_to_ext+0x97/0x180 [xfs] [ 172.233731] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] [ 172.233735] [<ffffffff8118a754>] kmem_cache_alloc+0x2b4/0x320 [ 172.233748] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] [ 172.233762] [<ffffffffa0543fa7>] kmem_zone_alloc+0x67/0xf0 [xfs] [ 172.233776] [<ffffffffa0544044>] kmem_zone_zalloc+0x14/0x40 [xfs] [ 172.233792] [<ffffffffa05ae031>] xfs_efi_init+0x41/0xa0 [xfs] [ 172.233809] [<ffffffffa05b32b8>] xfs_trans_get_efi+0x58/0x90 [xfs] [ 172.233825] [<ffffffffa055a0a6>] xfs_bmap_finish+0x76/0x1b0 [xfs] [ 172.233844] [<ffffffffa058ef0d>] xfs_itruncate_extents+0x2cd/0x610 [xfs] [ 172.233859] [<ffffffffa0540e41>] xfs_inactive+0x401/0x530 [xfs] [ 172.233874] [<ffffffffa053dbec>] xfs_fs_evict_inode+0x8c/0x1b0 [xfs] [ 172.233877] [<ffffffff811c1033>] evict+0xa3/0x1a0 [ 172.233880] [<ffffffff811c19c5>] iput+0xf5/0x180 [ 172.233883] [<ffffffff811bcf18>] dput+0x208/0x2f0 [ 172.233887] [<ffffffff811b43ae>] SYSC_renameat+0x3be/0x430 [ 172.233892] [<ffffffff810ad32e>] ? lock_release_holdtime.part.30+0xee/0x170 [ 172.233895] [<ffffffff8107c18d>] ? get_parent_ip+0xd/0x50 [ 172.233899] [<ffffffff810afe7d>] ? trace_hardirqs_on+0xd/0x10 [ 172.233904] [<ffffffff8100e358>] ? syscall_trace_enter+0x18/0x230 [ 172.233907] [<ffffffff811b67fe>] SyS_renameat+0xe/0x10 [ 172.233911] [<ffffffff811b681b>] SyS_rename+0x1b/0x20 [ 172.233914] [<ffffffff816051d4>] tracesys+0xdd/0xe2 [ 172.233917] FIX xfs_efi_item: Restoring 0xffff8800aaac4ea8-0xffff8800aaac4ea8=0x6b [ 172.233919] FIX xfs_efi_item: Marking all objects used _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 13:37 xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) Dave Jones @ 2013-05-07 19:04 ` Mark Tinguely 2013-05-07 19:07 ` Dave Jones 2013-05-11 12:09 ` Dmitry Monakhov 1 sibling, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 19:04 UTC (permalink / raw) To: Dave Jones, xfs, CAI Qian On 05/07/13 08:37, Dave Jones wrote: > 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk ^^ xfs_efi_log_item.efi_refcount being decremented on the xfs_efi_release() CAI Qian had the same thing in his May 6 "3.9.0: XFS rootfs corruption" email. I have not reproduced it yet. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:04 ` Mark Tinguely @ 2013-05-07 19:07 ` Dave Jones 2013-05-07 19:24 ` Mark Tinguely 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2013-05-07 19:07 UTC (permalink / raw) To: Mark Tinguely; +Cc: CAI Qian, xfs On Tue, May 07, 2013 at 02:04:05PM -0500, Mark Tinguely wrote: > On 05/07/13 08:37, Dave Jones wrote: > > 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk > ^^ > > xfs_efi_log_item.efi_refcount being decremented on the xfs_efi_release() > CAI Qian had the same thing in his May 6 "3.9.0: XFS rootfs corruption" > email. > > I have not reproduced it yet. I've hit it on two different machines today. The good news is that the corruption never makes it onto disk. xfs_repair doesn't pick up anything. Dave _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:07 ` Dave Jones @ 2013-05-07 19:24 ` Mark Tinguely 2013-05-07 19:31 ` Dave Jones 0 siblings, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 19:24 UTC (permalink / raw) To: Dave Jones; +Cc: CAI Qian, xfs On 05/07/13 14:07, Dave Jones wrote: > On Tue, May 07, 2013 at 02:04:05PM -0500, Mark Tinguely wrote: > > On 05/07/13 08:37, Dave Jones wrote: > > > 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > > > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk > > ^^ > > > > xfs_efi_log_item.efi_refcount being decremented on the xfs_efi_release() > > CAI Qian had the same thing in his May 6 "3.9.0: XFS rootfs corruption" > > email. > > > > I have not reproduced it yet. > > I've hit it on two different machines today. The good news is that the > corruption never makes it onto disk. xfs_repair doesn't pick up anything. > > Dave > There was a new patch in the efi/efd code that must be misbehaving. You are correct, this is not an on-disk value. I now have poisoning on and I can see this doing a compile like you suggested. I will ASSERT to see who is doing the decrement after free. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:24 ` Mark Tinguely @ 2013-05-07 19:31 ` Dave Jones 2013-05-07 19:58 ` Mark Tinguely 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2013-05-07 19:31 UTC (permalink / raw) To: Mark Tinguely; +Cc: CAI Qian, xfs On Tue, May 07, 2013 at 02:24:14PM -0500, Mark Tinguely wrote: > On 05/07/13 14:07, Dave Jones wrote: > > On Tue, May 07, 2013 at 02:04:05PM -0500, Mark Tinguely wrote: > > > On 05/07/13 08:37, Dave Jones wrote: > > > > 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > > > > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk > > > ^^ > > > > > > xfs_efi_log_item.efi_refcount being decremented on the xfs_efi_release() > > > CAI Qian had the same thing in his May 6 "3.9.0: XFS rootfs corruption" > > > email. > > > > > > I have not reproduced it yet. > > > > I've hit it on two different machines today. The good news is that the > > corruption never makes it onto disk. xfs_repair doesn't pick up anything. > > > > Dave > > > There was a new patch in the efi/efd code that must be misbehaving. > You are correct, this is not an on-disk value. > > I now have poisoning on and I can see this doing a compile like you > suggested. I will ASSERT to see who is doing the decrement after free. I can hit this almost instantly with fsx. I'll do a bisect, though it sounds like you already have a suspect. Dave _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:31 ` Dave Jones @ 2013-05-07 19:58 ` Mark Tinguely 2013-05-07 19:59 ` Dave Jones 0 siblings, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 19:58 UTC (permalink / raw) To: Dave Jones; +Cc: CAI Qian, xfs On 05/07/13 14:31, Dave Jones wrote: > On Tue, May 07, 2013 at 02:24:14PM -0500, Mark Tinguely wrote: > > On 05/07/13 14:07, Dave Jones wrote: > > > On Tue, May 07, 2013 at 02:04:05PM -0500, Mark Tinguely wrote: > > > > On 05/07/13 08:37, Dave Jones wrote: > > > > > 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > > > > > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > > > > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk > > > > ^^ > > > > > > > > xfs_efi_log_item.efi_refcount being decremented on the xfs_efi_release() > > > > CAI Qian had the same thing in his May 6 "3.9.0: XFS rootfs corruption" > > > > email. > > > > > > > > I have not reproduced it yet. > > > > > > I've hit it on two different machines today. The good news is that the > > > corruption never makes it onto disk. xfs_repair doesn't pick up anything. > > > > > > Dave > > > > > There was a new patch in the efi/efd code that must be misbehaving. > > You are correct, this is not an on-disk value. > > > > I now have poisoning on and I can see this doing a compile like you > > suggested. I will ASSERT to see who is doing the decrement after free. > > I can hit this almost instantly with fsx. I'll do a bisect, though > it sounds like you already have a suspect. > > Dave > If you want to try kmem debug of Linux 3.8 that would help. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:58 ` Mark Tinguely @ 2013-05-07 19:59 ` Dave Jones 2013-05-07 20:04 ` Mark Tinguely 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2013-05-07 19:59 UTC (permalink / raw) To: Mark Tinguely; +Cc: CAI Qian, xfs On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > > I can hit this almost instantly with fsx. I'll do a bisect, though > > it sounds like you already have a suspect. > > > > If you want to try kmem debug of Linux 3.8 that would help. I'm not sure what that is. Dave _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 19:59 ` Dave Jones @ 2013-05-07 20:04 ` Mark Tinguely 2013-05-07 20:22 ` Dave Jones 0 siblings, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 20:04 UTC (permalink / raw) To: Dave Jones; +Cc: CAI Qian, xfs On 05/07/13 14:59, Dave Jones wrote: > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > > > > I can hit this almost instantly with fsx. I'll do a bisect, though > > > it sounds like you already have a suspect. > > > > > > > If you want to try kmem debug of Linux 3.8 that would help. > > I'm not sure what that is. > > Dave > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 20:04 ` Mark Tinguely @ 2013-05-07 20:22 ` Dave Jones 2013-05-07 20:24 ` Mark Tinguely 0 siblings, 1 reply; 18+ messages in thread From: Dave Jones @ 2013-05-07 20:22 UTC (permalink / raw) To: Mark Tinguely; +Cc: CAI Qian, xfs On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote: > On 05/07/13 14:59, Dave Jones wrote: > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > > > > > > I can hit this almost instantly with fsx. I'll do a bisect, though > > > > it sounds like you already have a suspect. > > > > > > > > > > If you want to try kmem debug of Linux 3.8 that would help. > > > > I'm not sure what that is. > > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". Ah, done that. (I pretty much always run with it). This is something new. Even 3.9 was fine. It's only since the recent xfs merge. Dave _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 20:22 ` Dave Jones @ 2013-05-07 20:24 ` Mark Tinguely 2013-05-07 22:22 ` Dave Chinner 0 siblings, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 20:24 UTC (permalink / raw) To: Dave Jones; +Cc: CAI Qian, xfs On 05/07/13 15:22, Dave Jones wrote: > On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote: > > On 05/07/13 14:59, Dave Jones wrote: > > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > > > > > > > > I can hit this almost instantly with fsx. I'll do a bisect, though > > > > > it sounds like you already have a suspect. > > > > > > > > > > > > > If you want to try kmem debug of Linux 3.8 that would help. > > > > > > I'm not sure what that is. > > > > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". > > Ah, done that. (I pretty much always run with it). > > This is something new. Even 3.9 was fine. It's only since > the recent xfs merge. > > Dave > git revert 666d644cd72a9ec58b353209ff191d7430f3b357 until it gets fixed. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 20:24 ` Mark Tinguely @ 2013-05-07 22:22 ` Dave Chinner 2013-05-07 22:45 ` Mark Tinguely 0 siblings, 1 reply; 18+ messages in thread From: Dave Chinner @ 2013-05-07 22:22 UTC (permalink / raw) To: Mark Tinguely; +Cc: Dave Jones, CAI Qian, xfs On Tue, May 07, 2013 at 03:24:28PM -0500, Mark Tinguely wrote: > On 05/07/13 15:22, Dave Jones wrote: > >On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote: > > > On 05/07/13 14:59, Dave Jones wrote: > > > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > > > > > > > > > > I can hit this almost instantly with fsx. I'll do a bisect, though > > > > > > it sounds like you already have a suspect. > > > > > > > > > > > > > > > > If you want to try kmem debug of Linux 3.8 that would help. > > > > > > > > I'm not sure what that is. > > > > > > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". > > > >Ah, done that. (I pretty much always run with it). > > > >This is something new. Even 3.9 was fine. It's only since > >the recent xfs merge. > > > > Dave > > > > git revert 666d644cd72a9ec58b353209ff191d7430f3b357 That won't prevent the use after free. That commit fixed a problem that could lead to a use after free, but what we are seeing here is that it has ultimately exposed a previously unknown issue that causes the use after free. Basically what is happening is that there are two commits for the EFD being processed, when there should only be one. I'm not sure how this is happening yet, but these three traces came out from my debug sequentially when running generic/006: __xfs_efi_release 0xffff88006c4d7058 2 Pid: 4756, comm: kworker/1:1H Tainted: G B 3.9.0-rc8-dgc+ #592 Call Trace: [<ffffffff814e43a5>] __xfs_efi_release+0x35/0x80 [<ffffffff814e444d>] xfs_efi_item_unpin+0x5d/0x70 [<ffffffff814dbb88>] xfs_trans_committed_bulk+0x248/0x2e0 [<ffffffff814e18db>] xlog_cil_committed+0x3b/0x140 [<ffffffff814ddfae>] xlog_state_do_callback+0x19e/0x3a0 ..... That's the EFI being unpinned and it's reference going away during transaction commit completion procesing. __xfs_efi_release 0xffff88006c4d7058 1 Pid: 4756, comm: kworker/1:1H Tainted: G B 3.9.0-rc8-dgc+ #592 Call Trace: [<ffffffff814e43a5>] __xfs_efi_release+0x35/0x80 [<ffffffff814e4620>] xfs_efi_release+0x60/0x80 [<ffffffff814e46a6>] xfs_efd_item_committed+0x26/0x40 [<ffffffff814dba21>] xfs_trans_committed_bulk+0xe1/0x2e0 [<ffffffff814e18db>] xlog_cil_committed+0x3b/0x140 [<ffffffff814ddfae>] xlog_state_do_callback+0x19e/0x3a0 ..... That's the EFD releasing the EFI, which drops it's reference count and frees the both the EFI and the EFD. __xfs_efi_release 0xffff88006c4d7058 1802201963 Pid: 4756, comm: kworker/1:1H Tainted: G B 3.9.0-rc8-dgc+ #592 Call Trace: [<ffffffff814e43a5>] __xfs_efi_release+0x35/0x80 [<ffffffff814e4630>] xfs_efi_release+0x70/0x80 [<ffffffff814e46a6>] xfs_efd_item_committed+0x26/0x40 [<ffffffff814dba21>] xfs_trans_committed_bulk+0xe1/0x2e0 [<ffffffff814e18db>] xlog_cil_committed+0x3b/0x140 ..... And there's the EFI being released by an EFD after it has been freed. It's definitely been freed because 1802201963 = 0x6b6b6b6b.... Like I said, I don't understand how this is happening yet, but just seeing this behaviour explains an awful lot about why we've had long term issues with this code.... FWIW, I used to run with slab debugging turned on for all my debug builds. I switched my .config from SLAB to SLUB a few months ago, and didn't realise this silently turned of memory poisoning. That's how it slipped through my testing.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 22:22 ` Dave Chinner @ 2013-05-07 22:45 ` Mark Tinguely 2013-05-07 23:54 ` Dave Chinner 0 siblings, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-07 22:45 UTC (permalink / raw) To: Dave Chinner; +Cc: Dave Jones, CAI Qian, xfs On 05/07/13 17:22, Dave Chinner wrote: > On Tue, May 07, 2013 at 03:24:28PM -0500, Mark Tinguely wrote: >> On 05/07/13 15:22, Dave Jones wrote: >>> On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote: >>> > On 05/07/13 14:59, Dave Jones wrote: >>> > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: >>> > > >>> > > > > I can hit this almost instantly with fsx. I'll do a bisect, though >>> > > > > it sounds like you already have a suspect. >>> > > > > >>> > > > >>> > > > If you want to try kmem debug of Linux 3.8 that would help. >>> > > >>> > > I'm not sure what that is. >>> > >>> > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". >>> >>> Ah, done that. (I pretty much always run with it). >>> >>> This is something new. Even 3.9 was fine. It's only since >>> the recent xfs merge. >>> >>> Dave >>> >> >> git revert 666d644cd72a9ec58b353209ff191d7430f3b357 > > That won't prevent the use after free. That commit fixed a problem > that could lead to a use after free, but what we are seeing here is > that it has ultimately exposed a previously unknown issue that > causes the use after free. > > Basically what is happening is that there are two commits for the > EFD being processed, when there should only be one. I'm not sure how > this is happening yet, but these three traces came out from my debug > sequentially when running generic/006: > Sorry for the misleading statement. Yes, I agree that patch is a good thing. I meant that Dave and only Dave revert it and only to test if that patch was the change that caused the new symptom - which we know now that it is. I added some asserts and did not learn anything new except where the efi item was already freed. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 22:45 ` Mark Tinguely @ 2013-05-07 23:54 ` Dave Chinner 2013-05-08 0:43 ` Dave Jones 2013-05-08 13:24 ` Mark Tinguely 0 siblings, 2 replies; 18+ messages in thread From: Dave Chinner @ 2013-05-07 23:54 UTC (permalink / raw) To: Mark Tinguely; +Cc: Dave Jones, CAI Qian, xfs On Tue, May 07, 2013 at 05:45:20PM -0500, Mark Tinguely wrote: > On 05/07/13 17:22, Dave Chinner wrote: > >On Tue, May 07, 2013 at 03:24:28PM -0500, Mark Tinguely wrote: > >>On 05/07/13 15:22, Dave Jones wrote: > >>>On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote: > >>> > On 05/07/13 14:59, Dave Jones wrote: > >>> > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote: > >>> > > > >>> > > > > I can hit this almost instantly with fsx. I'll do a bisect, though > >>> > > > > it sounds like you already have a suspect. > >>> > > > > > >>> > > > > >>> > > > If you want to try kmem debug of Linux 3.8 that would help. > >>> > > > >>> > > I'm not sure what that is. > >>> > > >>> > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y". > >>> > >>>Ah, done that. (I pretty much always run with it). > >>> > >>>This is something new. Even 3.9 was fine. It's only since > >>>the recent xfs merge. > >>> > >>> Dave > >>> > >> > >>git revert 666d644cd72a9ec58b353209ff191d7430f3b357 > > > >That won't prevent the use after free. That commit fixed a problem > >that could lead to a use after free, but what we are seeing here is > >that it has ultimately exposed a previously unknown issue that > >causes the use after free. > > > >Basically what is happening is that there are two commits for the > >EFD being processed, when there should only be one. I'm not sure how > >this is happening yet, but these three traces came out from my debug > >sequentially when running generic/006: > > Sorry for the misleading statement. Yes, I agree that patch is a > good thing. I meant that Dave and only Dave revert it and only to > test if that patch was the change that caused the new symptom - > which we know now that it is. Sure, I realise that, and it turns out I'm wrong - it is a bug in commit 666d644cd. Poisoning turns a "will probably never occur" problem into an instant reproducer, because it sets a bit in the efi structure that is normally zero when the EFI is freed and hence triggers a second free of the EFI when reading it after the first free.... Dave, the patch below should fix the problem. Cheers, Dave. -- Dave Chinner david@fromorbit.com xfs: Don't reference the EFI after it is freed From: Dave Chinner <dchinner@redhat.com> Checking the EFI for whether it is being released from recovery after we've already released the known active reference is a mistake worthy of a brown paper bag. Fix the (now) obvious use after free that it can cause. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_extfree_item.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index c0f3750..98c437d 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -305,10 +305,22 @@ xfs_efi_release(xfs_efi_log_item_t *efip, { ASSERT(atomic_read(&efip->efi_next_extent) >= nextents); if (atomic_sub_and_test(nextents, &efip->efi_next_extent)) { + int recovered; + + /* + * __xfs_efi_release() can release the last reference to the EFI + * and free it, so it is unsafe to reference it after we've + * released the reference. The only case this is safe to do is + * if we are in recovery and the XFS_EFI_RECOVERED bit is set, + * meaning that we have two references to release. Check the + * recovered bit before the initial release, as we cannot + * reliably check it afterwards. + */ + recovered = test_bit(XFS_EFI_RECOVERED, &efip->efi_flags); __xfs_efi_release(efip); /* recovery needs us to drop the EFI reference, too */ - if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags)) + if (recovered) __xfs_efi_release(efip); } } _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 23:54 ` Dave Chinner @ 2013-05-08 0:43 ` Dave Jones 2013-05-08 13:24 ` Mark Tinguely 1 sibling, 0 replies; 18+ messages in thread From: Dave Jones @ 2013-05-08 0:43 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs, Mark Tinguely, CAI Qian On Wed, May 08, 2013 at 09:54:58AM +1000, Dave Chinner wrote: > Dave, the patch below should fix the problem. > > xfs: Don't reference the EFI after it is freed > > From: Dave Chinner <dchinner@redhat.com> > > Checking the EFI for whether it is being released from recovery > after we've already released the known active reference is a mistake > worthy of a brown paper bag. Fix the (now) obvious use after free > that it can cause. Yup, that works for me. thanks, Dave _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 23:54 ` Dave Chinner 2013-05-08 0:43 ` Dave Jones @ 2013-05-08 13:24 ` Mark Tinguely 2013-05-10 1:38 ` Dave Chinner 1 sibling, 1 reply; 18+ messages in thread From: Mark Tinguely @ 2013-05-08 13:24 UTC (permalink / raw) To: Dave Chinner; +Cc: Dave Jones, CAI Qian, xfs On 05/07/13 18:54, Dave Chinner wrote: Checking the EFI for whether it is being released from recovery after we've already released the known active reference is a mistake worthy of a brown paper bag. Fix the (now) obvious use after free that it can cause. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_extfree_item.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) Looks good. Reviewed-by: Mark Tinguely <tinguely@sgi.com> _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-08 13:24 ` Mark Tinguely @ 2013-05-10 1:38 ` Dave Chinner 0 siblings, 0 replies; 18+ messages in thread From: Dave Chinner @ 2013-05-10 1:38 UTC (permalink / raw) To: Mark Tinguely; +Cc: Dave Jones, CAI Qian, xfs On Wed, May 08, 2013 at 08:24:35AM -0500, Mark Tinguely wrote: > On 05/07/13 18:54, Dave Chinner wrote: > > > Checking the EFI for whether it is being released from recovery > after we've already released the known active reference is a mistake > worthy of a brown paper bag. Fix the (now) obvious use after free > that it can cause. > > Reported-by: Dave Jones <davej@redhat.com> > Signed-off-by: Dave Chinner <dchinner@redhat.com> > --- > fs/xfs/xfs_extfree_item.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > Looks good. > > Reviewed-by: Mark Tinguely <tinguely@sgi.com> Zach pointed out that the fix is much more complex than it needs to be. I'll respin the patch and resend it later today. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-07 13:37 xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) Dave Jones 2013-05-07 19:04 ` Mark Tinguely @ 2013-05-11 12:09 ` Dmitry Monakhov 2013-05-12 0:40 ` Dave Chinner 1 sibling, 1 reply; 18+ messages in thread From: Dmitry Monakhov @ 2013-05-11 12:09 UTC (permalink / raw) To: Dave Jones, Linux Kernel; +Cc: xfs On Tue, 7 May 2013 09:37:07 -0400, Dave Jones <davej@redhat.com> wrote: > started compiling a kernel, and then... > I've bisected this one. commit 666d644cd72a9ec58b353209ff191d7430f3b357 Author: Dave Chinner <dchinner@redhat.com> Date: Wed Apr 3 14:09:21 2013 +1100 xfs: don't free EFIs before the EFDs are committed #xfstests ./check generic/007 generic/007 generic/007 > [ 172.233200] ============================================================================= > [ 172.233205] BUG xfs_efi_item (Not tainted): Poison overwritten > [ 172.233207] ----------------------------------------------------------------------------- > > [ 172.233210] Disabling lock debugging due to kernel taint > [ 172.233213] INFO: 0xffff8800aaac4ea8-0xffff8800aaac4ea8. First byte 0x6a instead of 0x6b > [ 172.233235] INFO: Allocated in kmem_zone_alloc+0x67/0xf0 [xfs] age=29290 cpu=1 pid=1357 > [ 172.233239] __slab_alloc+0x468/0x52c > [ 172.233243] kmem_cache_alloc+0x2b4/0x320 > [ 172.233256] kmem_zone_alloc+0x67/0xf0 [xfs] > [ 172.233269] kmem_zone_zalloc+0x14/0x40 [xfs] > [ 172.233287] xfs_efi_init+0x41/0xa0 [xfs] > [ 172.233305] xfs_trans_get_efi+0x58/0x90 [xfs] > [ 172.233320] xfs_bmap_finish+0x76/0x1b0 [xfs] > [ 172.233338] xfs_itruncate_extents+0x2cd/0x610 [xfs] > [ 172.233353] xfs_inactive+0x401/0x530 [xfs] > [ 172.233367] xfs_fs_evict_inode+0x8c/0x1b0 [xfs] > [ 172.233370] evict+0xa3/0x1a0 > [ 172.233372] iput+0xf5/0x180 > [ 172.233374] dput+0x208/0x2f0 > [ 172.233378] SYSC_renameat+0x3be/0x430 > [ 172.233380] SyS_renameat+0xe/0x10 > [ 172.233383] SyS_rename+0x1b/0x20 > [ 172.233399] INFO: Freed in xfs_efi_item_free+0x21/0x40 [xfs] age=27303 cpu=1 pid=177 > [ 172.233402] __slab_free+0x41/0x39f > [ 172.233405] kmem_cache_free+0x326/0x370 > [ 172.233420] xfs_efi_item_free+0x21/0x40 [xfs] > [ 172.233435] __xfs_efi_release+0x4e/0x60 [xfs] > [ 172.233449] xfs_efi_release+0x50/0x70 [xfs] > [ 172.233463] xfs_efd_item_committed+0x22/0x40 [xfs] > [ 172.233479] xfs_trans_committed_bulk+0xcf/0x290 [xfs] > [ 172.233494] xlog_cil_committed+0x37/0x110 [xfs] > [ 172.233510] xlog_state_do_callback+0x1b8/0x3f0 [xfs] > [ 172.233525] xlog_state_done_syncing+0xf2/0x110 [xfs] > [ 172.233539] xlog_iodone+0x87/0x110 [xfs] > [ 172.233550] xfs_buf_iodone_work+0x5e/0xd0 [xfs] > [ 172.233554] process_one_work+0x211/0x700 > [ 172.233556] worker_thread+0x11d/0x3a0 > [ 172.233559] kthread+0xed/0x100 > [ 172.233562] ret_from_fork+0x7c/0xb0 > [ 172.233565] INFO: Slab 0xffffea0002aab100 objects=22 used=22 fp=0x (null) flags=0x20000000004080 > [ 172.233567] INFO: Object 0xffff8800aaac4e38 @offset=3640 fp=0xffff8800aaac7058 > > [ 172.233570] Bytes b4 ffff8800aaac4e28: 07 a2 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ > [ 172.233573] Object ffff8800aaac4e38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233575] Object ffff8800aaac4e48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233577] Object ffff8800aaac4e58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233579] Object ffff8800aaac4e68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233581] Object ffff8800aaac4e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233583] Object ffff8800aaac4e88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233586] Object ffff8800aaac4e98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233588] Object ffff8800aaac4ea8: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk > [ 172.233590] Object ffff8800aaac4eb8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233592] Object ffff8800aaac4ec8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233594] Object ffff8800aaac4ed8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233597] Object ffff8800aaac4ee8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233599] Object ffff8800aaac4ef8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233601] Object ffff8800aaac4f08: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233603] Object ffff8800aaac4f18: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233605] Object ffff8800aaac4f28: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233608] Object ffff8800aaac4f38: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233610] Object ffff8800aaac4f48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233612] Object ffff8800aaac4f58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233614] Object ffff8800aaac4f68: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233616] Object ffff8800aaac4f78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233618] Object ffff8800aaac4f88: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233620] Object ffff8800aaac4f98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233622] Object ffff8800aaac4fa8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [ 172.233625] Object ffff8800aaac4fb8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. > [ 172.233627] Redzone ffff8800aaac4fc8: bb bb bb bb bb bb bb bb ........ > [ 172.233629] Padding ffff8800aaac5108: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ > [ 172.233632] CPU: 1 PID: 1658 Comm: gcc Tainted: G B 3.9.0+ #1 [loadavg: 0.68 0.43 0.17 4/333 1660] > [ 172.233636] Hardware name: LENOVO 2356JK8/2356JK8, BIOS G7ET92WW (2.52 ) 02/18/2013 > [ 172.233638] ffffea0002aab100 ffff8800a4b03918 ffffffff815f5588 ffff8800a4b03968 > [ 172.233644] ffffffff81185944 0000000000000008 ffff880000000001 ffffffff81078016 > [ 172.233648] ffff8800aaac4ea8 ffff8800aaac4ea9 ffff880115937cc0 000000000000006b > [ 172.233653] Call Trace: > [ 172.233658] [<ffffffff815f5588>] dump_stack+0x19/0x1b > [ 172.233662] [<ffffffff81185944>] print_trailer+0x154/0x210 > [ 172.233666] [<ffffffff81078016>] ? perf_trace_sched_process_template+0xe6/0x100 > [ 172.233669] [<ffffffff81185b3f>] check_bytes_and_report+0xcf/0x110 > [ 172.233673] [<ffffffff81187037>] check_object+0x1e7/0x260 > [ 172.233676] [<ffffffff811864f7>] ? check_slab+0x87/0x130 > [ 172.233691] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] > [ 172.233694] [<ffffffff815f2ea3>] alloc_debug_processing+0x76/0x118 > [ 172.233697] [<ffffffff815f3b8d>] __slab_alloc+0x468/0x52c > [ 172.233716] [<ffffffffa0592787>] ? xfs_iext_bno_to_ext+0x97/0x180 [xfs] > [ 172.233731] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] > [ 172.233735] [<ffffffff8118a754>] kmem_cache_alloc+0x2b4/0x320 > [ 172.233748] [<ffffffffa0543fa7>] ? kmem_zone_alloc+0x67/0xf0 [xfs] > [ 172.233762] [<ffffffffa0543fa7>] kmem_zone_alloc+0x67/0xf0 [xfs] > [ 172.233776] [<ffffffffa0544044>] kmem_zone_zalloc+0x14/0x40 [xfs] > [ 172.233792] [<ffffffffa05ae031>] xfs_efi_init+0x41/0xa0 [xfs] > [ 172.233809] [<ffffffffa05b32b8>] xfs_trans_get_efi+0x58/0x90 [xfs] > [ 172.233825] [<ffffffffa055a0a6>] xfs_bmap_finish+0x76/0x1b0 [xfs] > [ 172.233844] [<ffffffffa058ef0d>] xfs_itruncate_extents+0x2cd/0x610 [xfs] > [ 172.233859] [<ffffffffa0540e41>] xfs_inactive+0x401/0x530 [xfs] > [ 172.233874] [<ffffffffa053dbec>] xfs_fs_evict_inode+0x8c/0x1b0 [xfs] > [ 172.233877] [<ffffffff811c1033>] evict+0xa3/0x1a0 > [ 172.233880] [<ffffffff811c19c5>] iput+0xf5/0x180 > [ 172.233883] [<ffffffff811bcf18>] dput+0x208/0x2f0 > [ 172.233887] [<ffffffff811b43ae>] SYSC_renameat+0x3be/0x430 > [ 172.233892] [<ffffffff810ad32e>] ? lock_release_holdtime.part.30+0xee/0x170 > [ 172.233895] [<ffffffff8107c18d>] ? get_parent_ip+0xd/0x50 > [ 172.233899] [<ffffffff810afe7d>] ? trace_hardirqs_on+0xd/0x10 > [ 172.233904] [<ffffffff8100e358>] ? syscall_trace_enter+0x18/0x230 > [ 172.233907] [<ffffffff811b67fe>] SyS_renameat+0xe/0x10 > [ 172.233911] [<ffffffff811b681b>] SyS_rename+0x1b/0x20 > [ 172.233914] [<ffffffff816051d4>] tracesys+0xdd/0xe2 > [ 172.233917] FIX xfs_efi_item: Restoring 0xffff8800aaac4ea8-0xffff8800aaac4ea8=0x6b > [ 172.233919] FIX xfs_efi_item: Marking all objects used > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) 2013-05-11 12:09 ` Dmitry Monakhov @ 2013-05-12 0:40 ` Dave Chinner 0 siblings, 0 replies; 18+ messages in thread From: Dave Chinner @ 2013-05-12 0:40 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Dave Jones, Linux Kernel, xfs On Sat, May 11, 2013 at 04:09:40PM +0400, Dmitry Monakhov wrote: > On Tue, 7 May 2013 09:37:07 -0400, Dave Jones <davej@redhat.com> wrote: > > started compiling a kernel, and then... > > > I've bisected this one. > commit 666d644cd72a9ec58b353209ff191d7430f3b357 > Author: Dave Chinner <dchinner@redhat.com> > Date: Wed Apr 3 14:09:21 2013 +1100 > > xfs: don't free EFIs before the EFDs are committed > > #xfstests ./check generic/007 generic/007 generic/007 Yuo, we already have a candidate fix for it. Thanks for another independent confirmation of the cause, though. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-05-12 0:40 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-05-07 13:37 xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) Dave Jones 2013-05-07 19:04 ` Mark Tinguely 2013-05-07 19:07 ` Dave Jones 2013-05-07 19:24 ` Mark Tinguely 2013-05-07 19:31 ` Dave Jones 2013-05-07 19:58 ` Mark Tinguely 2013-05-07 19:59 ` Dave Jones 2013-05-07 20:04 ` Mark Tinguely 2013-05-07 20:22 ` Dave Jones 2013-05-07 20:24 ` Mark Tinguely 2013-05-07 22:22 ` Dave Chinner 2013-05-07 22:45 ` Mark Tinguely 2013-05-07 23:54 ` Dave Chinner 2013-05-08 0:43 ` Dave Jones 2013-05-08 13:24 ` Mark Tinguely 2013-05-10 1:38 ` Dave Chinner 2013-05-11 12:09 ` Dmitry Monakhov 2013-05-12 0:40 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox