* XFS reclaim lock order bug
@ 2010-11-23 12:18 Nick Piggin
2010-11-23 21:12 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2010-11-23 12:18 UTC (permalink / raw)
To: xfs
Hi,
IIRC I've reported this before. Perhaps it is a false positive, but even
so it is still annoying that it triggers and turns off lockdep for
subsequent debugging.
Any chance it can get fixed or properly annotated?
Thanks,
Nick
[ 286.895008]
[ 286.895010] =================================
[ 286.895020] [ INFO: inconsistent lock state ]
[ 286.895020] 2.6.37-rc3+ #28
[ 286.895020] ---------------------------------
[ 286.895020] inconsistent {RECLAIM_FS-ON-R} -> {IN-RECLAIM_FS-W} usage.
[ 286.895020] rm/1844 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 286.895020] (&(&ip->i_iolock)->mr_lock#2){++++-+}, at: [<ffffffffa0067e58>] xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] {RECLAIM_FS-ON-R} state was registered at:
[ 286.895020] [<ffffffff8108380b>] mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff810838d1>] lockdep_trace_alloc+0x91/0xd0
[ 286.895020] [<ffffffff810d1851>] __alloc_pages_nodemask+0x91/0x780
[ 286.895020] [<ffffffff8110a043>] alloc_page_vma+0x93/0x150
[ 286.895020] [<ffffffff810ed909>] handle_mm_fault+0x719/0x9a0
[ 286.895020] [<ffffffff816068e3>] do_page_fault+0x133/0x4f0
[ 286.895020] [<ffffffff816039df>] page_fault+0x1f/0x30
[ 286.895020] [<ffffffff810cb52a>] generic_file_aio_read+0x2fa/0x730
[ 286.895020] [<ffffffffa009a29b>] xfs_file_aio_read+0x15b/0x390 [xfs]
[ 286.895020] [<ffffffff81117812>] do_sync_read+0xd2/0x110
[ 286.895020] [<ffffffff81117b55>] vfs_read+0xc5/0x190
[ 286.895020] [<ffffffff8111846c>] sys_read+0x4c/0x80
[ 286.895020] [<ffffffff8100312b>] system_call_fastpath+0x16/0x1b
[ 286.895020] irq event stamp: 1095103
[ 286.895020] hardirqs last enabled at (1095103): [<ffffffff81603215>] _raw_spin_unlock_irqrestore+0x65/0x80
[ 286.895020] hardirqs last disabled at (1095102): [<ffffffff81602be7>] _raw_spin_lock_irqsave+0x17/0x60
[ 286.895020] softirqs last enabled at (1093048): [<ffffffff81050c4e>] __do_softirq+0x16e/0x360
[ 286.895020] softirqs last disabled at (1093009): [<ffffffff81003fcc>] call_softirq+0x1c/0x50
[ 286.895020]
[ 286.895020] other info that might help us debug this:
[ 286.895020] 3 locks held by rm/1844:
[ 286.895020] #0: (&sb->s_type->i_mutex_key#13){+.+.+.}, at: [<ffffffff81124cdc>] do_lookup+0xfc/0x170
[ 286.895020] #1: (shrinker_rwsem){++++..}, at: [<ffffffff810d8f68>] shrink_slab+0x38/0x190
[ 286.895020] #2: (&pag->pag_ici_reclaim_lock){+.+...}, at: [<ffffffffa00a2944>] xfs_reclaim_inodes_ag+0xa4/0x370 [xfs]
[ 286.895020]
[ 286.895020] stack backtrace:
[ 286.895020] Pid: 1844, comm: rm Not tainted 2.6.37-rc3+ #28
[ 286.895020] Call Trace:
[ 286.895020] [<ffffffff810828f0>] print_usage_bug+0x170/0x180
[ 286.895020] [<ffffffff810835b1>] mark_lock+0x211/0x400
[ 286.895020] [<ffffffff810841ae>] __lock_acquire+0x40e/0x1490
[ 286.895020] [<ffffffff810852c5>] lock_acquire+0x95/0x1b0
[ 286.895020] [<ffffffffa0067e58>] ? xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa00a2774>] ? xfs_reclaim_inode+0x174/0x2a0 [xfs]
[ 286.895020] [<ffffffff81073c3a>] down_write_nested+0x4a/0x70
[ 286.895020] [<ffffffffa0067e58>] ? xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa0067e58>] xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa00a27c0>] xfs_reclaim_inode+0x1c0/0x2a0 [xfs]
[ 286.895020] [<ffffffffa00a2aaf>] xfs_reclaim_inodes_ag+0x20f/0x370 [xfs]
[ 286.895020] [<ffffffffa00a2c88>] xfs_reclaim_inode_shrink+0x78/0x80 [xfs]
[ 286.895020] [<ffffffff810d9057>] shrink_slab+0x127/0x190
[ 286.895020] [<ffffffff810dbf09>] zone_reclaim+0x349/0x420
[ 286.895020] [<ffffffff810cf815>] ? zone_watermark_ok+0x25/0xe0
[ 286.895020] [<ffffffff810d1603>] get_page_from_freelist+0x673/0x830
[ 286.895020] [<ffffffff8110c013>] ? init_object+0x43/0x80
[ 286.895020] [<ffffffffa00929ec>] ? kmem_zone_alloc+0x8c/0xd0 [xfs]
[ 286.895020] [<ffffffff8108380b>] ? mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff8108380b>] ? mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff810d18d0>] __alloc_pages_nodemask+0x110/0x780
[ 286.895020] [<ffffffff8110eb3a>] ? unfreeze_slab+0x11a/0x160
[ 286.895020] [<ffffffff811089d6>] alloc_pages_current+0x76/0xf0
[ 286.895020] [<ffffffff8110ce45>] new_slab+0x205/0x2b0
[ 286.895020] [<ffffffff8110ef7c>] __slab_alloc+0x30c/0x480
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8110fe18>] kmem_cache_alloc+0xf8/0x1a0
[ 286.895020] [<ffffffff8112f1e0>] ? __d_lookup+0x1c0/0x1f0
[ 286.895020] [<ffffffff8112f020>] ? __d_lookup+0x0/0x1f0
[ 286.895020] [<ffffffff8112f372>] d_alloc+0x22/0x200
[ 286.895020] [<ffffffff811236fb>] d_alloc_and_lookup+0x2b/0x90
[ 286.895020] [<ffffffff8112f24c>] ? d_lookup+0x3c/0x60
[ 286.895020] [<ffffffff81124cfa>] do_lookup+0x11a/0x170
[ 286.895020] [<ffffffff81125a4a>] link_path_walk+0x31a/0xa50
[ 286.895020] [<ffffffff81126292>] path_walk+0x62/0xe0
[ 286.895020] [<ffffffff8112636b>] do_path_lookup+0x5b/0x60
[ 286.895020] [<ffffffff81126fe2>] user_path_at+0x52/0xa0
[ 286.895020] [<ffffffff8110e8d5>] ? kmem_cache_free+0xe5/0x190
[ 286.895020] [<ffffffff81083b4d>] ? trace_hardirqs_on+0xd/0x10
[ 286.895020] [<ffffffff81126760>] ? do_unlinkat+0x60/0x1d0
[ 286.895020] [<ffffffff8111d077>] vfs_fstatat+0x37/0x70
[ 286.895020] [<ffffffff8111d21f>] sys_newfstatat+0x1f/0x50
[ 286.895020] [<ffffffff81083afd>] ? trace_hardirqs_on_caller+0x13d/0x180
[ 286.895020] [<ffffffff816028c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 286.895020] [<ffffffff8100312b>] system_call_fastpath+0x16/0x1b
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: XFS reclaim lock order bug 2010-11-23 12:18 XFS reclaim lock order bug Nick Piggin @ 2010-11-23 21:12 ` Dave Chinner 2010-11-24 0:58 ` Nick Piggin 2010-11-24 20:03 ` Christoph Hellwig 0 siblings, 2 replies; 14+ messages in thread From: Dave Chinner @ 2010-11-23 21:12 UTC (permalink / raw) To: Nick Piggin; +Cc: xfs On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote: > Hi, > > IIRC I've reported this before. Perhaps it is a false positive, but even > so it is still annoying that it triggers and turns off lockdep for > subsequent debugging. > > Any chance it can get fixed or properly annotated? It is supposed to be handled by the re-initialisation of the ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found in the reclaim state must have passed through this reinitialisation, so from a lockdep perspective the iolock in the vfs path is a different context to the iolock in the reclaim path. That fixed all the non-reclaim state related lockdep false positives, so Perhaps there is an issue with the lockdep reclaim state checking that does not interact well with re-initialised lock contexts? Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-23 21:12 ` Dave Chinner @ 2010-11-24 0:58 ` Nick Piggin 2010-11-24 2:26 ` Dave Chinner 2010-11-24 20:03 ` Christoph Hellwig 1 sibling, 1 reply; 14+ messages in thread From: Nick Piggin @ 2010-11-24 0:58 UTC (permalink / raw) To: Dave Chinner; +Cc: Nick Piggin, xfs On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote: > > Hi, > > > > IIRC I've reported this before. Perhaps it is a false positive, but even > > so it is still annoying that it triggers and turns off lockdep for > > subsequent debugging. > > > > Any chance it can get fixed or properly annotated? > > It is supposed to be handled by the re-initialisation of the > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > in the reclaim state must have passed through this reinitialisation, > so from a lockdep perspective the iolock in the vfs path is a > different context to the iolock in the reclaim path. That fixed all > the non-reclaim state related lockdep false positives, so Perhaps > there is an issue with the lockdep reclaim state checking that does > not interact well with re-initialised lock contexts? Hmm. I suppose that should work. So xfs_reclaim_inode can only call xfs_ilock _after_ the Linux inode has gone through ->evict_inode call? If so, then let's ask the lockdep people. Thanks, Nick _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-24 0:58 ` Nick Piggin @ 2010-11-24 2:26 ` Dave Chinner 0 siblings, 0 replies; 14+ messages in thread From: Dave Chinner @ 2010-11-24 2:26 UTC (permalink / raw) To: Nick Piggin; +Cc: xfs On Wed, Nov 24, 2010 at 11:58:11AM +1100, Nick Piggin wrote: > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > > On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote: > > > Hi, > > > > > > IIRC I've reported this before. Perhaps it is a false positive, but even > > > so it is still annoying that it triggers and turns off lockdep for > > > subsequent debugging. > > > > > > Any chance it can get fixed or properly annotated? > > > > It is supposed to be handled by the re-initialisation of the > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > > in the reclaim state must have passed through this reinitialisation, > > so from a lockdep perspective the iolock in the vfs path is a > > different context to the iolock in the reclaim path. That fixed all > > the non-reclaim state related lockdep false positives, so Perhaps > > there is an issue with the lockdep reclaim state checking that does > > not interact well with re-initialised lock contexts? > > Hmm. I suppose that should work. > > So xfs_reclaim_inode can only call xfs_ilock _after_ the Linux inode > has gone through ->evict_inode call? I think so - it's only found by xfs_reclaim_inode() if the radix tree reclaim tag is set for the inode, which is only set in two places: ->destroy_inode(), and in a failed lookup (which doesn't touch the iolock). I can't see any other way we can get to an inode in reclaim... > If so, then let's ask the lockdep people. Sure. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-23 21:12 ` Dave Chinner 2010-11-24 0:58 ` Nick Piggin @ 2010-11-24 20:03 ` Christoph Hellwig 2010-11-25 3:48 ` Nick Piggin 1 sibling, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2010-11-24 20:03 UTC (permalink / raw) To: Dave Chinner; +Cc: Nick Piggin, xfs On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > It is supposed to be handled by the re-initialisation of the > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > in the reclaim state must have passed through this reinitialisation, > so from a lockdep perspective the iolock in the vfs path is a > different context to the iolock in the reclaim path. That fixed all > the non-reclaim state related lockdep false positives, so Perhaps > there is an issue with the lockdep reclaim state checking that does > not interact well with re-initialised lock contexts? I've been looking through this again, and I think it's indeed not enough. We don't just need to re-initialize it, but also set a different lockclass for it. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-24 20:03 ` Christoph Hellwig @ 2010-11-25 3:48 ` Nick Piggin 2010-11-25 6:25 ` Peter Zijlstra 0 siblings, 1 reply; 14+ messages in thread From: Nick Piggin @ 2010-11-25 3:48 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Peter Zijlstra, Ingo Molnar, Nick Piggin, xfs On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote: > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > > It is supposed to be handled by the re-initialisation of the > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > > in the reclaim state must have passed through this reinitialisation, > > so from a lockdep perspective the iolock in the vfs path is a > > different context to the iolock in the reclaim path. That fixed all > > the non-reclaim state related lockdep false positives, so Perhaps > > there is an issue with the lockdep reclaim state checking that does > > not interact well with re-initialised lock contexts? > > I've been looking through this again, and I think it's indeed not > enough. We don't just need to re-initialize it, but also set a > different lockclass for it. Doesn't init_rwsem give it a new class? Guys, can you take a quick look at the code Dave is referring to (xfs_fs_evict_inode), and check that it actually does what he intends? We're getting what seems to be false positives in reclaim inversion of lockings. Backtraces here http://oss.sgi.com/pipermail/xfs/2010-November/048092.html Thanks, Nick _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 3:48 ` Nick Piggin @ 2010-11-25 6:25 ` Peter Zijlstra 2010-11-25 7:08 ` Nick Piggin 2010-11-25 10:29 ` Dave Chinner 0 siblings, 2 replies; 14+ messages in thread From: Peter Zijlstra @ 2010-11-25 6:25 UTC (permalink / raw) To: Nick Piggin; +Cc: Christoph Hellwig, Ingo Molnar, xfs On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote: > On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote: > > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > > > It is supposed to be handled by the re-initialisation of the > > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > > > in the reclaim state must have passed through this reinitialisation, > > > so from a lockdep perspective the iolock in the vfs path is a > > > different context to the iolock in the reclaim path. That fixed all > > > the non-reclaim state related lockdep false positives, so Perhaps > > > there is an issue with the lockdep reclaim state checking that does > > > not interact well with re-initialised lock contexts? > > > > I've been looking through this again, and I think it's indeed not > > enough. We don't just need to re-initialize it, but also set a > > different lockclass for it. > > Doesn't init_rwsem give it a new class? Per call-site, yes it should. > Guys, can you take a quick look at the code Dave is referring to > (xfs_fs_evict_inode), and check that it actually does what he > intends? Right, so this is trying to set a different class from the regular init site, which (/me applies grep) lives in xfs_inode_alloc(), right? Ought to work.. assuming the inode will be fully destroyed and new inodes are always obtained through xfs_inode_alloc() and not reused. > We're getting what seems to be false positives in reclaim inversion > of lockings. Backtraces here > http://oss.sgi.com/pipermail/xfs/2010-November/048092.html Right, so there its holding the inode in the read path while taking a page-fault which does an allocation. vs acquiring the inode in the xfs_reclaim_node_shrink() path. Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its possible to end up in that read path? Something like the below would give the lock-class an explicit name, because both sites now use the exact same init thing they're called: (&(&ip->i_iolock)->mr_lock) (&(&ip->i_iolock)->mr_lock#2) Which is hard to tell apart, but I suspect #2 is the dead one, since they get numbered in order of appearance and its hard to have a dead inode before having a life one ;-) In that case though, it would suggest the inode got re-used instead of destroyed and re-created using xfs_alloc_inode(), is that at all possible? --- fs/xfs/linux-2.6/xfs_super.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c index 064f964..721c1c5 100644 --- a/fs/xfs/linux-2.6/xfs_super.c +++ b/fs/xfs/linux-2.6/xfs_super.c @@ -1091,6 +1091,8 @@ xfs_fs_write_inode( return -error; } +static struct lock_class_key xfs_dead_inode; + STATIC void xfs_fs_evict_inode( struct inode *inode) @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode( */ ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode, + "xfd_dead_inode"); xfs_inactive(ip); } _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 6:25 ` Peter Zijlstra @ 2010-11-25 7:08 ` Nick Piggin 2010-11-25 7:28 ` Peter Zijlstra 2010-11-25 10:29 ` Dave Chinner 1 sibling, 1 reply; 14+ messages in thread From: Nick Piggin @ 2010-11-25 7:08 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs On Thu, Nov 25, 2010 at 07:25:25AM +0100, Peter Zijlstra wrote: > On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote: > > On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote: > > > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > > > > It is supposed to be handled by the re-initialisation of the > > > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > > > > in the reclaim state must have passed through this reinitialisation, > > > > so from a lockdep perspective the iolock in the vfs path is a > > > > different context to the iolock in the reclaim path. That fixed all > > > > the non-reclaim state related lockdep false positives, so Perhaps > > > > there is an issue with the lockdep reclaim state checking that does > > > > not interact well with re-initialised lock contexts? > > > > > > I've been looking through this again, and I think it's indeed not > > > enough. We don't just need to re-initialize it, but also set a > > > different lockclass for it. > > > > Doesn't init_rwsem give it a new class? > > Per call-site, yes it should. > > > Guys, can you take a quick look at the code Dave is referring to > > (xfs_fs_evict_inode), and check that it actually does what he > > intends? > > Right, so this is trying to set a different class from the regular init > site, which (/me applies grep) lives in xfs_inode_alloc(), right? > > Ought to work.. assuming the inode will be fully destroyed and new > inodes are always obtained through xfs_inode_alloc() and not reused. > > > We're getting what seems to be false positives in reclaim inversion > > of lockings. Backtraces here > > http://oss.sgi.com/pipermail/xfs/2010-November/048092.html > > Right, so there its holding the inode in the read path while taking a > page-fault which does an allocation. > > vs > > acquiring the inode in the xfs_reclaim_node_shrink() path. > > > Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its > possible to end up in that read path? I think that's the idea. > Something like the below would give the lock-class an explicit name, > because both sites now use the exact same init thing they're called: > > (&(&ip->i_iolock)->mr_lock) > (&(&ip->i_iolock)->mr_lock#2) > > Which is hard to tell apart, but I suspect #2 is the dead one, since > they get numbered in order of appearance and its hard to have a dead > inode before having a life one ;-) > > In that case though, it would suggest the inode got re-used instead of > destroyed and re-created using xfs_alloc_inode(), is that at all > possible? Ah, I see what you mean. An inode that has been through the evict_inode path is now found to be locked for read(2). The rwsem is in the init_always path of the allocator, so it seems like it's getting reused after evict_inode. Dave? > > --- > fs/xfs/linux-2.6/xfs_super.c | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c > index 064f964..721c1c5 100644 > --- a/fs/xfs/linux-2.6/xfs_super.c > +++ b/fs/xfs/linux-2.6/xfs_super.c > @@ -1091,6 +1091,8 @@ xfs_fs_write_inode( > return -error; > } > > +static struct lock_class_key xfs_dead_inode; > + > STATIC void > xfs_fs_evict_inode( > struct inode *inode) > @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode( > */ > ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); > mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); > + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode, > + "xfd_dead_inode"); > > xfs_inactive(ip); > } With this change, I assume the mrlock_init can go? (it would be nice to have a wrapper to allocate the class by itself) _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 7:08 ` Nick Piggin @ 2010-11-25 7:28 ` Peter Zijlstra 2010-11-25 10:32 ` Dave Chinner 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2010-11-25 7:28 UTC (permalink / raw) To: Nick Piggin; +Cc: Christoph Hellwig, Ingo Molnar, xfs On Thu, 2010-11-25 at 18:08 +1100, Nick Piggin wrote: > > +static struct lock_class_key xfs_dead_inode; > > + > > STATIC void > > xfs_fs_evict_inode( > > struct inode *inode) > > @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode( > > */ > > ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); > > mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); > > + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode, > > + "xfd_dead_inode"); > > > > xfs_inactive(ip); > > } > > With this change, I assume the mrlock_init can go? (it would be nice > to have a wrapper to allocate the class by itself) mrlock_init() does allocate a class (well rwsem_init, really), but sets the name to a stringified version of the lock argument. The lockdep_set_class*() interface is only guaranteed to work on a freshly initialized lock structure -- which in this case is a bit of a waste, but for debugging purposes would allow setting a clearer name. Alternatively, you can write the code like: xfs_inode_t dead_ip = XFS_I(inode); mrlock_init(&dead_ip->i_iolock, ...); In which case its also obvious, as that would result in: (&(&dead_ip->i_iolock)->mr_lock) as opposed to: (&(&ip->i_iolock)->mr_lock) _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 7:28 ` Peter Zijlstra @ 2010-11-25 10:32 ` Dave Chinner 0 siblings, 0 replies; 14+ messages in thread From: Dave Chinner @ 2010-11-25 10:32 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs On Thu, Nov 25, 2010 at 08:28:17AM +0100, Peter Zijlstra wrote: > On Thu, 2010-11-25 at 18:08 +1100, Nick Piggin wrote: > > > +static struct lock_class_key xfs_dead_inode; > > > + > > > STATIC void > > > xfs_fs_evict_inode( > > > struct inode *inode) > > > @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode( > > > */ > > > ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); > > > mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); > > > + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode, > > > + "xfd_dead_inode"); > > > > > > xfs_inactive(ip); > > > } > > > > With this change, I assume the mrlock_init can go? (it would be nice > > to have a wrapper to allocate the class by itself) > > > mrlock_init() does allocate a class (well rwsem_init, really), but sets > the name to a stringified version of the lock argument. > > The lockdep_set_class*() interface is only guaranteed to work on a > freshly initialized lock structure -- which in this case is a bit of a > waste, but for debugging purposes would allow setting a clearer name. > > Alternatively, you can write the code like: > > xfs_inode_t dead_ip = XFS_I(inode); > > mrlock_init(&dead_ip->i_iolock, ...); > > In which case its also obvious, as that would result in: > > (&(&dead_ip->i_iolock)->mr_lock) > > as opposed to: > > (&(&ip->i_iolock)->mr_lock) Ok, that's a handy trick to know. I'll try and sort this out tomorrow and make use of this trick to help identify the different lock classes. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 6:25 ` Peter Zijlstra 2010-11-25 7:08 ` Nick Piggin @ 2010-11-25 10:29 ` Dave Chinner 2010-11-25 10:36 ` Peter Zijlstra 2010-11-25 11:25 ` Christoph Hellwig 1 sibling, 2 replies; 14+ messages in thread From: Dave Chinner @ 2010-11-25 10:29 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs On Thu, Nov 25, 2010 at 07:25:25AM +0100, Peter Zijlstra wrote: > On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote: > > On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote: > > > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote: > > > > It is supposed to be handled by the re-initialisation of the > > > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found > > > > in the reclaim state must have passed through this reinitialisation, > > > > so from a lockdep perspective the iolock in the vfs path is a > > > > different context to the iolock in the reclaim path. That fixed all > > > > the non-reclaim state related lockdep false positives, so Perhaps > > > > there is an issue with the lockdep reclaim state checking that does > > > > not interact well with re-initialised lock contexts? > > > > > > I've been looking through this again, and I think it's indeed not > > > enough. We don't just need to re-initialize it, but also set a > > > different lockclass for it. > > > > Doesn't init_rwsem give it a new class? > > Per call-site, yes it should. > > > Guys, can you take a quick look at the code Dave is referring to > > (xfs_fs_evict_inode), and check that it actually does what he > > intends? > > Right, so this is trying to set a different class from the regular init > site, which (/me applies grep) lives in xfs_inode_alloc(), right? > > Ought to work.. assuming the inode will be fully destroyed and new > inodes are always obtained through xfs_inode_alloc() and not reused. > > > We're getting what seems to be false positives in reclaim inversion > > of lockings. Backtraces here > > http://oss.sgi.com/pipermail/xfs/2010-November/048092.html > > Right, so there its holding the inode in the read path while taking a > page-fault which does an allocation. > > vs > > acquiring the inode in the xfs_reclaim_node_shrink() path. > > > Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its > possible to end up in that read path? > > > Something like the below would give the lock-class an explicit name, > because both sites now use the exact same init thing they're called: > > (&(&ip->i_iolock)->mr_lock) > (&(&ip->i_iolock)->mr_lock#2) > > Which is hard to tell apart, but I suspect #2 is the dead one, since > they get numbered in order of appearance and its hard to have a dead > inode before having a life one ;-) > > In that case though, it would suggest the inode got re-used instead of > destroyed and re-created using xfs_alloc_inode(), is that at all > possible? Yes, actually it is - see the XFS_IRECLAIMABLE case in xfs_iget_cache_hit(). I guess we haven't seen the original lock inversion false positives that this was supposed to fix because the reclaim warnings trip first... I think that means we also need to reinitialise the lock when we recycle the inode out of the XFS_IRECLAIMABLE state. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 10:29 ` Dave Chinner @ 2010-11-25 10:36 ` Peter Zijlstra 2010-11-25 11:25 ` Christoph Hellwig 1 sibling, 0 replies; 14+ messages in thread From: Peter Zijlstra @ 2010-11-25 10:36 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs On Thu, 2010-11-25 at 21:29 +1100, Dave Chinner wrote: > > In that case though, it would suggest the inode got re-used instead of > > destroyed and re-created using xfs_alloc_inode(), is that at all > > possible? > > Yes, actually it is - see the XFS_IRECLAIMABLE case in > xfs_iget_cache_hit(). I guess we haven't seen the original lock > inversion false positives that this was supposed to fix because the > reclaim warnings trip first... > > I think that means we also need to reinitialise the lock when we recycle > the inode out of the XFS_IRECLAIMABLE state. Right, in which case you probably want an explicit lock class and use that one class in both xfs_alloc_inode() and the reclaim case. See my earlier suggestion wrt lockdep_set_class*() on how to do that. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 10:29 ` Dave Chinner 2010-11-25 10:36 ` Peter Zijlstra @ 2010-11-25 11:25 ` Christoph Hellwig 2010-11-25 11:37 ` Peter Zijlstra 1 sibling, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2010-11-25 11:25 UTC (permalink / raw) To: Dave Chinner Cc: Peter Zijlstra, Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs On Thu, Nov 25, 2010 at 09:29:40PM +1100, Dave Chinner wrote: > Yes, actually it is - see the XFS_IRECLAIMABLE case in > xfs_iget_cache_hit(). I guess we haven't seen the original lock > inversion false positives that this was supposed to fix because the > reclaim warnings trip first... > > I think that means we also need to reinitialise the lock when we recycle > the inode out of the XFS_IRECLAIMABLE state. I came up with the patch below when we had a previous report of the warning, but I couldn't convince myself that it really helps: Index: linux-2.6/fs/xfs/xfs_iget.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_iget.c 2010-09-20 12:10:28.227444173 -0300 +++ linux-2.6/fs/xfs/xfs_iget.c 2010-09-20 12:11:25.631444190 -0300 @@ -207,6 +207,10 @@ xfs_iget_cache_hit( ip->i_flags &= ~XFS_INEW; ip->i_flags |= XFS_IRECLAIMABLE; + + ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); + mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); + __xfs_inode_set_reclaim_tag(pag, ip); trace_xfs_iget_reclaim_fail(ip); goto out_error; _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug 2010-11-25 11:25 ` Christoph Hellwig @ 2010-11-25 11:37 ` Peter Zijlstra 0 siblings, 0 replies; 14+ messages in thread From: Peter Zijlstra @ 2010-11-25 11:37 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Ingo Molnar, Nick Piggin, xfs On Thu, 2010-11-25 at 06:25 -0500, Christoph Hellwig wrote: > On Thu, Nov 25, 2010 at 09:29:40PM +1100, Dave Chinner wrote: > > Yes, actually it is - see the XFS_IRECLAIMABLE case in > > xfs_iget_cache_hit(). I guess we haven't seen the original lock > > inversion false positives that this was supposed to fix because the > > reclaim warnings trip first... > > > > I think that means we also need to reinitialise the lock when we recycle > > the inode out of the XFS_IRECLAIMABLE state. > > I came up with the patch below when we had a previous report of the > warning, but I couldn't convince myself that it really helps: > > Index: linux-2.6/fs/xfs/xfs_iget.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_iget.c 2010-09-20 12:10:28.227444173 -0300 > +++ linux-2.6/fs/xfs/xfs_iget.c 2010-09-20 12:11:25.631444190 -0300 > @@ -207,6 +207,10 @@ xfs_iget_cache_hit( > > ip->i_flags &= ~XFS_INEW; > ip->i_flags |= XFS_IRECLAIMABLE; > + > + ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock)); > + mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); > + > __xfs_inode_set_reclaim_tag(pag, ip); > trace_xfs_iget_reclaim_fail(ip); > goto out_error; That adds a 3rd class which should work, but doesn't validate that the first -- xfs_inode_alloc() and this one are in fact similar. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-11-25 11:35 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-11-23 12:18 XFS reclaim lock order bug Nick Piggin 2010-11-23 21:12 ` Dave Chinner 2010-11-24 0:58 ` Nick Piggin 2010-11-24 2:26 ` Dave Chinner 2010-11-24 20:03 ` Christoph Hellwig 2010-11-25 3:48 ` Nick Piggin 2010-11-25 6:25 ` Peter Zijlstra 2010-11-25 7:08 ` Nick Piggin 2010-11-25 7:28 ` Peter Zijlstra 2010-11-25 10:32 ` Dave Chinner 2010-11-25 10:29 ` Dave Chinner 2010-11-25 10:36 ` Peter Zijlstra 2010-11-25 11:25 ` Christoph Hellwig 2010-11-25 11:37 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox