* XFS reclaim lock order bug
@ 2010-11-23 12:18 Nick Piggin
2010-11-23 21:12 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2010-11-23 12:18 UTC (permalink / raw)
To: xfs
Hi,
IIRC I've reported this before. Perhaps it is a false positive, but even
so it is still annoying that it triggers and turns off lockdep for
subsequent debugging.
Any chance it can get fixed or properly annotated?
Thanks,
Nick
[ 286.895008]
[ 286.895010] =================================
[ 286.895020] [ INFO: inconsistent lock state ]
[ 286.895020] 2.6.37-rc3+ #28
[ 286.895020] ---------------------------------
[ 286.895020] inconsistent {RECLAIM_FS-ON-R} -> {IN-RECLAIM_FS-W} usage.
[ 286.895020] rm/1844 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 286.895020] (&(&ip->i_iolock)->mr_lock#2){++++-+}, at: [<ffffffffa0067e58>] xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] {RECLAIM_FS-ON-R} state was registered at:
[ 286.895020] [<ffffffff8108380b>] mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff810838d1>] lockdep_trace_alloc+0x91/0xd0
[ 286.895020] [<ffffffff810d1851>] __alloc_pages_nodemask+0x91/0x780
[ 286.895020] [<ffffffff8110a043>] alloc_page_vma+0x93/0x150
[ 286.895020] [<ffffffff810ed909>] handle_mm_fault+0x719/0x9a0
[ 286.895020] [<ffffffff816068e3>] do_page_fault+0x133/0x4f0
[ 286.895020] [<ffffffff816039df>] page_fault+0x1f/0x30
[ 286.895020] [<ffffffff810cb52a>] generic_file_aio_read+0x2fa/0x730
[ 286.895020] [<ffffffffa009a29b>] xfs_file_aio_read+0x15b/0x390 [xfs]
[ 286.895020] [<ffffffff81117812>] do_sync_read+0xd2/0x110
[ 286.895020] [<ffffffff81117b55>] vfs_read+0xc5/0x190
[ 286.895020] [<ffffffff8111846c>] sys_read+0x4c/0x80
[ 286.895020] [<ffffffff8100312b>] system_call_fastpath+0x16/0x1b
[ 286.895020] irq event stamp: 1095103
[ 286.895020] hardirqs last enabled at (1095103): [<ffffffff81603215>] _raw_spin_unlock_irqrestore+0x65/0x80
[ 286.895020] hardirqs last disabled at (1095102): [<ffffffff81602be7>] _raw_spin_lock_irqsave+0x17/0x60
[ 286.895020] softirqs last enabled at (1093048): [<ffffffff81050c4e>] __do_softirq+0x16e/0x360
[ 286.895020] softirqs last disabled at (1093009): [<ffffffff81003fcc>] call_softirq+0x1c/0x50
[ 286.895020]
[ 286.895020] other info that might help us debug this:
[ 286.895020] 3 locks held by rm/1844:
[ 286.895020] #0: (&sb->s_type->i_mutex_key#13){+.+.+.}, at: [<ffffffff81124cdc>] do_lookup+0xfc/0x170
[ 286.895020] #1: (shrinker_rwsem){++++..}, at: [<ffffffff810d8f68>] shrink_slab+0x38/0x190
[ 286.895020] #2: (&pag->pag_ici_reclaim_lock){+.+...}, at: [<ffffffffa00a2944>] xfs_reclaim_inodes_ag+0xa4/0x370 [xfs]
[ 286.895020]
[ 286.895020] stack backtrace:
[ 286.895020] Pid: 1844, comm: rm Not tainted 2.6.37-rc3+ #28
[ 286.895020] Call Trace:
[ 286.895020] [<ffffffff810828f0>] print_usage_bug+0x170/0x180
[ 286.895020] [<ffffffff810835b1>] mark_lock+0x211/0x400
[ 286.895020] [<ffffffff810841ae>] __lock_acquire+0x40e/0x1490
[ 286.895020] [<ffffffff810852c5>] lock_acquire+0x95/0x1b0
[ 286.895020] [<ffffffffa0067e58>] ? xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa00a2774>] ? xfs_reclaim_inode+0x174/0x2a0 [xfs]
[ 286.895020] [<ffffffff81073c3a>] down_write_nested+0x4a/0x70
[ 286.895020] [<ffffffffa0067e58>] ? xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa0067e58>] xfs_ilock+0xe8/0x1e0 [xfs]
[ 286.895020] [<ffffffffa00a27c0>] xfs_reclaim_inode+0x1c0/0x2a0 [xfs]
[ 286.895020] [<ffffffffa00a2aaf>] xfs_reclaim_inodes_ag+0x20f/0x370 [xfs]
[ 286.895020] [<ffffffffa00a2c88>] xfs_reclaim_inode_shrink+0x78/0x80 [xfs]
[ 286.895020] [<ffffffff810d9057>] shrink_slab+0x127/0x190
[ 286.895020] [<ffffffff810dbf09>] zone_reclaim+0x349/0x420
[ 286.895020] [<ffffffff810cf815>] ? zone_watermark_ok+0x25/0xe0
[ 286.895020] [<ffffffff810d1603>] get_page_from_freelist+0x673/0x830
[ 286.895020] [<ffffffff8110c013>] ? init_object+0x43/0x80
[ 286.895020] [<ffffffffa00929ec>] ? kmem_zone_alloc+0x8c/0xd0 [xfs]
[ 286.895020] [<ffffffff8108380b>] ? mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff8108380b>] ? mark_held_locks+0x6b/0xa0
[ 286.895020] [<ffffffff810d18d0>] __alloc_pages_nodemask+0x110/0x780
[ 286.895020] [<ffffffff8110eb3a>] ? unfreeze_slab+0x11a/0x160
[ 286.895020] [<ffffffff811089d6>] alloc_pages_current+0x76/0xf0
[ 286.895020] [<ffffffff8110ce45>] new_slab+0x205/0x2b0
[ 286.895020] [<ffffffff8110ef7c>] __slab_alloc+0x30c/0x480
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8112f372>] ? d_alloc+0x22/0x200
[ 286.895020] [<ffffffff8110fe18>] kmem_cache_alloc+0xf8/0x1a0
[ 286.895020] [<ffffffff8112f1e0>] ? __d_lookup+0x1c0/0x1f0
[ 286.895020] [<ffffffff8112f020>] ? __d_lookup+0x0/0x1f0
[ 286.895020] [<ffffffff8112f372>] d_alloc+0x22/0x200
[ 286.895020] [<ffffffff811236fb>] d_alloc_and_lookup+0x2b/0x90
[ 286.895020] [<ffffffff8112f24c>] ? d_lookup+0x3c/0x60
[ 286.895020] [<ffffffff81124cfa>] do_lookup+0x11a/0x170
[ 286.895020] [<ffffffff81125a4a>] link_path_walk+0x31a/0xa50
[ 286.895020] [<ffffffff81126292>] path_walk+0x62/0xe0
[ 286.895020] [<ffffffff8112636b>] do_path_lookup+0x5b/0x60
[ 286.895020] [<ffffffff81126fe2>] user_path_at+0x52/0xa0
[ 286.895020] [<ffffffff8110e8d5>] ? kmem_cache_free+0xe5/0x190
[ 286.895020] [<ffffffff81083b4d>] ? trace_hardirqs_on+0xd/0x10
[ 286.895020] [<ffffffff81126760>] ? do_unlinkat+0x60/0x1d0
[ 286.895020] [<ffffffff8111d077>] vfs_fstatat+0x37/0x70
[ 286.895020] [<ffffffff8111d21f>] sys_newfstatat+0x1f/0x50
[ 286.895020] [<ffffffff81083afd>] ? trace_hardirqs_on_caller+0x13d/0x180
[ 286.895020] [<ffffffff816028c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 286.895020] [<ffffffff8100312b>] system_call_fastpath+0x16/0x1b
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-23 12:18 XFS reclaim lock order bug Nick Piggin
@ 2010-11-23 21:12 ` Dave Chinner
2010-11-24 0:58 ` Nick Piggin
2010-11-24 20:03 ` Christoph Hellwig
0 siblings, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2010-11-23 21:12 UTC (permalink / raw)
To: Nick Piggin; +Cc: xfs
On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote:
> Hi,
>
> IIRC I've reported this before. Perhaps it is a false positive, but even
> so it is still annoying that it triggers and turns off lockdep for
> subsequent debugging.
>
> Any chance it can get fixed or properly annotated?
It is supposed to be handled by the re-initialisation of the
ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
in the reclaim state must have passed through this reinitialisation,
so from a lockdep perspective the iolock in the vfs path is a
different context to the iolock in the reclaim path. That fixed all
the non-reclaim state related lockdep false positives, so Perhaps
there is an issue with the lockdep reclaim state checking that does
not interact well with re-initialised lock contexts?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-23 21:12 ` Dave Chinner
@ 2010-11-24 0:58 ` Nick Piggin
2010-11-24 2:26 ` Dave Chinner
2010-11-24 20:03 ` Christoph Hellwig
1 sibling, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2010-11-24 0:58 UTC (permalink / raw)
To: Dave Chinner; +Cc: Nick Piggin, xfs
On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote:
> > Hi,
> >
> > IIRC I've reported this before. Perhaps it is a false positive, but even
> > so it is still annoying that it triggers and turns off lockdep for
> > subsequent debugging.
> >
> > Any chance it can get fixed or properly annotated?
>
> It is supposed to be handled by the re-initialisation of the
> ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> in the reclaim state must have passed through this reinitialisation,
> so from a lockdep perspective the iolock in the vfs path is a
> different context to the iolock in the reclaim path. That fixed all
> the non-reclaim state related lockdep false positives, so Perhaps
> there is an issue with the lockdep reclaim state checking that does
> not interact well with re-initialised lock contexts?
Hmm. I suppose that should work.
So xfs_reclaim_inode can only call xfs_ilock _after_ the Linux inode
has gone through ->evict_inode call? If so, then let's ask the lockdep
people.
Thanks,
Nick
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-24 0:58 ` Nick Piggin
@ 2010-11-24 2:26 ` Dave Chinner
0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2010-11-24 2:26 UTC (permalink / raw)
To: Nick Piggin; +Cc: xfs
On Wed, Nov 24, 2010 at 11:58:11AM +1100, Nick Piggin wrote:
> On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> > On Tue, Nov 23, 2010 at 11:18:02PM +1100, Nick Piggin wrote:
> > > Hi,
> > >
> > > IIRC I've reported this before. Perhaps it is a false positive, but even
> > > so it is still annoying that it triggers and turns off lockdep for
> > > subsequent debugging.
> > >
> > > Any chance it can get fixed or properly annotated?
> >
> > It is supposed to be handled by the re-initialisation of the
> > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> > in the reclaim state must have passed through this reinitialisation,
> > so from a lockdep perspective the iolock in the vfs path is a
> > different context to the iolock in the reclaim path. That fixed all
> > the non-reclaim state related lockdep false positives, so Perhaps
> > there is an issue with the lockdep reclaim state checking that does
> > not interact well with re-initialised lock contexts?
>
> Hmm. I suppose that should work.
>
> So xfs_reclaim_inode can only call xfs_ilock _after_ the Linux inode
> has gone through ->evict_inode call?
I think so - it's only found by xfs_reclaim_inode() if the radix
tree reclaim tag is set for the inode, which is only set in two
places: ->destroy_inode(), and in a failed lookup (which doesn't
touch the iolock). I can't see any other way we can get to an inode
in reclaim...
> If so, then let's ask the lockdep people.
Sure.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-23 21:12 ` Dave Chinner
2010-11-24 0:58 ` Nick Piggin
@ 2010-11-24 20:03 ` Christoph Hellwig
2010-11-25 3:48 ` Nick Piggin
1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2010-11-24 20:03 UTC (permalink / raw)
To: Dave Chinner; +Cc: Nick Piggin, xfs
On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> It is supposed to be handled by the re-initialisation of the
> ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> in the reclaim state must have passed through this reinitialisation,
> so from a lockdep perspective the iolock in the vfs path is a
> different context to the iolock in the reclaim path. That fixed all
> the non-reclaim state related lockdep false positives, so Perhaps
> there is an issue with the lockdep reclaim state checking that does
> not interact well with re-initialised lock contexts?
I've been looking through this again, and I think it's indeed not
enough. We don't just need to re-initialize it, but also set a
different lockclass for it.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-24 20:03 ` Christoph Hellwig
@ 2010-11-25 3:48 ` Nick Piggin
2010-11-25 6:25 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2010-11-25 3:48 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Peter Zijlstra, Ingo Molnar, Nick Piggin, xfs
On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote:
> On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> > It is supposed to be handled by the re-initialisation of the
> > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> > in the reclaim state must have passed through this reinitialisation,
> > so from a lockdep perspective the iolock in the vfs path is a
> > different context to the iolock in the reclaim path. That fixed all
> > the non-reclaim state related lockdep false positives, so Perhaps
> > there is an issue with the lockdep reclaim state checking that does
> > not interact well with re-initialised lock contexts?
>
> I've been looking through this again, and I think it's indeed not
> enough. We don't just need to re-initialize it, but also set a
> different lockclass for it.
Doesn't init_rwsem give it a new class?
Guys, can you take a quick look at the code Dave is referring to
(xfs_fs_evict_inode), and check that it actually does what he
intends?
We're getting what seems to be false positives in reclaim inversion
of lockings. Backtraces here
http://oss.sgi.com/pipermail/xfs/2010-November/048092.html
Thanks,
Nick
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 3:48 ` Nick Piggin
@ 2010-11-25 6:25 ` Peter Zijlstra
2010-11-25 7:08 ` Nick Piggin
2010-11-25 10:29 ` Dave Chinner
0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2010-11-25 6:25 UTC (permalink / raw)
To: Nick Piggin; +Cc: Christoph Hellwig, Ingo Molnar, xfs
On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote:
> On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote:
> > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> > > It is supposed to be handled by the re-initialisation of the
> > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> > > in the reclaim state must have passed through this reinitialisation,
> > > so from a lockdep perspective the iolock in the vfs path is a
> > > different context to the iolock in the reclaim path. That fixed all
> > > the non-reclaim state related lockdep false positives, so Perhaps
> > > there is an issue with the lockdep reclaim state checking that does
> > > not interact well with re-initialised lock contexts?
> >
> > I've been looking through this again, and I think it's indeed not
> > enough. We don't just need to re-initialize it, but also set a
> > different lockclass for it.
>
> Doesn't init_rwsem give it a new class?
Per call-site, yes it should.
> Guys, can you take a quick look at the code Dave is referring to
> (xfs_fs_evict_inode), and check that it actually does what he
> intends?
Right, so this is trying to set a different class from the regular init
site, which (/me applies grep) lives in xfs_inode_alloc(), right?
Ought to work.. assuming the inode will be fully destroyed and new
inodes are always obtained through xfs_inode_alloc() and not reused.
> We're getting what seems to be false positives in reclaim inversion
> of lockings. Backtraces here
> http://oss.sgi.com/pipermail/xfs/2010-November/048092.html
Right, so there its holding the inode in the read path while taking a
page-fault which does an allocation.
vs
acquiring the inode in the xfs_reclaim_node_shrink() path.
Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its
possible to end up in that read path?
Something like the below would give the lock-class an explicit name,
because both sites now use the exact same init thing they're called:
(&(&ip->i_iolock)->mr_lock)
(&(&ip->i_iolock)->mr_lock#2)
Which is hard to tell apart, but I suspect #2 is the dead one, since
they get numbered in order of appearance and its hard to have a dead
inode before having a life one ;-)
In that case though, it would suggest the inode got re-used instead of
destroyed and re-created using xfs_alloc_inode(), is that at all
possible?
---
fs/xfs/linux-2.6/xfs_super.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 064f964..721c1c5 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1091,6 +1091,8 @@ xfs_fs_write_inode(
return -error;
}
+static struct lock_class_key xfs_dead_inode;
+
STATIC void
xfs_fs_evict_inode(
struct inode *inode)
@@ -1118,6 +1120,8 @@ xfs_fs_evict_inode(
*/
ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
+ lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode,
+ "xfd_dead_inode");
xfs_inactive(ip);
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 6:25 ` Peter Zijlstra
@ 2010-11-25 7:08 ` Nick Piggin
2010-11-25 7:28 ` Peter Zijlstra
2010-11-25 10:29 ` Dave Chinner
1 sibling, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2010-11-25 7:08 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs
On Thu, Nov 25, 2010 at 07:25:25AM +0100, Peter Zijlstra wrote:
> On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote:
> > On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote:
> > > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> > > > It is supposed to be handled by the re-initialisation of the
> > > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> > > > in the reclaim state must have passed through this reinitialisation,
> > > > so from a lockdep perspective the iolock in the vfs path is a
> > > > different context to the iolock in the reclaim path. That fixed all
> > > > the non-reclaim state related lockdep false positives, so Perhaps
> > > > there is an issue with the lockdep reclaim state checking that does
> > > > not interact well with re-initialised lock contexts?
> > >
> > > I've been looking through this again, and I think it's indeed not
> > > enough. We don't just need to re-initialize it, but also set a
> > > different lockclass for it.
> >
> > Doesn't init_rwsem give it a new class?
>
> Per call-site, yes it should.
>
> > Guys, can you take a quick look at the code Dave is referring to
> > (xfs_fs_evict_inode), and check that it actually does what he
> > intends?
>
> Right, so this is trying to set a different class from the regular init
> site, which (/me applies grep) lives in xfs_inode_alloc(), right?
>
> Ought to work.. assuming the inode will be fully destroyed and new
> inodes are always obtained through xfs_inode_alloc() and not reused.
>
> > We're getting what seems to be false positives in reclaim inversion
> > of lockings. Backtraces here
> > http://oss.sgi.com/pipermail/xfs/2010-November/048092.html
>
> Right, so there its holding the inode in the read path while taking a
> page-fault which does an allocation.
>
> vs
>
> acquiring the inode in the xfs_reclaim_node_shrink() path.
>
>
> Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its
> possible to end up in that read path?
I think that's the idea.
> Something like the below would give the lock-class an explicit name,
> because both sites now use the exact same init thing they're called:
>
> (&(&ip->i_iolock)->mr_lock)
> (&(&ip->i_iolock)->mr_lock#2)
>
> Which is hard to tell apart, but I suspect #2 is the dead one, since
> they get numbered in order of appearance and its hard to have a dead
> inode before having a life one ;-)
>
> In that case though, it would suggest the inode got re-used instead of
> destroyed and re-created using xfs_alloc_inode(), is that at all
> possible?
Ah, I see what you mean. An inode that has been through the evict_inode
path is now found to be locked for read(2). The rwsem is in the
init_always path of the allocator, so it seems like it's getting reused
after evict_inode. Dave?
>
> ---
> fs/xfs/linux-2.6/xfs_super.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
> index 064f964..721c1c5 100644
> --- a/fs/xfs/linux-2.6/xfs_super.c
> +++ b/fs/xfs/linux-2.6/xfs_super.c
> @@ -1091,6 +1091,8 @@ xfs_fs_write_inode(
> return -error;
> }
>
> +static struct lock_class_key xfs_dead_inode;
> +
> STATIC void
> xfs_fs_evict_inode(
> struct inode *inode)
> @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode(
> */
> ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
> mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
> + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode,
> + "xfd_dead_inode");
>
> xfs_inactive(ip);
> }
With this change, I assume the mrlock_init can go? (it would be nice
to have a wrapper to allocate the class by itself)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 7:08 ` Nick Piggin
@ 2010-11-25 7:28 ` Peter Zijlstra
2010-11-25 10:32 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2010-11-25 7:28 UTC (permalink / raw)
To: Nick Piggin; +Cc: Christoph Hellwig, Ingo Molnar, xfs
On Thu, 2010-11-25 at 18:08 +1100, Nick Piggin wrote:
> > +static struct lock_class_key xfs_dead_inode;
> > +
> > STATIC void
> > xfs_fs_evict_inode(
> > struct inode *inode)
> > @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode(
> > */
> > ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
> > mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
> > + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode,
> > + "xfd_dead_inode");
> >
> > xfs_inactive(ip);
> > }
>
> With this change, I assume the mrlock_init can go? (it would be nice
> to have a wrapper to allocate the class by itself)
mrlock_init() does allocate a class (well rwsem_init, really), but sets
the name to a stringified version of the lock argument.
The lockdep_set_class*() interface is only guaranteed to work on a
freshly initialized lock structure -- which in this case is a bit of a
waste, but for debugging purposes would allow setting a clearer name.
Alternatively, you can write the code like:
xfs_inode_t dead_ip = XFS_I(inode);
mrlock_init(&dead_ip->i_iolock, ...);
In which case its also obvious, as that would result in:
(&(&dead_ip->i_iolock)->mr_lock)
as opposed to:
(&(&ip->i_iolock)->mr_lock)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 6:25 ` Peter Zijlstra
2010-11-25 7:08 ` Nick Piggin
@ 2010-11-25 10:29 ` Dave Chinner
2010-11-25 10:36 ` Peter Zijlstra
2010-11-25 11:25 ` Christoph Hellwig
1 sibling, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2010-11-25 10:29 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs
On Thu, Nov 25, 2010 at 07:25:25AM +0100, Peter Zijlstra wrote:
> On Thu, 2010-11-25 at 14:48 +1100, Nick Piggin wrote:
> > On Wed, Nov 24, 2010 at 03:03:41PM -0500, Christoph Hellwig wrote:
> > > On Wed, Nov 24, 2010 at 08:12:58AM +1100, Dave Chinner wrote:
> > > > It is supposed to be handled by the re-initialisation of the
> > > > ip->i_iolock in ->evict_inode (xfs_fs_evict_inode). An inode found
> > > > in the reclaim state must have passed through this reinitialisation,
> > > > so from a lockdep perspective the iolock in the vfs path is a
> > > > different context to the iolock in the reclaim path. That fixed all
> > > > the non-reclaim state related lockdep false positives, so Perhaps
> > > > there is an issue with the lockdep reclaim state checking that does
> > > > not interact well with re-initialised lock contexts?
> > >
> > > I've been looking through this again, and I think it's indeed not
> > > enough. We don't just need to re-initialize it, but also set a
> > > different lockclass for it.
> >
> > Doesn't init_rwsem give it a new class?
>
> Per call-site, yes it should.
>
> > Guys, can you take a quick look at the code Dave is referring to
> > (xfs_fs_evict_inode), and check that it actually does what he
> > intends?
>
> Right, so this is trying to set a different class from the regular init
> site, which (/me applies grep) lives in xfs_inode_alloc(), right?
>
> Ought to work.. assuming the inode will be fully destroyed and new
> inodes are always obtained through xfs_inode_alloc() and not reused.
>
> > We're getting what seems to be false positives in reclaim inversion
> > of lockings. Backtraces here
> > http://oss.sgi.com/pipermail/xfs/2010-November/048092.html
>
> Right, so there its holding the inode in the read path while taking a
> page-fault which does an allocation.
>
> vs
>
> acquiring the inode in the xfs_reclaim_node_shrink() path.
>
>
> Presumably the whole xfs_fs_evict_inode() stuff will happen _after_ its
> possible to end up in that read path?
>
>
> Something like the below would give the lock-class an explicit name,
> because both sites now use the exact same init thing they're called:
>
> (&(&ip->i_iolock)->mr_lock)
> (&(&ip->i_iolock)->mr_lock#2)
>
> Which is hard to tell apart, but I suspect #2 is the dead one, since
> they get numbered in order of appearance and its hard to have a dead
> inode before having a life one ;-)
>
> In that case though, it would suggest the inode got re-used instead of
> destroyed and re-created using xfs_alloc_inode(), is that at all
> possible?
Yes, actually it is - see the XFS_IRECLAIMABLE case in
xfs_iget_cache_hit(). I guess we haven't seen the original lock
inversion false positives that this was supposed to fix because the
reclaim warnings trip first...
I think that means we also need to reinitialise the lock when we recycle
the inode out of the XFS_IRECLAIMABLE state.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 7:28 ` Peter Zijlstra
@ 2010-11-25 10:32 ` Dave Chinner
0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2010-11-25 10:32 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs
On Thu, Nov 25, 2010 at 08:28:17AM +0100, Peter Zijlstra wrote:
> On Thu, 2010-11-25 at 18:08 +1100, Nick Piggin wrote:
> > > +static struct lock_class_key xfs_dead_inode;
> > > +
> > > STATIC void
> > > xfs_fs_evict_inode(
> > > struct inode *inode)
> > > @@ -1118,6 +1120,8 @@ xfs_fs_evict_inode(
> > > */
> > > ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
> > > mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
> > > + lockdep_set_class_and_name(&ip->i_iolock->mr_lock, &xfs_dead_inode,
> > > + "xfd_dead_inode");
> > >
> > > xfs_inactive(ip);
> > > }
> >
> > With this change, I assume the mrlock_init can go? (it would be nice
> > to have a wrapper to allocate the class by itself)
>
>
> mrlock_init() does allocate a class (well rwsem_init, really), but sets
> the name to a stringified version of the lock argument.
>
> The lockdep_set_class*() interface is only guaranteed to work on a
> freshly initialized lock structure -- which in this case is a bit of a
> waste, but for debugging purposes would allow setting a clearer name.
>
> Alternatively, you can write the code like:
>
> xfs_inode_t dead_ip = XFS_I(inode);
>
> mrlock_init(&dead_ip->i_iolock, ...);
>
> In which case its also obvious, as that would result in:
>
> (&(&dead_ip->i_iolock)->mr_lock)
>
> as opposed to:
>
> (&(&ip->i_iolock)->mr_lock)
Ok, that's a handy trick to know. I'll try and sort this out
tomorrow and make use of this trick to help identify the different
lock classes.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 10:29 ` Dave Chinner
@ 2010-11-25 10:36 ` Peter Zijlstra
2010-11-25 11:25 ` Christoph Hellwig
1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2010-11-25 10:36 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs
On Thu, 2010-11-25 at 21:29 +1100, Dave Chinner wrote:
> > In that case though, it would suggest the inode got re-used instead of
> > destroyed and re-created using xfs_alloc_inode(), is that at all
> > possible?
>
> Yes, actually it is - see the XFS_IRECLAIMABLE case in
> xfs_iget_cache_hit(). I guess we haven't seen the original lock
> inversion false positives that this was supposed to fix because the
> reclaim warnings trip first...
>
> I think that means we also need to reinitialise the lock when we recycle
> the inode out of the XFS_IRECLAIMABLE state.
Right, in which case you probably want an explicit lock class and use
that one class in both xfs_alloc_inode() and the reclaim case.
See my earlier suggestion wrt lockdep_set_class*() on how to do that.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 10:29 ` Dave Chinner
2010-11-25 10:36 ` Peter Zijlstra
@ 2010-11-25 11:25 ` Christoph Hellwig
2010-11-25 11:37 ` Peter Zijlstra
1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2010-11-25 11:25 UTC (permalink / raw)
To: Dave Chinner
Cc: Peter Zijlstra, Christoph Hellwig, Ingo Molnar, Nick Piggin, xfs
On Thu, Nov 25, 2010 at 09:29:40PM +1100, Dave Chinner wrote:
> Yes, actually it is - see the XFS_IRECLAIMABLE case in
> xfs_iget_cache_hit(). I guess we haven't seen the original lock
> inversion false positives that this was supposed to fix because the
> reclaim warnings trip first...
>
> I think that means we also need to reinitialise the lock when we recycle
> the inode out of the XFS_IRECLAIMABLE state.
I came up with the patch below when we had a previous report of the
warning, but I couldn't convince myself that it really helps:
Index: linux-2.6/fs/xfs/xfs_iget.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_iget.c 2010-09-20 12:10:28.227444173 -0300
+++ linux-2.6/fs/xfs/xfs_iget.c 2010-09-20 12:11:25.631444190 -0300
@@ -207,6 +207,10 @@ xfs_iget_cache_hit(
ip->i_flags &= ~XFS_INEW;
ip->i_flags |= XFS_IRECLAIMABLE;
+
+ ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
+ mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
+
__xfs_inode_set_reclaim_tag(pag, ip);
trace_xfs_iget_reclaim_fail(ip);
goto out_error;
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: XFS reclaim lock order bug
2010-11-25 11:25 ` Christoph Hellwig
@ 2010-11-25 11:37 ` Peter Zijlstra
0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2010-11-25 11:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Ingo Molnar, Nick Piggin, xfs
On Thu, 2010-11-25 at 06:25 -0500, Christoph Hellwig wrote:
> On Thu, Nov 25, 2010 at 09:29:40PM +1100, Dave Chinner wrote:
> > Yes, actually it is - see the XFS_IRECLAIMABLE case in
> > xfs_iget_cache_hit(). I guess we haven't seen the original lock
> > inversion false positives that this was supposed to fix because the
> > reclaim warnings trip first...
> >
> > I think that means we also need to reinitialise the lock when we recycle
> > the inode out of the XFS_IRECLAIMABLE state.
>
> I came up with the patch below when we had a previous report of the
> warning, but I couldn't convince myself that it really helps:
>
> Index: linux-2.6/fs/xfs/xfs_iget.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_iget.c 2010-09-20 12:10:28.227444173 -0300
> +++ linux-2.6/fs/xfs/xfs_iget.c 2010-09-20 12:11:25.631444190 -0300
> @@ -207,6 +207,10 @@ xfs_iget_cache_hit(
>
> ip->i_flags &= ~XFS_INEW;
> ip->i_flags |= XFS_IRECLAIMABLE;
> +
> + ASSERT(!rwsem_is_locked(&ip->i_iolock.mr_lock));
> + mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
> +
> __xfs_inode_set_reclaim_tag(pag, ip);
> trace_xfs_iget_reclaim_fail(ip);
> goto out_error;
That adds a 3rd class which should work, but doesn't validate that the
first -- xfs_inode_alloc() and this one are in fact similar.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-11-25 11:35 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-23 12:18 XFS reclaim lock order bug Nick Piggin
2010-11-23 21:12 ` Dave Chinner
2010-11-24 0:58 ` Nick Piggin
2010-11-24 2:26 ` Dave Chinner
2010-11-24 20:03 ` Christoph Hellwig
2010-11-25 3:48 ` Nick Piggin
2010-11-25 6:25 ` Peter Zijlstra
2010-11-25 7:08 ` Nick Piggin
2010-11-25 7:28 ` Peter Zijlstra
2010-11-25 10:32 ` Dave Chinner
2010-11-25 10:29 ` Dave Chinner
2010-11-25 10:36 ` Peter Zijlstra
2010-11-25 11:25 ` Christoph Hellwig
2010-11-25 11:37 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox