From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: Mark Tinguely <tinguely@sgi.com>, xfs@oss.sgi.com
Subject: Re: [PATCH RFC 0/2] fix spinlock recursion on xa_lock in xfs_buf_item_push
Date: Thu, 31 Jan 2013 08:59:34 +1100 [thread overview]
Message-ID: <20130130215934.GB32297@disturbed.disaster> (raw)
In-Reply-To: <51094423.8000703@redhat.com>
On Wed, Jan 30, 2013 at 11:02:43AM -0500, Brian Foster wrote:
> (added Dave and the list back on CC)
>
> On 01/30/2013 09:07 AM, Mark Tinguely wrote:
> > On 01/30/13 00:05, Dave Chinner wrote:
> >> On Tue, Jan 29, 2013 at 03:42:35PM -0500, Brian Foster wrote:
> ...
> >
> >> So essentially what is happening here is that we are trying to lock
> >> a stale buffer in the AIL to flush it. Well, we can't flush it from
> >> the AIL, and indeed the next line of code is this:
> >>
> >> if (!xfs_buf_trylock(bp))
> >> return XFS_ITEM_LOCKED;
> >>
> >>>>>>> ASSERT(!(bip->bli_flags& XFS_BLI_STALE));
> >>
> >> The only reason this ASSERT is not firing is that we are failing to
> >> lock stale buffers. Hence we are relying on the failed lock to force
> >> the log, instead of detecting that we need to force the log after we
> >> drop the AIL lock and letting the caller handle it.
> >>
> >> So, wouldn't a better solution be to do something like simply like:
> >>
> >> + if (bp->b_flags& XBF_STALE)
> >> + return XFS_ITEM_PINNED;
> >> +
> >> if (!xfs_buf_trylock(bp))
> >> return XFS_ITEM_LOCKED;
> ...
> >
>
> Thanks guys. This certainly looks nicer than messing with the lock
> wrapper, but is it susceptible to the same problem? In other words, does
> this fix the problem or just tighten the window?
That's what I need to think about more - the only difference here is
that we are checking the flag before the down_trylock() instead of
after....
> So if the buf lock covers the pinned state (e.g., buffer gets locked,
> added to a transaction, the transaction gets committed and pins and
> unlocks the buffer, IIUC) and the stale state (buf gets locked, added to
> a new transaction and inval'd before the original transaction was
> written ?), but we don't hold the buf lock in xfs_buf_item_push(), how
> can we guarantee the state of either doesn't change between the time we
> check the flags and the time the lock fails?
... but the order of them being set and checked may be significant
and hence checking the stale flag first might be sufficient to avoid
the pin count race and hence the log force. Hence this might just
need a pair of memory barriers - one here and one in xfs_buf_stale()
- to ensure that we always see the XBF_STALE flag without needing to
lock the buffer first.
> > Makes sense. It would prevent the lock recursion. The more that I think
> > about, we do not want to release xa_lock during an AIL scan.
> >
>
> FWIW, the other log item abstractions appear to use this model (e.g.,
> xfs_inode_item_push()), where it appears safe to drop xa_lock once the
> actual object lock/ref is acquired and reacquire xa_lock before
> returning.
Right. We can't hold the AIL while doing blocking lock operations
because the AIL lock is an innermost lock. That's fundamentally the
cause of this problem - the trylock is doing more work that we can
allow in the context that it is being called in...
We also do not hold the AIL lock for the entire
traversal as long holds of the AIL lock result in serialisation
across the entire transaction subsystem. A successful IOP_PUSH()
typically drops and reacquires the AIL lock, but a failure doesn't.
Hence we limit the number of failed pushes under a single traversal
so that we don't spend a long time holding the lock doing no work
and instead back off and try again later....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.come
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-01-30 21:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-29 20:42 [PATCH RFC 0/2] fix spinlock recursion on xa_lock in xfs_buf_item_push Brian Foster
2013-01-29 20:42 ` [PATCH RFC 1/2] xfs: conditionally force log on trylock failure of pinned/stale buf Brian Foster
2013-01-29 20:42 ` [PATCH RFC 2/2] xfs: drop xa_lock around log force in xfs_buf_item push Brian Foster
2013-01-29 22:41 ` [PATCH RFC 0/2] fix spinlock recursion on xa_lock in xfs_buf_item_push Mark Tinguely
2013-01-30 6:05 ` Dave Chinner
[not found] ` <5109291E.6090303@sgi.com>
2013-01-30 16:02 ` Brian Foster
2013-01-30 21:59 ` Dave Chinner [this message]
2013-01-31 17:45 ` Brian Foster
2013-02-01 1:26 ` Dave Chinner
2013-02-01 16:01 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130130215934.GB32297@disturbed.disaster \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox