From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling
Date: Wed, 29 Aug 2018 09:55:50 -0400 [thread overview]
Message-ID: <20180829135550.GB56494@bfoster> (raw)
In-Reply-To: <20180828225909.GB5631@dastard>
On Wed, Aug 29, 2018 at 08:59:09AM +1000, Dave Chinner wrote:
> On Mon, Aug 27, 2018 at 10:25:48AM -0400, Brian Foster wrote:
> > The xfs_buf_log_item structure has a reference counter with slightly
> > tricky semantics. In the common case, a buffer is logged and
> > committed in a transaction, committed to the on-disk log (added to
> > the AIL) and then finally written back and removed from the AIL. The
> > bli refcount covers two potentially overlapping timeframes:
> >
> > 1. the bli is held in an active transaction
> > 2. the bli is pinned by the log
> >
> > The caveat to this approach is that the reference counter does not
> > purely dictate the lifetime of the bli. IOW, when a dirty buffer is
> > physically logged and unpinned, the bli refcount may go to zero as
> > the log item is inserted into the AIL. Only once the buffer is
> > written back can the bli finally be freed.
> >
> > The above semantics means that it is not enough for the various
> > refcount decrementing contexts to release the bli on decrement to
> > zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and
> > unpin (->iop_unpin()) must all drop the associated reference and
> > make additional checks to determine if the current context is
> > responsible for freeing the item.
> >
> > For example, if a transaction holds but does not dirty a particular
> > bli, the commit may drop the refcount to zero. If the bli itself is
> > clean, it is also not AIL resident and must be freed at this time.
> > The same is true for xfs_trans_brelse(). If the transaction dirties
> > a bli and then aborts or an unpin results in an abort due to a log
> > I/O error, the last reference count holder is expected to explicitly
> > remove the item from the AIL and release it (since an abort means
> > filesystem shutdown and metadata writeback will never occur).
> >
> > This leads to fairly complex checks being replicated in a few
> > different places. Since ->iop_unlock() and xfs_trans_brelse() are
> > nearly identical, refactor the logic into a common helper that
> > implements and documents the semantics in one place. This patch does
> > not change behavior.
>
> I think this improves the code further, because now we don't have to
> care about the buf item cleanup when dropping references.
>
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > fs/xfs/xfs_buf_item.c | 85 ++++++++++++++++++++++++------------------
> > fs/xfs/xfs_buf_item.h | 1 +
> > fs/xfs/xfs_trans_buf.c | 22 +----------
> > 3 files changed, 51 insertions(+), 57 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index b39d1b5320e7..d0191451fe18 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -556,13 +556,12 @@ xfs_buf_item_unlock(
> > {
> > struct xfs_buf_log_item *bip = BUF_ITEM(lip);
> > struct xfs_buf *bp = bip->bli_buf;
> > - bool freed;
> > - bool aborted;
> > + bool released;
> > bool hold = bip->bli_flags & XFS_BLI_HOLD;
> > - bool dirty = bip->bli_flags & XFS_BLI_DIRTY;
> > bool stale = bip->bli_flags & XFS_BLI_STALE;
> > #if defined(DEBUG) || defined(XFS_WARN)
> > bool ordered = bip->bli_flags & XFS_BLI_ORDERED;
> > + bool dirty = bip->bli_flags & XFS_BLI_DIRTY;
> > #endif
> > trace_xfs_buf_item_unlock(bip);
> >
> > @@ -574,8 +573,6 @@ xfs_buf_item_unlock(
> > (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> > ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
> >
> > - aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> > -
> > /*
> > * Clear the buffer's association with this transaction and
> > * per-transaction state from the bli, which has been copied above.
> > @@ -584,40 +581,16 @@ xfs_buf_item_unlock(
> > bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
> >
> > /*
> > - * Drop the transaction's bli reference and deal with the item if we had
> > - * the last one. We must free the item if clean or aborted since it
> > - * wasn't pinned by the log and this is the last chance to do so. If the
> > - * bli is freed and dirty (but non-aborted), the buffer was not dirty in
> > - * this transaction but modified by a previous one and still awaiting
> > - * writeback. In that case, the bli is freed on buffer writeback
> > - * completion.
> > + * Unref the item and unlock the buffer unless held or stale. Stale
> > + * buffers remain locked until final unpin unless the bli is freed by
> > + * the unref call. The latter implies shutdown because buffer
> > + * invalidation dirties the bli and transaction.
> > */
> > - freed = atomic_dec_and_test(&bip->bli_refcount);
> > - if (freed) {
> > - ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> > - /*
> > - * An aborted item may be in the AIL regardless of dirty state.
> > - * For example, consider an aborted transaction that invalidated
> > - * a dirty bli and cleared the dirty state.
> > - */
> > - if (aborted)
> > - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> > - if (aborted || !dirty)
> > - xfs_buf_item_relse(bp);
> > - } else if (stale) {
> > - /*
> > - * Stale buffers remain locked until final unpin unless the bli
> > - * was freed in the branch above. A freed stale bli implies an
> > - * abort because buffer invalidation dirties the bli and
> > - * transaction.
> > - */
> > - ASSERT(!freed);
> > + released = xfs_buf_item_unref(bip);
> > + if ((stale && !released) || hold)
> > return;
>
> Personal preference, but I'd put the hold check first because it's
> not a compound logic statement.
>
Ok.
> > +bool
> > +xfs_buf_item_unref(
>
> "+xfs_buf_item_put()" to be consistent with dropping references on
> other reference counted objects (e.g. iput, dput, bio_put,
> xfs_perag_put, xfs_log_ticket_put, etc).
>
Works for me.
> > + struct xfs_buf_log_item *bip)
> > +{
> > + struct xfs_log_item *lip = &bip->bli_item;
> > + bool aborted;
> > + bool dirty;
> > + bool freed;
> > +
> > + /* drop the bli ref and return if it wasn't the last one */
> > + freed = atomic_dec_and_test(&bip->bli_refcount);
> > + if (!freed)
> > + return false;
>
> We don't really need the freed variable here. This
>
> if (!atomic_dec_and_test(refcount))
> return false
>
> is a common enough "do something only when we drop the last
> reference" pattern that everyone should understand it by
> now.
>
Yeah, I think I just missed that the associated asserts (where freed was
used) were made unnecessary by the simplified logic.
> > + * If the bli is dirty and non-aborted, the buffer was clean in the
> > + * transaction but still awaiting writeback from previous changes. In
> > + * that case, the bli is freed on buffer writeback completion.
> > + */
> > + aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags) ||
> > + XFS_FORCED_SHUTDOWN(lip->li_mountp);
> > + dirty = bip->bli_flags & XFS_BLI_DIRTY;
> > + if (!aborted && dirty)
> > + return false;
>
> Personal preference (again) is to check dirty first - same as the
> comment text mentions "dirty and not-aborted"...
>
Makes sense. The dirty/clean state is also the more common/primary
consideration, whereas aborted == true should obviously be a
rare/outlier case.
> These are all minor things - I think it's a good improvement
> overall.
>
Thanks. I'll send a v3 with the associated tweaks..
Brian
> Cheers,
>
> dave.
>
> --
> Dave Chinner
> david@fromorbit.com
prev parent reply other threads:[~2018-08-29 17:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 14:25 [PATCH v2 0/3] xfs: bli refcount fixups Brian Foster
2018-08-27 14:25 ` [PATCH 1/3] xfs: don't unlock invalidated buf on aborted tx commit Brian Foster
2018-08-27 14:25 ` [PATCH 2/3] xfs: clean up xfs_trans_brelse() Brian Foster
2018-08-28 22:36 ` Dave Chinner
2018-08-29 13:55 ` Brian Foster
2018-08-27 14:25 ` [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling Brian Foster
2018-08-28 22:59 ` Dave Chinner
2018-08-29 13:55 ` Brian Foster [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180829135550.GB56494@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).