From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling
Date: Wed, 29 Aug 2018 08:59:09 +1000 [thread overview]
Message-ID: <20180828225909.GB5631@dastard> (raw)
In-Reply-To: <20180827142548.45029-4-bfoster@redhat.com>
On Mon, Aug 27, 2018 at 10:25:48AM -0400, Brian Foster wrote:
> The xfs_buf_log_item structure has a reference counter with slightly
> tricky semantics. In the common case, a buffer is logged and
> committed in a transaction, committed to the on-disk log (added to
> the AIL) and then finally written back and removed from the AIL. The
> bli refcount covers two potentially overlapping timeframes:
>
> 1. the bli is held in an active transaction
> 2. the bli is pinned by the log
>
> The caveat to this approach is that the reference counter does not
> purely dictate the lifetime of the bli. IOW, when a dirty buffer is
> physically logged and unpinned, the bli refcount may go to zero as
> the log item is inserted into the AIL. Only once the buffer is
> written back can the bli finally be freed.
>
> The above semantics means that it is not enough for the various
> refcount decrementing contexts to release the bli on decrement to
> zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and
> unpin (->iop_unpin()) must all drop the associated reference and
> make additional checks to determine if the current context is
> responsible for freeing the item.
>
> For example, if a transaction holds but does not dirty a particular
> bli, the commit may drop the refcount to zero. If the bli itself is
> clean, it is also not AIL resident and must be freed at this time.
> The same is true for xfs_trans_brelse(). If the transaction dirties
> a bli and then aborts or an unpin results in an abort due to a log
> I/O error, the last reference count holder is expected to explicitly
> remove the item from the AIL and release it (since an abort means
> filesystem shutdown and metadata writeback will never occur).
>
> This leads to fairly complex checks being replicated in a few
> different places. Since ->iop_unlock() and xfs_trans_brelse() are
> nearly identical, refactor the logic into a common helper that
> implements and documents the semantics in one place. This patch does
> not change behavior.
I think this improves the code further, because now we don't have to
care about the buf item cleanup when dropping references.
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> fs/xfs/xfs_buf_item.c | 85 ++++++++++++++++++++++++------------------
> fs/xfs/xfs_buf_item.h | 1 +
> fs/xfs/xfs_trans_buf.c | 22 +----------
> 3 files changed, 51 insertions(+), 57 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index b39d1b5320e7..d0191451fe18 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -556,13 +556,12 @@ xfs_buf_item_unlock(
> {
> struct xfs_buf_log_item *bip = BUF_ITEM(lip);
> struct xfs_buf *bp = bip->bli_buf;
> - bool freed;
> - bool aborted;
> + bool released;
> bool hold = bip->bli_flags & XFS_BLI_HOLD;
> - bool dirty = bip->bli_flags & XFS_BLI_DIRTY;
> bool stale = bip->bli_flags & XFS_BLI_STALE;
> #if defined(DEBUG) || defined(XFS_WARN)
> bool ordered = bip->bli_flags & XFS_BLI_ORDERED;
> + bool dirty = bip->bli_flags & XFS_BLI_DIRTY;
> #endif
> trace_xfs_buf_item_unlock(bip);
>
> @@ -574,8 +573,6 @@ xfs_buf_item_unlock(
> (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
>
> - aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> -
> /*
> * Clear the buffer's association with this transaction and
> * per-transaction state from the bli, which has been copied above.
> @@ -584,40 +581,16 @@ xfs_buf_item_unlock(
> bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
>
> /*
> - * Drop the transaction's bli reference and deal with the item if we had
> - * the last one. We must free the item if clean or aborted since it
> - * wasn't pinned by the log and this is the last chance to do so. If the
> - * bli is freed and dirty (but non-aborted), the buffer was not dirty in
> - * this transaction but modified by a previous one and still awaiting
> - * writeback. In that case, the bli is freed on buffer writeback
> - * completion.
> + * Unref the item and unlock the buffer unless held or stale. Stale
> + * buffers remain locked until final unpin unless the bli is freed by
> + * the unref call. The latter implies shutdown because buffer
> + * invalidation dirties the bli and transaction.
> */
> - freed = atomic_dec_and_test(&bip->bli_refcount);
> - if (freed) {
> - ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> - /*
> - * An aborted item may be in the AIL regardless of dirty state.
> - * For example, consider an aborted transaction that invalidated
> - * a dirty bli and cleared the dirty state.
> - */
> - if (aborted)
> - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> - if (aborted || !dirty)
> - xfs_buf_item_relse(bp);
> - } else if (stale) {
> - /*
> - * Stale buffers remain locked until final unpin unless the bli
> - * was freed in the branch above. A freed stale bli implies an
> - * abort because buffer invalidation dirties the bli and
> - * transaction.
> - */
> - ASSERT(!freed);
> + released = xfs_buf_item_unref(bip);
> + if ((stale && !released) || hold)
> return;
Personal preference, but I'd put the hold check first because it's
not a compound logic statement.
> +bool
> +xfs_buf_item_unref(
"+xfs_buf_item_put()" to be consistent with dropping references on
other reference counted objects (e.g. iput, dput, bio_put,
xfs_perag_put, xfs_log_ticket_put, etc).
> + struct xfs_buf_log_item *bip)
> +{
> + struct xfs_log_item *lip = &bip->bli_item;
> + bool aborted;
> + bool dirty;
> + bool freed;
> +
> + /* drop the bli ref and return if it wasn't the last one */
> + freed = atomic_dec_and_test(&bip->bli_refcount);
> + if (!freed)
> + return false;
We don't really need the freed variable here. This
if (!atomic_dec_and_test(refcount))
return false
is a common enough "do something only when we drop the last
reference" pattern that everyone should understand it by
now.
> + * If the bli is dirty and non-aborted, the buffer was clean in the
> + * transaction but still awaiting writeback from previous changes. In
> + * that case, the bli is freed on buffer writeback completion.
> + */
> + aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags) ||
> + XFS_FORCED_SHUTDOWN(lip->li_mountp);
> + dirty = bip->bli_flags & XFS_BLI_DIRTY;
> + if (!aborted && dirty)
> + return false;
Personal preference (again) is to check dirty first - same as the
comment text mentions "dirty and not-aborted"...
These are all minor things - I think it's a good improvement
overall.
Cheers,
dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-08-29 2:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 14:25 [PATCH v2 0/3] xfs: bli refcount fixups Brian Foster
2018-08-27 14:25 ` [PATCH 1/3] xfs: don't unlock invalidated buf on aborted tx commit Brian Foster
2018-08-27 14:25 ` [PATCH 2/3] xfs: clean up xfs_trans_brelse() Brian Foster
2018-08-28 22:36 ` Dave Chinner
2018-08-29 13:55 ` Brian Foster
2018-08-27 14:25 ` [PATCH 3/3] xfs: refactor xfs_buf_log_item reference count handling Brian Foster
2018-08-28 22:59 ` Dave Chinner [this message]
2018-08-29 13:55 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180828225909.GB5631@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).