From: Bill O'Donnell <billodo@redhat.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't unlock invalidated buf on aborted tx commit
Date: Tue, 21 Aug 2018 14:56:01 -0500 [thread overview]
Message-ID: <20180821195601.GA27408@redhat.com> (raw)
In-Reply-To: <20180821140116.15900-1-bfoster@redhat.com>
On Tue, Aug 21, 2018 at 10:01:16AM -0400, Brian Foster wrote:
> xfstests generic/388,475 occasionally reproduce assertion failures
> in xfs_buf_item_unpin() when the final bli reference is dropped on
> an invalidated buffer and the buffer is not locked as it is expected
> to be. Invalidated buffers should remain locked on transaction
> commit until the final unpin, at which point the buffer is removed
> from the AIL and the bli is freed since stale buffers are not
> written back.
>
> The assert failures are associated with filesystem shutdown,
> typically due to log I/O errors injected by the test. The
> problematic situation can occur if the shutdown happens to cause a
> race between an active transaction that has invalidated a particular
> buffer and an I/O error on a log buffer that contains the bli
> associated with the same (now stale) buffer.
>
> Both transaction and log contexts acquire a bli reference. If the
> transaction has already invalidated the buffer by the time the I/O
> error occurs and ends up aborting due to shutdown, the transaction
> and log hold the last two references to a stale bli. If the
> transaction cancel occurs first, it treats the buffer as non-stale
> due to the aborted state: the bli reference is dropped and the
> buffer is released/unlocked. The log buffer I/O error handling
> eventually calls into xfs_buf_item_unpin(), drops the final
> reference to the bli and treats it as stale. The buffer wasn't left
> locked by xfs_buf_item_unlock(), however, so the assert fails and
> the buffer is double unlocked. The latter problem is mitigated by
> the fact that the fs is shutdown and no further damage is possible.
>
> ->iop_unlock() of an invalidated buffer should behave consistently
> with respect to the bli refcount, regardless of aborted state. If
> the refcount remains elevated on commit, we know the bli is awaiting
> an unpin (since it can't be in another transaction) and will be
> handled appropriately on log buffer completion. If the final bli
> reference of an invalidated buffer is dropped in ->iop_unlock(), we
> can assume the transaction has aborted because invalidation implies
> a dirty transaction. In the non-abort case, the log would have
> acquired a bli reference in ->iop_pin() and prevented bli release at
> ->iop_unlock() time. In the abort case the item must be freed and
> buffer unlocked because it wasn't pinned by the log.
>
> Rework xfs_buf_item_unlock() to simplify the currently circuitous
> and duplicate logic and leave invalidated buffers locked based on
> bli refcount, regardless of aborted state. This ensures that a
> pinned, stale buffer is always found locked when eventually
> unpinned.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> Hi all,
>
> This survives the assert reproducer, several xfstests runs with v4/v5
> superblock filesystems, and fsstress and fsx workloads over a few hours.
>
> Brian
So far, this patch resolves the problem I had seen looping g/388,475
on debug kernels. I never encountered the issue on non-debug builds.
Thanks!
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
>
> fs/xfs/xfs_buf_item.c | 85 +++++++++++++++++++------------------------
> 1 file changed, 37 insertions(+), 48 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 1c9d1398980b..5e8b91543d93 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -556,73 +556,62 @@ xfs_buf_item_unlock(
> {
> struct xfs_buf_log_item *bip = BUF_ITEM(lip);
> struct xfs_buf *bp = bip->bli_buf;
> + int freed;
> bool aborted;
> bool hold = !!(bip->bli_flags & XFS_BLI_HOLD);
> bool dirty = !!(bip->bli_flags & XFS_BLI_DIRTY);
> + bool stale = !!(bip->bli_flags & XFS_BLI_STALE);
> #if defined(DEBUG) || defined(XFS_WARN)
> bool ordered = !!(bip->bli_flags & XFS_BLI_ORDERED);
> #endif
> + /*
> + * The bli dirty state should match whether the blf has logged segments
> + * except for ordered buffers, where only the bli should be dirty.
> + */
> + ASSERT((!ordered && dirty == xfs_buf_item_dirty_format(bip)) ||
> + (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> + ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
>
> - aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> + trace_xfs_buf_item_unlock(bip);
>
> - /* Clear the buffer's association with this transaction. */
> - bp->b_transp = NULL;
> + aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
>
> /*
> - * The per-transaction state has been copied above so clear it from the
> - * bli.
> + * Clear the buffer's association with this transaction and
> + * per-transaction state from the bli, which has been copied above.
> */
> + bp->b_transp = NULL;
> bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
>
> /*
> - * If the buf item is marked stale, then don't do anything. We'll
> - * unlock the buffer and free the buf item when the buffer is unpinned
> - * for the last time.
> + * Drop the transaction bli reference and free the item if clean or
> + * aborted and we had the last reference. In either case this is the
> + * last opportunity to free the item since it won't be written back.
> + * Otherwise, the bli is still referenced or dirty and will be freed on
> + * final unpin of the buffer (if stale) or writeback completion.
> */
> - if (bip->bli_flags & XFS_BLI_STALE) {
> - trace_xfs_buf_item_unlock_stale(bip);
> - ASSERT(bip->__bli_format.blf_flags & XFS_BLF_CANCEL);
> - if (!aborted) {
> - atomic_dec(&bip->bli_refcount);
> - return;
> - }
> + freed = atomic_dec_and_test(&bip->bli_refcount);
> + if (freed && (aborted || !dirty)) {
> + ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> + ASSERT(!stale || aborted);
> + /* an aborted item may be in the AIL, remove it first */
> + if (aborted)
> + xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> + xfs_buf_item_relse(bp);
> }
>
> - trace_xfs_buf_item_unlock(bip);
> -
> - /*
> - * If the buf item isn't tracking any data, free it, otherwise drop the
> - * reference we hold to it. If we are aborting the transaction, this may
> - * be the only reference to the buf item, so we free it anyway
> - * regardless of whether it is dirty or not. A dirty abort implies a
> - * shutdown, anyway.
> - *
> - * The bli dirty state should match whether the blf has logged segments
> - * except for ordered buffers, where only the bli should be dirty.
> - */
> - ASSERT((!ordered && dirty == xfs_buf_item_dirty_format(bip)) ||
> - (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> -
> /*
> - * Clean buffers, by definition, cannot be in the AIL. However, aborted
> - * buffers may be in the AIL regardless of dirty state. An aborted
> - * transaction that invalidates a buffer already in the AIL may have
> - * marked it stale and cleared the dirty state, for example.
> - *
> - * Therefore if we are aborting a buffer and we've just taken the last
> - * reference away, we have to check if it is in the AIL before freeing
> - * it. We need to free it in this case, because an aborted transaction
> - * has already shut the filesystem down and this is the last chance we
> - * will have to do so.
> + * If the buffer was invalidated, leave it locked on transaction commit
> + * unless we just dropped the final reference. The latter case should
> + * only ever happen on abort because invalidation dirties the
> + * transaction and the log would have grabbed another bli reference when
> + * the buffer was pinned. In the non-abort case, the buffer is unlocked
> + * on final unpin and the bli freed since stale buffers are not written
> + * back.
> */
> - if (atomic_dec_and_test(&bip->bli_refcount)) {
> - if (aborted) {
> - ASSERT(XFS_FORCED_SHUTDOWN(lip->li_mountp));
> - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> - xfs_buf_item_relse(bp);
> - } else if (!dirty)
> - xfs_buf_item_relse(bp);
> - }
> + if (stale && !freed)
> + return;
> + ASSERT(!stale || (aborted && freed));
>
> if (!hold)
> xfs_buf_relse(bp);
> --
> 2.17.1
>
next prev parent reply other threads:[~2018-08-21 23:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-21 14:01 [PATCH] xfs: don't unlock invalidated buf on aborted tx commit Brian Foster
2018-08-21 19:56 ` Bill O'Donnell [this message]
2018-08-22 0:28 ` Dave Chinner
2018-08-22 13:05 ` Brian Foster
2018-08-24 14:56 ` Brian Foster
2018-08-22 6:01 ` Christoph Hellwig
2018-08-22 13:05 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180821195601.GA27408@redhat.com \
--to=billodo@redhat.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).