From: Bill O'Donnell <billodo@redhat.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't unlock invalidated buf on aborted tx commit
Date: Tue, 21 Aug 2018 14:56:01 -0500 [thread overview]
Message-ID: <20180821195601.GA27408@redhat.com> (raw)
In-Reply-To: <20180821140116.15900-1-bfoster@redhat.com>
On Tue, Aug 21, 2018 at 10:01:16AM -0400, Brian Foster wrote:
> xfstests generic/388,475 occasionally reproduce assertion failures
> in xfs_buf_item_unpin() when the final bli reference is dropped on
> an invalidated buffer and the buffer is not locked as it is expected
> to be. Invalidated buffers should remain locked on transaction
> commit until the final unpin, at which point the buffer is removed
> from the AIL and the bli is freed since stale buffers are not
> written back.
>
> The assert failures are associated with filesystem shutdown,
> typically due to log I/O errors injected by the test. The
> problematic situation can occur if the shutdown happens to cause a
> race between an active transaction that has invalidated a particular
> buffer and an I/O error on a log buffer that contains the bli
> associated with the same (now stale) buffer.
>
> Both transaction and log contexts acquire a bli reference. If the
> transaction has already invalidated the buffer by the time the I/O
> error occurs and ends up aborting due to shutdown, the transaction
> and log hold the last two references to a stale bli. If the
> transaction cancel occurs first, it treats the buffer as non-stale
> due to the aborted state: the bli reference is dropped and the
> buffer is released/unlocked. The log buffer I/O error handling
> eventually calls into xfs_buf_item_unpin(), drops the final
> reference to the bli and treats it as stale. The buffer wasn't left
> locked by xfs_buf_item_unlock(), however, so the assert fails and
> the buffer is double unlocked. The latter problem is mitigated by
> the fact that the fs is shutdown and no further damage is possible.
>
> ->iop_unlock() of an invalidated buffer should behave consistently
> with respect to the bli refcount, regardless of aborted state. If
> the refcount remains elevated on commit, we know the bli is awaiting
> an unpin (since it can't be in another transaction) and will be
> handled appropriately on log buffer completion. If the final bli
> reference of an invalidated buffer is dropped in ->iop_unlock(), we
> can assume the transaction has aborted because invalidation implies
> a dirty transaction. In the non-abort case, the log would have
> acquired a bli reference in ->iop_pin() and prevented bli release at
> ->iop_unlock() time. In the abort case the item must be freed and
> buffer unlocked because it wasn't pinned by the log.
>
> Rework xfs_buf_item_unlock() to simplify the currently circuitous
> and duplicate logic and leave invalidated buffers locked based on
> bli refcount, regardless of aborted state. This ensures that a
> pinned, stale buffer is always found locked when eventually
> unpinned.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> Hi all,
>
> This survives the assert reproducer, several xfstests runs with v4/v5
> superblock filesystems, and fsstress and fsx workloads over a few hours.
>
> Brian
So far, this patch resolves the problem I had seen looping g/388,475
on debug kernels. I never encountered the issue on non-debug builds.
Thanks!
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
>
> fs/xfs/xfs_buf_item.c | 85 +++++++++++++++++++------------------------
> 1 file changed, 37 insertions(+), 48 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 1c9d1398980b..5e8b91543d93 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -556,73 +556,62 @@ xfs_buf_item_unlock(
> {
> struct xfs_buf_log_item *bip = BUF_ITEM(lip);
> struct xfs_buf *bp = bip->bli_buf;
> + int freed;
> bool aborted;
> bool hold = !!(bip->bli_flags & XFS_BLI_HOLD);
> bool dirty = !!(bip->bli_flags & XFS_BLI_DIRTY);
> + bool stale = !!(bip->bli_flags & XFS_BLI_STALE);
> #if defined(DEBUG) || defined(XFS_WARN)
> bool ordered = !!(bip->bli_flags & XFS_BLI_ORDERED);
> #endif
> + /*
> + * The bli dirty state should match whether the blf has logged segments
> + * except for ordered buffers, where only the bli should be dirty.
> + */
> + ASSERT((!ordered && dirty == xfs_buf_item_dirty_format(bip)) ||
> + (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> + ASSERT(!stale || (bip->__bli_format.blf_flags & XFS_BLF_CANCEL));
>
> - aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
> + trace_xfs_buf_item_unlock(bip);
>
> - /* Clear the buffer's association with this transaction. */
> - bp->b_transp = NULL;
> + aborted = test_bit(XFS_LI_ABORTED, &lip->li_flags);
>
> /*
> - * The per-transaction state has been copied above so clear it from the
> - * bli.
> + * Clear the buffer's association with this transaction and
> + * per-transaction state from the bli, which has been copied above.
> */
> + bp->b_transp = NULL;
> bip->bli_flags &= ~(XFS_BLI_LOGGED | XFS_BLI_HOLD | XFS_BLI_ORDERED);
>
> /*
> - * If the buf item is marked stale, then don't do anything. We'll
> - * unlock the buffer and free the buf item when the buffer is unpinned
> - * for the last time.
> + * Drop the transaction bli reference and free the item if clean or
> + * aborted and we had the last reference. In either case this is the
> + * last opportunity to free the item since it won't be written back.
> + * Otherwise, the bli is still referenced or dirty and will be freed on
> + * final unpin of the buffer (if stale) or writeback completion.
> */
> - if (bip->bli_flags & XFS_BLI_STALE) {
> - trace_xfs_buf_item_unlock_stale(bip);
> - ASSERT(bip->__bli_format.blf_flags & XFS_BLF_CANCEL);
> - if (!aborted) {
> - atomic_dec(&bip->bli_refcount);
> - return;
> - }
> + freed = atomic_dec_and_test(&bip->bli_refcount);
> + if (freed && (aborted || !dirty)) {
> + ASSERT(!aborted || XFS_FORCED_SHUTDOWN(lip->li_mountp));
> + ASSERT(!stale || aborted);
> + /* an aborted item may be in the AIL, remove it first */
> + if (aborted)
> + xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> + xfs_buf_item_relse(bp);
> }
>
> - trace_xfs_buf_item_unlock(bip);
> -
> - /*
> - * If the buf item isn't tracking any data, free it, otherwise drop the
> - * reference we hold to it. If we are aborting the transaction, this may
> - * be the only reference to the buf item, so we free it anyway
> - * regardless of whether it is dirty or not. A dirty abort implies a
> - * shutdown, anyway.
> - *
> - * The bli dirty state should match whether the blf has logged segments
> - * except for ordered buffers, where only the bli should be dirty.
> - */
> - ASSERT((!ordered && dirty == xfs_buf_item_dirty_format(bip)) ||
> - (ordered && dirty && !xfs_buf_item_dirty_format(bip)));
> -
> /*
> - * Clean buffers, by definition, cannot be in the AIL. However, aborted
> - * buffers may be in the AIL regardless of dirty state. An aborted
> - * transaction that invalidates a buffer already in the AIL may have
> - * marked it stale and cleared the dirty state, for example.
> - *
> - * Therefore if we are aborting a buffer and we've just taken the last
> - * reference away, we have to check if it is in the AIL before freeing
> - * it. We need to free it in this case, because an aborted transaction
> - * has already shut the filesystem down and this is the last chance we
> - * will have to do so.
> + * If the buffer was invalidated, leave it locked on transaction commit
> + * unless we just dropped the final reference. The latter case should
> + * only ever happen on abort because invalidation dirties the
> + * transaction and the log would have grabbed another bli reference when
> + * the buffer was pinned. In the non-abort case, the buffer is unlocked
> + * on final unpin and the bli freed since stale buffers are not written
> + * back.
> */
> - if (atomic_dec_and_test(&bip->bli_refcount)) {
> - if (aborted) {
> - ASSERT(XFS_FORCED_SHUTDOWN(lip->li_mountp));
> - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
> - xfs_buf_item_relse(bp);
> - } else if (!dirty)
> - xfs_buf_item_relse(bp);
> - }
> + if (stale && !freed)
> + return;
> + ASSERT(!stale || (aborted && freed));
>
> if (!hold)
> xfs_buf_relse(bp);
> --
> 2.17.1
>
next prev parent reply other threads:[~2018-08-21 23:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-21 14:01 [PATCH] xfs: don't unlock invalidated buf on aborted tx commit Brian Foster
2018-08-21 19:56 ` Bill O'Donnell [this message]
2018-08-22 0:28 ` Dave Chinner
2018-08-22 13:05 ` Brian Foster
2018-08-24 14:56 ` Brian Foster
2018-08-22 6:01 ` Christoph Hellwig
2018-08-22 13:05 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180821195601.GA27408@redhat.com \
--to=billodo@redhat.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.