From: Chandan Babu R <chandan.babu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/8] xfs: don't reverse order of items in bulk AIL insertion
Date: Wed, 28 Jun 2023 15:25:40 +0530 [thread overview]
Message-ID: <87edlu1moy.fsf@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <20230627224412.2242198-2-david@fromorbit.com>
On Wed, Jun 28, 2023 at 08:44:05 AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> XFS has strict metadata ordering requirements. One of the things it
> does is maintain the commit order of items from transaction commit
> through the CIL and into the AIL. That is, if a transaction logs
> item A before item B in a modification, then they will be inserted
> into the CIL in the order {A, B}. These items are then written into
> the iclog during checkpointing in the order {A, B}. When the
> checkpoint commits, they are supposed to be inserted into the AIL in
> the order {A, B}, and when they are pushed from the AIL, they are
> pushed in the order {A, B}.
>
> If we crash, log recovery then replays the two items from the
> checkpoint in the order {A, B}, resulting in the objects the items
> apply to being queued for writeback at the end of the checkpoint
> in the order {A, B}. This means recovery behaves the same way as the
> runtime code.
>
> In places, we have subtle dependencies on this ordering being
> maintained. One of this place is performing intent recovery from the
> log. It assumes that recovering an intent will result in a
> non-intent object being the first thing that is modified in the
> recovery transaction, and so when the transaction commits and the
> journal flushes, the first object inserted into the AIL beyond the
> intent recovery range will be a non-intent item. It uses the
> transistion from intent items to non-intent items to stop the
> recovery pass.
>
> A recent log recovery issue indicated that an intent was appearing
> as the first item in the AIL beyond the recovery range, hence
> breaking the end of recovery detection that exists.
>
> Tracing indicated insertion of the items into the AIL was apparently
> occurring in the right order (the intent was last in the commit item
> list), but the intent was appearing first in the AIL. IOWs, the
> order of items in the AIL was {D,C,B,A}, not {A,B,C,D}, and bulk
> insertion was reversing the order of the items in the batch of items
> being inserted.
>
> Lucky for us, all the items fed to bulk insertion have the same LSN,
> so the reversal of order does not affect the log head/tail tracking
> that is based on the contents of the AIL. It only impacts on code
> that has implicit, subtle dependencies on object order, and AFAICT
> only the intent recovery loop is impacted by it.
>
> Make sure bulk AIL insertion does not reorder items incorrectly.
Looks good to me.
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
>
> Fixes: 0e57f6a36f9b ("xfs: bulk AIL insertion during transaction commit")
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_trans_ail.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index 7d4109af193e..1098452e7f95 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -823,7 +823,7 @@ xfs_trans_ail_update_bulk(
> trace_xfs_ail_insert(lip, 0, lsn);
> }
> lip->li_lsn = lsn;
> - list_add(&lip->li_ail, &tmp);
> + list_add_tail(&lip->li_ail, &tmp);
> }
>
> if (!list_empty(&tmp))
--
chandan
next prev parent reply other threads:[~2023-06-29 12:27 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-27 22:44 [PATCH 0/8 v3] xfs: various fixes for 6.5 Dave Chinner
2023-06-27 22:44 ` [PATCH 1/8] xfs: don't reverse order of items in bulk AIL insertion Dave Chinner
2023-06-28 6:03 ` Christoph Hellwig
2023-06-28 9:55 ` Chandan Babu R [this message]
2023-06-28 17:46 ` Darrick J. Wong
2023-06-27 22:44 ` [PATCH 2/8] xfs: use deferred frees for btree block freeing Dave Chinner
2023-06-28 17:46 ` Darrick J. Wong
2023-06-28 22:55 ` Dave Chinner
2023-06-29 7:52 ` Chandan Babu R
2023-06-27 22:44 ` [PATCH 3/8] xfs: pass alloc flags through to xfs_extent_busy_flush() Dave Chinner
2023-06-29 9:44 ` Chandan Babu R
2023-06-27 22:44 ` [PATCH 4/8] xfs: allow extent free intents to be retried Dave Chinner
2023-06-28 17:48 ` Darrick J. Wong
2023-06-28 22:57 ` Dave Chinner
2023-06-29 9:50 ` Chandan Babu R
2023-06-27 22:44 ` [PATCH 5/8] xfs: don't block in busy flushing when freeing extents Dave Chinner
2023-06-27 22:44 ` [PATCH 6/8] xfs: journal geometry is not properly bounds checked Dave Chinner
2023-06-28 6:08 ` Christoph Hellwig
2023-06-28 6:38 ` Dave Chinner
2023-06-28 17:50 ` Darrick J. Wong
2023-06-27 22:44 ` [PATCH 7/8] xfs: AGF length has never been " Dave Chinner
2023-06-28 17:52 ` Darrick J. Wong
2023-06-29 2:09 ` [PATCH 7/8 V2] " Dave Chinner
2023-06-29 16:35 ` Darrick J. Wong
2023-06-29 22:33 ` Dave Chinner
2023-06-27 22:44 ` [PATCH 8/8] xfs: fix bounds check in xfs_defer_agfl_block() Dave Chinner
2023-06-28 6:09 ` Christoph Hellwig
2023-06-28 17:52 ` Darrick J. Wong
2023-06-29 19:42 ` [RFC PATCH 9/8] xfs: AGI length should be bounds checked Darrick J. Wong
2023-06-29 22:35 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edlu1moy.fsf@debian-BULLSEYE-live-builder-AMD64 \
--to=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox