From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>,
Hans Holmberg <hans.holmberg@wdc.com>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: prevent gc from picking the same zone twice
Date: Wed, 22 Oct 2025 23:16:22 -0700 [thread overview]
Message-ID: <20251023061622.GP3356773@frogsfrogsfrogs> (raw)
In-Reply-To: <20251017060710.696868-2-hch@lst.de>
On Fri, Oct 17, 2025 at 08:07:02AM +0200, Christoph Hellwig wrote:
> When we are picking a zone for gc it might already be in the pipeline
> which can lead to us moving the same data twice resulting in in write
> amplification and a very unfortunate case where keep on garbage
> collecting the zone we just filled with migrated data stopping all
> forward progress.
>
> Fix this by introducing a count of on-going GC operations on a zone, so
> and skip any zone with ongoing GC when picking a new victim.
>
> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/libxfs/xfs_rtgroup.h | 6 ++++++
> fs/xfs/xfs_zone_gc.c | 27 +++++++++++++++++++++++++++
> 2 files changed, 33 insertions(+)
>
> diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h
> index d36a6ae0abe5..d4fcf591e63d 100644
> --- a/fs/xfs/libxfs/xfs_rtgroup.h
> +++ b/fs/xfs/libxfs/xfs_rtgroup.h
> @@ -50,6 +50,12 @@ struct xfs_rtgroup {
> uint8_t *rtg_rsum_cache;
> struct xfs_open_zone *rtg_open_zone;
> };
> +
> + /*
> + * Count of outstanding GC operations for zoned XFS. Any RTG with a
> + * non-zero rtg_gccount will not be picked as new GC victim.
> + */
> + atomic_t rtg_gccount;
> };
>
> /*
> diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c
> index 109877d9a6bf..efcb52796d05 100644
> --- a/fs/xfs/xfs_zone_gc.c
> +++ b/fs/xfs/xfs_zone_gc.c
> @@ -114,6 +114,8 @@ struct xfs_gc_bio {
> /* Open Zone being written to */
> struct xfs_open_zone *oz;
>
> + struct xfs_rtgroup *victim_rtg;
> +
> /* Bio used for reads and writes, including the bvec used by it */
> struct bio_vec bv;
> struct bio bio; /* must be last */
> @@ -264,6 +266,7 @@ xfs_zone_gc_iter_init(
> iter->rec_count = 0;
> iter->rec_idx = 0;
> iter->victim_rtg = victim_rtg;
> + atomic_inc(&victim_rtg->rtg_gccount);
> }
>
> /*
> @@ -362,6 +365,7 @@ xfs_zone_gc_query(
>
> return 0;
> done:
> + atomic_dec(&iter->victim_rtg->rtg_gccount);
> xfs_rtgroup_rele(iter->victim_rtg);
> iter->victim_rtg = NULL;
> return 0;
> @@ -451,6 +455,20 @@ xfs_zone_gc_pick_victim_from(
> if (!rtg)
> continue;
>
> + /*
> + * If the zone is already undergoing GC, don't pick it again.
> + *
> + * This prevents us from picking one of the zones for which we
> + * already submitted GC I/O, but for which the remapping hasn't
> + * concluded again. This won't cause data corruption, but
"...but that I/O hasn't yet finished."
With that and the other comments corrected and a Fixes tag applied,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> + * increases write amplification and slows down GC, so this is
> + * a bad thing.
> + */
> + if (atomic_read(&rtg->rtg_gccount)) {
> + xfs_rtgroup_rele(rtg);
> + continue;
> + }
> +
> /* skip zones that are just waiting for a reset */
> if (rtg_rmap(rtg)->i_used_blocks == 0 ||
> rtg_rmap(rtg)->i_used_blocks >= victim_used) {
> @@ -688,6 +706,9 @@ xfs_zone_gc_start_chunk(
> chunk->scratch = &data->scratch[data->scratch_idx];
> chunk->data = data;
> chunk->oz = oz;
> + chunk->victim_rtg = iter->victim_rtg;
> + atomic_inc(&chunk->victim_rtg->rtg_group.xg_active_ref);
> + atomic_inc(&chunk->victim_rtg->rtg_gccount);
>
> bio->bi_iter.bi_sector = xfs_rtb_to_daddr(mp, chunk->old_startblock);
> bio->bi_end_io = xfs_zone_gc_end_io;
> @@ -710,6 +731,8 @@ static void
> xfs_zone_gc_free_chunk(
> struct xfs_gc_bio *chunk)
> {
> + atomic_dec(&chunk->victim_rtg->rtg_gccount);
> + xfs_rtgroup_rele(chunk->victim_rtg);
> list_del(&chunk->entry);
> xfs_open_zone_put(chunk->oz);
> xfs_irele(chunk->ip);
> @@ -770,6 +793,10 @@ xfs_zone_gc_split_write(
> split_chunk->oz = chunk->oz;
> atomic_inc(&chunk->oz->oz_ref);
>
> + split_chunk->victim_rtg = chunk->victim_rtg;
> + atomic_inc(&chunk->victim_rtg->rtg_group.xg_active_ref);
> + atomic_inc(&chunk->victim_rtg->rtg_gccount);
> +
> chunk->offset += split_len;
> chunk->len -= split_len;
> chunk->old_startblock += XFS_B_TO_FSB(data->mp, split_len);
> --
> 2.47.3
>
>
next prev parent reply other threads:[~2025-10-23 6:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 6:07 fix for selecting a zone with active GC I/O for GC Christoph Hellwig
2025-10-17 6:07 ` [PATCH 1/2] xfs: prevent gc from picking the same zone twice Christoph Hellwig
2025-10-17 12:37 ` Carlos Maiolino
2025-10-20 11:49 ` Christoph Hellwig
2025-10-18 4:08 ` Damien Le Moal
2025-10-23 6:16 ` Darrick J. Wong [this message]
2025-10-23 6:28 ` Christoph Hellwig
2025-10-23 15:02 ` Darrick J. Wong
2025-10-23 15:04 ` Christoph Hellwig
2025-10-17 6:07 ` [PATCH 2/2] xfs: document another racy GC case in xfs_zoned_map_extent Christoph Hellwig
2025-10-17 12:40 ` Carlos Maiolino
2025-10-18 4:09 ` Damien Le Moal
2025-10-20 8:19 ` Hans Holmberg
2025-10-23 6:21 ` Darrick J. Wong
2025-10-23 6:30 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2025-10-23 15:17 fix for selecting a zone with active GC I/O for GC v2 Christoph Hellwig
2025-10-23 15:17 ` [PATCH 1/2] xfs: prevent gc from picking the same zone twice Christoph Hellwig
2025-10-31 11:19 ` Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251023061622.GP3356773@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hans.holmberg@wdc.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).