linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>,
	Hans Holmberg <hans.holmberg@wdc.com>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: prevent gc from picking the same zone twice
Date: Wed, 22 Oct 2025 23:16:22 -0700	[thread overview]
Message-ID: <20251023061622.GP3356773@frogsfrogsfrogs> (raw)
In-Reply-To: <20251017060710.696868-2-hch@lst.de>

On Fri, Oct 17, 2025 at 08:07:02AM +0200, Christoph Hellwig wrote:
> When we are picking a zone for gc it might already be in the pipeline
> which can lead to us moving the same data twice resulting in in write
> amplification and a very unfortunate case where keep on garbage
> collecting the zone we just filled with migrated data stopping all
> forward progress.
> 
> Fix this by introducing a count of on-going GC operations on a zone, so
> and skip any zone with ongoing GC when picking a new victim.
> 
> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_rtgroup.h |  6 ++++++
>  fs/xfs/xfs_zone_gc.c        | 27 +++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h
> index d36a6ae0abe5..d4fcf591e63d 100644
> --- a/fs/xfs/libxfs/xfs_rtgroup.h
> +++ b/fs/xfs/libxfs/xfs_rtgroup.h
> @@ -50,6 +50,12 @@ struct xfs_rtgroup {
>  		uint8_t			*rtg_rsum_cache;
>  		struct xfs_open_zone	*rtg_open_zone;
>  	};
> +
> +	/*
> +	 * Count of outstanding GC operations for zoned XFS.  Any RTG with a
> +	 * non-zero rtg_gccount will not be picked as new GC victim.
> +	 */
> +	atomic_t		rtg_gccount;
>  };
>  
>  /*
> diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c
> index 109877d9a6bf..efcb52796d05 100644
> --- a/fs/xfs/xfs_zone_gc.c
> +++ b/fs/xfs/xfs_zone_gc.c
> @@ -114,6 +114,8 @@ struct xfs_gc_bio {
>  	/* Open Zone being written to */
>  	struct xfs_open_zone		*oz;
>  
> +	struct xfs_rtgroup		*victim_rtg;
> +
>  	/* Bio used for reads and writes, including the bvec used by it */
>  	struct bio_vec			bv;
>  	struct bio			bio;	/* must be last */
> @@ -264,6 +266,7 @@ xfs_zone_gc_iter_init(
>  	iter->rec_count = 0;
>  	iter->rec_idx = 0;
>  	iter->victim_rtg = victim_rtg;
> +	atomic_inc(&victim_rtg->rtg_gccount);
>  }
>  
>  /*
> @@ -362,6 +365,7 @@ xfs_zone_gc_query(
>  
>  	return 0;
>  done:
> +	atomic_dec(&iter->victim_rtg->rtg_gccount);
>  	xfs_rtgroup_rele(iter->victim_rtg);
>  	iter->victim_rtg = NULL;
>  	return 0;
> @@ -451,6 +455,20 @@ xfs_zone_gc_pick_victim_from(
>  		if (!rtg)
>  			continue;
>  
> +		/*
> +		 * If the zone is already undergoing GC, don't pick it again.
> +		 *
> +		 * This prevents us from picking one of the zones for which we
> +		 * already submitted GC I/O, but for which the remapping hasn't
> +		 * concluded again.  This won't cause data corruption, but

"...but that I/O hasn't yet finished."

With that and the other comments corrected and a Fixes tag applied,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> +		 * increases write amplification and slows down GC, so this is
> +		 * a bad thing.
> +		 */
> +		if (atomic_read(&rtg->rtg_gccount)) {
> +			xfs_rtgroup_rele(rtg);
> +			continue;
> +		}
> +
>  		/* skip zones that are just waiting for a reset */
>  		if (rtg_rmap(rtg)->i_used_blocks == 0 ||
>  		    rtg_rmap(rtg)->i_used_blocks >= victim_used) {
> @@ -688,6 +706,9 @@ xfs_zone_gc_start_chunk(
>  	chunk->scratch = &data->scratch[data->scratch_idx];
>  	chunk->data = data;
>  	chunk->oz = oz;
> +	chunk->victim_rtg = iter->victim_rtg;
> +	atomic_inc(&chunk->victim_rtg->rtg_group.xg_active_ref);
> +	atomic_inc(&chunk->victim_rtg->rtg_gccount);
>  
>  	bio->bi_iter.bi_sector = xfs_rtb_to_daddr(mp, chunk->old_startblock);
>  	bio->bi_end_io = xfs_zone_gc_end_io;
> @@ -710,6 +731,8 @@ static void
>  xfs_zone_gc_free_chunk(
>  	struct xfs_gc_bio	*chunk)
>  {
> +	atomic_dec(&chunk->victim_rtg->rtg_gccount);
> +	xfs_rtgroup_rele(chunk->victim_rtg);
>  	list_del(&chunk->entry);
>  	xfs_open_zone_put(chunk->oz);
>  	xfs_irele(chunk->ip);
> @@ -770,6 +793,10 @@ xfs_zone_gc_split_write(
>  	split_chunk->oz = chunk->oz;
>  	atomic_inc(&chunk->oz->oz_ref);
>  
> +	split_chunk->victim_rtg = chunk->victim_rtg;
> +	atomic_inc(&chunk->victim_rtg->rtg_group.xg_active_ref);
> +	atomic_inc(&chunk->victim_rtg->rtg_gccount);
> +
>  	chunk->offset += split_len;
>  	chunk->len -= split_len;
>  	chunk->old_startblock += XFS_B_TO_FSB(data->mp, split_len);
> -- 
> 2.47.3
> 
> 

  parent reply	other threads:[~2025-10-23  6:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17  6:07 fix for selecting a zone with active GC I/O for GC Christoph Hellwig
2025-10-17  6:07 ` [PATCH 1/2] xfs: prevent gc from picking the same zone twice Christoph Hellwig
2025-10-17 12:37   ` Carlos Maiolino
2025-10-20 11:49     ` Christoph Hellwig
2025-10-18  4:08   ` Damien Le Moal
2025-10-23  6:16   ` Darrick J. Wong [this message]
2025-10-23  6:28     ` Christoph Hellwig
2025-10-23 15:02       ` Darrick J. Wong
2025-10-23 15:04         ` Christoph Hellwig
2025-10-17  6:07 ` [PATCH 2/2] xfs: document another racy GC case in xfs_zoned_map_extent Christoph Hellwig
2025-10-17 12:40   ` Carlos Maiolino
2025-10-18  4:09   ` Damien Le Moal
2025-10-20  8:19   ` Hans Holmberg
2025-10-23  6:21   ` Darrick J. Wong
2025-10-23  6:30     ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2025-10-23 15:17 fix for selecting a zone with active GC I/O for GC v2 Christoph Hellwig
2025-10-23 15:17 ` [PATCH 1/2] xfs: prevent gc from picking the same zone twice Christoph Hellwig
2025-10-31 11:19   ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251023061622.GP3356773@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=hans.holmberg@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).