public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: validate that zoned RT devices are zone aligned
@ 2025-12-10 14:23 Christoph Hellwig
  2025-12-10 16:48 ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2025-12-10 14:23 UTC (permalink / raw)
  To: cem; +Cc: linux-xfs

Garbage collection assumes all zones contain the full amount of blocks.
Mkfs already ensures this happens, but make the kernel check it as well
to avoid getting into trouble due to fuzzers or mkfs bugs.

Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_sb.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index cdd16dd805d7..db5231f846ea 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -301,6 +301,19 @@ xfs_validate_rt_geometry(
 	    sbp->sb_rbmblocks != xfs_expected_rbmblocks(sbp))
 		return false;
 
+	if (xfs_sb_is_v5(sbp) &&
+	    (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_ZONED)) {
+		uint32_t		mod;
+
+		/*
+		 * Zoned RT devices must be aligned to the rtgroup size, because
+		 * garbage collection can't deal with rump RT groups.
+		 */
+		div_u64_rem(sbp->sb_rextents, sbp->sb_rgextents, &mod);
+		if (mod)
+			return false;
+	}
+
 	return true;
 }
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: validate that zoned RT devices are zone aligned
  2025-12-10 14:23 [PATCH] xfs: validate that zoned RT devices are zone aligned Christoph Hellwig
@ 2025-12-10 16:48 ` Darrick J. Wong
  2025-12-10 16:54   ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2025-12-10 16:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, linux-xfs

On Wed, Dec 10, 2025 at 03:23:05PM +0100, Christoph Hellwig wrote:
> Garbage collection assumes all zones contain the full amount of blocks.
> Mkfs already ensures this happens, but make the kernel check it as well

mkfs doesn't enforce that when you're creating a zoned filesystem on
non-zoned storage:

# mkfs.xfs -r rtdev=/dev/sda,zoned=1 -f /dev/sdf
meta-data=/dev/sdf               isize=512    agcount=4, agsize=1298176 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=0    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=1   metadir=1
data     =                       bsize=4096   blocks=5192704, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =/dev/sda               extsz=4096   blocks=5192704, rtextents=5192704
         =                       rgcount=80   rgsize=65536 extents
         =                       zoned=1      start=0 reserved=0

5192704 isn't congruent with 65536, and we get a runt rtgroup at the
end:

# mount /dev/sdf /opt -o rtdev=/dev/sda
# xfs_io -c 'rginfo' /opt | tail -n 20
RTG: 76
Length: 65536
Sick: 0x0
Checked: 0x0
Flags: 0x0
RTG: 77
Length: 65536
Sick: 0x0
Checked: 0x0
Flags: 0x0
RTG: 78
Length: 65536
Sick: 0x0
Checked: 0x0
Flags: 0x0
RTG: 79
Length: 15360
Sick: 0x0
Checked: 0x0
Flags: 0x0

rtgroup 79 is clearly a runt group.

(The mkfs enforcement does work if you have an actual zoned storage
device since mkfs complains about changes in the zone sizes.)

> to avoid getting into trouble due to fuzzers or mkfs bugs.
>
> Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format")
> Signed-off-by: Christoph Hellwig <hch@lst.de>

How many filesystems are there in the wild with rump rtgroups?  My first
thought was "why not pretend the runt rtgroup doesn't exist?" but then
that creates all sorts of weirdness where you have a 778M rt volume on a
disk with 256M rtgroups, but then we ignore the 10M of space and you can
never get to it.

Given that runt zoned rtgroups can exist in the wild, how hard would it
be to fix zonegc?

--D

> ---
>  fs/xfs/libxfs/xfs_sb.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index cdd16dd805d7..db5231f846ea 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -301,6 +301,19 @@ xfs_validate_rt_geometry(
>  	    sbp->sb_rbmblocks != xfs_expected_rbmblocks(sbp))
>  		return false;
>  
> +	if (xfs_sb_is_v5(sbp) &&
> +	    (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_ZONED)) {
> +		uint32_t		mod;
> +
> +		/*
> +		 * Zoned RT devices must be aligned to the rtgroup size, because
> +		 * garbage collection can't deal with rump RT groups.
> +		 */
> +		div_u64_rem(sbp->sb_rextents, sbp->sb_rgextents, &mod);
> +		if (mod)
> +			return false;
> +	}
> +
>  	return true;
>  }
>  
> -- 
> 2.47.3
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: validate that zoned RT devices are zone aligned
  2025-12-10 16:48 ` Darrick J. Wong
@ 2025-12-10 16:54   ` Christoph Hellwig
  2025-12-10 19:18     ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2025-12-10 16:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, cem, linux-xfs

On Wed, Dec 10, 2025 at 08:48:59AM -0800, Darrick J. Wong wrote:
> mkfs doesn't enforce that when you're creating a zoned filesystem on
> non-zoned storage:
...
> (The mkfs enforcement does work if you have an actual zoned storage
> device since mkfs complains about changes in the zone sizes.)

Ugg, and I thought only my horrible hacks caused that..

> 
> > to avoid getting into trouble due to fuzzers or mkfs bugs.
> >
> > Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format")
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> How many filesystems are there in the wild with rump rtgroups?

I suspect very, very few as zoned mode on non-zoned devices is not a
widely advertised feature, and then you'd also need wiredly sized device
or manual override to get it.  And then scrub would complain about it.

> Given that runt zoned rtgroups can exist in the wild, how hard would it
> be to fix zonegc?

Very nasty.  We can't ever GC into one.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: validate that zoned RT devices are zone aligned
  2025-12-10 16:54   ` Christoph Hellwig
@ 2025-12-10 19:18     ` Darrick J. Wong
  2025-12-11  5:04       ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2025-12-10 19:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, linux-xfs

On Wed, Dec 10, 2025 at 05:54:38PM +0100, Christoph Hellwig wrote:
> On Wed, Dec 10, 2025 at 08:48:59AM -0800, Darrick J. Wong wrote:
> > mkfs doesn't enforce that when you're creating a zoned filesystem on
> > non-zoned storage:
> ...
> > (The mkfs enforcement does work if you have an actual zoned storage
> > device since mkfs complains about changes in the zone sizes.)
> 
> Ugg, and I thought only my horrible hacks caused that..
> 
> > 
> > > to avoid getting into trouble due to fuzzers or mkfs bugs.
> > >
> > > Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format")
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > How many filesystems are there in the wild with rump rtgroups?
> 
> I suspect very, very few as zoned mode on non-zoned devices is not a
> widely advertised feature, and then you'd also need wiredly sized device
> or manual override to get it.  And then scrub would complain about it.

<nod>

> > Given that runt zoned rtgroups can exist in the wild, how hard would it
> > be to fix zonegc?
> 
> Very nasty.  We can't ever GC into one.

How nasty is it, exactly?  AFAICT,

 * The zone targetting code (aka the zone we copy into) then has to
   know to avoid a runt endzone?

 * Thresholding gets weird because they don't apply right to the runt
   zone, which means the victim selection is also off.

 * The code that reserves zones for gc or other ENOSPC handling then has
   to ensure it never picks a runt zone to avoid corner case problems

Any other reasons?  Given that zoned is still experimental I think I'm
ok with adding this restriction, but only after some more thorough
understanding. :)

Also does growfs need patching so that it doesn't create a runt zone?

--D

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: validate that zoned RT devices are zone aligned
  2025-12-10 19:18     ` Darrick J. Wong
@ 2025-12-11  5:04       ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2025-12-11  5:04 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, cem, linux-xfs

On Wed, Dec 10, 2025 at 11:18:25AM -0800, Darrick J. Wong wrote:
> How nasty is it, exactly?  AFAICT,
> 
>  * The zone targetting code (aka the zone we copy into) then has to
>    know to avoid a runt endzone?

Yes.  And I don't really have a good idea how to do that.

>  * Thresholding gets weird because they don't apply right to the runt
>    zone, which means the victim selection is also off.

Yes.

>  * The code that reserves zones for gc or other ENOSPC handling then has
>    to ensure it never picks a runt zone to avoid corner case problems

This is what is needed for 1) above with all the same issues.

> Any other reasons?  Given that zoned is still experimental I think I'm
> ok with adding this restriction, but only after some more thorough
> understanding. :)

Yeah.

> Also does growfs need patching so that it doesn't create a runt zone?

I think we run the buffer verifier there, but a nicer check to abort
early would be helpful.  As would be a test case.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-12-11  5:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-10 14:23 [PATCH] xfs: validate that zoned RT devices are zone aligned Christoph Hellwig
2025-12-10 16:48 ` Darrick J. Wong
2025-12-10 16:54   ` Christoph Hellwig
2025-12-10 19:18     ` Darrick J. Wong
2025-12-11  5:04       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox