linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
@ 2025-09-26 12:38 Lukas Herbolt
  2025-09-29  6:54 ` Christoph Hellwig
  2025-10-15  6:52 ` [PATCH] " Donald Douwsma
  0 siblings, 2 replies; 7+ messages in thread
From: Lukas Herbolt @ 2025-09-26 12:38 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: linux-xfs, Lukas Herbolt

Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
and so we set lsu to blocksize. But we do not check the the size if
lsunit can be bigger to fit the disk geometry.

Before:
modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
lsblk -tQ 'VENDOR == "XFS_TEST"'; \
mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
modprobe -r scsi_debug
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE  RA WSAME
sda          0 262144 262144    4096     512    0 bfq       256 512    0B
meta-data=/dev/sda               isize=512    agcount=32, agsize=8192000 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0
data     =                       bsize=4096   blocks=262144000, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=128000, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

After:
modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
lsblk -tQ 'VENDOR == "XFS_TEST"'; \
mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
modprobe -r scsi_debug
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE  RA WSAME
sda          0 262144 262144    4096     512    0 bfq       256 512    0B
meta-data=/dev/sda               isize=512    agcount=32, agsize=8192000 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0   metadir=0
data     =                       bsize=4096   blocks=262144000, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=128000, version=2
         =                       sectsz=4096  sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
         =                       rgcount=0    rgsize=0 extents
         =                       zoned=0      start=0 reserved=0

Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
---
 mkfs/xfs_mkfs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..05268cd9 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3643,6 +3643,10 @@ check_lsunit:
 		lsu = getnum(cli->lsu, &lopts, L_SU);
 	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
 		lsu = cfg->blocksize; /* lsunit matches filesystem block size */
+		if (cfg->dsunit){
+			cfg->lsunit = cfg->dsunit;
+			lsu = 0;
+		}
 
 	if (lsu) {
 		/* verify if lsu is a multiple block size */
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
@ 2025-09-29  6:54 ` Christoph Hellwig
  2025-09-29  8:59   ` lukas
  2025-10-15  6:52 ` [PATCH] " Donald Douwsma
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-09-29  6:54 UTC (permalink / raw)
  To: Lukas Herbolt; +Cc: aalbersh, djwong, linux-xfs

On Fri, Sep 26, 2025 at 02:38:30PM +0200, Lukas Herbolt wrote:
> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
> and so we set lsu to blocksize. But we do not check the the size if
> lsunit can be bigger to fit the disk geometry.

As I had to walk the code to understand (again for the nth time :))
what the lsunit actually does:  it pads every log write up to that
size.  I.e. if you set a log stripe unit, that effectively becomes the
minimum I/O size for the log.  So yes, setting it to the minimum I/O
size of the device makes sense.  But maybe the commit log should be
a bit more clear about that?  (and of course our terminology should
be as well, ast least outside the user interface that we can't touch).

> Before:

You Before/after also contain changes for metadir/zoned, looks like you
upgraded to a new xfsprogs for your patch, but not the baseline.

> index 8cd4ccd7..05268cd9 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3643,6 +3643,10 @@ check_lsunit:
>  		lsu = getnum(cli->lsu, &lopts, L_SU);
>  	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
>  		lsu = cfg->blocksize; /* lsunit matches filesystem block size */
> +		if (cfg->dsunit){
> +			cfg->lsunit = cfg->dsunit;
> +			lsu = 0;
> +		}

I don't think just picking the data stripe unit is correct here, given
that the log can also be external and on a separate device.  Instead
we'll need to duplicate the calculation based on ft.log, preferably by
factoring it into a helper.

The lsu = 0 also drop the multiple of block size check.  If that is not
a hard requirement (and I'd have to do some research where it is coming
from) we should relax the check instead of silently disabling it like
this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-09-29  6:54 ` Christoph Hellwig
@ 2025-09-29  8:59   ` lukas
  2025-10-03  7:46     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: lukas @ 2025-09-29  8:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On 2025-09-29 08:54, Christoph Hellwig wrote:
> On Fri, Sep 26, 2025 at 02:38:30PM +0200, Lukas Herbolt wrote:
>> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
>> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
>> and so we set lsu to blocksize. But we do not check the the size if
>> lsunit can be bigger to fit the disk geometry.
> 
> As I had to walk the code to understand (again for the nth time :))
> what the lsunit actually does:  it pads every log write up to that
> size.  I.e. if you set a log stripe unit, that effectively becomes the
> minimum I/O size for the log.  So yes, setting it to the minimum I/O
> size of the device makes sense.  But maybe the commit log should be
> a bit more clear about that?  (and of course our terminology should
> be as well, ast least outside the user interface that we can't touch).
> 
>> Before:
> 
> You Before/after also contain changes for metadir/zoned, looks like you
> upgraded to a new xfsprogs for your patch, but not the baseline.
> 
Yeah it was fedora rawhide, 6.12, did not notice the metadir/zoned.

>> index 8cd4ccd7..05268cd9 100644
>> --- a/mkfs/xfs_mkfs.c
>> +++ b/mkfs/xfs_mkfs.c
>> @@ -3643,6 +3643,10 @@ check_lsunit:
>>  		lsu = getnum(cli->lsu, &lopts, L_SU);
>>  	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
>>  		lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>> +		if (cfg->dsunit){
>> +			cfg->lsunit = cfg->dsunit;
>> +			lsu = 0;
>> +		}
> 
> I don't think just picking the data stripe unit is correct here, given
> that the log can also be external and on a separate device.  Instead
> we'll need to duplicate the calculation based on ft.log, preferably by
> factoring it into a helper.
Hmmm, aren't all the <data|rt|log>.sunit" set by blkid_get_topology()?
So as log is internal lsunit should be equal to dsunit and it can only
differ if the log is external?

Based on comment:

     /*
      * check that log sunit is modulo fsblksize; default it to dsunit 
for
      * an internal log; or the log device stripe unit if it's external.
      */

> The lsu = 0 also drop the multiple of block size check.  If that is not
> a hard requirement (and I'd have to do some research where it is coming
> from) we should relax the check instead of silently disabling it like
> this.

My understanding was the LSU check is there mostly if cli->lsu is set.
Actually if that's assumption is correct it can be done just like this.

---
  mkfs/xfs_mkfs.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..3aecacd3 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3644,7 +3644,7 @@ check_lsunit:
  	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
  		lsu = cfg->blocksize; /* lsunit matches filesystem block size */

-	if (lsu) {
+	if (cli->lsu) {
  		/* verify if lsu is a multiple block size */
  		if (lsu % cfg->blocksize != 0) {
  			fprintf(stderr,
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-09-29  8:59   ` lukas
@ 2025-10-03  7:46     ` Christoph Hellwig
  2025-10-07  7:13       ` [PATCH v2] " Lukas Herbolt
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-03  7:46 UTC (permalink / raw)
  To: lukas; +Cc: Christoph Hellwig, linux-xfs

On Mon, Sep 29, 2025 at 10:59:24AM +0200, lukas@herbolt.com wrote:
> > You Before/after also contain changes for metadir/zoned, looks like you
> > upgraded to a new xfsprogs for your patch, but not the baseline.
> > 
> Yeah it was fedora rawhide, 6.12, did not notice the metadir/zoned.

Np.  I just started at it to see what you changed.

> > I don't think just picking the data stripe unit is correct here, given
> > that the log can also be external and on a separate device.  Instead
> > we'll need to duplicate the calculation based on ft.log, preferably by
> > factoring it into a helper.
> Hmmm, aren't all the <data|rt|log>.sunit" set by blkid_get_topology()?

Yes.

> So as log is internal lsunit should be equal to dsunit and it can only
> differ if the log is external?

Yes.

> My understanding was the LSU check is there mostly if cli->lsu is set.
> Actually if that's assumption is correct it can be done just like this.

That might be a better idea, and should be done in a prep patch.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-10-03  7:46     ` Christoph Hellwig
@ 2025-10-07  7:13       ` Lukas Herbolt
  2025-10-08  6:06         ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Lukas Herbolt @ 2025-10-07  7:13 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, Lukas Herbolt

>> and should be done in a prep patch
Not sure if I got it right, but sending v2

Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
---
v2: rework check lsu only if LSU comes from CLI

---
 mkfs/xfs_mkfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..3aecacd3 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3644,7 +3644,7 @@ check_lsunit:
 	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
 		lsu = cfg->blocksize; /* lsunit matches filesystem block size */
 
-	if (lsu) {
+	if (cli->lsu) {
 		/* verify if lsu is a multiple block size */
 		if (lsu % cfg->blocksize != 0) {
 			fprintf(stderr,
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-10-07  7:13       ` [PATCH v2] " Lukas Herbolt
@ 2025-10-08  6:06         ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-08  6:06 UTC (permalink / raw)
  To: Lukas Herbolt; +Cc: djwong, linux-xfs

On Tue, Oct 07, 2025 at 09:13:00AM +0200, Lukas Herbolt wrote:
> >> and should be done in a prep patch
> Not sure if I got it right, but sending v2
> 
> Signed-off-by: Lukas Herbolt <lukas@herbolt.com>

The changes look good.  You'll want to send them out as a proper
Linux-style patch with a commit log explaining the changes in a separate
mail, although it would make sense to turn it into a series with your
other patch.  git send-email is neat to automate that.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
  2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
  2025-09-29  6:54 ` Christoph Hellwig
@ 2025-10-15  6:52 ` Donald Douwsma
  1 sibling, 0 replies; 7+ messages in thread
From: Donald Douwsma @ 2025-10-15  6:52 UTC (permalink / raw)
  To: Lukas Herbolt, aalbersh, djwong; +Cc: linux-xfs



On 26/9/25 22:38, Lukas Herbolt wrote:
> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
> and so we set lsu to blocksize. But we do not check the the size if
> lsunit can be bigger to fit the disk geometry.
> 
> Before:
> modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
> lsblk -tQ 'VENDOR == "XFS_TEST"'; \
> mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
> modprobe -r scsi_debug
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE  RA WSAME
> sda          0 262144 262144    4096     512    0 bfq       256 512    0B
> meta-data=/dev/sda               isize=512    agcount=32, agsize=8192000 blks
>           =                       sectsz=4096  attr=2, projid32bit=1
>           =                       crc=1        finobt=1, sparse=1, rmapbt=1
>           =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
>           =                       exchange=0
> data     =                       bsize=4096   blocks=262144000, imaxpct=25
>           =                       sunit=64     swidth=64 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
> log      =internal log           bsize=4096   blocks=128000, version=2
>           =                       sectsz=4096  sunit=1 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> After:
> modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
> lsblk -tQ 'VENDOR == "XFS_TEST"'; \
> mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
> modprobe -r scsi_debug
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE  RA WSAME
> sda          0 262144 262144    4096     512    0 bfq       256 512    0B
> meta-data=/dev/sda               isize=512    agcount=32, agsize=8192000 blks
>           =                       sectsz=4096  attr=2, projid32bit=1
>           =                       crc=1        finobt=1, sparse=1, rmapbt=1
>           =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
>           =                       exchange=0   metadir=0
> data     =                       bsize=4096   blocks=262144000, imaxpct=25
>           =                       sunit=64     swidth=64 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
> log      =internal log           bsize=4096   blocks=128000, version=2
>           =                       sectsz=4096  sunit=64 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
>           =                       rgcount=0    rgsize=0 extents
>           =                       zoned=0      start=0 reserved=0
> 

Has it always been like this?

"2f44b1b0 mkfs: rework stripe calculations" changed a lot in this area,
but its harder to follow the code before the change.

> Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
> ---
>   mkfs/xfs_mkfs.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index 8cd4ccd7..05268cd9 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3643,6 +3643,10 @@ check_lsunit:
>   		lsu = getnum(cli->lsu, &lopts, L_SU);
>   	else if (cfg->lsectorsize > XLOG_HEADER_SIZE)

Missing { ?                                           ^

>   		lsu = cfg->blocksize; /* lsunit matches filesystem block size */
> +		if (cfg->dsunit){
> +			cfg->lsunit = cfg->dsunit;
> +			lsu = 0;
> +		}

	}

>   
>   	if (lsu) {
>   		/* verify if lsu is a multiple block size */


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-15  6:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
2025-09-29  6:54 ` Christoph Hellwig
2025-09-29  8:59   ` lukas
2025-10-03  7:46     ` Christoph Hellwig
2025-10-07  7:13       ` [PATCH v2] " Lukas Herbolt
2025-10-08  6:06         ` Christoph Hellwig
2025-10-15  6:52 ` [PATCH] " Donald Douwsma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).