* [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
@ 2025-09-26 12:38 Lukas Herbolt
2025-09-29 6:54 ` Christoph Hellwig
2025-10-15 6:52 ` [PATCH] " Donald Douwsma
0 siblings, 2 replies; 7+ messages in thread
From: Lukas Herbolt @ 2025-09-26 12:38 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, Lukas Herbolt
Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
and so we set lsu to blocksize. But we do not check the the size if
lsunit can be bigger to fit the disk geometry.
Before:
modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
lsblk -tQ 'VENDOR == "XFS_TEST"'; \
mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
modprobe -r scsi_debug
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
sda 0 262144 262144 4096 512 0 bfq 256 512 0B
meta-data=/dev/sda isize=512 agcount=32, agsize=8192000 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
= exchange=0
data = bsize=4096 blocks=262144000, imaxpct=25
= sunit=64 swidth=64 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
log =internal log bsize=4096 blocks=128000, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
After:
modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
lsblk -tQ 'VENDOR == "XFS_TEST"'; \
mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
modprobe -r scsi_debug
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
sda 0 262144 262144 4096 512 0 bfq 256 512 0B
meta-data=/dev/sda isize=512 agcount=32, agsize=8192000 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
= exchange=0 metadir=0
data = bsize=4096 blocks=262144000, imaxpct=25
= sunit=64 swidth=64 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
log =internal log bsize=4096 blocks=128000, version=2
= sectsz=4096 sunit=64 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
= rgcount=0 rgsize=0 extents
= zoned=0 start=0 reserved=0
Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
---
mkfs/xfs_mkfs.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..05268cd9 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3643,6 +3643,10 @@ check_lsunit:
lsu = getnum(cli->lsu, &lopts, L_SU);
else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
lsu = cfg->blocksize; /* lsunit matches filesystem block size */
+ if (cfg->dsunit){
+ cfg->lsunit = cfg->dsunit;
+ lsu = 0;
+ }
if (lsu) {
/* verify if lsu is a multiple block size */
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
@ 2025-09-29 6:54 ` Christoph Hellwig
2025-09-29 8:59 ` lukas
2025-10-15 6:52 ` [PATCH] " Donald Douwsma
1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-09-29 6:54 UTC (permalink / raw)
To: Lukas Herbolt; +Cc: aalbersh, djwong, linux-xfs
On Fri, Sep 26, 2025 at 02:38:30PM +0200, Lukas Herbolt wrote:
> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
> and so we set lsu to blocksize. But we do not check the the size if
> lsunit can be bigger to fit the disk geometry.
As I had to walk the code to understand (again for the nth time :))
what the lsunit actually does: it pads every log write up to that
size. I.e. if you set a log stripe unit, that effectively becomes the
minimum I/O size for the log. So yes, setting it to the minimum I/O
size of the device makes sense. But maybe the commit log should be
a bit more clear about that? (and of course our terminology should
be as well, ast least outside the user interface that we can't touch).
> Before:
You Before/after also contain changes for metadir/zoned, looks like you
upgraded to a new xfsprogs for your patch, but not the baseline.
> index 8cd4ccd7..05268cd9 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3643,6 +3643,10 @@ check_lsunit:
> lsu = getnum(cli->lsu, &lopts, L_SU);
> else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
> lsu = cfg->blocksize; /* lsunit matches filesystem block size */
> + if (cfg->dsunit){
> + cfg->lsunit = cfg->dsunit;
> + lsu = 0;
> + }
I don't think just picking the data stripe unit is correct here, given
that the log can also be external and on a separate device. Instead
we'll need to duplicate the calculation based on ft.log, preferably by
factoring it into a helper.
The lsu = 0 also drop the multiple of block size check. If that is not
a hard requirement (and I'd have to do some research where it is coming
from) we should relax the check instead of silently disabling it like
this.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-09-29 6:54 ` Christoph Hellwig
@ 2025-09-29 8:59 ` lukas
2025-10-03 7:46 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: lukas @ 2025-09-29 8:59 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On 2025-09-29 08:54, Christoph Hellwig wrote:
> On Fri, Sep 26, 2025 at 02:38:30PM +0200, Lukas Herbolt wrote:
>> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
>> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
>> and so we set lsu to blocksize. But we do not check the the size if
>> lsunit can be bigger to fit the disk geometry.
>
> As I had to walk the code to understand (again for the nth time :))
> what the lsunit actually does: it pads every log write up to that
> size. I.e. if you set a log stripe unit, that effectively becomes the
> minimum I/O size for the log. So yes, setting it to the minimum I/O
> size of the device makes sense. But maybe the commit log should be
> a bit more clear about that? (and of course our terminology should
> be as well, ast least outside the user interface that we can't touch).
>
>> Before:
>
> You Before/after also contain changes for metadir/zoned, looks like you
> upgraded to a new xfsprogs for your patch, but not the baseline.
>
Yeah it was fedora rawhide, 6.12, did not notice the metadir/zoned.
>> index 8cd4ccd7..05268cd9 100644
>> --- a/mkfs/xfs_mkfs.c
>> +++ b/mkfs/xfs_mkfs.c
>> @@ -3643,6 +3643,10 @@ check_lsunit:
>> lsu = getnum(cli->lsu, &lopts, L_SU);
>> else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
>> lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>> + if (cfg->dsunit){
>> + cfg->lsunit = cfg->dsunit;
>> + lsu = 0;
>> + }
>
> I don't think just picking the data stripe unit is correct here, given
> that the log can also be external and on a separate device. Instead
> we'll need to duplicate the calculation based on ft.log, preferably by
> factoring it into a helper.
Hmmm, aren't all the <data|rt|log>.sunit" set by blkid_get_topology()?
So as log is internal lsunit should be equal to dsunit and it can only
differ if the log is external?
Based on comment:
/*
* check that log sunit is modulo fsblksize; default it to dsunit
for
* an internal log; or the log device stripe unit if it's external.
*/
> The lsu = 0 also drop the multiple of block size check. If that is not
> a hard requirement (and I'd have to do some research where it is coming
> from) we should relax the check instead of silently disabling it like
> this.
My understanding was the LSU check is there mostly if cli->lsu is set.
Actually if that's assumption is correct it can be done just like this.
---
mkfs/xfs_mkfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..3aecacd3 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3644,7 +3644,7 @@ check_lsunit:
else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
lsu = cfg->blocksize; /* lsunit matches filesystem block size */
- if (lsu) {
+ if (cli->lsu) {
/* verify if lsu is a multiple block size */
if (lsu % cfg->blocksize != 0) {
fprintf(stderr,
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-09-29 8:59 ` lukas
@ 2025-10-03 7:46 ` Christoph Hellwig
2025-10-07 7:13 ` [PATCH v2] " Lukas Herbolt
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-03 7:46 UTC (permalink / raw)
To: lukas; +Cc: Christoph Hellwig, linux-xfs
On Mon, Sep 29, 2025 at 10:59:24AM +0200, lukas@herbolt.com wrote:
> > You Before/after also contain changes for metadir/zoned, looks like you
> > upgraded to a new xfsprogs for your patch, but not the baseline.
> >
> Yeah it was fedora rawhide, 6.12, did not notice the metadir/zoned.
Np. I just started at it to see what you changed.
> > I don't think just picking the data stripe unit is correct here, given
> > that the log can also be external and on a separate device. Instead
> > we'll need to duplicate the calculation based on ft.log, preferably by
> > factoring it into a helper.
> Hmmm, aren't all the <data|rt|log>.sunit" set by blkid_get_topology()?
Yes.
> So as log is internal lsunit should be equal to dsunit and it can only
> differ if the log is external?
Yes.
> My understanding was the LSU check is there mostly if cli->lsu is set.
> Actually if that's assumption is correct it can be done just like this.
That might be a better idea, and should be done in a prep patch.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-10-03 7:46 ` Christoph Hellwig
@ 2025-10-07 7:13 ` Lukas Herbolt
2025-10-08 6:06 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Lukas Herbolt @ 2025-10-07 7:13 UTC (permalink / raw)
To: djwong; +Cc: linux-xfs, Lukas Herbolt
>> and should be done in a prep patch
Not sure if I got it right, but sending v2
Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
---
v2: rework check lsu only if LSU comes from CLI
---
mkfs/xfs_mkfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 8cd4ccd7..3aecacd3 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -3644,7 +3644,7 @@ check_lsunit:
else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
lsu = cfg->blocksize; /* lsunit matches filesystem block size */
- if (lsu) {
+ if (cli->lsu) {
/* verify if lsu is a multiple block size */
if (lsu % cfg->blocksize != 0) {
fprintf(stderr,
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-10-07 7:13 ` [PATCH v2] " Lukas Herbolt
@ 2025-10-08 6:06 ` Christoph Hellwig
0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-08 6:06 UTC (permalink / raw)
To: Lukas Herbolt; +Cc: djwong, linux-xfs
On Tue, Oct 07, 2025 at 09:13:00AM +0200, Lukas Herbolt wrote:
> >> and should be done in a prep patch
> Not sure if I got it right, but sending v2
>
> Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
The changes look good. You'll want to send them out as a proper
Linux-style patch with a commit log explaining the changes in a separate
mail, although it would make sense to turn it into a series with your
other patch. git send-email is neat to automate that.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks.
2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
2025-09-29 6:54 ` Christoph Hellwig
@ 2025-10-15 6:52 ` Donald Douwsma
1 sibling, 0 replies; 7+ messages in thread
From: Donald Douwsma @ 2025-10-15 6:52 UTC (permalink / raw)
To: Lukas Herbolt, aalbersh, djwong; +Cc: linux-xfs
On 26/9/25 22:38, Lukas Herbolt wrote:
> Creating of XFS on 4kN or 512e disk result in suboptimal LSU/LSUNIT.
> As of now we check if the sectorsize is bigger than XLOG_HEADER_SIZE
> and so we set lsu to blocksize. But we do not check the the size if
> lsunit can be bigger to fit the disk geometry.
>
> Before:
> modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
> lsblk -tQ 'VENDOR == "XFS_TEST"'; \
> mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
> modprobe -r scsi_debug
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
> sda 0 262144 262144 4096 512 0 bfq 256 512 0B
> meta-data=/dev/sda isize=512 agcount=32, agsize=8192000 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=1
> = reflink=1 bigtime=1 inobtcount=1 nrext64=1
> = exchange=0
> data = bsize=4096 blocks=262144000, imaxpct=25
> = sunit=64 swidth=64 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
> log =internal log bsize=4096 blocks=128000, version=2
> = sectsz=4096 sunit=1 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> After:
> modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000; \
> lsblk -tQ 'VENDOR == "XFS_TEST"'; \
> mkfs.xfs -f $(lsblk -Q 'VENDOR == "XFS_TEST"' -no path) 2>/dev/null; sleep 1; \
> modprobe -r scsi_debug
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
> sda 0 262144 262144 4096 512 0 bfq 256 512 0B
> meta-data=/dev/sda isize=512 agcount=32, agsize=8192000 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=1
> = reflink=1 bigtime=1 inobtcount=1 nrext64=1
> = exchange=0 metadir=0
> data = bsize=4096 blocks=262144000, imaxpct=25
> = sunit=64 swidth=64 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
> log =internal log bsize=4096 blocks=128000, version=2
> = sectsz=4096 sunit=64 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> = rgcount=0 rgsize=0 extents
> = zoned=0 start=0 reserved=0
>
Has it always been like this?
"2f44b1b0 mkfs: rework stripe calculations" changed a lot in this area,
but its harder to follow the code before the change.
> Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
> ---
> mkfs/xfs_mkfs.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index 8cd4ccd7..05268cd9 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3643,6 +3643,10 @@ check_lsunit:
> lsu = getnum(cli->lsu, &lopts, L_SU);
> else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
Missing { ? ^
> lsu = cfg->blocksize; /* lsunit matches filesystem block size */
> + if (cfg->dsunit){
> + cfg->lsunit = cfg->dsunit;
> + lsu = 0;
> + }
}
>
> if (lsu) {
> /* verify if lsu is a multiple block size */
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-15 6:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-26 12:38 [PATCH] mkfs.xfs fix sunit size on 512e and 4kN disks Lukas Herbolt
2025-09-29 6:54 ` Christoph Hellwig
2025-09-29 8:59 ` lukas
2025-10-03 7:46 ` Christoph Hellwig
2025-10-07 7:13 ` [PATCH v2] " Lukas Herbolt
2025-10-08 6:06 ` Christoph Hellwig
2025-10-15 6:52 ` [PATCH] " Donald Douwsma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).