From: "Darrick J. Wong" <djwong@kernel.org>
To: aalbersh@kernel.org
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/4] mkfs: fix log sunit automatic configuration
Date: Thu, 5 Mar 2026 14:56:54 -0800 [thread overview]
Message-ID: <20260305225654.GK57948@frogsfrogsfrogs> (raw)
In-Reply-To: <177268457083.1999857.7479249726865742847.stgit@frogsfrogsfrogs>
On Wed, Mar 04, 2026 at 08:24:55PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> This patch fixes ~96% failure rates on fstests on my test fleet, some
> of which now have 4k LBA disks with unexciting min/opt io geometry:
>
> # lsblk -t /dev/sda
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
> sda 0 4096 1048576 4096 512 1 bfq 256 2048 0B
> # mkfs.xfs -f -N /dev/sda3
> meta-data=/dev/sda3 isize=512 agcount=4, agsize=2183680 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=1
> = reflink=1 bigtime=1 inobtcount=1 nrext64=1
> = exchange=1 metadir=0
> data = bsize=4096 blocks=8734720, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1
> log =internal log bsize=4096 blocks=16384, version=2
> = sectsz=4096 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> = rgcount=0 rgsize=0 extents
> = zoned=0 start=0 reserved=0
>
> Note that MIN-IO == PHY-SEC, so dsunit/dswidth are zero. With this
> change, we no longer set the lsunit to the fsblock size if the log
> sector size is greater than 512. Unfortunately, dsunit is also not set,
> so mkfs never sets the log sunit and it remains zero. I think
> this causes problems with the log roundoff computation in the kernel:
>
> if (xfs_has_logv2(mp) && mp->m_sb.sb_logsunit > 1)
> log->l_iclog_roundoff = mp->m_sb.sb_logsunit;
> else
> log->l_iclog_roundoff = BBSIZE;
>
> because now the roundoff factor is less than the log sector size. After
> a while, the filesystem cannot be mounted anymore because:
>
> XFS (sda3): Mounting V5 Filesystem 81b8ffa8-383b-4574-a68c-9b8202707a26
> XFS (sda3): Corruption warning: Metadata has LSN (4:2729) ahead of current LSN (4:2727). Please unmount and run xfs_repair (>= v4.3) to resolve.
> XFS (sda3): log mount/recovery failed: error -22
> XFS (sda3): log mount failed
>
> Reverting this patch makes the problem go away, but I think you're
> trying to make it so that mkfs will set lsunit = dsunit if dsunit>0 and
> the caller didn't specify any -lsunit= parameter, right?
>
> But there's something that just seems off with this whole function. If
> the user provided a -lsunit/-lsu option then we need to validate the
> value and either use it if it makes sense, or complain if not. If the
> user didn't specify any option, then we should figure it out
> automatically from the other data device geometry options (internal) or
> the external log device probing.
>
> But that's not what this function does. Why would you do this:
>
> else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
> lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>
> and then loudly validate that lsu (bytes) is congruent with the fsblock
> size? This is trivially true, but then it disables the "make lsunit use
> dsunit if set" logic below:
>
> } else if (cfg->sb_feat.log_version == 2 &&
> cfg->loginternal && cfg->dsunit) {
> /* lsunit and dsunit now in fs blocks */
> cfg->lsunit = cfg->dsunit;
> }
>
> AFAICT, the "lsunit matches fs block size" logic is buggy. This code
> was added with no justification as part of a "reworking" commit
> 2f44b1b0e5adc4 ("mkfs: rework stripe calculations") back in 2017. I
> think the correct logic is to move the "lsunit matches fs block size"
> logic to the no-lsunit-option code after the validation code.
>
> This seems to set sb_logsunit to 4096 on my test VM, to 0 on the even
> more boring VMs with 512 physical sectors, and to 262144 with the
> scsi_debug device that Lukas Herbolt created with:
>
> # modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000
>
> Cc: <linux-xfs@vger.kernel.org> # v4.15.0
> Fixes: 2f44b1b0e5adc4 ("mkfs: rework stripe calculations")
> Fixes: ca1eb448e116da ("mkfs.xfs fix sunit size on 512e and 4kN disks.")
> Link: https://lore.kernel.org/linux-xfs/20250926123829.2101207-2-lukas@herbolt.com/
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> mkfs/xfs_mkfs.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
>
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index ece20905b28313..a45859dd633e98 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3624,10 +3624,8 @@ _("%s: Stripe unit(%d) or stripe width(%d) is not a multiple of the block size(%
> lsunit = cli->lsunit;
> else if (cli_opt_set(&lopts, L_SU))
> lsu = getnum(cli->lsu, &lopts, L_SU);
> - else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
> - lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>
> - if (cli->lsu) {
> + if (lsu) {
> /* verify if lsu is a multiple block size */
> if (lsu % cfg->blocksize != 0) {
> fprintf(stderr,
> @@ -3651,10 +3649,14 @@ _("log stripe unit (%d) must be a multiple of the block size (%d)\n"),
> if (lsunit) {
> /* convert from 512 byte blocks to fs blocks */
> cfg->lsunit = DTOBT(lsunit, cfg->blocklog);
> - } else if (cfg->sb_feat.log_version == 2 &&
> - cfg->loginternal && cfg->dsunit) {
> - /* lsunit and dsunit now in fs blocks */
> - cfg->lsunit = cfg->dsunit;
> + } else if (cfg->sb_feat.log_version == 2 && cfg->loginternal) {
> + if (cfg->dsunit) {
> + /* lsunit and dsunit now in fs blocks */
> + cfg->lsunit = cfg->dsunit;
> + } else if (cfg->lsectorsize > XLOG_HEADER_SIZE) {
> + /* lsunit matches filesystem block size */
> + cfg->lsunit = 1;
> + }
> } else if (cfg->sb_feat.log_version == 2 &&
> !cfg->loginternal) {
> /* use the external log device properties */
There's a bug in this patch; the lsectorsize > XLOG_HEADER_SIZE also
needs to be copied downwards to the external log configuration part.
--D
next prev parent reply other threads:[~2026-03-05 22:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 4:24 [PATCHSET] xfsprogs: various bug fixes for 6.19 Darrick J. Wong
2026-03-05 4:24 ` [PATCH 1/4] misc: fix a few memory leaks Darrick J. Wong
2026-03-05 14:03 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 2/4] libxfs: fix data corruption bug in libxfs_file_write Darrick J. Wong
2026-03-05 7:46 ` Holger Hoffstätte
2026-03-05 15:56 ` Darrick J. Wong
2026-03-05 14:04 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 3/4] mkfs: fix protofile data corruption when in/out file block sizes don't match Darrick J. Wong
2026-03-05 14:04 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 4/4] mkfs: fix log sunit automatic configuration Darrick J. Wong
2026-03-05 14:05 ` Christoph Hellwig
2026-03-05 22:56 ` Darrick J. Wong [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-03-10 3:24 [PATCHSET v2] xfsprogs: various bug fixes for 6.19 Darrick J. Wong
2026-03-10 3:25 ` [PATCH 4/4] mkfs: fix log sunit automatic configuration Darrick J. Wong
2026-03-10 6:43 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305225654.GK57948@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=aalbersh@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.