From: "Darrick J. Wong" <djwong@kernel.org>
To: aalbersh@kernel.org
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/4] mkfs: fix log sunit automatic configuration
Date: Thu, 5 Mar 2026 14:56:54 -0800 [thread overview]
Message-ID: <20260305225654.GK57948@frogsfrogsfrogs> (raw)
In-Reply-To: <177268457083.1999857.7479249726865742847.stgit@frogsfrogsfrogs>
On Wed, Mar 04, 2026 at 08:24:55PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> This patch fixes ~96% failure rates on fstests on my test fleet, some
> of which now have 4k LBA disks with unexciting min/opt io geometry:
>
> # lsblk -t /dev/sda
> NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME
> sda 0 4096 1048576 4096 512 1 bfq 256 2048 0B
> # mkfs.xfs -f -N /dev/sda3
> meta-data=/dev/sda3 isize=512 agcount=4, agsize=2183680 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=1
> = reflink=1 bigtime=1 inobtcount=1 nrext64=1
> = exchange=1 metadir=0
> data = bsize=4096 blocks=8734720, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1
> log =internal log bsize=4096 blocks=16384, version=2
> = sectsz=4096 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> = rgcount=0 rgsize=0 extents
> = zoned=0 start=0 reserved=0
>
> Note that MIN-IO == PHY-SEC, so dsunit/dswidth are zero. With this
> change, we no longer set the lsunit to the fsblock size if the log
> sector size is greater than 512. Unfortunately, dsunit is also not set,
> so mkfs never sets the log sunit and it remains zero. I think
> this causes problems with the log roundoff computation in the kernel:
>
> if (xfs_has_logv2(mp) && mp->m_sb.sb_logsunit > 1)
> log->l_iclog_roundoff = mp->m_sb.sb_logsunit;
> else
> log->l_iclog_roundoff = BBSIZE;
>
> because now the roundoff factor is less than the log sector size. After
> a while, the filesystem cannot be mounted anymore because:
>
> XFS (sda3): Mounting V5 Filesystem 81b8ffa8-383b-4574-a68c-9b8202707a26
> XFS (sda3): Corruption warning: Metadata has LSN (4:2729) ahead of current LSN (4:2727). Please unmount and run xfs_repair (>= v4.3) to resolve.
> XFS (sda3): log mount/recovery failed: error -22
> XFS (sda3): log mount failed
>
> Reverting this patch makes the problem go away, but I think you're
> trying to make it so that mkfs will set lsunit = dsunit if dsunit>0 and
> the caller didn't specify any -lsunit= parameter, right?
>
> But there's something that just seems off with this whole function. If
> the user provided a -lsunit/-lsu option then we need to validate the
> value and either use it if it makes sense, or complain if not. If the
> user didn't specify any option, then we should figure it out
> automatically from the other data device geometry options (internal) or
> the external log device probing.
>
> But that's not what this function does. Why would you do this:
>
> else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
> lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>
> and then loudly validate that lsu (bytes) is congruent with the fsblock
> size? This is trivially true, but then it disables the "make lsunit use
> dsunit if set" logic below:
>
> } else if (cfg->sb_feat.log_version == 2 &&
> cfg->loginternal && cfg->dsunit) {
> /* lsunit and dsunit now in fs blocks */
> cfg->lsunit = cfg->dsunit;
> }
>
> AFAICT, the "lsunit matches fs block size" logic is buggy. This code
> was added with no justification as part of a "reworking" commit
> 2f44b1b0e5adc4 ("mkfs: rework stripe calculations") back in 2017. I
> think the correct logic is to move the "lsunit matches fs block size"
> logic to the no-lsunit-option code after the validation code.
>
> This seems to set sb_logsunit to 4096 on my test VM, to 0 on the even
> more boring VMs with 512 physical sectors, and to 262144 with the
> scsi_debug device that Lukas Herbolt created with:
>
> # modprobe scsi_debug inq_vendor=XFS_TEST physblk_exp=3 sector_size=512 \
> opt_xferlen_exp=9 opt_blks=512 dev_size_mb=100 virtual_gb=1000
>
> Cc: <linux-xfs@vger.kernel.org> # v4.15.0
> Fixes: 2f44b1b0e5adc4 ("mkfs: rework stripe calculations")
> Fixes: ca1eb448e116da ("mkfs.xfs fix sunit size on 512e and 4kN disks.")
> Link: https://lore.kernel.org/linux-xfs/20250926123829.2101207-2-lukas@herbolt.com/
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> mkfs/xfs_mkfs.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
>
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index ece20905b28313..a45859dd633e98 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -3624,10 +3624,8 @@ _("%s: Stripe unit(%d) or stripe width(%d) is not a multiple of the block size(%
> lsunit = cli->lsunit;
> else if (cli_opt_set(&lopts, L_SU))
> lsu = getnum(cli->lsu, &lopts, L_SU);
> - else if (cfg->lsectorsize > XLOG_HEADER_SIZE)
> - lsu = cfg->blocksize; /* lsunit matches filesystem block size */
>
> - if (cli->lsu) {
> + if (lsu) {
> /* verify if lsu is a multiple block size */
> if (lsu % cfg->blocksize != 0) {
> fprintf(stderr,
> @@ -3651,10 +3649,14 @@ _("log stripe unit (%d) must be a multiple of the block size (%d)\n"),
> if (lsunit) {
> /* convert from 512 byte blocks to fs blocks */
> cfg->lsunit = DTOBT(lsunit, cfg->blocklog);
> - } else if (cfg->sb_feat.log_version == 2 &&
> - cfg->loginternal && cfg->dsunit) {
> - /* lsunit and dsunit now in fs blocks */
> - cfg->lsunit = cfg->dsunit;
> + } else if (cfg->sb_feat.log_version == 2 && cfg->loginternal) {
> + if (cfg->dsunit) {
> + /* lsunit and dsunit now in fs blocks */
> + cfg->lsunit = cfg->dsunit;
> + } else if (cfg->lsectorsize > XLOG_HEADER_SIZE) {
> + /* lsunit matches filesystem block size */
> + cfg->lsunit = 1;
> + }
> } else if (cfg->sb_feat.log_version == 2 &&
> !cfg->loginternal) {
> /* use the external log device properties */
There's a bug in this patch; the lsectorsize > XLOG_HEADER_SIZE also
needs to be copied downwards to the external log configuration part.
--D
next prev parent reply other threads:[~2026-03-05 22:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 4:24 [PATCHSET] xfsprogs: various bug fixes for 6.19 Darrick J. Wong
2026-03-05 4:24 ` [PATCH 1/4] misc: fix a few memory leaks Darrick J. Wong
2026-03-05 14:03 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 2/4] libxfs: fix data corruption bug in libxfs_file_write Darrick J. Wong
2026-03-05 7:46 ` Holger Hoffstätte
2026-03-05 15:56 ` Darrick J. Wong
2026-03-05 14:04 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 3/4] mkfs: fix protofile data corruption when in/out file block sizes don't match Darrick J. Wong
2026-03-05 14:04 ` Christoph Hellwig
2026-03-05 4:24 ` [PATCH 4/4] mkfs: fix log sunit automatic configuration Darrick J. Wong
2026-03-05 14:05 ` Christoph Hellwig
2026-03-05 22:56 ` Darrick J. Wong [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-03-10 3:24 [PATCHSET v2] xfsprogs: various bug fixes for 6.19 Darrick J. Wong
2026-03-10 3:25 ` [PATCH 4/4] mkfs: fix log sunit automatic configuration Darrick J. Wong
2026-03-10 6:43 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305225654.GK57948@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=aalbersh@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox