From: "Darrick J. Wong" <djwong@kernel.org>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"fstests@vger.kernel.org" <fstests@vger.kernel.org>
Subject: Re: mkfs.xfs "concurrency" change concerns
Date: Fri, 10 Oct 2025 12:17:13 -0700 [thread overview]
Message-ID: <20251010191713.GE6188@frogsfrogsfrogs> (raw)
In-Reply-To: <84c8a5e5-938d-4745-996d-4237009c9cc5@sandeen.net>
On Thu, Oct 09, 2025 at 03:13:47PM -0500, Eric Sandeen wrote:
> Hey all -
>
> this got long, so tl;dr:
>
> 1) concurrency geometry breaks some xfstests for me
> 2) concurrency behavior is not consistent w/ loopback vs. imagefile
> 3) concurrency defaults to the mkfs machine not the mount machine
>
> In detail:
>
> So, I realize I'm late to the game here and didn't review the patches
> before they went in, but it looks like the "concurrency" mkfs.xfs
> arguments and defaults are breaking several xfstests.
>
> 4738ff0 mkfs: allow sizing realtime allocation groups for concurrency
> c02a1873 mkfs: allow sizing internal logs for concurrency
> 9338bc8b mkfs: allow sizing allocation groups for concurrency
>
> Specifically, xfs/078, xfs/216, and xfs/217 are failing for us
> on various machines with between 8 and 128 CPUS, due to the
> fundamental change in geometry that results from the new
> concurrency behavior, which makes any consistent golden
> output that involves geometry details quite difficult.
Uggggh. You're right, I see golden output changes in xfs/078 if I boost
the number of VM CPUs past four:
--- /run/fstests/bin/tests/xfs/078.out 2025-07-15 14:41:40.195202883 -0700
+++ /var/tmp/fstests/xfs/078.out.bad 2025-10-10 11:56:14.040263143 -0700
@@ -188,6 +188,6 @@
*** mount loop filesystem
*** grow loop filesystem
xfs_growfs --BlockSize=4096 --Blocks=268435456
-data blocks changed from 268435456 to 4194304001
+data blocks changed from 268435456 to 4194304000
*** unmount
*** all done
Can this happen if RAID stripe parameters also get involved and change
the AG count? This test could be improved by parsing the block counts
and using _within to get past those kinds of problems, if there's more
than one way to make the golden output wrong despite correct operation.
What do your xfs/21[67] failures look like?
> One option might be to detect whether the "concurrency" args
> exist in mkfs.xfs, and set that back to 4, which is probably likely
> to more or less behave the old way, and match the current golden
> output which was (usually) based on 4 AGs. But that might break
> the purpose of some of the tests, if we're only validating behavior
> when a specific set of arguments is applied.
I think you're really asking to force the old behavior from before the
concurrency options existed, but only if fstests is running. Or maybe
a little more than that; I'll get to that at the end.
> (for 078, adding -d concurrency=4 seems to fix it. For 216 and 217
> I think I needed -l concurrency=4, but this might depend on nr cpus.)
>
> So, we could probably fix xfstests to make mkfs.xfs behave the old way,
> with loss of coverage of behavior with current code defaults.
Well yes, you'd be losing test coverage either for configurations that
set concurrency options explicitly, or when the storage are
nonrotational.
> Other concerns, though - I see that we only do this if the storage
> is nonrotational. But in testing, if you set up a loop device, the
> loop dev is nonrotational, and gets the new concurrency behavior,
> while doing a mkfs.xfs directly on the backing file doesn't:
>
> # losetup /dev/loop4 testfile.img
>
> # mkfs.xfs -f /dev/loop4 2>&1 | grep agcount
> meta-data=/dev/loop4 isize=512 agcount=6, agsize=11184810 blks
>
> # mkfs.xfs -f testfile.img 2>&1 | grep agcount
> meta-data=testfile.img isize=512 agcount=4, agsize=16777216 blks
>
> so we get different behavior depending on how you access the image file.
What kernel is this? 6.17 sets ROTATIONAL by default and clears it if
the backing bdev (or the bdev backing the file) has ROTATIONAL set.
That might be why you see the discrepancy. I think that behavior has
been in the kernel since ~6.11 or so.
[Aside: Obviously, checking inode->i_sb->sb_bdev isn't sufficient for
files on a multi-disk filesystem, but it's probably close enough here.]
(But see below)
> And speaking of image files, it's a pretty common use case to use mkfs.xfs
> on image files for deployment elsewhere. Maybe the good news, even if
Yes, mkfs defaults to assuming rotational (and hence not computing a
concurrency factor) if the BLKROTATIONAL query fails. So that might
be why you get 4 AGs on a regular file but 6 on a loop device pointing
to the same file.
> accidental, is that if you mkfs the file directly, you don't get system-
> specific "concurrence" geometry. But I am concerned that there is no
> guarantee that the machine performing mkfs is the machine that will mount
> the filesystem, so this seems like a slightly dangerous assumption for
> default behavior.
What I tell our internal customers is:
1. Defer formatting until deployment whenever possible so that mkfs can
optimize the filesystem for the storage and machine it actually gets.
2. If you can't do that, then try to make the image creator machine
match the deployment hardware as much as possible in terms of
rotationality and CPU count.
3. We shouldn't have a #3 because that's leaving performance on the
table.
They were very happy to see performance gains after adjusting their
WOS generation scripts towards #1.
But I think it's this #3 here that's causing the most concern for you?
I suppose if you really don't know what the deployment hardware is going
to look like then ... falling back to the default calculations (which
are mostly for spinning-rust) at least is familiar.
Would the creation of -[dlr] concurrency=never options to mkfs.xfs
address all of your concerns?
> I understand the desire to DTRT by default, but I am concerned about
> test breakage, loopdev inconsistencies, and too-broad assumptions about
> where the resulting filesystem will actually be used.
That itself is a ... very broad statement considering that this code
landed in 6.8 and this is the first I've heard of complaints. ;)
--D
> Thoughts?
> Thanks,
> -Eric
>
>
next prev parent reply other threads:[~2025-10-10 19:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-09 20:13 mkfs.xfs "concurrency" change concerns Eric Sandeen
2025-10-10 5:17 ` Christoph Hellwig
2025-10-10 19:17 ` Darrick J. Wong [this message]
2025-10-10 19:34 ` Darrick J. Wong
2025-10-10 20:47 ` Eric Sandeen
2025-10-14 2:32 ` Darrick J. Wong
2025-10-14 15:36 ` Eric Sandeen
2025-10-17 22:46 ` Darrick J. Wong
2025-10-18 15:01 ` Eric Sandeen
2026-06-15 14:19 ` Christoph Hellwig
2026-06-15 15:24 ` Eric Sandeen
2026-06-15 15:40 ` Darrick J. Wong
2026-06-16 5:24 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251010191713.GE6188@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox