From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 547C62FBDE9; Tue, 14 Oct 2025 02:32:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760409149; cv=none; b=INm0dc7Spg1N0bNtD6jclmqM61YG5nmCwHH0Uh4FPwJ+lH4aH7PLK0jFRcLwnhLXE2ZPJFMzk2ha316232abxUSJ8DgneKaBy+sTJvaL6keTuuA4zYZircm73kI3NQfWn7BfI9zFRsr18IGqS2XpiBvUKeKa4vMhk+GgYVw72Ak= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760409149; c=relaxed/simple; bh=su72gBkImdKZBIdIjJ210Mp/DwXHGma4OX6PPsHc4UU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iefFzI2ahAjXq8IGf37oautdiCTLTu7kDC7hZZT8umSSDzkFVk0X6y9qYz926KZIrq1HlRz4ueRSHFJ/v7PgvPOeL+qC1bZ4ck1dF//abe2tMYGoTNndefGsrqR5rKpsr2J8ISnf9HQXOr0JQjbVHP+TNoNxvZOx1hb0JTWmSbM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kudkPD6o; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kudkPD6o" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C1E92C4CEE7; Tue, 14 Oct 2025 02:32:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760409148; bh=su72gBkImdKZBIdIjJ210Mp/DwXHGma4OX6PPsHc4UU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kudkPD6od2QGb2R91+vsJvdnwvdb6kTLgcqs6fKlwqao6sz/wFXP8iEkOsXZFz0Gb P6FCHtDbb+k/IDbklSjSMsikAJ+H+2tT6+QyjilPup0EKmKnWbw6ZfefOdGWgxKUk7 yprzsl7tqQpeTX3/+hljrFCZ2qQzuswtPmik6JRgCsGKf9u0wi85OSfVMaUAXif4q+ TxIaDNMZ/I5ehmJ1/czHqPf5rRS1hVd5+b/RzI/kK7ZJhK/nAPcGJ5dJPObtYaRS9k DcnHZwgB8OKWCEHLmrCeZ/kEGnRE5/+itFY+YCRU4rYVaa7apOdvuJgSY4L2T2CR0g 7feD+OH7I79EA== Date: Mon, 13 Oct 2025 19:32:28 -0700 From: "Darrick J. Wong" To: Eric Sandeen Cc: "linux-xfs@vger.kernel.org" , "fstests@vger.kernel.org" , Christoph Hellwig Subject: Re: mkfs.xfs "concurrency" change concerns Message-ID: <20251014023228.GU6188@frogsfrogsfrogs> References: <84c8a5e5-938d-4745-996d-4237009c9cc5@sandeen.net> <20251010191713.GE6188@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Oct 10, 2025 at 03:47:14PM -0500, Eric Sandeen wrote: > On 10/10/25 2:17 PM, Darrick J. Wong wrote: > > On Thu, Oct 09, 2025 at 03:13:47PM -0500, Eric Sandeen wrote: > >> Hey all - > >> > >> this got long, so tl;dr: > >> > >> 1) concurrency geometry breaks some xfstests for me > >> 2) concurrency behavior is not consistent w/ loopback vs. imagefile > >> 3) concurrency defaults to the mkfs machine not the mount machine > > ... > > > Uggggh. You're right, I see golden output changes in xfs/078 if I boost > > the number of VM CPUs past four: > > > > --- /run/fstests/bin/tests/xfs/078.out 2025-07-15 14:41:40.195202883 -0700 > > +++ /var/tmp/fstests/xfs/078.out.bad 2025-10-10 11:56:14.040263143 -0700 > > @@ -188,6 +188,6 @@ > > *** mount loop filesystem > > *** grow loop filesystem > > xfs_growfs --BlockSize=4096 --Blocks=268435456 > > -data blocks changed from 268435456 to 4194304001 > > +data blocks changed from 268435456 to 4194304000 > > *** unmount > > *** all done > > Yeah I saw exactly that. > > > Can this happen if RAID stripe parameters also get involved and change > > the AG count? This test could be improved by parsing the block counts > > and using _within to get past those kinds of problems, if there's more > > than one way to make the golden output wrong despite correct operation. > > Maybe? Not sure tbh. I hadn't seen it fail before. > > > What do your xfs/21[67] failures look like? > > On a VM w/ 8 CPUs, like this: > > xfs/216 4s ... - output mismatch (see /root/xfstests-dev/results//xfs/216.out.bad) > --- tests/xfs/216.out 2024-02-08 15:39:34.883667028 -0600 > +++ /root/xfstests-dev/results//xfs/216.out.bad 2025-10-10 15:28:24.055130959 -0500 > @@ -7,4 +7,4 @@ > fssize=32g log =internal log bsize=4096 blocks=16384, version=2 > fssize=64g log =internal log bsize=4096 blocks=16384, version=2 > fssize=128g log =internal log bsize=4096 blocks=16384, version=2 > -fssize=256g log =internal log bsize=4096 blocks=32768, version=2 > +fssize=256g log =internal log bsize=4096 blocks=32767, version=2 > ... > (Run 'diff -u /root/xfstests-dev/tests/xfs/216.out /root/xfstests-dev/results//xfs/216.out.bad' to see the entire diff) > Ran: xfs/216 > Failures: xfs/216 > Failed 1 of 1 tests > > 217 was similarly off by one. So maybe _within works ... > > >> One option might be to detect whether the "concurrency" args > >> exist in mkfs.xfs, and set that back to 4, which is probably likely > >> to more or less behave the old way, and match the current golden > >> output which was (usually) based on 4 AGs. But that might break > >> the purpose of some of the tests, if we're only validating behavior > >> when a specific set of arguments is applied. > > > > I think you're really asking to force the old behavior from before the > > concurrency options existed, but only if fstests is running. Or maybe > > a little more than that; I'll get to that at the end. > > Not so much asking for as musing about, and I don't think it's the > best idea ... > > ... > > >> # losetup /dev/loop4 testfile.img > >> > >> # mkfs.xfs -f /dev/loop4 2>&1 | grep agcount > >> meta-data=/dev/loop4 isize=512 agcount=6, agsize=11184810 blks > >> > >> # mkfs.xfs -f testfile.img 2>&1 | grep agcount > >> meta-data=testfile.img isize=512 agcount=4, agsize=16777216 blks > >> > >> so we get different behavior depending on how you access the image file. > > > > What kernel is this? 6.17 sets ROTATIONAL by default and clears it if > > the backing bdev (or the bdev backing the file) has ROTATIONAL set. > > That might be why you see the discrepancy. I think that behavior has > > been in the kernel since ~6.11 or so. > > This was 6.17. The backing file get the nonrotational/concurrency treatment > but the loop device does. This probably says more about the xfsprogs test > (ddev_is_solidstate) than the kernel. > > ioctl(3, BLKROTATIONAL, 0x7ffd9d48f696) = -1 ENOTTY (Inappropriate ioctl for device) > > fails, so ddev_is_solidstate returns false. For the loop dev, BLKROTATIONAL > says rotational == 0 so we get true for solidstate. > > But TBH this might be the right answer for mkfsing a file directly, as it is > likely an image destined for another machine. > > Perhaps ddev_is_solidstate() should default to "not solidstate" for regular > files /and/ loopback devices if possible? It's *possible*, but why would mkfs ignore what the kernel tells us? Suppose you're creating an image file on a filesystem sitting on a SSD with the intent of deploying the image in a mostly non-rotational environment. Now those people don't get any of the SSD optimizations even though the creator had an SSD > > [Aside: Obviously, checking inode->i_sb->sb_bdev isn't sufficient for > > files on a multi-disk filesystem, but it's probably close enough here.] > > > > (But see below) > > > >> And speaking of image files, it's a pretty common use case to use mkfs.xfs > >> on image files for deployment elsewhere. Maybe the good news, even if > > > > Yes, mkfs defaults to assuming rotational (and hence not computing a > > concurrency factor) if the BLKROTATIONAL query fails. So that might > > be why you get 4 AGs on a regular file but 6 on a loop device pointing > > to the same file. > > Yes, exactly. > > >> accidental, is that if you mkfs the file directly, you don't get system- > >> specific "concurrence" geometry. But I am concerned that there is no > >> guarantee that the machine performing mkfs is the machine that will mount > >> the filesystem, so this seems like a slightly dangerous assumption for > >> default behavior. > > > > What I tell our internal customers is: > > > > 1. Defer formatting until deployment whenever possible so that mkfs can > > optimize the filesystem for the storage and machine it actually gets. > > > > 2. If you can't do that, then try to make the image creator machine > > match the deployment hardware as much as possible in terms of > > rotationality and CPU count. > > I just don't think that's practical in real life when you're creating a > generic OS image for wide distribution into unknown environments. Uhhh well I exist in real life too. > > 3. We shouldn't have a #3 because that's leaving performance on the > > table. > > > > They were very happy to see performance gains after adjusting their > > WOS generation scripts towards #1. > > > > But I think it's this #3 here that's causing the most concern for you? > > I suppose if you really don't know what the deployment hardware is going > > to look like then ... falling back to the default calculations (which > > are mostly for spinning-rust) at least is familiar. > > > > Would the creation of -[dlr] concurrency=never options to mkfs.xfs > > address all of your concerns? > > Not quite, because *defaults* have changed, and it's almost impossible > to chase down every consumer who's going to happily update xfsprogs and > suddenly get wildly different geometry which might really help them! > Or might make no sense at all. But that horse has kind of left the barn, > I guess. I would say that the *hardware* has changed, and so should the defaults. The pre-concurrency= defaults, FWIW, are what past xfs developers worked out as a reasonable setting for spinning rust. > I'm tempted to say "do not do the concurrency thing for regular files or > for loopback files because that's a strong hint that this filesystem has > at least a good chance of being destined for another machine." Thoughts? > > (from your other email) > > > Now that I've read the manpage, I'm reminded that -[dlr] concurrency=0 > > already disables the automatic calculation. Does injecting that into > > dparam (in xfs/078) fix the problem for you? It does for me. > > I'm sure it does. Gets back to my question about whether that is too > detrimental to test coverage for the default behavior. I think probably > a _within is better. _within would take a little work since we're comparing > strings with numbers in them not just numbers but I think we really do > care only about one number in the string, so it's probably fine. Yeah. Could we turn the last two digits in a sequence into "XX" so that as long as it's not off by more than 100 fsblocks then everything is ok? > >> I understand the desire to DTRT by default, but I am concerned about > >> test breakage, loopdev inconsistencies, and too-broad assumptions about > >> where the resulting filesystem will actually be used. > > > > That itself is a ... very broad statement considering that this code > > landed in 6.8 and this is the first I've heard of complaints. ;) > > Yeah, I started off with acking that I am very late to this. Been a million > work distractions keeping me from this stuff, I'm sorry. :( > > But as you said, any test env. with > 4 CPUs probably started failing tests > since 6.8 so it's probably not all on me. ;) > > I think where I'll go from here is do a big run with -d and -l concurrency > set to something biggish in MKFS_OPTIONS, see what breaks, and see what sort > of _within changes might help. > And then I might also see about a patch to make ddev_is_solidstate explicitly > return false for loop devices and regular files, unless anyone sees issues > with that approach. As I said before, that might be surprising to anyone who saw that the loop device is not rotational and didn't get nonrotational tuning. --D > thanks, > -Eric >