* [PATCHSET V2] xfsprogs: enable new stable features for 6.18 @ 2025-12-09 16:16 Darrick J. Wong 2025-12-09 16:16 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong 2025-12-09 16:16 ` [PATCH 2/2] mkfs: add 2025 LTS config file Darrick J. Wong 0 siblings, 2 replies; 14+ messages in thread From: Darrick J. Wong @ 2025-12-09 16:16 UTC (permalink / raw) To: djwong, aalbersh; +Cc: linux-xfs Hi all, Enable by default some new features that seem stable now. v2: include performance implications If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=default-features --- Commits in this patchset: * mkfs: enable new features by default * mkfs: add 2025 LTS config file --- mkfs/Makefile | 3 ++- mkfs/lts_6.18.conf | 19 +++++++++++++++++++ mkfs/xfs_mkfs.c | 5 +++-- 3 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 mkfs/lts_6.18.conf ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] mkfs: enable new features by default 2025-12-09 16:16 [PATCHSET V2] xfsprogs: enable new stable features for 6.18 Darrick J. Wong @ 2025-12-09 16:16 ` Darrick J. Wong 2025-12-09 16:22 ` Christoph Hellwig 2025-12-09 22:25 ` Dave Chinner 2025-12-09 16:16 ` [PATCH 2/2] mkfs: add 2025 LTS config file Darrick J. Wong 1 sibling, 2 replies; 14+ messages in thread From: Darrick J. Wong @ 2025-12-09 16:16 UTC (permalink / raw) To: djwong, aalbersh; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> Since the LTS is coming up, enable parent pointers and exchange-range by default for all users. Also fix up an out of date comment. I created a really stupid benchmarking script that does: #!/bin/bash # pptr overhead benchmark umount /opt /mnt rmmod xfs for i in 1 0; do umount /opt mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent= mount /dev/sdb /opt mkdir -p /opt/foo for ((i=0;i<5;i++)); do time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1 done done This is the result of creating an enormous number of empty files in a single directory: # ./dumb.sh naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 real 0m18.807s user 0m2.169s sys 0m54.013s naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 real 0m20.654s user 0m2.374s sys 1m4.441s As you can see, there's a 10% increase in runtime here. If I make the workload a bit more representative by changing the -f argument to include a directory tree workout: -f creat=1,mkdir=1,mknod=1,rmdir=1,unlink=1,link=1,rename=1 naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 real 0m12.742s user 0m28.074s sys 0m10.839s naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 real 0m12.782s user 0m28.892s sys 0m8.897s Almost no difference here. If I then actually write to the regular files by adding: -f write=1 naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 real 0m16.668s user 0m21.709s sys 0m15.425s naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 real 0m15.562s user 0m21.740s sys 0m12.927s So that's about a 2% difference. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- mkfs/xfs_mkfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index 8f5a6fa5676453..8db51217016eb0 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -1044,7 +1044,7 @@ struct sb_feat_args { bool inode_align; /* XFS_SB_VERSION_ALIGNBIT */ bool nci; /* XFS_SB_VERSION_BORGBIT */ bool lazy_sb_counters; /* XFS_SB_VERSION2_LAZYSBCOUNTBIT */ - bool parent_pointers; /* XFS_SB_VERSION2_PARENTBIT */ + bool parent_pointers; /* XFS_SB_FEAT_INCOMPAT_PARENT */ bool projid32bit; /* XFS_SB_VERSION2_PROJID32BIT */ bool crcs_enabled; /* XFS_SB_VERSION2_CRCBIT */ bool dirftype; /* XFS_SB_VERSION2_FTYPE */ @@ -5984,11 +5984,12 @@ main( .rmapbt = true, .reflink = true, .inobtcnt = true, - .parent_pointers = false, + .parent_pointers = true, .nodalign = false, .nortalign = false, .bigtime = true, .nrext64 = true, + .exchrange = true, /* * When we decide to enable a new feature by default, * please remember to update the mkfs conf files. ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-09 16:16 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong @ 2025-12-09 16:22 ` Christoph Hellwig 2025-12-09 22:25 ` Dave Chinner 1 sibling, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2025-12-09 16:22 UTC (permalink / raw) To: Darrick J. Wong; +Cc: aalbersh, linux-xfs On Tue, Dec 09, 2025 at 08:16:08AM -0800, Darrick J. Wong wrote: > Almost no difference here. If I then actually write to the regular > files by adding: > > -f write=1 .. > So that's about a 2% difference. Let's hope no one complains given that the parent points are useful: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-09 16:16 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong 2025-12-09 16:22 ` Christoph Hellwig @ 2025-12-09 22:25 ` Dave Chinner 2025-12-10 23:49 ` Darrick J. Wong 1 sibling, 1 reply; 14+ messages in thread From: Dave Chinner @ 2025-12-09 22:25 UTC (permalink / raw) To: Darrick J. Wong; +Cc: aalbersh, linux-xfs On Tue, Dec 09, 2025 at 08:16:08AM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Since the LTS is coming up, enable parent pointers and exchange-range by > default for all users. Also fix up an out of date comment. > > I created a really stupid benchmarking script that does: > > #!/bin/bash > > # pptr overhead benchmark > > umount /opt /mnt > rmmod xfs > for i in 1 0; do > umount /opt > mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent= > mount /dev/sdb /opt > mkdir -p /opt/foo > for ((i=0;i<5;i++)); do > time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1 > done > done Hmmm. fsstress is an interesting choice here... > This is the result of creating an enormous number of empty files in a > single directory: > > # ./dumb.sh > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 > real 0m18.807s > user 0m2.169s > sys 0m54.013s > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 > real 0m20.654s > user 0m2.374s > sys 1m4.441s Yeah, that's only creating 20,000 files/sec. That's a lot less than expect a single thread to be able to do - why is the kernel burning all 4 CPUs on this workload? i.e. i'd expect a pure create workload to run at about 40,000 files/s with sleeping contention on the i_rwsem, but this is much slower than I'd expect and contention is on a spinning lock... Also, parent pointers add about 20% more system time overhead (54s sys time to 64.4s sys time). Where does this come from? Do you have kernel profiles? Is it PP overhead, a change in the contention point, or just worse contention on the same resource? > As you can see, there's a 10% increase in runtime here. If I make the > workload a bit more representative by changing the -f argument to > include a directory tree workout: > > -f creat=1,mkdir=1,mknod=1,rmdir=1,unlink=1,link=1,rename=1 > > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 > real 0m12.742s > user 0m28.074s > sys 0m10.839s > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 > real 0m12.782s > user 0m28.892s > sys 0m8.897s Again, that's way slower than I'd expect a 4p metadata workload to run through 400k modification ops. i.e. it's running at about 35k ops/s, and I'd be expecting the baseline to be upwards of 100k ops/s. Ah, look at the amount of time spent in userspace - 28-20s vs 9-11s spent in the kernel filesystem code. Ok, performance is limited by the usrespace code, not the kernel code. I would expect a decent fs benchmark to be at most 10% userspace CPU time, with >90% of the time being spent in the kernel doing filesystem operations. IOWs, there is way too much userspace overhead in this worklaod to draw useful conclusions about the impact of the kernel side changes. System time went up from 9s to 11s when parent pointers are turned on - a 20% increase in CPU overhead - but that additional overhead isn't reflected in the wall time results because the CPU overehad is dominated by the userspace program, not the kernel code that is being "measured". > Almost no difference here. Ah, no. Again, system time went up by ~20%, even though elapsed time was unchanged. That implies there is some amount of sleeping contention occurring between processes doing work, and the additional CPU overhead of the PP code simply resulted in less sleep time. Again, this is not noticable because the workload is dominated by userspace CPU overhead, not the kernel/filesystem operation overhead... > If I then actually write to the regular > files by adding: > > -f write=1 > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 > real 0m16.668s > user 0m21.709s > sys 0m15.425s > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 > real 0m15.562s > user 0m21.740s > sys 0m12.927s > > So that's about a 2% difference. Same here - system time went up by 25%, even though wall time didn't change. Also, 15.5s to 16.6s increase in wall time is actually a 7% difference in runtime, not 2%. ---- Overall, I don't think the benchmarking documented here is sufficient to justify the conclusion that "parent pointers have little real world overhead so we can turn them on by default". I would at least like to see the "will-it-scale" impact on a 64p machine with a hundred GB of RAM and IO subsystem at least capable of a million IOPS and a filesystem optimised for max performance (e.g. highly parallel fsmark based workloads). This will push the filesystem and CPU usage to their actual limits and directly expose additional overhead and new contention points in the results. This is also much more representative of the sorts of high performance, high end deployments that we expect XFS to be deployed on, and where performance impact actually matters to users. i.e. we need to know what the impact of the change is on the high end as well as low end VM/desktop configs before any conclusion can be drawn w.r.t. changing the parent pointer default setting.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-09 22:25 ` Dave Chinner @ 2025-12-10 23:49 ` Darrick J. Wong 2025-12-15 23:59 ` Dave Chinner 0 siblings, 1 reply; 14+ messages in thread From: Darrick J. Wong @ 2025-12-10 23:49 UTC (permalink / raw) To: Dave Chinner; +Cc: aalbersh, linux-xfs On Wed, Dec 10, 2025 at 09:25:24AM +1100, Dave Chinner wrote: > On Tue, Dec 09, 2025 at 08:16:08AM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > default for all users. Also fix up an out of date comment. > > > > I created a really stupid benchmarking script that does: > > > > #!/bin/bash > > > > # pptr overhead benchmark > > > > umount /opt /mnt > > rmmod xfs > > for i in 1 0; do > > umount /opt > > mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent= > > mount /dev/sdb /opt > > mkdir -p /opt/foo > > for ((i=0;i<5;i++)); do > > time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1 > > done > > done > > Hmmm. fsstress is an interesting choice here... <flush all the old benchmarks and conclusions> I have an old 40-core Xeon E5-2660V3 with a pair of 1.5T Intel nvme ssds and 128G of RAM running 6.18.0. For this sample, I tried to keep the memory usage well below the amount of DRAM so that I could measure the pure overhead of writing parent pointers out to disk and not anything else. I also omit ls'ing and chmod'ing the directory tree because neither of those operations touch parent pointers. I also left the logbsize at the defaults (32k) because that's what most users get. Here I'm using the following benchmark program, compiled from various suggestions from dchinner over the years: #!/bin/bash -x iter=8 feature="-n parent" filesz=0 subdirs=10000 files_per_iter=100000 writesize=16384 mkdirme() { set +x local i for ((i=0;i<agcount;i++)); do mkdir -p /nvme/$i dirs+=(-d /nvme/$i) done set -x } bulkme() { set +x local i for ((i=0;i<agcount;i++)); do xfs_io -c "bulkstat -a $i -q" /nvme & done wait set -x } rmdirme() { set +x local i for dir in "${dirs[@]}"; do rm -r -f "${dir}" & done wait set -x } benchme() { agcount="$(xfs_info /nvme/ | grep agcount= | sed -e 's/^.*agcount=//g' -e 's/,.*$//g')" dirs=() mkdirme #time ~djwong/cdev/work/fstests/build-x86_64/ltp/fsstress -n 400000 -p 40 -z -f creat=1,mkdir=1,rmdir=1,unlink=1 -d /nvme/ -s 1 time fs_mark -w "${writesz}" -D "${subdirs}" -S 0 -n "${files_per_iter}" -s "${filesz}" -L "${iter}" "${dirs[@]}" time bulkme time rmdirme } for p in 0 1; do umount /dev/nvme1n1 /nvme /mnt #mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 -n parent=$p || break mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 $feature=$p || break mount /dev/nvme1n1 /nvme/ -o logdev=/dev/nvme0n1 || break benchme umount /dev/nvme1n1 /nvme /mnt done I get this mkfs output: # mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 meta-data=/dev/nvme1n1 isize=512 agcount=40, agsize=9767586 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=0 metadir=0 data = bsize=4096 blocks=390703440, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 log =/dev/nvme0n1 bsize=4096 blocks=262144, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents = zoned=0 start=0 reserved=0 # grep nvme1n1 /proc/mounts /dev/nvme1n1 /nvme xfs rw,relatime,inode64,logbufs=8,logbsize=32k,logdev=/dev/nvme0n1,noquota 0 0 and this output from fsmark with parent=0: # fs_mark -D 10000 -S 0 -n 100000 -s 0 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 14:22:07 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 10000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 0 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 4000000 0 566680.9 31398816 2 8000000 0 665535.6 30037368 2 12000000 0 537227.6 31726557 2 16000000 0 538133.9 32411165 2 20000000 0 619369.6 30790676 2 24000000 0 600018.2 31583349 2 28000000 0 607209.8 31193980 3 32000000 0 533240.7 32277102 real 0m57.573s user 3m53.578s sys 19m44.440s + bulkme + set +x real 0m1.122s user 0m0.955s sys 0m39.306s + rmdirme + set +x real 0m59.649s user 0m41.196s sys 13m9.566s I limited this to 8 iterations so I could post some preliminary results after a few minutes. Now let's try again with parent=1: + fs_mark -D 10000 -S 0 -n 100000 -s 0 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # fs_mark -D 10000 -S 0 -n 100000 -s 0 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 14:24:44 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 10000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 0 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 4000000 0 543929.1 31344175 2 8000000 0 523736.2 31180565 2 12000000 0 522184.1 31700380 2 16000000 0 513468.0 32112498 2 20000000 0 543993.1 31910496 2 24000000 0 562760.1 32061910 2 28000000 0 524039.8 31825520 3 32000000 0 526028.8 31889193 real 1m2.934s user 3m53.508s sys 25m14.810s + bulkme + set +x real 0m1.158s user 0m0.882s sys 0m39.847s + rmdirme + set +x real 1m12.505s user 0m47.489s sys 20m33.844s fs_mark itself shows a decrease in file creation/sec of about 9%, an increase in wall clock time of about 9%, and an increase in kernel time of about 28%. That's to be expected, since parent pointer updates cause directory entry creation and deletion to hold more ILOCKs and for longer. Parallel bulkstat (aka bulkme) shows an increase in wall time of 3% and system time of 1%, which is not surprising since that's just walking the inode btree and cores, no parent pointers involved. Similarly, deleting all the files created by fs_mark shows an increase in wall time of about ~21% and an increase in system time of about 56%. I concede that parent pointers has a fair amount of overhead for the worst case of creating a large directory tree or deleting it. I reran this with logbsize=256k and while I saw a slight increase in performance across the board, the overhead of pptrs is about the same percentagewise. If I then re-run the benchmark with a file size of 1M and tell it to create fewer files, then I get the following for parent=0: # fs_mark -D 1000 -S 0 -n 200 -s 1048576 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:03:11 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 1048576 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 8000 1048576 1493.4 198379 2 16000 1048576 1327.0 255655 3 24000 1048576 1355.8 255105 4 32000 1048576 1352.3 253094 4 40000 1048576 1836.9 262258 5 48000 1048576 1337.6 246991 5 56000 1048576 1328.4 240303 6 64000 1048576 1165.9 237211 real 0m50.384s user 0m7.640s sys 1m43.187s + bulkme + set +x real 0m0.023s user 0m0.061s sys 0m0.167s + rmdirme + set +x real 0m0.675s user 0m0.107s sys 0m15.644s and for parent=1: # fs_mark -D 1000 -S 0 -n 200 -s 1048576 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:04:41 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 1048576 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 8000 1048576 1963.9 254007 2 16000 1048576 1716.4 227074 3 24000 1048576 1052.5 264987 4 32000 1048576 1793.6 242288 4 40000 1048576 1364.2 249738 5 48000 1048576 1081.2 250394 5 56000 1048576 1342.0 260667 6 64000 1048576 1356.9 242324 real 0m49.256s user 0m7.621s sys 1m44.847s + bulkme + set +x real 0m0.021s user 0m0.060s sys 0m0.176s + rmdirme + set +x real 0m0.537s user 0m0.108s sys 0m15.453s Here we see that the fs_mark creates/sec goes up by 4%, wall time decreases by 3%, and the kernel time increases by 2% or so. The rmdir wall time decreases by 2% and the kernel time by ~1%, which is quite small. So for a more common case of populating a directory tree full of big files with data in them, the overhead isn't all that noticeable. I then decided to simulate my maildir spool, which has 670,000 files consuming 12GB for an average file size of 17936 bytes. I reduced the file size to 16K, increase the number of files per iteration, and set the write buffer size to something not aligned to a block, and got this for parent=0: # fs_mark -w 778 -D 1000 -S 0 -n 6000 -s 16384 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:21:38 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 16384 bytes, written with an IO size of 778 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 240000 16384 40085.3 2492281 2 480000 16384 37026.7 2780077 2 720000 16384 28445.5 2591461 3 960000 16384 28888.6 2595817 3 1200000 16384 25160.8 2903882 3 1440000 16384 29372.1 2600018 3 1680000 16384 26443.9 2732790 4 1920000 16384 26307.1 2758750 real 1m11.633s user 0m46.156s sys 3m24.543s + bulkme + set +x real 0m0.091s user 0m0.111s sys 0m2.461s + rmdirme + set +x real 0m9.364s user 0m2.245s sys 0m47.221s and this for parent=1 # fs_mark -w 778 -D 1000 -S 0 -n 6000 -s 16384 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:23:38 2025 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 16384 bytes, written with an IO size of 778 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead 2 240000 16384 39340.1 2627066 2 480000 16384 27727.2 2925494 2 720000 16384 28305.4 2597191 2 960000 16384 24891.6 2834421 3 1200000 16384 27964.8 2810556 3 1440000 16384 27204.6 2776783 3 1680000 16384 25745.2 2779197 3 1920000 16384 24674.9 2752721 real 1m14.422s user 0m46.607s sys 3m38.777s + bulkme + set +x real 0m0.081s user 0m0.123s sys 0m2.408s + rmdirme + set +x real 0m9.306s user 0m2.570s sys 1m10.598s fs_mark shows a 7% decrease in creates/sec, a 4% increase in wall time, a 7% increase in kernel time. bulkstat is as usual not that different, and deletion shows an increase in kernel time of 50%. Conclusion: There are noticeable overheads to enabling parent pointers, but counterbalancing that, we can now repair an entire filesystem, directory tree and all. --D ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-10 23:49 ` Darrick J. Wong @ 2025-12-15 23:59 ` Dave Chinner 2025-12-16 23:07 ` Darrick J. Wong 0 siblings, 1 reply; 14+ messages in thread From: Dave Chinner @ 2025-12-15 23:59 UTC (permalink / raw) To: Darrick J. Wong; +Cc: aalbersh, linux-xfs On Wed, Dec 10, 2025 at 03:49:28PM -0800, Darrick J. Wong wrote: > On Wed, Dec 10, 2025 at 09:25:24AM +1100, Dave Chinner wrote: > > On Tue, Dec 09, 2025 at 08:16:08AM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > > default for all users. Also fix up an out of date comment. > > > > > > I created a really stupid benchmarking script that does: > > > > > > #!/bin/bash > > > > > > # pptr overhead benchmark > > > > > > umount /opt /mnt > > > rmmod xfs > > > for i in 1 0; do > > > umount /opt > > > mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent= > > > mount /dev/sdb /opt > > > mkdir -p /opt/foo > > > for ((i=0;i<5;i++)); do > > > time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1 > > > done > > > done > > > > Hmmm. fsstress is an interesting choice here... > > <flush all the old benchmarks and conclusions> > > I have an old 40-core Xeon E5-2660V3 with a pair of 1.5T Intel nvme ssds > and 128G of RAM running 6.18.0. For this sample, I tried to keep the > memory usage well below the amount of DRAM so that I could measure the > pure overhead of writing parent pointers out to disk and not anything > else. I also omit ls'ing and chmod'ing the directory tree because > neither of those operations touch parent pointers. I also left the > logbsize at the defaults (32k) because that's what most users get. ok. ..... > benchme() { > agcount="$(xfs_info /nvme/ | grep agcount= | sed -e 's/^.*agcount=//g' -e 's/,.*$//g')" > dirs=() > mkdirme > > #time ~djwong/cdev/work/fstests/build-x86_64/ltp/fsstress -n 400000 -p 40 -z -f creat=1,mkdir=1,rmdir=1,unlink=1 -d /nvme/ -s 1 > time fs_mark -w "${writesz}" -D "${subdirs}" -S 0 -n "${files_per_iter}" -s "${filesz}" -L "${iter}" "${dirs[@]}" > > time bulkme > time rmdirme Ok, so this is testing cache-hot bulkstat and rm, so it's not exercising the cold-read path and hence is not needing to read and initialising parent pointers for unlinking. Can you drop caches between the bulkstat and the unlink phases so we exercise cold cache parent pointer instantiation overhead somewhere? > } > > for p in 0 1; do > umount /dev/nvme1n1 /nvme /mnt > #mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 -n parent=$p || break > mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 $feature=$p || break > mount /dev/nvme1n1 /nvme/ -o logdev=/dev/nvme0n1 || break > benchme > umount /dev/nvme1n1 /nvme /mnt > done > > I get this mkfs output: > # mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 > meta-data=/dev/nvme1n1 isize=512 agcount=40, agsize=9767586 blks > = sectsz=4096 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=1 > = reflink=1 bigtime=1 inobtcount=1 nrext64=1 > = exchange=0 metadir=0 > data = bsize=4096 blocks=390703440, imaxpct=5 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 > log =/dev/nvme0n1 bsize=4096 blocks=262144, version=2 > = sectsz=4096 sunit=1 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > = rgcount=0 rgsize=0 extents > = zoned=0 start=0 reserved=0 > # grep nvme1n1 /proc/mounts > /dev/nvme1n1 /nvme xfs rw,relatime,inode64,logbufs=8,logbsize=32k,logdev=/dev/nvme0n1,noquota 0 0 > > and this output from fsmark with parent=0: .... a table-based summary would have made this easier to read parent real user sys create 0 0m57.573s 3m53.578s 19m44.440s create 1 1m2.934s 3m53.508s 25m14.810s bulk 0 0m1.122s 0m0.955s 0m39.306s bulk 1 0m1.158s 0m0.882s 0m39.847s unlink 0 0m59.649s 0m41.196s 13m9.566s unlink 1 1m12.505s 0m47.489s 20m33.844s > fs_mark itself shows a decrease in file creation/sec of about 9%, an > increase in wall clock time of about 9%, and an increase in kernel time > of about 28%. That's to be expected, since parent pointer updates cause > directory entry creation and deletion to hold more ILOCKs and for > longer. ILOCK isn't an issue with this test - the whole point of the segmented directory structure is that each thread operates in it's own directory, so there is no ILOCK contention at all. i.e. the entire difference is the CPU overhead of the adding the xattr fork and creating the parent pointer xattr. I suspect that the create side overhead is probably acceptible, because we also typically add security xattrs at create time and these will be slightly faster as the xattr fork is already prepared... > Parallel bulkstat (aka bulkme) shows an increase in wall time of 3% and > system time of 1%, which is not surprising since that's just walking the > inode btree and cores, no parent pointers involved. I was more interested in the cold cache behaviour - hot cache is generally uninteresting as the XFS inode cache scales pretty much perfectly in this case. Reading the inodes from disk, OTOH, adds a whole heap of instantiation and lock contention overhead and changes the picture significantly. I'm interested to know what the impact of having PPs is in that case.... > Similarly, deleting all the files created by fs_mark shows an increase > in wall time of about ~21% and an increase in system time of about 56%. > I concede that parent pointers has a fair amount of overhead for the > worst case of creating a large directory tree or deleting it. Ok, so an increase in unlink CPU overhead of 56% is pretty bad. On single threaded workloads, that's going to equate to be a ~50% reduction in performance for operations that perform unlinks in CPU bound loops (e.g. rm -rf on hot caches). Note that the above test is not CPU bound - it's only running at about 50% CPU utilisation because of some other contention point in the fs (possibly log space or pinned/stale directory buffers requiring a log force to clear). However, results like this make me think that PP unlink hasn't been optimised for the common case: removing the last parent pointer (i.e. nlink 1 -> 0) when the inode is being placed on the unlinked list in syscall context. This is the common case in the absence of hard links, and it puts the PP xattr removal directly in application task context. In this case, it seems to me that we don't actually need to remove the parent pointer xattr. When the inode is inactivated by bakground inodegc after last close, the xattr fork is truncated and that will remove all xattrs including the stale remaining PP without needing to make a specific PP transaction. Doing this would remove the PP overhead completely from the final unlink syscall path. It would only add minimal extra overhead on the inodegc side as (in the common case) we have to remove security xattrs in inodegc. Hence I think we really need to try to mitigate this common case overhead before we make PP the default for everyone. The perf decrease > If I then re-run the benchmark with a file size of 1M and tell it to > create fewer files, then I get the following for parent=0: These are largely meaningless as the create benchmark is throttling hard on disk bandwidth (1.5-2GB/s) in the write() path, not limited by PP overhead. The variance in runtime comes from the data IO path behaviour, and the lack of sync() operations after the create means that writeback is likely still running when the unlink phase runs. Hence it's pretty difficult to conclude anything about parent pointers themselves because of the other large variants in this workload. > I then decided to simulate my maildir spool, which has 670,000 files > consuming 12GB for an average file size of 17936 bytes. I reduced the > file size to 16K, increase the number of files per iteration, and set > the write buffer size to something not aligned to a block, and got this > for parent=0: Same again, but this time the writeback thread will be seeing delalloc latencies w.r.t. AGF locks vs incoming directory and inode chunk allocation operations. That can be seen by: > > # fs_mark -w 778 -D 1000 -S 0 -n 6000 -s 16384 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 > # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:21:38 2025 > # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. > # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. > # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) > # Files info: size 16384 bytes, written with an IO size of 778 bytes per write > # App overhead is time in microseconds spent in the test not doing file writing related system calls. > > FSUse% Count Size Files/sec App Overhead > 2 240000 16384 40085.3 2492281 > 2 480000 16384 37026.7 2780077 > 2 720000 16384 28445.5 2591461 > 3 960000 16384 28888.6 2595817 > 3 1200000 16384 25160.8 2903882 > 3 1440000 16384 29372.1 2600018 > 3 1680000 16384 26443.9 2732790 > 4 1920000 16384 26307.1 2758750 > > real 1m11.633s > user 0m46.156s > sys 3m24.543s .. creates only managing ~270% CPU utilisation for a 40-way operation. IOWs, parent pointer overhead is noise compared to the losses caused by data writeback locking/throttling interactions, so nothing can really be concluded from there here. > Conclusion: There are noticeable overheads to enabling parent pointers, > but counterbalancing that, we can now repair an entire filesystem, > directory tree and all. True, but I think that the unlink overhead is significant enough that we need to address that before enabling PP by default for everyone. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-15 23:59 ` Dave Chinner @ 2025-12-16 23:07 ` Darrick J. Wong 0 siblings, 0 replies; 14+ messages in thread From: Darrick J. Wong @ 2025-12-16 23:07 UTC (permalink / raw) To: Dave Chinner; +Cc: aalbersh, linux-xfs On Tue, Dec 16, 2025 at 10:59:42AM +1100, Dave Chinner wrote: > On Wed, Dec 10, 2025 at 03:49:28PM -0800, Darrick J. Wong wrote: > > On Wed, Dec 10, 2025 at 09:25:24AM +1100, Dave Chinner wrote: > > > On Tue, Dec 09, 2025 at 08:16:08AM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > > > default for all users. Also fix up an out of date comment. > > > > > > > > I created a really stupid benchmarking script that does: > > > > > > > > #!/bin/bash > > > > > > > > # pptr overhead benchmark > > > > > > > > umount /opt /mnt > > > > rmmod xfs > > > > for i in 1 0; do > > > > umount /opt > > > > mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent= > > > > mount /dev/sdb /opt > > > > mkdir -p /opt/foo > > > > for ((i=0;i<5;i++)); do > > > > time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1 > > > > done > > > > done > > > > > > Hmmm. fsstress is an interesting choice here... > > > > <flush all the old benchmarks and conclusions> > > > > I have an old 40-core Xeon E5-2660V3 with a pair of 1.5T Intel nvme ssds > > and 128G of RAM running 6.18.0. For this sample, I tried to keep the > > memory usage well below the amount of DRAM so that I could measure the > > pure overhead of writing parent pointers out to disk and not anything > > else. I also omit ls'ing and chmod'ing the directory tree because > > neither of those operations touch parent pointers. I also left the > > logbsize at the defaults (32k) because that's what most users get. > > ok. > > ..... > > > benchme() { > > agcount="$(xfs_info /nvme/ | grep agcount= | sed -e 's/^.*agcount=//g' -e 's/,.*$//g')" > > dirs=() > > mkdirme > > > > #time ~djwong/cdev/work/fstests/build-x86_64/ltp/fsstress -n 400000 -p 40 -z -f creat=1,mkdir=1,rmdir=1,unlink=1 -d /nvme/ -s 1 > > time fs_mark -w "${writesz}" -D "${subdirs}" -S 0 -n "${files_per_iter}" -s "${filesz}" -L "${iter}" "${dirs[@]}" > > > > time bulkme > > time rmdirme > > Ok, so this is testing cache-hot bulkstat and rm, so it's not > exercising the cold-read path and hence is not needing to read and > initialising parent pointers for unlinking. Can you drop caches > between the bulkstat and the unlink phases so we exercise cold cache > parent pointer instantiation overhead somewhere? > > > } > > > > for p in 0 1; do > > umount /dev/nvme1n1 /nvme /mnt > > #mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 -n parent=$p || break > > mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 $feature=$p || break > > mount /dev/nvme1n1 /nvme/ -o logdev=/dev/nvme0n1 || break > > benchme > > umount /dev/nvme1n1 /nvme /mnt > > done > > > > I get this mkfs output: > > # mkfs.xfs -f -l logdev=/dev/nvme0n1,size=1g /dev/nvme1n1 > > meta-data=/dev/nvme1n1 isize=512 agcount=40, agsize=9767586 blks > > = sectsz=4096 attr=2, projid32bit=1 > > = crc=1 finobt=1, sparse=1, rmapbt=1 > > = reflink=1 bigtime=1 inobtcount=1 nrext64=1 > > = exchange=0 metadir=0 > > data = bsize=4096 blocks=390703440, imaxpct=5 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 > > log =/dev/nvme0n1 bsize=4096 blocks=262144, version=2 > > = sectsz=4096 sunit=1 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > = rgcount=0 rgsize=0 extents > > = zoned=0 start=0 reserved=0 > > # grep nvme1n1 /proc/mounts > > /dev/nvme1n1 /nvme xfs rw,relatime,inode64,logbufs=8,logbsize=32k,logdev=/dev/nvme0n1,noquota 0 0 > > > > and this output from fsmark with parent=0: > > .... > > a table-based summary would have made this easier to read > > parent real user sys > create 0 0m57.573s 3m53.578s 19m44.440s > create 1 1m2.934s 3m53.508s 25m14.810s > > bulk 0 0m1.122s 0m0.955s 0m39.306s > bulk 1 0m1.158s 0m0.882s 0m39.847s > > unlink 0 0m59.649s 0m41.196s 13m9.566s > unlink 1 1m12.505s 0m47.489s 20m33.844s > > > fs_mark itself shows a decrease in file creation/sec of about 9%, an > > increase in wall clock time of about 9%, and an increase in kernel time > > of about 28%. That's to be expected, since parent pointer updates cause > > directory entry creation and deletion to hold more ILOCKs and for > > longer. > > ILOCK isn't an issue with this test - the whole point of the > segmented directory structure is that each thread operates in it's > own directory, so there is no ILOCK contention at all. i.e. the > entire difference is the CPU overhead of the adding the xattr fork > and creating the parent pointer xattr. > > I suspect that the create side overhead is probably acceptible, > because we also typically add security xattrs at create time and > these will be slightly faster as the xattr fork is already > prepared... > > > Parallel bulkstat (aka bulkme) shows an increase in wall time of 3% and > > system time of 1%, which is not surprising since that's just walking the > > inode btree and cores, no parent pointers involved. > > I was more interested in the cold cache behaviour - hot cache is > generally uninteresting as the XFS inode cache scales pretty much > perfectly in this case. Reading the inodes from disk, OTOH, adds a > whole heap of instantiation and lock contention overhead and changes > the picture significantly. I'm interested to know what the impact of > having PPs is in that case.... > > > Similarly, deleting all the files created by fs_mark shows an increase > > in wall time of about ~21% and an increase in system time of about 56%. > > I concede that parent pointers has a fair amount of overhead for the > > worst case of creating a large directory tree or deleting it. > > Ok, so an increase in unlink CPU overhead of 56% is pretty bad. On > single threaded workloads, that's going to equate to be a ~50% > reduction in performance for operations that perform unlinks in CPU > bound loops (e.g. rm -rf on hot caches). Note that the above test is > not CPU bound - it's only running at about 50% CPU utilisation > because of some other contention point in the fs (possibly log space > or pinned/stale directory buffers requiring a log force to clear). > > However, results like this make me think that PP unlink hasn't been > optimised for the common case: removing the last parent pointer > (i.e. nlink 1 -> 0) when the inode is being placed on the unlinked > list in syscall context. This is the common case in the absence of > hard links, and it puts the PP xattr removal directly in application > task context. > > In this case, it seems to me that we don't actually need > to remove the parent pointer xattr. When the inode is inactivated by > bakground inodegc after last close, the xattr fork is truncated and > that will remove all xattrs including the stale remaining PP without > needing to make a specific PP transaction. > > Doing this would remove the PP overhead completely from the final > unlink syscall path. It would only add minimal extra overhead on > the inodegc side as (in the common case) we have to remove security > xattrs in inodegc. At some point hch suggested that the parent pointer code could shortcut the entire xattr intent machinery if the child file has shortform xattrs. For this fsmark benchmark where we're creating a lot of empty files, doing so actually /does/ cut the creation overhead from ~30% to ~3%; and the deletion overhead to nearly zero. diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h index 589f810eedc0d8..c59e5ef47ed95d 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.h +++ b/fs/xfs/libxfs/xfs_attr_leaf.h @@ -49,6 +49,7 @@ void xfs_attr_shortform_create(struct xfs_da_args *args); void xfs_attr_shortform_add(struct xfs_da_args *args, int forkoff); int xfs_attr_shortform_getvalue(struct xfs_da_args *args); int xfs_attr_shortform_to_leaf(struct xfs_da_args *args); +int xfs_attr_try_sf_addname(struct xfs_inode *dp, struct xfs_da_args *args); int xfs_attr_sf_removename(struct xfs_da_args *args); struct xfs_attr_sf_entry *xfs_attr_sf_findname(struct xfs_da_args *args); int xfs_attr_shortform_allfit(struct xfs_buf *bp, struct xfs_inode *dp); diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index 8c04acd30d489c..89cc913a2b4345 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -349,7 +349,7 @@ xfs_attr_set_resv( * xfs_attr_shortform_addname() will convert to leaf format and return -ENOSPC. * to use. */ -STATIC int +int xfs_attr_try_sf_addname( struct xfs_inode *dp, struct xfs_da_args *args) diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c index 69366c44a70159..048f822951103c 100644 --- a/fs/xfs/libxfs/xfs_parent.c +++ b/fs/xfs/libxfs/xfs_parent.c @@ -29,6 +29,7 @@ #include "xfs_trans_space.h" #include "xfs_attr_item.h" #include "xfs_health.h" +#include "xfs_attr_leaf.h" struct kmem_cache *xfs_parent_args_cache; @@ -202,6 +203,16 @@ xfs_parent_addname( xfs_inode_to_parent_rec(&ppargs->rec, dp); xfs_parent_da_args_init(&ppargs->args, tp, &ppargs->rec, child, child->i_ino, parent_name); + + if (xfs_inode_has_attr_fork(child) && + xfs_attr_is_shortform(child)) { + ppargs->args.op_flags |= XFS_DA_OP_ADDNAME; + + error = xfs_attr_try_sf_addname(child, &ppargs->args); + if (error != -ENOSPC) + return error; + } + xfs_attr_defer_add(&ppargs->args, XFS_ATTR_DEFER_SET); return 0; } @@ -224,6 +235,10 @@ xfs_parent_removename( xfs_inode_to_parent_rec(&ppargs->rec, dp); xfs_parent_da_args_init(&ppargs->args, tp, &ppargs->rec, child, child->i_ino, parent_name); + + if (xfs_attr_is_shortform(child)) + return xfs_attr_sf_removename(&ppargs->args); + xfs_attr_defer_add(&ppargs->args, XFS_ATTR_DEFER_REMOVE); return 0; } @@ -250,6 +265,27 @@ xfs_parent_replacename( child->i_ino, old_name); xfs_inode_to_parent_rec(&ppargs->new_rec, new_dp); + + if (xfs_attr_is_shortform(child)) { + ppargs->args.op_flags |= XFS_DA_OP_ADDNAME | XFS_DA_OP_REPLACE; + + error = xfs_attr_sf_removename(&ppargs->args); + if (error) + return error; + + xfs_parent_da_args_init(&ppargs->args, tp, &ppargs->new_rec, + child, child->i_ino, new_name); + ppargs->args.op_flags |= XFS_DA_OP_ADDNAME; + + error = xfs_attr_try_sf_addname(child, &ppargs->args); + if (error == -ENOSPC) { + xfs_attr_defer_add(&ppargs->args, XFS_ATTR_DEFER_SET); + return 0; + } + + return error; + } + ppargs->args.new_name = new_name->name; ppargs->args.new_namelen = new_name->len; ppargs->args.new_value = &ppargs->new_rec; > Hence I think we really need to try to mitigate this common case > overhead before we make PP the default for everyone. The perf > decrease > > > > If I then re-run the benchmark with a file size of 1M and tell it to > > create fewer files, then I get the following for parent=0: > > These are largely meaningless as the create benchmark is throttling > hard on disk bandwidth (1.5-2GB/s) in the write() path, not limited > by PP overhead. > > The variance in runtime comes from the data IO path behaviour, and > the lack of sync() operations after the create means that writeback > is likely still running when the unlink phase runs. Hence it's > pretty difficult to conclude anything about parent pointers > themselves because of the other large variants in this workload. They're not meaningless numbers, Dave. Writing data into user files is always going take up a large portion of the time spent creating a real dreictory tree. Anyone unpacking a tarball onto a filesystem can run into disk throttling on write bandwidth, which just reduces the relative overhead of the pptr updates further. The only times it becomes painful is in this microbenchmarking case where someone is trying to create millions of empty files; and when deleting a directory tree. Anyway, we now have a patch, and I'll rerun the benchmark if this survives overnight testing. --D > > I then decided to simulate my maildir spool, which has 670,000 files > > consuming 12GB for an average file size of 17936 bytes. I reduced the > > file size to 16K, increase the number of files per iteration, and set > > the write buffer size to something not aligned to a block, and got this > > for parent=0: > > Same again, but this time the writeback thread will be seeing > delalloc latencies w.r.t. AGF locks vs incoming directory and inode > chunk allocation operations. That can be seen by: > > > > > # fs_mark -w 778 -D 1000 -S 0 -n 6000 -s 16384 -L 8 -d /nvme/0 -d /nvme/1 -d /nvme/2 -d /nvme/3 -d /nvme/4 -d /nvme/5 -d /nvme/6 -d /nvme/7 -d /nvme/8 -d /nvme/9 -d /nvme/10 -d /nvme/11 -d /nvme/12 -d /nvme/13 -d /nvme/14 -d /nvme/15 -d /nvme/16 -d /nvme/17 -d /nvme/18 -d /nvme/19 -d /nvme/20 -d /nvme/21 -d /nvme/22 -d /nvme/23 -d /nvme/24 -d /nvme/25 -d /nvme/26 -d /nvme/27 -d /nvme/28 -d /nvme/29 -d /nvme/30 -d /nvme/31 -d /nvme/32 -d /nvme/33 -d /nvme/34 -d /nvme/35 -d /nvme/36 -d /nvme/37 -d /nvme/38 -d /nvme/39 > > # Version 3.3, 40 thread(s) starting at Wed Dec 10 15:21:38 2025 > > # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. > > # Directories: Time based hash between directories across 1000 subdirectories with 180 seconds per subdirectory. > > # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) > > # Files info: size 16384 bytes, written with an IO size of 778 bytes per write > > # App overhead is time in microseconds spent in the test not doing file writing related system calls. > > > > FSUse% Count Size Files/sec App Overhead > > 2 240000 16384 40085.3 2492281 > > 2 480000 16384 37026.7 2780077 > > 2 720000 16384 28445.5 2591461 > > 3 960000 16384 28888.6 2595817 > > 3 1200000 16384 25160.8 2903882 > > 3 1440000 16384 29372.1 2600018 > > 3 1680000 16384 26443.9 2732790 > > 4 1920000 16384 26307.1 2758750 > > > > real 1m11.633s > > user 0m46.156s > > sys 3m24.543s > > .. creates only managing ~270% CPU utilisation for a 40-way > operation. > > IOWs, parent pointer overhead is noise compared to the losses caused > by data writeback locking/throttling interactions, so nothing can > really be concluded from there here. > > > Conclusion: There are noticeable overheads to enabling parent pointers, > > but counterbalancing that, we can now repair an entire filesystem, > > directory tree and all. > > True, but I think that the unlink overhead is significant enough > that we need to address that before enabling PP by default for > everyone. > > -Dave. > -- > Dave Chinner > david@fromorbit.com > ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/2] mkfs: add 2025 LTS config file 2025-12-09 16:16 [PATCHSET V2] xfsprogs: enable new stable features for 6.18 Darrick J. Wong 2025-12-09 16:16 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong @ 2025-12-09 16:16 ` Darrick J. Wong 2025-12-09 16:23 ` Christoph Hellwig 1 sibling, 1 reply; 14+ messages in thread From: Darrick J. Wong @ 2025-12-09 16:16 UTC (permalink / raw) To: djwong, aalbersh; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> Add a new configuration file with the defaults as of 6.18 LTS. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- mkfs/Makefile | 3 ++- mkfs/lts_6.18.conf | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 mkfs/lts_6.18.conf diff --git a/mkfs/Makefile b/mkfs/Makefile index 04905bd5101ccb..fb1473324cde7c 100644 --- a/mkfs/Makefile +++ b/mkfs/Makefile @@ -18,7 +18,8 @@ CFGFILES = \ lts_5.15.conf \ lts_6.1.conf \ lts_6.6.conf \ - lts_6.12.conf + lts_6.12.conf \ + lts_6.18.conf LLDLIBS += $(LIBXFS) $(LIBXCMD) $(LIBFROG) $(LIBRT) $(LIBBLKID) \ $(LIBUUID) $(LIBINIH) $(LIBURCU) $(LIBPTHREAD) diff --git a/mkfs/lts_6.18.conf b/mkfs/lts_6.18.conf new file mode 100644 index 00000000000000..2dbec51e586fa1 --- /dev/null +++ b/mkfs/lts_6.18.conf @@ -0,0 +1,19 @@ +# V5 features that were the mkfs defaults when the upstream Linux 6.18 LTS +# kernel was released at the end of 2025. + +[metadata] +bigtime=1 +crc=1 +finobt=1 +inobtcount=1 +metadir=0 +reflink=1 +rmapbt=1 + +[inode] +sparse=1 +nrext64=1 +exchange=1 + +[naming] +parent=1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] mkfs: add 2025 LTS config file 2025-12-09 16:16 ` [PATCH 2/2] mkfs: add 2025 LTS config file Darrick J. Wong @ 2025-12-09 16:23 ` Christoph Hellwig 0 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2025-12-09 16:23 UTC (permalink / raw) To: Darrick J. Wong; +Cc: aalbersh, linux-xfs Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCHSET 2/2] xfsprogs: enable new stable features for 6.18 @ 2025-12-02 1:27 Darrick J. Wong 2025-12-02 1:28 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong 0 siblings, 1 reply; 14+ messages in thread From: Darrick J. Wong @ 2025-12-02 1:27 UTC (permalink / raw) To: aalbersh, djwong; +Cc: linux-xfs Hi all, Enable by default some new features that seem stable now. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=default-features --- Commits in this patchset: * mkfs: enable new features by default * mkfs: add 2025 LTS config file --- mkfs/Makefile | 3 ++- mkfs/lts_6.18.conf | 19 +++++++++++++++++++ mkfs/xfs_mkfs.c | 5 +++-- 3 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 mkfs/lts_6.18.conf ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] mkfs: enable new features by default 2025-12-02 1:27 [PATCHSET 2/2] xfsprogs: enable new stable features for 6.18 Darrick J. Wong @ 2025-12-02 1:28 ` Darrick J. Wong 2025-12-02 7:38 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Darrick J. Wong @ 2025-12-02 1:28 UTC (permalink / raw) To: aalbersh, djwong; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> Since the LTS is coming up, enable parent pointers and exchange-range by default for all users. Also fix up an out of date comment. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- mkfs/xfs_mkfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index 8f5a6fa5676453..8db51217016eb0 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -1044,7 +1044,7 @@ struct sb_feat_args { bool inode_align; /* XFS_SB_VERSION_ALIGNBIT */ bool nci; /* XFS_SB_VERSION_BORGBIT */ bool lazy_sb_counters; /* XFS_SB_VERSION2_LAZYSBCOUNTBIT */ - bool parent_pointers; /* XFS_SB_VERSION2_PARENTBIT */ + bool parent_pointers; /* XFS_SB_FEAT_INCOMPAT_PARENT */ bool projid32bit; /* XFS_SB_VERSION2_PROJID32BIT */ bool crcs_enabled; /* XFS_SB_VERSION2_CRCBIT */ bool dirftype; /* XFS_SB_VERSION2_FTYPE */ @@ -5984,11 +5984,12 @@ main( .rmapbt = true, .reflink = true, .inobtcnt = true, - .parent_pointers = false, + .parent_pointers = true, .nodalign = false, .nortalign = false, .bigtime = true, .nrext64 = true, + .exchrange = true, /* * When we decide to enable a new feature by default, * please remember to update the mkfs conf files. ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-02 1:28 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong @ 2025-12-02 7:38 ` Christoph Hellwig 2025-12-03 0:53 ` Darrick J. Wong 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2025-12-02 7:38 UTC (permalink / raw) To: Darrick J. Wong; +Cc: aalbersh, linux-xfs On Mon, Dec 01, 2025 at 05:28:16PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Since the LTS is coming up, enable parent pointers and exchange-range by > default for all users. Also fix up an out of date comment. Do you have any numbers that show the overhead or non-overhead of enabling rmap? It will increase the amount of metadata written quite a bit. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-02 7:38 ` Christoph Hellwig @ 2025-12-03 0:53 ` Darrick J. Wong 2025-12-03 6:31 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Darrick J. Wong @ 2025-12-03 0:53 UTC (permalink / raw) To: Christoph Hellwig; +Cc: aalbersh, linux-xfs On Mon, Dec 01, 2025 at 11:38:46PM -0800, Christoph Hellwig wrote: > On Mon, Dec 01, 2025 at 05:28:16PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > default for all users. Also fix up an out of date comment. > > Do you have any numbers that show the overhead or non-overhead of > enabling rmap? It will increase the amount of metadata written quite > a bit. I'm assuming you're interested in the overhead of *parent pointers* and not rmap since we turned on rmap by default back in 2023? I created a really stupid benchmarking script that does: #!/bin/bash umount /opt mkfs.xfs -f /dev/sdb -n parent=$1 mount /dev/sdb /opt mkdir -p /opt/foo for ((i=0;i<10;i++)); do time fsstress -n 400000 -p 4 -z -f creat=1,mkdir=1,mknod=1,rmdir=1,unlink=1,link=1,rename=1 -d /opt/foo -s 1 done # ./dumb.sh 0 meta-data=/dev/sdb isize=512 agcount=4, agsize=1298176 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=1 metadir=0 data = bsize=4096 blocks=5192704, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 log =internal log bsize=4096 blocks=16384, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents = zoned=0 start=0 reserved=0 Discarding blocks...Done. real 0m18.807s user 0m2.169s sys 0m54.013s real 0m13.845s user 0m2.005s sys 0m34.048s real 0m14.019s user 0m1.931s sys 0m36.086s real 0m14.435s user 0m2.105s sys 0m35.845s real 0m14.823s user 0m1.920s sys 0m35.528s real 0m14.181s user 0m2.013s sys 0m35.775s real 0m14.281s user 0m1.865s sys 0m36.240s real 0m13.638s user 0m1.933s sys 0m35.642s real 0m13.553s user 0m1.904s sys 0m35.084s real 0m13.963s user 0m1.979s sys 0m35.724s # ./dumb.sh 1 meta-data=/dev/sdb isize=512 agcount=4, agsize=1298176 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=1 metadir=0 data = bsize=4096 blocks=5192704, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 log =internal log bsize=4096 blocks=16384, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents = zoned=0 start=0 reserved=0 Discarding blocks...Done. real 0m20.654s user 0m2.374s sys 1m4.441s real 0m14.255s user 0m1.990s sys 0m36.749s real 0m14.553s user 0m1.931s sys 0m36.606s real 0m13.855s user 0m1.767s sys 0m36.467s real 0m14.606s user 0m2.073s sys 0m37.255s real 0m13.706s user 0m1.942s sys 0m36.294s real 0m14.177s user 0m2.017s sys 0m36.528s real 0m15.310s user 0m2.164s sys 0m37.720s real 0m14.099s user 0m2.013s sys 0m37.062s real 0m14.067s user 0m2.068s sys 0m36.552s As you can see, there's a noticeable increase in the runtime of the first fsstress invocation, but for the subsequent runs there's not much of a difference. I think the parent pointer log items usually complete in a single log checkpoint and are usually omitted from the log. In the common case of a single parent and an inline xattr area, the overhead is basically zero because we're just writing to the attr fork's if_data and not messing with xattr blocks. If I remove the -flink=1 parameter from fsstress so that parent pointers are always running out of the immediate area then the first parent=0 runtime is: real 0m18.920s user 0m2.559s sys 1m0.991s and the first parent=1 is: real 0m20.458s user 0m2.533s sys 1m6.301s I see more or less the same timings for the nine subsequent runs for each parent= setting. I think it's safe to say the overhead ranges between negligible and 10% on a cold new filesystem. --D ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-03 0:53 ` Darrick J. Wong @ 2025-12-03 6:31 ` Christoph Hellwig 2025-12-04 18:48 ` Darrick J. Wong 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2025-12-03 6:31 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, linux-xfs On Tue, Dec 02, 2025 at 04:53:45PM -0800, Darrick J. Wong wrote: > On Mon, Dec 01, 2025 at 11:38:46PM -0800, Christoph Hellwig wrote: > > On Mon, Dec 01, 2025 at 05:28:16PM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > > default for all users. Also fix up an out of date comment. > > > > Do you have any numbers that show the overhead or non-overhead of > > enabling rmap? It will increase the amount of metadata written quite > > a bit. > > I'm assuming you're interested in the overhead of *parent pointers* and > not rmap since we turned on rmap by default back in 2023? Yes, sorry. > I see more or less the same timings for the nine subsequent runs for > each parent= setting. I think it's safe to say the overhead ranges > between negligible and 10% on a cold new filesystem. Should we document this cleary? Because this means at least some workloads are going to see a performance decrease. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] mkfs: enable new features by default 2025-12-03 6:31 ` Christoph Hellwig @ 2025-12-04 18:48 ` Darrick J. Wong 0 siblings, 0 replies; 14+ messages in thread From: Darrick J. Wong @ 2025-12-04 18:48 UTC (permalink / raw) To: Christoph Hellwig; +Cc: aalbersh, linux-xfs On Tue, Dec 02, 2025 at 10:31:22PM -0800, Christoph Hellwig wrote: > On Tue, Dec 02, 2025 at 04:53:45PM -0800, Darrick J. Wong wrote: > > On Mon, Dec 01, 2025 at 11:38:46PM -0800, Christoph Hellwig wrote: > > > On Mon, Dec 01, 2025 at 05:28:16PM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > Since the LTS is coming up, enable parent pointers and exchange-range by > > > > default for all users. Also fix up an out of date comment. > > > > > > Do you have any numbers that show the overhead or non-overhead of > > > enabling rmap? It will increase the amount of metadata written quite > > > a bit. > > > > I'm assuming you're interested in the overhead of *parent pointers* and > > not rmap since we turned on rmap by default back in 2023? > > Yes, sorry. > > > I see more or less the same timings for the nine subsequent runs for > > each parent= setting. I think it's safe to say the overhead ranges > > between negligible and 10% on a cold new filesystem. > > Should we document this cleary? Because this means at least some > workloads are going to see a performance decrease. Yep. But first -- all those results are inaccurate because I forgot that fsstress quietly ignores everything after the first op=freq component of the optarg, so all that benchmark was doing was creating millions of files in a single directory and never deleting anything. That's why the subsequent runs were much faster -- most of those files were already created. So I'll send a patch to fstests to fix that behavior. With that, the benchmark that I alleged I was running produces these numbers when creating a directory tree of only empty files: naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 real 0m12.742s user 0m28.074s sys 0m10.839s real 0m13.469s user 0m25.827s sys 0m11.816s real 0m11.352s user 0m22.602s sys 0m11.275s naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 real 0m12.782s user 0m28.892s sys 0m8.897s real 0m13.591s user 0m25.371s sys 0m9.601s real 0m10.012s user 0m20.849s sys 0m9.018s Almost no difference here! If I add in write=1 then there's a 5% decrease going to parent=1: naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1 real 0m15.020s user 0m22.358s sys 0m14.827s real 0m17.196s user 0m22.888s sys 0m15.586s real 0m16.668s user 0m21.709s sys 0m15.425s naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0 real 0m14.808s user 0m22.266s sys 0m12.843s real 0m16.323s user 0m22.409s sys 0m13.695s real 0m15.562s user 0m21.740s sys 0m12.927s --D ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-12-16 23:07 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-09 16:16 [PATCHSET V2] xfsprogs: enable new stable features for 6.18 Darrick J. Wong 2025-12-09 16:16 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong 2025-12-09 16:22 ` Christoph Hellwig 2025-12-09 22:25 ` Dave Chinner 2025-12-10 23:49 ` Darrick J. Wong 2025-12-15 23:59 ` Dave Chinner 2025-12-16 23:07 ` Darrick J. Wong 2025-12-09 16:16 ` [PATCH 2/2] mkfs: add 2025 LTS config file Darrick J. Wong 2025-12-09 16:23 ` Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2025-12-02 1:27 [PATCHSET 2/2] xfsprogs: enable new stable features for 6.18 Darrick J. Wong 2025-12-02 1:28 ` [PATCH 1/2] mkfs: enable new features by default Darrick J. Wong 2025-12-02 7:38 ` Christoph Hellwig 2025-12-03 0:53 ` Darrick J. Wong 2025-12-03 6:31 ` Christoph Hellwig 2025-12-04 18:48 ` Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox