[PATCHBOMB] xfs/fstests: largeish pile of bug fixes

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCHBOMB] xfs/fstests: largeish pile of bug fixes
@ 2024-11-26  1:18 Darrick J. Wong
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
  0 siblings, 2 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:18 UTC (permalink / raw)
  To: Zorro Lang, Carlos Maiolino; +Cc: fstests, xfs, Christoph Hellwig

Hi everyone,

I have a rather large pile of bug fixes to send in for 6.12^H^H13-rc.
Most of the fstests fixes you've already seen before, but I've been
working on some new bug fixes for the kernel.  The code that's already
been reviewed is ready for merging, and I was going to ask for a merge
last week.  But.

But then I decided to look at a warning from the page allocator and that
unravelled a bunch more broken stuff.  So now we have some fixes for the
transaction code, a couple more fixes that got lost in the rtrmap
series, an unlock bug, and the quota code.

Ah, the quota code.  I'm removing what I hope is the last place where
reclaim and/or the AIL can have to do a disk read in order to write
dirty data back to disk.  Fixing that was pretty annoying until I got
the other bits settled.

Here's all the new stuff:

[PATCHSET v3] fstests: random fixes for v2024.11.17
  [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully
  [PATCH 15/16] generic/454: actually set attr value for llamapirate
  [PATCH 16/16] xfs/122: add tests for commitrange structures
[PATCHSET] xfs: bug fixes for 6.13
  [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an
  [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are
  [PATCH 13/21] xfs: unlock inodes when erroring out of
  [PATCH 14/21] xfs: only run precommits once per transaction object
  [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  [PATCH 16/21] xfs: don't lose solo superblock counter update
  [PATCH 17/21] xfs: don't lose solo dquot update transactions
  [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush
  [PATCH 19/21] xfs: clean up log item accesses in
  [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer
  [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCHSET v3] fstests: random fixes for v2024.11.17
  2024-11-26  1:18 [PATCHBOMB] xfs/fstests: largeish pile of bug fixes Darrick J. Wong
@ 2024-11-26  1:20 ` Darrick J. Wong
  2024-11-26  1:20   ` [PATCH 01/16] generic/757: fix various bugs in this test Darrick J. Wong
                     ` (16 more replies)
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
  1 sibling, 17 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:20 UTC (permalink / raw)
  To: djwong, zlang; +Cc: tytso, hch, zlang, fstests, fstests, linux-xfs

Hi all,

Here's the usual odd fixes for fstests.  Most of these are cleanups and
bug fixes that have been aging in my djwong-wtf branch forever.

v2: Now with more cleanups to the logwrites code and better loop control
    for 251, as discussed on the v1 patchset.
v3: Add more acks, kick out some of the logwrites stuff, add more
    bugfixes.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=random-fixes

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes
---
Commits in this patchset:
 * generic/757: fix various bugs in this test
 * generic/757: convert to thinp
 * xfs/113: fix failure to corrupt the entire directory
 * xfs/508: fix test for 64k blocksize
 * common/rc: capture dmesg when oom kills happen
 * generic/562: handle ENOSPC while cloning gracefully
 * xfs/163: skip test if we can't shrink due to enospc issues
 * xfs/009: allow logically contiguous preallocations
 * generic/251: use sentinel files to kill the fstrim loop
 * generic/251: constrain runtime via time/load/soak factors
 * generic/251: don't copy the fsstress source code
 * common/rc: _scratch_mkfs_sized supports extra arguments
 * xfs/157: do not drop necessary mkfs options
 * generic/366: fix directio requirements checking
 * generic/454: actually set attr value for llamapirate subtest
 * xfs/122: add tests for commitrange structures
---
 common/rc         |   35 +++++++++++++++++++----------------
 tests/generic/251 |   42 +++++++++++++++++++++---------------------
 tests/generic/366 |    2 +-
 tests/generic/454 |    8 ++++----
 tests/generic/562 |   10 ++++++++--
 tests/generic/757 |   26 ++++++++++++++++++++------
 tests/xfs/009     |   29 ++++++++++++++++++++++++++++-
 tests/xfs/113     |   33 +++++++++++++++++++++++++++------
 tests/xfs/122.out |    1 +
 tests/xfs/157     |    3 +--
 tests/xfs/163     |    9 ++++++++-
 tests/xfs/508     |    4 ++--
 12 files changed, 140 insertions(+), 62 deletions(-)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCHSET] xfs: bug fixes for 6.13
  2024-11-26  1:18 [PATCHBOMB] xfs/fstests: largeish pile of bug fixes Darrick J. Wong
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
@ 2024-11-26  1:20 ` Darrick J. Wong
  2024-11-26  1:24   ` [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage Darrick J. Wong
                     ` (21 more replies)
  1 sibling, 22 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:20 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, wozizhi, dan.carpenter, linux-xfs

Hi all,

Bug fixes for 6.13.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=xfs-6.13-fixes

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=xfs-6.13-fixes
---
Commits in this patchset:
 * xfs: fix off-by-one error in fsmap's end_daddr usage
 * xfs: metapath scrubber should use the already loaded inodes
 * xfs: keep quota directory inode loaded
 * xfs: return a 64-bit block count from xfs_btree_count_blocks
 * xfs: don't drop errno values when we fail to ficlone the entire range
 * xfs: separate healthy clearing mask during repair
 * xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink
 * xfs: mark metadir repair tempfiles with IRECOVERY
 * xfs: fix null bno_hint handling in xfs_rtallocate_rtg
 * xfs: fix error bailout in xfs_rtginode_create
 * xfs: update btree keys correctly when _insrec splits an inode root block
 * xfs: fix scrub tracepoints when inode-rooted btrees are involved
 * xfs: unlock inodes when erroring out of xfs_trans_alloc_dir
 * xfs: only run precommits once per transaction object
 * xfs: remove recursion in __xfs_trans_commit
 * xfs: don't lose solo superblock counter update transactions
 * xfs: don't lose solo dquot update transactions
 * xfs: separate dquot buffer reads from xfs_dqflush
 * xfs: clean up log item accesses in xfs_qm_dqflush{,_done}
 * xfs: attach dquot buffer to dquot log item buffer
 * xfs: convert quotacheck to attach dquot buffers
---
 fs/xfs/libxfs/xfs_btree.c        |   33 +++++--
 fs/xfs/libxfs/xfs_btree.h        |    2 
 fs/xfs/libxfs/xfs_ialloc_btree.c |    4 +
 fs/xfs/libxfs/xfs_rtgroup.c      |    2 
 fs/xfs/scrub/agheader.c          |    6 +
 fs/xfs/scrub/agheader_repair.c   |    6 +
 fs/xfs/scrub/fscounters.c        |    2 
 fs/xfs/scrub/health.c            |   57 +++++++-----
 fs/xfs/scrub/ialloc.c            |    4 -
 fs/xfs/scrub/metapath.c          |   68 +++++---------
 fs/xfs/scrub/refcount.c          |    2 
 fs/xfs/scrub/scrub.h             |    6 +
 fs/xfs/scrub/symlink_repair.c    |    3 -
 fs/xfs/scrub/tempfile.c          |   10 ++
 fs/xfs/scrub/trace.h             |    2 
 fs/xfs/xfs_bmap_util.c           |    2 
 fs/xfs/xfs_dquot.c               |  185 ++++++++++++++++++++++++++++++++------
 fs/xfs/xfs_dquot.h               |    5 +
 fs/xfs/xfs_dquot_item.c          |   51 ++++++++--
 fs/xfs/xfs_dquot_item.h          |    7 +
 fs/xfs/xfs_file.c                |    8 ++
 fs/xfs/xfs_fsmap.c               |   38 +++++---
 fs/xfs/xfs_inode.h               |    2 
 fs/xfs/xfs_qm.c                  |   92 +++++++++++++------
 fs/xfs/xfs_qm.h                  |    1 
 fs/xfs/xfs_quota.h               |    7 +
 fs/xfs/xfs_rtalloc.c             |    2 
 fs/xfs/xfs_trans.c               |   58 ++++++------
 fs/xfs/xfs_trans_ail.c           |    2 
 fs/xfs/xfs_trans_dquot.c         |   31 +++++-
 30 files changed, 475 insertions(+), 223 deletions(-)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 01/16] generic/757: fix various bugs in this test
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
@ 2024-11-26  1:20   ` Darrick J. Wong
  2024-11-28  7:56     ` Zorro Lang
  2024-11-26  1:21   ` [PATCH 02/16] generic/757: convert to thinp Darrick J. Wong
                     ` (15 subsequent siblings)
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:20 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Fix this test so the check doesn't fail on XFS, and restrict runtime to
100 loops because otherwise this test takes many hours.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/757 |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)


diff --git a/tests/generic/757 b/tests/generic/757
index 0ff5a8ac00182b..37cf49e6bc7fd9 100755
--- a/tests/generic/757
+++ b/tests/generic/757
@@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
 cur=$(_log_writes_find_next_fua $prev)
 [ -z "$cur" ] && _fail "failed to locate next FUA write"
 
-while [ ! -z "$cur" ]; do
+while _soak_loop_running $((100 * TIME_FACTOR)); do
 	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
 
+	# xfs_repair won't run if the log is dirty
+	if [ $FSTYP = "xfs" ]; then
+		_scratch_mount
+		_scratch_unmount
+	fi
 	_check_scratch_fs
 
 	prev=$cur


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/16] generic/757: convert to thinp
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
  2024-11-26  1:20   ` [PATCH 01/16] generic/757: fix various bugs in this test Darrick J. Wong
@ 2024-11-26  1:21   ` Darrick J. Wong
  2024-11-28  8:08     ` Zorro Lang
  2024-11-26  1:21   ` [PATCH 03/16] xfs/113: fix failure to corrupt the entire directory Darrick J. Wong
                     ` (14 subsequent siblings)
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:21 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Convert this test to use dm-thinp so that discards always zero the data.
This prevents weird replay problems if the scratch device doesn't
guarantee that read after discard returns zeroes.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/757 |   23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)


diff --git a/tests/generic/757 b/tests/generic/757
index 37cf49e6bc7fd9..6c13c6af41c57c 100755
--- a/tests/generic/757
+++ b/tests/generic/757
@@ -8,12 +8,13 @@
 # This can be seen on subpage FSes on Linux 6.4.
 #
 . ./common/preamble
-_begin_fstest auto quick metadata log recoveryloop aio
+_begin_fstest auto quick metadata log recoveryloop aio thin
 
 _cleanup()
 {
 	cd /
 	_log_writes_cleanup &> /dev/null
+	_dmthin_cleanup
 	rm -f $tmp.*
 }
 
@@ -23,11 +24,14 @@ _cleanup()
 
 fio_config=$tmp.fio
 
+. ./common/dmthin
 . ./common/dmlogwrites
 
-_require_scratch
+# Use thin device as replay device, which requires $SCRATCH_DEV
+_require_scratch_nocheck
 _require_aiodio
 _require_log_writes
+_require_dm_target thin-pool
 
 cat >$fio_config <<EOF
 [global]
@@ -47,7 +51,13 @@ _require_fio $fio_config
 
 cat $fio_config >> $seqres.full
 
-_log_writes_init $SCRATCH_DEV
+# Use a thin device to provide deterministic discard behavior. Discards are used
+# by the log replay tool for fast zeroing to prevent out-of-order replay issues.
+_test_unmount
+sectors=$(blockdev --getsz $SCRATCH_DEV)
+sectors=$((sectors * 90 / 100))
+_dmthin_init $sectors $sectors
+_log_writes_init $DMTHIN_VOL_DEV
 _log_writes_mkfs >> $seqres.full 2>&1
 _log_writes_mark mkfs
 
@@ -64,14 +74,13 @@ cur=$(_log_writes_find_next_fua $prev)
 [ -z "$cur" ] && _fail "failed to locate next FUA write"
 
 while _soak_loop_running $((100 * TIME_FACTOR)); do
-	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
+	_log_writes_replay_log_range $cur $DMTHIN_VOL_DEV >> $seqres.full
 
 	# xfs_repair won't run if the log is dirty
 	if [ $FSTYP = "xfs" ]; then
-		_scratch_mount
-		_scratch_unmount
+		_dmthin_mount
 	fi
-	_check_scratch_fs
+	_dmthin_check_fs
 
 	prev=$cur
 	cur=$(_log_writes_find_next_fua $(($cur + 1)))


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/16] xfs/113: fix failure to corrupt the entire directory
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
  2024-11-26  1:20   ` [PATCH 01/16] generic/757: fix various bugs in this test Darrick J. Wong
  2024-11-26  1:21   ` [PATCH 02/16] generic/757: convert to thinp Darrick J. Wong
@ 2024-11-26  1:21   ` Darrick J. Wong
  2024-11-26  1:21   ` [PATCH 04/16] xfs/508: fix test for 64k blocksize Darrick J. Wong
                     ` (13 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:21 UTC (permalink / raw)
  To: djwong, zlang; +Cc: fstests, hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

This test tries to corrupt the data blocks of a directory, but it
doesn't take into account the fact that __populate_check_xfs_dir can
remove enough entries to cause sparse holes in the directory.  If that
happens, this "file data block is unmapped" logic will cause the
corruption loop to exit early.  Then we can add to the directory, which
causes the test to fail.

Instead, create a list of mappable dir block offsets, and run 100
corruptions at a time to reduce the amount of time we spend initializing
xfs_db.  This fixes the regressions that I see with 32k/64k block sizes.

Cc: <fstests@vger.kernel.org> # v2022.05.01
Fixes: c8e6dbc8812653 ("xfs: test directory metadata corruption checking and repair")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/xfs/113 |   33 +++++++++++++++++++++++++++------
 1 file changed, 27 insertions(+), 6 deletions(-)


diff --git a/tests/xfs/113 b/tests/xfs/113
index 094ab71f2aefec..22ac8c3fd51b80 100755
--- a/tests/xfs/113
+++ b/tests/xfs/113
@@ -52,13 +52,34 @@ _scratch_xfs_repair -n >> $seqres.full 2>&1 || _fail "xfs_repair should not fail
 echo "+ check dir"
 __populate_check_xfs_dir "${inode}" btree
 
+dir_data_offsets() {
+	_scratch_xfs_db -c "inode ${inode}" -c 'bmap' | \
+		awk -v leaf_lblk=$leaf_lblk \
+		'{
+			if ($3 >= leaf_lblk)
+				exit;
+			for (i = 0; i < $8; i++)
+				printf("%d\n", $3 + i);
+		}'
+}
+
 echo "+ corrupt dir"
-loff=0
-while true; do
-	_scratch_xfs_db -x -c "inode ${inode}" -c "dblock ${loff}" -c "stack" | grep -q 'file data block is unmapped' && break
-	_scratch_xfs_db -x -c "inode ${inode}" -c "dblock ${loff}" -c "stack" -c "blocktrash -x 32 -y $((blksz * 8)) -z ${FUZZ_ARGS}" >> $seqres.full
-	loff="$((loff + 1))"
-done
+subcommands=()
+while read loff; do
+	# run 100 commands at a time
+	if [ "${#subcommands[@]}" -lt 600 ]; then
+		subcommands+=(-c "inode ${inode}")
+		subcommands+=(-c "dblock ${loff}")
+		subcommands+=(-c "blocktrash -x 32 -y $((blksz * 8)) -z ${FUZZ_ARGS}")
+		continue
+	fi
+
+	_scratch_xfs_db -x "${subcommands[@]}" >> $seqres.full
+	subcommands=()
+done < <(dir_data_offsets)
+if [ "${#subcommands[@]}" -gt 0 ]; then
+	_scratch_xfs_db -x "${subcommands[@]}" >> $seqres.full
+fi
 
 echo "+ mount image && modify dir"
 if _try_scratch_mount >> $seqres.full 2>&1; then


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/16] xfs/508: fix test for 64k blocksize
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (2 preceding siblings ...)
  2024-11-26  1:21   ` [PATCH 03/16] xfs/113: fix failure to corrupt the entire directory Darrick J. Wong
@ 2024-11-26  1:21   ` Darrick J. Wong
  2024-11-26  1:21   ` [PATCH 05/16] common/rc: capture dmesg when oom kills happen Darrick J. Wong
                     ` (12 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:21 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

It turns out that icreate transactions will try to reserve quite a bit
of space on a 64k fsblock filesystem -- enough to handle the worst case
parent directory expansion, a new inode chunk, and these days a parent
pointer as well.  This can work out to quite a bit of space:

fsblock		reservation
1k		172K
4k		368K
16k		1136K
64k		3650K

Unfortunately, this test sets its block quota limits at 1-2MB, so we
can't even create a child file.  Bump the limits up by 10x so that this
test will pass even if there's more metadata size creep in the future.

Fixes: f769a923f576df ("xfs: project quota ineritance flag test")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/xfs/508 |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/tests/xfs/508 b/tests/xfs/508
index ee1a0371db7d6d..1bd13e98c9f641 100755
--- a/tests/xfs/508
+++ b/tests/xfs/508
@@ -44,7 +44,7 @@ do_quota_nospc()
 	local exp=$2
 
 	echo "Write $file, expect $exp:" | _filter_scratch
-	$XFS_IO_PROG -t -f -c "pwrite 0 5m" $file 2>&1 >/dev/null | \
+	$XFS_IO_PROG -t -f -c "pwrite 0 50m" $file 2>&1 >/dev/null | \
 		_filter_xfs_io_error
 	rm -f $file
 }
@@ -56,7 +56,7 @@ _require_prjquota $SCRATCH_DEV
 
 mkdir $SCRATCH_MNT/dir
 $QUOTA_CMD -x -c 'project -s test' $SCRATCH_MNT >>$seqres.full 2>&1
-$QUOTA_CMD -x -c 'limit -p bsoft=1m bhard=2m test' $SCRATCH_MNT
+$QUOTA_CMD -x -c 'limit -p bsoft=10m bhard=20m test' $SCRATCH_MNT
 
 # test the Project inheritance bit is a directory only flag, and it's set on
 # directory by default. Expect no complain about "project inheritance flag is


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/16] common/rc: capture dmesg when oom kills happen
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (3 preceding siblings ...)
  2024-11-26  1:21   ` [PATCH 04/16] xfs/508: fix test for 64k blocksize Darrick J. Wong
@ 2024-11-26  1:21   ` Darrick J. Wong
  2024-11-26  1:22   ` [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully Darrick J. Wong
                     ` (11 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:21 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Capture the dmesg output if the OOM killer is invoked during fstests.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 common/rc |    1 +
 1 file changed, 1 insertion(+)


diff --git a/common/rc b/common/rc
index 2ee46e5101e168..70a0f1d1c6acd9 100644
--- a/common/rc
+++ b/common/rc
@@ -4538,6 +4538,7 @@ _check_dmesg()
 	     -e "INFO: possible circular locking dependency detected" \
 	     -e "general protection fault:" \
 	     -e "BUG .* remaining" \
+	     -e "oom-kill" \
 	     -e "UBSAN:" \
 	     $seqres.dmesg
 	if [ $? -eq 0 ]; then


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (4 preceding siblings ...)
  2024-11-26  1:21   ` [PATCH 05/16] common/rc: capture dmesg when oom kills happen Darrick J. Wong
@ 2024-11-26  1:22   ` Darrick J. Wong
  2024-11-26  4:55     ` Christoph Hellwig
  2024-11-26  1:22   ` [PATCH 07/16] xfs/163: skip test if we can't shrink due to enospc issues Darrick J. Wong
                     ` (10 subsequent siblings)
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:22 UTC (permalink / raw)
  To: djwong, zlang; +Cc: fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

This test creates a couple of patterned files on a tiny filesystem,
fragments the free space, clones one patterned file to the other, and
checks that the entire file was cloned.

However, this test doesn't work on a 64k fsblock filesystem because
we've used up all the free space reservation for the rmapbt, and that
causes the FICLONE to error out with ENOSPC partway through.  Hence we
need to detect the ENOSPC and _notrun the test.

That said, it turns out that XFS has been silently dropping error codes
if we managed to make some progress cloning extents.  That's ok if the
operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
FICLONE/FICLONERANGE do not permit partial results, so the dropped error
codes is actually an error.

Therefore, this testcase now becomes a regression test for the patch to
fix that.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/generic/562 |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


diff --git a/tests/generic/562 b/tests/generic/562
index 91360c4154a6a2..36bd02911c96b8 100755
--- a/tests/generic/562
+++ b/tests/generic/562
@@ -15,6 +15,9 @@ _begin_fstest auto clone punch
 . ./common/filter
 . ./common/reflink
 
+test "$FSTYP" = "xfs" && \
+	_fixed_by_kernel_commit XXXXXXXXXX "xfs: don't drop errno values when we fail to ficlone the entire range"
+
 _require_scratch_reflink
 _require_test_program "punch-alternating"
 _require_xfs_io_command "fpunch"
@@ -48,8 +51,11 @@ while true; do
 done
 
 # Now clone file bar into file foo. This is supposed to succeed and not fail
-# with ENOSPC for example.
-_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full
+# with ENOSPC for example.  However, XFS will sometimes run out of space.
+_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full 2> $tmp.err
+cat $tmp.err
+test "$FSTYP" = "xfs" && grep -q 'No space left on device' $tmp.err && \
+	_notrun "ran out of space while cloning"
 
 # Unmount and mount the filesystem again to verify the operation was durably
 # persisted.


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/16] xfs/163: skip test if we can't shrink due to enospc issues
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (5 preceding siblings ...)
  2024-11-26  1:22   ` [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully Darrick J. Wong
@ 2024-11-26  1:22   ` Darrick J. Wong
  2024-11-26  1:22   ` [PATCH 08/16] xfs/009: allow logically contiguous preallocations Darrick J. Wong
                     ` (9 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:22 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If this test fails due to insufficient space, skip this test.  This can
happen if a realtime volume is enabled on the filesystem and we cannot
shrink due to the rtbitmap.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/xfs/163 |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)


diff --git a/tests/xfs/163 b/tests/xfs/163
index 2bd94060222f96..75c3113dc2fd03 100755
--- a/tests/xfs/163
+++ b/tests/xfs/163
@@ -17,13 +17,20 @@ _begin_fstest auto quick growfs shrinkfs
 
 test_shrink()
 {
-	$XFS_GROWFS_PROG -D"$1" $SCRATCH_MNT >> $seqres.full 2>&1
+	$XFS_GROWFS_PROG -D"$1" $SCRATCH_MNT &> $tmp.growfs
 	ret=$?
 
 	_scratch_unmount
 	_check_scratch_fs
 	_scratch_mount
 
+	# If we couldn't shrink the filesystem due to lack of space, we're
+	# done with this test.
+	[ $1 -ne $dblocks ] && \
+		grep -q 'No space left on device' $tmp.growfs && \
+		_notrun "Could not shrink due to lack of space"
+	cat $tmp.growfs >> $seqres.full
+
 	$XFS_INFO_PROG $SCRATCH_MNT 2>&1 | _filter_mkfs 2>$tmp.growfs >/dev/null
 	. $tmp.growfs
 	[ $ret -eq 0 -a $1 -eq $dblocks ]


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/16] xfs/009: allow logically contiguous preallocations
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (6 preceding siblings ...)
  2024-11-26  1:22   ` [PATCH 07/16] xfs/163: skip test if we can't shrink due to enospc issues Darrick J. Wong
@ 2024-11-26  1:22   ` Darrick J. Wong
  2024-11-26  1:22   ` [PATCH 09/16] generic/251: use sentinel files to kill the fstrim loop Darrick J. Wong
                     ` (8 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:22 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

The new rtgroups feature implements a simplistic rotor to pick the
rtgroup for an initial allocation to a file.  This causes test failures
if the preallocations are spread across two rtgroups, which happens if
there are more subtests than rtgroups.

One way to fix this would be to reset the rotor then each subtest starts
allocating from rtgroup 0, but the only way to do that is to cycle the
scratch mount, which is a bit gross.

Instead, report logically contiguous mappings as a single mapping even
if the physical space is not contiguous.  Unfortunately, there's not
enough context in the comments to know if the test actually was checking
for physical contiguity?  Or if this is just an exerciser of the old
preallocation calls, and it's fine as long as the file ranges are mapped
(or unmapped) as desired.

Messing with some awk is a lot cheaper than umount/mount cycling.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/xfs/009 |   29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)


diff --git a/tests/xfs/009 b/tests/xfs/009
index dde505f079f4f8..bb42ce32490df5 100755
--- a/tests/xfs/009
+++ b/tests/xfs/009
@@ -49,13 +49,26 @@ _filesize()
 _block_filter()
 {
 	$AWK_PROG -v bsize="$bsize" '
+	BEGIN {
+		br_pos = 0
+		br_len = 0
+	}
+	function dump_blockrange() {
+		if (br_len == 0)
+			return
+		printf("        [%d,%d]: BLOCKRANGE\n", br_pos, br_len)
+		br_pos = 0
+		br_len = 0
+	}
 	/blocksize/ {
+		dump_blockrange()
 		printf("    blocksize BSIZE\n")
 
 		next
 	}
 
 	/CMD/ {
+		dump_blockrange()
 		split($3, off, "=")
 		offset = strtonum(off[2])
 		if (offset != -1)
@@ -72,6 +85,7 @@ _block_filter()
 	}
 
 	/MAP/ {
+		dump_blockrange()
 		split($2, off, "=")
 		offset = strtonum(off[2])
 		if (offset != -1)
@@ -90,6 +104,7 @@ _block_filter()
 	}
 
 	/TRUNCATE/ {
+		dump_blockrange()
 		split($2, off, "=")
 		offset = strtonum(off[2]) / bsize
 
@@ -99,16 +114,28 @@ _block_filter()
 	}
 
 	/\[[0-9]+,[0-9]+\]:/ {
-		printf("        %s BLOCKRANGE\n", $1)
+		rangestr = gensub(/\[([0-9]+),([0-9]+)\]:/, "\\1,\\2", "g", $1);
+		split(rangestr, off, ",")
+		if (br_pos + br_len == off[1]) {
+			br_len += off[2];
+		} else {
+			dump_blockrange()
+			br_pos = off[1];
+			br_len = off[2];
+		}
 
 		next
 	}
 
 	{
+		dump_blockrange()
 		print
 
 		next
 	}
+	END {
+		dump_blockrange()
+	}
 	'
 }
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/16] generic/251: use sentinel files to kill the fstrim loop
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (7 preceding siblings ...)
  2024-11-26  1:22   ` [PATCH 08/16] xfs/009: allow logically contiguous preallocations Darrick J. Wong
@ 2024-11-26  1:22   ` Darrick J. Wong
  2024-11-26  1:23   ` [PATCH 10/16] generic/251: constrain runtime via time/load/soak factors Darrick J. Wong
                     ` (7 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:22 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Apparently the subshell kill doesn't always take, and then the test runs
for hours and hours because nothing stops it.  Instead, use a sentinel
file to detect when fstrim_loop should stop execing background fstrims.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/251 |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


diff --git a/tests/generic/251 b/tests/generic/251
index b432fb11937911..d59e91c3e0a33a 100755
--- a/tests/generic/251
+++ b/tests/generic/251
@@ -125,12 +125,15 @@ fstrim_loop()
 			wait $fpid
 		fi
 		while [ $start -lt $fsize ] ; do
+			test -s $tmp.fstrim_loop || break
 			$FSTRIM_PROG -m ${minlen}k -o ${start}k -l ${step}k $SCRATCH_MNT &
 			fpid=$!
 			wait $fpid
 			start=$(( $start + $step ))
 		done
+		test -s $tmp.fstrim_loop || break
 	done
+	rm -f $tmp.fstrim_loop
 }
 
 function check_sums() {
@@ -188,6 +191,7 @@ find -P . -xdev -type f -print0 | xargs -0 md5sum | sort -o $tmp/content.sums
 
 echo -n "Running the test: "
 pids=""
+echo run > $tmp.fstrim_loop
 fstrim_loop &
 fstrim_pid=$!
 p=1
@@ -199,8 +203,10 @@ done
 echo "done."
 
 wait $pids
-kill $fstrim_pid
-wait $fstrim_pid
+truncate -s 0 $tmp.fstrim_loop
+while test -e $tmp.fstrim_loop; do
+	sleep 1
+done
 
 status=0
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/16] generic/251: constrain runtime via time/load/soak factors
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (8 preceding siblings ...)
  2024-11-26  1:22   ` [PATCH 09/16] generic/251: use sentinel files to kill the fstrim loop Darrick J. Wong
@ 2024-11-26  1:23   ` Darrick J. Wong
  2024-11-26  1:23   ` [PATCH 11/16] generic/251: don't copy the fsstress source code Darrick J. Wong
                     ` (6 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:23 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

On my test fleet, this test can run for well in excess of 20 minutes:

   613 generic/251
   616 generic/251
   624 generic/251
   630 generic/251
   634 generic/251
   652 generic/251
   675 generic/251
   749 generic/251
   777 generic/251
   808 generic/251
   832 generic/251
   946 generic/251
  1082 generic/251
  1221 generic/251
  1241 generic/251
  1254 generic/251
  1305 generic/251
  1366 generic/251
  1646 generic/251
  1936 generic/251
  1952 generic/251
  2358 generic/251
  4359 generic/251
  5325 generic/251
 34046 generic/251

because it hardcodes 20 threads and 10 copies.  It's not great to have a
test that results in a significant fraction of the total test runtime.
Fix the looping and load on this test to use LOAD and TIME_FACTOR to
scale up its operations, along with the usual SOAK_DURATION override.
That brings the default runtime down to less than a minute.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/251 |   24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)


diff --git a/tests/generic/251 b/tests/generic/251
index d59e91c3e0a33a..b4ddda10cef403 100755
--- a/tests/generic/251
+++ b/tests/generic/251
@@ -15,7 +15,6 @@ _begin_fstest ioctl trim auto
 tmp=`mktemp -d`
 trap "_cleanup; exit \$status" 0 1 3
 trap "_destroy; exit \$status" 2 15
-chpid=0
 mypid=$$
 
 # Import common functions.
@@ -151,29 +150,28 @@ function check_sums() {
 
 function run_process() {
 	local p=$1
-	repeat=10
+	if [ -n "$SOAK_DURATION" ]; then
+		local duration="$SOAK_DURATION"
+	else
+		local duration="$((30 * TIME_FACTOR))"
+	fi
+	local stopat="$(( $(date +%s) + duration))"
 
-	sleep $((5*$p))s &
-	export chpid=$! && wait $chpid &> /dev/null
-	chpid=0
-
-	while [ $repeat -gt 0 ]; do
+	sleep $((5*$p))s
 
+	while [ "$(date +%s)" -lt "$stopat" ]; do
 		# Remove old directories.
 		rm -rf $SCRATCH_MNT/$p
-		export chpid=$! && wait $chpid &> /dev/null
 
 		# Copy content -> partition.
 		mkdir $SCRATCH_MNT/$p
 		cp -axT $content/ $SCRATCH_MNT/$p/
-		export chpid=$! && wait $chpid &> /dev/null
 
 		check_sums
-		repeat=$(( $repeat - 1 ))
 	done
 }
 
-nproc=20
+nproc=$((4 * LOAD_FACTOR))
 
 # Copy $here to the scratch fs and make coipes of the replica.  The fstests
 # output (and hence $seqres.full) could be in $here, so we need to snapshot
@@ -194,11 +192,9 @@ pids=""
 echo run > $tmp.fstrim_loop
 fstrim_loop &
 fstrim_pid=$!
-p=1
-while [ $p -le $nproc ]; do
+for ((p = 1; p < nproc; p++)); do
 	run_process $p &
 	pids="$pids $!"
-	p=$(($p+1))
 done
 echo "done."
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 11/16] generic/251: don't copy the fsstress source code
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (9 preceding siblings ...)
  2024-11-26  1:23   ` [PATCH 10/16] generic/251: constrain runtime via time/load/soak factors Darrick J. Wong
@ 2024-11-26  1:23   ` Darrick J. Wong
  2024-11-26  1:23   ` [PATCH 12/16] common/rc: _scratch_mkfs_sized supports extra arguments Darrick J. Wong
                     ` (5 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:23 UTC (permalink / raw)
  To: djwong, zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Run fsstress for a short time to generate test data to replicate on the
scratch device so that we don't blow out the test runtimes on
unintentionally copying .git directories or large corefiles from the
developer's systems, etc.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/251 |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)


diff --git a/tests/generic/251 b/tests/generic/251
index b4ddda10cef403..ec486c277c6828 100755
--- a/tests/generic/251
+++ b/tests/generic/251
@@ -173,13 +173,11 @@ function run_process() {
 
 nproc=$((4 * LOAD_FACTOR))
 
-# Copy $here to the scratch fs and make coipes of the replica.  The fstests
-# output (and hence $seqres.full) could be in $here, so we need to snapshot
-# $here before computing file checksums.
+# Use fsstress to create a directory tree with some variability
 content=$SCRATCH_MNT/orig
 mkdir -p $content
-cp -axT $here/ $content/
-
+FSSTRESS_ARGS=$(_scale_fsstress_args -p 4 -d $content -n 1000 $FSSTRESS_AVOID)
+$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full
 mkdir -p $tmp
 
 (


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 12/16] common/rc: _scratch_mkfs_sized supports extra arguments
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (10 preceding siblings ...)
  2024-11-26  1:23   ` [PATCH 11/16] generic/251: don't copy the fsstress source code Darrick J. Wong
@ 2024-11-26  1:23   ` Darrick J. Wong
  2024-11-26  1:23   ` [PATCH 13/16] xfs/157: do not drop necessary mkfs options Darrick J. Wong
                     ` (4 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:23 UTC (permalink / raw)
  To: djwong, zlang; +Cc: zlang, fstests, linux-xfs

From: Zorro Lang <zlang@kernel.org>

To give more arguments to _scratch_mkfs_sized, we generally do as:

  MKFS_OPTIONS="-L oldlabel $MKFS_OPTIONS" _scratch_mkfs_sized $fs_size

to give "-L oldlabel" to it. But if _scratch_mkfs_sized fails, it
will get rid of the whole MKFS_OPTIONS and try to mkfs again.
Likes:

  ** mkfs failed with extra mkfs options added to "-L oldlabel -m rmapbt=1" by test 157 **
  ** attempting to mkfs using only test 157 options: -d size=524288000 -b size=4096 **

But that's not the fault of "-L oldlabel". So for keeping the mkfs
options ("-L oldlabel") we need, we'd better to let the
scratch_mkfs_sized to support extra arguments, rather than using
global MKFS_OPTIONS.

Signed-off-by: Zorro Lang <zlang@kernel.org>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
[djwong: fix string quoting issues]
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 common/rc |   34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)


diff --git a/common/rc b/common/rc
index 70a0f1d1c6acd9..6acacbe4c88eea 100644
--- a/common/rc
+++ b/common/rc
@@ -1026,11 +1026,13 @@ _small_fs_size_mb()
 }
 
 # Create fs of certain size on scratch device
-# _try_scratch_mkfs_sized <size in bytes> [optional blocksize]
+# _try_scratch_mkfs_sized <size in bytes> [optional blocksize] [other options]
 _try_scratch_mkfs_sized()
 {
 	local fssize=$1
-	local blocksize=$2
+	shift
+	local blocksize=$1
+	shift
 	local def_blksz
 	local blocksize_opt
 	local rt_ops
@@ -1094,10 +1096,10 @@ _try_scratch_mkfs_sized()
 		# don't override MKFS_OPTIONS that set a block size.
 		echo $MKFS_OPTIONS |grep -E -q "b\s*size="
 		if [ $? -eq 0 ]; then
-			_try_scratch_mkfs_xfs -d size=$fssize $rt_ops
+			_try_scratch_mkfs_xfs -d size=$fssize $rt_ops "$@"
 		else
 			_try_scratch_mkfs_xfs -d size=$fssize $rt_ops \
-				-b size=$blocksize
+				-b size=$blocksize "$@"
 		fi
 		;;
 	ext2|ext3|ext4)
@@ -1108,7 +1110,7 @@ _try_scratch_mkfs_sized()
 				_notrun "Could not make scratch logdev"
 			MKFS_OPTIONS="$MKFS_OPTIONS -J device=$SCRATCH_LOGDEV"
 		fi
-		${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
+		${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize "$@" $SCRATCH_DEV $blocks
 		;;
 	gfs2)
 		# mkfs.gfs2 doesn't automatically shrink journal files on small
@@ -1123,13 +1125,13 @@ _try_scratch_mkfs_sized()
 			(( journal_size >= min_journal_size )) || journal_size=$min_journal_size
 			MKFS_OPTIONS="-J $journal_size $MKFS_OPTIONS"
 		fi
-		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS -O -b $blocksize $SCRATCH_DEV $blocks
+		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS -O -b $blocksize "$@" $SCRATCH_DEV $blocks
 		;;
 	ocfs2)
-		yes | ${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
+		yes | ${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize "$@" $SCRATCH_DEV $blocks
 		;;
 	udf)
-		$MKFS_UDF_PROG $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
+		$MKFS_UDF_PROG $MKFS_OPTIONS -b $blocksize "$@" $SCRATCH_DEV $blocks
 		;;
 	btrfs)
 		local mixed_opt=
@@ -1137,33 +1139,33 @@ _try_scratch_mkfs_sized()
 		# the device is not zoned. Ref: btrfs-progs: btrfs_min_dev_size()
 		(( fssize < $((256 * 1024 * 1024)) )) &&
 			! _scratch_btrfs_is_zoned && mixed_opt='--mixed'
-		$MKFS_BTRFS_PROG $MKFS_OPTIONS $mixed_opt -b $fssize $SCRATCH_DEV
+		$MKFS_BTRFS_PROG $MKFS_OPTIONS $mixed_opt -b $fssize "$@" $SCRATCH_DEV
 		;;
 	jfs)
-		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS $SCRATCH_DEV $blocks
+		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS "$@" $SCRATCH_DEV $blocks
 		;;
 	reiserfs)
-		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
+		${MKFS_PROG} -t $FSTYP $MKFS_OPTIONS -b $blocksize "$@" $SCRATCH_DEV $blocks
 		;;
 	reiser4)
 		# mkfs.resier4 requires size in KB as input for creating filesystem
-		$MKFS_REISER4_PROG $MKFS_OPTIONS -y -b $blocksize $SCRATCH_DEV \
+		$MKFS_REISER4_PROG $MKFS_OPTIONS -y -b $blocksize "$@" $SCRATCH_DEV \
 				   `expr $fssize / 1024`
 		;;
 	f2fs)
 		# mkfs.f2fs requires # of sectors as an input for the size
 		local sector_size=`blockdev --getss $SCRATCH_DEV`
-		$MKFS_F2FS_PROG $MKFS_OPTIONS $SCRATCH_DEV `expr $fssize / $sector_size`
+		$MKFS_F2FS_PROG $MKFS_OPTIONS "$@" $SCRATCH_DEV `expr $fssize / $sector_size`
 		;;
 	tmpfs)
 		local free_mem=`_free_memory_bytes`
 		if [ "$free_mem" -lt "$fssize" ] ; then
 		   _notrun "Not enough memory ($free_mem) for tmpfs with $fssize bytes"
 		fi
-		export MOUNT_OPTIONS="-o size=$fssize $TMPFS_MOUNT_OPTIONS"
+		export MOUNT_OPTIONS="-o size=$fssize "$@" $TMPFS_MOUNT_OPTIONS"
 		;;
 	bcachefs)
-		$MKFS_BCACHEFS_PROG $MKFS_OPTIONS --fs_size=$fssize $blocksize_opt $SCRATCH_DEV
+		$MKFS_BCACHEFS_PROG $MKFS_OPTIONS --fs_size=$fssize $blocksize_opt "$@" $SCRATCH_DEV
 		;;
 	*)
 		_notrun "Filesystem $FSTYP not supported in _scratch_mkfs_sized"
@@ -1173,7 +1175,7 @@ _try_scratch_mkfs_sized()
 
 _scratch_mkfs_sized()
 {
-	_try_scratch_mkfs_sized $* || _notrun "_scratch_mkfs_sized failed with ($*)"
+	_try_scratch_mkfs_sized "$@" || _notrun "_scratch_mkfs_sized failed with ($*)"
 }
 
 # Emulate an N-data-disk stripe w/ various stripe units


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 13/16] xfs/157: do not drop necessary mkfs options
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (11 preceding siblings ...)
  2024-11-26  1:23   ` [PATCH 12/16] common/rc: _scratch_mkfs_sized supports extra arguments Darrick J. Wong
@ 2024-11-26  1:23   ` Darrick J. Wong
  2024-11-26  1:24   ` [PATCH 14/16] generic/366: fix directio requirements checking Darrick J. Wong
                     ` (3 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:23 UTC (permalink / raw)
  To: djwong, zlang; +Cc: zlang, fstests, linux-xfs

From: Zorro Lang <zlang@kernel.org>

To give the test option "-L oldlabel" to _scratch_mkfs_sized, xfs/157
does:

  MKFS_OPTIONS="-L oldlabel $MKFS_OPTIONS" _scratch_mkfs_sized $fs_size

but the _scratch_mkfs_sized trys to keep the $fs_size, when mkfs
fails with incompatible $MKFS_OPTIONS options, likes this:

  ** mkfs failed with extra mkfs options added to "-L oldlabel -m rmapbt=1" by test 157 **
  ** attempting to mkfs using only test 157 options: -d size=524288000 -b size=4096 **

but the "-L oldlabel" is necessary, we shouldn't drop it. To avoid
that, we give the "-L oldlabel" to _scratch_mkfs_sized through
function parameters, not through global MKFS_OPTIONS.

Signed-off-by: Zorro Lang <zlang@kernel.org>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
[djwong: fix more string quoting issues]
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/157 |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


diff --git a/tests/xfs/157 b/tests/xfs/157
index 9b5badbaeb3c76..e102a5a10abe4b 100755
--- a/tests/xfs/157
+++ b/tests/xfs/157
@@ -66,8 +66,7 @@ scenario() {
 }
 
 check_label() {
-	MKFS_OPTIONS="-L oldlabel $MKFS_OPTIONS" _scratch_mkfs_sized $fs_size \
-		>> $seqres.full
+	_scratch_mkfs_sized "$fs_size" "" -L oldlabel >> $seqres.full 2>&1
 	_scratch_xfs_db -c label
 	_scratch_xfs_admin -L newlabel "$@" >> $seqres.full
 	_scratch_xfs_db -c label


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 14/16] generic/366: fix directio requirements checking
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (12 preceding siblings ...)
  2024-11-26  1:23   ` [PATCH 13/16] xfs/157: do not drop necessary mkfs options Darrick J. Wong
@ 2024-11-26  1:24   ` Darrick J. Wong
  2024-11-26  1:24   ` [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest Darrick J. Wong
                     ` (2 subsequent siblings)
  16 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:24 UTC (permalink / raw)
  To: djwong, zlang; +Cc: fstests, hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

On a system with 4k-sector storage devices, this test fails with:

 --- /tmp/fstests/tests/generic/366.out	2024-11-17 09:04:53.161104479 -0800
 +++ /var/tmp/fstests/generic/366.out.bad	2024-11-20 21:02:30.948000000 -0800
 @@ -1,2 +1,34 @@
  QA output created by 366
 +fio: io_u error on file /opt/file1: Invalid argument: read offset=15360, buflen=512
 +fio: io_u error on file /opt/file1: Invalid argument: read offset=15360, buflen=512

The cause of this failure is that we cannot do 512byte directios to a
device with 4k LBAs.  Update the precondition checking to exclude this
scenario.

Cc: <fstests@vger.kernel.org> # v2024.11.17
Fixes: 4c1629ae3a3a56 ("generic: new test case to verify if certain fio load will hang the filesystem")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 tests/generic/366 |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/tests/generic/366 b/tests/generic/366
index 6e7dd7279218c4..b322bcca72fecc 100755
--- a/tests/generic/366
+++ b/tests/generic/366
@@ -20,7 +20,7 @@ _begin_fstest auto quick rw
 . ./common/filter
 
 _require_scratch
-_require_odirect
+_require_odirect 512	# see fio job1 config below
 _require_aio
 
 _fixed_by_kernel_commit xxxxxxxxxxxx \


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (13 preceding siblings ...)
  2024-11-26  1:24   ` [PATCH 14/16] generic/366: fix directio requirements checking Darrick J. Wong
@ 2024-11-26  1:24   ` Darrick J. Wong
  2024-11-26  4:56     ` Christoph Hellwig
  2024-11-26  1:24   ` [PATCH 16/16] xfs/122: add tests for commitrange structures Darrick J. Wong
  2024-11-26 20:27   ` [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool Darrick J. Wong
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:24 UTC (permalink / raw)
  To: djwong, zlang; +Cc: fstests, tytso, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Ted reported that this test fails on his setup, and I noticed that I
forgot to actually set a value for the xattr.  In theory filesystems
support zero-byte xattrs, but we might as well set and check the values
so that we can make sure nobody got confused.

The actual test failure comes from attr 2.4.47 refusing to set a
zero-legnth xattr, whereas 2.5 and newer will.  That was changed in the
attr commit 0550d2bc989d39 ("Properly set and report empty attribute
values") prior to 2.4.48:

https://git.savannah.nongnu.org/cgit/attr.git/commit/?id=0550d2bc989d390eb25f7004ee0fae2dbc693a0d

Cc: <fstests@vger.kernel.org> # v2024.10.28
Fixes: 9c3762ceafd430 ("misc: amend unicode confusing name tests to check for hidden tag characters")
Reported-and-tested-by: tytso@mit.edu
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/generic/454 |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)


diff --git a/tests/generic/454 b/tests/generic/454
index 2cc2d81ce4cc77..aec8beb8b43ca0 100755
--- a/tests/generic/454
+++ b/tests/generic/454
@@ -121,8 +121,8 @@ setf "combmark_\xe1\x80\x9c\xe1\x80\xad\xe1\x80\xaf.txt" "combining marks"
 setf "combmark_\xe1\x80\x9c\xe1\x80\xaf\xe1\x80\xad.txt" "combining marks"
 
 # encoding hidden tag characters in attrnames to create confusing xattrs
-setf "llamapirate\xf3\xa0\x80\x81\xf3\xa0\x81\x94\xf3\xa0\x81\xa8\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb3\xf3\xa0\x81\xa1\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x81\xb3\xf3\xa0\x80\xa0\xf3\xa0\x81\xa6\xf3\xa0\x81\xaf\xf3\xa0\x81\xb2\xf3\xa0\x80\xa0\xf3\xa0\x81\x93\xf3\xa0\x81\xa5\xf3\xa0\x81\xa1\xf3\xa0\x81\xb4\xf3\xa0\x81\xb4\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb7\xf3\xa0\x81\xa5\xf3\xa0\x81\xb2\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\x95\xf3\xa0\x81\x93\xf3\xa0\x81\x84\xf3\xa0\x80\xa0\xf3\xa0\x80\xb1\xf3\xa0\x80\xb2\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x81\xbf"
-setf "llamapirate"
+setf "llamapirate\xf3\xa0\x80\x81\xf3\xa0\x81\x94\xf3\xa0\x81\xa8\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb3\xf3\xa0\x81\xa1\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x81\xb3\xf3\xa0\x80\xa0\xf3\xa0\x81\xa6\xf3\xa0\x81\xaf\xf3\xa0\x81\xb2\xf3\xa0\x80\xa0\xf3\xa0\x81\x93\xf3\xa0\x81\xa5\xf3\xa0\x81\xa1\xf3\xa0\x81\xb4\xf3\xa0\x81\xb4\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb7\xf3\xa0\x81\xa5\xf3\xa0\x81\xb2\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\x95\xf3\xa0\x81\x93\xf3\xa0\x81\x84\xf3\xa0\x80\xa0\xf3\xa0\x80\xb1\xf3\xa0\x80\xb2\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x81\xbf" "secret instructions"
+setf "llamapirate" "no secret instructions"
 
 _getfattr --absolute-names -d "${testfile}" >> $seqres.full
 
@@ -171,8 +171,8 @@ testf "zerojoin_moo\xe2\x80\x8ccow.txt" "zero width joiners"
 testf "combmark_\xe1\x80\x9c\xe1\x80\xad\xe1\x80\xaf.txt" "combining marks"
 testf "combmark_\xe1\x80\x9c\xe1\x80\xaf\xe1\x80\xad.txt" "combining marks"
 
-testf "llamapirate\xf3\xa0\x80\x81\xf3\xa0\x81\x94\xf3\xa0\x81\xa8\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb3\xf3\xa0\x81\xa1\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x81\xb3\xf3\xa0\x80\xa0\xf3\xa0\x81\xa6\xf3\xa0\x81\xaf\xf3\xa0\x81\xb2\xf3\xa0\x80\xa0\xf3\xa0\x81\x93\xf3\xa0\x81\xa5\xf3\xa0\x81\xa1\xf3\xa0\x81\xb4\xf3\xa0\x81\xb4\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb7\xf3\xa0\x81\xa5\xf3\xa0\x81\xb2\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\x95\xf3\xa0\x81\x93\xf3\xa0\x81\x84\xf3\xa0\x80\xa0\xf3\xa0\x80\xb1\xf3\xa0\x80\xb2\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x81\xbf"
-testf "llamapirate"
+testf "llamapirate\xf3\xa0\x80\x81\xf3\xa0\x81\x94\xf3\xa0\x81\xa8\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb3\xf3\xa0\x81\xa1\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x81\xb3\xf3\xa0\x80\xa0\xf3\xa0\x81\xa6\xf3\xa0\x81\xaf\xf3\xa0\x81\xb2\xf3\xa0\x80\xa0\xf3\xa0\x81\x93\xf3\xa0\x81\xa5\xf3\xa0\x81\xa1\xf3\xa0\x81\xb4\xf3\xa0\x81\xb4\xf3\xa0\x81\xac\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\xb7\xf3\xa0\x81\xa5\xf3\xa0\x81\xb2\xf3\xa0\x81\xa5\xf3\xa0\x80\xa0\xf3\xa0\x81\x95\xf3\xa0\x81\x93\xf3\xa0\x81\x84\xf3\xa0\x80\xa0\xf3\xa0\x80\xb1\xf3\xa0\x80\xb2\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x80\xb0\xf3\xa0\x81\xbf" "secret instructions"
+testf "llamapirate" "no secret instructions"
 
 echo "Uniqueness of keys?"
 crazy_keys="$(_getfattr --absolute-names -d "${testfile}" | grep -E -c '(french_|chinese_|greek_|arabic_|urk)')"


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 16/16] xfs/122: add tests for commitrange structures
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (14 preceding siblings ...)
  2024-11-26  1:24   ` [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest Darrick J. Wong
@ 2024-11-26  1:24   ` Darrick J. Wong
  2024-11-26  4:57     ` Christoph Hellwig
  2024-11-26 20:27   ` [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool Darrick J. Wong
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:24 UTC (permalink / raw)
  To: djwong, zlang; +Cc: fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Update this test to check the ioctl structure for XFS_IOC_COMMIT_RANGE,
which was added in 6.12.  This will be the last ever addition to
xfs/122, because in 6.13 we moved the ondisk structure checks to libxfs
after which we'll be able to _notrun this test on newer codebases.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/122.out |    1 +
 1 file changed, 1 insertion(+)


diff --git a/tests/xfs/122.out b/tests/xfs/122.out
index 60d8294551b1c8..4dc7d7d0a3602b 100644
--- a/tests/xfs/122.out
+++ b/tests/xfs/122.out
@@ -76,6 +76,7 @@ sizeof(struct xfs_bulk_ireq) = 64
 sizeof(struct xfs_bulkstat) = 192
 sizeof(struct xfs_bulkstat_req) = 64
 sizeof(struct xfs_clone_args) = 32
+sizeof(struct xfs_commit_range) = 88
 sizeof(struct xfs_cud_log_format) = 16
 sizeof(struct xfs_cui_log_format) = 16
 sizeof(struct xfs_da3_blkinfo) = 56


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
@ 2024-11-26  1:24   ` Darrick J. Wong
  2024-11-26  1:25   ` [PATCH 02/21] xfs: metapath scrubber should use the already loaded inodes Darrick J. Wong
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:24 UTC (permalink / raw)
  To: djwong; +Cc: stable, wozizhi, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

In commit ca6448aed4f10a, we created an "end_daddr" variable to fix
fsmap reporting when the end of the range requested falls in the middle
of an unknown (aka free on the rmapbt) region.  Unfortunately, I didn't
notice that the the code sets end_daddr to the last sector of the device
but then uses that quantity to compute the length of the synthesized
mapping.

Zizhi Wo later observed that when end_daddr isn't set, we still don't
report the last fsblock on a device because in that case (aka when
info->last is true), the info->high mapping that we pass to
xfs_getfsmap_group_helper has a startblock that points to the last
fsblock.  This is also wrong because the code uses startblock to
compute the length of the synthesized mapping.

Fix the second problem by setting end_daddr unconditionally, and fix the
first problem by setting start_daddr to one past the end of the range to
query.

Cc: <stable@vger.kernel.org> # v6.11
Fixes: ca6448aed4f10a ("xfs: Fix missing interval for missing_owner in xfs fsmap")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reported-by: Zizhi Wo <wozizhi@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_fsmap.c |   38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)


diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index 82f2e0dd224997..3290dd8524a69a 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -163,7 +163,8 @@ struct xfs_getfsmap_info {
 	xfs_daddr_t		next_daddr;	/* next daddr we expect */
 	/* daddr of low fsmap key when we're using the rtbitmap */
 	xfs_daddr_t		low_daddr;
-	xfs_daddr_t		end_daddr;	/* daddr of high fsmap key */
+	/* daddr of high fsmap key, or the last daddr on the device */
+	xfs_daddr_t		end_daddr;
 	u64			missing_owner;	/* owner of holes */
 	u32			dev;		/* device id */
 	/*
@@ -387,8 +388,8 @@ xfs_getfsmap_group_helper(
 	 * we calculated from userspace's high key to synthesize the record.
 	 * Note that if the btree query found a mapping, there won't be a gap.
 	 */
-	if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL)
-		frec->start_daddr = info->end_daddr;
+	if (info->last)
+		frec->start_daddr = info->end_daddr + 1;
 	else
 		frec->start_daddr = xfs_gbno_to_daddr(xg, startblock);
 
@@ -736,11 +737,10 @@ xfs_getfsmap_rtdev_rtbitmap_helper(
 	 * we calculated from userspace's high key to synthesize the record.
 	 * Note that if the btree query found a mapping, there won't be a gap.
 	 */
-	if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL) {
-		frec.start_daddr = info->end_daddr;
-	} else {
+	if (info->last)
+		frec.start_daddr = info->end_daddr + 1;
+	else
 		frec.start_daddr = xfs_rtb_to_daddr(mp, start_rtb);
-	}
 
 	frec.len_daddr = XFS_FSB_TO_BB(mp, rtbcount);
 	return xfs_getfsmap_helper(tp, info, &frec);
@@ -933,7 +933,10 @@ xfs_getfsmap(
 	struct xfs_trans		*tp = NULL;
 	struct xfs_fsmap		dkeys[2];	/* per-dev keys */
 	struct xfs_getfsmap_dev		handlers[XFS_GETFSMAP_DEVS];
-	struct xfs_getfsmap_info	info = { NULL };
+	struct xfs_getfsmap_info	info = {
+		.fsmap_recs		= fsmap_recs,
+		.head			= head,
+	};
 	bool				use_rmap;
 	int				i;
 	int				error = 0;
@@ -998,9 +1001,6 @@ xfs_getfsmap(
 
 	info.next_daddr = head->fmh_keys[0].fmr_physical +
 			  head->fmh_keys[0].fmr_length;
-	info.end_daddr = XFS_BUF_DADDR_NULL;
-	info.fsmap_recs = fsmap_recs;
-	info.head = head;
 
 	/* For each device we support... */
 	for (i = 0; i < XFS_GETFSMAP_DEVS; i++) {
@@ -1013,17 +1013,23 @@ xfs_getfsmap(
 			break;
 
 		/*
-		 * If this device number matches the high key, we have
-		 * to pass the high key to the handler to limit the
-		 * query results.  If the device number exceeds the
-		 * low key, zero out the low key so that we get
-		 * everything from the beginning.
+		 * If this device number matches the high key, we have to pass
+		 * the high key to the handler to limit the query results, and
+		 * set the end_daddr so that we can synthesize records at the
+		 * end of the query range or device.
 		 */
 		if (handlers[i].dev == head->fmh_keys[1].fmr_device) {
 			dkeys[1] = head->fmh_keys[1];
 			info.end_daddr = min(handlers[i].nr_sectors - 1,
 					     dkeys[1].fmr_physical);
+		} else {
+			info.end_daddr = handlers[i].nr_sectors - 1;
 		}
+
+		/*
+		 * If the device number exceeds the low key, zero out the low
+		 * key so that we get everything from the beginning.
+		 */
 		if (handlers[i].dev > head->fmh_keys[0].fmr_device)
 			memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/21] xfs: metapath scrubber should use the already loaded inodes
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
  2024-11-26  1:24   ` [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage Darrick J. Wong
@ 2024-11-26  1:25   ` Darrick J. Wong
  2024-11-26  1:25   ` [PATCH 03/21] xfs: keep quota directory inode loaded Darrick J. Wong
                     ` (19 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:25 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Don't waste time in xchk_setup_metapath_dqinode doing a second lookup of
the quota inodes, just grab them from the quotainfo structure.  The
whole point of this scrubber is to make sure that the dirents exist, so
it's completely silly to do lookups.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/metapath.c |   41 +++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)


diff --git a/fs/xfs/scrub/metapath.c b/fs/xfs/scrub/metapath.c
index b78db651346518..80467d6bc76389 100644
--- a/fs/xfs/scrub/metapath.c
+++ b/fs/xfs/scrub/metapath.c
@@ -196,36 +196,45 @@ xchk_setup_metapath_dqinode(
 	struct xfs_scrub	*sc,
 	xfs_dqtype_t		type)
 {
+	struct xfs_quotainfo	*qi = sc->mp->m_quotainfo;
 	struct xfs_trans	*tp = NULL;
 	struct xfs_inode	*dp = NULL;
 	struct xfs_inode	*ip = NULL;
-	const char		*path;
 	int			error;
 
+	if (!qi)
+		return -ENOENT;
+
+	switch (type) {
+	case XFS_DQTYPE_USER:
+		ip = qi->qi_uquotaip;
+		break;
+	case XFS_DQTYPE_GROUP:
+		ip = qi->qi_gquotaip;
+		break;
+	case XFS_DQTYPE_PROJ:
+		ip = qi->qi_pquotaip;
+		break;
+	default:
+		ASSERT(0);
+		return -EINVAL;
+	}
+	if (!ip)
+		return -ENOENT;
+
 	error = xfs_trans_alloc_empty(sc->mp, &tp);
 	if (error)
 		return error;
 
 	error = xfs_dqinode_load_parent(tp, &dp);
-	if (error)
-		goto out_cancel;
-
-	error = xfs_dqinode_load(tp, dp, type, &ip);
-	if (error)
-		goto out_dp;
-
 	xfs_trans_cancel(tp);
-	tp = NULL;
+	if (error)
+		return error;
 
-	path = kasprintf(GFP_KERNEL, "%s", xfs_dqinode_path(type));
-	error = xchk_setup_metapath_scan(sc, dp, path, ip);
+	error = xchk_setup_metapath_scan(sc, dp,
+			kstrdup(xfs_dqinode_path(type), GFP_KERNEL), ip);
 
-	xfs_irele(ip);
-out_dp:
 	xfs_irele(dp);
-out_cancel:
-	if (tp)
-		xfs_trans_cancel(tp);
 	return error;
 }
 #else


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/21] xfs: keep quota directory inode loaded
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
  2024-11-26  1:24   ` [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage Darrick J. Wong
  2024-11-26  1:25   ` [PATCH 02/21] xfs: metapath scrubber should use the already loaded inodes Darrick J. Wong
@ 2024-11-26  1:25   ` Darrick J. Wong
  2024-11-26  1:25   ` [PATCH 04/21] xfs: return a 64-bit block count from xfs_btree_count_blocks Darrick J. Wong
                     ` (18 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:25 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

In the same vein as the previous patch, there's no point in the metapath
scrub setup function doing a lookup on the quota metadir just so it can
validate that lookups work correctly.  Instead, retain the quota
directory inode in memory for the lifetime of the mount so that we can
check this meaningfully.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/metapath.c |   37 ++++++-------------------------------
 fs/xfs/xfs_qm.c         |   47 +++++++++++++++++++++++++----------------------
 fs/xfs/xfs_qm.h         |    1 +
 3 files changed, 32 insertions(+), 53 deletions(-)


diff --git a/fs/xfs/scrub/metapath.c b/fs/xfs/scrub/metapath.c
index 80467d6bc76389..c678cba1ffc3f7 100644
--- a/fs/xfs/scrub/metapath.c
+++ b/fs/xfs/scrub/metapath.c
@@ -171,23 +171,13 @@ static int
 xchk_setup_metapath_quotadir(
 	struct xfs_scrub	*sc)
 {
-	struct xfs_trans	*tp;
-	struct xfs_inode	*dp = NULL;
-	int			error;
+	struct xfs_quotainfo	*qi = sc->mp->m_quotainfo;
 
-	error = xfs_trans_alloc_empty(sc->mp, &tp);
-	if (error)
-		return error;
+	if (!qi || !qi->qi_dirip)
+		return -ENOENT;
 
-	error = xfs_dqinode_load_parent(tp, &dp);
-	xfs_trans_cancel(tp);
-	if (error)
-		return error;
-
-	error = xchk_setup_metapath_scan(sc, sc->mp->m_metadirip,
-			kasprintf(GFP_KERNEL, "quota"), dp);
-	xfs_irele(dp);
-	return error;
+	return xchk_setup_metapath_scan(sc, sc->mp->m_metadirip,
+			kstrdup("quota", GFP_KERNEL), qi->qi_dirip);
 }
 
 /* Scan a quota inode under the /quota directory. */
@@ -197,10 +187,7 @@ xchk_setup_metapath_dqinode(
 	xfs_dqtype_t		type)
 {
 	struct xfs_quotainfo	*qi = sc->mp->m_quotainfo;
-	struct xfs_trans	*tp = NULL;
-	struct xfs_inode	*dp = NULL;
 	struct xfs_inode	*ip = NULL;
-	int			error;
 
 	if (!qi)
 		return -ENOENT;
@@ -222,20 +209,8 @@ xchk_setup_metapath_dqinode(
 	if (!ip)
 		return -ENOENT;
 
-	error = xfs_trans_alloc_empty(sc->mp, &tp);
-	if (error)
-		return error;
-
-	error = xfs_dqinode_load_parent(tp, &dp);
-	xfs_trans_cancel(tp);
-	if (error)
-		return error;
-
-	error = xchk_setup_metapath_scan(sc, dp,
+	return xchk_setup_metapath_scan(sc, qi->qi_dirip,
 			kstrdup(xfs_dqinode_path(type), GFP_KERNEL), ip);
-
-	xfs_irele(dp);
-	return error;
 }
 #else
 # define xchk_setup_metapath_quotadir(...)	(-ENOENT)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index b928b036990bc3..a4fa21dfd6b4ad 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -241,6 +241,10 @@ xfs_qm_destroy_quotainos(
 		xfs_irele(qi->qi_pquotaip);
 		qi->qi_pquotaip = NULL;
 	}
+	if (qi->qi_dirip) {
+		xfs_irele(qi->qi_dirip);
+		qi->qi_dirip = NULL;
+	}
 }
 
 /*
@@ -648,8 +652,7 @@ xfs_qm_init_timelimits(
 static int
 xfs_qm_load_metadir_qinos(
 	struct xfs_mount	*mp,
-	struct xfs_quotainfo	*qi,
-	struct xfs_inode	**dpp)
+	struct xfs_quotainfo	*qi)
 {
 	struct xfs_trans	*tp;
 	int			error;
@@ -658,7 +661,7 @@ xfs_qm_load_metadir_qinos(
 	if (error)
 		return error;
 
-	error = xfs_dqinode_load_parent(tp, dpp);
+	error = xfs_dqinode_load_parent(tp, &qi->qi_dirip);
 	if (error == -ENOENT) {
 		/* no quota dir directory, but we'll create one later */
 		error = 0;
@@ -668,21 +671,21 @@ xfs_qm_load_metadir_qinos(
 		goto out_trans;
 
 	if (XFS_IS_UQUOTA_ON(mp)) {
-		error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_USER,
+		error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_USER,
 				&qi->qi_uquotaip);
 		if (error && error != -ENOENT)
 			goto out_trans;
 	}
 
 	if (XFS_IS_GQUOTA_ON(mp)) {
-		error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_GROUP,
+		error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_GROUP,
 				&qi->qi_gquotaip);
 		if (error && error != -ENOENT)
 			goto out_trans;
 	}
 
 	if (XFS_IS_PQUOTA_ON(mp)) {
-		error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_PROJ,
+		error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_PROJ,
 				&qi->qi_pquotaip);
 		if (error && error != -ENOENT)
 			goto out_trans;
@@ -698,34 +701,33 @@ xfs_qm_load_metadir_qinos(
 STATIC int
 xfs_qm_create_metadir_qinos(
 	struct xfs_mount	*mp,
-	struct xfs_quotainfo	*qi,
-	struct xfs_inode	**dpp)
+	struct xfs_quotainfo	*qi)
 {
 	int			error;
 
-	if (!*dpp) {
-		error = xfs_dqinode_mkdir_parent(mp, dpp);
+	if (!qi->qi_dirip) {
+		error = xfs_dqinode_mkdir_parent(mp, &qi->qi_dirip);
 		if (error && error != -EEXIST)
 			return error;
 	}
 
 	if (XFS_IS_UQUOTA_ON(mp) && !qi->qi_uquotaip) {
-		error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_USER,
-				&qi->qi_uquotaip);
+		error = xfs_dqinode_metadir_create(qi->qi_dirip,
+				XFS_DQTYPE_USER, &qi->qi_uquotaip);
 		if (error)
 			return error;
 	}
 
 	if (XFS_IS_GQUOTA_ON(mp) && !qi->qi_gquotaip) {
-		error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_GROUP,
-				&qi->qi_gquotaip);
+		error = xfs_dqinode_metadir_create(qi->qi_dirip,
+				XFS_DQTYPE_GROUP, &qi->qi_gquotaip);
 		if (error)
 			return error;
 	}
 
 	if (XFS_IS_PQUOTA_ON(mp) && !qi->qi_pquotaip) {
-		error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_PROJ,
-				&qi->qi_pquotaip);
+		error = xfs_dqinode_metadir_create(qi->qi_dirip,
+				XFS_DQTYPE_PROJ, &qi->qi_pquotaip);
 		if (error)
 			return error;
 	}
@@ -770,7 +772,6 @@ xfs_qm_init_metadir_qinos(
 	struct xfs_mount	*mp)
 {
 	struct xfs_quotainfo	*qi = mp->m_quotainfo;
-	struct xfs_inode	*dp = NULL;
 	int			error;
 
 	if (!xfs_has_quota(mp)) {
@@ -779,20 +780,22 @@ xfs_qm_init_metadir_qinos(
 			return error;
 	}
 
-	error = xfs_qm_load_metadir_qinos(mp, qi, &dp);
+	error = xfs_qm_load_metadir_qinos(mp, qi);
 	if (error)
 		goto out_err;
 
-	error = xfs_qm_create_metadir_qinos(mp, qi, &dp);
+	error = xfs_qm_create_metadir_qinos(mp, qi);
 	if (error)
 		goto out_err;
 
-	xfs_irele(dp);
+	/* The only user of the quota dir inode is online fsck */
+#if !IS_ENABLED(CONFIG_XFS_ONLINE_SCRUB)
+	xfs_irele(qi->qi_dirip);
+	qi->qi_dirip = NULL;
+#endif
 	return 0;
 out_err:
 	xfs_qm_destroy_quotainos(mp->m_quotainfo);
-	if (dp)
-		xfs_irele(dp);
 	return error;
 }
 
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index e919c7f62f5780..35b64bc3a7a867 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -55,6 +55,7 @@ struct xfs_quotainfo {
 	struct xfs_inode	*qi_uquotaip;	/* user quota inode */
 	struct xfs_inode	*qi_gquotaip;	/* group quota inode */
 	struct xfs_inode	*qi_pquotaip;	/* project quota inode */
+	struct xfs_inode	*qi_dirip;	/* quota metadir */
 	struct list_lru		qi_lru;
 	int			qi_dquots;
 	struct mutex		qi_quotaofflock;/* to serialize quotaoff */


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/21] xfs: return a 64-bit block count from xfs_btree_count_blocks
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (2 preceding siblings ...)
  2024-11-26  1:25   ` [PATCH 03/21] xfs: keep quota directory inode loaded Darrick J. Wong
@ 2024-11-26  1:25   ` Darrick J. Wong
  2024-11-26  1:26   ` [PATCH 05/21] xfs: don't drop errno values when we fail to ficlone the entire range Darrick J. Wong
                     ` (17 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:25 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

With the nrext64 feature enabled, it's possible for a data fork to have
2^48 extent mappings.  Even with a 64k fsblock size, that maps out to
a bmbt containing more than 2^32 blocks.  Therefore, this predicate must
return a u64 count to avoid an integer wraparound that will cause scrub
to do the wrong thing.

It's unlikely that any such filesystem currently exists, because the
incore bmbt would consume more than 64GB of kernel memory on its own,
and so far nobody except me has driven a filesystem that far, judging
from the lack of complaints.

Cc: <stable@vger.kernel.org> # v5.19
Fixes: df9ad5cc7a5240 ("xfs: Introduce macros to represent new maximum extent counts for data/attr forks")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_btree.c        |    4 ++--
 fs/xfs/libxfs/xfs_btree.h        |    2 +-
 fs/xfs/libxfs/xfs_ialloc_btree.c |    4 +++-
 fs/xfs/scrub/agheader.c          |    6 +++---
 fs/xfs/scrub/agheader_repair.c   |    6 +++---
 fs/xfs/scrub/fscounters.c        |    2 +-
 fs/xfs/scrub/ialloc.c            |    4 ++--
 fs/xfs/scrub/refcount.c          |    2 +-
 fs/xfs/xfs_bmap_util.c           |    2 +-
 9 files changed, 17 insertions(+), 15 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 2b5fc5fd16435d..c748866ef92368 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -5144,7 +5144,7 @@ xfs_btree_count_blocks_helper(
 	int			level,
 	void			*data)
 {
-	xfs_extlen_t		*blocks = data;
+	xfs_filblks_t		*blocks = data;
 	(*blocks)++;
 
 	return 0;
@@ -5154,7 +5154,7 @@ xfs_btree_count_blocks_helper(
 int
 xfs_btree_count_blocks(
 	struct xfs_btree_cur	*cur,
-	xfs_extlen_t		*blocks)
+	xfs_filblks_t		*blocks)
 {
 	*blocks = 0;
 	return xfs_btree_visit_blocks(cur, xfs_btree_count_blocks_helper,
diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
index 3b739459ebb0f4..c5bff273cae255 100644
--- a/fs/xfs/libxfs/xfs_btree.h
+++ b/fs/xfs/libxfs/xfs_btree.h
@@ -484,7 +484,7 @@ typedef int (*xfs_btree_visit_blocks_fn)(struct xfs_btree_cur *cur, int level,
 int xfs_btree_visit_blocks(struct xfs_btree_cur *cur,
 		xfs_btree_visit_blocks_fn fn, unsigned int flags, void *data);
 
-int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_extlen_t *blocks);
+int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_filblks_t *blocks);
 
 union xfs_btree_rec *xfs_btree_rec_addr(struct xfs_btree_cur *cur, int n,
 		struct xfs_btree_block *block);
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index 9b34896dd1a32f..6f270d8f4270cb 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -744,6 +744,7 @@ xfs_finobt_count_blocks(
 {
 	struct xfs_buf		*agbp = NULL;
 	struct xfs_btree_cur	*cur;
+	xfs_filblks_t		blocks;
 	int			error;
 
 	error = xfs_ialloc_read_agi(pag, tp, 0, &agbp);
@@ -751,9 +752,10 @@ xfs_finobt_count_blocks(
 		return error;
 
 	cur = xfs_finobt_init_cursor(pag, tp, agbp);
-	error = xfs_btree_count_blocks(cur, tree_blocks);
+	error = xfs_btree_count_blocks(cur, &blocks);
 	xfs_btree_del_cursor(cur, error);
 	xfs_trans_brelse(tp, agbp);
+	*tree_blocks = blocks;
 
 	return error;
 }
diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c
index 61f80a6410c738..1d41b85478da9d 100644
--- a/fs/xfs/scrub/agheader.c
+++ b/fs/xfs/scrub/agheader.c
@@ -458,7 +458,7 @@ xchk_agf_xref_btreeblks(
 {
 	struct xfs_agf		*agf = sc->sa.agf_bp->b_addr;
 	struct xfs_mount	*mp = sc->mp;
-	xfs_agblock_t		blocks;
+	xfs_filblks_t		blocks;
 	xfs_agblock_t		btreeblks;
 	int			error;
 
@@ -507,7 +507,7 @@ xchk_agf_xref_refcblks(
 	struct xfs_scrub	*sc)
 {
 	struct xfs_agf		*agf = sc->sa.agf_bp->b_addr;
-	xfs_agblock_t		blocks;
+	xfs_filblks_t		blocks;
 	int			error;
 
 	if (!sc->sa.refc_cur)
@@ -840,7 +840,7 @@ xchk_agi_xref_fiblocks(
 	struct xfs_scrub	*sc)
 {
 	struct xfs_agi		*agi = sc->sa.agi_bp->b_addr;
-	xfs_agblock_t		blocks;
+	xfs_filblks_t		blocks;
 	int			error = 0;
 
 	if (!xfs_has_inobtcounts(sc->mp))
diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c
index 0fad0baaba2f69..b45d2b32051a63 100644
--- a/fs/xfs/scrub/agheader_repair.c
+++ b/fs/xfs/scrub/agheader_repair.c
@@ -256,7 +256,7 @@ xrep_agf_calc_from_btrees(
 	struct xfs_agf		*agf = agf_bp->b_addr;
 	struct xfs_mount	*mp = sc->mp;
 	xfs_agblock_t		btreeblks;
-	xfs_agblock_t		blocks;
+	xfs_filblks_t		blocks;
 	int			error;
 
 	/* Update the AGF counters from the bnobt. */
@@ -946,7 +946,7 @@ xrep_agi_calc_from_btrees(
 	if (error)
 		goto err;
 	if (xfs_has_inobtcounts(mp)) {
-		xfs_agblock_t	blocks;
+		xfs_filblks_t	blocks;
 
 		error = xfs_btree_count_blocks(cur, &blocks);
 		if (error)
@@ -959,7 +959,7 @@ xrep_agi_calc_from_btrees(
 	agi->agi_freecount = cpu_to_be32(freecount);
 
 	if (xfs_has_finobt(mp) && xfs_has_inobtcounts(mp)) {
-		xfs_agblock_t	blocks;
+		xfs_filblks_t	blocks;
 
 		cur = xfs_finobt_init_cursor(sc->sa.pag, sc->tp, agi_bp);
 		error = xfs_btree_count_blocks(cur, &blocks);
diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
index 4a50f8e0004092..ca23cf4db6c5ef 100644
--- a/fs/xfs/scrub/fscounters.c
+++ b/fs/xfs/scrub/fscounters.c
@@ -261,7 +261,7 @@ xchk_fscount_btreeblks(
 	struct xchk_fscounters	*fsc,
 	xfs_agnumber_t		agno)
 {
-	xfs_extlen_t		blocks;
+	xfs_filblks_t		blocks;
 	int			error;
 
 	error = xchk_ag_init_existing(sc, agno, &sc->sa);
diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
index abad54c3621d44..4dc7c83dc08a40 100644
--- a/fs/xfs/scrub/ialloc.c
+++ b/fs/xfs/scrub/ialloc.c
@@ -650,8 +650,8 @@ xchk_iallocbt_xref_rmap_btreeblks(
 	struct xfs_scrub	*sc)
 {
 	xfs_filblks_t		blocks;
-	xfs_extlen_t		inobt_blocks = 0;
-	xfs_extlen_t		finobt_blocks = 0;
+	xfs_filblks_t		inobt_blocks = 0;
+	xfs_filblks_t		finobt_blocks = 0;
 	int			error;
 
 	if (!sc->sa.ino_cur || !sc->sa.rmap_cur ||
diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
index 2b6be75e942415..1c5e45cc64190c 100644
--- a/fs/xfs/scrub/refcount.c
+++ b/fs/xfs/scrub/refcount.c
@@ -491,7 +491,7 @@ xchk_refcount_xref_rmap(
 	struct xfs_scrub	*sc,
 	xfs_filblks_t		cow_blocks)
 {
-	xfs_extlen_t		refcbt_blocks = 0;
+	xfs_filblks_t		refcbt_blocks = 0;
 	xfs_filblks_t		blocks;
 	int			error;
 
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index a59bbe767a7dc4..0836fea2d6d814 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -103,7 +103,7 @@ xfs_bmap_count_blocks(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, whichfork);
 	struct xfs_btree_cur	*cur;
-	xfs_extlen_t		btblocks = 0;
+	xfs_filblks_t		btblocks = 0;
 	int			error;
 
 	*nextents = 0;


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/21] xfs: don't drop errno values when we fail to ficlone the entire range
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (3 preceding siblings ...)
  2024-11-26  1:25   ` [PATCH 04/21] xfs: return a 64-bit block count from xfs_btree_count_blocks Darrick J. Wong
@ 2024-11-26  1:26   ` Darrick J. Wong
  2024-11-26  1:26   ` [PATCH 06/21] xfs: separate healthy clearing mask during repair Darrick J. Wong
                     ` (16 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:26 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Way back when we first implemented FICLONE for XFS, life was simple --
either the the entire remapping completed, or something happened and we
had to return an errno explaining what happened.  Neither of those
ioctls support returning partial results, so it's all or nothing.

Then things got complicated when copy_file_range came along, because it
actually can return the number of bytes copied, so commit 3f68c1f562f1e4
tried to make it so that we could return a partial result if the
REMAP_FILE_CAN_SHORTEN flag is set.  This is also how FIDEDUPERANGE can
indicate that the kernel performed a partial deduplication.

Unfortunately, the logic is wrong if an error stops the remapping and
CAN_SHORTEN is not set.  Because those callers cannot return partial
results, it is an error for ->remap_file_range to return a positive
quantity that is less than the @len passed in.  Implementations really
should be returning a negative errno in this case, because that's what
btrfs (which introduced FICLONE{,RANGE}) did.

Therefore, ->remap_range implementations cannot silently drop an errno
that they might have when the number of bytes remapped is less than the
number of bytes requested and CAN_SHORTEN is not set.

Found by running generic/562 on a 64k fsblock filesystem and wondering
why it reported corrupt files.

Cc: <stable@vger.kernel.org> # v4.20
Fixes: 3fc9f5e409319e ("xfs: remove xfs_reflink_remap_range")
Really-Fixes: 3f68c1f562f1e4 ("xfs: support returning partial reflink results")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index c6de6b865ef11c..73562ff1c956f0 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1228,6 +1228,14 @@ xfs_file_remap_range(
 	xfs_iunlock2_remapping(src, dest);
 	if (ret)
 		trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_);
+	/*
+	 * If the caller did not set CAN_SHORTEN, then it is not prepared to
+	 * handle partial results -- either the whole remap succeeds, or we
+	 * must say why it did not.  In this case, any error should be returned
+	 * to the caller.
+	 */
+	if (ret && remapped < len && !(remap_flags & REMAP_FILE_CAN_SHORTEN))
+		return ret;
 	return remapped > 0 ? remapped : ret;
 }

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/21] xfs: separate healthy clearing mask during repair
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (4 preceding siblings ...)
  2024-11-26  1:26   ` [PATCH 05/21] xfs: don't drop errno values when we fail to ficlone the entire range Darrick J. Wong
@ 2024-11-26  1:26   ` Darrick J. Wong
  2024-11-26  1:26   ` [PATCH 07/21] xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink Darrick J. Wong
                     ` (15 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:26 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

In commit d9041681dd2f53 we introduced some XFS_SICK_*ZAPPED flags so
that the inode record repair code could clean up a damaged inode record
enough to iget the inode but still be able to remember that the higher
level repair code needs to be called.  As part of that, we introduced a
xchk_mark_healthy_if_clean helper that is supposed to cause the ZAPPED
state to be removed if that higher level metadata actually checks out.
This was done by setting additional bits in sick_mask hoping that
xchk_update_health will clear all those bits after a healthy scrub.

Unfortunately, that's not quite what sick_mask means -- bits in that
mask are indeed cleared if the metadata is healthy, but they're set if
the metadata is NOT healthy.  fsck is only intended to set the ZAPPED
bits explicitly.

If something else sets the CORRUPT/XCORRUPT state after the
xchk_mark_healthy_if_clean call, we end up marking the metadata zapped.
This can happen if the following sequence happens:

1. Scrub runs, discovers that the metadata is fine but could be
   optimized and calls xchk_mark_healthy_if_clean on a ZAPPED flag.
   That causes the ZAPPED flag to be set in sick_mask because the
   metadata is not CORRUPT or XCORRUPT.

2. Repair runs to optimize the metadata.

3. Some other metadata used for cross-referencing in (1) becomes
   corrupt.

4. Post-repair scrub runs, but this time it sets CORRUPT or XCORRUPT due
   to the events in (3).

5. Now the xchk_health_update sets the ZAPPED flag on the metadata we
   just repaired.  This is not the correct state.

Fix this by moving the "if healthy" mask to a separate field, and only
ever using it to clear the sick state.

Cc: <stable@vger.kernel.org> # v6.8
Fixes: d9041681dd2f53 ("xfs: set inode sick state flags when we zap either ondisk fork")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/health.c |   57 ++++++++++++++++++++++++++++---------------------
 fs/xfs/scrub/scrub.h  |    6 +++++
 2 files changed, 39 insertions(+), 24 deletions(-)


diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c
index ce86bdad37fa42..ccc6ca5934ca6a 100644
--- a/fs/xfs/scrub/health.c
+++ b/fs/xfs/scrub/health.c
@@ -71,7 +71,8 @@
 /* Map our scrub type to a sick mask and a set of health update functions. */
 
 enum xchk_health_group {
-	XHG_FS = 1,
+	XHG_NONE = 1,
+	XHG_FS,
 	XHG_AG,
 	XHG_INO,
 	XHG_RTGROUP,
@@ -83,6 +84,7 @@ struct xchk_health_map {
 };
 
 static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = {
+	[XFS_SCRUB_TYPE_PROBE]		= { XHG_NONE,  0 },
 	[XFS_SCRUB_TYPE_SB]		= { XHG_AG,  XFS_SICK_AG_SB },
 	[XFS_SCRUB_TYPE_AGF]		= { XHG_AG,  XFS_SICK_AG_AGF },
 	[XFS_SCRUB_TYPE_AGFL]		= { XHG_AG,  XFS_SICK_AG_AGFL },
@@ -133,7 +135,7 @@ xchk_mark_healthy_if_clean(
 {
 	if (!(sc->sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
 				  XFS_SCRUB_OFLAG_XCORRUPT)))
-		sc->sick_mask |= mask;
+		sc->healthy_mask |= mask;
 }
 
 /*
@@ -189,6 +191,7 @@ xchk_update_health(
 {
 	struct xfs_perag	*pag;
 	struct xfs_rtgroup	*rtg;
+	unsigned int		mask = sc->sick_mask;
 	bool			bad;
 
 	/*
@@ -203,50 +206,56 @@ xchk_update_health(
 		return;
 	}
 
-	if (!sc->sick_mask)
-		return;
-
 	bad = (sc->sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT |
 				   XFS_SCRUB_OFLAG_XCORRUPT));
+	if (!bad)
+		mask |= sc->healthy_mask;
 	switch (type_to_health_flag[sc->sm->sm_type].group) {
+	case XHG_NONE:
+		break;
 	case XHG_AG:
+		if (!mask)
+			return;
 		pag = xfs_perag_get(sc->mp, sc->sm->sm_agno);
 		if (bad)
-			xfs_group_mark_corrupt(pag_group(pag), sc->sick_mask);
+			xfs_group_mark_corrupt(pag_group(pag), mask);
 		else
-			xfs_group_mark_healthy(pag_group(pag), sc->sick_mask);
+			xfs_group_mark_healthy(pag_group(pag), mask);
 		xfs_perag_put(pag);
 		break;
 	case XHG_INO:
 		if (!sc->ip)
 			return;
-		if (bad) {
-			unsigned int	mask = sc->sick_mask;
-
-			/*
-			 * If we're coming in for repairs then we don't want
-			 * sickness flags to propagate to the incore health
-			 * status if the inode gets inactivated before we can
-			 * fix it.
-			 */
-			if (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR)
-				mask |= XFS_SICK_INO_FORGET;
+		/*
+		 * If we're coming in for repairs then we don't want sickness
+		 * flags to propagate to the incore health status if the inode
+		 * gets inactivated before we can fix it.
+		 */
+		if (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR)
+			mask |= XFS_SICK_INO_FORGET;
+		if (!mask)
+			return;
+		if (bad)
 			xfs_inode_mark_corrupt(sc->ip, mask);
-		} else
-			xfs_inode_mark_healthy(sc->ip, sc->sick_mask);
+		else
+			xfs_inode_mark_healthy(sc->ip, mask);
 		break;
 	case XHG_FS:
+		if (!mask)
+			return;
 		if (bad)
-			xfs_fs_mark_corrupt(sc->mp, sc->sick_mask);
+			xfs_fs_mark_corrupt(sc->mp, mask);
 		else
-			xfs_fs_mark_healthy(sc->mp, sc->sick_mask);
+			xfs_fs_mark_healthy(sc->mp, mask);
 		break;
 	case XHG_RTGROUP:
+		if (!mask)
+			return;
 		rtg = xfs_rtgroup_get(sc->mp, sc->sm->sm_agno);
 		if (bad)
-			xfs_group_mark_corrupt(rtg_group(rtg), sc->sick_mask);
+			xfs_group_mark_corrupt(rtg_group(rtg), mask);
 		else
-			xfs_group_mark_healthy(rtg_group(rtg), sc->sick_mask);
+			xfs_group_mark_healthy(rtg_group(rtg), mask);
 		xfs_rtgroup_put(rtg);
 		break;
 	default:
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index a7fda3e2b01377..5dbbe93cb49bfa 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -184,6 +184,12 @@ struct xfs_scrub {
 	 */
 	unsigned int			sick_mask;
 
+	/*
+	 * Clear these XFS_SICK_* flags but only if the scan is ok.  Useful for
+	 * removing ZAPPED flags after a repair.
+	 */
+	unsigned int			healthy_mask;
+
 	/* next time we want to cond_resched() */
 	struct xchk_relax		relax;
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/21] xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (5 preceding siblings ...)
  2024-11-26  1:26   ` [PATCH 06/21] xfs: separate healthy clearing mask during repair Darrick J. Wong
@ 2024-11-26  1:26   ` Darrick J. Wong
  2024-11-26  1:26   ` [PATCH 08/21] xfs: mark metadir repair tempfiles with IRECOVERY Darrick J. Wong
                     ` (14 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:26 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If we need to reset a symlink target to the "durr it's busted" string,
then we clear the zapped flag as well.  However, this should be using
the provided helper so that we don't set the zapped state on an
otherwise ok symlink.

Cc: <stable@vger.kernel.org> # v6.10
Fixes: 2651923d8d8db0 ("xfs: online repair of symbolic links")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/symlink_repair.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/symlink_repair.c b/fs/xfs/scrub/symlink_repair.c
index d015a86ef460fb..953ce7be78dc2f 100644
--- a/fs/xfs/scrub/symlink_repair.c
+++ b/fs/xfs/scrub/symlink_repair.c
@@ -36,6 +36,7 @@
 #include "scrub/tempfile.h"
 #include "scrub/tempexch.h"
 #include "scrub/reap.h"
+#include "scrub/health.h"
 
 /*
  * Symbolic Link Repair
@@ -233,7 +234,7 @@ xrep_symlink_salvage(
 	 * target zapped flag.
 	 */
 	if (buflen == 0) {
-		sc->sick_mask |= XFS_SICK_INO_SYMLINK_ZAPPED;
+		xchk_mark_healthy_if_clean(sc, XFS_SICK_INO_SYMLINK_ZAPPED);
 		sprintf(target_buf, DUMMY_TARGET);
 	}
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/21] xfs: mark metadir repair tempfiles with IRECOVERY
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (6 preceding siblings ...)
  2024-11-26  1:26   ` [PATCH 07/21] xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink Darrick J. Wong
@ 2024-11-26  1:26   ` Darrick J. Wong
  2024-11-26  1:27   ` [PATCH 09/21] xfs: fix null bno_hint handling in xfs_rtallocate_rtg Darrick J. Wong
                     ` (13 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:26 UTC (permalink / raw)
  To: djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Once in a long while, xfs/566 and xfs/801 report directory corruption in
one of the metadata subdirectories while it's forcibly rebuilding all
filesystem metadata.  I observed the following sequence of events:

1. Initiate a repair of the parent pointers for the /quota/user file.
   This is the secret file containing user quota data.

2. The pptr repair thread creates a temporary file and begins staging
   parent pointers in the ondisk metadata in preparation for an
   exchange-range to commit the new pptr data.

3. At the same time, initiate a repair of the /quota directory itself.

4. The dir repair thread finds the temporary file from (2), scans it for
   parent pointers, and stages a dirent in its own temporary dir in
   preparation to commit the fixed directory.

5. The parent pointer repair completes and frees the temporary file.

6. The dir repair commits the new directory and scans it again.  It
   finds the dirent that points to the old temporary file in (2) and
   marks the directory corrupt.

Oops!  Repair code must never scan the temporary files that other repair
functions create to stage new metadata.  They're not supposed to do
that, but the predicate function xrep_is_tempfile is incorrect because
it assumes that any XFS_DIFLAG2_METADATA file cannot ever be a temporary
file, but xrep_tempfile_adjust_directory_tree creates exactly that.

Fix this by setting the IRECOVERY flag on temporary metadata directory
inodes and using that to correct the predicate.  Repair code is supposed
to erase all the data in temporary files before releasing them, so it's
ok if a thread scans the temporary file after we drop IRECOVERY.

Fixes: bb6cdd5529ff67 ("xfs: hide metadata inodes from everyone because they are special")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/scrub/tempfile.c |   10 ++++++++--
 fs/xfs/xfs_inode.h      |    2 +-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/scrub/tempfile.c b/fs/xfs/scrub/tempfile.c
index 4b7f7860e37ece..dc3802c7f678ce 100644
--- a/fs/xfs/scrub/tempfile.c
+++ b/fs/xfs/scrub/tempfile.c
@@ -223,6 +223,7 @@ xrep_tempfile_adjust_directory_tree(
 	if (error)
 		goto out_ilock;

+	xfs_iflags_set(sc->tempip, XFS_IRECOVERY);
 	xfs_qm_dqdetach(sc->tempip);
 out_ilock:
 	xrep_tempfile_iunlock(sc);
@@ -246,6 +247,8 @@ xrep_tempfile_remove_metadir(

 	ASSERT(sc->tp == NULL);

+	xfs_iflags_clear(sc->tempip, XFS_IRECOVERY);
+
 	xfs_ilock(sc->tempip, XFS_IOLOCK_EXCL);
 	sc->temp_ilock_flags |= XFS_IOLOCK_EXCL;

@@ -945,10 +948,13 @@ xrep_is_tempfile(

 	/*
 	 * Files in the metadata directory tree also have S_PRIVATE set and
-	 * IOP_XATTR unset, so we must distinguish them separately.
+	 * IOP_XATTR unset, so we must distinguish them separately.  We (ab)use
+	 * the IRECOVERY flag to mark temporary metadir inodes knowing that the
+	 * end of log recovery clears IRECOVERY, so the only ones that can
+	 * exist during online repair are the ones we create.
 	 */
 	if (xfs_has_metadir(mp) && (ip->i_diflags2 & XFS_DIFLAG2_METADATA))
-		return false;
+		return __xfs_iflags_test(ip, XFS_IRECOVERY);

 	if (IS_PRIVATE(inode) && !(inode->i_opflags & IOP_XATTR))
 		return true;
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 2a4485fb990846..bd6b37beabacdd 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -231,7 +231,7 @@ xfs_iflags_clear(xfs_inode_t *ip, unsigned long flags)
 }

 static inline int
-__xfs_iflags_test(xfs_inode_t *ip, unsigned long flags)
+__xfs_iflags_test(const struct xfs_inode *ip, unsigned long flags)
 {
 	return (ip->i_flags & flags);
 }

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/21] xfs: fix null bno_hint handling in xfs_rtallocate_rtg
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (7 preceding siblings ...)
  2024-11-26  1:26   ` [PATCH 08/21] xfs: mark metadir repair tempfiles with IRECOVERY Darrick J. Wong
@ 2024-11-26  1:27   ` Darrick J. Wong
  2024-11-26  1:27   ` [PATCH 10/21] xfs: fix error bailout in xfs_rtginode_create Darrick J. Wong
                     ` (12 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:27 UTC (permalink / raw)
  To: djwong; +Cc: stable, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

xfs_bmap_rtalloc initializes the bno_hint variable to NULLRTBLOCK (aka
NULLFSBLOCK).  If the allocation request is for a file range that's
adjacent to an existing mapping, it will then change bno_hint to the
blkno hint in the bmalloca structure.

In other words, bno_hint is either a rt block number, or it's all 1s.
Unfortunately, commit ec12f97f1b8a8f didn't take the NULLRTBLOCK state
into account, which means that it tries to translate that into a
realtime extent number.  We then end up with an obnoxiously high rtx
number and pointlessly feed that to the near allocator.  This often
fails and falls back to the by-size allocator.  Seeing as we had no
locality hint anyway, this is a waste of time.

Fix the code to detect a lack of bno_hint correctly.  This was detected
by running xfs/009 with metadir enabled and a 28k rt extent size.

Cc: <stable@vger.kernel.org> # v6.12
Fixes: ec12f97f1b8a8f ("xfs: make the rtalloc start hint a xfs_rtblock_t")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_rtalloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 0cb534d71119a5..fcfa6e0eb3ad2a 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1827,7 +1827,7 @@ xfs_rtallocate_rtg(
 	 * For an allocation to an empty file at offset 0, pick an extent that
 	 * will space things out in the rt area.
 	 */
-	if (bno_hint)
+	if (bno_hint != NULLFSBLOCK)
 		start = xfs_rtb_to_rtx(args.mp, bno_hint);
 	else if (!xfs_has_rtgroups(args.mp) && initial_user_data)
 		start = xfs_rtpick_extent(args.rtg, tp, maxlen);

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/21] xfs: fix error bailout in xfs_rtginode_create
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (8 preceding siblings ...)
  2024-11-26  1:27   ` [PATCH 09/21] xfs: fix null bno_hint handling in xfs_rtallocate_rtg Darrick J. Wong
@ 2024-11-26  1:27   ` Darrick J. Wong
  2024-11-26  1:27   ` [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block Darrick J. Wong
                     ` (11 subsequent siblings)
  21 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:27 UTC (permalink / raw)
  To: djwong; +Cc: dan.carpenter, hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

smatch reported that we screwed up the error cleanup in this function.
Fix it.

Fixes: ae897e0bed0f54 ("xfs: support creating per-RTG files in growfs")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_rtgroup.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c
index e74bb059f24fa1..4f3bfc884aff29 100644
--- a/fs/xfs/libxfs/xfs_rtgroup.c
+++ b/fs/xfs/libxfs/xfs_rtgroup.c
@@ -496,7 +496,7 @@ xfs_rtginode_create(
 
 	error = xfs_metadir_create(&upd, S_IFREG);
 	if (error)
-		return error;
+		goto out_cancel;
 
 	xfs_rtginode_lockdep_setup(upd.ip, rtg_rgno(rtg), type);
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (9 preceding siblings ...)
  2024-11-26  1:27   ` [PATCH 10/21] xfs: fix error bailout in xfs_rtginode_create Darrick J. Wong
@ 2024-11-26  1:27   ` Darrick J. Wong
  2024-11-26  5:01     ` Christoph Hellwig
  2024-11-26  1:27   ` [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved Darrick J. Wong
                     ` (10 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:27 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec
would erroneously try to update the parent's key for a block that had
been split if we decided to insert the new record into the new block.
The solution was to detect this situation and update the in-core key
value that we pass up to the caller so that the caller will (eventually)
add the new block to the parent level of the tree with the correct key.

However, I missed a subtlety about the way inode-rooted btrees work.  If
the full block was a maximally sized inode root block, we'll solve that
fullness by moving the root block's records to a new block, resizing the
root block, and updating the root to point to the new block.  We don't
pass a pointer to the new block to the caller because that work has
already been done.  The new record will /always/ land in the new block,
so in this case we need to use xfs_btree_update_keys to update the keys.

This bug can theoretically manifest itself in the very rare case that we
split a bmbt root block and the new record lands in the very first slot
of the new block, though I've never managed to trigger it in practice.
However, it is very easy to reproduce by running generic/522 with the
realtime rmapbt patchset if rtinherit=1.

Cc: <stable@vger.kernel.org> # v4.8
Fixes: 2c813ad66a7218 ("xfs: support btrees with overlapping intervals for keys")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_btree.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index c748866ef92368..68ee1c299c25fd 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -3557,14 +3557,31 @@ xfs_btree_insrec(
 	xfs_btree_log_block(cur, bp, XFS_BB_NUMRECS);
 
 	/*
-	 * If we just inserted into a new tree block, we have to
-	 * recalculate nkey here because nkey is out of date.
+	 * Update btree keys to reflect the newly added record or keyptr.
+	 * There are three cases here to be aware of.  Normally, all we have to
+	 * do is walk towards the root, updating keys as necessary.
 	 *
-	 * Otherwise we're just updating an existing block (having shoved
-	 * some records into the new tree block), so use the regular key
-	 * update mechanism.
+	 * If the caller had us target a full block for the insertion, we dealt
+	 * with that by calling the _make_block_unfull function.  If the
+	 * "make unfull" function splits the block, it'll hand us back the key
+	 * and pointer of the new block.  We haven't yet added the new block to
+	 * the next level up, so if we decide to add the new record to the new
+	 * block (bp->b_bn != old_bn), we have to update the caller's pointer
+	 * so that the caller adds the new block with the correct key.
+	 *
+	 * However, there is a third possibility-- if the selected block is the
+	 * root block of an inode-rooted btree and cannot be expanded further,
+	 * the "make unfull" function moves the root block contents to a new
+	 * block and updates the root block to point to the new block.  In this
+	 * case, no block pointer is passed back because the block has already
+	 * been added to the btree.  In this case, we need to use the regular
+	 * key update function, just like the first case.  This is critical for
+	 * overlapping btrees, because the high key must be updated to reflect
+	 * the entire tree, not just the subtree accessible through the first
+	 * child of the root (which is now two levels down from the root).
 	 */
-	if (bp && xfs_buf_daddr(bp) != old_bn) {
+	if (!xfs_btree_ptr_is_null(cur, &nptr) &&
+	    bp && xfs_buf_daddr(bp) != old_bn) {
 		xfs_btree_get_keys(cur, block, lkey);
 	} else if (xfs_btree_needs_key_update(cur, optr)) {
 		error = xfs_btree_update_keys(cur, level);


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (10 preceding siblings ...)
  2024-11-26  1:27   ` [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block Darrick J. Wong
@ 2024-11-26  1:27   ` Darrick J. Wong
  2024-11-26  5:01     ` Christoph Hellwig
  2024-11-26  1:28   ` [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir Darrick J. Wong
                     ` (9 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:27 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Fix a minor mistakes in the scrub tracepoints that can manifest when
inode-rooted btrees are enabled.  The existing code worked fine for bmap
btrees, but we should tighten the code up to be less sloppy.

Cc: <stable@vger.kernel.org> # v5.7
Fixes: 92219c292af8dd ("xfs: convert btree cursor inode-private member names")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/scrub/trace.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h
index 9b38f5ad1eaf07..d2ae7e93acb08e 100644
--- a/fs/xfs/scrub/trace.h
+++ b/fs/xfs/scrub/trace.h
@@ -605,7 +605,7 @@ TRACE_EVENT(xchk_ifork_btree_op_error,
 	TP_fast_assign(
 		xfs_fsblock_t fsbno = xchk_btree_cur_fsbno(cur, level);
 		__entry->dev = sc->mp->m_super->s_dev;
-		__entry->ino = sc->ip->i_ino;
+		__entry->ino = cur->bc_ino.ip->i_ino;
 		__entry->whichfork = cur->bc_ino.whichfork;
 		__entry->type = sc->sm->sm_type;
 		__assign_str(name);


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (11 preceding siblings ...)
  2024-11-26  1:27   ` [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved Darrick J. Wong
@ 2024-11-26  1:28   ` Darrick J. Wong
  2024-11-26  5:03     ` Christoph Hellwig
  2024-11-26  1:28   ` [PATCH 14/21] xfs: only run precommits once per transaction object Darrick J. Wong
                     ` (8 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:28 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Debugging a filesystem patch with generic/475 caused the system to hang
after observing the following sequences in dmesg:

 XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x491520 len 32 error 5
 XFS (dm-0): metadata I/O error in "xfs_btree_read_buf_block+0xba/0x160 [xfs]" at daddr 0x3445608 len 8 error 5
 XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x138e1c0 len 32 error 5
 XFS (dm-0): log I/O error -5
 XFS (dm-0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x1ea/0x4b0 [xfs] (fs/xfs/xfs_trans_buf.c:311).  Shutting down filesystem.
 XFS (dm-0): Please unmount the filesystem and rectify the problem(s)
 XFS (dm-0): Internal error dqp->q_ino.reserved < dqp->q_ino.count at line 869 of file fs/xfs/xfs_trans_dquot.c.  Caller xfs_trans_dqresv+0x236/0x440 [xfs]
 XFS (dm-0): Corruption detected. Unmount and run xfs_repair
 XFS (dm-0): Unmounting Filesystem be6bcbcc-9921-4deb-8d16-7cc94e335fa7

The system is stuck in unmount trying to lock a couple of inodes so that
they can be purged.  The dquot corruption notice above is a clue to what
happened -- a link() call tried to set up a transaction to link a child
into a directory.  Quota reservation for the transaction failed after IO
errors shut down the filesystem, but then we forgot to unlock the inodes
on our way out.  Fix that.

Cc: <stable@vger.kernel.org> # v6.10
Fixes: bd5562111d5839 ("xfs: Hold inode locks in xfs_trans_alloc_dir")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |    3 +++
 1 file changed, 3 insertions(+)


diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 30fbed27cf05cc..05b18e30368e4b 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -1435,5 +1435,8 @@ xfs_trans_alloc_dir(
 
 out_cancel:
 	xfs_trans_cancel(tp);
+	xfs_iunlock(dp, XFS_ILOCK_EXCL);
+	if (dp != ip)
+		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
 }


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 14/21] xfs: only run precommits once per transaction object
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (12 preceding siblings ...)
  2024-11-26  1:28   ` [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir Darrick J. Wong
@ 2024-11-26  1:28   ` Darrick J. Wong
  2024-11-26  5:09     ` Christoph Hellwig
  2024-11-26  1:28   ` [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit Darrick J. Wong
                     ` (7 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:28 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Committing a transaction tx0 with a defer ops chain of (A, B, C)
creates a chain of transactions that looks like this:

tx0 -> txA -> txB -> txC

Prior to commit cb042117488dbf, __xfs_trans_commit would run precommits
on tx0, then call xfs_defer_finish_noroll to convert A-C to tx[A-C].
Unfortunately, after the finish_noroll loop we forgot to run precommits
on txC.  That was fixed by adding the second precommit call.

Unfortunately, none of us remembered that xfs_defer_finish_noroll
calls __xfs_trans_commit a second time to commit tx0 before finishing
work A in txA and committing that.  In other words, we run precommits
twice on tx0:

xfs_trans_commit(tx0)
    __xfs_trans_commit(tx0, false)
        xfs_trans_run_precommits(tx0)
        xfs_defer_finish_noroll(tx0)
            xfs_trans_roll(tx0)
                txA = xfs_trans_dup(tx0)
                __xfs_trans_commit(tx0, true)
                xfs_trans_run_precommits(tx0)

This currently isn't an issue because the inode item precommit is
idempotent; the iunlink item precommit deletes itself so it can't be
called again; and the buffer/dquot item precommits only check the incore
objects for corruption.  However, it doesn't make sense to run
precommits twice.

Fix this situation by only running precommits after finish_noroll.

Cc: <stable@vger.kernel.org> # v6.4
Fixes: cb042117488dbf ("xfs: defered work could create precommits")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |   16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 05b18e30368e4b..4a517250efc911 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -860,13 +860,6 @@ __xfs_trans_commit(
 
 	trace_xfs_trans_commit(tp, _RET_IP_);
 
-	error = xfs_trans_run_precommits(tp);
-	if (error) {
-		if (tp->t_flags & XFS_TRANS_PERM_LOG_RES)
-			xfs_defer_cancel(tp);
-		goto out_unreserve;
-	}
-
 	/*
 	 * Finish deferred items on final commit. Only permanent transactions
 	 * should ever have deferred ops.
@@ -877,13 +870,12 @@ __xfs_trans_commit(
 		error = xfs_defer_finish_noroll(&tp);
 		if (error)
 			goto out_unreserve;
-
-		/* Run precommits from final tx in defer chain. */
-		error = xfs_trans_run_precommits(tp);
-		if (error)
-			goto out_unreserve;
 	}
 
+	error = xfs_trans_run_precommits(tp);
+	if (error)
+		goto out_unreserve;
+
 	/*
 	 * If there is nothing to be logged by the transaction,
 	 * then unlock all of the items associated with the


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (13 preceding siblings ...)
  2024-11-26  1:28   ` [PATCH 14/21] xfs: only run precommits once per transaction object Darrick J. Wong
@ 2024-11-26  1:28   ` Darrick J. Wong
  2024-11-26  5:11     ` Christoph Hellwig
  2024-11-26  1:28   ` [PATCH 16/21] xfs: don't lose solo superblock counter update transactions Darrick J. Wong
                     ` (6 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:28 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls
__xfs_trans_commit again on the same transaction.  In other words,
there's function recursion that has caused minor amounts of confusion in
the past.  There's no reason to keep this around, since there's only one
place where we actually want the xfs_defer_finish_noroll, and that is in
the top level xfs_trans_commit call.

Fixes: 98719051e75ccf ("xfs: refactor internal dfops initialization")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |   26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 4a517250efc911..26bb2343082af4 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -860,18 +860,6 @@ __xfs_trans_commit(
 
 	trace_xfs_trans_commit(tp, _RET_IP_);
 
-	/*
-	 * Finish deferred items on final commit. Only permanent transactions
-	 * should ever have deferred ops.
-	 */
-	WARN_ON_ONCE(!list_empty(&tp->t_dfops) &&
-		     !(tp->t_flags & XFS_TRANS_PERM_LOG_RES));
-	if (!regrant && (tp->t_flags & XFS_TRANS_PERM_LOG_RES)) {
-		error = xfs_defer_finish_noroll(&tp);
-		if (error)
-			goto out_unreserve;
-	}
-
 	error = xfs_trans_run_precommits(tp);
 	if (error)
 		goto out_unreserve;
@@ -950,6 +938,20 @@ int
 xfs_trans_commit(
 	struct xfs_trans	*tp)
 {
+	/*
+	 * Finish deferred items on final commit. Only permanent transactions
+	 * should ever have deferred ops.
+	 */
+	WARN_ON_ONCE(!list_empty(&tp->t_dfops) &&
+		     !(tp->t_flags & XFS_TRANS_PERM_LOG_RES));
+	if (tp->t_flags & XFS_TRANS_PERM_LOG_RES) {
+		int error = xfs_defer_finish_noroll(&tp);
+		if (error) {
+			xfs_trans_cancel(tp);
+			return error;
+		}
+	}
+
 	return __xfs_trans_commit(tp, false);
 }
 


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 16/21] xfs: don't lose solo superblock counter update transactions
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (14 preceding siblings ...)
  2024-11-26  1:28   ` [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit Darrick J. Wong
@ 2024-11-26  1:28   ` Darrick J. Wong
  2024-11-26  5:14     ` Christoph Hellwig
  2024-11-26  1:29   ` [PATCH 17/21] xfs: don't lose solo dquot " Darrick J. Wong
                     ` (5 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:28 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Superblock counter updates are tracked via per-transaction counters in
the xfs_trans object.  These changes are then turned into dirty log
items in xfs_trans_apply_sb_deltas just prior to commiting the log items
to the CIL.

However, updating the per-transaction counter deltas do not cause
XFS_TRANS_DIRTY to be set on the transaction.  In other words, a pure sb
counter update will be silently discarded if there are no other dirty
log items attached to the transaction.

This is currently not the case anywhere in the filesystem because sb
counter updates always dirty at least one other metadata item, but let's
not leave a logic bomb.

Cc: <stable@vger.kernel.org> # v2.6.35
Fixes: 0924378a689ccb ("xfs: split out iclog writing from xfs_trans_commit()")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 26bb2343082af4..427a8ba0ab99e2 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -860,6 +860,13 @@ __xfs_trans_commit(
 
 	trace_xfs_trans_commit(tp, _RET_IP_);
 
+	/*
+	 * Commit per-transaction changes that are not already tracked through
+	 * log items.  This can add dirty log items to the transaction.
+	 */
+	if (tp->t_flags & XFS_TRANS_SB_DIRTY)
+		xfs_trans_apply_sb_deltas(tp);
+
 	error = xfs_trans_run_precommits(tp);
 	if (error)
 		goto out_unreserve;
@@ -890,8 +897,6 @@ __xfs_trans_commit(
 	/*
 	 * If we need to update the superblock, then do it now.
 	 */
-	if (tp->t_flags & XFS_TRANS_SB_DIRTY)
-		xfs_trans_apply_sb_deltas(tp);
 	xfs_trans_apply_dquot_deltas(tp);
 
 	xlog_cil_commit(log, tp, &commit_seq, regrant);


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 17/21] xfs: don't lose solo dquot update transactions
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (15 preceding siblings ...)
  2024-11-26  1:28   ` [PATCH 16/21] xfs: don't lose solo superblock counter update transactions Darrick J. Wong
@ 2024-11-26  1:29   ` Darrick J. Wong
  2024-11-26  5:18     ` Christoph Hellwig
  2024-11-26  1:29   ` [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush Darrick J. Wong
                     ` (4 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:29 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Quota counter updates are tracked via incore objects which hang off the
xfs_trans object.  These changes are then turned into dirty log items in
xfs_trans_apply_dquot_deltas just prior to commiting the log items to
the CIL.

However, updating the incore deltas do not cause XFS_TRANS_DIRTY to be
set on the transaction.  In other words, a pure quota counter update
will be silently discarded if there are no other dirty log items
attached to the transaction.

This is currently not the case anywhere in the filesystem because quota
updates always dirty at least one other metadata item, but a subsequent
bug fix will add dquot log item precommits, so we actually need a dirty
dquot log item prior to xfs_trans_run_precommits.  Also let's not leave
a logic bomb.

Cc: <stable@vger.kernel.org> # v2.6.35
Fixes: 0924378a689ccb ("xfs: split out iclog writing from xfs_trans_commit()")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_quota.h       |    7 ++++---
 fs/xfs/xfs_trans.c       |   10 +++-------
 fs/xfs/xfs_trans_dquot.c |   31 ++++++++++++++++++++++++++-----
 3 files changed, 33 insertions(+), 15 deletions(-)


diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index fa1317cc396c96..b864ed59787780 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -101,7 +101,8 @@ extern void xfs_trans_free_dqinfo(struct xfs_trans *);
 extern void xfs_trans_mod_dquot_byino(struct xfs_trans *, struct xfs_inode *,
 		uint, int64_t);
 extern void xfs_trans_apply_dquot_deltas(struct xfs_trans *);
-extern void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *);
+void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *tp,
+		bool already_locked);
 int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp, struct xfs_inode *ip,
 		int64_t dblocks, int64_t rblocks, bool force);
 extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
@@ -172,8 +173,8 @@ static inline void xfs_trans_mod_dquot_byino(struct xfs_trans *tp,
 		struct xfs_inode *ip, uint field, int64_t delta)
 {
 }
-#define xfs_trans_apply_dquot_deltas(tp)
-#define xfs_trans_unreserve_and_mod_dquots(tp)
+#define xfs_trans_apply_dquot_deltas(tp, a)
+#define xfs_trans_unreserve_and_mod_dquots(tp, a)
 static inline int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp,
 		struct xfs_inode *ip, int64_t dblocks, int64_t rblocks,
 		bool force)
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 427a8ba0ab99e2..4cd25717c9d130 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -866,6 +866,7 @@ __xfs_trans_commit(
 	 */
 	if (tp->t_flags & XFS_TRANS_SB_DIRTY)
 		xfs_trans_apply_sb_deltas(tp);
+	xfs_trans_apply_dquot_deltas(tp);
 
 	error = xfs_trans_run_precommits(tp);
 	if (error)
@@ -894,11 +895,6 @@ __xfs_trans_commit(
 
 	ASSERT(tp->t_ticket != NULL);
 
-	/*
-	 * If we need to update the superblock, then do it now.
-	 */
-	xfs_trans_apply_dquot_deltas(tp);
-
 	xlog_cil_commit(log, tp, &commit_seq, regrant);
 
 	xfs_trans_free(tp);
@@ -924,7 +920,7 @@ __xfs_trans_commit(
 	 * the dqinfo portion to be.  All that means is that we have some
 	 * (non-persistent) quota reservations that need to be unreserved.
 	 */
-	xfs_trans_unreserve_and_mod_dquots(tp);
+	xfs_trans_unreserve_and_mod_dquots(tp, true);
 	if (tp->t_ticket) {
 		if (regrant && !xlog_is_shutdown(log))
 			xfs_log_ticket_regrant(log, tp->t_ticket);
@@ -1018,7 +1014,7 @@ xfs_trans_cancel(
 	}
 #endif
 	xfs_trans_unreserve_and_mod_sb(tp);
-	xfs_trans_unreserve_and_mod_dquots(tp);
+	xfs_trans_unreserve_and_mod_dquots(tp, false);
 
 	if (tp->t_ticket) {
 		xfs_log_ticket_ungrant(log, tp->t_ticket);
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index 481ba3dc9f190d..713b6d243e5631 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -606,6 +606,24 @@ xfs_trans_apply_dquot_deltas(
 			ASSERT(dqp->q_blk.reserved >= dqp->q_blk.count);
 			ASSERT(dqp->q_ino.reserved >= dqp->q_ino.count);
 			ASSERT(dqp->q_rtb.reserved >= dqp->q_rtb.count);
+
+			/*
+			 * We've applied the count changes and given back
+			 * whatever reservation we didn't use.  Zero out the
+			 * dqtrx fields.
+			 */
+			qtrx->qt_blk_res = 0;
+			qtrx->qt_bcount_delta = 0;
+			qtrx->qt_delbcnt_delta = 0;
+
+			qtrx->qt_rtblk_res = 0;
+			qtrx->qt_rtblk_res_used = 0;
+			qtrx->qt_rtbcount_delta = 0;
+			qtrx->qt_delrtb_delta = 0;
+
+			qtrx->qt_ino_res = 0;
+			qtrx->qt_ino_res_used = 0;
+			qtrx->qt_icount_delta = 0;
 		}
 	}
 }
@@ -642,7 +660,8 @@ xfs_trans_unreserve_and_mod_dquots_hook(
  */
 void
 xfs_trans_unreserve_and_mod_dquots(
-	struct xfs_trans	*tp)
+	struct xfs_trans	*tp,
+	bool			already_locked)
 {
 	int			i, j;
 	struct xfs_dquot	*dqp;
@@ -671,10 +690,12 @@ xfs_trans_unreserve_and_mod_dquots(
 			 * about the number of blocks used field, or deltas.
 			 * Also we don't bother to zero the fields.
 			 */
-			locked = false;
+			locked = already_locked;
 			if (qtrx->qt_blk_res) {
-				xfs_dqlock(dqp);
-				locked = true;
+				if (!locked) {
+					xfs_dqlock(dqp);
+					locked = true;
+				}
 				dqp->q_blk.reserved -=
 					(xfs_qcnt_t)qtrx->qt_blk_res;
 			}
@@ -695,7 +716,7 @@ xfs_trans_unreserve_and_mod_dquots(
 				dqp->q_rtb.reserved -=
 					(xfs_qcnt_t)qtrx->qt_rtblk_res;
 			}
-			if (locked)
+			if (locked && !already_locked)
 				xfs_dqunlock(dqp);
 
 		}


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (16 preceding siblings ...)
  2024-11-26  1:29   ` [PATCH 17/21] xfs: don't lose solo dquot " Darrick J. Wong
@ 2024-11-26  1:29   ` Darrick J. Wong
  2024-11-26  5:27     ` Christoph Hellwig
  2024-11-26  1:29   ` [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done} Darrick J. Wong
                     ` (3 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:29 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

The first step towards holding the dquot buffer in the li_buf instead of
reading it in the AIL is to separate the part that reads the buffer from
the actual flush code.  There should be no functional changes.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_dquot.c      |   57 +++++++++++++++++++++++++++++++----------------
 fs/xfs/xfs_dquot.h      |    4 ++-
 fs/xfs/xfs_dquot_item.c |   20 +++++++++++++---
 fs/xfs/xfs_qm.c         |   37 +++++++++++++++++++++++++------
 4 files changed, 86 insertions(+), 32 deletions(-)


diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index ff982d983989b0..6ec4087e38dfc8 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1238,6 +1238,42 @@ xfs_qm_dqflush_check(
 	return NULL;
 }
 
+/*
+ * Get the buffer containing the on-disk dquot.
+ *
+ * Requires dquot flush lock, will clear the dirty flag, delete the quota log
+ * item from the AIL, and shut down the system if something goes wrong.
+ */
+int
+xfs_dquot_read_buf(
+	struct xfs_trans	*tp,
+	struct xfs_dquot	*dqp,
+	struct xfs_buf		**bpp)
+{
+	struct xfs_mount	*mp = dqp->q_mount;
+	struct xfs_buf		*bp = NULL;
+	int			error;
+
+	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno,
+				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
+				   &bp, &xfs_dquot_buf_ops);
+	if (error == -EAGAIN)
+		return error;
+	if (xfs_metadata_is_sick(error))
+		xfs_dquot_mark_sick(dqp);
+	if (error)
+		goto out_abort;
+
+	*bpp = bp;
+	return 0;
+
+out_abort:
+	dqp->q_flags &= ~XFS_DQFLAG_DIRTY;
+	xfs_trans_ail_delete(&dqp->q_logitem.qli_item, 0);
+	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
+	return error;
+}
+
 /*
  * Write a modified dquot to disk.
  * The dquot must be locked and the flush lock too taken by caller.
@@ -1249,11 +1285,10 @@ xfs_qm_dqflush_check(
 int
 xfs_qm_dqflush(
 	struct xfs_dquot	*dqp,
-	struct xfs_buf		**bpp)
+	struct xfs_buf		*bp)
 {
 	struct xfs_mount	*mp = dqp->q_mount;
 	struct xfs_log_item	*lip = &dqp->q_logitem.qli_item;
-	struct xfs_buf		*bp;
 	struct xfs_dqblk	*dqblk;
 	xfs_failaddr_t		fa;
 	int			error;
@@ -1263,28 +1298,12 @@ xfs_qm_dqflush(
 
 	trace_xfs_dqflush(dqp);
 
-	*bpp = NULL;
-
 	xfs_qm_dqunpin_wait(dqp);
 
-	/*
-	 * Get the buffer containing the on-disk dquot
-	 */
-	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
-				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
-				   &bp, &xfs_dquot_buf_ops);
-	if (error == -EAGAIN)
-		goto out_unlock;
-	if (xfs_metadata_is_sick(error))
-		xfs_dquot_mark_sick(dqp);
-	if (error)
-		goto out_abort;
-
 	fa = xfs_qm_dqflush_check(dqp);
 	if (fa) {
 		xfs_alert(mp, "corrupt dquot ID 0x%x in memory at %pS",
 				dqp->q_id, fa);
-		xfs_buf_relse(bp);
 		xfs_dquot_mark_sick(dqp);
 		error = -EFSCORRUPTED;
 		goto out_abort;
@@ -1334,14 +1353,12 @@ xfs_qm_dqflush(
 	}
 
 	trace_xfs_dqflush_done(dqp);
-	*bpp = bp;
 	return 0;
 
 out_abort:
 	dqp->q_flags &= ~XFS_DQFLAG_DIRTY;
 	xfs_trans_ail_delete(lip, 0);
 	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
-out_unlock:
 	xfs_dqfunlock(dqp);
 	return error;
 }
diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
index d73d179df00958..50f8404c41176c 100644
--- a/fs/xfs/xfs_dquot.h
+++ b/fs/xfs/xfs_dquot.h
@@ -214,7 +214,9 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp);
 #define XFS_DQ_IS_DIRTY(dqp)	((dqp)->q_flags & XFS_DQFLAG_DIRTY)
 
 void		xfs_qm_dqdestroy(struct xfs_dquot *dqp);
-int		xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf **bpp);
+int		xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp,
+				struct xfs_buf **bpp);
+int		xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp);
 void		xfs_qm_dqunpin_wait(struct xfs_dquot *dqp);
 void		xfs_qm_adjust_dqtimers(struct xfs_dquot *d);
 void		xfs_qm_adjust_dqlimits(struct xfs_dquot *d);
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 7d19091215b080..56ecc5ed01934d 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -155,14 +155,26 @@ xfs_qm_dquot_logitem_push(
 
 	spin_unlock(&lip->li_ailp->ail_lock);
 
-	error = xfs_qm_dqflush(dqp, &bp);
+	error = xfs_dquot_read_buf(NULL, dqp, &bp);
+	if (error) {
+		if (error == -EAGAIN)
+			rval = XFS_ITEM_LOCKED;
+		xfs_dqfunlock(dqp);
+		goto out_relock_ail;
+	}
+
+	/*
+	 * dqflush completes dqflock on error, and the delwri ioend does it on
+	 * success.
+	 */
+	error = xfs_qm_dqflush(dqp, bp);
 	if (!error) {
 		if (!xfs_buf_delwri_queue(bp, buffer_list))
 			rval = XFS_ITEM_FLUSHING;
-		xfs_buf_relse(bp);
-	} else if (error == -EAGAIN)
-		rval = XFS_ITEM_LOCKED;
+	}
+	xfs_buf_relse(bp);
 
+out_relock_ail:
 	spin_lock(&lip->li_ailp->ail_lock);
 out_unlock:
 	xfs_dqunlock(dqp);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index a4fa21dfd6b4ad..341fe4821c2d77 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -148,17 +148,28 @@ xfs_qm_dqpurge(
 		 * We don't care about getting disk errors here. We need
 		 * to purge this dquot anyway, so we go ahead regardless.
 		 */
-		error = xfs_qm_dqflush(dqp, &bp);
+		error = xfs_dquot_read_buf(NULL, dqp, &bp);
+		if (error == -EAGAIN) {
+			xfs_dqfunlock(dqp);
+			dqp->q_flags &= ~XFS_DQFLAG_FREEING;
+			goto out_unlock;
+		}
+		if (error)
+			goto out_funlock;
+
+		/*
+		 * dqflush completes dqflock on error, and the bwrite ioend
+		 * does it on success.
+		 */
+		error = xfs_qm_dqflush(dqp, bp);
 		if (!error) {
 			error = xfs_bwrite(bp);
 			xfs_buf_relse(bp);
-		} else if (error == -EAGAIN) {
-			dqp->q_flags &= ~XFS_DQFLAG_FREEING;
-			goto out_unlock;
 		}
 		xfs_dqflock(dqp);
 	}
 
+out_funlock:
 	ASSERT(atomic_read(&dqp->q_pincount) == 0);
 	ASSERT(xlog_is_shutdown(dqp->q_logitem.qli_item.li_log) ||
 		!test_bit(XFS_LI_IN_AIL, &dqp->q_logitem.qli_item.li_flags));
@@ -495,7 +506,17 @@ xfs_qm_dquot_isolate(
 		/* we have to drop the LRU lock to flush the dquot */
 		spin_unlock(lru_lock);
 
-		error = xfs_qm_dqflush(dqp, &bp);
+		error = xfs_dquot_read_buf(NULL, dqp, &bp);
+		if (error) {
+			xfs_dqfunlock(dqp);
+			goto out_unlock_dirty;
+		}
+
+		/*
+		 * dqflush completes dqflock on error, and the delwri ioend
+		 * does it on success.
+		 */
+		error = xfs_qm_dqflush(dqp, bp);
 		if (error)
 			goto out_unlock_dirty;
 
@@ -1491,11 +1512,13 @@ xfs_qm_flush_one(
 		goto out_unlock;
 	}
 
-	error = xfs_qm_dqflush(dqp, &bp);
+	error = xfs_dquot_read_buf(NULL, dqp, &bp);
 	if (error)
 		goto out_unlock;
 
-	xfs_buf_delwri_queue(bp, buffer_list);
+	error = xfs_qm_dqflush(dqp, bp);
+	if (!error)
+		xfs_buf_delwri_queue(bp, buffer_list);
 	xfs_buf_relse(bp);
 out_unlock:
 	xfs_dqunlock(dqp);


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done}
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (17 preceding siblings ...)
  2024-11-26  1:29   ` [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush Darrick J. Wong
@ 2024-11-26  1:29   ` Darrick J. Wong
  2024-11-26  5:28     ` Christoph Hellwig
  2024-11-26  1:29   ` [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer Darrick J. Wong
                     ` (2 subsequent siblings)
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:29 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Clean up these functions a little bit before we move on to the real
modifications, and make the variable naming consistent for dquot log
items.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_dquot.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)


diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 6ec4087e38dfc8..4ba042786cfb7b 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1142,8 +1142,9 @@ static void
 xfs_qm_dqflush_done(
 	struct xfs_log_item	*lip)
 {
-	struct xfs_dq_logitem	*qip = (struct xfs_dq_logitem *)lip;
-	struct xfs_dquot	*dqp = qip->qli_dquot;
+	struct xfs_dq_logitem	*qlip =
+			container_of(lip, struct xfs_dq_logitem, qli_item);
+	struct xfs_dquot	*dqp = qlip->qli_dquot;
 	struct xfs_ail		*ailp = lip->li_ailp;
 	xfs_lsn_t		tail_lsn;
 
@@ -1156,12 +1157,12 @@ xfs_qm_dqflush_done(
 	 * holding the lock before removing the dquot from the AIL.
 	 */
 	if (test_bit(XFS_LI_IN_AIL, &lip->li_flags) &&
-	    ((lip->li_lsn == qip->qli_flush_lsn) ||
+	    ((lip->li_lsn == qlip->qli_flush_lsn) ||
 	     test_bit(XFS_LI_FAILED, &lip->li_flags))) {
 
 		spin_lock(&ailp->ail_lock);
 		xfs_clear_li_failed(lip);
-		if (lip->li_lsn == qip->qli_flush_lsn) {
+		if (lip->li_lsn == qlip->qli_flush_lsn) {
 			/* xfs_ail_update_finish() drops the AIL lock */
 			tail_lsn = xfs_ail_delete_one(ailp, lip);
 			xfs_ail_update_finish(ailp, tail_lsn);
@@ -1319,7 +1320,7 @@ xfs_qm_dqflush(
 	dqp->q_flags &= ~XFS_DQFLAG_DIRTY;
 
 	xfs_trans_ail_copy_lsn(mp->m_ail, &dqp->q_logitem.qli_flush_lsn,
-					&dqp->q_logitem.qli_item.li_lsn);
+			&lip->li_lsn);
 
 	/*
 	 * copy the lsn into the on-disk dquot now while we have the in memory
@@ -1331,7 +1332,7 @@ xfs_qm_dqflush(
 	 * of a dquot without an up-to-date CRC getting to disk.
 	 */
 	if (xfs_has_crc(mp)) {
-		dqblk->dd_lsn = cpu_to_be64(dqp->q_logitem.qli_item.li_lsn);
+		dqblk->dd_lsn = cpu_to_be64(lip->li_lsn);
 		xfs_update_cksum((char *)dqblk, sizeof(struct xfs_dqblk),
 				 XFS_DQUOT_CRC_OFF);
 	}
@@ -1341,7 +1342,7 @@ xfs_qm_dqflush(
 	 * the AIL and release the flush lock once the dquot is synced to disk.
 	 */
 	bp->b_flags |= _XBF_DQUOTS;
-	list_add_tail(&dqp->q_logitem.qli_item.li_bio_list, &bp->b_li_list);
+	list_add_tail(&lip->li_bio_list, &bp->b_li_list);
 
 	/*
 	 * If the buffer is pinned then push on the log so we won't


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (18 preceding siblings ...)
  2024-11-26  1:29   ` [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done} Darrick J. Wong
@ 2024-11-26  1:29   ` Darrick J. Wong
  2024-11-26  5:42     ` Christoph Hellwig
  2024-11-26  1:30   ` [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers Darrick J. Wong
  2024-11-26 20:26   ` [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes Darrick J. Wong
  21 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:29 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Ever since 6.12-rc1, I've observed a pile of warnings from the kernel
when running fstests with quotas enabled:

WARNING: CPU: 1 PID: 458580 at mm/page_alloc.c:4221 __alloc_pages_noprof+0xc9c/0xf18
CPU: 1 UID: 0 PID: 458580 Comm: xfsaild/sda3 Tainted: G        W          6.12.0-rc6-djwa #rc6 6ee3e0e531f6457e2d26aa008a3b65ff184b377c
<snip>
Call trace:
 __alloc_pages_noprof+0xc9c/0xf18
 alloc_pages_mpol_noprof+0x94/0x240
 alloc_pages_noprof+0x68/0xf8
 new_slab+0x3e0/0x568
 ___slab_alloc+0x5a0/0xb88
 __slab_alloc.constprop.0+0x7c/0xf8
 __kmalloc_noprof+0x404/0x4d0
 xfs_buf_get_map+0x594/0xde0 [xfs 384cb02810558b4c490343c164e9407332118f88]
 xfs_buf_read_map+0x64/0x2e0 [xfs 384cb02810558b4c490343c164e9407332118f88]
 xfs_trans_read_buf_map+0x1dc/0x518 [xfs 384cb02810558b4c490343c164e9407332118f88]
 xfs_qm_dqflush+0xac/0x468 [xfs 384cb02810558b4c490343c164e9407332118f88]
 xfs_qm_dquot_logitem_push+0xe4/0x148 [xfs 384cb02810558b4c490343c164e9407332118f88]
 xfsaild+0x3f4/0xde8 [xfs 384cb02810558b4c490343c164e9407332118f88]
 kthread+0x110/0x128
 ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---

This corresponds to the line:

	WARN_ON_ONCE(current->flags & PF_MEMALLOC);

within the NOFAIL checks.  What's happening here is that the XFS AIL is
trying to write a disk quota update back into the filesystem, but for
that it needs to read the ondisk buffer for the dquot.  The buffer is
not in memory anymore, probably because it was evicted.  Regardless, the
buffer cache tries to allocate a new buffer, but those allocations are
NOFAIL.  The AIL thread has marked itself PF_MEMALLOC (aka noreclaim)
since commit 43ff2122e6492b ("xfs: on-stack delayed write buffer lists")
presumably because reclaim can push on XFS to push on the AIL.

An easy way to fix this probably would have been to drop the NOFAIL flag
from the xfs_buf allocation and open code a retry loop, but then there's
still the problem that for bs>ps filesystems, the buffer itself could
require up to 64k worth of pages.

Inode items had similar behavior (multi-page cluster buffers that we
don't want to allocate in the AIL) which we solved by making transaction
precommit attach the inode cluster buffers to the dirty log item.  Let's
solve the dquot problem in the same way.

So: Make a real precommit handler to read the dquot buffer and attach it
to the log item; pass it to dqflush in the push method; and have the
iodone function detach the buffer once we've flushed everything.  Add a
state flag to the log item to track when a thread has entered the
precommit -> push mechanism to skip the detaching if it turns out that
the dquot is very busy, as we don't hold the dquot lock between log item
commit and AIL push).

Reading and attaching the dquot buffer in the precommit hook is inspired
by the work done for inode cluster buffers some time ago.

Cc: <stable@vger.kernel.org> # v6.12
Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_dquot.c      |  120 +++++++++++++++++++++++++++++++++++++++++++++--
 fs/xfs/xfs_dquot.h      |    5 ++
 fs/xfs/xfs_dquot_item.c |   39 ++++++++++-----
 fs/xfs/xfs_dquot_item.h |    7 +++
 fs/xfs/xfs_qm.c         |    6 +-
 fs/xfs/xfs_trans_ail.c  |    2 -
 6 files changed, 155 insertions(+), 24 deletions(-)


diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 4ba042786cfb7b..c495f7ad80018f 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -75,8 +75,24 @@ void
 xfs_qm_dqdestroy(
 	struct xfs_dquot	*dqp)
 {
+	struct xfs_dq_logitem	*qlip = &dqp->q_logitem;
+	struct xfs_buf		*bp = NULL;
+
 	ASSERT(list_empty(&dqp->q_lru));
 
+	/*
+	 * Detach the dquot buffer if it's still attached, because we can get
+	 * called through dqpurge after a log shutdown.
+	 */
+	spin_lock(&qlip->qli_lock);
+	if (qlip->qli_item.li_buf) {
+		bp = qlip->qli_item.li_buf;
+		qlip->qli_item.li_buf = NULL;
+	}
+	spin_unlock(&qlip->qli_lock);
+	if (bp)
+		xfs_buf_rele(bp);
+
 	kvfree(dqp->q_logitem.qli_item.li_lv_shadow);
 	mutex_destroy(&dqp->q_qlock);
 
@@ -1146,6 +1162,7 @@ xfs_qm_dqflush_done(
 			container_of(lip, struct xfs_dq_logitem, qli_item);
 	struct xfs_dquot	*dqp = qlip->qli_dquot;
 	struct xfs_ail		*ailp = lip->li_ailp;
+	struct xfs_buf		*bp = NULL;
 	xfs_lsn_t		tail_lsn;
 
 	/*
@@ -1175,6 +1192,19 @@ xfs_qm_dqflush_done(
 	 * Release the dq's flush lock since we're done with it.
 	 */
 	xfs_dqfunlock(dqp);
+
+	/*
+	 * If this dquot hasn't been dirtied since initiating the last dqflush,
+	 * release the buffer reference.
+	 */
+	spin_lock(&qlip->qli_lock);
+	if (!qlip->qli_dirty) {
+		bp = lip->li_buf;
+		lip->li_buf = NULL;
+	}
+	spin_unlock(&qlip->qli_lock);
+	if (bp)
+		xfs_buf_rele(bp);
 }
 
 void
@@ -1197,7 +1227,7 @@ xfs_buf_dquot_io_fail(
 
 	spin_lock(&bp->b_mount->m_ail->ail_lock);
 	list_for_each_entry(lip, &bp->b_li_list, li_bio_list)
-		xfs_set_li_failed(lip, bp);
+		set_bit(XFS_LI_FAILED, &lip->li_flags);
 	spin_unlock(&bp->b_mount->m_ail->ail_lock);
 }
 
@@ -1249,6 +1279,7 @@ int
 xfs_dquot_read_buf(
 	struct xfs_trans	*tp,
 	struct xfs_dquot	*dqp,
+	xfs_buf_flags_t		xbf_flags,
 	struct xfs_buf		**bpp)
 {
 	struct xfs_mount	*mp = dqp->q_mount;
@@ -1256,7 +1287,7 @@ xfs_dquot_read_buf(
 	int			error;
 
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno,
-				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
+				   mp->m_quotainfo->qi_dqchunklen, xbf_flags,
 				   &bp, &xfs_dquot_buf_ops);
 	if (error == -EAGAIN)
 		return error;
@@ -1275,6 +1306,77 @@ xfs_dquot_read_buf(
 	return error;
 }
 
+/*
+ * Attach a dquot buffer to this dquot to avoid allocating a buffer during a
+ * dqflush, since dqflush can be called from reclaim context.
+ */
+int
+xfs_dquot_attach_buf(
+	struct xfs_trans	*tp,
+	struct xfs_dquot	*dqp)
+{
+	struct xfs_dq_logitem	*qlip = &dqp->q_logitem;
+	struct xfs_log_item	*lip = &qlip->qli_item;
+	int			error;
+
+	spin_lock(&qlip->qli_lock);
+	if (!lip->li_buf) {
+		struct xfs_buf	*bp = NULL;
+
+		spin_unlock(&qlip->qli_lock);
+		error = xfs_dquot_read_buf(tp, dqp, 0, &bp);
+		if (error)
+			return error;
+
+		/*
+		 * Attach the dquot to the buffer so that the AIL does not have
+		 * to read the dquot buffer to push this item.
+		 */
+		xfs_buf_hold(bp);
+		spin_lock(&qlip->qli_lock);
+		lip->li_buf = bp;
+		xfs_trans_brelse(tp, bp);
+	}
+	qlip->qli_dirty = true;
+	spin_unlock(&qlip->qli_lock);
+
+	return 0;
+}
+
+/*
+ * Get a new reference the dquot buffer attached to this dquot for a dqflush
+ * operation.
+ *
+ * Returns 0 and a NULL bp if none was attached to the dquot; 0 and a locked
+ * bp; or -EAGAIN if the buffer could not be locked.
+ */
+int
+xfs_dquot_use_attached_buf(
+	struct xfs_dquot	*dqp,
+	struct xfs_buf		**bpp)
+{
+	struct xfs_buf		*bp = dqp->q_logitem.qli_item.li_buf;
+
+	/*
+	 * A NULL buffer can happen if the dquot dirty flag was set but the
+	 * filesystem shut down before transaction commit happened.  In that
+	 * case we're not going to flush anyway.
+	 */
+	if (!bp) {
+		ASSERT(xfs_is_shutdown(dqp->q_mount));
+
+		*bpp = NULL;
+		return 0;
+	}
+
+	if (!xfs_buf_trylock(bp))
+		return -EAGAIN;
+
+	xfs_buf_hold(bp);
+	*bpp = bp;
+	return 0;
+}
+
 /*
  * Write a modified dquot to disk.
  * The dquot must be locked and the flush lock too taken by caller.
@@ -1289,7 +1391,8 @@ xfs_qm_dqflush(
 	struct xfs_buf		*bp)
 {
 	struct xfs_mount	*mp = dqp->q_mount;
-	struct xfs_log_item	*lip = &dqp->q_logitem.qli_item;
+	struct xfs_dq_logitem	*qlip = &dqp->q_logitem;
+	struct xfs_log_item	*lip = &qlip->qli_item;
 	struct xfs_dqblk	*dqblk;
 	xfs_failaddr_t		fa;
 	int			error;
@@ -1319,8 +1422,15 @@ xfs_qm_dqflush(
 	 */
 	dqp->q_flags &= ~XFS_DQFLAG_DIRTY;
 
-	xfs_trans_ail_copy_lsn(mp->m_ail, &dqp->q_logitem.qli_flush_lsn,
-			&lip->li_lsn);
+	/*
+	 * We hold the dquot lock, so nobody can dirty it while we're
+	 * scheduling the write out.  Clear the dirty-since-flush flag.
+	 */
+	spin_lock(&qlip->qli_lock);
+	qlip->qli_dirty = false;
+	spin_unlock(&qlip->qli_lock);
+
+	xfs_trans_ail_copy_lsn(mp->m_ail, &qlip->qli_flush_lsn, &lip->li_lsn);
 
 	/*
 	 * copy the lsn into the on-disk dquot now while we have the in memory
diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
index 50f8404c41176c..362ca34f7c248b 100644
--- a/fs/xfs/xfs_dquot.h
+++ b/fs/xfs/xfs_dquot.h
@@ -215,7 +215,7 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp);
 
 void		xfs_qm_dqdestroy(struct xfs_dquot *dqp);
 int		xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp,
-				struct xfs_buf **bpp);
+				xfs_buf_flags_t flags, struct xfs_buf **bpp);
 int		xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp);
 void		xfs_qm_dqunpin_wait(struct xfs_dquot *dqp);
 void		xfs_qm_adjust_dqtimers(struct xfs_dquot *d);
@@ -239,6 +239,9 @@ void		xfs_dqlockn(struct xfs_dqtrx *q);
 
 void		xfs_dquot_set_prealloc_limits(struct xfs_dquot *);
 
+int xfs_dquot_attach_buf(struct xfs_trans *tp, struct xfs_dquot *dqp);
+int xfs_dquot_use_attached_buf(struct xfs_dquot *dqp, struct xfs_buf **bpp);
+
 static inline struct xfs_dquot *xfs_qm_dqhold(struct xfs_dquot *dqp)
 {
 	xfs_dqlock(dqp);
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 56ecc5ed01934d..271b195ebb9326 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -123,8 +123,9 @@ xfs_qm_dquot_logitem_push(
 		__releases(&lip->li_ailp->ail_lock)
 		__acquires(&lip->li_ailp->ail_lock)
 {
-	struct xfs_dquot	*dqp = DQUOT_ITEM(lip)->qli_dquot;
-	struct xfs_buf		*bp = lip->li_buf;
+	struct xfs_dq_logitem	*qlip = DQUOT_ITEM(lip);
+	struct xfs_dquot	*dqp = qlip->qli_dquot;
+	struct xfs_buf		*bp;
 	uint			rval = XFS_ITEM_SUCCESS;
 	int			error;
 
@@ -155,11 +156,10 @@ xfs_qm_dquot_logitem_push(
 
 	spin_unlock(&lip->li_ailp->ail_lock);
 
-	error = xfs_dquot_read_buf(NULL, dqp, &bp);
-	if (error) {
-		if (error == -EAGAIN)
-			rval = XFS_ITEM_LOCKED;
+	error = xfs_dquot_use_attached_buf(dqp, &bp);
+	if (error == -EAGAIN) {
 		xfs_dqfunlock(dqp);
+		rval = XFS_ITEM_LOCKED;
 		goto out_relock_ail;
 	}
 
@@ -207,12 +207,10 @@ xfs_qm_dquot_logitem_committing(
 }
 
 #ifdef DEBUG_EXPENSIVE
-static int
-xfs_qm_dquot_logitem_precommit(
-	struct xfs_trans	*tp,
-	struct xfs_log_item	*lip)
+static void
+xfs_qm_dquot_logitem_precommit_check(
+	struct xfs_dquot	*dqp)
 {
-	struct xfs_dquot	*dqp = DQUOT_ITEM(lip)->qli_dquot;
 	struct xfs_mount	*mp = dqp->q_mount;
 	struct xfs_disk_dquot	ddq = { };
 	xfs_failaddr_t		fa;
@@ -228,13 +226,24 @@ xfs_qm_dquot_logitem_precommit(
 		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
 		ASSERT(fa == NULL);
 	}
-
-	return 0;
 }
 #else
-# define xfs_qm_dquot_logitem_precommit	NULL
+# define xfs_qm_dquot_logitem_precommit_check(...)	((void)0)
 #endif
 
+static int
+xfs_qm_dquot_logitem_precommit(
+	struct xfs_trans	*tp,
+	struct xfs_log_item	*lip)
+{
+	struct xfs_dq_logitem	*qlip = DQUOT_ITEM(lip);
+	struct xfs_dquot	*dqp = qlip->qli_dquot;
+
+	xfs_qm_dquot_logitem_precommit_check(dqp);
+
+	return xfs_dquot_attach_buf(tp, dqp);
+}
+
 static const struct xfs_item_ops xfs_dquot_item_ops = {
 	.iop_size	= xfs_qm_dquot_logitem_size,
 	.iop_precommit	= xfs_qm_dquot_logitem_precommit,
@@ -259,5 +268,7 @@ xfs_qm_dquot_logitem_init(
 
 	xfs_log_item_init(dqp->q_mount, &lp->qli_item, XFS_LI_DQUOT,
 					&xfs_dquot_item_ops);
+	spin_lock_init(&lp->qli_lock);
 	lp->qli_dquot = dqp;
+	lp->qli_dirty = false;
 }
diff --git a/fs/xfs/xfs_dquot_item.h b/fs/xfs/xfs_dquot_item.h
index 794710c2447493..d66e52807d76d5 100644
--- a/fs/xfs/xfs_dquot_item.h
+++ b/fs/xfs/xfs_dquot_item.h
@@ -14,6 +14,13 @@ struct xfs_dq_logitem {
 	struct xfs_log_item	qli_item;	/* common portion */
 	struct xfs_dquot	*qli_dquot;	/* dquot ptr */
 	xfs_lsn_t		qli_flush_lsn;	/* lsn at last flush */
+
+	/*
+	 * We use this spinlock to coordinate access to the li_buf pointer in
+	 * the log item and the qli_dirty flag.
+	 */
+	spinlock_t		qli_lock;
+	bool			qli_dirty;	/* dirtied since last flush? */
 };
 
 void xfs_qm_dquot_logitem_init(struct xfs_dquot *dqp);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 341fe4821c2d77..a79c4a1bf27fab 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -148,7 +148,7 @@ xfs_qm_dqpurge(
 		 * We don't care about getting disk errors here. We need
 		 * to purge this dquot anyway, so we go ahead regardless.
 		 */
-		error = xfs_dquot_read_buf(NULL, dqp, &bp);
+		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
 		if (error == -EAGAIN) {
 			xfs_dqfunlock(dqp);
 			dqp->q_flags &= ~XFS_DQFLAG_FREEING;
@@ -506,7 +506,7 @@ xfs_qm_dquot_isolate(
 		/* we have to drop the LRU lock to flush the dquot */
 		spin_unlock(lru_lock);
 
-		error = xfs_dquot_read_buf(NULL, dqp, &bp);
+		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
 		if (error) {
 			xfs_dqfunlock(dqp);
 			goto out_unlock_dirty;
@@ -1512,7 +1512,7 @@ xfs_qm_flush_one(
 		goto out_unlock;
 	}
 
-	error = xfs_dquot_read_buf(NULL, dqp, &bp);
+	error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
 	if (error)
 		goto out_unlock;
 
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 8ede9d099d1fea..f56d62dced97b1 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -360,7 +360,7 @@ xfsaild_resubmit_item(
 
 	/* protected by ail_lock */
 	list_for_each_entry(lip, &bp->b_li_list, li_bio_list) {
-		if (bp->b_flags & _XBF_INODES)
+		if (bp->b_flags & (_XBF_INODES | _XBF_DQUOTS))
 			clear_bit(XFS_LI_FAILED, &lip->li_flags);
 		else
 			xfs_clear_li_failed(lip);


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (19 preceding siblings ...)
  2024-11-26  1:29   ` [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer Darrick J. Wong
@ 2024-11-26  1:30   ` Darrick J. Wong
  2024-11-26  5:42     ` Christoph Hellwig
  2024-12-17  2:06     ` Lai, Yi
  2024-11-26 20:26   ` [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes Darrick J. Wong
  21 siblings, 2 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26  1:30 UTC (permalink / raw)
  To: djwong; +Cc: stable, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Now that we've converted the dquot logging machinery to attach the dquot
buffer to the li_buf pointer so that the AIL dqflush doesn't have to
allocate or read buffers in a reclaim path, do the same for the
quotacheck code so that the reclaim shrinker dqflush call doesn't have
to do that either.

Cc: <stable@vger.kernel.org> # v6.12
Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_dquot.c |    9 +++------
 fs/xfs/xfs_dquot.h |    2 --
 fs/xfs/xfs_qm.c    |   18 +++++++++++++-----
 3 files changed, 16 insertions(+), 13 deletions(-)


diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index c495f7ad80018f..c47f95c96fe0cf 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1275,11 +1275,10 @@ xfs_qm_dqflush_check(
  * Requires dquot flush lock, will clear the dirty flag, delete the quota log
  * item from the AIL, and shut down the system if something goes wrong.
  */
-int
+static int
 xfs_dquot_read_buf(
 	struct xfs_trans	*tp,
 	struct xfs_dquot	*dqp,
-	xfs_buf_flags_t		xbf_flags,
 	struct xfs_buf		**bpp)
 {
 	struct xfs_mount	*mp = dqp->q_mount;
@@ -1287,10 +1286,8 @@ xfs_dquot_read_buf(
 	int			error;
 
 	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno,
-				   mp->m_quotainfo->qi_dqchunklen, xbf_flags,
+				   mp->m_quotainfo->qi_dqchunklen, 0,
 				   &bp, &xfs_dquot_buf_ops);
-	if (error == -EAGAIN)
-		return error;
 	if (xfs_metadata_is_sick(error))
 		xfs_dquot_mark_sick(dqp);
 	if (error)
@@ -1324,7 +1321,7 @@ xfs_dquot_attach_buf(
 		struct xfs_buf	*bp = NULL;
 
 		spin_unlock(&qlip->qli_lock);
-		error = xfs_dquot_read_buf(tp, dqp, 0, &bp);
+		error = xfs_dquot_read_buf(tp, dqp, &bp);
 		if (error)
 			return error;
 
diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
index 362ca34f7c248b..1c5c911615bf7f 100644
--- a/fs/xfs/xfs_dquot.h
+++ b/fs/xfs/xfs_dquot.h
@@ -214,8 +214,6 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp);
 #define XFS_DQ_IS_DIRTY(dqp)	((dqp)->q_flags & XFS_DQFLAG_DIRTY)
 
 void		xfs_qm_dqdestroy(struct xfs_dquot *dqp);
-int		xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp,
-				xfs_buf_flags_t flags, struct xfs_buf **bpp);
 int		xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp);
 void		xfs_qm_dqunpin_wait(struct xfs_dquot *dqp);
 void		xfs_qm_adjust_dqtimers(struct xfs_dquot *d);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index a79c4a1bf27fab..e073ad51af1a3d 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -148,13 +148,13 @@ xfs_qm_dqpurge(
 		 * We don't care about getting disk errors here. We need
 		 * to purge this dquot anyway, so we go ahead regardless.
 		 */
-		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
+		error = xfs_dquot_use_attached_buf(dqp, &bp);
 		if (error == -EAGAIN) {
 			xfs_dqfunlock(dqp);
 			dqp->q_flags &= ~XFS_DQFLAG_FREEING;
 			goto out_unlock;
 		}
-		if (error)
+		if (!bp)
 			goto out_funlock;
 
 		/*
@@ -506,8 +506,8 @@ xfs_qm_dquot_isolate(
 		/* we have to drop the LRU lock to flush the dquot */
 		spin_unlock(lru_lock);
 
-		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
-		if (error) {
+		error = xfs_dquot_use_attached_buf(dqp, &bp);
+		if (!bp || error == -EAGAIN) {
 			xfs_dqfunlock(dqp);
 			goto out_unlock_dirty;
 		}
@@ -1330,6 +1330,10 @@ xfs_qm_quotacheck_dqadjust(
 		return error;
 	}
 
+	error = xfs_dquot_attach_buf(NULL, dqp);
+	if (error)
+		return error;
+
 	trace_xfs_dqadjust(dqp);
 
 	/*
@@ -1512,9 +1516,13 @@ xfs_qm_flush_one(
 		goto out_unlock;
 	}
 
-	error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
+	error = xfs_dquot_use_attached_buf(dqp, &bp);
 	if (error)
 		goto out_unlock;
+	if (!bp) {
+		error = -EFSCORRUPTED;
+		goto out_unlock;
+	}
 
 	error = xfs_qm_dqflush(dqp, bp);
 	if (!error)


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully
  2024-11-26  1:22   ` [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully Darrick J. Wong
@ 2024-11-26  4:55     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  4:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs

On Mon, Nov 25, 2024 at 05:22:05PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> This test creates a couple of patterned files on a tiny filesystem,
> fragments the free space, clones one patterned file to the other, and
> checks that the entire file was cloned.
> 
> However, this test doesn't work on a 64k fsblock filesystem because
> we've used up all the free space reservation for the rmapbt, and that
> causes the FICLONE to error out with ENOSPC partway through.  Hence we
> need to detect the ENOSPC and _notrun the test.
> 
> That said, it turns out that XFS has been silently dropping error codes
> if we managed to make some progress cloning extents.  That's ok if the
> operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
> FICLONE/FICLONERANGE do not permit partial results, so the dropped error
> codes is actually an error.
> 
> Therefore, this testcase now becomes a regression test for the patch to
> fix that.

Still no big fan of having a btrfs-specific must not ENOSPC
assumption in a generic test.  So my preference would be to move
the must not error at all case into a btrfs specific test and make
your newly added ENOSPC handling unconditional.  But I guess the
state with this patch is strictly better than without, so:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest
  2024-11-26  1:24   ` [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest Darrick J. Wong
@ 2024-11-26  4:56     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  4:56 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: zlang, fstests, tytso, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 16/16] xfs/122: add tests for commitrange structures
  2024-11-26  1:24   ` [PATCH 16/16] xfs/122: add tests for commitrange structures Darrick J. Wong
@ 2024-11-26  4:57     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  4:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs

On Mon, Nov 25, 2024 at 05:24:43PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Update this test to check the ioctl structure for XFS_IOC_COMMIT_RANGE,
> which was added in 6.12.  This will be the last ever addition to
> xfs/122, because in 6.13 we moved the ondisk structure checks to libxfs
> after which we'll be able to _notrun this test on newer codebases.

As we'd say in german: Dein Wort in Gottes Ohr!

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block
  2024-11-26  1:27   ` [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block Darrick J. Wong
@ 2024-11-26  5:01     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved
  2024-11-26  1:27   ` [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved Darrick J. Wong
@ 2024-11-26  5:01     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir
  2024-11-26  1:28   ` [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir Darrick J. Wong
@ 2024-11-26  5:03     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 14/21] xfs: only run precommits once per transaction object
  2024-11-26  1:28   ` [PATCH 14/21] xfs: only run precommits once per transaction object Darrick J. Wong
@ 2024-11-26  5:09     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  2024-11-26  1:28   ` [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit Darrick J. Wong
@ 2024-11-26  5:11     ` Christoph Hellwig
  2024-11-26  5:11       ` Christoph Hellwig
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 25, 2024 at 05:28:37PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls
> __xfs_trans_commit again on the same transaction.  In other words,
> there's function recursion that has caused minor amounts of confusion in
> the past.  There's no reason to keep this around, since there's only one
> place where we actually want the xfs_defer_finish_noroll, and that is in
> the top level xfs_trans_commit call.

Hmm, I don't think the current version is a recursion, because the
is keyed off the regrant argument.  That being said the new version is
a lot cleaner, but maybe adjust the commit log and drop the fixes tag?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  2024-11-26  5:11     ` Christoph Hellwig
@ 2024-11-26  5:11       ` Christoph Hellwig
  2024-11-26 18:20         ` Darrick J. Wong
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 25, 2024 at 09:11:06PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 25, 2024 at 05:28:37PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls
> > __xfs_trans_commit again on the same transaction.  In other words,
> > there's function recursion that has caused minor amounts of confusion in
> > the past.  There's no reason to keep this around, since there's only one
> > place where we actually want the xfs_defer_finish_noroll, and that is in
> > the top level xfs_trans_commit call.
> 
> Hmm, I don't think the current version is a recursion, because the
> is keyed off the regrant argument.  That being said the new version is
> a lot cleaner, but maybe adjust the commit log and drop the fixes tag?

With that:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 16/21] xfs: don't lose solo superblock counter update transactions
  2024-11-26  1:28   ` [PATCH 16/21] xfs: don't lose solo superblock counter update transactions Darrick J. Wong
@ 2024-11-26  5:14     ` Christoph Hellwig
  2024-11-26 18:23       ` Darrick J. Wong
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

On Mon, Nov 25, 2024 at 05:28:53PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Superblock counter updates are tracked via per-transaction counters in
> the xfs_trans object.  These changes are then turned into dirty log
> items in xfs_trans_apply_sb_deltas just prior to commiting the log items
> to the CIL.
> 
> However, updating the per-transaction counter deltas do not cause
> XFS_TRANS_DIRTY to be set on the transaction.  In other words, a pure sb
> counter update will be silently discarded if there are no other dirty
> log items attached to the transaction.
> 
> This is currently not the case anywhere in the filesystem because sb
> counter updates always dirty at least one other metadata item, but let's
> not leave a logic bomb.

xfs_trans_mod_sb is the only place that sets XFS_TRANS_SB_DIRTY, and
always forces XFS_TRANS_DIRTY.  So this seems kinda intentional, even
if this new version is much cleaner.  So the change looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

But I don't think the Fixes tag is really warranted.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 17/21] xfs: don't lose solo dquot update transactions
  2024-11-26  1:29   ` [PATCH 17/21] xfs: don't lose solo dquot " Darrick J. Wong
@ 2024-11-26  5:18     ` Christoph Hellwig
  2024-11-26 18:23       ` Darrick J. Wong
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

On Mon, Nov 25, 2024 at 05:29:08PM -0800, Darrick J. Wong wrote:
> This is currently not the case anywhere in the filesystem because quota
> updates always dirty at least one other metadata item, but a subsequent
> bug fix will add dquot log item precommits, so we actually need a dirty
> dquot log item prior to xfs_trans_run_precommits.  Also let's not leave
> a logic bomb.

Unlike for the sb updates there doesn't seem to be anything in
xfs_trans_mod_dquot that forces the transaction to be dirty, so I guess
the fixes tag is fine here.

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush
  2024-11-26  1:29   ` [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush Darrick J. Wong
@ 2024-11-26  5:27     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done}
  2024-11-26  1:29   ` [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done} Darrick J. Wong
@ 2024-11-26  5:28     ` Christoph Hellwig
  2024-11-26 18:25       ` Darrick J. Wong
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

>  	if (test_bit(XFS_LI_IN_AIL, &lip->li_flags) &&
> -	    ((lip->li_lsn == qip->qli_flush_lsn) ||
> +	    ((lip->li_lsn == qlip->qli_flush_lsn) ||

Let's drop the superfluous inner braces here while you're at it.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer
  2024-11-26  1:29   ` [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer Darrick J. Wong
@ 2024-11-26  5:42     ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

On Mon, Nov 25, 2024 at 05:29:55PM -0800, Darrick J. Wong wrote:
> +	spin_lock(&qlip->qli_lock);
> +	if (qlip->qli_item.li_buf) {
> +		bp = qlip->qli_item.li_buf;
> +		qlip->qli_item.li_buf = NULL;
> +	}
> +	spin_unlock(&qlip->qli_lock);
> +	if (bp)
> +		xfs_buf_rele(bp);

> +	spin_lock(&qlip->qli_lock);
> +	if (!qlip->qli_dirty) {
> +		bp = lip->li_buf;
> +		lip->li_buf = NULL;
> +	}
> +	spin_unlock(&qlip->qli_lock);
> +	if (bp)
> +		xfs_buf_rele(bp);

Should this move into a common helper and always use either the
buf or dirty check instead of mixing them?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers
  2024-11-26  1:30   ` [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers Darrick J. Wong
@ 2024-11-26  5:42     ` Christoph Hellwig
  2024-12-17  2:06     ` Lai, Yi
  1 sibling, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-26  5:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  2024-11-26  5:11       ` Christoph Hellwig
@ 2024-11-26 18:20         ` Darrick J. Wong
  2024-11-27  5:44           ` Christoph Hellwig
  0 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 18:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Nov 25, 2024 at 09:11:23PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 25, 2024 at 09:11:06PM -0800, Christoph Hellwig wrote:
> > On Mon, Nov 25, 2024 at 05:28:37PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls
> > > __xfs_trans_commit again on the same transaction.  In other words,
> > > there's function recursion that has caused minor amounts of confusion in
> > > the past.  There's no reason to keep this around, since there's only one
> > > place where we actually want the xfs_defer_finish_noroll, and that is in
> > > the top level xfs_trans_commit call.
> > 
> > Hmm, I don't think the current version is a recursion, because the
> > is keyed off the regrant argument.  That being said the new version is
> > a lot cleaner, but maybe adjust the commit log and drop the fixes tag?

How about:

"xfs: avoid nested calls to __xfs_trans_commit

"Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which
calls __xfs_trans_commit again on the same transaction.  In other words,
there's a nested function call (albeit with slightly different
arguments) that has caused minor amounts of confusion in the past.
There's no reason to keep this around, since there's only one place
where we actually want the xfs_defer_finish_noroll, and that is in the
top level xfs_trans_commit call.

"This also reduces stack usage a little bit."

> With that:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks!

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 16/21] xfs: don't lose solo superblock counter update transactions
  2024-11-26  5:14     ` Christoph Hellwig
@ 2024-11-26 18:23       ` Darrick J. Wong
  0 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 18:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: stable, linux-xfs

On Mon, Nov 25, 2024 at 09:14:07PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 25, 2024 at 05:28:53PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Superblock counter updates are tracked via per-transaction counters in
> > the xfs_trans object.  These changes are then turned into dirty log
> > items in xfs_trans_apply_sb_deltas just prior to commiting the log items
> > to the CIL.
> > 
> > However, updating the per-transaction counter deltas do not cause
> > XFS_TRANS_DIRTY to be set on the transaction.  In other words, a pure sb
> > counter update will be silently discarded if there are no other dirty
> > log items attached to the transaction.
> > 
> > This is currently not the case anywhere in the filesystem because sb
> > counter updates always dirty at least one other metadata item, but let's
> > not leave a logic bomb.
> 
> xfs_trans_mod_sb is the only place that sets XFS_TRANS_SB_DIRTY, and
> always forces XFS_TRANS_DIRTY.  So this seems kinda intentional, even
> if this new version is much cleaner.  So the change looks fine:

<nod> I don't think this one was absolutely necessary, it just seemed
like a cleanup to put anything that set TRANS_DIRTY before the
TRANS_DIRTY check.

> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> But I don't think the Fixes tag is really warranted.

Dropped.

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 17/21] xfs: don't lose solo dquot update transactions
  2024-11-26  5:18     ` Christoph Hellwig
@ 2024-11-26 18:23       ` Darrick J. Wong
  0 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 18:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: stable, linux-xfs

On Mon, Nov 25, 2024 at 09:18:14PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 25, 2024 at 05:29:08PM -0800, Darrick J. Wong wrote:
> > This is currently not the case anywhere in the filesystem because quota
> > updates always dirty at least one other metadata item, but a subsequent
> > bug fix will add dquot log item precommits, so we actually need a dirty
> > dquot log item prior to xfs_trans_run_precommits.  Also let's not leave
> > a logic bomb.
> 
> Unlike for the sb updates there doesn't seem to be anything in
> xfs_trans_mod_dquot that forces the transaction to be dirty, so I guess
> the fixes tag is fine here.

Correct.

> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks!

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done}
  2024-11-26  5:28     ` Christoph Hellwig
@ 2024-11-26 18:25       ` Darrick J. Wong
  0 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 18:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Nov 25, 2024 at 09:28:37PM -0800, Christoph Hellwig wrote:
> >  	if (test_bit(XFS_LI_IN_AIL, &lip->li_flags) &&
> > -	    ((lip->li_lsn == qip->qli_flush_lsn) ||
> > +	    ((lip->li_lsn == qlip->qli_flush_lsn) ||
> 
> Let's drop the superfluous inner braces here while you're at it.

Done.

> Otherwise looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thank you!

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes
  2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
                     ` (20 preceding siblings ...)
  2024-11-26  1:30   ` [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers Darrick J. Wong
@ 2024-11-26 20:26   ` Darrick J. Wong
  2024-11-29  8:20     ` Christoph Hellwig
  2024-12-07  0:41     ` Luis Chamberlain
  21 siblings, 2 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 20:26 UTC (permalink / raw)
  To: Dave Chinner, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

For a sparse inodes filesystem, mkfs.xfs computes the values of
sb_spino_align and sb_inoalignmt with the following code:

	int     cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;

	if (cfg->sb_feat.crcs_enabled)
		cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;

	sbp->sb_spino_align = cluster_size >> cfg->blocklog;
	sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK *
			cfg->inodesize >> cfg->blocklog;

On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results
in cluster_size = 8192 * (512 / 256) = 16384.  As a result,
sb_spino_align and sb_inoalignmt are both set to zero.  Unfortunately,
this trips the new sb_spino_align check that was just added to
xfs_validate_sb_common, and the mkfs fails:

# mkfs.xfs -f -b size=64k, /dev/sda
meta-data=/dev/sda               isize=512    agcount=4, agsize=81136 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0   metadir=0
data     =                       bsize=65536  blocks=324544, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=65536  ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=65536  blocks=5006, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=65536  blocks=0, rtextents=0
         =                       rgcount=0    rgsize=0 extents
Discarding blocks...Sparse inode alignment (0) is invalid.
Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
Sparse inode alignment (0) is invalid.
Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
mkfs.xfs: writing AG headers failed, err=22

Prior to commit 59e43f5479cce1 this all worked fine, even if "sparse"
inodes are somewhat meaningless when everything fits in a single
fsblock.  Adjust the checks to handle existing filesystems.

Fixes: 59e43f5479cce1 ("xfs: sb_spino_align is not verified")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_sb.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index a809513a290cf4..3b5623611eba02 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -494,12 +494,13 @@ xfs_validate_sb_common(
 				return -EINVAL;
 			}
 
-			if (!sbp->sb_spino_align ||
-			    sbp->sb_spino_align > sbp->sb_inoalignmt ||
-			    (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0) {
+			if (sbp->sb_spino_align &&
+			    (sbp->sb_spino_align > sbp->sb_inoalignmt ||
+			     (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0)) {
 				xfs_warn(mp,
-				"Sparse inode alignment (%u) is invalid.",
-					sbp->sb_spino_align);
+"Sparse inode alignment (%u) is invalid, must be integer factor of (%u).",
+					sbp->sb_spino_align,
+					sbp->sb_inoalignmt);
 				return -EINVAL;
 			}
 		} else if (sbp->sb_spino_align) {

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool
  2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
                     ` (15 preceding siblings ...)
  2024-11-26  1:24   ` [PATCH 16/16] xfs/122: add tests for commitrange structures Darrick J. Wong
@ 2024-11-26 20:27   ` Darrick J. Wong
  2024-11-27  5:43     ` Christoph Hellwig
  16 siblings, 1 reply; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-26 20:27 UTC (permalink / raw)
  To: zlang; +Cc: zlang, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If you happen to be running fstests on a bunch of VMs and the VMs all
have access to a shared disk pool, then it's possible that two VMs could
be running generic/459 at exactly the same time.  In that case, it's a
VERY bad thing to have two nodes trying to create an LVM volume group
named "vg_459" because one node will succeed, after which the other node
will see the vg_459 volume group that it didn't create:

  A volume group called vg_459 already exists.
  Logical volume pool_459 already exists in Volume group vg_459.
  Logical Volume "lv_459" already exists in volume group "vg_459"

But then, because this is bash, we don't abort the test script and
continue executing.  If we're lucky this fails when /dev/vg_459/lv_459
disappears before mkfs can run:

  Error accessing specified device /dev/mapper/vg_459-lv_459: No such file or directory
  Usage: mkfs.xfs

But in the bad case both nodes write filesystems to the same device and
then they trample all over each other.  Fix this by adding the hostname
and pid to all the LVM names so that they won't collide.

Fixes: 461dad511f6b91 ("generic: Test filesystem lockup on full overprovisioned dm-thin")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/generic/459 |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/tests/generic/459 b/tests/generic/459
index 98177f6b5ef8fb..32ee899f929819 100755
--- a/tests/generic/459
+++ b/tests/generic/459
@@ -47,10 +47,17 @@ _require_command "$THIN_CHECK_PROG" thin_check
 _require_freeze
 _require_odirect

-vgname=vg_$seq
-lvname=lv_$seq
-poolname=pool_$seq
-snapname=snap_$seq
+# Create all the LVM names with the hostname and pid so that we don't have any
+# collisions between VMs running from a shared pool of disks.  Hyphens become
+# underscores because LVM turns those into double hyphens, which messes with
+# accessing /dev/mapper/$vg-$lv (which you're not supposed to do but this test
+# does anyway).
+lvmsuffix="${seq}_$(hostname -s | tr '-' '_')_$$"
+
+vgname=vg_$lvmsuffix
+lvname=lv_$lvmsuffix
+poolname=pool_$lvmsuffix
+snapname=snap_$lvmsuffix
 origpsize=200
 virtsize=300
 newpsize=300

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool
  2024-11-26 20:27   ` [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool Darrick J. Wong
@ 2024-11-27  5:43     ` Christoph Hellwig
  2024-11-27 16:35       ` Darrick J. Wong
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-27  5:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: zlang, zlang, fstests, linux-xfs

On Tue, Nov 26, 2024 at 12:27:29PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If you happen to be running fstests on a bunch of VMs and the VMs all
> have access to a shared disk pool, then it's possible that two VMs could
> be running generic/459 at exactly the same time.  In that case, it's a
> VERY bad thing to have two nodes trying to create an LVM volume group
> named "vg_459" because one node will succeed, after which the other node
> will see the vg_459 volume group that it didn't create:
> 
>   A volume group called vg_459 already exists.
>   Logical volume pool_459 already exists in Volume group vg_459.
>   Logical Volume "lv_459" already exists in volume group "vg_459"
> 
> But then, because this is bash, we don't abort the test script and
> continue executing.  If we're lucky this fails when /dev/vg_459/lv_459
> disappears before mkfs can run:

How the F.. do the VG names leak out of the VM scope?

That being said, the unique names looks fine to me, so:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit
  2024-11-26 18:20         ` Darrick J. Wong
@ 2024-11-27  5:44           ` Christoph Hellwig
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-27  5:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Tue, Nov 26, 2024 at 10:20:52AM -0800, Darrick J. Wong wrote:
> How about:
> 
> "xfs: avoid nested calls to __xfs_trans_commit
> 
> "Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which
> calls __xfs_trans_commit again on the same transaction.  In other words,
> there's a nested function call (albeit with slightly different
> arguments) that has caused minor amounts of confusion in the past.
> There's no reason to keep this around, since there's only one place
> where we actually want the xfs_defer_finish_noroll, and that is in the
> top level xfs_trans_commit call.
> 
> "This also reduces stack usage a little bit."

Sounds good.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool
  2024-11-27  5:43     ` Christoph Hellwig
@ 2024-11-27 16:35       ` Darrick J. Wong
  0 siblings, 0 replies; 69+ messages in thread
From: Darrick J. Wong @ 2024-11-27 16:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: zlang, zlang, fstests, linux-xfs

On Tue, Nov 26, 2024 at 09:43:42PM -0800, Christoph Hellwig wrote:
> On Tue, Nov 26, 2024 at 12:27:29PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > If you happen to be running fstests on a bunch of VMs and the VMs all
> > have access to a shared disk pool, then it's possible that two VMs could
> > be running generic/459 at exactly the same time.  In that case, it's a
> > VERY bad thing to have two nodes trying to create an LVM volume group
> > named "vg_459" because one node will succeed, after which the other node
> > will see the vg_459 volume group that it didn't create:
> > 
> >   A volume group called vg_459 already exists.
> >   Logical volume pool_459 already exists in Volume group vg_459.
> >   Logical Volume "lv_459" already exists in volume group "vg_459"
> > 
> > But then, because this is bash, we don't abort the test script and
> > continue executing.  If we're lucky this fails when /dev/vg_459/lv_459
> > disappears before mkfs can run:
> 
> How the F.. do the VG names leak out of the VM scope?

I ran fstests-xfs on my fstests-ocfs2 cluster, wherein all nodes have
write access to all disks because we're all one big happy fleet.  Each
node gets a list of which disks it can use for fstests so in theory
there's no overlap ... until two machines tried to create LVM VGs with
the same name at exactly the same time and tripped.  A sane prod system
would adjust the access controls per fstests run but I'm too lazy to do
that every night.

(Yeah, I just confessed to occasionally fstesting ocfs2.)

> That being said, the unique names looks fine to me, so:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks!

--D

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/16] generic/757: fix various bugs in this test
  2024-11-26  1:20   ` [PATCH 01/16] generic/757: fix various bugs in this test Darrick J. Wong
@ 2024-11-28  7:56     ` Zorro Lang
  0 siblings, 0 replies; 69+ messages in thread
From: Zorro Lang @ 2024-11-28  7:56 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, linux-xfs

On Mon, Nov 25, 2024 at 05:20:47PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Fix this test so the check doesn't fail on XFS, and restrict runtime to
> 100 loops because otherwise this test takes many hours.
> 
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---

Thanks, this patchset looks good to me now.

Reviewed-by: Zorro Lang <zlang@redhat.com>

>  tests/generic/757 |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> index 0ff5a8ac00182b..37cf49e6bc7fd9 100755
> --- a/tests/generic/757
> +++ b/tests/generic/757
> @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
>  cur=$(_log_writes_find_next_fua $prev)
>  [ -z "$cur" ] && _fail "failed to locate next FUA write"
>  
> -while [ ! -z "$cur" ]; do
> +while _soak_loop_running $((100 * TIME_FACTOR)); do
>  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
>  
> +	# xfs_repair won't run if the log is dirty
> +	if [ $FSTYP = "xfs" ]; then
> +		_scratch_mount
> +		_scratch_unmount
> +	fi
>  	_check_scratch_fs
>  
>  	prev=$cur
> 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/16] generic/757: convert to thinp
  2024-11-26  1:21   ` [PATCH 02/16] generic/757: convert to thinp Darrick J. Wong
@ 2024-11-28  8:08     ` Zorro Lang
  0 siblings, 0 replies; 69+ messages in thread
From: Zorro Lang @ 2024-11-28  8:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, fstests, linux-xfs

On Mon, Nov 25, 2024 at 05:21:03PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Convert this test to use dm-thinp so that discards always zero the data.
> This prevents weird replay problems if the scratch device doesn't
> guarantee that read after discard returns zeroes.
> 
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---

This patch is good to me, and g/757 finally passed on my side :)

Reviewed-by: Zorro Lang <zlang@redhat.com>

Thanks,
Zorro

>  tests/generic/757 |   23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> index 37cf49e6bc7fd9..6c13c6af41c57c 100755
> --- a/tests/generic/757
> +++ b/tests/generic/757
> @@ -8,12 +8,13 @@
>  # This can be seen on subpage FSes on Linux 6.4.
>  #
>  . ./common/preamble
> -_begin_fstest auto quick metadata log recoveryloop aio
> +_begin_fstest auto quick metadata log recoveryloop aio thin
>  
>  _cleanup()
>  {
>  	cd /
>  	_log_writes_cleanup &> /dev/null
> +	_dmthin_cleanup
>  	rm -f $tmp.*
>  }
>  
> @@ -23,11 +24,14 @@ _cleanup()
>  
>  fio_config=$tmp.fio
>  
> +. ./common/dmthin
>  . ./common/dmlogwrites
>  
> -_require_scratch
> +# Use thin device as replay device, which requires $SCRATCH_DEV
> +_require_scratch_nocheck
>  _require_aiodio
>  _require_log_writes
> +_require_dm_target thin-pool
>  
>  cat >$fio_config <<EOF
>  [global]
> @@ -47,7 +51,13 @@ _require_fio $fio_config
>  
>  cat $fio_config >> $seqres.full
>  
> -_log_writes_init $SCRATCH_DEV
> +# Use a thin device to provide deterministic discard behavior. Discards are used
> +# by the log replay tool for fast zeroing to prevent out-of-order replay issues.
> +_test_unmount
> +sectors=$(blockdev --getsz $SCRATCH_DEV)
> +sectors=$((sectors * 90 / 100))
> +_dmthin_init $sectors $sectors
> +_log_writes_init $DMTHIN_VOL_DEV
>  _log_writes_mkfs >> $seqres.full 2>&1
>  _log_writes_mark mkfs
>  
> @@ -64,14 +74,13 @@ cur=$(_log_writes_find_next_fua $prev)
>  [ -z "$cur" ] && _fail "failed to locate next FUA write"
>  
>  while _soak_loop_running $((100 * TIME_FACTOR)); do
> -	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> +	_log_writes_replay_log_range $cur $DMTHIN_VOL_DEV >> $seqres.full
>  
>  	# xfs_repair won't run if the log is dirty
>  	if [ $FSTYP = "xfs" ]; then
> -		_scratch_mount
> -		_scratch_unmount
> +		_dmthin_mount
>  	fi
> -	_check_scratch_fs
> +	_dmthin_check_fs
>  
>  	prev=$cur
>  	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes
  2024-11-26 20:26   ` [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes Darrick J. Wong
@ 2024-11-29  8:20     ` Christoph Hellwig
  2024-12-07  0:41     ` Luis Chamberlain
  1 sibling, 0 replies; 69+ messages in thread
From: Christoph Hellwig @ 2024-11-29  8:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes
  2024-11-26 20:26   ` [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes Darrick J. Wong
  2024-11-29  8:20     ` Christoph Hellwig
@ 2024-12-07  0:41     ` Luis Chamberlain
  1 sibling, 0 replies; 69+ messages in thread
From: Luis Chamberlain @ 2024-12-07  0:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, linux-xfs

On Tue, Nov 26, 2024 at 12:26:19PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> For a sparse inodes filesystem, mkfs.xfs computes the values of
> sb_spino_align and sb_inoalignmt with the following code:
> 
> 	int     cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
> 
> 	if (cfg->sb_feat.crcs_enabled)
> 		cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;
> 
> 	sbp->sb_spino_align = cluster_size >> cfg->blocklog;
> 	sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK *
> 			cfg->inodesize >> cfg->blocklog;
> 
> On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results
> in cluster_size = 8192 * (512 / 256) = 16384.  As a result,
> sb_spino_align and sb_inoalignmt are both set to zero.  Unfortunately,
> this trips the new sb_spino_align check that was just added to
> xfs_validate_sb_common, and the mkfs fails:
> 
> # mkfs.xfs -f -b size=64k, /dev/sda
> meta-data=/dev/sda               isize=512    agcount=4, agsize=81136 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=1, sparse=1, rmapbt=1
>          =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
>          =                       exchange=0   metadir=0
> data     =                       bsize=65536  blocks=324544, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=65536  ascii-ci=0, ftype=1, parent=0
> log      =internal log           bsize=65536  blocks=5006, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=65536  blocks=0, rtextents=0
>          =                       rgcount=0    rgsize=0 extents
> Discarding blocks...Sparse inode alignment (0) is invalid.
> Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
> libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
> mkfs.xfs: Releasing dirty buffer to free list!
> found dirty buffer (bulk) on free list!
> Sparse inode alignment (0) is invalid.
> Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
> libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
> mkfs.xfs: writing AG headers failed, err=22
> 
> Prior to commit 59e43f5479cce1 this all worked fine, even if "sparse"
> inodes are somewhat meaningless when everything fits in a single
> fsblock.  Adjust the checks to handle existing filesystems.
> 
> Fixes: 59e43f5479cce1 ("xfs: sb_spino_align is not verified")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>

Tested-by: Luis Chamberlain <mcgrof@kernel.org>

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers
  2024-11-26  1:30   ` [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers Darrick J. Wong
  2024-11-26  5:42     ` Christoph Hellwig
@ 2024-12-17  2:06     ` Lai, Yi
  1 sibling, 0 replies; 69+ messages in thread
From: Lai, Yi @ 2024-12-17  2:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: stable, linux-xfs, yi1.lai, syzkaller-bugs

On Mon, Nov 25, 2024 at 05:30:11PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've converted the dquot logging machinery to attach the dquot
> buffer to the li_buf pointer so that the AIL dqflush doesn't have to
> allocate or read buffers in a reclaim path, do the same for the
> quotacheck code so that the reclaim shrinker dqflush call doesn't have
> to do that either.
> 
> Cc: <stable@vger.kernel.org> # v6.12
> Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
>  fs/xfs/xfs_dquot.c |    9 +++------
>  fs/xfs/xfs_dquot.h |    2 --
>  fs/xfs/xfs_qm.c    |   18 +++++++++++++-----
>  3 files changed, 16 insertions(+), 13 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index c495f7ad80018f..c47f95c96fe0cf 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -1275,11 +1275,10 @@ xfs_qm_dqflush_check(
>   * Requires dquot flush lock, will clear the dirty flag, delete the quota log
>   * item from the AIL, and shut down the system if something goes wrong.
>   */
> -int
> +static int
>  xfs_dquot_read_buf(
>  	struct xfs_trans	*tp,
>  	struct xfs_dquot	*dqp,
> -	xfs_buf_flags_t		xbf_flags,
>  	struct xfs_buf		**bpp)
>  {
>  	struct xfs_mount	*mp = dqp->q_mount;
> @@ -1287,10 +1286,8 @@ xfs_dquot_read_buf(
>  	int			error;
>  
>  	error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno,
> -				   mp->m_quotainfo->qi_dqchunklen, xbf_flags,
> +				   mp->m_quotainfo->qi_dqchunklen, 0,
>  				   &bp, &xfs_dquot_buf_ops);
> -	if (error == -EAGAIN)
> -		return error;
>  	if (xfs_metadata_is_sick(error))
>  		xfs_dquot_mark_sick(dqp);
>  	if (error)
> @@ -1324,7 +1321,7 @@ xfs_dquot_attach_buf(
>  		struct xfs_buf	*bp = NULL;
>  
>  		spin_unlock(&qlip->qli_lock);
> -		error = xfs_dquot_read_buf(tp, dqp, 0, &bp);
> +		error = xfs_dquot_read_buf(tp, dqp, &bp);
>  		if (error)
>  			return error;
>  
> diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
> index 362ca34f7c248b..1c5c911615bf7f 100644
> --- a/fs/xfs/xfs_dquot.h
> +++ b/fs/xfs/xfs_dquot.h
> @@ -214,8 +214,6 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp);
>  #define XFS_DQ_IS_DIRTY(dqp)	((dqp)->q_flags & XFS_DQFLAG_DIRTY)
>  
>  void		xfs_qm_dqdestroy(struct xfs_dquot *dqp);
> -int		xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp,
> -				xfs_buf_flags_t flags, struct xfs_buf **bpp);
>  int		xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp);
>  void		xfs_qm_dqunpin_wait(struct xfs_dquot *dqp);
>  void		xfs_qm_adjust_dqtimers(struct xfs_dquot *d);
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index a79c4a1bf27fab..e073ad51af1a3d 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -148,13 +148,13 @@ xfs_qm_dqpurge(
>  		 * We don't care about getting disk errors here. We need
>  		 * to purge this dquot anyway, so we go ahead regardless.
>  		 */
> -		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
> +		error = xfs_dquot_use_attached_buf(dqp, &bp);
>  		if (error == -EAGAIN) {
>  			xfs_dqfunlock(dqp);
>  			dqp->q_flags &= ~XFS_DQFLAG_FREEING;
>  			goto out_unlock;
>  		}
> -		if (error)
> +		if (!bp)
>  			goto out_funlock;
>  
>  		/*
> @@ -506,8 +506,8 @@ xfs_qm_dquot_isolate(
>  		/* we have to drop the LRU lock to flush the dquot */
>  		spin_unlock(lru_lock);
>  
> -		error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
> -		if (error) {
> +		error = xfs_dquot_use_attached_buf(dqp, &bp);
> +		if (!bp || error == -EAGAIN) {
>  			xfs_dqfunlock(dqp);
>  			goto out_unlock_dirty;
>  		}
> @@ -1330,6 +1330,10 @@ xfs_qm_quotacheck_dqadjust(
>  		return error;
>  	}
>  
> +	error = xfs_dquot_attach_buf(NULL, dqp);
> +	if (error)
> +		return error;
> +
>  	trace_xfs_dqadjust(dqp);
>  
>  	/*
> @@ -1512,9 +1516,13 @@ xfs_qm_flush_one(
>  		goto out_unlock;
>  	}
>  
> -	error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp);
> +	error = xfs_dquot_use_attached_buf(dqp, &bp);
>  	if (error)
>  		goto out_unlock;
> +	if (!bp) {
> +		error = -EFSCORRUPTED;
> +		goto out_unlock;
> +	}
>  
>  	error = xfs_qm_dqflush(dqp, bp);
>  	if (!error)
>
Hi Darrick J. Wong,

Greetings!

I used Syzkaller and found that there is possible deadlock in xfs_dquot_detach_buf in linux v6.13-rc3.

After bisection and the first bad commit is:
"
ca378189fdfa xfs: convert quotacheck to attach dquot buffers
"

All detailed into can be found at:
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf
Syzkaller repro code:
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf/repro.c
Syzkaller repro syscall steps:
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf/repro.prog
Syzkaller report:
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf/repro.report
Kconfig(make olddefconfig):
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf/kconfig_origin
Bisect info:
https://github.com/laifryiee/syzkaller_logs/tree/main/241216_192201_xfs_dquot_detach_buf/bisect_info.log
bzImage:
https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/241216_192201_xfs_dquot_detach_buf/bzImage_78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8
Issue dmesg:
https://github.com/laifryiee/syzkaller_logs/blob/main/241216_192201_xfs_dquot_detach_buf/78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8_dmesg.log

"
[   52.971391] ======================================================
[   52.971706] WARNING: possible circular locking dependency detected
[   52.972026] 6.13.0-rc3-78d4f34e2115+ #1 Not tainted
[   52.972282] ------------------------------------------------------
[   52.972596] repro/673 is trying to acquire lock:
[   52.972842] ffff88802366b510 (&lp->qli_lock){+.+.}-{3:3}, at: xfs_dquot_detach_buf+0x2d/0x190
[   52.973324]
[   52.973324] but task is already holding lock:
[   52.973617] ffff888015681b30 (&l->lock){+.+.}-{3:3}, at: __list_lru_walk_one+0x409/0x810
[   52.974039]
[   52.974039] which lock already depends on the new lock.
[   52.974039]
[   52.974442]
[   52.974442] the existing dependency chain (in reverse order) is:
[   52.974815]
[   52.974815] -> #3 (&l->lock){+.+.}-{3:3}:
[   52.975100]        lock_acquire+0x80/0xb0
[   52.975319]        _raw_spin_lock+0x38/0x50
[   52.975550]        list_lru_add+0x198/0x5d0
[   52.975770]        list_lru_add_obj+0x207/0x360
[   52.976008]        xfs_buf_rele+0xcb6/0x1610
[   52.976243]        xfs_trans_brelse+0x385/0x410
[   52.976484]        xfs_imap_lookup+0x38d/0x5a0
[   52.976719]        xfs_imap+0x668/0xc80
[   52.976923]        xfs_iget+0x875/0x2dd0
[   52.977129]        xfs_mountfs+0x116b/0x2060
[   52.977360]        xfs_fs_fill_super+0x12bc/0x1f10
[   52.977612]        get_tree_bdev_flags+0x3d8/0x6c0
[   52.977869]        get_tree_bdev+0x29/0x40
[   52.978086]        xfs_fs_get_tree+0x26/0x30
[   52.978310]        vfs_get_tree+0x9e/0x390
[   52.978526]        path_mount+0x707/0x2000
[   52.978742]        __x64_sys_mount+0x2bf/0x350
[   52.978974]        x64_sys_call+0x1e1d/0x2140
[   52.979210]        do_syscall_64+0x6d/0x140
[   52.979431]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   52.979723]
[   52.979723] -> #2 (&bch->bc_lock){+.+.}-{3:3}:
[   52.980029]        lock_acquire+0x80/0xb0
[   52.980240]        _raw_spin_lock+0x38/0x50
[   52.980456]        _atomic_dec_and_lock+0xb8/0x100
[   52.980712]        xfs_buf_rele+0x112/0x1610
[   52.980937]        xfs_trans_brelse+0x385/0x410
[   52.981175]        xfs_imap_lookup+0x38d/0x5a0
[   52.981410]        xfs_imap+0x668/0xc80
[   52.981612]        xfs_iget+0x875/0x2dd0
[   52.981816]        xfs_mountfs+0x116b/0x2060
[   52.982042]        xfs_fs_fill_super+0x12bc/0x1f10
[   52.982289]        get_tree_bdev_flags+0x3d8/0x6c0
[   52.982540]        get_tree_bdev+0x29/0x40
[   52.982756]        xfs_fs_get_tree+0x26/0x30
[   52.982979]        vfs_get_tree+0x9e/0x390
[   52.983194]        path_mount+0x707/0x2000
[   52.983406]        __x64_sys_mount+0x2bf/0x350
[   52.983637]        x64_sys_call+0x1e1d/0x2140
[   52.983865]        do_syscall_64+0x6d/0x140
[   52.984081]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   52.984366]
[   52.984366] -> #1 (&bp->b_lock){+.+.}-{3:3}:
[   52.984665]        lock_acquire+0x80/0xb0
[   52.984876]        _raw_spin_lock+0x38/0x50
[   52.985092]        xfs_buf_rele+0x106/0x1610
[   52.985319]        xfs_trans_brelse+0x385/0x410
[   52.985556]        xfs_dquot_attach_buf+0x312/0x490
[   52.985806]        xfs_qm_quotacheck_dqadjust+0x122/0x580
[   52.986083]        xfs_qm_dqusage_adjust+0x5e0/0x7c0
[   52.986340]        xfs_iwalk_ag_recs+0x423/0x780
[   52.986579]        xfs_iwalk_run_callbacks+0x1e2/0x540
[   52.986842]        xfs_iwalk_ag+0x6e6/0x920
[   52.987061]        xfs_iwalk_ag_work+0x160/0x1e0
[   52.987301]        xfs_pwork_work+0x8b/0x180
[   52.987528]        process_one_work+0x92e/0x1b60
[   52.987770]        worker_thread+0x68d/0xe90
[   52.987992]        kthread+0x35a/0x470
[   52.988186]        ret_from_fork+0x56/0x90
[   52.988401]        ret_from_fork_asm+0x1a/0x30
[   52.988627]
[   52.988627] -> #0 (&lp->qli_lock){+.+.}-{3:3}:
[   52.988924]        __lock_acquire+0x2ff8/0x5d60
[   52.989156]        lock_acquire.part.0+0x142/0x390
[   52.989402]        lock_acquire+0x80/0xb0
[   52.989609]        _raw_spin_lock+0x38/0x50
[   52.989820]        xfs_dquot_detach_buf+0x2d/0x190
[   52.990061]        xfs_qm_dquot_isolate+0x1c6/0x12f0
[   52.990312]        __list_lru_walk_one+0x31a/0x810
[   52.990553]        list_lru_walk_one+0x4c/0x60
[   52.990781]        xfs_qm_shrink_scan+0x1d0/0x380
[   52.991020]        do_shrink_slab+0x410/0x10a0
[   52.991253]        shrink_slab+0x349/0x1370
[   52.991469]        drop_slab+0xf5/0x1f0
[   52.991667]        drop_caches_sysctl_handler+0x179/0x1a0
[   52.991943]        proc_sys_call_handler+0x418/0x610
[   52.992197]        proc_sys_write+0x2c/0x40
[   52.992409]        vfs_write+0xc59/0x1140
[   52.992613]        ksys_write+0x14f/0x280
[   52.992817]        __x64_sys_write+0x7b/0xc0
[   52.993034]        x64_sys_call+0x16b3/0x2140
[   52.993259]        do_syscall_64+0x6d/0x140
[   52.993470]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   52.993748]
[   52.993748] other info that might help us debug this:
[   52.993748]
[   52.994133] Chain exists of:
[   52.994133]   &lp->qli_lock --> &bch->bc_lock --> &l->lock
[   52.994133]
[   52.994613]  Possible unsafe locking scenario:
[   52.994613]
[   52.994901]        CPU0                    CPU1
[   52.995125]        ----                    ----
[   52.995348]   lock(&l->lock);
[   52.995505]                                lock(&bch->bc_lock);
[   52.995797]                                lock(&l->lock);
[   52.996067]   lock(&lp->qli_lock);
[   52.996242]
[   52.996242]  *** DEADLOCK ***
[   52.996242]
[   52.996530] 3 locks held by repro/673:
[   52.996719]  #0: ffff888012734408 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x14f/0x280
[   52.997127]  #1: ffff888015681b30 (&l->lock){+.+.}-{3:3}, at: __list_lru_walk_one+0x409/0x810
[   52.997559]  #2: ffff88802366b5f8 (&dqp->q_qlock){+.+.}-{4:4}, at: xfs_qm_dquot_isolate+0x8f/0x12f0
[   52.998011]
[   52.998011] stack backtrace:
[   52.998230] CPU: 1 UID: 0 PID: 673 Comm: repro Not tainted 6.13.0-rc3-78d4f34e2115+ #1
[   52.998617] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qem4
[   52.999165] Call Trace:
[   52.999296]  <TASK>
[   52.999411]  dump_stack_lvl+0xea/0x150
[   52.999609]  dump_stack+0x19/0x20
[   52.999786]  print_circular_bug+0x47f/0x750
[   53.000002]  check_noncircular+0x2f4/0x3e0
[   53.000213]  ? __pfx_check_noncircular+0x10/0x10
[   53.000453]  ? __pfx_lockdep_lock+0x10/0x10
[   53.000668]  __lock_acquire+0x2ff8/0x5d60
[   53.000881]  ? __pfx___lock_acquire+0x10/0x10
[   53.001105]  ? __kasan_check_read+0x15/0x20
[   53.001324]  ? mark_lock.part.0+0xf3/0x17b0
[   53.001539]  ? __this_cpu_preempt_check+0x21/0x30
[   53.001779]  ? lock_acquire.part.0+0x152/0x390
[   53.002008]  lock_acquire.part.0+0x142/0x390
[   53.002228]  ? xfs_dquot_detach_buf+0x2d/0x190
[   53.002456]  ? __pfx_lock_acquire.part.0+0x10/0x10
[   53.002702]  ? debug_smp_processor_id+0x20/0x30
[   53.002933]  ? rcu_is_watching+0x19/0xc0
[   53.003141]  ? trace_lock_acquire+0x13d/0x1b0
[   53.003366]  lock_acquire+0x80/0xb0
[   53.003549]  ? xfs_dquot_detach_buf+0x2d/0x190
[   53.003776]  _raw_spin_lock+0x38/0x50
[   53.003965]  ? xfs_dquot_detach_buf+0x2d/0x190
[   53.004190]  xfs_dquot_detach_buf+0x2d/0x190
[   53.004408]  xfs_qm_dquot_isolate+0x1c6/0x12f0
[   53.004638]  ? __pfx_xfs_qm_dquot_isolate+0x10/0x10
[   53.004886]  ? lock_acquire+0x80/0xb0
[   53.005080]  __list_lru_walk_one+0x31a/0x810
[   53.005306]  ? __pfx_xfs_qm_dquot_isolate+0x10/0x10
[   53.005555]  ? __pfx_xfs_qm_dquot_isolate+0x10/0x10
[   53.005803]  list_lru_walk_one+0x4c/0x60
[   53.006006]  xfs_qm_shrink_scan+0x1d0/0x380
[   53.006221]  ? __pfx_xfs_qm_shrink_scan+0x10/0x10
[   53.006465]  do_shrink_slab+0x410/0x10a0
[   53.006673]  shrink_slab+0x349/0x1370
[   53.006864]  ? __this_cpu_preempt_check+0x21/0x30
[   53.007103]  ? lock_release+0x441/0x870
[   53.007303]  ? __pfx_lock_release+0x10/0x10
[   53.007517]  ? shrink_slab+0x161/0x1370
[   53.007717]  ? __pfx_shrink_slab+0x10/0x10
[   53.007931]  ? mem_cgroup_iter+0x3a6/0x670
[   53.008147]  drop_slab+0xf5/0x1f0
[   53.008324]  drop_caches_sysctl_handler+0x179/0x1a0
[   53.008575]  proc_sys_call_handler+0x418/0x610
[   53.008803]  ? __pfx_proc_sys_call_handler+0x10/0x10
[   53.009054]  ? rcu_is_watching+0x19/0xc0
[   53.009263]  ? __this_cpu_preempt_check+0x21/0x30
[   53.009504]  proc_sys_write+0x2c/0x40
[   53.009694]  vfs_write+0xc59/0x1140
[   53.009876]  ? __pfx_proc_sys_write+0x10/0x10
[   53.010101]  ? __pfx_vfs_write+0x10/0x10
[   53.010307]  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[   53.010583]  ksys_write+0x14f/0x280
[   53.010765]  ? __pfx_ksys_write+0x10/0x10
[   53.010971]  ? __audit_syscall_entry+0x39c/0x500
[   53.011211]  __x64_sys_write+0x7b/0xc0
[   53.011404]  ? syscall_trace_enter+0x14f/0x280
[   53.011633]  x64_sys_call+0x16b3/0x2140
[   53.011831]  do_syscall_64+0x6d/0x140
[   53.012020]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   53.012275] RIP: 0033:0x7f56d423ee5d
[   53.012463] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 8
[   53.013348] RSP: 002b:00007ffcc7ab57f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[   53.013723] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f56d423ee5d
[   53.014070] RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000004
[   53.014416] RBP: 00007ffcc7ab5810 R08: 00007ffcc7ab5810 R09: 00007ffcc7ab5810
[   53.014763] R10: 002c646975756f6e R11: 0000000000000202 R12: 00007ffcc7ab5968
[   53.015110] R13: 0000000000402d04 R14: 000000000041ae08 R15: 00007f56d45aa000
[   53.015463]  </TASK>
"

Regards,
Yi Lai

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
  // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
  // You could change the bzImage_xxx as you want
  // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2024-12-17  2:07 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26  1:18 [PATCHBOMB] xfs/fstests: largeish pile of bug fixes Darrick J. Wong
2024-11-26  1:20 ` [PATCHSET v3] fstests: random fixes for v2024.11.17 Darrick J. Wong
2024-11-26  1:20   ` [PATCH 01/16] generic/757: fix various bugs in this test Darrick J. Wong
2024-11-28  7:56     ` Zorro Lang
2024-11-26  1:21   ` [PATCH 02/16] generic/757: convert to thinp Darrick J. Wong
2024-11-28  8:08     ` Zorro Lang
2024-11-26  1:21   ` [PATCH 03/16] xfs/113: fix failure to corrupt the entire directory Darrick J. Wong
2024-11-26  1:21   ` [PATCH 04/16] xfs/508: fix test for 64k blocksize Darrick J. Wong
2024-11-26  1:21   ` [PATCH 05/16] common/rc: capture dmesg when oom kills happen Darrick J. Wong
2024-11-26  1:22   ` [PATCH 06/16] generic/562: handle ENOSPC while cloning gracefully Darrick J. Wong
2024-11-26  4:55     ` Christoph Hellwig
2024-11-26  1:22   ` [PATCH 07/16] xfs/163: skip test if we can't shrink due to enospc issues Darrick J. Wong
2024-11-26  1:22   ` [PATCH 08/16] xfs/009: allow logically contiguous preallocations Darrick J. Wong
2024-11-26  1:22   ` [PATCH 09/16] generic/251: use sentinel files to kill the fstrim loop Darrick J. Wong
2024-11-26  1:23   ` [PATCH 10/16] generic/251: constrain runtime via time/load/soak factors Darrick J. Wong
2024-11-26  1:23   ` [PATCH 11/16] generic/251: don't copy the fsstress source code Darrick J. Wong
2024-11-26  1:23   ` [PATCH 12/16] common/rc: _scratch_mkfs_sized supports extra arguments Darrick J. Wong
2024-11-26  1:23   ` [PATCH 13/16] xfs/157: do not drop necessary mkfs options Darrick J. Wong
2024-11-26  1:24   ` [PATCH 14/16] generic/366: fix directio requirements checking Darrick J. Wong
2024-11-26  1:24   ` [PATCH 15/16] generic/454: actually set attr value for llamapirate subtest Darrick J. Wong
2024-11-26  4:56     ` Christoph Hellwig
2024-11-26  1:24   ` [PATCH 16/16] xfs/122: add tests for commitrange structures Darrick J. Wong
2024-11-26  4:57     ` Christoph Hellwig
2024-11-26 20:27   ` [PATCH 17/16] generic/459: prevent collisions between test VMs backed by a shared disk pool Darrick J. Wong
2024-11-27  5:43     ` Christoph Hellwig
2024-11-27 16:35       ` Darrick J. Wong
2024-11-26  1:20 ` [PATCHSET] xfs: bug fixes for 6.13 Darrick J. Wong
2024-11-26  1:24   ` [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage Darrick J. Wong
2024-11-26  1:25   ` [PATCH 02/21] xfs: metapath scrubber should use the already loaded inodes Darrick J. Wong
2024-11-26  1:25   ` [PATCH 03/21] xfs: keep quota directory inode loaded Darrick J. Wong
2024-11-26  1:25   ` [PATCH 04/21] xfs: return a 64-bit block count from xfs_btree_count_blocks Darrick J. Wong
2024-11-26  1:26   ` [PATCH 05/21] xfs: don't drop errno values when we fail to ficlone the entire range Darrick J. Wong
2024-11-26  1:26   ` [PATCH 06/21] xfs: separate healthy clearing mask during repair Darrick J. Wong
2024-11-26  1:26   ` [PATCH 07/21] xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink Darrick J. Wong
2024-11-26  1:26   ` [PATCH 08/21] xfs: mark metadir repair tempfiles with IRECOVERY Darrick J. Wong
2024-11-26  1:27   ` [PATCH 09/21] xfs: fix null bno_hint handling in xfs_rtallocate_rtg Darrick J. Wong
2024-11-26  1:27   ` [PATCH 10/21] xfs: fix error bailout in xfs_rtginode_create Darrick J. Wong
2024-11-26  1:27   ` [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block Darrick J. Wong
2024-11-26  5:01     ` Christoph Hellwig
2024-11-26  1:27   ` [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved Darrick J. Wong
2024-11-26  5:01     ` Christoph Hellwig
2024-11-26  1:28   ` [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir Darrick J. Wong
2024-11-26  5:03     ` Christoph Hellwig
2024-11-26  1:28   ` [PATCH 14/21] xfs: only run precommits once per transaction object Darrick J. Wong
2024-11-26  5:09     ` Christoph Hellwig
2024-11-26  1:28   ` [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit Darrick J. Wong
2024-11-26  5:11     ` Christoph Hellwig
2024-11-26  5:11       ` Christoph Hellwig
2024-11-26 18:20         ` Darrick J. Wong
2024-11-27  5:44           ` Christoph Hellwig
2024-11-26  1:28   ` [PATCH 16/21] xfs: don't lose solo superblock counter update transactions Darrick J. Wong
2024-11-26  5:14     ` Christoph Hellwig
2024-11-26 18:23       ` Darrick J. Wong
2024-11-26  1:29   ` [PATCH 17/21] xfs: don't lose solo dquot " Darrick J. Wong
2024-11-26  5:18     ` Christoph Hellwig
2024-11-26 18:23       ` Darrick J. Wong
2024-11-26  1:29   ` [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush Darrick J. Wong
2024-11-26  5:27     ` Christoph Hellwig
2024-11-26  1:29   ` [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done} Darrick J. Wong
2024-11-26  5:28     ` Christoph Hellwig
2024-11-26 18:25       ` Darrick J. Wong
2024-11-26  1:29   ` [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer Darrick J. Wong
2024-11-26  5:42     ` Christoph Hellwig
2024-11-26  1:30   ` [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers Darrick J. Wong
2024-11-26  5:42     ` Christoph Hellwig
2024-12-17  2:06     ` Lai, Yi
2024-11-26 20:26   ` [PATCH 22/21] xfs: fix sb_spino_align checks for large fsblock sizes Darrick J. Wong
2024-11-29  8:20     ` Christoph Hellwig
2024-12-07  0:41     ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox