public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/12] Add more tests for multi fs block atomic writes
@ 2025-06-11  9:34 Ojaswin Mujoo
  2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
                   ` (11 more replies)
  0 siblings, 12 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

These are the tests we were using to verify that filesystems are not
tearing multi fs block atomic writes. Infact some of the tests like
generic/772 actually helped us catch and fix issues in ext4's early
implementations of multi fs block atomic writes and hence we feel these
tests are useful to have in xfstests.

We have tested these with scsi debug as well as a real nvme device
supporting multi fs block atomic writes.

Thoughts and suggestions are welcome!

(This is rebased over Darrick's atomic write tests:
https://lore.kernel.org/fstests/20250605040122.63131-1-catherine.hoang@oracle.com/T/#t)

Ojaswin Mujoo (7):
  common/rc: Add a helper to run fsx on a given file
  ltp/fsx.c: Add atomic writes support to fsx
  generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  generic/771: Stress fsx with atomic writes enabled
  generic/772: Add sudden shutdown tests for multi block atomic writes
  ext4/063: Atomic write test for extent split across leaf nodes
  ext4/064: Add atomic write tests for journal credit calculation

Ritesh Harjani (IBM) (5):
  common/preamble: Fix fsx for ext4 with bigalloc
  generic/767: Add atomic write test using fio crc check verifier
  generic/769: Add atomic write test using fio verify on file mixed
    mappings
  ext4/061: Atomic writes stress test for bigalloc using fio crc
    verifier
  ext4/062: Atomic writes test for bigalloc using fio crc verifier on
    multiple files

 common/preamble       |  16 ++
 common/rc             |  21 ++-
 ltp/fsx.c             | 105 +++++++++++-
 tests/ext4/061        | 107 +++++++++++++
 tests/ext4/061.out    |   2 +
 tests/ext4/062        | 131 +++++++++++++++
 tests/ext4/062.out    |   2 +
 tests/ext4/063        | 125 +++++++++++++++
 tests/ext4/063.out    |   2 +
 tests/ext4/064        |  75 +++++++++
 tests/ext4/064.out    |   2 +
 tests/generic/767     |  84 ++++++++++
 tests/generic/767.out |   2 +
 tests/generic/769     | 101 ++++++++++++
 tests/generic/769.out |   2 +
 tests/generic/770     | 161 +++++++++++++++++++
 tests/generic/770.out |   2 +
 tests/generic/771     |  49 ++++++
 tests/generic/771.out |   2 +
 tests/generic/772     | 360 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/772.out |   2 +
 21 files changed, 1345 insertions(+), 8 deletions(-)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out
 create mode 100755 tests/ext4/064
 create mode 100644 tests/ext4/064.out
 create mode 100755 tests/generic/767
 create mode 100644 tests/generic/767.out
 create mode 100755 tests/generic/769
 create mode 100644 tests/generic/769.out
 create mode 100755 tests/generic/770
 create mode 100644 tests/generic/770.out
 create mode 100755 tests/generic/771
 create mode 100644 tests/generic/771.out
 create mode 100755 tests/generic/772
 create mode 100644 tests/generic/772.out

-- 
2.49.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 14:30   ` Darrick J. Wong
  2025-06-18 19:13   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 02/12] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

Insert range and collapse range only works with bigalloc in case
the range is cluster size aligned, which fsx doesnt take care. To
work past this, disable insert range and collapse range on ext4, if
bigalloc is enabled.

This is achieved by defining a new function _setup_fs_options
which can serve as a mechanism to apply FS-wide options to
the tests.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/preamble | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/common/preamble b/common/preamble
index ba029a34..2bccff74 100644
--- a/common/preamble
+++ b/common/preamble
@@ -24,6 +24,20 @@ _register_cleanup()
 	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
 }
 
+# setup FS options only to be available for each test run
+_setup_fs_options() {
+	case "$FSTYP" in
+	"ext4")
+		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
+			export FSX_AVOID="-I -C"
+		fi
+		;;
+	# Add other filesystem types here as needed
+	*)
+		;;
+	esac
+}
+
 # Prepare to run a fstest by initializing the required global variables to
 # their defaults, sourcing common functions, registering a cleanup function,
 # and removing the $seqres.full file.
@@ -55,4 +69,6 @@ _begin_fstest()
 	# remove previous $seqres.full before test
 	rm -f $seqres.full $seqres.hints
 
+	# setup filesystem options for a given test execution
+	_setup_fs_options
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 02/12] common/rc: Add a helper to run fsx on a given file
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 14:31   ` Darrick J. Wong
  2025-06-11  9:34 ` [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
Add a helper _run_fsx_on_file so that we can run fsx on any
given file including in $SCRATCH_MNT. Also, refactor _run_fsx
to use this helper.

No functional change is intended in this patch.

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/common/rc b/common/rc
index cfbe2a5f..a5d811a1 100644
--- a/common/rc
+++ b/common/rc
@@ -5115,13 +5115,22 @@ _require_hugepage_fsx()
 		_notrun "fsx binary does not support MADV_COLLAPSE"
 }
 
-_run_fsx()
+_run_fsx_on_file()
 {
+	local testfile=$1
+	shift
+
+	if ! [ -f $testfile ]
+	then
+		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
+		touch $testfile
+	fi
+
 	echo "fsx $*"
 	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
-	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
+	set -- $FSX_PROG $args $FSX_AVOID $testfile
 	echo "$@" >>$seqres.full
-	rm -f $TEST_DIR/junk
+	rm -f $testfile
 	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
 	local res=${PIPESTATUS[0]}
 	if [ $res -ne 0 ]; then
@@ -5133,6 +5142,12 @@ _run_fsx()
 	return 0
 }
 
+_run_fsx()
+{
+	_run_fsx_on_file $TEST_DIR/junk $@
+	return $?
+}
+
 # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
 # the test, but if any other error occurs then exit the test.
 _run_hugepage_fsx() {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
  2025-06-11  9:34 ` [RFC 02/12] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 14:35   ` Darrick J. Wong
  2025-06-11  9:34 ` [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier Ojaswin Mujoo
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

Implement atomic write support to help fuzz atomic writes
with fsx.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 ltp/fsx.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 100 insertions(+), 5 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 163b9453..9353fe6f 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -40,6 +40,7 @@
 #include <liburing.h>
 #endif
 #include <sys/syscall.h>
+#include "statx.h"
 
 #ifndef MAP_FILE
 # define MAP_FILE 0
@@ -49,6 +50,10 @@
 #define RWF_DONTCACHE	0x80
 #endif
 
+#ifndef RWF_ATOMIC
+#define RWF_ATOMIC	0x40
+#endif
+
 #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
 
 /* Operation flags (bitmask) */
@@ -110,6 +115,7 @@ enum {
 	OP_READ_DONTCACHE,
 	OP_WRITE,
 	OP_WRITE_DONTCACHE,
+	OP_WRITE_ATOMIC,
 	OP_MAPREAD,
 	OP_MAPWRITE,
 	OP_MAX_LITE,
@@ -200,6 +206,11 @@ int	uring = 0;
 int	mark_nr = 0;
 int	dontcache_io = 1;
 int	hugepages = 0;                  /* -h flag */
+int	do_atomic_writes = 0;		/* -a flag */
+
+/* User for atomic writes */
+int awu_min = 0;
+int awu_max = 0;
 
 /* Stores info needed to periodically collapse hugepages */
 struct hugepages_collapse_info {
@@ -288,6 +299,7 @@ static const char *op_names[] = {
 	[OP_READ_DONTCACHE] = "read_dontcache",
 	[OP_WRITE] = "write",
 	[OP_WRITE_DONTCACHE] = "write_dontcache",
+	[OP_WRITE_ATOMIC] = "write_atomic",
 	[OP_MAPREAD] = "mapread",
 	[OP_MAPWRITE] = "mapwrite",
 	[OP_TRUNCATE] = "truncate",
@@ -422,6 +434,7 @@ logdump(void)
 				prt("\t***RRRR***");
 			break;
 		case OP_WRITE_DONTCACHE:
+		case OP_WRITE_ATOMIC:
 		case OP_WRITE:
 			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
 			    lp->args[0], lp->args[0] + lp->args[1] - 1,
@@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
 	file_size = offset + size;
 }
 
+static int is_power_of_2(unsigned n) {
+	return ((n & (n - 1)) == 0);
+}
+
+/*
+ * Round down n to nearest power of 2.
+ * If n is already a power of 2, return n;
+ */
+static int rounddown_pow_of_2(int n) {
+	int i = 0;
+
+	if (is_power_of_2(n))
+		return n;
+
+	for (; (1 << i) < n; i++);
+
+	return 1 << (i - 1);
+}
+
 void
 dowrite(unsigned offset, unsigned size, int flags)
 {
@@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
 	offset -= offset % writebdy;
 	if (o_direct)
 		size -= size % writebdy;
+	if (flags & RWF_ATOMIC) {
+		/* atomic write len must be inbetween awu_min and awu_max */
+		if (size < awu_min)
+			size = awu_min;
+		if (size > awu_max)
+			size = awu_max;
+
+		/* atomic writes need power-of-2 sizes */
+		size = rounddown_pow_of_2(size);
+
+		/* atomic writes need naturally aligned offsets */
+		offset -= offset % size;
+
+		/* Skip the write if we are crossing max filesize */
+		if ((offset + size) > maxfilelen) {
+			if (!quiet && testcalls > simulatedopcount)
+				prt("skipping atomic write past maxfilelen\n");
+			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+			return;
+		}
+	}
 	if (size == 0) {
 		if (!quiet && testcalls > simulatedopcount && !o_direct)
 			prt("skipping zero size write\n");
@@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
 		return;
 	}
 
-	log4(OP_WRITE, offset, size, FL_NONE);
+	if (flags & RWF_ATOMIC)
+		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
+	else
+		log4(OP_WRITE, offset, size, FL_NONE);
 
 	gendata(original_buf, good_buf, offset, size);
 	if (offset + size > file_size) {
@@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
 		       (monitorstart == -1 ||
 			(offset + size > monitorstart &&
 			(monitorend == -1 || offset <= monitorend))))))
-		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
-		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
+		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
+		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
+		    (flags & RWF_ATOMIC) != 0);
 	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
 	if (iret != size) {
 		if (iret == -1)
@@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
 }
 #endif
 
+int test_atomic_writes(void) {
+	int ret;
+	struct statx stx;
+
+	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
+	if (ret < 0) {
+		fprintf(stderr, "main: Statx failed with %d."
+			" Failed to determine atomic write limits, "
+			" disabling!\n", ret);
+		return 0;
+	}
+
+	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
+	    stx.stx_atomic_write_unit_min > 0) {
+		awu_min = stx.stx_atomic_write_unit_min;
+		awu_max = stx.stx_atomic_write_unit_max;
+		return 1;
+	}
+
+	fprintf(stderr, "main: IO Stack does not support"
+			"atomic writes, disabling!\n");
+	return 0;
+}
+
 #ifdef HAVE_COPY_FILE_RANGE
 int
 test_copy_range(void)
@@ -2385,6 +2466,14 @@ have_op:
 			dowrite(offset, size, 0);
 		break;
 
+	case OP_WRITE_ATOMIC:
+		TRIM_OFF_LEN(offset, size, maxfilelen);
+		if (do_atomic_writes)
+			dowrite(offset, size, RWF_ATOMIC);
+		else
+			dowrite(offset, size, 0);
+		break;
+
 	case OP_MAPREAD:
 		TRIM_OFF_LEN(offset, size, file_size);
 		domapread(offset, size);
@@ -2511,13 +2600,14 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
+		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
 	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
 	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
 	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
 	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
 	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
 	   ... fname\n\
+	-a: enable atomic writes if IO stack supports it\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
@@ -3059,9 +3149,12 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
+				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
+		case 'a':
+			do_atomic_writes = 1;
+			break;
 		case 'b':
 			simulatedopcount = getnum(optarg, &endp);
 			if (!quiet)
@@ -3475,6 +3568,8 @@ main(int argc, char **argv)
 		exchange_range_calls = test_exchange_range();
 	if (dontcache_io)
 		dontcache_io = test_dontcache_io();
+	if (do_atomic_writes)
+		do_atomic_writes = test_atomic_writes();
 
 	while (keep_running())
 		if (!test())
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (2 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 14:42   ` Darrick J. Wong
  2025-06-18 19:34   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

This adds atomic write test using fio based on it's crc check verifier.
fio adds a crc for each data block. If the underlying device supports atomic
write then it is guaranteed that we will never have a mix data from two
threads writing on the same physical block.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
 tests/generic/767.out |  2 ++
 2 files changed, 86 insertions(+)
 create mode 100755 tests/generic/767
 create mode 100644 tests/generic/767.out

diff --git a/tests/generic/767 b/tests/generic/767
new file mode 100755
index 00000000..4f80e7b6
--- /dev/null
+++ b/tests/generic/767
@@ -0,0 +1,84 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 767
+#
+# Validate FS atomic write using fio crc check verifier.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic
+_require_odirect
+_require_aio
+
+function max()
+{
+	if (( $1 > $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+function min()
+{
+	if (( $1 > $2 )); then
+		echo "$2"
+	else
+		echo "$1"
+	fi
+}
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+blocksize=$(max "$awu_min_write" "$((awu_max_write/2))")
+
+# XFS can have high awu_max_write due to software fallback. Cap it at 64k
+blocksize=$(min "$blocksize" "65536")
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100 * 1024 * 1024))
+
+cat >$fio_config <<EOF
+[aio-dio-aw-verify]
+direct=1
+ioengine=libaio
+rw=randwrite
+bs=$blocksize
+fallocate=native
+filename=$SCRATCH_MNT/test-file
+size=$SIZE
+iodepth=$FIO_LOAD
+numjobs=$FIO_LOAD
+group_reporting=1
+verify_state_save=0
+verify=crc32c
+verify_fatal=1
+verify_dump=0
+verify_backlog=1024
+verify_async=4
+verify_write_sequence=0
+atomic=1
+EOF
+
+_require_fio $fio_config
+
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --output=$fio_out
+cat $fio_out >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/767.out b/tests/generic/767.out
new file mode 100644
index 00000000..2bf7f989
--- /dev/null
+++ b/tests/generic/767.out
@@ -0,0 +1,2 @@
+QA output created by 767
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (3 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 15:35   ` Darrick J. Wong
  2025-06-11  9:34 ` [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

This tests uses fio to first create a file with mixed mappings. Then it
does atomic writes using aio dio with parallel jobs to the same file
with mixed mappings. This forces the filesystem allocator to allocate
extents over mixed mapping regions to stress FS block allocators.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/769     | 101 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/769.out |   2 +
 2 files changed, 103 insertions(+)
 create mode 100755 tests/generic/769
 create mode 100644 tests/generic/769.out

diff --git a/tests/generic/769 b/tests/generic/769
new file mode 100755
index 00000000..469d6344
--- /dev/null
+++ b/tests/generic/769
@@ -0,0 +1,101 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 769
+#
+# Validate FS atomic write using fio crc check verifier on mixed mappings
+# of a file.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_odirect
+_require_aio
+
+function max()
+{
+	if (( $1 > $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+aw_bsize=$(max "$awu_min_write" "$((awu_max_write/4))")
+
+fsbsize=$(_get_block_size $SCRATCH_MNT)
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((128 * 1024 * 1024))
+
+cat >$fio_config <<EOF
+[global]
+ioengine=libaio
+fallocate=none
+filename=$SCRATCH_MNT/test-file
+filesize=$SIZE
+bs=$fsbsize
+direct=1
+verify=0
+group_reporting=1
+
+# Create written extents
+[written_blocks]
+stonewall
+ioengine=libaio
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+
+# Create unwritten extents
+[unwritten_blocks]
+stonewall
+ioengine=falloc
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+
+# atomic write to mixed mappings of written/unwritten/holes
+[atomic_write_aio_dio_job]
+stonewall
+direct=1
+ioengine=libaio
+rw=randwrite
+bs=$aw_bsize
+iodepth=$FIO_LOAD
+numjobs=$FIO_LOAD
+size=$SIZE
+random_generator=lfsr
+verify_state_save=0
+verify=crc32c
+verify_fatal=1
+verify_dump=0
+verify_backlog=1024
+verify_async=4
+verify_write_sequence=0
+atomic=1
+EOF
+
+_require_fio $fio_config
+
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --output=$fio_out
+cat $fio_out >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/769.out b/tests/generic/769.out
new file mode 100644
index 00000000..1512b439
--- /dev/null
+++ b/tests/generic/769.out
@@ -0,0 +1,2 @@
+QA output created by 769
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (4 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 15:36   ` Darrick J. Wong
  2025-06-18 20:17   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 07/12] generic/771: Stress fsx with atomic writes enabled Ojaswin Mujoo
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

This adds various atomic write multi-fsblock stresst tests
with mixed mappings and O_SYNC, to ensure the data and metadata
is atomically persisted even if there is a shutdown.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/770.out |   2 +
 2 files changed, 163 insertions(+)
 create mode 100755 tests/generic/770
 create mode 100644 tests/generic/770.out

diff --git a/tests/generic/770 b/tests/generic/770
new file mode 100755
index 00000000..2b98b3b3
--- /dev/null
+++ b/tests/generic/770
@@ -0,0 +1,161 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 770
+#
+# Atomic write multi-fsblock data integrity tests with mixed mappings
+# and O_SYNC
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+check_data_integrity() {
+	actual=$(_hexdump $testfile)
+	if [[ "$expected" != "$actual" ]]
+	then
+		echo "Integrity check failed"
+		echo "Integrity check failed" >> $seqres.full
+		echo "# Expected file contents:" >> $seqres.full
+		echo "$expected" >> $seqres.full
+		echo "# Actual file contents:" >> $seqres.full
+		echo "$actual" >> $seqres.full
+	fi
+}
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+
+# Create an expected pattern to compare with
+$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
+expected=$(_hexdump $testfile)
+echo "# Expected file contents:" >> $seqres.full
+echo "$expected" >> $seqres.full
+
+echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping (10 iterations):" >> $seqres.full
+# Calculate how many blocks (e.g. 4K) fit in awu_max (e.g. 64K)
+num_blocks=$((awu_max / blksz))
+echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
+
+operations=("W" "H" "U")
+
+# Run 10 iterations of the test
+for ((iteration=1; iteration<=10; iteration++)); do
+	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
+
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+	off=0
+	mapping=""
+
+	for ((i=0; i<num_blocks; i++)); do
+		index=$((RANDOM % ${#operations[@]}))
+		map="${operations[$index]}"
+		mapping="${mapping}${map}"
+
+		case "$map" in
+			"W")
+				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
+				;;
+			"H")
+				# No operation needed for hole
+				;;
+			"U")
+				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
+				;;
+		esac
+		off=$((off + blksz))
+	done
+
+	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
+
+	sync $testfile
+
+	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
+	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
+				  grep wrote | awk -F'[/ ]' '{print $2}')
+
+	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
+	check_data_integrity
+	echo "Iteration $iteration completed: OK" >> $seqres.full
+	echo >> $seqres.full
+done
+echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
+
+echo >> $seqres.full
+echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
+bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
+                grep wrote | awk -F'[/ ]' '{print $2}')
+test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
+_scratch_shutdown -v >> $seqres.full
+_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-2"
+check_data_integrity
+echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
+
+echo >> $seqres.full
+echo "# Test 3: Do O_DSYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations):" >> $seqres.full
+num_blocks=$((awu_max / blksz))
+echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
+
+operations=("W" "H" "U")
+
+for ((iteration=1; iteration<=10; iteration++)); do
+	echo "=== Mixed Mapping Shutdown Test Iteration $iteration ===" >> $seqres.full
+
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+	off=0
+	mapping=""
+
+	for ((i=0; i<num_blocks; i++)); do
+		index=$((RANDOM % ${#operations[@]}))
+		map="${operations[$index]}"
+		mapping="${mapping}${map}"
+
+		case "$map" in
+			"W")
+				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
+				;;
+			"H")
+				# No operation needed for hole
+				;;
+			"U")
+				$XFS_IO_PROG -c "falloc $off $blksz" $testfile > /dev/null
+				;;
+		esac
+		off=$((off + blksz))
+	done
+
+	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
+
+	sync $testfile
+
+	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
+	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
+				  grep wrote | awk -F'[/ ]' '{print $2}')
+
+	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
+
+	echo "Shutting down filesystem" >> $seqres.full
+	_scratch_shutdown -v >> $seqres.full
+	_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
+	check_data_integrity
+	echo "Iteration $iteration completed: OK" >> $seqres.full
+	echo >> $seqres.full
+done
+echo "# Test 3: Do O_SYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations): OK" >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
+
diff --git a/tests/generic/770.out b/tests/generic/770.out
new file mode 100644
index 00000000..17994ed5
--- /dev/null
+++ b/tests/generic/770.out
@@ -0,0 +1,2 @@
+QA output created by 770
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (5 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 14:45   ` Darrick J. Wong
  2025-06-18 20:27   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

Stress file with atomic writes to ensure we excercise codepaths
where we are mixing different FS operations with atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
 tests/generic/771.out |  2 ++
 2 files changed, 51 insertions(+)
 create mode 100755 tests/generic/771
 create mode 100644 tests/generic/771.out

diff --git a/tests/generic/771 b/tests/generic/771
new file mode 100755
index 00000000..690dfa0a
--- /dev/null
+++ b/tests/generic/771
@@ -0,0 +1,49 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 771
+#
+# fuzz fsx with atomic writes
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest rw auto quick atomicwrites
+
+# Import common functions.
+. ./common/filter
+
+_require_test
+_require_odirect
+_require_scratch_write_atomic
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount  >> $seqres.full 2>&1
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+
+# fsx usage:
+#
+# -N numops: total # operations to do
+# -l flen: the upper bound on file size
+# -o oplen: the upper bound on operation size (64k default)
+# -w writebdy: $psize would make writes page aligned (on i386)
+# -Z: O_DIRECT (use -R, -W, -r and -w too)
+# -W: mapped write operations DISabled
+
+_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
+status=$?
+
+if [[ "$status" != "0" ]]
+then
+	echo "Somthing went wrong, check $seqres.full"
+fi
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/771.out b/tests/generic/771.out
new file mode 100644
index 00000000..c2345c7b
--- /dev/null
+++ b/tests/generic/771.out
@@ -0,0 +1,2 @@
+QA output created by 771
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (6 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 07/12] generic/771: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-11 15:38   ` Darrick J. Wong
                     ` (2 more replies)
  2025-06-11  9:34 ` [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
                   ` (3 subsequent siblings)
  11 siblings, 3 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

This test is intended to ensure that multi blocks atomic writes
maintain atomic guarantees across sudden FS shutdowns.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/772     | 360 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/772.out |   2 +
 2 files changed, 362 insertions(+)
 create mode 100755 tests/generic/772
 create mode 100644 tests/generic/772.out

diff --git a/tests/generic/772 b/tests/generic/772
new file mode 100755
index 00000000..6af7e74c
--- /dev/null
+++ b/tests/generic/772
@@ -0,0 +1,360 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 0772
+#
+# Test multi block atomic writes with sudden FS shutdowns to ensure
+# the FS is not tearing the write operation
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+echo "Awu max: $awu_max" >> $seqres.full
+
+num_blocks=$((awu_max / blksz))
+filesize=$(($blksz * 12 * 1024 ))
+
+atomic_write_loop() {
+	local off=0
+	local size=$awu_max
+	for ((i=0; i<$((filesize / $size )); i++)); do
+		# Due to sudden shutdown this can produce errors so just redirect them
+		# to seqres.full
+		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
+		echo "Written to offset: $off" >> $tmp.aw
+		off=$((off + $size))
+	done
+}
+
+create_mixed_mappings() {
+	local file=$1
+	local size_bytes=$2
+
+	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
+	#Fill the file with alternate written and unwritten blocks
+	local off=0
+	local operations=("W" "U")
+
+	for ((i=0; i<$((size_bytes / blksz )); i++)); do
+		index=$(($i % ${#operations[@]}))
+		map="${operations[$index]}"
+
+		case "$map" in
+		    "W")
+			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
+			;;
+		    "U")
+			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
+			;;
+		esac
+		off=$((off + blksz))
+	done
+
+	sync $file
+}
+
+populate_expected_data() {
+	# create a dummy file with expected old data for different cases
+	create_mixed_mappings $testfile.exp_old_mixed $awu_max
+	expected_data_old_mixed=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mixed)
+
+	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
+	expected_data_old_zeroes=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_zeroes)
+
+	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
+	expected_data_old_mapped=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mapped)
+
+	# create a dummy file with expected new data
+	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
+	expected_data_new=$(xxd -s 0 -l $awu_max -p $testfile.exp_new)
+}
+
+verify_data_blocks() {
+	local verify_start=$1
+	local verify_end=$2
+	local expected_data_old="$3"
+	local expected_data_new="$4"
+
+	echo >> $seqres.full
+	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
+
+	# After an atomic write, for every chunk we ensure that the underlying
+	# data is either the old data or new data as writes shouldn't get torn.
+	local off=$verify_start
+	while [[ "$off" -lt "$verify_end" ]]
+	do
+		actual_data=$(xxd -s $off -l $awu_max -p $testfile)
+		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
+		then
+			echo "Checksum match failed at off: $off size: $awu_max"
+			echo "Expected contents: (Either of the 2 below):"
+			echo
+			echo "Expected old: "
+			echo "$expected_data_old"
+			echo
+			echo "Expected new: "
+			echo "$expected_data_new"
+			echo
+			echo "Actual contents: "
+			echo "$actual_data"
+
+			return 1
+		fi
+		echo -n "Check at offset $off suceeded! " >> $seqres.full
+		if [[ "$actual_data" == "$expected_data_new" ]]
+		then
+			echo "matched new" >> $seqres.full
+		elif [[ "$actual_data" == "$expected_data_old" ]]
+		then
+			echo "matched old" >> $seqres.full
+		fi
+		off=$(( off + awu_max ))
+	done
+
+	return 0
+}
+
+# test data integrity for file by shutting down in between atomic writes
+test_data_integrity() {
+	echo >> $seqres.full
+	echo "# Writing atomically to file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	# Wait for atleast first write to be recorded
+	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid
+	wait $awloop_pid
+
+	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	cat $tmp.aw >> $seqres.full
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+	# we want to verify all blocks around which the shutdown happended
+	verify_start=$(( last_offset - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	verify_end=$(( last_offset + (awu_max * 5)))
+	if [[ "$verify_end" -gt "$filesize" ]]
+	then
+		verify_end=$filesize
+	fi
+}
+
+# test data integrity for file wiht written and unwritten mappings
+test_data_integrity_mixed() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with mixed mappings" >> $seqres.full
+	create_mixed_mappings $testfile $filesize
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
+
+	if [[ "$?" == "1" ]]
+	then
+		return 1
+	fi
+}
+
+# test data integrity for file with completely written mappings
+test_data_integrity_writ() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully written mapping" >> $seqres.full
+	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
+
+	if [[ "$?" == "1" ]]
+	then
+		return 1
+	fi
+}
+
+# test data integrity for file with completely unwritten mappings
+test_data_integrity_unwrit() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
+	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+
+	if [[ "$?" == "1" ]]
+	then
+		return 1
+	fi
+}
+
+# test data integrity for file with no mappings
+test_data_integrity_hole() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with no mappings" >> $seqres.full
+	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+
+	if [[ "$?" == "1" ]]
+	then
+		return 1
+	fi
+}
+
+test_filesize_integrity() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Performing extending atomic writes over file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	# Wait for atleast first write to be recorded
+	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid
+	wait $awloop_pid
+
+	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	cat $tmp.aw >> $seqres.full
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+	local filesize=$(_get_filesize $testfile)
+	echo >> $seqres.full
+	echo "# Filesize after shutdown: $filesize" >> $seqres.full
+
+	# To confirm that the write went atomically, we check:
+	# 1. The last block should be a multiple of awu_max
+	# 2. The last block should be the completely new data
+
+	if (( $filesize % $awu_max ))
+	then
+		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
+	fi
+
+	verify_start=$(( filesize - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	local verify_end=$filesize
+
+	# Here the blocks should always match new data hence, for simplicity of
+	# code, just corrupt the $expected_data_old buffer so it never matches
+	local expected_data_old="POISON"
+	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
+
+	return $?
+}
+
+$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+echo >> $seqres.full
+echo "# Populating expected data buffers" >> $seqres.full
+populate_expected_data
+
+# Loop 20 times to shake out any races due to shutdown
+for ((iter=0; iter<20; iter++))
+do
+	echo >> $seqres.full
+	echo "------ Iteration $iter ------" >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
+	test_data_integrity_mixed
+	if [[ "$?" == "1" ]]
+	then
+		status=1
+		break
+	fi
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
+	test_data_integrity_writ
+	if [[ "$?" == "1" ]]
+	then
+		status=1
+		break
+	fi
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
+	test_data_integrity_unwrit
+	if [[ "$?" == "1" ]]
+	then
+		status=1
+		break
+	fi
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
+	test_data_integrity_hole
+	if [[ "$?" == "1" ]]
+	then
+		status=1
+		break
+	fi
+
+	echo >> $seqres.full
+	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
+	test_filesize_integrity
+	if [[ "$?" == "1" ]]
+	then
+		status=1
+		break
+	fi
+done
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/772.out b/tests/generic/772.out
new file mode 100644
index 00000000..98c13968
--- /dev/null
+++ b/tests/generic/772.out
@@ -0,0 +1,2 @@
+QA output created by 772
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (7 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-19  7:43   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

We brute force all possible blocksize & clustersize combinations on
a bigalloc filesystem for stressing atomic write using fio data crc
verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
that we never see the mix of data contents from different threads on
a given bsrange.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/061     | 107 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/061.out |   2 +
 2 files changed, 109 insertions(+)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out

diff --git a/tests/ext4/061 b/tests/ext4/061
new file mode 100755
index 00000000..9d656613
--- /dev/null
+++ b/tests/ext4/061
@@ -0,0 +1,107 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
+# we should never see the mix of data contents from different threads for any
+# given fio blocksize.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+
+function max()
+{
+	if (( $1 > $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+function min()
+{
+	if (( $1 < $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+FS_MAX_CLUSTER_SIZE=$((128*1024))
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100*1024*1024))
+fiobsize=4096
+
+# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(max 4096 "$bdev_awu_min")
+FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
+
+function create_fio_config()
+{
+cat >$fio_config <<EOF
+	[aio-dio-aw-verify]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_dump=0
+	verify_backlog=1024
+	verify_async=4
+	verify_write_sequence=0
+	atomic=1
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_config=$tmp.fio
+create_fio_config
+_require_fio $fio_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_ext4 "$MKFS_OPTIONS" >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				touch $SCRATCH_MNT/f1
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+				fio_config=$tmp.fio
+				fio_out=$tmp.fio.out
+				create_fio_config
+				_require_fio $fio_config
+				cat $fio_config >> $seqres.full
+				$FIO_PROG $fio_config --output=$fio_out
+				ret=$?
+				cat $fio_out >> $seqres.full
+				_scratch_unmount
+				[[ $ret -eq 0 ]] || break;
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/061.out b/tests/ext4/061.out
new file mode 100644
index 00000000..273be9e0
--- /dev/null
+++ b/tests/ext4/061.out
@@ -0,0 +1,2 @@
+QA output created by 061
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (8 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-12 10:26   ` John Garry
  2025-06-19  7:45   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
  2025-06-11  9:34 ` [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
  11 siblings, 2 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

Brute force all possible blocksize clustersize combination on a bigalloc
filesystem for stressing atomic write using fio data crc verifier. We run
multiple threads in parallel with each job writing to its own file. The
parallel jobs running on a constrained filesystem size ensure that we stress
the ext4 allocator to allocate contiguous extents.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/062     | 131 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/062.out |   2 +
 2 files changed, 133 insertions(+)
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out

diff --git a/tests/ext4/062 b/tests/ext4/062
new file mode 100755
index 00000000..50803b97
--- /dev/null
+++ b/tests/ext4/062
@@ -0,0 +1,131 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
+# a constrained filesystem size to stress the ext4 allocator to allocate
+# contiguous extents.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+
+function max()
+{
+	if (( $1 > $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+function min()
+{
+	if (( $1 < $2 )); then
+		echo "$1"
+	else
+		echo "$2"
+	fi
+}
+
+FSSIZE=$((360*1024*1024))
+FS_MAX_CLUSTER_SIZE=$((128*1024))
+FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
+fiobsize=4096
+
+# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(max 4096 "$bdev_awu_min")
+FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
+
+function create_fio_config()
+{
+cat >$fio_config <<EOF
+	[global]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=truncate
+	size=$((FSSIZE / 12))
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_dump=0
+	verify_backlog=1024
+	verify_async=4
+	verify_write_sequence=0
+	atomic=1
+
+	[job1]
+	filename=$SCRATCH_MNT/testfile-job1
+
+	[job2]
+	filename=$SCRATCH_MNT/testfile-job2
+
+	[job3]
+	filename=$SCRATCH_MNT/testfile-job3
+
+	[job4]
+	filename=$SCRATCH_MNT/testfile-job4
+
+	[job5]
+	filename=$SCRATCH_MNT/testfile-job5
+
+	[job6]
+	filename=$SCRATCH_MNT/testfile-job6
+
+	[job7]
+	filename=$SCRATCH_MNT/testfile-job7
+
+	[job8]
+	filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_config=$tmp.fio
+create_fio_config
+_require_fio $fio_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				touch $SCRATCH_MNT/f1
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+				fio_config=$tmp.fio
+				fio_out=$tmp.fio.out
+				create_fio_config
+				_require_fio $fio_config
+				cat $fio_config >> $seqres.full
+				$FIO_PROG $fio_config --output=$fio_out
+				ret=$?
+				cat $fio_out >> $seqres.full
+				_scratch_unmount
+				[[ $ret -eq 0 ]] || break;
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/062.out b/tests/ext4/062.out
new file mode 100644
index 00000000..a1578f48
--- /dev/null
+++ b/tests/ext4/062.out
@@ -0,0 +1,2 @@
+QA output created by 062
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (9 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-19  7:52   ` Zorro Lang
  2025-06-11  9:34 ` [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

In ext4, even if an allocated range is physically and logically
contiguous, it can still be split into 2 extents. This is because ext4
does not merge extents across leaf nodes. This is an issue for atomic
writes since even for a continuous extent the map block could (in rare
cases) return a shorter map, hence tearning the write. This test creates
such a file and ensures that the atomic write handles this case
correctly

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/063     | 125 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/063.out |   2 +
 2 files changed, 127 insertions(+)
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out

diff --git a/tests/ext4/063 b/tests/ext4/063
new file mode 100755
index 00000000..b4759990
--- /dev/null
+++ b/tests/ext4/063
@@ -0,0 +1,125 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# In ext4, even if an allocated range is physically and logically contiguous,
+# it can still be split into 2 extents. This is because ext4 does not merge
+# extents across leaf nodes. This is an issue for atomic writes since even for
+# a continuous extent the map block could (in rare cases) return a shorter map,
+# hence tearning the write. This test creates such a file and ensures that the
+# atomic write handles this case correctly
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+prep() {
+	local bs=`_get_block_size $SCRATCH_MNT`
+	local ex_hdr_bytes=12
+	local ex_entry_bytes=12
+	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
+
+	# fill the extent tree leaf which bs len extents at alternate offsets. For example,
+	# for 4k bs the tree should look as follows
+	#
+	#                  +---------+---------+
+	#                  | index 1 | index 2 |
+	#                  +-----+---+-----+---+
+	#               +--------+         +-------+
+	#               |                          |
+	#    +----------+--------------+     +-----+-----+
+	#    | ex 1 | ex 2 |... | ex n |     |  ex n + 1 |
+	#    +-------------------------+     +-----------+
+	#    0      2            680          682
+	for i in $(seq 0 $entries_per_blk)
+	do
+		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
+	done
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
+	# this is a new FS the allocator would find continuous blocks such that
+	# ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
+	# since we dont merge extents across leaf we will end up with a tree as:
+	#
+	#                  +---------+---------+
+	#                  | index 1 | index 2 |
+	#                  +-----+---+-----+---+
+	#               +--------+         +-------+
+	#               |                          |
+	#    +----------+--------------+     +-----+-----+
+	#    | ex 1 | ex 2 |... | ex n |     | ex merged |
+	#    +-------------------------+     +-----------+
+	#    0      2            680          681  682  684
+	#
+	echo >> $seqres.full
+	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
+	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	_scratch_cycle_mount
+}
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+awu_max=$(_get_atomic_write_unit_max $testfile)
+
+echo >> $seqres.full
+echo "# Prepping the file" >> $seqres.full
+prep
+
+torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
+
+echo >> $seqres.full
+echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
+echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+
+echo >> $seqres.full
+echo "Extent state after atomic write:">> $seqres.full
+echo "...">> $seqres.full
+$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+echo >> $seqres.full
+echo "# Checking data integrity" >> $seqres.full
+
+# create a dummy file with expected data
+$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
+expected_data=$(xxd -s 0 -l $awu_max -p $testfile.exp)
+
+# We ensure that the data after atomic writes should match the expected data
+actual_data=$(xxd -s $torn_aw_offset -l $awu_max -p $testfile)
+if [[ "$actual_data" != "$expected_data" ]]
+then
+	echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
+	echo
+	echo "Expected: "
+	echo "$expected_data"
+	echo
+	echo "Actual contents: "
+	echo "$actual_data"
+
+	status=1
+	exit
+fi
+
+echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/063.out b/tests/ext4/063.out
new file mode 100644
index 00000000..de35fc52
--- /dev/null
+++ b/tests/ext4/063.out
@@ -0,0 +1,2 @@
+QA output created by 063
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation
  2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (10 preceding siblings ...)
  2025-06-11  9:34 ` [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-06-11  9:34 ` Ojaswin Mujoo
  2025-06-19  7:58   ` Zorro Lang
  11 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-11  9:34 UTC (permalink / raw)
  To: fstests; +Cc: Ritesh Harjani, djwong, john.g.garry

Test atomic writes with journal credit calculation. We take 2 cases
here:

1. Atomic writes on single mapping causing tree to collapse into
   the inode
2. Atomic writes on mixed mapping causing tree to collapse into the
   inode

This test is inspired by ext4/034.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/064     | 75 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/064.out |  2 ++
 2 files changed, 77 insertions(+)
 create mode 100755 tests/ext4/064
 create mode 100644 tests/ext4/064.out

diff --git a/tests/ext4/064 b/tests/ext4/064
new file mode 100755
index 00000000..12e48ae3
--- /dev/null
+++ b/tests/ext4/064
@@ -0,0 +1,75 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 034
+#
+# Test proper credit reservation is done when performing
+# tree collapse during an aotmic write based allocation
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick quota fiemap prealloc atomicwrites
+
+# Import common functions.
+
+
+# Modify as appropriate.
+_exclude_fs ext2
+_exclude_fs ext3
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "syncfs"
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+echo "----- Testing with atomi write on non-mixed mapping -----" >> $seqres.full
+
+echo "Format and mount" >> $seqres.full
+_scratch_mkfs  > $seqres.full 2>&1
+_scratch_mount > $seqres.full 2>&1
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+	     -c "fsync" $SCRATCH_MNT/foobar >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 4k 4k" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+echo "----- Testing with atomi write on mixed mapping -----" >> $seqres.full
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar2 with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+	     -c "fsync" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 0k 12k" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy2 >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/064.out b/tests/ext4/064.out
new file mode 100644
index 00000000..d9076546
--- /dev/null
+++ b/tests/ext4/064.out
@@ -0,0 +1,2 @@
+QA output created by 064
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-06-11 14:30   ` Darrick J. Wong
  2025-06-12  6:11     ` Ojaswin Mujoo
  2025-06-18 19:13   ` Zorro Lang
  1 sibling, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 14:30 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> Insert range and collapse range only works with bigalloc in case
> the range is cluster size aligned, which fsx doesnt take care. To
> work past this, disable insert range and collapse range on ext4, if
> bigalloc is enabled.

Hmmm, insert/collapse-range have the same behavior on xfs realtime,
maybe we should amend test() in fsx to round to the allocation unit
size?

Querying that programmatically might be ... interesting though.  Is
there a good way to do that for ext4 bigalloc?

(See detect_xfs_alloc_unit in punch-alternating.c)

--D

> This is achieved by defining a new function _setup_fs_options
> which can serve as a mechanism to apply FS-wide options to
> the tests.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  common/preamble | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/common/preamble b/common/preamble
> index ba029a34..2bccff74 100644
> --- a/common/preamble
> +++ b/common/preamble
> @@ -24,6 +24,20 @@ _register_cleanup()
>  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
>  }
>  
> +# setup FS options only to be available for each test run
> +_setup_fs_options() {
> +	case "$FSTYP" in
> +	"ext4")
> +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> +			export FSX_AVOID="-I -C"
> +		fi
> +		;;
> +	# Add other filesystem types here as needed
> +	*)
> +		;;
> +	esac
> +}
> +
>  # Prepare to run a fstest by initializing the required global variables to
>  # their defaults, sourcing common functions, registering a cleanup function,
>  # and removing the $seqres.full file.
> @@ -55,4 +69,6 @@ _begin_fstest()
>  	# remove previous $seqres.full before test
>  	rm -f $seqres.full $seqres.hints
>  
> +	# setup filesystem options for a given test execution
> +	_setup_fs_options
>  }
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/12] common/rc: Add a helper to run fsx on a given file
  2025-06-11  9:34 ` [RFC 02/12] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
@ 2025-06-11 14:31   ` Darrick J. Wong
  2025-06-12  6:17     ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 14:31 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:45PM +0530, Ojaswin Mujoo wrote:
> Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
> Add a helper _run_fsx_on_file so that we can run fsx on any
> given file including in $SCRATCH_MNT. Also, refactor _run_fsx
> to use this helper.
> 
> No functional change is intended in this patch.
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  common/rc | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index cfbe2a5f..a5d811a1 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -5115,13 +5115,22 @@ _require_hugepage_fsx()
>  		_notrun "fsx binary does not support MADV_COLLAPSE"
>  }
>  
> -_run_fsx()
> +_run_fsx_on_file()
>  {
> +	local testfile=$1
> +	shift
> +
> +	if ! [ -f $testfile ]
> +	then
> +		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
> +		touch $testfile
> +	fi
> +
>  	echo "fsx $*"
>  	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> -	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> +	set -- $FSX_PROG $args $FSX_AVOID $testfile

	local testfile="${1:-$TEST_DIR/junk}"
	...
	set -- $FSX_PROG $args $FSX_AVOID $testfile

Then you don't need the extra helper.

--D

>  	echo "$@" >>$seqres.full
> -	rm -f $TEST_DIR/junk
> +	rm -f $testfile
>  	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
>  	local res=${PIPESTATUS[0]}
>  	if [ $res -ne 0 ]; then
> @@ -5133,6 +5142,12 @@ _run_fsx()
>  	return 0
>  }
>  
> +_run_fsx()
> +{
> +	_run_fsx_on_file $TEST_DIR/junk $@
> +	return $?
> +}
> +
>  # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
>  # the test, but if any other error occurs then exit the test.
>  _run_hugepage_fsx() {
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx
  2025-06-11  9:34 ` [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-06-11 14:35   ` Darrick J. Wong
  2025-06-12  6:18     ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 14:35 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:46PM +0530, Ojaswin Mujoo wrote:
> Implement atomic write support to help fuzz atomic writes
> with fsx.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  ltp/fsx.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 100 insertions(+), 5 deletions(-)
> 
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index 163b9453..9353fe6f 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -40,6 +40,7 @@
>  #include <liburing.h>
>  #endif
>  #include <sys/syscall.h>
> +#include "statx.h"
>  
>  #ifndef MAP_FILE
>  # define MAP_FILE 0
> @@ -49,6 +50,10 @@
>  #define RWF_DONTCACHE	0x80
>  #endif
>  
> +#ifndef RWF_ATOMIC
> +#define RWF_ATOMIC	0x40
> +#endif
> +
>  #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
>  
>  /* Operation flags (bitmask) */
> @@ -110,6 +115,7 @@ enum {
>  	OP_READ_DONTCACHE,
>  	OP_WRITE,
>  	OP_WRITE_DONTCACHE,
> +	OP_WRITE_ATOMIC,
>  	OP_MAPREAD,
>  	OP_MAPWRITE,
>  	OP_MAX_LITE,
> @@ -200,6 +206,11 @@ int	uring = 0;
>  int	mark_nr = 0;
>  int	dontcache_io = 1;
>  int	hugepages = 0;                  /* -h flag */
> +int	do_atomic_writes = 0;		/* -a flag */
> +
> +/* User for atomic writes */
> +int awu_min = 0;
> +int awu_max = 0;
>  
>  /* Stores info needed to periodically collapse hugepages */
>  struct hugepages_collapse_info {
> @@ -288,6 +299,7 @@ static const char *op_names[] = {
>  	[OP_READ_DONTCACHE] = "read_dontcache",
>  	[OP_WRITE] = "write",
>  	[OP_WRITE_DONTCACHE] = "write_dontcache",
> +	[OP_WRITE_ATOMIC] = "write_atomic",
>  	[OP_MAPREAD] = "mapread",
>  	[OP_MAPWRITE] = "mapwrite",
>  	[OP_TRUNCATE] = "truncate",
> @@ -422,6 +434,7 @@ logdump(void)
>  				prt("\t***RRRR***");
>  			break;
>  		case OP_WRITE_DONTCACHE:
> +		case OP_WRITE_ATOMIC:
>  		case OP_WRITE:
>  			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
>  			    lp->args[0], lp->args[0] + lp->args[1] - 1,
> @@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
>  	file_size = offset + size;
>  }
>  
> +static int is_power_of_2(unsigned n) {
> +	return ((n & (n - 1)) == 0);
> +}
> +
> +/*
> + * Round down n to nearest power of 2.
> + * If n is already a power of 2, return n;
> + */
> +static int rounddown_pow_of_2(int n) {
> +	int i = 0;
> +
> +	if (is_power_of_2(n))
> +		return n;
> +
> +	for (; (1 << i) < n; i++);
> +
> +	return 1 << (i - 1);
> +}
> +
>  void
>  dowrite(unsigned offset, unsigned size, int flags)
>  {
> @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
>  	offset -= offset % writebdy;
>  	if (o_direct)
>  		size -= size % writebdy;
> +	if (flags & RWF_ATOMIC) {
> +		/* atomic write len must be inbetween awu_min and awu_max */
> +		if (size < awu_min)
> +			size = awu_min;
> +		if (size > awu_max)
> +			size = awu_max;
> +
> +		/* atomic writes need power-of-2 sizes */
> +		size = rounddown_pow_of_2(size);
> +
> +		/* atomic writes need naturally aligned offsets */
> +		offset -= offset % size;
> +
> +		/* Skip the write if we are crossing max filesize */
> +		if ((offset + size) > maxfilelen) {
> +			if (!quiet && testcalls > simulatedopcount)
> +				prt("skipping atomic write past maxfilelen\n");
> +			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> +			return;
> +		}
> +	}
>  	if (size == 0) {
>  		if (!quiet && testcalls > simulatedopcount && !o_direct)
>  			prt("skipping zero size write\n");
> @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
>  		return;
>  	}
>  
> -	log4(OP_WRITE, offset, size, FL_NONE);
> +	if (flags & RWF_ATOMIC)
> +		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> +	else
> +		log4(OP_WRITE, offset, size, FL_NONE);
>  
>  	gendata(original_buf, good_buf, offset, size);
>  	if (offset + size > file_size) {
> @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
>  		       (monitorstart == -1 ||
>  			(offset + size > monitorstart &&
>  			(monitorend == -1 || offset <= monitorend))))))
> -		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> -		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> +		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> +		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> +		    (flags & RWF_ATOMIC) != 0);
>  	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
>  	if (iret != size) {
>  		if (iret == -1)
> @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
>  }
>  #endif
>  
> +int test_atomic_writes(void) {
> +	int ret;
> +	struct statx stx;
> +
> +	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
> +	if (ret < 0) {
> +		fprintf(stderr, "main: Statx failed with %d."
> +			" Failed to determine atomic write limits, "
> +			" disabling!\n", ret);
> +		return 0;
> +	}
> +
> +	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
> +	    stx.stx_atomic_write_unit_min > 0) {
> +		awu_min = stx.stx_atomic_write_unit_min;
> +		awu_max = stx.stx_atomic_write_unit_max;
> +		return 1;
> +	}
> +
> +	fprintf(stderr, "main: IO Stack does not support"
> +			"atomic writes, disabling!\n");
> +	return 0;
> +}
> +
>  #ifdef HAVE_COPY_FILE_RANGE
>  int
>  test_copy_range(void)
> @@ -2385,6 +2466,14 @@ have_op:
>  			dowrite(offset, size, 0);
>  		break;
>  
> +	case OP_WRITE_ATOMIC:
> +		TRIM_OFF_LEN(offset, size, maxfilelen);
> +		if (do_atomic_writes)
> +			dowrite(offset, size, RWF_ATOMIC);
> +		else
> +			dowrite(offset, size, 0);

Er.... shouldn't we skip OP_ATOMIC_WRITE if !do_atomic_writes?
There's a whole switch statement further up in test() that does things
like:

	case OP_COPY_RANGE:
		if (!copy_range_calls) {
			log5(op, offset, size, offset2, FL_SKIPPED);
			goto out;
		}
		break;

to break out early.

--D

> +		break;
> +
>  	case OP_MAPREAD:
>  		TRIM_OFF_LEN(offset, size, file_size);
>  		domapread(offset, size);
> @@ -2511,13 +2600,14 @@ void
>  usage(void)
>  {
>  	fprintf(stdout, "usage: %s",
> -		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
> +		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
>  	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
>  	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
>  	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
>  	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
>  	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
>  	   ... fname\n\
> +	-a: enable atomic writes if IO stack supports it\n\
>  	-b opnum: beginning operation number (default 1)\n\
>  	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
>  	-d: debug output for all operations\n\
> @@ -3059,9 +3149,12 @@ main(int argc, char **argv)
>  	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
>  
>  	while ((ch = getopt_long(argc, argv,
> -				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> +				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
>  				 longopts, NULL)) != EOF)
>  		switch (ch) {
> +		case 'a':
> +			do_atomic_writes = 1;
> +			break;
>  		case 'b':
>  			simulatedopcount = getnum(optarg, &endp);
>  			if (!quiet)
> @@ -3475,6 +3568,8 @@ main(int argc, char **argv)
>  		exchange_range_calls = test_exchange_range();
>  	if (dontcache_io)
>  		dontcache_io = test_dontcache_io();
> +	if (do_atomic_writes)
> +		do_atomic_writes = test_atomic_writes();
>  
>  	while (keep_running())
>  		if (!test())
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-11  9:34 ` [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-06-11 14:42   ` Darrick J. Wong
  2025-06-12  6:22     ` Ojaswin Mujoo
  2025-06-18 19:34   ` Zorro Lang
  1 sibling, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 14:42 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:47PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> This adds atomic write test using fio based on it's crc check verifier.
> fio adds a crc for each data block. If the underlying device supports atomic
> write then it is guaranteed that we will never have a mix data from two
> threads writing on the same physical block.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/767.out |  2 ++
>  2 files changed, 86 insertions(+)
>  create mode 100755 tests/generic/767
>  create mode 100644 tests/generic/767.out
> 
> diff --git a/tests/generic/767 b/tests/generic/767
> new file mode 100755
> index 00000000..4f80e7b6
> --- /dev/null
> +++ b/tests/generic/767
> @@ -0,0 +1,84 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 767
> +#
> +# Validate FS atomic write using fio crc check verifier.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_odirect
> +_require_aio
> +
> +function max()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}
> +
> +function min()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$2"
> +	else
> +		echo "$1"
> +	fi
> +}

Should these be common/rc helpers?

Or, since bash is ... uh fun with arguments...

_min() {
	local ret

	for arg in "$@"; do
		if [ -z "$ret" ] || (( $arg < $ret )); then
			ret="$arg"
		fi
	done
	echo $ret
}

and then you can pass as many arguments as you like.  The only downside
is that you can pass stringly typed crap "_min cow frog" and it still
returns "cow".  As if.

> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +blocksize=$(max "$awu_min_write" "$((awu_max_write/2))")
> +
> +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> +blocksize=$(min "$blocksize" "65536")
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))

What program is nproc?

--D

> +SIZE=$((100 * 1024 * 1024))
> +
> +cat >$fio_config <<EOF
> +[aio-dio-aw-verify]
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +bs=$blocksize
> +fallocate=native
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +iodepth=$FIO_LOAD
> +numjobs=$FIO_LOAD
> +group_reporting=1
> +verify_state_save=0
> +verify=crc32c
> +verify_fatal=1
> +verify_dump=0
> +verify_backlog=1024
> +verify_async=4
> +verify_write_sequence=0
> +atomic=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +$FIO_PROG $fio_config --output=$fio_out
> +cat $fio_out >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/767.out b/tests/generic/767.out
> new file mode 100644
> index 00000000..2bf7f989
> --- /dev/null
> +++ b/tests/generic/767.out
> @@ -0,0 +1,2 @@
> +QA output created by 767
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-11  9:34 ` [RFC 07/12] generic/771: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-06-11 14:45   ` Darrick J. Wong
  2025-06-12  6:27     ` Ojaswin Mujoo
  2025-06-18 20:27   ` Zorro Lang
  1 sibling, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 14:45 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:
> Stress file with atomic writes to ensure we excercise codepaths
> where we are mixing different FS operations with atomic writes
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/771.out |  2 ++
>  2 files changed, 51 insertions(+)
>  create mode 100755 tests/generic/771
>  create mode 100644 tests/generic/771.out
> 
> diff --git a/tests/generic/771 b/tests/generic/771
> new file mode 100755
> index 00000000..690dfa0a
> --- /dev/null
> +++ b/tests/generic/771
> @@ -0,0 +1,49 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 771
> +#
> +# fuzz fsx with atomic writes
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest rw auto quick atomicwrites
> +
> +# Import common functions.
> +. ./common/filter
> +
> +_require_test
> +_require_odirect
> +_require_scratch_write_atomic
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount  >> $seqres.full 2>&1
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> +
> +# fsx usage:
> +#
> +# -N numops: total # operations to do
> +# -l flen: the upper bound on file size
> +# -o oplen: the upper bound on operation size (64k default)
> +# -w writebdy: $psize would make writes page aligned (on i386)
> +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> +# -W: mapped write operations DISabled
> +
> +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full

Don't we already get fsx stress testing RWF_ATOMIC through generic/521
and generic/522?

Also why is mmap write disabled?

--D

> +status=$?
> +
> +if [[ "$status" != "0" ]]
> +then
> +	echo "Somthing went wrong, check $seqres.full"
> +fi
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/771.out b/tests/generic/771.out
> new file mode 100644
> index 00000000..c2345c7b
> --- /dev/null
> +++ b/tests/generic/771.out
> @@ -0,0 +1,2 @@
> +QA output created by 771
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings
  2025-06-11  9:34 ` [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-06-11 15:35   ` Darrick J. Wong
  0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 15:35 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:48PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> This tests uses fio to first create a file with mixed mappings. Then it
> does atomic writes using aio dio with parallel jobs to the same file
> with mixed mappings. This forces the filesystem allocator to allocate
> extents over mixed mapping regions to stress FS block allocators.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/769     | 101 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/769.out |   2 +
>  2 files changed, 103 insertions(+)
>  create mode 100755 tests/generic/769
>  create mode 100644 tests/generic/769.out
> 
> diff --git a/tests/generic/769 b/tests/generic/769
> new file mode 100755
> index 00000000..469d6344
> --- /dev/null
> +++ b/tests/generic/769
> @@ -0,0 +1,101 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 769
> +#
> +# Validate FS atomic write using fio crc check verifier on mixed mappings
> +# of a file.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_odirect
> +_require_aio
> +
> +function max()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}

Yeah, there's that function again.

The rest of the test looks fine.

--D

> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +aw_bsize=$(max "$awu_min_write" "$((awu_max_write/4))")
> +
> +fsbsize=$(_get_block_size $SCRATCH_MNT)
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((128 * 1024 * 1024))
> +
> +cat >$fio_config <<EOF
> +[global]
> +ioengine=libaio
> +fallocate=none
> +filename=$SCRATCH_MNT/test-file
> +filesize=$SIZE
> +bs=$fsbsize
> +direct=1
> +verify=0
> +group_reporting=1
> +
> +# Create written extents
> +[written_blocks]
> +stonewall
> +ioengine=libaio
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +
> +# Create unwritten extents
> +[unwritten_blocks]
> +stonewall
> +ioengine=falloc
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +
> +# atomic write to mixed mappings of written/unwritten/holes
> +[atomic_write_aio_dio_job]
> +stonewall
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +bs=$aw_bsize
> +iodepth=$FIO_LOAD
> +numjobs=$FIO_LOAD
> +size=$SIZE
> +random_generator=lfsr
> +verify_state_save=0
> +verify=crc32c
> +verify_fatal=1
> +verify_dump=0
> +verify_backlog=1024
> +verify_async=4
> +verify_write_sequence=0
> +atomic=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +$FIO_PROG $fio_config --output=$fio_out
> +cat $fio_out >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/769.out b/tests/generic/769.out
> new file mode 100644
> index 00000000..1512b439
> --- /dev/null
> +++ b/tests/generic/769.out
> @@ -0,0 +1,2 @@
> +QA output created by 769
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-11  9:34 ` [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-06-11 15:36   ` Darrick J. Wong
  2025-06-12  6:23     ` Ojaswin Mujoo
  2025-06-18 20:17   ` Zorro Lang
  1 sibling, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 15:36 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:49PM +0530, Ojaswin Mujoo wrote:
> This adds various atomic write multi-fsblock stresst tests
> with mixed mappings and O_SYNC, to ensure the data and metadata
> is atomically persisted even if there is a shutdown.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/770.out |   2 +
>  2 files changed, 163 insertions(+)
>  create mode 100755 tests/generic/770
>  create mode 100644 tests/generic/770.out
> 
> diff --git a/tests/generic/770 b/tests/generic/770
> new file mode 100755
> index 00000000..2b98b3b3
> --- /dev/null
> +++ b/tests/generic/770
> @@ -0,0 +1,161 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 770
> +#
> +# Atomic write multi-fsblock data integrity tests with mixed mappings
> +# and O_SYNC
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands

_require_shutdown?

--D

> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +check_data_integrity() {
> +	actual=$(_hexdump $testfile)
> +	if [[ "$expected" != "$actual" ]]
> +	then
> +		echo "Integrity check failed"
> +		echo "Integrity check failed" >> $seqres.full
> +		echo "# Expected file contents:" >> $seqres.full
> +		echo "$expected" >> $seqres.full
> +		echo "# Actual file contents:" >> $seqres.full
> +		echo "$actual" >> $seqres.full
> +	fi
> +}
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +
> +# Create an expected pattern to compare with
> +$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
> +expected=$(_hexdump $testfile)
> +echo "# Expected file contents:" >> $seqres.full
> +echo "$expected" >> $seqres.full
> +
> +echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping (10 iterations):" >> $seqres.full
> +# Calculate how many blocks (e.g. 4K) fit in awu_max (e.g. 64K)
> +num_blocks=$((awu_max / blksz))
> +echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
> +
> +operations=("W" "H" "U")
> +
> +# Run 10 iterations of the test
> +for ((iteration=1; iteration<=10; iteration++)); do
> +	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
> +
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +	off=0
> +	mapping=""
> +
> +	for ((i=0; i<num_blocks; i++)); do
> +		index=$((RANDOM % ${#operations[@]}))
> +		map="${operations[$index]}"
> +		mapping="${mapping}${map}"
> +
> +		case "$map" in
> +			"W")
> +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> +				;;
> +			"H")
> +				# No operation needed for hole
> +				;;
> +			"U")
> +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
> +				;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> +
> +	sync $testfile
> +
> +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> +				  grep wrote | awk -F'[/ ]' '{print $2}')
> +
> +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +	check_data_integrity
> +	echo "Iteration $iteration completed: OK" >> $seqres.full
> +	echo >> $seqres.full
> +done
> +echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
> +bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
> +                grep wrote | awk -F'[/ ]' '{print $2}')
> +test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +_scratch_shutdown -v >> $seqres.full
> +_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-2"
> +check_data_integrity
> +echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Test 3: Do O_DSYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations):" >> $seqres.full
> +num_blocks=$((awu_max / blksz))
> +echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
> +
> +operations=("W" "H" "U")
> +
> +for ((iteration=1; iteration<=10; iteration++)); do
> +	echo "=== Mixed Mapping Shutdown Test Iteration $iteration ===" >> $seqres.full
> +
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> +	off=0
> +	mapping=""
> +
> +	for ((i=0; i<num_blocks; i++)); do
> +		index=$((RANDOM % ${#operations[@]}))
> +		map="${operations[$index]}"
> +		mapping="${mapping}${map}"
> +
> +		case "$map" in
> +			"W")
> +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> +				;;
> +			"H")
> +				# No operation needed for hole
> +				;;
> +			"U")
> +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile > /dev/null
> +				;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> +
> +	sync $testfile
> +
> +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> +				  grep wrote | awk -F'[/ ]' '{print $2}')
> +
> +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +
> +	echo "Shutting down filesystem" >> $seqres.full
> +	_scratch_shutdown -v >> $seqres.full
> +	_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> +	check_data_integrity
> +	echo "Iteration $iteration completed: OK" >> $seqres.full
> +	echo >> $seqres.full
> +done
> +echo "# Test 3: Do O_SYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations): OK" >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> +
> diff --git a/tests/generic/770.out b/tests/generic/770.out
> new file mode 100644
> index 00000000..17994ed5
> --- /dev/null
> +++ b/tests/generic/770.out
> @@ -0,0 +1,2 @@
> +QA output created by 770
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-06-11 15:38   ` Darrick J. Wong
  2025-06-12  6:28     ` Ojaswin Mujoo
  2025-06-19  7:15   ` Zorro Lang
  2025-06-20 14:05   ` John Garry
  2 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-11 15:38 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 03:04:51PM +0530, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/772     | 360 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/772.out |   2 +
>  2 files changed, 362 insertions(+)
>  create mode 100755 tests/generic/772
>  create mode 100644 tests/generic/772.out
> 
> diff --git a/tests/generic/772 b/tests/generic/772
> new file mode 100755
> index 00000000..6af7e74c
> --- /dev/null
> +++ b/tests/generic/772
> @@ -0,0 +1,360 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 0772
> +#
> +# Test multi block atomic writes with sudden FS shutdowns to ensure
> +# the FS is not tearing the write operation
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands

_require_shutdown?

Otherwise seems fine to me...

--D

> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +echo "Awu max: $awu_max" >> $seqres.full
> +
> +num_blocks=$((awu_max / blksz))
> +filesize=$(($blksz * 12 * 1024 ))
> +
> +atomic_write_loop() {
> +	local off=0
> +	local size=$awu_max
> +	for ((i=0; i<$((filesize / $size )); i++)); do
> +		# Due to sudden shutdown this can produce errors so just redirect them
> +		# to seqres.full
> +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> +		echo "Written to offset: $off" >> $tmp.aw
> +		off=$((off + $size))
> +	done
> +}
> +
> +create_mixed_mappings() {
> +	local file=$1
> +	local size_bytes=$2
> +
> +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> +	#Fill the file with alternate written and unwritten blocks
> +	local off=0
> +	local operations=("W" "U")
> +
> +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> +		index=$(($i % ${#operations[@]}))
> +		map="${operations[$index]}"
> +
> +		case "$map" in
> +		    "W")
> +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
> +			;;
> +		    "U")
> +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
> +			;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	sync $file
> +}
> +
> +populate_expected_data() {
> +	# create a dummy file with expected old data for different cases
> +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
> +	expected_data_old_mixed=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mixed)
> +
> +	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
> +	expected_data_old_zeroes=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_zeroes)
> +
> +	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
> +	expected_data_old_mapped=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mapped)
> +
> +	# create a dummy file with expected new data
> +	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
> +	expected_data_new=$(xxd -s 0 -l $awu_max -p $testfile.exp_new)
> +}
> +
> +verify_data_blocks() {
> +	local verify_start=$1
> +	local verify_end=$2
> +	local expected_data_old="$3"
> +	local expected_data_new="$4"
> +
> +	echo >> $seqres.full
> +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> +
> +	# After an atomic write, for every chunk we ensure that the underlying
> +	# data is either the old data or new data as writes shouldn't get torn.
> +	local off=$verify_start
> +	while [[ "$off" -lt "$verify_end" ]]
> +	do
> +		actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> +		then
> +			echo "Checksum match failed at off: $off size: $awu_max"
> +			echo "Expected contents: (Either of the 2 below):"
> +			echo
> +			echo "Expected old: "
> +			echo "$expected_data_old"
> +			echo
> +			echo "Expected new: "
> +			echo "$expected_data_new"
> +			echo
> +			echo "Actual contents: "
> +			echo "$actual_data"
> +
> +			return 1
> +		fi
> +		echo -n "Check at offset $off suceeded! " >> $seqres.full
> +		if [[ "$actual_data" == "$expected_data_new" ]]
> +		then
> +			echo "matched new" >> $seqres.full
> +		elif [[ "$actual_data" == "$expected_data_old" ]]
> +		then
> +			echo "matched old" >> $seqres.full
> +		fi
> +		off=$(( off + awu_max ))
> +	done
> +
> +	return 0
> +}
> +
> +# test data integrity for file by shutting down in between atomic writes
> +test_data_integrity() {
> +	echo >> $seqres.full
> +	echo "# Writing atomically to file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!
> +
> +	# Wait for atleast first write to be recorded
> +	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid
> +	wait $awloop_pid
> +
> +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	cat $tmp.aw >> $seqres.full
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +
> +	# we want to verify all blocks around which the shutdown happended
> +	verify_start=$(( last_offset - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	verify_end=$(( last_offset + (awu_max * 5)))
> +	if [[ "$verify_end" -gt "$filesize" ]]
> +	then
> +		verify_end=$filesize
> +	fi
> +}
> +
> +# test data integrity for file wiht written and unwritten mappings
> +test_data_integrity_mixed() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> +	create_mixed_mappings $testfile $filesize
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi
> +}
> +
> +# test data integrity for file with completely written mappings
> +test_data_integrity_writ() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi
> +}
> +
> +# test data integrity for file with completely unwritten mappings
> +test_data_integrity_unwrit() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi
> +}
> +
> +# test data integrity for file with no mappings
> +test_data_integrity_hole() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with no mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi
> +}
> +
> +test_filesize_integrity() {
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!
> +
> +	# Wait for atleast first write to be recorded
> +	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid
> +	wait $awloop_pid
> +
> +	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	cat $tmp.aw >> $seqres.full
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +	local filesize=$(_get_filesize $testfile)
> +	echo >> $seqres.full
> +	echo "# Filesize after shutdown: $filesize" >> $seqres.full
> +
> +	# To confirm that the write went atomically, we check:
> +	# 1. The last block should be a multiple of awu_max
> +	# 2. The last block should be the completely new data
> +
> +	if (( $filesize % $awu_max ))
> +	then
> +		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
> +	fi
> +
> +	verify_start=$(( filesize - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	local verify_end=$filesize
> +
> +	# Here the blocks should always match new data hence, for simplicity of
> +	# code, just corrupt the $expected_data_old buffer so it never matches
> +	local expected_data_old="POISON"
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
> +
> +	return $?
> +}
> +
> +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Populating expected data buffers" >> $seqres.full
> +populate_expected_data
> +
> +# Loop 20 times to shake out any races due to shutdown
> +for ((iter=0; iter<20; iter++))
> +do
> +	echo >> $seqres.full
> +	echo "------ Iteration $iter ------" >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> +	test_data_integrity_mixed
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> +	test_data_integrity_writ
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> +	test_data_integrity_unwrit
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> +	test_data_integrity_hole
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> +	test_filesize_integrity
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +done
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/772.out b/tests/generic/772.out
> new file mode 100644
> index 00000000..98c13968
> --- /dev/null
> +++ b/tests/generic/772.out
> @@ -0,0 +1,2 @@
> +QA output created by 772
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-11 14:30   ` Darrick J. Wong
@ 2025-06-12  6:11     ` Ojaswin Mujoo
  2025-06-12 14:36       ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > Insert range and collapse range only works with bigalloc in case
> > the range is cluster size aligned, which fsx doesnt take care. To
> > work past this, disable insert range and collapse range on ext4, if
> > bigalloc is enabled.
> 
> Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> maybe we should amend test() in fsx to round to the allocation unit
> size?
Hey Darrick,

Yes makes sense but as you mentioned, I'm not sure if there
is a way to programatically detect the bigalloc cluster size (or
allocation unit in general) like we do for xfs. 

(Adding Ted to CC in case he has some idea)
> 
> Querying that programmatically might be ... interesting though.  Is
> there a good way to do that for ext4 bigalloc?
> 
> (See detect_xfs_alloc_unit in punch-alternating.c)
> 
> --D
> 
> > This is achieved by defining a new function _setup_fs_options
> > which can serve as a mechanism to apply FS-wide options to
> > the tests.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  common/preamble | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/common/preamble b/common/preamble
> > index ba029a34..2bccff74 100644
> > --- a/common/preamble
> > +++ b/common/preamble
> > @@ -24,6 +24,20 @@ _register_cleanup()
> >  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
> >  }
> >  
> > +# setup FS options only to be available for each test run
> > +_setup_fs_options() {
> > +	case "$FSTYP" in
> > +	"ext4")
> > +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > +			export FSX_AVOID="-I -C"
> > +		fi
> > +		;;
> > +	# Add other filesystem types here as needed
> > +	*)
> > +		;;
> > +	esac
> > +}
> > +
> >  # Prepare to run a fstest by initializing the required global variables to
> >  # their defaults, sourcing common functions, registering a cleanup function,
> >  # and removing the $seqres.full file.
> > @@ -55,4 +69,6 @@ _begin_fstest()
> >  	# remove previous $seqres.full before test
> >  	rm -f $seqres.full $seqres.hints
> >  
> > +	# setup filesystem options for a given test execution
> > +	_setup_fs_options
> >  }
> > -- 
> > 2.49.0
> > 
> > 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/12] common/rc: Add a helper to run fsx on a given file
  2025-06-11 14:31   ` Darrick J. Wong
@ 2025-06-12  6:17     ` Ojaswin Mujoo
  2025-06-12 14:35       ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 07:31:38AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:45PM +0530, Ojaswin Mujoo wrote:
> > Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
> > Add a helper _run_fsx_on_file so that we can run fsx on any
> > given file including in $SCRATCH_MNT. Also, refactor _run_fsx
> > to use this helper.
> > 
> > No functional change is intended in this patch.
> > 
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  common/rc | 21 ++++++++++++++++++---
> >  1 file changed, 18 insertions(+), 3 deletions(-)
> > 
> > diff --git a/common/rc b/common/rc
> > index cfbe2a5f..a5d811a1 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -5115,13 +5115,22 @@ _require_hugepage_fsx()
> >  		_notrun "fsx binary does not support MADV_COLLAPSE"
> >  }
> >  
> > -_run_fsx()
> > +_run_fsx_on_file()
> >  {
> > +	local testfile=$1
> > +	shift
> > +
> > +	if ! [ -f $testfile ]
> > +	then
> > +		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
> > +		touch $testfile
> > +	fi
> > +
> >  	echo "fsx $*"
> >  	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> > -	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> > +	set -- $FSX_PROG $args $FSX_AVOID $testfile
> 
> 	local testfile="${1:-$TEST_DIR/junk}"
> 	...
> 	set -- $FSX_PROG $args $FSX_AVOID $testfile
> 
> Then you don't need the extra helper.
Hi Darrick,

I originally added the extra helper so that we won't need to change the
existing run_fsx calls. For example, with the change you suggested,
generic/263 would need to be changed from:

run_fsx -N 10000  -o 8192   -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z

to 

run_fsx $TEST_DIR/junk -N 10000  -o 8192   -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z

because otherwise, the $1 would actually be "-N" and run_fsx would get
confused. 

However, if you would prefer to drop the helper and rather modify
run_fsx and the call sites to pass a filename, that should be doable as
well. Please let me know your preference on this.


Regards,
ojaswin

> 
> --D
> 
> >  	echo "$@" >>$seqres.full
> > -	rm -f $TEST_DIR/junk
> > +	rm -f $testfile
> >  	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
> >  	local res=${PIPESTATUS[0]}
> >  	if [ $res -ne 0 ]; then
> > @@ -5133,6 +5142,12 @@ _run_fsx()
> >  	return 0
> >  }
> >  
> > +_run_fsx()
> > +{
> > +	_run_fsx_on_file $TEST_DIR/junk $@
> > +	return $?
> > +}
> > +
> >  # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
> >  # the test, but if any other error occurs then exit the test.
> >  _run_hugepage_fsx() {
> > -- 
> > 2.49.0
> > 
> > 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx
  2025-06-11 14:35   ` Darrick J. Wong
@ 2025-06-12  6:18     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 07:35:54AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:46PM +0530, Ojaswin Mujoo wrote:
> > Implement atomic write support to help fuzz atomic writes
> > with fsx.
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---

<snip>

> > +
> >  #ifdef HAVE_COPY_FILE_RANGE
> >  int
> >  test_copy_range(void)
> > @@ -2385,6 +2466,14 @@ have_op:
> >  			dowrite(offset, size, 0);
> >  		break;
> >  
> > +	case OP_WRITE_ATOMIC:
> > +		TRIM_OFF_LEN(offset, size, maxfilelen);
> > +		if (do_atomic_writes)
> > +			dowrite(offset, size, RWF_ATOMIC);
> > +		else
> > +			dowrite(offset, size, 0);
> 
> Er.... shouldn't we skip OP_ATOMIC_WRITE if !do_atomic_writes?
> There's a whole switch statement further up in test() that does things
> like:
> 
> 	case OP_COPY_RANGE:
> 		if (!copy_range_calls) {
> 			log5(op, offset, size, offset2, FL_SKIPPED);
> 			goto out;
> 		}
> 		break;
> 
> to break out early.
> 
> --D

Got it, I'll make the change Darrick.

Thanks,
ojaswin


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-11 14:42   ` Darrick J. Wong
@ 2025-06-12  6:22     ` Ojaswin Mujoo
  2025-06-12 14:55       ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 07:42:25AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:47PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > This adds atomic write test using fio based on it's crc check verifier.
> > fio adds a crc for each data block. If the underlying device supports atomic
> > write then it is guaranteed that we will never have a mix data from two
> > threads writing on the same physical block.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/767.out |  2 ++
> >  2 files changed, 86 insertions(+)
> >  create mode 100755 tests/generic/767
> >  create mode 100644 tests/generic/767.out
> > 
> > diff --git a/tests/generic/767 b/tests/generic/767
> > new file mode 100755
> > index 00000000..4f80e7b6
> > --- /dev/null
> > +++ b/tests/generic/767
> > @@ -0,0 +1,84 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 767
> > +#
> > +# Validate FS atomic write using fio crc check verifier.
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto aio rw atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_odirect
> > +_require_aio
> > +
> > +function max()
> > +{
> > +	if (( $1 > $2 )); then
> > +		echo "$1"
> > +	else
> > +		echo "$2"
> > +	fi
> > +}
> > +
> > +function min()
> > +{
> > +	if (( $1 > $2 )); then
> > +		echo "$2"
> > +	else
> > +		echo "$1"
> > +	fi
> > +}
> 
> Should these be common/rc helpers?
> 
> Or, since bash is ... uh fun with arguments...
> 
> _min() {
> 	local ret
> 
> 	for arg in "$@"; do
> 		if [ -z "$ret" ] || (( $arg < $ret )); then
> 			ret="$arg"
> 		fi
> 	done
> 	echo $ret
> }
> 
> and then you can pass as many arguments as you like.  The only downside
> is that you can pass stringly typed crap "_min cow frog" and it still
> returns "cow".  As if.

Yes this makes sense (sans the cow vs frog part :p) . I'll make the change in v2.

> 
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount
> > +
> > +touch "$SCRATCH_MNT/f1"
> > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > +blocksize=$(max "$awu_min_write" "$((awu_max_write/2))")
> > +
> > +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> > +blocksize=$(min "$blocksize" "65536")
> > +
> > +fio_config=$tmp.fio
> > +fio_out=$tmp.fio.out
> > +
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> 
> What program is nproc?

It returns the number of CPUs so we can scale the load based on CPUs we
have. It comes from coreutils so I think the distros should have it.


Thanks,
ojaswin
> 
> --D
> 
> > +SIZE=$((100 * 1024 * 1024))
> > +
> > +cat >$fio_config <<EOF
> > +[aio-dio-aw-verify]
> > +direct=1
> > +ioengine=libaio
> > +rw=randwrite
> > +bs=$blocksize
> > +fallocate=native
> > +filename=$SCRATCH_MNT/test-file
> > +size=$SIZE
> > +iodepth=$FIO_LOAD
> > +numjobs=$FIO_LOAD
> > +group_reporting=1
> > +verify_state_save=0
> > +verify=crc32c
> > +verify_fatal=1
> > +verify_dump=0
> > +verify_backlog=1024
> > +verify_async=4
> > +verify_write_sequence=0
> > +atomic=1
> > +EOF
> > +
> > +_require_fio $fio_config
> > +
> > +cat $fio_config >> $seqres.full
> > +$FIO_PROG $fio_config --output=$fio_out
> > +cat $fio_out >> $seqres.full
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/generic/767.out b/tests/generic/767.out
> > new file mode 100644
> > index 00000000..2bf7f989
> > --- /dev/null
> > +++ b/tests/generic/767.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 767
> > +Silence is golden
> > -- 
> > 2.49.0
> > 
> > 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-11 15:36   ` Darrick J. Wong
@ 2025-06-12  6:23     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 08:36:23AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:49PM +0530, Ojaswin Mujoo wrote:
> > This adds various atomic write multi-fsblock stresst tests
> > with mixed mappings and O_SYNC, to ensure the data and metadata
> > is atomically persisted even if there is a shutdown.
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/770.out |   2 +
> >  2 files changed, 163 insertions(+)
> >  create mode 100755 tests/generic/770
> >  create mode 100644 tests/generic/770.out
> > 
> > diff --git a/tests/generic/770 b/tests/generic/770
> > new file mode 100755
> > index 00000000..2b98b3b3
> > --- /dev/null
> > +++ b/tests/generic/770
> > @@ -0,0 +1,161 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 770
> > +#
> > +# Atomic write multi-fsblock data integrity tests with mixed mappings
> > +# and O_SYNC
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto quick rw atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> 
> _require_shutdown?
> 
> --D

I'll add that to v2, thanks!

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-11 14:45   ` Darrick J. Wong
@ 2025-06-12  6:27     ` Ojaswin Mujoo
  2025-06-12 15:14       ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 07:45:58AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:
> > Stress file with atomic writes to ensure we excercise codepaths
> > where we are mixing different FS operations with atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/771.out |  2 ++
> >  2 files changed, 51 insertions(+)
> >  create mode 100755 tests/generic/771
> >  create mode 100644 tests/generic/771.out
> > 
> > diff --git a/tests/generic/771 b/tests/generic/771
> > new file mode 100755
> > index 00000000..690dfa0a
> > --- /dev/null
> > +++ b/tests/generic/771
> > @@ -0,0 +1,49 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 771
> > +#
> > +# fuzz fsx with atomic writes
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest rw auto quick atomicwrites
> > +
> > +# Import common functions.
> > +. ./common/filter
> > +
> > +_require_test
> > +_require_odirect
> > +_require_scratch_write_atomic
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount  >> $seqres.full 2>&1
> > +
> > +testfile=$SCRATCH_MNT/testfile
> > +touch $testfile
> > +
> > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > +blksz=$(_get_block_size $SCRATCH_MNT)
> > +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> > +
> > +# fsx usage:
> > +#
> > +# -N numops: total # operations to do
> > +# -l flen: the upper bound on file size
> > +# -o oplen: the upper bound on operation size (64k default)
> > +# -w writebdy: $psize would make writes page aligned (on i386)
> > +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> > +# -W: mapped write operations DISabled
> > +
> > +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
> 
> Don't we already get fsx stress testing RWF_ATOMIC through generic/521
> and generic/522?

So for RWF_ATOMIC we need to pass the -a flag to fsx which the other tests
don't do. This test specifically passes -a to make sure we stress the
RWF_ATOMIC code path intermixed with other operations. Further, other
tests run FSX on test device whereas for our atomic write tests we
generally assume the scratch dev is the one with atomic capabilities.

> 
> Also why is mmap write disabled?

I thought it'd be better to not mix mmap writes with direct writes but I
can add it back if that's preferred.

Regards,
ojaswin
> 
> --D
> 
> > +status=$?
> > +
> > +if [[ "$status" != "0" ]]
> > +then
> > +	echo "Somthing went wrong, check $seqres.full"
> > +fi
> > +
> > +echo "Silence is golden"
> > +status=0
> > +exit
> > diff --git a/tests/generic/771.out b/tests/generic/771.out
> > new file mode 100644
> > index 00000000..c2345c7b
> > --- /dev/null
> > +++ b/tests/generic/771.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 771
> > +Silence is golden
> > -- 
> > 2.49.0
> > 
> > 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-11 15:38   ` Darrick J. Wong
@ 2025-06-12  6:28     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-12  6:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Wed, Jun 11, 2025 at 08:38:22AM -0700, Darrick J. Wong wrote:
> On Wed, Jun 11, 2025 at 03:04:51PM +0530, Ojaswin Mujoo wrote:
> > This test is intended to ensure that multi blocks atomic writes
> > maintain atomic guarantees across sudden FS shutdowns.
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/772     | 360 ++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/772.out |   2 +
> >  2 files changed, 362 insertions(+)
> >  create mode 100755 tests/generic/772
> >  create mode 100644 tests/generic/772.out
> > 
> > diff --git a/tests/generic/772 b/tests/generic/772
> > new file mode 100755
> > index 00000000..6af7e74c
> > --- /dev/null
> > +++ b/tests/generic/772
> > @@ -0,0 +1,360 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 0772
> > +#
> > +# Test multi block atomic writes with sudden FS shutdowns to ensure
> > +# the FS is not tearing the write operation
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> 
> _require_shutdown?
> 
> Otherwise seems fine to me...
> 
> --D

Will do, thanks for the reviews!

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-11  9:34 ` [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-06-12 10:26   ` John Garry
  2025-06-13  5:37     ` Ojaswin Mujoo
  2025-06-19  7:45   ` Zorro Lang
  1 sibling, 1 reply; 61+ messages in thread
From: John Garry @ 2025-06-12 10:26 UTC (permalink / raw)
  To: Ojaswin Mujoo, fstests; +Cc: Ritesh Harjani, djwong

On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> 
> Brute force all possible blocksize clustersize combination on a bigalloc
> filesystem for stressing atomic write using fio data crc verifier. We run
> multiple threads in parallel with each job writing to its own file. The
> parallel jobs running on a constrained filesystem size ensure that we stress
> the ext4 allocator to allocate contiguous extents.
> 
> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>

RWF_ATOMIC does not guarantee that racing atomic writes and reads are 
serialised. That is what you are testing here, right?

NVMe and SCSI do guarantee this (serialisation). However, reads in the 
block layer may be split into multiple requests, even though unlikely.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/12] common/rc: Add a helper to run fsx on a given file
  2025-06-12  6:17     ` Ojaswin Mujoo
@ 2025-06-12 14:35       ` Darrick J. Wong
  0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-12 14:35 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Thu, Jun 12, 2025 at 11:47:33AM +0530, Ojaswin Mujoo wrote:
> On Wed, Jun 11, 2025 at 07:31:38AM -0700, Darrick J. Wong wrote:
> > On Wed, Jun 11, 2025 at 03:04:45PM +0530, Ojaswin Mujoo wrote:
> > > Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
> > > Add a helper _run_fsx_on_file so that we can run fsx on any
> > > given file including in $SCRATCH_MNT. Also, refactor _run_fsx
> > > to use this helper.
> > > 
> > > No functional change is intended in this patch.
> > > 
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  common/rc | 21 ++++++++++++++++++---
> > >  1 file changed, 18 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/common/rc b/common/rc
> > > index cfbe2a5f..a5d811a1 100644
> > > --- a/common/rc
> > > +++ b/common/rc
> > > @@ -5115,13 +5115,22 @@ _require_hugepage_fsx()
> > >  		_notrun "fsx binary does not support MADV_COLLAPSE"
> > >  }
> > >  
> > > -_run_fsx()
> > > +_run_fsx_on_file()
> > >  {
> > > +	local testfile=$1
> > > +	shift
> > > +
> > > +	if ! [ -f $testfile ]
> > > +	then
> > > +		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
> > > +		touch $testfile
> > > +	fi
> > > +
> > >  	echo "fsx $*"
> > >  	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> > > -	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> > > +	set -- $FSX_PROG $args $FSX_AVOID $testfile
> > 
> > 	local testfile="${1:-$TEST_DIR/junk}"
> > 	...
> > 	set -- $FSX_PROG $args $FSX_AVOID $testfile
> > 
> > Then you don't need the extra helper.
> Hi Darrick,
> 
> I originally added the extra helper so that we won't need to change the
> existing run_fsx calls. For example, with the change you suggested,
> generic/263 would need to be changed from:
> 
> run_fsx -N 10000  -o 8192   -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z
> 
> to 
> 
> run_fsx $TEST_DIR/junk -N 10000  -o 8192   -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z
> 
> because otherwise, the $1 would actually be "-N" and run_fsx would get
> confused. 
> 
> However, if you would prefer to drop the helper and rather modify
> run_fsx and the call sites to pass a filename, that should be doable as
> well. Please let me know your preference on this.

Ah, right.  I forgot that the function filters its arguments but then
passes them on to fsx.  Ignore this comment, then. :)

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> 
> Regards,
> ojaswin
> 
> > 
> > --D
> > 
> > >  	echo "$@" >>$seqres.full
> > > -	rm -f $TEST_DIR/junk
> > > +	rm -f $testfile
> > >  	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
> > >  	local res=${PIPESTATUS[0]}
> > >  	if [ $res -ne 0 ]; then
> > > @@ -5133,6 +5142,12 @@ _run_fsx()
> > >  	return 0
> > >  }
> > >  
> > > +_run_fsx()
> > > +{
> > > +	_run_fsx_on_file $TEST_DIR/junk $@
> > > +	return $?
> > > +}
> > > +
> > >  # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
> > >  # the test, but if any other error occurs then exit the test.
> > >  _run_hugepage_fsx() {
> > > -- 
> > > 2.49.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-12  6:11     ` Ojaswin Mujoo
@ 2025-06-12 14:36       ` Darrick J. Wong
  2025-06-13  5:31         ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-12 14:36 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Thu, Jun 12, 2025 at 11:41:16AM +0530, Ojaswin Mujoo wrote:
> On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > 
> > > Insert range and collapse range only works with bigalloc in case
> > > the range is cluster size aligned, which fsx doesnt take care. To
> > > work past this, disable insert range and collapse range on ext4, if
> > > bigalloc is enabled.
> > 
> > Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> > maybe we should amend test() in fsx to round to the allocation unit
> > size?
> Hey Darrick,
> 
> Yes makes sense but as you mentioned, I'm not sure if there
> is a way to programatically detect the bigalloc cluster size (or
> allocation unit in general) like we do for xfs. 

I don't either, but maybe we should have a way reveal the allocation
unit size for a given file?  Yet another statx field? :P

(It /would/ be useful for programs that use collapse/insert range)

--D

> (Adding Ted to CC in case he has some idea)
> > 
> > Querying that programmatically might be ... interesting though.  Is
> > there a good way to do that for ext4 bigalloc?
> > 
> > (See detect_xfs_alloc_unit in punch-alternating.c)
> > 
> > --D
> > 
> > > This is achieved by defining a new function _setup_fs_options
> > > which can serve as a mechanism to apply FS-wide options to
> > > the tests.
> > > 
> > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  common/preamble | 16 ++++++++++++++++
> > >  1 file changed, 16 insertions(+)
> > > 
> > > diff --git a/common/preamble b/common/preamble
> > > index ba029a34..2bccff74 100644
> > > --- a/common/preamble
> > > +++ b/common/preamble
> > > @@ -24,6 +24,20 @@ _register_cleanup()
> > >  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
> > >  }
> > >  
> > > +# setup FS options only to be available for each test run
> > > +_setup_fs_options() {
> > > +	case "$FSTYP" in
> > > +	"ext4")
> > > +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > > +			export FSX_AVOID="-I -C"
> > > +		fi
> > > +		;;
> > > +	# Add other filesystem types here as needed
> > > +	*)
> > > +		;;
> > > +	esac
> > > +}
> > > +
> > >  # Prepare to run a fstest by initializing the required global variables to
> > >  # their defaults, sourcing common functions, registering a cleanup function,
> > >  # and removing the $seqres.full file.
> > > @@ -55,4 +69,6 @@ _begin_fstest()
> > >  	# remove previous $seqres.full before test
> > >  	rm -f $seqres.full $seqres.hints
> > >  
> > > +	# setup filesystem options for a given test execution
> > > +	_setup_fs_options
> > >  }
> > > -- 
> > > 2.49.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-12  6:22     ` Ojaswin Mujoo
@ 2025-06-12 14:55       ` Darrick J. Wong
  0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-12 14:55 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Thu, Jun 12, 2025 at 11:52:48AM +0530, Ojaswin Mujoo wrote:
> On Wed, Jun 11, 2025 at 07:42:25AM -0700, Darrick J. Wong wrote:
> > On Wed, Jun 11, 2025 at 03:04:47PM +0530, Ojaswin Mujoo wrote:
> > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > 
> > > This adds atomic write test using fio based on it's crc check verifier.
> > > fio adds a crc for each data block. If the underlying device supports atomic
> > > write then it is guaranteed that we will never have a mix data from two
> > > threads writing on the same physical block.
> > > 
> > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/767.out |  2 ++
> > >  2 files changed, 86 insertions(+)
> > >  create mode 100755 tests/generic/767
> > >  create mode 100644 tests/generic/767.out
> > > 
> > > diff --git a/tests/generic/767 b/tests/generic/767
> > > new file mode 100755
> > > index 00000000..4f80e7b6
> > > --- /dev/null
> > > +++ b/tests/generic/767
> > > @@ -0,0 +1,84 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > > +#
> > > +# FS QA Test 767
> > > +#
> > > +# Validate FS atomic write using fio crc check verifier.
> > > +#
> > > +. ./common/preamble
> > > +. ./common/atomicwrites
> > > +
> > > +_begin_fstest auto aio rw atomicwrites
> > > +
> > > +_require_scratch_write_atomic
> > > +_require_odirect
> > > +_require_aio
> > > +
> > > +function max()
> > > +{
> > > +	if (( $1 > $2 )); then
> > > +		echo "$1"
> > > +	else
> > > +		echo "$2"
> > > +	fi
> > > +}
> > > +
> > > +function min()
> > > +{
> > > +	if (( $1 > $2 )); then
> > > +		echo "$2"
> > > +	else
> > > +		echo "$1"
> > > +	fi
> > > +}
> > 
> > Should these be common/rc helpers?
> > 
> > Or, since bash is ... uh fun with arguments...
> > 
> > _min() {
> > 	local ret
> > 
> > 	for arg in "$@"; do
> > 		if [ -z "$ret" ] || (( $arg < $ret )); then
> > 			ret="$arg"
> > 		fi
> > 	done
> > 	echo $ret
> > }
> > 
> > and then you can pass as many arguments as you like.  The only downside
> > is that you can pass stringly typed crap "_min cow frog" and it still
> > returns "cow".  As if.
> 
> Yes this makes sense (sans the cow vs frog part :p) . I'll make the change in v2.
> 
> > 
> > > +_scratch_mkfs >> $seqres.full 2>&1
> > > +_scratch_mount
> > > +
> > > +touch "$SCRATCH_MNT/f1"
> > > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > > +blocksize=$(max "$awu_min_write" "$((awu_max_write/2))")
> > > +
> > > +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> > > +blocksize=$(min "$blocksize" "65536")
> > > +
> > > +fio_config=$tmp.fio
> > > +fio_out=$tmp.fio.out
> > > +
> > > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > 
> > What program is nproc?
> 
> It returns the number of CPUs so we can scale the load based on CPUs we
> have. It comes from coreutils so I think the distros should have it.

Ah, another variant on getconf _NPROCESSORS_ONLN.  Though looking at the
strace I guess its benefit is that it looks at the cpu affinity mask to
compute the number of processors that can actually be used.  Hrm, maybe
that would be a good cleanup for another day...

--D

> 
> Thanks,
> ojaswin
> > 
> > --D
> > 
> > > +SIZE=$((100 * 1024 * 1024))
> > > +
> > > +cat >$fio_config <<EOF
> > > +[aio-dio-aw-verify]
> > > +direct=1
> > > +ioengine=libaio
> > > +rw=randwrite
> > > +bs=$blocksize
> > > +fallocate=native
> > > +filename=$SCRATCH_MNT/test-file
> > > +size=$SIZE
> > > +iodepth=$FIO_LOAD
> > > +numjobs=$FIO_LOAD
> > > +group_reporting=1
> > > +verify_state_save=0
> > > +verify=crc32c
> > > +verify_fatal=1
> > > +verify_dump=0
> > > +verify_backlog=1024
> > > +verify_async=4
> > > +verify_write_sequence=0
> > > +atomic=1
> > > +EOF
> > > +
> > > +_require_fio $fio_config
> > > +
> > > +cat $fio_config >> $seqres.full
> > > +$FIO_PROG $fio_config --output=$fio_out
> > > +cat $fio_out >> $seqres.full
> > > +
> > > +# success, all done
> > > +echo Silence is golden
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/767.out b/tests/generic/767.out
> > > new file mode 100644
> > > index 00000000..2bf7f989
> > > --- /dev/null
> > > +++ b/tests/generic/767.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 767
> > > +Silence is golden
> > > -- 
> > > 2.49.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-12  6:27     ` Ojaswin Mujoo
@ 2025-06-12 15:14       ` Darrick J. Wong
  2025-06-13  5:20         ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-12 15:14 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry

On Thu, Jun 12, 2025 at 11:57:45AM +0530, Ojaswin Mujoo wrote:
> On Wed, Jun 11, 2025 at 07:45:58AM -0700, Darrick J. Wong wrote:
> > On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:
> > > Stress file with atomic writes to ensure we excercise codepaths
> > > where we are mixing different FS operations with atomic writes
> > > 
> > > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/771.out |  2 ++
> > >  2 files changed, 51 insertions(+)
> > >  create mode 100755 tests/generic/771
> > >  create mode 100644 tests/generic/771.out
> > > 
> > > diff --git a/tests/generic/771 b/tests/generic/771
> > > new file mode 100755
> > > index 00000000..690dfa0a
> > > --- /dev/null
> > > +++ b/tests/generic/771
> > > @@ -0,0 +1,49 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > > +#
> > > +# FS QA Test 771
> > > +#
> > > +# fuzz fsx with atomic writes
> > > +#
> > > +. ./common/preamble
> > > +. ./common/atomicwrites
> > > +_begin_fstest rw auto quick atomicwrites
> > > +
> > > +# Import common functions.
> > > +. ./common/filter
> > > +
> > > +_require_test
> > > +_require_odirect
> > > +_require_scratch_write_atomic
> > > +
> > > +_scratch_mkfs >> $seqres.full 2>&1
> > > +_scratch_mount  >> $seqres.full 2>&1
> > > +
> > > +testfile=$SCRATCH_MNT/testfile
> > > +touch $testfile
> > > +
> > > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > > +blksz=$(_get_block_size $SCRATCH_MNT)
> > > +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> > > +
> > > +# fsx usage:
> > > +#
> > > +# -N numops: total # operations to do
> > > +# -l flen: the upper bound on file size
> > > +# -o oplen: the upper bound on operation size (64k default)
> > > +# -w writebdy: $psize would make writes page aligned (on i386)
> > > +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> > > +# -W: mapped write operations DISabled
> > > +
> > > +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
> > 
> > Don't we already get fsx stress testing RWF_ATOMIC through generic/521
> > and generic/522?
> 
> So for RWF_ATOMIC we need to pass the -a flag to fsx which the other tests
> don't do. This test specifically passes -a to make sure we stress the
> RWF_ATOMIC code path intermixed with other operations. Further, other
> tests run FSX on test device whereas for our atomic write tests we
> generally assume the scratch dev is the one with atomic capabilities.

Oh, I overlooked the part where it's not enabled by default if its
presence can be detected by fsx.  Can we flip the polarity of -a, that
way RWF_ATOMIC will get tested with all the other fsx soak tests?

> > Also why is mmap write disabled?
> 
> I thought it'd be better to not mix mmap writes with direct writes but I
> can add it back if that's preferred.

It's not unreasonable that some dumb program is going to try that some
day, even though we all scream for ice crea^W^W^Wnot to mix the two IO
paths.

--D

> Regards,
> ojaswin
> > 
> > --D
> > 
> > > +status=$?
> > > +
> > > +if [[ "$status" != "0" ]]
> > > +then
> > > +	echo "Somthing went wrong, check $seqres.full"
> > > +fi
> > > +
> > > +echo "Silence is golden"
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/771.out b/tests/generic/771.out
> > > new file mode 100644
> > > index 00000000..c2345c7b
> > > --- /dev/null
> > > +++ b/tests/generic/771.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 771
> > > +Silence is golden
> > > -- 
> > > 2.49.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-12 15:14       ` Darrick J. Wong
@ 2025-06-13  5:20         ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-13  5:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry

On Thu, Jun 12, 2025 at 08:14:35AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 12, 2025 at 11:57:45AM +0530, Ojaswin Mujoo wrote:
> > On Wed, Jun 11, 2025 at 07:45:58AM -0700, Darrick J. Wong wrote:
> > > On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:

<...>

> > > > +# -N numops: total # operations to do
> > > > +# -l flen: the upper bound on file size
> > > > +# -o oplen: the upper bound on operation size (64k default)
> > > > +# -w writebdy: $psize would make writes page aligned (on i386)
> > > > +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> > > > +# -W: mapped write operations DISabled
> > > > +
> > > > +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
> > > 
> > > Don't we already get fsx stress testing RWF_ATOMIC through generic/521
> > > and generic/522?
> > 
> > So for RWF_ATOMIC we need to pass the -a flag to fsx which the other tests
> > don't do. This test specifically passes -a to make sure we stress the
> > RWF_ATOMIC code path intermixed with other operations. Further, other
> > tests run FSX on test device whereas for our atomic write tests we
> > generally assume the scratch dev is the one with atomic capabilities.
> 
> Oh, I overlooked the part where it's not enabled by default if its
> presence can be detected by fsx.  Can we flip the polarity of -a, that
> way RWF_ATOMIC will get tested with all the other fsx soak tests?

Sure Darrick, I'll do that in next revision.

> 
> > > Also why is mmap write disabled?
> > 
> > I thought it'd be better to not mix mmap writes with direct writes but I
> > can add it back if that's preferred.
> 
> It's not unreasonable that some dumb program is going to try that some
> day, even though we all scream for ice crea^W^W^Wnot to mix the two IO
> paths.

haha makes sense, i'll enable it as well.

> 
> --D
> 
> > Regards,
> > ojaswin
> > > 
> > > --D
> > > 
> > > > +status=$?
> > > > +
> > > > +if [[ "$status" != "0" ]]
> > > > +then
> > > > +	echo "Somthing went wrong, check $seqres.full"
> > > > +fi
> > > > +
> > > > +echo "Silence is golden"
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/generic/771.out b/tests/generic/771.out
> > > > new file mode 100644
> > > > index 00000000..c2345c7b
> > > > --- /dev/null
> > > > +++ b/tests/generic/771.out
> > > > @@ -0,0 +1,2 @@
> > > > +QA output created by 771
> > > > +Silence is golden
> > > > -- 
> > > > 2.49.0
> > > > 
> > > > 
> > 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-12 14:36       ` Darrick J. Wong
@ 2025-06-13  5:31         ` Ojaswin Mujoo
  2025-06-13 15:04           ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-13  5:31 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Thu, Jun 12, 2025 at 07:36:14AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 12, 2025 at 11:41:16AM +0530, Ojaswin Mujoo wrote:
> > On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> > > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > > 
> > > > Insert range and collapse range only works with bigalloc in case
> > > > the range is cluster size aligned, which fsx doesnt take care. To
> > > > work past this, disable insert range and collapse range on ext4, if
> > > > bigalloc is enabled.
> > > 
> > > Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> > > maybe we should amend test() in fsx to round to the allocation unit
> > > size?
> > Hey Darrick,
> > 
> > Yes makes sense but as you mentioned, I'm not sure if there
> > is a way to programatically detect the bigalloc cluster size (or
> > allocation unit in general) like we do for xfs. 
> 
> I don't either, but maybe we should have a way reveal the allocation
> unit size for a given file?  Yet another statx field? :P
> 
> (It /would/ be useful for programs that use collapse/insert range)

Yes it would, at the very least, help with defining clear semantics for
collapse/insert range with bigalloc/rtvol because right now those
operations just EINVAL if the range is not aligned correctly, which is
confusing since it is not documented how to do it properly.

xfs does have an ioctl to get the geometry for rtvol. I think you are
suggesting a more generic statx field which can be used by other FSes as
well, right?


Regards,
ojaswin
> 
> --D
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-12 10:26   ` John Garry
@ 2025-06-13  5:37     ` Ojaswin Mujoo
  2025-06-20 14:01       ` John Garry
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-13  5:37 UTC (permalink / raw)
  To: John Garry; +Cc: fstests, Ritesh Harjani, djwong

On Thu, Jun 12, 2025 at 11:26:17AM +0100, John Garry wrote:
> On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> > 
> > Brute force all possible blocksize clustersize combination on a bigalloc
> > filesystem for stressing atomic write using fio data crc verifier. We run
> > multiple threads in parallel with each job writing to its own file. The
> > parallel jobs running on a constrained filesystem size ensure that we stress
> > the ext4 allocator to allocate contiguous extents.
> > 
> > Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> 
> RWF_ATOMIC does not guarantee that racing atomic writes and reads are
> serialised. That is what you are testing here, right?
> 
> NVMe and SCSI do guarantee this (serialisation). However, reads in the block
> layer may be split into multiple requests, even though unlikely.

Hey John,

We are not really testing the serialization here
(verify_write_sequence=0) but rather that multiple threads atomically
writing to the same file should never tear the write. 

In the test, for each job, multiple threads are doing the write on the
same file with the same iosize so they should always overwrite each
other completely.  The verifier then ensures that the whole iosize chunk
written matches the checksum, which will only happen if the write is not
torn. That way we are able to ensure that even with multiple threads
writing the same ranges, we don't break the writes (the sequence doesn't
matter as long as it is not breaking)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-13  5:31         ` Ojaswin Mujoo
@ 2025-06-13 15:04           ` Darrick J. Wong
  2025-06-17  6:22             ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-13 15:04 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Fri, Jun 13, 2025 at 11:01:25AM +0530, Ojaswin Mujoo wrote:
> On Thu, Jun 12, 2025 at 07:36:14AM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 12, 2025 at 11:41:16AM +0530, Ojaswin Mujoo wrote:
> > > On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> > > > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > > > 
> > > > > Insert range and collapse range only works with bigalloc in case
> > > > > the range is cluster size aligned, which fsx doesnt take care. To
> > > > > work past this, disable insert range and collapse range on ext4, if
> > > > > bigalloc is enabled.
> > > > 
> > > > Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> > > > maybe we should amend test() in fsx to round to the allocation unit
> > > > size?
> > > Hey Darrick,
> > > 
> > > Yes makes sense but as you mentioned, I'm not sure if there
> > > is a way to programatically detect the bigalloc cluster size (or
> > > allocation unit in general) like we do for xfs. 
> > 
> > I don't either, but maybe we should have a way reveal the allocation
> > unit size for a given file?  Yet another statx field? :P
> > 
> > (It /would/ be useful for programs that use collapse/insert range)
> 
> Yes it would, at the very least, help with defining clear semantics for
> collapse/insert range with bigalloc/rtvol because right now those
> operations just EINVAL if the range is not aligned correctly, which is
> confusing since it is not documented how to do it properly.
> 
> xfs does have an ioctl to get the geometry for rtvol. I think you are
> suggesting a more generic statx field which can be used by other FSes as
> well, right?

Right, since other filesystems (fat, ntfs, etc) also have allocation
units larger than the fsblock size.  Most of the time the allocunit
amplification simply doesn't matter to applications, but once in a while
it does (collapse/insert range, cow) affect performance.

--D

> 
> Regards,
> ojaswin
> > 
> > --D
> > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-13 15:04           ` Darrick J. Wong
@ 2025-06-17  6:22             ` Ojaswin Mujoo
  2025-06-30 15:27               ` Darrick J. Wong
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-17  6:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Fri, Jun 13, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> On Fri, Jun 13, 2025 at 11:01:25AM +0530, Ojaswin Mujoo wrote:
> > On Thu, Jun 12, 2025 at 07:36:14AM -0700, Darrick J. Wong wrote:
> > > On Thu, Jun 12, 2025 at 11:41:16AM +0530, Ojaswin Mujoo wrote:
> > > > On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> > > > > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > > > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > > > > 
> > > > > > Insert range and collapse range only works with bigalloc in case
> > > > > > the range is cluster size aligned, which fsx doesnt take care. To
> > > > > > work past this, disable insert range and collapse range on ext4, if
> > > > > > bigalloc is enabled.
> > > > > 
> > > > > Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> > > > > maybe we should amend test() in fsx to round to the allocation unit
> > > > > size?
> > > > Hey Darrick,
> > > > 
> > > > Yes makes sense but as you mentioned, I'm not sure if there
> > > > is a way to programatically detect the bigalloc cluster size (or
> > > > allocation unit in general) like we do for xfs. 
> > > 
> > > I don't either, but maybe we should have a way reveal the allocation
> > > unit size for a given file?  Yet another statx field? :P
> > > 
> > > (It /would/ be useful for programs that use collapse/insert range)
> > 
> > Yes it would, at the very least, help with defining clear semantics for
> > collapse/insert range with bigalloc/rtvol because right now those
> > operations just EINVAL if the range is not aligned correctly, which is
> > confusing since it is not documented how to do it properly.
> > 
> > xfs does have an ioctl to get the geometry for rtvol. I think you are
> > suggesting a more generic statx field which can be used by other FSes as
> > well, right?
> 
> Right, since other filesystems (fat, ntfs, etc) also have allocation
> units larger than the fsblock size.  Most of the time the allocunit
> amplification simply doesn't matter to applications, but once in a while
> it does (collapse/insert range, cow) affect performance.
> 
> --D

Makes sense Darrick. I can look into it. 

For this patch, is it okay to keep the approach of disabling
collapse/insert range for bigalloc for now and we can change fsx later
if we add support for exposing alloc units.

Regards,
Ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
  2025-06-11 14:30   ` Darrick J. Wong
@ 2025-06-18 19:13   ` Zorro Lang
  2025-06-20  6:21     ` Ojaswin Mujoo
  1 sibling, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-18 19:13 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> Insert range and collapse range only works with bigalloc in case
> the range is cluster size aligned, which fsx doesnt take care. To
> work past this, disable insert range and collapse range on ext4, if
> bigalloc is enabled.
> 
> This is achieved by defining a new function _setup_fs_options
> which can serve as a mechanism to apply FS-wide options to
> the tests.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  common/preamble | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/common/preamble b/common/preamble
> index ba029a34..2bccff74 100644
> --- a/common/preamble
> +++ b/common/preamble
> @@ -24,6 +24,20 @@ _register_cleanup()
>  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
>  }
>  
> +# setup FS options only to be available for each test run
> +_setup_fs_options() {

If this's a function for fsx only, better to name it with "fsx", e.g.
_setup_default_fsx_avoid (or some other names).

> +	case "$FSTYP" in
> +	"ext4")
> +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> +			export FSX_AVOID="-I -C"

Hmm... I'm also wondering if this's an issue should be fixed in fstests. How about
let the testers who tests ext4 with MKFS_OPTIONS="-O bigalloc" write local.config
as below?

[ext4-bigalloc]
...
MKFS_OPTIONS="-O bigalloc"
FSX_AVOID="-I -C"

Thanks,
Zorro


> +		fi
> +		;;
> +	# Add other filesystem types here as needed
> +	*)
> +		;;
> +	esac
> +}
> +
>  # Prepare to run a fstest by initializing the required global variables to
>  # their defaults, sourcing common functions, registering a cleanup function,
>  # and removing the $seqres.full file.
> @@ -55,4 +69,6 @@ _begin_fstest()
>  	# remove previous $seqres.full before test
>  	rm -f $seqres.full $seqres.hints
>  
> +	# setup filesystem options for a given test execution
> +	_setup_fs_options
>  }
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-11  9:34 ` [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier Ojaswin Mujoo
  2025-06-11 14:42   ` Darrick J. Wong
@ 2025-06-18 19:34   ` Zorro Lang
  2025-06-20  7:06     ` Ojaswin Mujoo
  1 sibling, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-18 19:34 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:47PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> This adds atomic write test using fio based on it's crc check verifier.
> fio adds a crc for each data block. If the underlying device supports atomic
> write then it is guaranteed that we will never have a mix data from two
> threads writing on the same physical block.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/767.out |  2 ++

I'd like to recommend using a bigger case number for this RFC test case (and others
in this patchset), to help you to rebase on later fstests easily :)

>  2 files changed, 86 insertions(+)
>  create mode 100755 tests/generic/767
>  create mode 100644 tests/generic/767.out
> 
> diff --git a/tests/generic/767 b/tests/generic/767
> new file mode 100755
> index 00000000..4f80e7b6
> --- /dev/null
> +++ b/tests/generic/767
> @@ -0,0 +1,84 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 767
> +#
> +# Validate FS atomic write using fio crc check verifier.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_odirect
> +_require_aio
> +
> +function max()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}
> +
> +function min()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$2"
> +	else
> +		echo "$1"
> +	fi
> +}
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +blocksize=$(max "$awu_min_write" "$((awu_max_write/2))")
> +
> +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> +blocksize=$(min "$blocksize" "65536")
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))

This's the first time I learn about this command :-D

[root@dell-per750-41 xfstests]# nproc 
112
[root@dell-per750-41 xfstests]# type -P nproc
/usr/bin/nproc
[root@dell-per750-41 xfstests]# rpm -qf `type -P nproc`
coreutils-9.7-3.fc43.x86_64

Thanks,
Zorro

> +SIZE=$((100 * 1024 * 1024))
> +
> +cat >$fio_config <<EOF
> +[aio-dio-aw-verify]
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +bs=$blocksize
> +fallocate=native
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +iodepth=$FIO_LOAD
> +numjobs=$FIO_LOAD
> +group_reporting=1
> +verify_state_save=0
> +verify=crc32c
> +verify_fatal=1
> +verify_dump=0
> +verify_backlog=1024
> +verify_async=4
> +verify_write_sequence=0
> +atomic=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +$FIO_PROG $fio_config --output=$fio_out
> +cat $fio_out >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/767.out b/tests/generic/767.out
> new file mode 100644
> index 00000000..2bf7f989
> --- /dev/null
> +++ b/tests/generic/767.out
> @@ -0,0 +1,2 @@
> +QA output created by 767
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-11  9:34 ` [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
  2025-06-11 15:36   ` Darrick J. Wong
@ 2025-06-18 20:17   ` Zorro Lang
  2025-06-20  8:20     ` Ojaswin Mujoo
  1 sibling, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-18 20:17 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:49PM +0530, Ojaswin Mujoo wrote:
> This adds various atomic write multi-fsblock stresst tests
> with mixed mappings and O_SYNC, to ensure the data and metadata
> is atomically persisted even if there is a shutdown.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/770.out |   2 +
>  2 files changed, 163 insertions(+)
>  create mode 100755 tests/generic/770
>  create mode 100644 tests/generic/770.out
> 
> diff --git a/tests/generic/770 b/tests/generic/770
> new file mode 100755
> index 00000000..2b98b3b3
> --- /dev/null
> +++ b/tests/generic/770
> @@ -0,0 +1,161 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 770
> +#
> +# Atomic write multi-fsblock data integrity tests with mixed mappings
> +# and O_SYNC
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +check_data_integrity() {
> +	actual=$(_hexdump $testfile)
> +	if [[ "$expected" != "$actual" ]]
> +	then
> +		echo "Integrity check failed"
> +		echo "Integrity check failed" >> $seqres.full
> +		echo "# Expected file contents:" >> $seqres.full
> +		echo "$expected" >> $seqres.full
> +		echo "# Actual file contents:" >> $seqres.full
> +		echo "$actual" >> $seqres.full
> +	fi
> +}
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +
> +# Create an expected pattern to compare with
> +$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
> +expected=$(_hexdump $testfile)
> +echo "# Expected file contents:" >> $seqres.full
> +echo "$expected" >> $seqres.full
> +
> +echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping (10 iterations):" >> $seqres.full
> +# Calculate how many blocks (e.g. 4K) fit in awu_max (e.g. 64K)
> +num_blocks=$((awu_max / blksz))
> +echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
> +
> +operations=("W" "H" "U")
> +
> +# Run 10 iterations of the test
> +for ((iteration=1; iteration<=10; iteration++)); do
> +	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
> +
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +	off=0
> +	mapping=""
> +
> +	for ((i=0; i<num_blocks; i++)); do
> +		index=$((RANDOM % ${#operations[@]}))
> +		map="${operations[$index]}"
> +		mapping="${mapping}${map}"
> +
> +		case "$map" in
> +			"W")
> +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> +				;;
> +			"H")
> +				# No operation needed for hole
> +				;;
> +			"U")
> +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null

_require_xfs_io_command falloc

> +				;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> +
> +	sync $testfile
> +
> +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \

_require_xfs_io_command pwrite -A


> +				  grep wrote | awk -F'[/ ]' '{print $2}')
> +
> +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +	check_data_integrity
> +	echo "Iteration $iteration completed: OK" >> $seqres.full
> +	echo >> $seqres.full
> +done
> +echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
> +bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
> +                grep wrote | awk -F'[/ ]' '{print $2}')
> +test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +_scratch_shutdown -v >> $seqres.full

_require_scratch_shutdown

> +_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-2"
> +check_data_integrity
> +echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Test 3: Do O_DSYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations):" >> $seqres.full
> +num_blocks=$((awu_max / blksz))
> +echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
> +
> +operations=("W" "H" "U")
> +
> +for ((iteration=1; iteration<=10; iteration++)); do
> +	echo "=== Mixed Mapping Shutdown Test Iteration $iteration ===" >> $seqres.full
> +
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> +	off=0
> +	mapping=""
> +
> +	for ((i=0; i<num_blocks; i++)); do
> +		index=$((RANDOM % ${#operations[@]}))
> +		map="${operations[$index]}"
> +		mapping="${mapping}${map}"
> +
> +		case "$map" in
> +			"W")
> +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> +				;;
> +			"H")
> +				# No operation needed for hole
> +				;;
> +			"U")
> +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile > /dev/null
> +				;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> +
> +	sync $testfile
> +
> +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> +				  grep wrote | awk -F'[/ ]' '{print $2}')
> +
> +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> +
> +	echo "Shutting down filesystem" >> $seqres.full
> +	_scratch_shutdown -v >> $seqres.full
> +	_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> +	check_data_integrity
> +	echo "Iteration $iteration completed: OK" >> $seqres.full
> +	echo >> $seqres.full
> +done

Looks like there're two iterations (loop running codes), the code looks much similar, can we
move them to one function then call it twice?

Thanks,
Zorro

> +echo "# Test 3: Do O_SYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations): OK" >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> +
> diff --git a/tests/generic/770.out b/tests/generic/770.out
> new file mode 100644
> index 00000000..17994ed5
> --- /dev/null
> +++ b/tests/generic/770.out
> @@ -0,0 +1,2 @@
> +QA output created by 770
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-11  9:34 ` [RFC 07/12] generic/771: Stress fsx with atomic writes enabled Ojaswin Mujoo
  2025-06-11 14:45   ` Darrick J. Wong
@ 2025-06-18 20:27   ` Zorro Lang
  2025-06-20  8:26     ` Ojaswin Mujoo
  1 sibling, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-18 20:27 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:
> Stress file with atomic writes to ensure we excercise codepaths
> where we are mixing different FS operations with atomic writes
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/771.out |  2 ++
>  2 files changed, 51 insertions(+)
>  create mode 100755 tests/generic/771
>  create mode 100644 tests/generic/771.out
> 
> diff --git a/tests/generic/771 b/tests/generic/771
> new file mode 100755
> index 00000000..690dfa0a
> --- /dev/null
> +++ b/tests/generic/771
> @@ -0,0 +1,49 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 771
> +#
> +# fuzz fsx with atomic writes
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest rw auto quick atomicwrites
> +
> +# Import common functions.
> +. ./common/filter

I think this's useless for this case.

> +
> +_require_test

Do you use TEST_DEV or TEST_MNT ?

> +_require_odirect
> +_require_scratch_write_atomic
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount  >> $seqres.full 2>&1
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`

Do you need _require_block_device? Or is nfs, cifs or overlay... good for this test?

> +
> +# fsx usage:
> +#
> +# -N numops: total # operations to do
> +# -l flen: the upper bound on file size
> +# -o oplen: the upper bound on operation size (64k default)
> +# -w writebdy: $psize would make writes page aligned (on i386)
> +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> +# -W: mapped write operations DISabled
> +
> +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
> +status=$?

Generally we _exit directly after changing $status.

> +
> +if [[ "$status" != "0" ]]
> +then
> +	echo "Somthing went wrong, check $seqres.full"
> +fi

but you don't _exit ...

> +
> +echo "Silence is golden"
> +status=0

Then the status will be set to 0 again. So above "status=$?" is useless.
You can check "$?" directly, or exit directly if _run_fsx fails.

Thanks,
Zorro

> +exit
> diff --git a/tests/generic/771.out b/tests/generic/771.out
> new file mode 100644
> index 00000000..c2345c7b
> --- /dev/null
> +++ b/tests/generic/771.out
> @@ -0,0 +1,2 @@
> +QA output created by 771
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
  2025-06-11 15:38   ` Darrick J. Wong
@ 2025-06-19  7:15   ` Zorro Lang
  2025-06-20 11:11     ` Ojaswin Mujoo
  2025-06-20 14:05   ` John Garry
  2 siblings, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-19  7:15 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:51PM +0530, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/772     | 360 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/772.out |   2 +
>  2 files changed, 362 insertions(+)
>  create mode 100755 tests/generic/772
>  create mode 100644 tests/generic/772.out
> 
> diff --git a/tests/generic/772 b/tests/generic/772
> new file mode 100755
> index 00000000..6af7e74c
> --- /dev/null
> +++ b/tests/generic/772
> @@ -0,0 +1,360 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 0772
> +#
> +# Test multi block atomic writes with sudden FS shutdowns to ensure
> +# the FS is not tearing the write operation
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +echo "Awu max: $awu_max" >> $seqres.full
> +
> +num_blocks=$((awu_max / blksz))
> +filesize=$(($blksz * 12 * 1024 ))
> +
> +atomic_write_loop() {
> +	local off=0
> +	local size=$awu_max
> +	for ((i=0; i<$((filesize / $size )); i++)); do
> +		# Due to sudden shutdown this can produce errors so just redirect them
> +		# to seqres.full
> +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full

_require_xfs_io_command pwrite -A

> +		echo "Written to offset: $off" >> $tmp.aw
> +		off=$((off + $size))
> +	done
> +}
> +
> +create_mixed_mappings() {
> +	local file=$1
> +	local size_bytes=$2
> +
> +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> +	#Fill the file with alternate written and unwritten blocks
> +	local off=0
> +	local operations=("W" "U")
> +
> +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> +		index=$(($i % ${#operations[@]}))
> +		map="${operations[$index]}"
> +
> +		case "$map" in
> +		    "W")
> +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
> +			;;
> +		    "U")
> +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null

_require_xfs_io_command falloc

> +			;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	sync $file
> +}
> +
> +populate_expected_data() {
> +	# create a dummy file with expected old data for different cases
> +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
> +	expected_data_old_mixed=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mixed)

"xxd" it's not a necessary running dependence of xfstests, please replace it with
common/rc:_hexdump() function or other similar commands in coreutils. Or you have
to _notrun if there's not xxd installed.

> +
> +	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
> +	expected_data_old_zeroes=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_zeroes)
> +
> +	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
> +	expected_data_old_mapped=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mapped)
> +
> +	# create a dummy file with expected new data
> +	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
> +	expected_data_new=$(xxd -s 0 -l $awu_max -p $testfile.exp_new)
> +}
> +
> +verify_data_blocks() {
> +	local verify_start=$1
> +	local verify_end=$2
> +	local expected_data_old="$3"
> +	local expected_data_new="$4"
> +
> +	echo >> $seqres.full
> +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> +
> +	# After an atomic write, for every chunk we ensure that the underlying
> +	# data is either the old data or new data as writes shouldn't get torn.
> +	local off=$verify_start
> +	while [[ "$off" -lt "$verify_end" ]]
> +	do
> +		actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> +		then
> +			echo "Checksum match failed at off: $off size: $awu_max"
> +			echo "Expected contents: (Either of the 2 below):"
> +			echo
> +			echo "Expected old: "
> +			echo "$expected_data_old"
> +			echo
> +			echo "Expected new: "
> +			echo "$expected_data_new"
> +			echo
> +			echo "Actual contents: "
> +			echo "$actual_data"
> +
> +			return 1
> +		fi
> +		echo -n "Check at offset $off suceeded! " >> $seqres.full
> +		if [[ "$actual_data" == "$expected_data_new" ]]
> +		then
> +			echo "matched new" >> $seqres.full
> +		elif [[ "$actual_data" == "$expected_data_old" ]]
> +		then
> +			echo "matched old" >> $seqres.full
> +		fi
> +		off=$(( off + awu_max ))
> +	done
> +
> +	return 0
> +}
> +
> +# test data integrity for file by shutting down in between atomic writes
> +test_data_integrity() {
> +	echo >> $seqres.full
> +	echo "# Writing atomically to file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!

If there's background processes in test case, please make sure it's killed in
_cleanup(), e.g.

_cleanup()
{
...
	[ -n "$awloop_pid" ] && kill $awloop_pid
	wait
...
}

> +
> +	# Wait for atleast first write to be recorded
> +	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown

_require_scratch_shutdown

> +
> +	kill $awloop_pid
> +	wait $awloop_pid

	unset awloop_pid

tell _cleanup "it's killed".

> +
> +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	cat $tmp.aw >> $seqres.full
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +
> +	# we want to verify all blocks around which the shutdown happended
> +	verify_start=$(( last_offset - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	verify_end=$(( last_offset + (awu_max * 5)))
> +	if [[ "$verify_end" -gt "$filesize" ]]
> +	then
> +		verify_end=$filesize
> +	fi
> +}
> +
> +# test data integrity for file wiht written and unwritten mappings
> +test_data_integrity_mixed() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full

_require_xfs_io_command truncate

> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> +	create_mixed_mappings $testfile $filesize
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi

If the return value is useful, you'd better to "return 0" clearly. Or the
return value of this function will be unclear.

> +}
> +
> +# test data integrity for file with completely written mappings
> +test_data_integrity_writ() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi

Same as above

> +}
> +
> +# test data integrity for file with completely unwritten mappings
> +test_data_integrity_unwrit() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi

Same

> +}
> +
> +# test data integrity for file with no mappings
> +test_data_integrity_hole() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with no mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +
> +	if [[ "$?" == "1" ]]
> +	then
> +		return 1
> +	fi

Same

> +}
> +
> +test_filesize_integrity() {
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!

Please use another name, then deal with it in _cleanup, refer to
above awloop_pid.

> +
> +	# Wait for atleast first write to be recorded
> +	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid
> +	wait $awloop_pid
> +
> +	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	cat $tmp.aw >> $seqres.full
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +	local filesize=$(_get_filesize $testfile)
> +	echo >> $seqres.full
> +	echo "# Filesize after shutdown: $filesize" >> $seqres.full
> +
> +	# To confirm that the write went atomically, we check:
> +	# 1. The last block should be a multiple of awu_max
> +	# 2. The last block should be the completely new data
> +
> +	if (( $filesize % $awu_max ))
> +	then
> +		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
> +	fi
> +
> +	verify_start=$(( filesize - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	local verify_end=$filesize
> +
> +	# Here the blocks should always match new data hence, for simplicity of
> +	# code, just corrupt the $expected_data_old buffer so it never matches
> +	local expected_data_old="POISON"
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
> +
> +	return $?
> +}
> +
> +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Populating expected data buffers" >> $seqres.full
> +populate_expected_data
> +
> +# Loop 20 times to shake out any races due to shutdown
> +for ((iter=0; iter<20; iter++))
> +do
> +	echo >> $seqres.full
> +	echo "------ Iteration $iter ------" >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> +	test_data_integrity_mixed
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> +	test_data_integrity_writ
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> +	test_data_integrity_unwrit
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> +	test_data_integrity_hole
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> +	test_filesize_integrity
> +	if [[ "$?" == "1" ]]
> +	then
> +		status=1
> +		break

What are these "status=1" for? You "break" the loop run directly after setting
status=1, then the status will be set to 0 at the end of the test.

You can output something to break the golden image (.out file). Or call _exit
or _fail to end the test with an error output.

Thanks,
Zorro

> +	fi
> +done
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/772.out b/tests/generic/772.out
> new file mode 100644
> index 00000000..98c13968
> --- /dev/null
> +++ b/tests/generic/772.out
> @@ -0,0 +1,2 @@
> +QA output created by 772
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
  2025-06-11  9:34 ` [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-06-19  7:43   ` Zorro Lang
  2025-06-20 15:08     ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: Zorro Lang @ 2025-06-19  7:43 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:52PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> We brute force all possible blocksize & clustersize combinations on
> a bigalloc filesystem for stressing atomic write using fio data crc
> verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
> a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> that we never see the mix of data contents from different threads on
> a given bsrange.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/ext4/061     | 107 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/061.out |   2 +
>  2 files changed, 109 insertions(+)
>  create mode 100755 tests/ext4/061
>  create mode 100644 tests/ext4/061.out
> 
> diff --git a/tests/ext4/061 b/tests/ext4/061
> new file mode 100755
> index 00000000..9d656613
> --- /dev/null
> +++ b/tests/ext4/061
> @@ -0,0 +1,107 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> +# we should never see the mix of data contents from different threads for any
> +# given fio blocksize.
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +
> +function max()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}
> +
> +function min()
> +{
> +	if (( $1 < $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}

I've seen these two functions many times, please make them to be common helpers.

> +
> +FS_MAX_CLUSTER_SIZE=$((128*1024))
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((100*1024*1024))
> +fiobsize=4096
> +
> +# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(max 4096 "$bdev_awu_min")
> +FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
> +
> +function create_fio_config()
> +{
> +cat >$fio_config <<EOF
> +	[aio-dio-aw-verify]
> +	direct=1
> +	ioengine=libaio

_require_aiodio

> +	rw=randwrite
> +	bs=$fiobsize
> +	fallocate=native
> +	filename=$SCRATCH_MNT/test-file
> +	size=$SIZE
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	group_reporting=1
> +	verify_state_save=0
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_dump=0
> +	verify_backlog=1024
> +	verify_async=4
> +	verify_write_sequence=0
> +	atomic=1
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_config=$tmp.fio
> +create_fio_config
> +_require_fio $fio_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> +	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
> +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do

Wow, 3 for loops...

> +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> +			_scratch_mkfs_ext4 "$MKFS_OPTIONS" >> $seqres.full 2>&1 || continue

MKFS_OPTIONS is used in _scratch_mkfs_ext4 by default, you don't need to use it as
an argument.

Or do you want to do:

_scratch_mkfs_ext4 "-O bigalloc -b $fsblocksize -C $fsclustersize" ?

> +			if _try_scratch_mount >> $seqres.full 2>&1; then
> +				touch $SCRATCH_MNT/f1
> +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +				fio_config=$tmp.fio

Do you change the "$tmp"? If not, you don't need to set fio_config=$tmp.fio
everytime, you've set fio_config above this for loop.

> +				fio_out=$tmp.fio.out

If you don't change "$tmp", you can set fio_out once, before the
for loop running.

> +				create_fio_config
> +				_require_fio $fio_config

I think only $fiobsize will be changed in $fio_config at here, so you're trying
to check "bs=$fiobsize"? If so, that doesn't make sense, due to _require_fio accepts
any "bs" number, except bs <= 0. So you don't need to call _require_fio
at here everytime, especially you've called it before the loop running.

Thanks,
Zorro

> +				cat $fio_config >> $seqres.full
> +				$FIO_PROG $fio_config --output=$fio_out
> +				ret=$?
> +				cat $fio_out >> $seqres.full
> +				_scratch_unmount
> +				[[ $ret -eq 0 ]] || break;
> +			fi
> +		done
> +	done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> new file mode 100644
> index 00000000..273be9e0
> --- /dev/null
> +++ b/tests/ext4/061.out
> @@ -0,0 +1,2 @@
> +QA output created by 061
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-11  9:34 ` [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
  2025-06-12 10:26   ` John Garry
@ 2025-06-19  7:45   ` Zorro Lang
  1 sibling, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-19  7:45 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:53PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> Brute force all possible blocksize clustersize combination on a bigalloc
> filesystem for stressing atomic write using fio data crc verifier. We run
> multiple threads in parallel with each job writing to its own file. The
> parallel jobs running on a constrained filesystem size ensure that we stress
> the ext4 allocator to allocate contiguous extents.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---

Similar review points as patch 9/12.

>  tests/ext4/062     | 131 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/062.out |   2 +
>  2 files changed, 133 insertions(+)
>  create mode 100755 tests/ext4/062
>  create mode 100644 tests/ext4/062.out
> 
> diff --git a/tests/ext4/062 b/tests/ext4/062
> new file mode 100755
> index 00000000..50803b97
> --- /dev/null
> +++ b/tests/ext4/062
> @@ -0,0 +1,131 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
> +# a constrained filesystem size to stress the ext4 allocator to allocate
> +# contiguous extents.
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +
> +function max()
> +{
> +	if (( $1 > $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}
> +
> +function min()
> +{
> +	if (( $1 < $2 )); then
> +		echo "$1"
> +	else
> +		echo "$2"
> +	fi
> +}
> +
> +FSSIZE=$((360*1024*1024))
> +FS_MAX_CLUSTER_SIZE=$((128*1024))
> +FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
> +fiobsize=4096
> +
> +# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(max 4096 "$bdev_awu_min")
> +FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
> +
> +function create_fio_config()
> +{
> +cat >$fio_config <<EOF
> +	[global]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite
> +	bs=$fiobsize
> +	fallocate=truncate
> +	size=$((FSSIZE / 12))
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	group_reporting=1
> +	verify_state_save=0
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_dump=0
> +	verify_backlog=1024
> +	verify_async=4
> +	verify_write_sequence=0
> +	atomic=1
> +
> +	[job1]
> +	filename=$SCRATCH_MNT/testfile-job1
> +
> +	[job2]
> +	filename=$SCRATCH_MNT/testfile-job2
> +
> +	[job3]
> +	filename=$SCRATCH_MNT/testfile-job3
> +
> +	[job4]
> +	filename=$SCRATCH_MNT/testfile-job4
> +
> +	[job5]
> +	filename=$SCRATCH_MNT/testfile-job5
> +
> +	[job6]
> +	filename=$SCRATCH_MNT/testfile-job6
> +
> +	[job7]
> +	filename=$SCRATCH_MNT/testfile-job7
> +
> +	[job8]
> +	filename=$SCRATCH_MNT/testfile-job8
> +
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_config=$tmp.fio
> +create_fio_config
> +_require_fio $fio_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> +	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
> +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> +			_scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
> +			if _try_scratch_mount >> $seqres.full 2>&1; then
> +				touch $SCRATCH_MNT/f1
> +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +				fio_config=$tmp.fio
> +				fio_out=$tmp.fio.out
> +				create_fio_config
> +				_require_fio $fio_config
> +				cat $fio_config >> $seqres.full
> +				$FIO_PROG $fio_config --output=$fio_out
> +				ret=$?
> +				cat $fio_out >> $seqres.full
> +				_scratch_unmount
> +				[[ $ret -eq 0 ]] || break;
> +			fi
> +		done
> +	done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/062.out b/tests/ext4/062.out
> new file mode 100644
> index 00000000..a1578f48
> --- /dev/null
> +++ b/tests/ext4/062.out
> @@ -0,0 +1,2 @@
> +QA output created by 062
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes
  2025-06-11  9:34 ` [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-06-19  7:52   ` Zorro Lang
  0 siblings, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-19  7:52 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:54PM +0530, Ojaswin Mujoo wrote:
> In ext4, even if an allocated range is physically and logically
> contiguous, it can still be split into 2 extents. This is because ext4
> does not merge extents across leaf nodes. This is an issue for atomic
> writes since even for a continuous extent the map block could (in rare
> cases) return a shorter map, hence tearning the write. This test creates
> such a file and ensures that the atomic write handles this case
> correctly
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/ext4/063     | 125 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/063.out |   2 +
>  2 files changed, 127 insertions(+)
>  create mode 100755 tests/ext4/063
>  create mode 100644 tests/ext4/063.out
> 
> diff --git a/tests/ext4/063 b/tests/ext4/063
> new file mode 100755
> index 00000000..b4759990
> --- /dev/null
> +++ b/tests/ext4/063
> @@ -0,0 +1,125 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# In ext4, even if an allocated range is physically and logically contiguous,
> +# it can still be split into 2 extents. This is because ext4 does not merge
> +# extents across leaf nodes. This is an issue for atomic writes since even for
> +# a continuous extent the map block could (in rare cases) return a shorter map,
> +# hence tearning the write. This test creates such a file and ensures that the
> +# atomic write handles this case correctly
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +
> +prep() {
> +	local bs=`_get_block_size $SCRATCH_MNT`
> +	local ex_hdr_bytes=12
> +	local ex_entry_bytes=12
> +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> +
> +	# fill the extent tree leaf which bs len extents at alternate offsets. For example,
> +	# for 4k bs the tree should look as follows
> +	#
> +	#                  +---------+---------+
> +	#                  | index 1 | index 2 |
> +	#                  +-----+---+-----+---+
> +	#               +--------+         +-------+
> +	#               |                          |
> +	#    +----------+--------------+     +-----+-----+
> +	#    | ex 1 | ex 2 |... | ex n |     |  ex n + 1 |
> +	#    +-------------------------+     +-----------+
> +	#    0      2            680          682
> +	for i in $(seq 0 $entries_per_blk)
> +	do
> +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> +	done
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full

_require_command "$DEBUGFS_PROG" debugfs

> +
> +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
> +	# this is a new FS the allocator would find continuous blocks such that
> +	# ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
> +	# since we dont merge extents across leaf we will end up with a tree as:
> +	#
> +	#                  +---------+---------+
> +	#                  | index 1 | index 2 |
> +	#                  +-----+---+-----+---+
> +	#               +--------+         +-------+
> +	#               |                          |
> +	#    +----------+--------------+     +-----+-----+
> +	#    | ex 1 | ex 2 |... | ex n |     | ex merged |
> +	#    +-------------------------+     +-----------+
> +	#    0      2            680          681  682  684
> +	#
> +	echo >> $seqres.full
> +	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> +	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +	_scratch_cycle_mount
> +}
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +
> +echo >> $seqres.full
> +echo "# Prepping the file" >> $seqres.full
> +prep
> +
> +torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
> +
> +echo >> $seqres.full
> +echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
> +echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full

_require_xfs_io_command pwrite -A

> +$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "Extent state after atomic write:">> $seqres.full
> +echo "...">> $seqres.full
> +$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Checking data integrity" >> $seqres.full
> +
> +# create a dummy file with expected data
> +$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
> +expected_data=$(xxd -s 0 -l $awu_max -p $testfile.exp)

xxd isn't a necessary running dependence of xfstests, please use common/rc:_hexdump()
or other commands in coreutils.

> +
> +# We ensure that the data after atomic writes should match the expected data
> +actual_data=$(xxd -s $torn_aw_offset -l $awu_max -p $testfile)
> +if [[ "$actual_data" != "$expected_data" ]]
> +then
> +	echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
> +	echo
> +	echo "Expected: "
> +	echo "$expected_data"
> +	echo
> +	echo "Actual contents: "
> +	echo "$actual_data"
> +


> +	status=1
> +	exit

_exit 1

> +fi
> +
> +echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/ext4/063.out b/tests/ext4/063.out
> new file mode 100644
> index 00000000..de35fc52
> --- /dev/null
> +++ b/tests/ext4/063.out
> @@ -0,0 +1,2 @@
> +QA output created by 063
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation
  2025-06-11  9:34 ` [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
@ 2025-06-19  7:58   ` Zorro Lang
  0 siblings, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-19  7:58 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Wed, Jun 11, 2025 at 03:04:55PM +0530, Ojaswin Mujoo wrote:
> Test atomic writes with journal credit calculation. We take 2 cases
> here:
> 
> 1. Atomic writes on single mapping causing tree to collapse into
>    the inode
> 2. Atomic writes on mixed mapping causing tree to collapse into the
>    inode
> 
> This test is inspired by ext4/034.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/ext4/064     | 75 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/064.out |  2 ++
>  2 files changed, 77 insertions(+)
>  create mode 100755 tests/ext4/064
>  create mode 100644 tests/ext4/064.out
> 
> diff --git a/tests/ext4/064 b/tests/ext4/064
> new file mode 100755
> index 00000000..12e48ae3
> --- /dev/null
> +++ b/tests/ext4/064
> @@ -0,0 +1,75 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 034
> +#
> +# Test proper credit reservation is done when performing
> +# tree collapse during an aotmic write based allocation
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick quota fiemap prealloc atomicwrites
> +
> +# Import common functions.
> +
> +
> +# Modify as appropriate.
> +_exclude_fs ext2
> +_exclude_fs ext3
> +_require_xfs_io_command "falloc"
> +_require_xfs_io_command "fiemap"

Great, this case notes the falloc and fiemap commands, and adds fiemap
and prealloc test groups. Other test patches refer to this.

> +_require_xfs_io_command "syncfs"
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +
> +echo "----- Testing with atomi write on non-mixed mapping -----" >> $seqres.full
> +
> +echo "Format and mount" >> $seqres.full
> +_scratch_mkfs  > $seqres.full 2>&1
> +_scratch_mount > $seqres.full 2>&1
> +
> +echo "Create the original file" >> $seqres.full
> +touch $SCRATCH_MNT/foobar >> $seqres.full
> +
> +echo "Create 2 level extent tree (btree) for foobar with a unwritten extent" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
> +	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
> +	     -c "fsync" $SCRATCH_MNT/foobar >> $seqres.full
> +
> +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar >> $seqres.full
> +
> +echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
> +$XFS_IO_PROG -dc "pwrite -A -V1 4k 4k" $SCRATCH_MNT/foobar >> $seqres.full

_require_xfs_io_command pwrite -A

> +
> +echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy >> $seqres.full
> +
> +echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
> +$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
> +
> +echo "----- Testing with atomi write on mixed mapping -----" >> $seqres.full
> +
> +echo "Create the original file" >> $seqres.full
> +touch $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Create 2 level extent tree (btree) for foobar2 with a unwritten extent" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
> +	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
> +	     -c "fsync" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
> +$XFS_IO_PROG -dc "pwrite -A -V1 0k 12k" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy2 >> $seqres.full
> +
> +echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
> +$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/ext4/064.out b/tests/ext4/064.out
> new file mode 100644
> index 00000000..d9076546
> --- /dev/null
> +++ b/tests/ext4/064.out
> @@ -0,0 +1,2 @@
> +QA output created by 064
> +Silence is golden
> -- 
> 2.49.0
> 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-18 19:13   ` Zorro Lang
@ 2025-06-20  6:21     ` Ojaswin Mujoo
  2025-06-20  9:59       ` Zorro Lang
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20  6:21 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 03:13:24AM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > Insert range and collapse range only works with bigalloc in case
> > the range is cluster size aligned, which fsx doesnt take care. To
> > work past this, disable insert range and collapse range on ext4, if
> > bigalloc is enabled.
> > 
> > This is achieved by defining a new function _setup_fs_options
> > which can serve as a mechanism to apply FS-wide options to
> > the tests.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  common/preamble | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/common/preamble b/common/preamble
> > index ba029a34..2bccff74 100644
> > --- a/common/preamble
> > +++ b/common/preamble
> > @@ -24,6 +24,20 @@ _register_cleanup()
> >  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
> >  }
> >  
> > +# setup FS options only to be available for each test run
> > +_setup_fs_options() {
> 
> If this's a function for fsx only, better to name it with "fsx", e.g.
> _setup_default_fsx_avoid (or some other names).
> 
> > +	case "$FSTYP" in
> > +	"ext4")
> > +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > +			export FSX_AVOID="-I -C"
> 
> Hmm... I'm also wondering if this's an issue should be fixed in fstests. How about
> let the testers who tests ext4 with MKFS_OPTIONS="-O bigalloc" write local.config
> as below?
> 
> [ext4-bigalloc]
> ...
> MKFS_OPTIONS="-O bigalloc"
> FSX_AVOID="-I -C"
> 
> Thanks,
> Zorro

Hey Zorro, 

Basically the idea is that _setup_fs_options is a generic function that
can be used to do any fs specific modifications to the global options.

This way we can set options to avoid known issues with different FSes,
which can otherwise confuse the user if they are not aware of such
issues. 

Does that sound okay?

Regards,
ojaswin

> 
> 
> > +		fi
> > +		;;
> > +	# Add other filesystem types here as needed
> > +	*)
> > +		;;
> > +	esac
> > +}
> > +
> >  # Prepare to run a fstest by initializing the required global variables to
> >  # their defaults, sourcing common functions, registering a cleanup function,
> >  # and removing the $seqres.full file.
> > @@ -55,4 +69,6 @@ _begin_fstest()
> >  	# remove previous $seqres.full before test
> >  	rm -f $seqres.full $seqres.hints
> >  
> > +	# setup filesystem options for a given test execution
> > +	_setup_fs_options
> >  }
> > -- 
> > 2.49.0
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier
  2025-06-18 19:34   ` Zorro Lang
@ 2025-06-20  7:06     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20  7:06 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 03:34:34AM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:47PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > This adds atomic write test using fio based on it's crc check verifier.
> > fio adds a crc for each data block. If the underlying device supports atomic
> > write then it is guaranteed that we will never have a mix data from two
> > threads writing on the same physical block.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/767     | 84 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/767.out |  2 ++
> 
> I'd like to recommend using a bigger case number for this RFC test case (and others
> in this patchset), to help you to rebase on later fstests easily :)

Sure Zorro, I'll make this change in v2. I think g/1226 should be okay
as the other atomic writes patchset [1] ends at 1225.

Thanks,
Ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-18 20:17   ` Zorro Lang
@ 2025-06-20  8:20     ` Ojaswin Mujoo
  2025-06-20 12:12       ` Zorro Lang
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20  8:20 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 04:17:11AM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:49PM +0530, Ojaswin Mujoo wrote:
> > This adds various atomic write multi-fsblock stresst tests
> > with mixed mappings and O_SYNC, to ensure the data and metadata
> > is atomically persisted even if there is a shutdown.
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/770.out |   2 +
> >  2 files changed, 163 insertions(+)
> >  create mode 100755 tests/generic/770
> >  create mode 100644 tests/generic/770.out
> > 
> > diff --git a/tests/generic/770 b/tests/generic/770
> > new file mode 100755
> > index 00000000..2b98b3b3
> > --- /dev/null
> > +++ b/tests/generic/770
> > @@ -0,0 +1,161 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 770
> > +#
> > +# Atomic write multi-fsblock data integrity tests with mixed mappings
> > +# and O_SYNC
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto quick rw atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +

<snip>

> > +			"U")
> > +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
> 
> _require_xfs_io_command falloc
> 
> > +				;;
> > +		esac
> > +		off=$((off + blksz))
> > +	done
> > +
> > +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> > +
> > +	sync $testfile
> > +
> > +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> > +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> 
> _require_xfs_io_command pwrite -A

Hey Zorro,

pwrite -A and falloc command are already checked for in
_require_atomic_write_test_commands helper used on top.

> 
> 
> > +				  grep wrote | awk -F'[/ ]' '{print $2}')
> > +
> > +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> > +	check_data_integrity
> > +	echo "Iteration $iteration completed: OK" >> $seqres.full
> > +	echo >> $seqres.full
> > +done
> > +echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
> > +
> > +echo >> $seqres.full
> > +echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
> > +bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
> > +                grep wrote | awk -F'[/ ]' '{print $2}')
> > +test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> > +_scratch_shutdown -v >> $seqres.full
> 
> _require_scratch_shutdown

Thanks, i'll add it.

> 
> > +_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-2"
> > +check_data_integrity
> > +echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
> > +
> > +echo >> $seqres.full
> > +echo "# Test 3: Do O_DSYNC atomic write on random mixed mapping with sudden fs shutdown (10 iterations):" >> $seqres.full
> > +num_blocks=$((awu_max / blksz))
> > +echo "Testing $num_blocks blocks of $blksz size within $awu_max region" >> $seqres.full
> > +
> > +operations=("W" "H" "U")
> > +
> > +for ((iteration=1; iteration<=10; iteration++)); do
> > +	echo "=== Mixed Mapping Shutdown Test Iteration $iteration ===" >> $seqres.full
> > +
> > +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> > +
> > +	off=0
> > +	mapping=""
> > +
> > +	for ((i=0; i<num_blocks; i++)); do
> > +		index=$((RANDOM % ${#operations[@]}))
> > +		map="${operations[$index]}"
> > +		mapping="${mapping}${map}"
> > +
> > +		case "$map" in
> > +			"W")
> > +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> > +				;;
> > +			"H")
> > +				# No operation needed for hole
> > +				;;
> > +			"U")
> > +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile > /dev/null
> > +				;;
> > +		esac
> > +		off=$((off + blksz))
> > +	done
> > +
> > +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> > +
> > +	sync $testfile
> > +
> > +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> > +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> > +				  grep wrote | awk -F'[/ ]' '{print $2}')
> > +
> > +	test $bytes_written -eq $awu_max || echo "atomic write len=$awu_max failed"
> > +
> > +	echo "Shutting down filesystem" >> $seqres.full
> > +	_scratch_shutdown -v >> $seqres.full
> > +	_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> > +	check_data_integrity
> > +	echo "Iteration $iteration completed: OK" >> $seqres.full
> > +	echo >> $seqres.full
> > +done
> 
> Looks like there're two iterations (loop running codes), the code looks much similar, can we
> move them to one function then call it twice?

Sure, i'll refactor it in v2.

Thanks,
ojaswin
> 
> Thanks,
> Zorro
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/12] generic/771: Stress fsx with atomic writes enabled
  2025-06-18 20:27   ` Zorro Lang
@ 2025-06-20  8:26     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20  8:26 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 04:27:41AM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:50PM +0530, Ojaswin Mujoo wrote:
> > Stress file with atomic writes to ensure we excercise codepaths
> > where we are mixing different FS operations with atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/generic/771     | 49 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/771.out |  2 ++
> >  2 files changed, 51 insertions(+)
> >  create mode 100755 tests/generic/771
> >  create mode 100644 tests/generic/771.out
> > 
> > diff --git a/tests/generic/771 b/tests/generic/771
> > new file mode 100755
> > index 00000000..690dfa0a
> > --- /dev/null
> > +++ b/tests/generic/771
> > @@ -0,0 +1,49 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 771
> > +#
> > +# fuzz fsx with atomic writes
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest rw auto quick atomicwrites
> > +
> > +# Import common functions.
> > +. ./common/filter
> 
> I think this's useless for this case.
> 
> > +
> > +_require_test
> 
> Do you use TEST_DEV or TEST_MNT ?

No we dont, thanks for pointing this out. I'll change that and remove
common/filter as wel..
> 
> > +_require_odirect
> > +_require_scratch_write_atomic
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount  >> $seqres.full 2>&1
> > +
> > +testfile=$SCRATCH_MNT/testfile
> > +touch $testfile
> > +
> > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > +blksz=$(_get_block_size $SCRATCH_MNT)
> > +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> 
> Do you need _require_block_device? Or is nfs, cifs or overlay... good for this test?

I think it should be okay to not use _require_block_device since all we care about is the
underlying device support atomic write, which we already ensure via
_require_scratch_write_atomic.

> 
> > +
> > +# fsx usage:
> > +#
> > +# -N numops: total # operations to do
> > +# -l flen: the upper bound on file size
> > +# -o oplen: the upper bound on operation size (64k default)
> > +# -w writebdy: $psize would make writes page aligned (on i386)
> > +# -Z: O_DIRECT (use -R, -W, -r and -w too)
> > +# -W: mapped write operations DISabled
> > +
> > +_run_fsx_on_file $testfile -N 10000 -a -o $awu_max  -l 500000 -r $bsize -w $bsize -Z -W $FSX_AVOID >> $seqres.full
> > +status=$?
> 
> Generally we _exit directly after changing $status.
> 
> > +
> > +if [[ "$status" != "0" ]]
> > +then
> > +	echo "Somthing went wrong, check $seqres.full"
> > +fi
> 
> but you don't _exit ...
> 
> > +
> > +echo "Silence is golden"
> > +status=0
> 
> Then the status will be set to 0 again. So above "status=$?" is useless.
> You can check "$?" directly, or exit directly if _run_fsx fails.
> 
> Thanks,
> Zorro

Yes, this seems like a mistake. I will directly _fail in case _run_fsx
returns error.

Thanks for the review!
Ojaswin
> 
> > +exit
> > diff --git a/tests/generic/771.out b/tests/generic/771.out
> > new file mode 100644
> > index 00000000..c2345c7b
> > --- /dev/null
> > +++ b/tests/generic/771.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 771
> > +Silence is golden
> > -- 
> > 2.49.0
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-20  6:21     ` Ojaswin Mujoo
@ 2025-06-20  9:59       ` Zorro Lang
  0 siblings, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-20  9:59 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry, linux-ext4

On Fri, Jun 20, 2025 at 11:51:07AM +0530, Ojaswin Mujoo wrote:
> On Thu, Jun 19, 2025 at 03:13:24AM +0800, Zorro Lang wrote:
> > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > 
> > > Insert range and collapse range only works with bigalloc in case
> > > the range is cluster size aligned, which fsx doesnt take care. To
> > > work past this, disable insert range and collapse range on ext4, if
> > > bigalloc is enabled.
> > > 
> > > This is achieved by defining a new function _setup_fs_options
> > > which can serve as a mechanism to apply FS-wide options to
> > > the tests.
> > > 
> > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  common/preamble | 16 ++++++++++++++++
> > >  1 file changed, 16 insertions(+)
> > > 
> > > diff --git a/common/preamble b/common/preamble
> > > index ba029a34..2bccff74 100644
> > > --- a/common/preamble
> > > +++ b/common/preamble
> > > @@ -24,6 +24,20 @@ _register_cleanup()
> > >  	trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
> > >  }
> > >  
> > > +# setup FS options only to be available for each test run
> > > +_setup_fs_options() {
> > 
> > If this's a function for fsx only, better to name it with "fsx", e.g.
> > _setup_default_fsx_avoid (or some other names).
> > 
> > > +	case "$FSTYP" in
> > > +	"ext4")
> > > +		if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > > +			export FSX_AVOID="-I -C"
> > 
> > Hmm... I'm also wondering if this's an issue should be fixed in fstests. How about
> > let the testers who tests ext4 with MKFS_OPTIONS="-O bigalloc" write local.config
> > as below?
> > 
> > [ext4-bigalloc]
> > ...
> > MKFS_OPTIONS="-O bigalloc"
> > FSX_AVOID="-I -C"
> > 
> > Thanks,
> > Zorro
> 
> Hey Zorro, 
> 
> Basically the idea is that _setup_fs_options is a generic function that
> can be used to do any fs specific modifications to the global options.
> 
> This way we can set options to avoid known issues with different FSes,
> which can otherwise confuse the user if they are not aware of such
> issues. 
> 
> Does that sound okay?

I think we should give this choice to the user or the test case who wants
to test ext4 bigalloc feature. And MKFS_OPTIONS isn't the only way to
enable bigalloc, some test cases can do _scratch_mkfs "-O bigalloc".

**But, if *ext4 list* really hope to force set FSX_AVOID="-I -C" for
"$MKFS_OPTIONS" =~ bigalloc, then better to do this in _run_fsx* function at
first, I don't think it's worth having a global _setup_fs_options, and call
it at the beginning of each case running for now.

Thanks,
Zorro

> 
> Regards,
> ojaswin
> 
> > 
> > 
> > > +		fi
> > > +		;;
> > > +	# Add other filesystem types here as needed
> > > +	*)
> > > +		;;
> > > +	esac
> > > +}
> > > +
> > >  # Prepare to run a fstest by initializing the required global variables to
> > >  # their defaults, sourcing common functions, registering a cleanup function,
> > >  # and removing the $seqres.full file.
> > > @@ -55,4 +69,6 @@ _begin_fstest()
> > >  	# remove previous $seqres.full before test
> > >  	rm -f $seqres.full $seqres.hints
> > >  
> > > +	# setup filesystem options for a given test execution
> > > +	_setup_fs_options
> > >  }
> > > -- 
> > > 2.49.0
> > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-19  7:15   ` Zorro Lang
@ 2025-06-20 11:11     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20 11:11 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 03:15:06PM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:51PM +0530, Ojaswin Mujoo wrote:
> > +
> > +atomic_write_loop() {
> > +	local off=0
> > +	local size=$awu_max
> > +	for ((i=0; i<$((filesize / $size )); i++)); do
> > +		# Due to sudden shutdown this can produce errors so just redirect them
> > +		# to seqres.full
> > +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> 
> _require_xfs_io_command pwrite -A
> 
> > +		echo "Written to offset: $off" >> $tmp.aw
> > +		off=$((off + $size))
> > +	done
> > +}
> > +
> > +create_mixed_mappings() {
> > +	local file=$1
> > +	local size_bytes=$2
> > +
> > +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> > +	#Fill the file with alternate written and unwritten blocks
> > +	local off=0
> > +	local operations=("W" "U")
> > +
> > +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> > +		index=$(($i % ${#operations[@]}))
> > +		map="${operations[$index]}"
> > +
> > +		case "$map" in
> > +		    "W")
> > +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
> > +			;;
> > +		    "U")
> > +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
> 
> _require_xfs_io_command falloc
> 
> > +			;;
> > +		esac
> > +		off=$((off + blksz))
> > +	done
> > +
> > +	sync $file
> > +}
> > +
> > +populate_expected_data() {
> > +	# create a dummy file with expected old data for different cases
> > +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
> > +	expected_data_old_mixed=$(xxd -s 0 -l $awu_max -p $testfile.exp_old_mixed)
> 
> "xxd" it's not a necessary running dependence of xfstests, please replace it with
> common/rc:_hexdump() function or other similar commands in coreutils. Or you have
> to _notrun if there's not xxd installed.

Hey Zorro, got it I'll switch to using od which is what _hexdump also
uses. Won't be using _hexdump helper though since the defaults print the
address of byte as well, which is not something I want. 

<snip>

> > +
> > +# test data integrity for file by shutting down in between atomic writes
> > +test_data_integrity() {
> > +	echo >> $seqres.full
> > +	echo "# Writing atomically to file in background" >> $seqres.full
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> 
> If there's background processes in test case, please make sure it's killed in
> _cleanup(), e.g.
> 
> _cleanup()
> {
> ...
> 	[ -n "$awloop_pid" ] && kill $awloop_pid
> 	wait
> ...

Thanks for pointing this out, I'll take care of this.
> }
> 
> > +
> > +	# Wait for atleast first write to be recorded
> > +	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
> > +
> > +	echo >> $seqres.full
> > +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> > +	_scratch_shutdown
> 
> _require_scratch_shutdown
> 
> > +
> > +	kill $awloop_pid
> > +	wait $awloop_pid
> 
> 	unset awloop_pid
> 
> tell _cleanup "it's killed".
> 
> > +
> > +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > +	cat $tmp.aw >> $seqres.full
> > +	echo >> $seqres.full
> > +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> > +
> > +	rm $tmp.aw
> > +	sleep 0.5
> > +
> > +	_scratch_cycle_mount
> > +
> > +	# we want to verify all blocks around which the shutdown happended
> > +	verify_start=$(( last_offset - (awu_max * 5)))
> > +	if [[ $verify_start < 0 ]]
> > +	then
> > +		verify_start=0
> > +	fi
> > +
> > +	verify_end=$(( last_offset + (awu_max * 5)))
> > +	if [[ "$verify_end" -gt "$filesize" ]]
> > +	then
> > +		verify_end=$filesize
> > +	fi
> > +}
> > +
> > +# test data integrity for file wiht written and unwritten mappings
> > +test_data_integrity_mixed() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> 
> _require_xfs_io_command truncate

Will do.

> 
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> > +	create_mixed_mappings $testfile $filesize
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> > +
> > +	if [[ "$?" == "1" ]]
> > +	then
> > +		return 1
> > +	fi
> 
> If the return value is useful, you'd better to "return 0" clearly. Or the
> return value of this function will be unclear.
> 
> > +}
> > +
> > +# test data integrity for file with completely written mappings
> > +test_data_integrity_writ() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> > +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> > +
> > +	if [[ "$?" == "1" ]]
> > +	then
> > +		return 1
> > +	fi
> 
> Same as above
> 
> > +}
> > +
> > +# test data integrity for file with completely unwritten mappings
> > +test_data_integrity_unwrit() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> > +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> > +
> > +	if [[ "$?" == "1" ]]
> > +	then
> > +		return 1
> > +	fi
> 
> Same
> 
> > +}
> > +
> > +# test data integrity for file with no mappings
> > +test_data_integrity_hole() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with no mappings" >> $seqres.full
> > +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> > +
> > +	if [[ "$?" == "1" ]]
> > +	then
> > +		return 1
> > +	fi
> 
> Same
> 
> > +}
> > +
> > +test_filesize_integrity() {
> > +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> 
> Please use another name, then deal with it in _cleanup, refer to
> above awloop_pid.
> 



<snip>

> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> > +	test_filesize_integrity
> > +	if [[ "$?" == "1" ]]
> > +	then
> > +		status=1
> > +		break
> 
> What are these "status=1" for? You "break" the loop run directly after setting
> status=1, then the status will be set to 0 at the end of the test.
> 
> You can output something to break the golden image (.out file). Or call _exit
> or _fail to end the test with an error output.
> 
> Thanks,
> Zorro

Ohh right, I don't even know why I didn't just _fail if the check in
verify_data_blocks fails. That will save all this stupid return value
handling logic I've added. Thanks so much for pointing this out, I'll
fix it in next iteration!

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-20  8:20     ` Ojaswin Mujoo
@ 2025-06-20 12:12       ` Zorro Lang
  0 siblings, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-20 12:12 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Fri, Jun 20, 2025 at 01:50:18PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jun 19, 2025 at 04:17:11AM +0800, Zorro Lang wrote:
> > On Wed, Jun 11, 2025 at 03:04:49PM +0530, Ojaswin Mujoo wrote:
> > > This adds various atomic write multi-fsblock stresst tests
> > > with mixed mappings and O_SYNC, to ensure the data and metadata
> > > is atomically persisted even if there is a shutdown.
> > > 
> > > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  tests/generic/770     | 161 ++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/770.out |   2 +
> > >  2 files changed, 163 insertions(+)
> > >  create mode 100755 tests/generic/770
> > >  create mode 100644 tests/generic/770.out
> > > 
> > > diff --git a/tests/generic/770 b/tests/generic/770
> > > new file mode 100755
> > > index 00000000..2b98b3b3
> > > --- /dev/null
> > > +++ b/tests/generic/770
> > > @@ -0,0 +1,161 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > > +#
> > > +# FS QA Test 770
> > > +#
> > > +# Atomic write multi-fsblock data integrity tests with mixed mappings
> > > +# and O_SYNC
> > > +#
> > > +. ./common/preamble
> > > +. ./common/atomicwrites
> > > +_begin_fstest auto quick rw atomicwrites
> > > +
> > > +_require_scratch_write_atomic_multi_fsblock
> > > +_require_atomic_write_test_commands
> > > +
> 
> <snip>
> 
> > > +			"U")
> > > +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
> > 
> > _require_xfs_io_command falloc
> > 
> > > +				;;
> > > +		esac
> > > +		off=$((off + blksz))
> > > +	done
> > > +
> > > +	echo "Mixed mapping preparation complete. Full mapping pattern: $mapping" >> $seqres.full
> > > +
> > > +	sync $testfile
> > > +
> > > +	echo "Performing O_DSYNC atomic write over the entire $awu_max region" >> $seqres.full
> > > +	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> > 
> > _require_xfs_io_command pwrite -A
> 
> Hey Zorro,
> 
> pwrite -A and falloc command are already checked for in
> _require_atomic_write_test_commands helper used on top.

Oh, sure, sorry I didn't notice that :)

Thanks,
Zorro


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-13  5:37     ` Ojaswin Mujoo
@ 2025-06-20 14:01       ` John Garry
  2025-06-20 16:49         ` Ojaswin Mujoo
  0 siblings, 1 reply; 61+ messages in thread
From: John Garry @ 2025-06-20 14:01 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong

On 13/06/2025 06:37, Ojaswin Mujoo wrote:
> On Thu, Jun 12, 2025 at 11:26:17AM +0100, John Garry wrote:
>> On 11/06/2025 10:34, Ojaswin Mujoo wrote:
>>> From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
>>>
>>> Brute force all possible blocksize clustersize combination on a bigalloc
>>> filesystem for stressing atomic write using fio data crc verifier. We run
>>> multiple threads in parallel with each job writing to its own file. The
>>> parallel jobs running on a constrained filesystem size ensure that we stress
>>> the ext4 allocator to allocate contiguous extents.
>>>
>>> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
>>> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
>>
>> RWF_ATOMIC does not guarantee that racing atomic writes and reads are
>> serialised. That is what you are testing here, right?
>>
>> NVMe and SCSI do guarantee this (serialisation). However, reads in the block
>> layer may be split into multiple requests, even though unlikely.
> 
> Hey John,
> 
> We are not really testing the serialization here
> (verify_write_sequence=0) but rather that multiple threads atomically
> writing to the same file should never tear the write.
> 
> In the test, for each job, multiple threads are doing the write on the
> same file with the same iosize so they should always overwrite each
> other completely.  The verifier then ensures that the whole iosize chunk
> written matches the checksum, which will only happen if the write is not
> torn. That way we are able to ensure that even with multiple threads
> writing the same ranges, we don't break the writes (the sequence doesn't
> matter as long as it is not breaking)

So the threads are overwriting the same data range, right?

If so, as an experiment, try setting /sys/block/DEV/queue/max_sectors_kb 
lower than the bsize and see what happens...

Thanks,
John

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
  2025-06-11 15:38   ` Darrick J. Wong
  2025-06-19  7:15   ` Zorro Lang
@ 2025-06-20 14:05   ` John Garry
  2025-06-20 15:24     ` Ojaswin Mujoo
  2 siblings, 1 reply; 61+ messages in thread
From: John Garry @ 2025-06-20 14:05 UTC (permalink / raw)
  To: Ojaswin Mujoo, fstests; +Cc: Ritesh Harjani, djwong

On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.

This looks like an interesting test. Can you please consider writing a 
fuller commit message describing how it works?

And do you verify that the test would fail for a non-atomic writes 
(before proving that it works for atomic writes)?

I've had a lot of difficulty testing atomic write behavior for system 
crash/unexpected shutdown, so it would be interesting to see how this works.

In the meantime, I'll try to understand the code...

Thanks,
John

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
  2025-06-19  7:43   ` Zorro Lang
@ 2025-06-20 15:08     ` Ojaswin Mujoo
  2025-06-20 16:53       ` Zorro Lang
  0 siblings, 1 reply; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20 15:08 UTC (permalink / raw)
  To: Zorro Lang; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 19, 2025 at 03:43:51PM +0800, Zorro Lang wrote:
> On Wed, Jun 11, 2025 at 03:04:52PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > We brute force all possible blocksize & clustersize combinations on
> > a bigalloc filesystem for stressing atomic write using fio data crc
> > verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
> > a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> > that we never see the mix of data contents from different threads on
> > a given bsrange.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/ext4/061     | 107 +++++++++++++++++++++++++++++++++++++++++++++
> >  tests/ext4/061.out |   2 +
> >  2 files changed, 109 insertions(+)
> >  create mode 100755 tests/ext4/061
> >  create mode 100644 tests/ext4/061.out
> > 

<snip>

> > +function max()
> > +{
> > +	if (( $1 > $2 )); then
> > +		echo "$1"
> > +	else
> > +		echo "$2"
> > +	fi
> > +}
> > +
> > +function min()
> > +{
> > +	if (( $1 < $2 )); then
> > +		echo "$1"
> > +	else
> > +		echo "$2"
> > +	fi
> > +}
> 
> I've seen these two functions many times, please make them to be common helpers.

Yes I will do this in v2.

> 
> > +
> > +FS_MAX_CLUSTER_SIZE=$((128*1024))
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > +SIZE=$((100*1024*1024))
> > +fiobsize=4096
> > +
> > +# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
> > +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> > +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> > +fsblocksize=$(max 4096 "$bdev_awu_min")
> > +FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
> > +
> > +function create_fio_config()
> > +{
> > +cat >$fio_config <<EOF
> > +	[aio-dio-aw-verify]
> > +	direct=1
> > +	ioengine=libaio
> 
> _require_aiodio

Got it, thanks
> 
> > +	rw=randwrite
> > +	bs=$fiobsize
> > +	fallocate=native
> > +	filename=$SCRATCH_MNT/test-file
> > +	size=$SIZE
> > +	iodepth=$FIO_LOAD
> > +	numjobs=$FIO_LOAD
> > +	group_reporting=1
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	verify_fatal=1
> > +	verify_dump=0
> > +	verify_backlog=1024
> > +	verify_async=4
> > +	verify_write_sequence=0
> > +	atomic=1
> > +EOF
> > +}
> > +
> > +# Let's create a sample fio config to check whether fio supports all options.
> > +fio_config=$tmp.fio
> > +create_fio_config
> > +_require_fio $fio_config
> > +
> > +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> > +	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
> > +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> 
> Wow, 3 for loops...

Yes :) We want to test all the combinations. Since the IO is less, it
usually finishes within a reasonable time (~5 mins)

> 
> > +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> > +			_scratch_mkfs_ext4 "$MKFS_OPTIONS" >> $seqres.full 2>&1 || continue
> 
> MKFS_OPTIONS is used in _scratch_mkfs_ext4 by default, you don't need to use it as
> an argument.
> 
> Or do you want to do:
> 
> _scratch_mkfs_ext4 "-O bigalloc -b $fsblocksize -C $fsclustersize" ?

Yes we want to explicitly run these options so various block and cluster
sizes get tested irrespective of what the user passes in MKFS_OPTIONS.

However, looking at _scratch_mkfs_ext4() > _scratch_do_mkfs() we seem
to try mkfs.ext4 $MKFS_OPTIONS $extra_mkfs_options first and if that fails we 
anyways try mkfs $extra_mkfs_options.

That being said, I'm a bit unsure whether something like:

mkfs.ext4 $MKFS_OPTIONS -O bigalloc -b $fsblocksize -C $fsclustersize

would produce reliable results always so:

		MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
		_scratch_mkfs_ext4

seems better approach. Thoughts?

> 
> > +			if _try_scratch_mount >> $seqres.full 2>&1; then
> > +				touch $SCRATCH_MNT/f1
> > +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> > +				fio_config=$tmp.fio
> 
> Do you change the "$tmp"? If not, you don't need to set fio_config=$tmp.fio
> everytime, you've set fio_config above this for loop.
> 
> > +				fio_out=$tmp.fio.out
> 
> If you don't change "$tmp", you can set fio_out once, before the
> for loop running.
> 
> > +				create_fio_config
> > +				_require_fio $fio_config
> 
> I think only $fiobsize will be changed in $fio_config at here, so you're trying
> to check "bs=$fiobsize"? If so, that doesn't make sense, due to _require_fio accepts
> any "bs" number, except bs <= 0. So you don't need to call _require_fio
> at here everytime, especially you've called it before the loop running.
> 

Makes sense, I'll clean this up.

Thanks,
ojaswin

> Thanks,
> Zorro
> 
> > +				cat $fio_config >> $seqres.full
> > +				$FIO_PROG $fio_config --output=$fio_out
> > +				ret=$?
> > +				cat $fio_out >> $seqres.full
> > +				_scratch_unmount
> > +				[[ $ret -eq 0 ]] || break;
> > +			fi
> > +		done
> > +	done
> > +done
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> > new file mode 100644
> > index 00000000..273be9e0
> > --- /dev/null
> > +++ b/tests/ext4/061.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 061
> > +Silence is golden
> > -- 
> > 2.49.0
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes
  2025-06-20 14:05   ` John Garry
@ 2025-06-20 15:24     ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20 15:24 UTC (permalink / raw)
  To: John Garry; +Cc: fstests, Ritesh Harjani, djwong

On Fri, Jun 20, 2025 at 03:05:42PM +0100, John Garry wrote:
> On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> > This test is intended to ensure that multi blocks atomic writes
> > maintain atomic guarantees across sudden FS shutdowns.
> 
> This looks like an interesting test. Can you please consider writing a
> fuller commit message describing how it works?

Hey John, sure I'll add a more comprehensive commit message. Basically
what we do here is to create a mixed mapping and then do an atomic write
on it while shutting the FS down in parallel.

Then we note down the last offset n where write happened before shutdown
and check blocks from n - 5 to n + 5 to ensure they either have
completely old data or completely new data but not a mix of both.

We repeat this for non mixed mappings like fully written, hole etc but
the mixed one is the one most likely to get torn.

> 
> And do you verify that the test would fail for a non-atomic writes (before
> proving that it works for atomic writes)?


Yes we were reliably able to see torn data, every 20 iterations or so,
when we tried this test on a non atomic block device (after modifying
the test a bit eg not passing -A to pwrite). 

Infact even with an atomic device, we were also able to detect an issue
in ext4's early multiblock atomic write implementation where mishandling
of the transaction was resulting in torn data.

> 
> I've had a lot of difficulty testing atomic write behavior for system
> crash/unexpected shutdown, so it would be interesting to see how this works.
> 
> In the meantime, I'll try to understand the code...

Sure, one thing to note is that we use FS shutdown and focus mostly on
making sure the FS doesn't tear the write. 

We might need something similar for block layer as well at some point.
> 
> Thanks,
> John

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-20 14:01       ` John Garry
@ 2025-06-20 16:49         ` Ojaswin Mujoo
  0 siblings, 0 replies; 61+ messages in thread
From: Ojaswin Mujoo @ 2025-06-20 16:49 UTC (permalink / raw)
  To: John Garry; +Cc: fstests, Ritesh Harjani, djwong

On Fri, Jun 20, 2025 at 03:01:40PM +0100, John Garry wrote:
> On 13/06/2025 06:37, Ojaswin Mujoo wrote:
> > On Thu, Jun 12, 2025 at 11:26:17AM +0100, John Garry wrote:
> > > On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> > > > From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> > > > 
> > > > Brute force all possible blocksize clustersize combination on a bigalloc
> > > > filesystem for stressing atomic write using fio data crc verifier. We run
> > > > multiple threads in parallel with each job writing to its own file. The
> > > > parallel jobs running on a constrained filesystem size ensure that we stress
> > > > the ext4 allocator to allocate contiguous extents.
> > > > 
> > > > Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> > > > Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> > > 
> > > RWF_ATOMIC does not guarantee that racing atomic writes and reads are
> > > serialised. That is what you are testing here, right?
> > > 
> > > NVMe and SCSI do guarantee this (serialisation). However, reads in the block
> > > layer may be split into multiple requests, even though unlikely.
> > 
> > Hey John,
> > 
> > We are not really testing the serialization here
> > (verify_write_sequence=0) but rather that multiple threads atomically
> > writing to the same file should never tear the write.
> > 
> > In the test, for each job, multiple threads are doing the write on the
> > same file with the same iosize so they should always overwrite each
> > other completely.  The verifier then ensures that the whole iosize chunk
> > written matches the checksum, which will only happen if the write is not
> > torn. That way we are able to ensure that even with multiple threads
> > writing the same ranges, we don't break the writes (the sequence doesn't
> > matter as long as it is not breaking)
> 
> So the threads are overwriting the same data range, right?
> 
> If so, as an experiment, try setting /sys/block/DEV/queue/max_sectors_kb
> lower than the bsize and see what happens...

Okay so I tried this flow:

1. mount ext4 FS
2. max_sectors_kb = 4
3. Do a fio atomic write of 64k (awu_max = 64k in this case)

And I'm able to see checksum issues that means the write is getting
torn. I'm not sure of the exact block layer code around max_sectors_kb, but
I do see max_sectors_kb -- sets --> max_user_sectors --> max_sectors
But then get_max_io_size() ignores max_sectors for atomic writes and
uses atomic_write_max_sectors, which should be correctly set.

Hmm.. I must be missing something, what is splitting the bio?

Regards,
ojaswin

> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
  2025-06-20 15:08     ` Ojaswin Mujoo
@ 2025-06-20 16:53       ` Zorro Lang
  0 siblings, 0 replies; 61+ messages in thread
From: Zorro Lang @ 2025-06-20 16:53 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, djwong, john.g.garry

On Fri, Jun 20, 2025 at 08:38:40PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jun 19, 2025 at 03:43:51PM +0800, Zorro Lang wrote:
> > On Wed, Jun 11, 2025 at 03:04:52PM +0530, Ojaswin Mujoo wrote:
> > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > 
> > > We brute force all possible blocksize & clustersize combinations on
> > > a bigalloc filesystem for stressing atomic write using fio data crc
> > > verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
> > > a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> > > that we never see the mix of data contents from different threads on
> > > a given bsrange.
> > > 
> > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > >  tests/ext4/061     | 107 +++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/ext4/061.out |   2 +
> > >  2 files changed, 109 insertions(+)
> > >  create mode 100755 tests/ext4/061
> > >  create mode 100644 tests/ext4/061.out
> > > 
> 
> <snip>
> 
> > > +function max()
> > > +{
> > > +	if (( $1 > $2 )); then
> > > +		echo "$1"
> > > +	else
> > > +		echo "$2"
> > > +	fi
> > > +}
> > > +
> > > +function min()
> > > +{
> > > +	if (( $1 < $2 )); then
> > > +		echo "$1"
> > > +	else
> > > +		echo "$2"
> > > +	fi
> > > +}
> > 
> > I've seen these two functions many times, please make them to be common helpers.
> 
> Yes I will do this in v2.
> 
> > 
> > > +
> > > +FS_MAX_CLUSTER_SIZE=$((128*1024))
> > > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > > +SIZE=$((100*1024*1024))
> > > +fiobsize=4096
> > > +
> > > +# Calculate fsblocksize and FS_MAX_CLUSTER_SIZE as per bdev atomic write units.
> > > +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> > > +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> > > +fsblocksize=$(max 4096 "$bdev_awu_min")
> > > +FS_MAX_CLUSTER_SIZE=$(min "$FS_MAX_CLUSTER_SIZE" "$bdev_awu_max")
> > > +
> > > +function create_fio_config()
> > > +{
> > > +cat >$fio_config <<EOF
> > > +	[aio-dio-aw-verify]
> > > +	direct=1
> > > +	ioengine=libaio
> > 
> > _require_aiodio
> 
> Got it, thanks
> > 
> > > +	rw=randwrite
> > > +	bs=$fiobsize
> > > +	fallocate=native
> > > +	filename=$SCRATCH_MNT/test-file
> > > +	size=$SIZE
> > > +	iodepth=$FIO_LOAD
> > > +	numjobs=$FIO_LOAD
> > > +	group_reporting=1
> > > +	verify_state_save=0
> > > +	verify=crc32c
> > > +	verify_fatal=1
> > > +	verify_dump=0
> > > +	verify_backlog=1024
> > > +	verify_async=4
> > > +	verify_write_sequence=0
> > > +	atomic=1
> > > +EOF
> > > +}
> > > +
> > > +# Let's create a sample fio config to check whether fio supports all options.
> > > +fio_config=$tmp.fio
> > > +create_fio_config
> > > +_require_fio $fio_config
> > > +
> > > +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> > > +	for ((fsclustersize=$fsblocksize; fsclustersize <= $FS_MAX_CLUSTER_SIZE; fsclustersize = $fsclustersize << 1)); do
> > > +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> > 
> > Wow, 3 for loops...
> 
> Yes :) We want to test all the combinations. Since the IO is less, it
> usually finishes within a reasonable time (~5 mins)
> 
> > 
> > > +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> > > +			_scratch_mkfs_ext4 "$MKFS_OPTIONS" >> $seqres.full 2>&1 || continue
> > 
> > MKFS_OPTIONS is used in _scratch_mkfs_ext4 by default, you don't need to use it as
> > an argument.
> > 
> > Or do you want to do:
> > 
> > _scratch_mkfs_ext4 "-O bigalloc -b $fsblocksize -C $fsclustersize" ?
> 
> Yes we want to explicitly run these options so various block and cluster
> sizes get tested irrespective of what the user passes in MKFS_OPTIONS.
> 
> However, looking at _scratch_mkfs_ext4() > _scratch_do_mkfs() we seem
> to try mkfs.ext4 $MKFS_OPTIONS $extra_mkfs_options first and if that fails we 
> anyways try mkfs $extra_mkfs_options.
> 
> That being said, I'm a bit unsure whether something like:
> 
> mkfs.ext4 $MKFS_OPTIONS -O bigalloc -b $fsblocksize -C $fsclustersize
> 
> would produce reliable results always so:
> 
> 		MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> 		_scratch_mkfs_ext4
> 
> seems better approach. Thoughts?

Sure, this's good to me:)

Thanks,
Zorro

> 
> > 
> > > +			if _try_scratch_mount >> $seqres.full 2>&1; then
> > > +				touch $SCRATCH_MNT/f1
> > > +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> > > +				fio_config=$tmp.fio
> > 
> > Do you change the "$tmp"? If not, you don't need to set fio_config=$tmp.fio
> > everytime, you've set fio_config above this for loop.
> > 
> > > +				fio_out=$tmp.fio.out
> > 
> > If you don't change "$tmp", you can set fio_out once, before the
> > for loop running.
> > 
> > > +				create_fio_config
> > > +				_require_fio $fio_config
> > 
> > I think only $fiobsize will be changed in $fio_config at here, so you're trying
> > to check "bs=$fiobsize"? If so, that doesn't make sense, due to _require_fio accepts
> > any "bs" number, except bs <= 0. So you don't need to call _require_fio
> > at here everytime, especially you've called it before the loop running.
> > 
> 
> Makes sense, I'll clean this up.
> 
> Thanks,
> ojaswin
> 
> > Thanks,
> > Zorro
> > 
> > > +				cat $fio_config >> $seqres.full
> > > +				$FIO_PROG $fio_config --output=$fio_out
> > > +				ret=$?
> > > +				cat $fio_out >> $seqres.full
> > > +				_scratch_unmount
> > > +				[[ $ret -eq 0 ]] || break;
> > > +			fi
> > > +		done
> > > +	done
> > > +done
> > > +
> > > +# success, all done
> > > +echo Silence is golden
> > > +status=0
> > > +exit
> > > diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> > > new file mode 100644
> > > index 00000000..273be9e0
> > > --- /dev/null
> > > +++ b/tests/ext4/061.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 061
> > > +Silence is golden
> > > -- 
> > > 2.49.0
> > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc
  2025-06-17  6:22             ` Ojaswin Mujoo
@ 2025-06-30 15:27               ` Darrick J. Wong
  0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2025-06-30 15:27 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: fstests, Ritesh Harjani, john.g.garry, tytso

On Tue, Jun 17, 2025 at 11:52:02AM +0530, Ojaswin Mujoo wrote:
> On Fri, Jun 13, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> > On Fri, Jun 13, 2025 at 11:01:25AM +0530, Ojaswin Mujoo wrote:
> > > On Thu, Jun 12, 2025 at 07:36:14AM -0700, Darrick J. Wong wrote:
> > > > On Thu, Jun 12, 2025 at 11:41:16AM +0530, Ojaswin Mujoo wrote:
> > > > > On Wed, Jun 11, 2025 at 07:30:05AM -0700, Darrick J. Wong wrote:
> > > > > > On Wed, Jun 11, 2025 at 03:04:44PM +0530, Ojaswin Mujoo wrote:
> > > > > > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > > > > > 
> > > > > > > Insert range and collapse range only works with bigalloc in case
> > > > > > > the range is cluster size aligned, which fsx doesnt take care. To
> > > > > > > work past this, disable insert range and collapse range on ext4, if
> > > > > > > bigalloc is enabled.
> > > > > > 
> > > > > > Hmmm, insert/collapse-range have the same behavior on xfs realtime,
> > > > > > maybe we should amend test() in fsx to round to the allocation unit
> > > > > > size?
> > > > > Hey Darrick,
> > > > > 
> > > > > Yes makes sense but as you mentioned, I'm not sure if there
> > > > > is a way to programatically detect the bigalloc cluster size (or
> > > > > allocation unit in general) like we do for xfs. 
> > > > 
> > > > I don't either, but maybe we should have a way reveal the allocation
> > > > unit size for a given file?  Yet another statx field? :P
> > > > 
> > > > (It /would/ be useful for programs that use collapse/insert range)
> > > 
> > > Yes it would, at the very least, help with defining clear semantics for
> > > collapse/insert range with bigalloc/rtvol because right now those
> > > operations just EINVAL if the range is not aligned correctly, which is
> > > confusing since it is not documented how to do it properly.
> > > 
> > > xfs does have an ioctl to get the geometry for rtvol. I think you are
> > > suggesting a more generic statx field which can be used by other FSes as
> > > well, right?
> > 
> > Right, since other filesystems (fat, ntfs, etc) also have allocation
> > units larger than the fsblock size.  Most of the time the allocunit
> > amplification simply doesn't matter to applications, but once in a while
> > it does (collapse/insert range, cow) affect performance.
> > 
> > --D
> 
> Makes sense Darrick. I can look into it. 
> 
> For this patch, is it okay to keep the approach of disabling
> collapse/insert range for bigalloc for now and we can change fsx later
> if we add support for exposing alloc units.

Yeah, particularly since there's no easy way for userspace to find out
the collapse/insert unit size.

--D

> Regards,
> Ojaswin
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2025-06-30 15:27 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-11  9:34 [RFC 00/12] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-06-11  9:34 ` [RFC 01/12] common/preamble: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
2025-06-11 14:30   ` Darrick J. Wong
2025-06-12  6:11     ` Ojaswin Mujoo
2025-06-12 14:36       ` Darrick J. Wong
2025-06-13  5:31         ` Ojaswin Mujoo
2025-06-13 15:04           ` Darrick J. Wong
2025-06-17  6:22             ` Ojaswin Mujoo
2025-06-30 15:27               ` Darrick J. Wong
2025-06-18 19:13   ` Zorro Lang
2025-06-20  6:21     ` Ojaswin Mujoo
2025-06-20  9:59       ` Zorro Lang
2025-06-11  9:34 ` [RFC 02/12] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
2025-06-11 14:31   ` Darrick J. Wong
2025-06-12  6:17     ` Ojaswin Mujoo
2025-06-12 14:35       ` Darrick J. Wong
2025-06-11  9:34 ` [RFC 03/12] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
2025-06-11 14:35   ` Darrick J. Wong
2025-06-12  6:18     ` Ojaswin Mujoo
2025-06-11  9:34 ` [RFC 04/12] generic/767: Add atomic write test using fio crc check verifier Ojaswin Mujoo
2025-06-11 14:42   ` Darrick J. Wong
2025-06-12  6:22     ` Ojaswin Mujoo
2025-06-12 14:55       ` Darrick J. Wong
2025-06-18 19:34   ` Zorro Lang
2025-06-20  7:06     ` Ojaswin Mujoo
2025-06-11  9:34 ` [RFC 05/12] generic/769: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
2025-06-11 15:35   ` Darrick J. Wong
2025-06-11  9:34 ` [RFC 06/12] generic/770: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
2025-06-11 15:36   ` Darrick J. Wong
2025-06-12  6:23     ` Ojaswin Mujoo
2025-06-18 20:17   ` Zorro Lang
2025-06-20  8:20     ` Ojaswin Mujoo
2025-06-20 12:12       ` Zorro Lang
2025-06-11  9:34 ` [RFC 07/12] generic/771: Stress fsx with atomic writes enabled Ojaswin Mujoo
2025-06-11 14:45   ` Darrick J. Wong
2025-06-12  6:27     ` Ojaswin Mujoo
2025-06-12 15:14       ` Darrick J. Wong
2025-06-13  5:20         ` Ojaswin Mujoo
2025-06-18 20:27   ` Zorro Lang
2025-06-20  8:26     ` Ojaswin Mujoo
2025-06-11  9:34 ` [RFC 08/12] generic/772: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
2025-06-11 15:38   ` Darrick J. Wong
2025-06-12  6:28     ` Ojaswin Mujoo
2025-06-19  7:15   ` Zorro Lang
2025-06-20 11:11     ` Ojaswin Mujoo
2025-06-20 14:05   ` John Garry
2025-06-20 15:24     ` Ojaswin Mujoo
2025-06-11  9:34 ` [RFC 09/12] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
2025-06-19  7:43   ` Zorro Lang
2025-06-20 15:08     ` Ojaswin Mujoo
2025-06-20 16:53       ` Zorro Lang
2025-06-11  9:34 ` [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
2025-06-12 10:26   ` John Garry
2025-06-13  5:37     ` Ojaswin Mujoo
2025-06-20 14:01       ` John Garry
2025-06-20 16:49         ` Ojaswin Mujoo
2025-06-19  7:45   ` Zorro Lang
2025-06-11  9:34 ` [RFC 11/12] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
2025-06-19  7:52   ` Zorro Lang
2025-06-11  9:34 ` [RFC 12/12] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
2025-06-19  7:58   ` Zorro Lang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox