public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Add more tests for multi fs block atomic writes
@ 2025-06-26 11:58 Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
                   ` (13 more replies)
  0 siblings, 14 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Changes:

- (1/13) new patch with _min and _max helpers
- (2/13) remove setup_fs_options and add fsx specifc helper 
- (4/13) skip atomic write instead of falling back to normal write (fsx)
- (4/13) make atomic write default on instead of default off (fsx)
- (5,6/13) refactor and cleanup fio tests
- (7/13) refactored common code
- (8/13) dont ignore mmap writes for fsx with atomic writes
- (9/13) use od instead of xxd. handle cleanup of bg threads in _cleanup()
- (10-13/13) minor refactors
- change all tests use _fail for better consistency
- use higher tests numbers for easier merging 

Link to rfc: https://lore.kernel.org/fstests/cover.1749629233.git.ojaswin@linux.ibm.com/

PS: I'm on vacation till next week so there might be a delay in response

Ojaswin Mujoo (9):
  common/rc: Add _min() and _max() helpers
  common/rc: Fix fsx for ext4 with bigalloc
  common/rc: Add a helper to run fsx on a given file
  ltp/fsx.c: Add atomic writes support to fsx
  generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
  generic/1229: Stress fsx with atomic writes enabled
  generic/1230: Add sudden shutdown tests for multi block atomic writes
  ext4/063: Atomic write test for extent split across leaf nodes
  ext4/064: Add atomic write tests for journal credit calculation

Ritesh Harjani (IBM) (4):
  generic/1226: Add atomic write test using fio crc check verifier
  generic/1227: Add atomic write test using fio verify on file mixed
    mappings
  ext4/061: Atomic writes stress test for bigalloc using fio crc
    verifier
  ext4/062: Atomic writes test for bigalloc using fio crc verifier on
    multiple files

 common/rc              |  62 +++++++-
 ltp/fsx.c              | 109 +++++++++++++-
 tests/ext4/061         |  97 +++++++++++++
 tests/ext4/061.out     |   2 +
 tests/ext4/062         | 121 ++++++++++++++++
 tests/ext4/062.out     |   2 +
 tests/ext4/063         | 125 ++++++++++++++++
 tests/ext4/063.out     |   2 +
 tests/ext4/064         |  75 ++++++++++
 tests/ext4/064.out     |   2 +
 tests/generic/1226     |  66 +++++++++
 tests/generic/1226.out |   2 +
 tests/generic/1227     |  92 ++++++++++++
 tests/generic/1227.out |   2 +
 tests/generic/1228     | 139 ++++++++++++++++++
 tests/generic/1228.out |   2 +
 tests/generic/1229     |  41 ++++++
 tests/generic/1229.out |   2 +
 tests/generic/1230     | 321 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1230.out |   2 +
 20 files changed, 1257 insertions(+), 9 deletions(-)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out
 create mode 100755 tests/ext4/064
 create mode 100644 tests/ext4/064.out
 create mode 100755 tests/generic/1226
 create mode 100644 tests/generic/1226.out
 create mode 100755 tests/generic/1227
 create mode 100644 tests/generic/1227.out
 create mode 100755 tests/generic/1228
 create mode 100644 tests/generic/1228.out
 create mode 100755 tests/generic/1229
 create mode 100644 tests/generic/1229.out
 create mode 100755 tests/generic/1230
 create mode 100644 tests/generic/1230.out

-- 
2.49.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 01/13] common/rc: Add _min() and _max() helpers
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Many programs open code these functionalities so add it as a generic helper
in common/rc

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/common/rc b/common/rc
index 94ae3d8c..2f94a4f9 100644
--- a/common/rc
+++ b/common/rc
@@ -5817,6 +5817,28 @@ _require_program() {
 	_have_program "$1" || _notrun "$tag required"
 }
 
+_min() {
+	local ret
+
+	for arg in "$@"; do
+		if [ -z "$ret" ] || (( $arg < $ret )); then
+			ret="$arg"
+		fi
+	done
+	echo $ret
+}
+
+_max() {
+	local ret
+
+	for arg in "$@"; do
+		if [ -z "$ret" ] || (( $arg > $ret )); then
+			ret="$arg"
+		fi
+	done
+	echo $ret
+}
+
 ################################################################################
 # make sure this script returns success
 /bin/true
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 13:32   ` Theodore Ts'o
  2025-06-26 11:58 ` [PATCH v2 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Insert range and collapse range only works with bigalloc in case
the range is cluster size aligned, which fsx doesnt take care. To
work past this, disable insert range and collapse range on ext4, if
bigalloc is enabled.

This is achieved by defining a new function _set_default_fsx_avoid
called via run_fsx helper. This can be used to selectively disable
fsx options based on the configuration.

Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/common/rc b/common/rc
index 2f94a4f9..3cfa432d 100644
--- a/common/rc
+++ b/common/rc
@@ -5113,10 +5113,26 @@ _require_hugepage_fsx()
 		_notrun "fsx binary does not support MADV_COLLAPSE"
 }
 
+_set_default_fsx_avoid() {
+       case "$FSTYP" in
+       "ext4")
+               if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
+                       export FSX_AVOID+=" -I -C"
+               fi
+               ;;
+       # Add other filesystem types here as needed
+       *)
+               ;;
+       esac
+}
+
 _run_fsx()
 {
 	echo "fsx $*"
 	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
+
+	_set_default_fsx_avoid
+
 	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
 	echo "$@" >>$seqres.full
 	rm -f $TEST_DIR/junk
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 03/13] common/rc: Add a helper to run fsx on a given file
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
Add a helper _run_fsx_on_file so that we can run fsx on any
given file including in $SCRATCH_MNT. Also, refactor _run_fsx
to use this helper.

No functional change is intended in this patch.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/common/rc b/common/rc
index 3cfa432d..3ac9a4e5 100644
--- a/common/rc
+++ b/common/rc
@@ -5126,16 +5126,26 @@ _set_default_fsx_avoid() {
        esac
 }
 
-_run_fsx()
+_run_fsx_on_file()
 {
+	local testfile=$1
+	shift
+
+	if ! [ -f $testfile ]
+	then
+		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
+		touch $testfile
+	fi
+
 	echo "fsx $*"
 	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
 
 	_set_default_fsx_avoid
 
-	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
+	set -- $FSX_PROG $args $FSX_AVOID $testfile
+
 	echo "$@" >>$seqres.full
-	rm -f $TEST_DIR/junk
+	rm -f $testfile
 	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
 	local res=${PIPESTATUS[0]}
 	if [ $res -ne 0 ]; then
@@ -5147,6 +5157,12 @@ _run_fsx()
 	return 0
 }
 
+_run_fsx()
+{
+	_run_fsx_on_file $TEST_DIR/junk $@
+	return $?
+}
+
 # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
 # the test, but if any other error occurs then exit the test.
 _run_hugepage_fsx() {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 04/13] ltp/fsx.c: Add atomic writes support to fsx
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (2 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Implement atomic write support to help fuzz atomic writes
with fsx.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 104 insertions(+), 5 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 163b9453..8e9011f4 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -40,6 +40,7 @@
 #include <liburing.h>
 #endif
 #include <sys/syscall.h>
+#include "statx.h"
 
 #ifndef MAP_FILE
 # define MAP_FILE 0
@@ -49,6 +50,10 @@
 #define RWF_DONTCACHE	0x80
 #endif
 
+#ifndef RWF_ATOMIC
+#define RWF_ATOMIC	0x40
+#endif
+
 #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
 
 /* Operation flags (bitmask) */
@@ -110,6 +115,7 @@ enum {
 	OP_READ_DONTCACHE,
 	OP_WRITE,
 	OP_WRITE_DONTCACHE,
+	OP_WRITE_ATOMIC,
 	OP_MAPREAD,
 	OP_MAPWRITE,
 	OP_MAX_LITE,
@@ -200,6 +206,11 @@ int	uring = 0;
 int	mark_nr = 0;
 int	dontcache_io = 1;
 int	hugepages = 0;                  /* -h flag */
+int	do_atomic_writes = 1;		/* -a flag disables */
+
+/* User for atomic writes */
+int awu_min = 0;
+int awu_max = 0;
 
 /* Stores info needed to periodically collapse hugepages */
 struct hugepages_collapse_info {
@@ -288,6 +299,7 @@ static const char *op_names[] = {
 	[OP_READ_DONTCACHE] = "read_dontcache",
 	[OP_WRITE] = "write",
 	[OP_WRITE_DONTCACHE] = "write_dontcache",
+	[OP_WRITE_ATOMIC] = "write_atomic",
 	[OP_MAPREAD] = "mapread",
 	[OP_MAPWRITE] = "mapwrite",
 	[OP_TRUNCATE] = "truncate",
@@ -422,6 +434,7 @@ logdump(void)
 				prt("\t***RRRR***");
 			break;
 		case OP_WRITE_DONTCACHE:
+		case OP_WRITE_ATOMIC:
 		case OP_WRITE:
 			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
 			    lp->args[0], lp->args[0] + lp->args[1] - 1,
@@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
 	file_size = offset + size;
 }
 
+static int is_power_of_2(unsigned n) {
+	return ((n & (n - 1)) == 0);
+}
+
+/*
+ * Round down n to nearest power of 2.
+ * If n is already a power of 2, return n;
+ */
+static int rounddown_pow_of_2(int n) {
+	int i = 0;
+
+	if (is_power_of_2(n))
+		return n;
+
+	for (; (1 << i) < n; i++);
+
+	return 1 << (i - 1);
+}
+
 void
 dowrite(unsigned offset, unsigned size, int flags)
 {
@@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
 	offset -= offset % writebdy;
 	if (o_direct)
 		size -= size % writebdy;
+	if (flags & RWF_ATOMIC) {
+		/* atomic write len must be inbetween awu_min and awu_max */
+		if (size < awu_min)
+			size = awu_min;
+		if (size > awu_max)
+			size = awu_max;
+
+		/* atomic writes need power-of-2 sizes */
+		size = rounddown_pow_of_2(size);
+
+		/* atomic writes need naturally aligned offsets */
+		offset -= offset % size;
+
+		/* Skip the write if we are crossing max filesize */
+		if ((offset + size) > maxfilelen) {
+			if (!quiet && testcalls > simulatedopcount)
+				prt("skipping atomic write past maxfilelen\n");
+			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+			return;
+		}
+	}
 	if (size == 0) {
 		if (!quiet && testcalls > simulatedopcount && !o_direct)
 			prt("skipping zero size write\n");
@@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
 		return;
 	}
 
-	log4(OP_WRITE, offset, size, FL_NONE);
+	if (flags & RWF_ATOMIC)
+		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
+	else
+		log4(OP_WRITE, offset, size, FL_NONE);
 
 	gendata(original_buf, good_buf, offset, size);
 	if (offset + size > file_size) {
@@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
 		       (monitorstart == -1 ||
 			(offset + size > monitorstart &&
 			(monitorend == -1 || offset <= monitorend))))))
-		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
-		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
+		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
+		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
+		    (flags & RWF_ATOMIC) != 0);
 	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
 	if (iret != size) {
 		if (iret == -1)
@@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
 }
 #endif
 
+int test_atomic_writes(void) {
+	int ret;
+	struct statx stx;
+
+	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
+	if (ret < 0) {
+		fprintf(stderr, "main: Statx failed with %d."
+			" Failed to determine atomic write limits, "
+			" disabling!\n", ret);
+		return 0;
+	}
+
+	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
+	    stx.stx_atomic_write_unit_min > 0) {
+		awu_min = stx.stx_atomic_write_unit_min;
+		awu_max = stx.stx_atomic_write_unit_max;
+		return 1;
+	}
+
+	fprintf(stderr, "main: IO Stack does not support"
+			"atomic writes, disabling!\n");
+	return 0;
+}
+
 #ifdef HAVE_COPY_FILE_RANGE
 int
 test_copy_range(void)
@@ -2356,6 +2437,12 @@ have_op:
 			goto out;
 		}
 		break;
+	case OP_WRITE_ATOMIC:
+		if (!do_atomic_writes) {
+			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+			goto out;
+		}
+		break;
 	}
 
 	switch (op) {
@@ -2385,6 +2472,11 @@ have_op:
 			dowrite(offset, size, 0);
 		break;
 
+	case OP_WRITE_ATOMIC:
+		TRIM_OFF_LEN(offset, size, maxfilelen);
+		dowrite(offset, size, RWF_ATOMIC);
+		break;
+
 	case OP_MAPREAD:
 		TRIM_OFF_LEN(offset, size, file_size);
 		domapread(offset, size);
@@ -2511,13 +2603,14 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
+		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
 	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
 	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
 	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
 	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
 	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
 	   ... fname\n\
+	-a: disable atomic writes\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
@@ -3059,9 +3152,13 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
+				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
+		case 'a':
+			prt("main(): Atomic writes disabled\n");
+			do_atomic_writes = 0;
+			break;
 		case 'b':
 			simulatedopcount = getnum(optarg, &endp);
 			if (!quiet)
@@ -3475,6 +3572,8 @@ main(int argc, char **argv)
 		exchange_range_calls = test_exchange_range();
 	if (dontcache_io)
 		dontcache_io = test_dontcache_io();
+	if (do_atomic_writes)
+		do_atomic_writes = test_atomic_writes();
 
 	while (keep_running())
 		if (!test())
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (3 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-27 14:09   ` John Garry
  2025-06-26 11:58 ` [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

This adds atomic write test using fio based on it's crc check verifier.
fio adds a crc for each data block. If the underlying device supports atomic
write then it is guaranteed that we will never have a mix data from two
threads writing on the same physical block.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1226     | 66 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/1226.out |  2 ++
 2 files changed, 68 insertions(+)
 create mode 100755 tests/generic/1226
 create mode 100644 tests/generic/1226.out

diff --git a/tests/generic/1226 b/tests/generic/1226
new file mode 100755
index 00000000..a5ce8556
--- /dev/null
+++ b/tests/generic/1226
@@ -0,0 +1,66 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1226
+#
+# Validate FS atomic write using fio crc check verifier.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic
+_require_odirect
+_require_aio
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
+
+# XFS can have high awu_max_write due to software fallback. Cap it at 64k
+blocksize=$(_min "$blocksize" "65536")
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100 * 1024 * 1024))
+
+cat >$fio_config <<EOF
+[aio-dio-aw-verify]
+direct=1
+ioengine=libaio
+rw=randwrite
+bs=$blocksize
+fallocate=native
+filename=$SCRATCH_MNT/test-file
+size=$SIZE
+iodepth=$FIO_LOAD
+numjobs=$FIO_LOAD
+group_reporting=1
+verify_state_save=0
+verify=crc32c
+verify_fatal=1
+verify_dump=0
+verify_backlog=1024
+verify_async=4
+verify_write_sequence=0
+atomic=1
+EOF
+
+_require_fio $fio_config
+
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --output=$fio_out
+cat $fio_out >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1226.out b/tests/generic/1226.out
new file mode 100644
index 00000000..6dce0ea5
--- /dev/null
+++ b/tests/generic/1226.out
@@ -0,0 +1,2 @@
+QA output created by 1226
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (4 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-27 14:48   ` John Garry
  2025-06-26 11:58 ` [PATCH v2 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

This tests uses fio to first create a file with mixed mappings. Then it
does atomic writes using aio dio with parallel jobs to the same file
with mixed mappings. This forces the filesystem allocator to allocate
extents over mixed mapping regions to stress FS block allocators.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1227     | 92 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/1227.out |  2 +
 2 files changed, 94 insertions(+)
 create mode 100755 tests/generic/1227
 create mode 100644 tests/generic/1227.out

diff --git a/tests/generic/1227 b/tests/generic/1227
new file mode 100755
index 00000000..e2668758
--- /dev/null
+++ b/tests/generic/1227
@@ -0,0 +1,92 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1227
+#
+# Validate FS atomic write using fio crc check verifier on mixed mappings
+# of a file.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_odirect
+_require_aio
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
+
+fsbsize=$(_get_block_size $SCRATCH_MNT)
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((128 * 1024 * 1024))
+
+cat >$fio_config <<EOF
+[global]
+ioengine=libaio
+fallocate=none
+filename=$SCRATCH_MNT/test-file
+filesize=$SIZE
+bs=$fsbsize
+direct=1
+verify=0
+group_reporting=1
+
+# Create written extents
+[written_blocks]
+stonewall
+ioengine=libaio
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+
+# Create unwritten extents
+[unwritten_blocks]
+stonewall
+ioengine=falloc
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+
+# atomic write to mixed mappings of written/unwritten/holes
+[atomic_write_aio_dio_job]
+stonewall
+direct=1
+ioengine=libaio
+rw=randwrite
+bs=$aw_bsize
+iodepth=$FIO_LOAD
+numjobs=$FIO_LOAD
+size=$SIZE
+random_generator=lfsr
+verify_state_save=0
+verify=crc32c
+verify_fatal=1
+verify_dump=0
+verify_backlog=1024
+verify_async=4
+verify_write_sequence=0
+atomic=1
+EOF
+
+_require_fio $fio_config
+
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --output=$fio_out
+cat $fio_out >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1227.out b/tests/generic/1227.out
new file mode 100644
index 00000000..2605d062
--- /dev/null
+++ b/tests/generic/1227.out
@@ -0,0 +1,2 @@
+QA output created by 1227
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (5 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 11:58 ` [PATCH v2 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

This adds various atomic write multi-fsblock stresst tests
with mixed mappings and O_SYNC, to ensure the data and metadata
is atomically persisted even if there is a shutdown.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1228     | 139 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1228.out |   2 +
 2 files changed, 141 insertions(+)
 create mode 100755 tests/generic/1228
 create mode 100644 tests/generic/1228.out

diff --git a/tests/generic/1228 b/tests/generic/1228
new file mode 100755
index 00000000..3f9a6af1
--- /dev/null
+++ b/tests/generic/1228
@@ -0,0 +1,139 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1228
+#
+# Atomic write multi-fsblock data integrity tests with mixed mappings
+# and O_SYNC
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+check_data_integrity() {
+	actual=$(_hexdump $testfile)
+	if [[ "$expected" != "$actual" ]]
+	then
+		echo "Integrity check failed"
+		echo "Integrity check failed" >> $seqres.full
+		echo "# Expected file contents:" >> $seqres.full
+		echo "$expected" >> $seqres.full
+		echo "# Actual file contents:" >> $seqres.full
+		echo "$actual" >> $seqres.full
+
+		_fail "Data integrity check failed. The atomic write was torn."
+	fi
+}
+
+prep_mixed_mapping() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+	local off=0
+	local mapping=""
+
+	local operations=("W" "H" "U")
+	local num_blocks=$((awu_max / blksz))
+	for ((i=0; i<num_blocks; i++)); do
+		local index=$((RANDOM % ${#operations[@]}))
+		local map="${operations[$index]}"
+		local mapping="${mapping}${map}"
+
+		case "$map" in
+			"W")
+				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
+				;;
+			"H")
+				# No operation needed for hole
+				;;
+			"U")
+				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
+				;;
+		esac
+		off=$((off + blksz))
+	done
+
+	echo "+ + Mixed mapping prep done. Full mapping pattern: $mapping" >> $seqres.full
+
+	sync $testfile
+}
+
+verify_atomic_write() {
+	if [[ "$1" == "shutdown" ]]
+	then
+		local do_shutdown=1
+	fi
+
+	test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
+
+	if [[ $do_shutdown -eq "1" ]]
+	then
+		echo "Shutting down filesystem" >> $seqres.full
+		_scratch_shutdown >> $seqres.full
+		_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
+	fi
+
+	check_data_integrity
+}
+
+mixed_mapping_test() {
+	prep_mixed_mapping
+
+	echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
+	bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
+		        grep wrote | awk -F'[/ ]' '{print $2}')
+
+	verify_atomic_write $1
+}
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+
+# Create an expected pattern to compare with
+$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
+expected=$(_hexdump $testfile)
+echo "# Expected file contents:" >> $seqres.full
+echo "$expected" >> $seqres.full
+echo >> $seqres.full
+
+echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping:" >> $seqres.full
+echo >> $seqres.full
+for ((iteration=1; iteration<=10; iteration++)); do
+	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
+
+	echo "+ Testing without shutdown..." >> $seqres.full
+	mixed_mapping_test
+	echo "Passed!" >> $seqres.full
+
+	echo "+ Testing with sudden shutdown..." >> $seqres.full
+	mixed_mapping_test "shutdown"
+	echo "Passed!" >> $seqres.full
+
+	echo "Iteration $iteration completed: OK" >> $seqres.full
+	echo >> $seqres.full
+done
+echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
+
+
+echo >> $seqres.full
+echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
+bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
+                grep wrote | awk -F'[/ ]' '{print $2}')
+verify_atomic_write "shutdown"
+echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
+
diff --git a/tests/generic/1228.out b/tests/generic/1228.out
new file mode 100644
index 00000000..1baffa91
--- /dev/null
+++ b/tests/generic/1228.out
@@ -0,0 +1,2 @@
+QA output created by 1228
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 08/13] generic/1229: Stress fsx with atomic writes enabled
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (6 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-06-26 11:58 ` Ojaswin Mujoo
  2025-06-26 11:59 ` [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:58 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Stress file with atomic writes to ensure we excercise codepaths
where we are mixing different FS operations with atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1229     | 41 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1229.out |  2 ++
 2 files changed, 43 insertions(+)
 create mode 100755 tests/generic/1229
 create mode 100644 tests/generic/1229.out

diff --git a/tests/generic/1229 b/tests/generic/1229
new file mode 100755
index 00000000..98e9b50c
--- /dev/null
+++ b/tests/generic/1229
@@ -0,0 +1,41 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1229
+#
+# fuzz fsx with atomic writes
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest rw auto quick atomicwrites
+
+_require_odirect
+_require_scratch_write_atomic
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount  >> $seqres.full 2>&1
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+
+# fsx usage:
+#
+# -N numops: total # operations to do
+# -l flen: the upper bound on file size
+# -o oplen: the upper bound on operation size (64k default)
+# -Z: O_DIRECT ()
+
+_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID  >> $seqres.full
+if [[ "$?" != "0" ]]
+then
+	_fail "fsx returned error: $?"
+fi
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1229.out b/tests/generic/1229.out
new file mode 100644
index 00000000..737d61c6
--- /dev/null
+++ b/tests/generic/1229.out
@@ -0,0 +1,2 @@
+QA output created by 1229
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (7 preceding siblings ...)
  2025-06-26 11:58 ` [PATCH v2 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-06-26 11:59 ` Ojaswin Mujoo
  2025-06-27 16:11   ` John Garry
  2025-06-26 11:59 ` [PATCH v2 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:59 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

This test is intended to ensure that multi blocks atomic writes
maintain atomic guarantees across sudden FS shutdowns.

The way we work is that we lay out a file with random mix of written,
unwritten and hole extents. Then we start performing atomic writes
sequentially on the file while we parallely shutdown the FS. Then we
note the last offset where the atomic write happened just before shut
down and then make sure blocks around it either have completely old
data or completely new data, ie the write was not torn during shutdown.

We repeat the same with completely written, completely unwritten and completely
empty file to ensure these cases are not torn either.  Finally, we have a
similar test for append atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1230     | 321 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1230.out |   2 +
 2 files changed, 323 insertions(+)
 create mode 100755 tests/generic/1230
 create mode 100644 tests/generic/1230.out

diff --git a/tests/generic/1230 b/tests/generic/1230
new file mode 100755
index 00000000..c65ad55f
--- /dev/null
+++ b/tests/generic/1230
@@ -0,0 +1,321 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test No. 1230
+#
+# Test multi block atomic writes with sudden FS shutdowns to ensure
+# the FS is not tearing the write operation
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+echo "Awu max: $awu_max" >> $seqres.full
+
+num_blocks=$((awu_max / blksz))
+filesize=$(($blksz * 12 * 1024 ))
+
+_cleanup() {
+	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
+	wait
+}
+
+atomic_write_loop() {
+	local off=0
+	local size=$awu_max
+	for ((i=0; i<$((filesize / $size )); i++)); do
+		# Due to sudden shutdown this can produce errors so just redirect them
+		# to seqres.full
+		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
+		echo "Written to offset: $off" >> $tmp.aw
+		off=$((off + $size))
+	done
+}
+
+create_mixed_mappings() {
+	local file=$1
+	local size_bytes=$2
+
+	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
+	#Fill the file with alternate written and unwritten blocks
+	local off=0
+	local operations=("W" "U")
+
+	for ((i=0; i<$((size_bytes / blksz )); i++)); do
+		index=$(($i % ${#operations[@]}))
+		map="${operations[$index]}"
+
+		case "$map" in
+		    "W")
+			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
+			;;
+		    "U")
+			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
+			;;
+		esac
+		off=$((off + blksz))
+	done
+
+	sync $file
+}
+
+populate_expected_data() {
+	# create a dummy file with expected old data for different cases
+	create_mixed_mappings $testfile.exp_old_mixed $awu_max
+	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
+
+	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
+	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
+
+	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
+	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
+
+	# create a dummy file with expected new data
+	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
+	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
+}
+
+verify_data_blocks() {
+	local verify_start=$1
+	local verify_end=$2
+	local expected_data_old="$3"
+	local expected_data_new="$4"
+
+	echo >> $seqres.full
+	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
+
+	# After an atomic write, for every chunk we ensure that the underlying
+	# data is either the old data or new data as writes shouldn't get torn.
+	local off=$verify_start
+	while [[ "$off" -lt "$verify_end" ]]
+	do
+		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
+		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
+		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
+		then
+			echo "Checksum match failed at off: $off size: $awu_max"
+			echo "Expected contents: (Either of the 2 below):"
+			echo
+			echo "Expected old: "
+			echo "$expected_data_old"
+			echo
+			echo "Expected new: "
+			echo "$expected_data_new"
+			echo
+			echo "Actual contents: "
+			echo "$actual_data"
+
+			_fail
+		fi
+		echo -n "Check at offset $off suceeded! " >> $seqres.full
+		if [[ "$actual_data" == "$expected_data_new" ]]
+		then
+			echo "matched new" >> $seqres.full
+		elif [[ "$actual_data" == "$expected_data_old" ]]
+		then
+			echo "matched old" >> $seqres.full
+		fi
+		off=$(( off + awu_max ))
+	done
+}
+
+# test data integrity for file by shutting down in between atomic writes
+test_data_integrity() {
+	echo >> $seqres.full
+	echo "# Writing atomically to file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	# Wait for atleast first write to be recorded
+	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid
+	wait $awloop_pid
+	unset $awloop_pid
+
+	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	cat $tmp.aw >> $seqres.full
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+	# we want to verify all blocks around which the shutdown happended
+	verify_start=$(( last_offset - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	verify_end=$(( last_offset + (awu_max * 5)))
+	if [[ "$verify_end" -gt "$filesize" ]]
+	then
+		verify_end=$filesize
+	fi
+}
+
+# test data integrity for file wiht written and unwritten mappings
+test_data_integrity_mixed() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with mixed mappings" >> $seqres.full
+	create_mixed_mappings $testfile $filesize
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
+}
+
+# test data integrity for file with completely written mappings
+test_data_integrity_writ() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully written mapping" >> $seqres.full
+	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
+}
+
+# test data integrity for file with completely unwritten mappings
+test_data_integrity_unwrit() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
+	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+# test data integrity for file with no mappings
+test_data_integrity_hole() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with no mappings" >> $seqres.full
+	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+test_filesize_integrity() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Performing extending atomic writes over file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	# Wait for atleast first write to be recorded
+	while [ ! -f "$tmp.aw" ]; do sleep 0.2; done
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid
+	wait $awloop_pid
+	unset $awloop_pid
+
+	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	cat $tmp.aw >> $seqres.full
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+	local filesize=$(_get_filesize $testfile)
+	echo >> $seqres.full
+	echo "# Filesize after shutdown: $filesize" >> $seqres.full
+
+	# To confirm that the write went atomically, we check:
+	# 1. The last block should be a multiple of awu_max
+	# 2. The last block should be the completely new data
+
+	if (( $filesize % $awu_max ))
+	then
+		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
+	fi
+
+	verify_start=$(( filesize - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	local verify_end=$filesize
+
+	# Here the blocks should always match new data hence, for simplicity of
+	# code, just corrupt the $expected_data_old buffer so it never matches
+	local expected_data_old="POISON"
+	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
+}
+
+$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+echo >> $seqres.full
+echo "# Populating expected data buffers" >> $seqres.full
+populate_expected_data
+
+# Loop 20 times to shake out any races due to shutdown
+for ((iter=0; iter<20; iter++))
+do
+	echo >> $seqres.full
+	echo "------ Iteration $iter ------" >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
+	test_data_integrity_mixed
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
+	test_data_integrity_writ
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
+	test_data_integrity_unwrit
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
+	test_data_integrity_hole
+
+	echo >> $seqres.full
+	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
+	test_filesize_integrity
+done
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1230.out b/tests/generic/1230.out
new file mode 100644
index 00000000..d01f54ea
--- /dev/null
+++ b/tests/generic/1230.out
@@ -0,0 +1,2 @@
+QA output created by 1230
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (8 preceding siblings ...)
  2025-06-26 11:59 ` [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-06-26 11:59 ` Ojaswin Mujoo
  2025-06-26 11:59 ` [PATCH v2 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:59 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

We brute force all possible blocksize & clustersize combinations on
a bigalloc filesystem for stressing atomic write using fio data crc
verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
that we never see the mix of data contents from different threads on
a given bsrange.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/061     | 97 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/061.out |  2 +
 2 files changed, 99 insertions(+)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out

diff --git a/tests/ext4/061 b/tests/ext4/061
new file mode 100755
index 00000000..1c600c70
--- /dev/null
+++ b/tests/ext4/061
@@ -0,0 +1,97 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
+# we should never see the mix of data contents from different threads for any
+# given fio blocksize.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100*1024*1024))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_config()
+{
+cat >$fio_config <<EOF
+	[aio-dio-aw-verify]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_dump=0
+	verify_backlog=1024
+	verify_async=4
+	verify_write_sequence=0
+	atomic=1
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+create_fio_config
+_require_fio $fio_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	# cluster sizes above 16 x blocksize are experimental so avoid them
+	# Also, cap cluster size at 128kb to keep it reasonable for large
+	# blocks size
+	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+				touch $SCRATCH_MNT/f1
+				create_fio_config
+
+				cat $fio_config >> $seqres.full
+
+				$FIO_PROG $fio_config --output=$fio_out
+				ret=$?
+				cat $fio_out >> $seqres.full
+
+				_scratch_unmount
+
+				[[ $ret -eq 0 ]] || _fail "fio with atomic write failed"
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/061.out b/tests/ext4/061.out
new file mode 100644
index 00000000..273be9e0
--- /dev/null
+++ b/tests/ext4/061.out
@@ -0,0 +1,2 @@
+QA output created by 061
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (9 preceding siblings ...)
  2025-06-26 11:59 ` [PATCH v2 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-06-26 11:59 ` Ojaswin Mujoo
  2025-06-26 11:59 ` [PATCH v2 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:59 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

Brute force all possible blocksize clustersize combination on a bigalloc
filesystem for stressing atomic write using fio data crc verifier. We run
multiple threads in parallel with each job writing to its own file. The
parallel jobs running on a constrained filesystem size ensure that we stress
the ext4 allocator to allocate contiguous extents.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/062     | 121 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/062.out |   2 +
 2 files changed, 123 insertions(+)
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out

diff --git a/tests/ext4/062 b/tests/ext4/062
new file mode 100755
index 00000000..d59701c8
--- /dev/null
+++ b/tests/ext4/062
@@ -0,0 +1,121 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
+# a constrained filesystem size to stress the ext4 allocator to allocate
+# contiguous extents.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FSSIZE=$((360*1024*1024))
+FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_config()
+{
+cat >$fio_config <<EOF
+	[global]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=truncate
+	size=$((FSSIZE / 12))
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_dump=0
+	verify_backlog=1024
+	verify_async=4
+	verify_write_sequence=0
+	atomic=1
+
+	[job1]
+	filename=$SCRATCH_MNT/testfile-job1
+
+	[job2]
+	filename=$SCRATCH_MNT/testfile-job2
+
+	[job3]
+	filename=$SCRATCH_MNT/testfile-job3
+
+	[job4]
+	filename=$SCRATCH_MNT/testfile-job4
+
+	[job5]
+	filename=$SCRATCH_MNT/testfile-job5
+
+	[job6]
+	filename=$SCRATCH_MNT/testfile-job6
+
+	[job7]
+	filename=$SCRATCH_MNT/testfile-job7
+
+	[job8]
+	filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+create_fio_config
+_require_fio $fio_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	# cluster sizes above 16 x blocksize are experimental so avoid them
+	# Also, cap cluster size at 128kb to keep it reasonable for large
+	# blocks size cases.
+	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+				touch $SCRATCH_MNT/f1
+				create_fio_config
+
+				cat $fio_config >> $seqres.full
+
+				$FIO_PROG $fio_config --output=$fio_out
+				ret=$?
+				cat $fio_out >> $seqres.full
+
+				_scratch_unmount
+
+				[[ $ret -eq 0 ]] || _fail "fio with atomic write failed"
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/062.out b/tests/ext4/062.out
new file mode 100644
index 00000000..a1578f48
--- /dev/null
+++ b/tests/ext4/062.out
@@ -0,0 +1,2 @@
+QA output created by 062
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 12/13] ext4/063: Atomic write test for extent split across leaf nodes
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (10 preceding siblings ...)
  2025-06-26 11:59 ` [PATCH v2 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-06-26 11:59 ` Ojaswin Mujoo
  2025-06-26 11:59 ` [PATCH v2 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
  2025-06-27 13:56 ` [PATCH v2 00/13] Add more tests for multi fs block atomic writes John Garry
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:59 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

In ext4, even if an allocated range is physically and logically
contiguous, it can still be split into 2 extents. This is because ext4
does not merge extents across leaf nodes. This is an issue for atomic
writes since even for a continuous extent the map block could (in rare
cases) return a shorter map, hence tearning the write. This test creates
such a file and ensures that the atomic write handles this case
correctly

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/063     | 125 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/063.out |   2 +
 2 files changed, 127 insertions(+)
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out

diff --git a/tests/ext4/063 b/tests/ext4/063
new file mode 100755
index 00000000..25b5693d
--- /dev/null
+++ b/tests/ext4/063
@@ -0,0 +1,125 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# In ext4, even if an allocated range is physically and logically contiguous,
+# it can still be split into 2 extents. This is because ext4 does not merge
+# extents across leaf nodes. This is an issue for atomic writes since even for
+# a continuous extent the map block could (in rare cases) return a shorter map,
+# hence tearning the write. This test creates such a file and ensures that the
+# atomic write handles this case correctly
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_command "$DEBUGFS_PROG" debugfs
+
+prep() {
+	local bs=`_get_block_size $SCRATCH_MNT`
+	local ex_hdr_bytes=12
+	local ex_entry_bytes=12
+	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
+
+	# fill the extent tree leaf which bs len extents at alternate offsets. For example,
+	# for 4k bs the tree should look as follows
+	#
+	#                  +---------+---------+
+	#                  | index 1 | index 2 |
+	#                  +-----+---+-----+---+
+	#               +--------+         +-------+
+	#               |                          |
+	#    +----------+--------------+     +-----+-----+
+	#    | ex 1 | ex 2 |... | ex n |     |  ex n + 1 |
+	#    +-------------------------+     +-----------+
+	#    0      2            680          682
+	for i in $(seq 0 $entries_per_blk)
+	do
+		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
+	done
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
+	# this is a new FS the allocator would find continuous blocks such that
+	# ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
+	# since we dont merge extents across leaf we will end up with a tree as:
+	#
+	#                  +---------+---------+
+	#                  | index 1 | index 2 |
+	#                  +-----+---+-----+---+
+	#               +--------+         +-------+
+	#               |                          |
+	#    +----------+--------------+     +-----+-----+
+	#    | ex 1 | ex 2 |... | ex n |     | ex merged |
+	#    +-------------------------+     +-----------+
+	#    0      2            680          681  682  684
+	#
+	echo >> $seqres.full
+	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
+	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	_scratch_cycle_mount
+}
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+awu_max=$(_get_atomic_write_unit_max $testfile)
+
+echo >> $seqres.full
+echo "# Prepping the file" >> $seqres.full
+prep
+
+torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
+
+echo >> $seqres.full
+echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
+echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+
+echo >> $seqres.full
+echo "Extent state after atomic write:">> $seqres.full
+echo "...">> $seqres.full
+$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+echo >> $seqres.full
+echo "# Checking data integrity" >> $seqres.full
+
+# create a dummy file with expected data
+$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
+expected_data=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp)
+
+# We ensure that the data after atomic writes should match the expected data
+actual_data=$(od -An -t x1 -j $torn_aw_offset -N $awu_max $testfile)
+if [[ "$actual_data" != "$expected_data" ]]
+then
+	echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
+	echo
+	echo "Expected: "
+	echo "$expected_data"
+	echo
+	echo "Actual contents: "
+	echo "$actual_data"
+
+	_fail
+fi
+
+echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/063.out b/tests/ext4/063.out
new file mode 100644
index 00000000..de35fc52
--- /dev/null
+++ b/tests/ext4/063.out
@@ -0,0 +1,2 @@
+QA output created by 063
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 13/13] ext4/064: Add atomic write tests for journal credit calculation
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (11 preceding siblings ...)
  2025-06-26 11:59 ` [PATCH v2 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-06-26 11:59 ` Ojaswin Mujoo
  2025-06-27 13:56 ` [PATCH v2 00/13] Add more tests for multi fs block atomic writes John Garry
  13 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-06-26 11:59 UTC (permalink / raw)
  To: Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, john.g.garry, tytso

Test atomic writes with journal credit calculation. We take 2 cases
here:

1. Atomic writes on single mapping causing tree to collapse into
   the inode
2. Atomic writes on mixed mapping causing tree to collapse into the
   inode

This test is inspired by ext4/034.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/064     | 75 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/064.out |  2 ++
 2 files changed, 77 insertions(+)
 create mode 100755 tests/ext4/064
 create mode 100644 tests/ext4/064.out

diff --git a/tests/ext4/064 b/tests/ext4/064
new file mode 100755
index 00000000..ec31f983
--- /dev/null
+++ b/tests/ext4/064
@@ -0,0 +1,75 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 034
+#
+# Test proper credit reservation is done when performing
+# tree collapse during an aotmic write based allocation
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick quota fiemap prealloc atomicwrites
+
+# Import common functions.
+
+
+# Modify as appropriate.
+_exclude_fs ext2
+_exclude_fs ext3
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "syncfs"
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+echo "----- Testing with atomic write on non-mixed mapping -----" >> $seqres.full
+
+echo "Format and mount" >> $seqres.full
+_scratch_mkfs  > $seqres.full 2>&1
+_scratch_mount > $seqres.full 2>&1
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+	     -c "fsync" $SCRATCH_MNT/foobar >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 4k 4k" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+echo "----- Testing with atomi write on mixed mapping -----" >> $seqres.full
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar2 with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+	     -c "pwrite 20k 4k"  -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+	     -c "fsync" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 0k 12k" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy2 >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/064.out b/tests/ext4/064.out
new file mode 100644
index 00000000..d9076546
--- /dev/null
+++ b/tests/ext4/064.out
@@ -0,0 +1,2 @@
+QA output created by 064
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc
  2025-06-26 11:58 ` [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-06-26 13:32   ` Theodore Ts'o
  2025-06-30 15:28     ` Darrick J. Wong
  0 siblings, 1 reply; 38+ messages in thread
From: Theodore Ts'o @ 2025-06-26 13:32 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, john.g.garry

On Thu, Jun 26, 2025 at 05:28:53PM +0530, Ojaswin Mujoo wrote:
> This is achieved by defining a new function _set_default_fsx_avoid
> called via run_fsx helper. This can be used to selectively disable
> fsx options based on the configuration.

> +_set_default_fsx_avoid() {
> +       case "$FSTYP" in
> +       "ext4")
> +               if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> +                       export FSX_AVOID+=" -I -C"
> +               fi
> +               ;;

This assumes that MKFS_OPTIONS reflects the file system features
enabled on the file system.  That's true by definition for the scratch
fs, but it's not necessarily true for the test fs.  At least in
theory, someone could run fstests without a scrach device, and then
MKFS_OPTION?S might not be set.

I'm not sure that we care; is this something that we make assumptions
in other tests?

					- Ted

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 00/13] Add more tests for multi fs block atomic writes
  2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (12 preceding siblings ...)
  2025-06-26 11:59 ` [PATCH v2 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
@ 2025-06-27 13:56 ` John Garry
  13 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2025-06-27 13:56 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, tytso

On 26/06/2025 12:58, Ojaswin Mujoo wrote:
> Changes:

A small request:

please cc linux-xfs and linux-ext4 lists for these patches in future

thanks

> 
> - (1/13) new patch with _min and _max helpers
> - (2/13) remove setup_fs_options and add fsx specifc helper
> - (4/13) skip atomic write instead of falling back to normal write (fsx)
> - (4/13) make atomic write default on instead of default off (fsx)
> - (5,6/13) refactor and cleanup fio tests
> - (7/13) refactored common code
> - (8/13) dont ignore mmap writes for fsx with atomic writes
> - (9/13) use od instead of xxd. handle cleanup of bg threads in _cleanup()
> - (10-13/13) minor refactors
> - change all tests use _fail for better consistency
> - use higher tests numbers for easier merging
> 
> Link to rfc:https://urldefense.com/v3/__https://lore.kernel.org/fstests/ 
> cover.1749629233.git.ojaswin@linux.ibm.com/__;!!ACWV5N9M2RV99hQ! 
> J14tJ2AlGzQmxs408RVEDMmyt2ePPukro0IrPiuxMGEaq1wFgSfj91M8FQypI- 
> MM18b01NTxCQl8GLPSKwWNBQ$ 
> 
> PS: I'm on vacation till next week so there might be a delay in response
> 
> Ojaswin Mujoo (9):
>    common/rc: Add _min() and _max() helpers
>    common/rc: Fix fsx for ext4 with bigalloc
>    common/rc: Add a helper to run fsx on a given file
>    ltp/fsx.c: Add atomic writes support to fsx
>    generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
>    generic/1229: Stress fsx with atomic writes enabled
>    generic/1230: Add sudden shutdown tests for multi block atomic writes
>    ext4/063: Atomic write test for extent split across leaf nodes
>    ext4/064: Add atomic write tests for journal credit calculation


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-06-26 11:58 ` [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-06-27 14:09   ` John Garry
  2025-07-01 16:18     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-06-27 14:09 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, tytso


> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> +
> +# XFS can have high awu_max_write due to software fallback. Cap it at 64k

This test fails on xfs due for this reason. software fallback -based 
atomic writes have no serialization against reads or other writes.

Can you cap at atomic write unit max opt? That will mean that we get 
serialization from atomic write HW support.

BTW, to repeat what I said before, it can also fail for atomic write 
bios using HW support, as reads may be split.

See this sample output:

generic/1226 68s ... - output mismatch (see 
/home/opc/xfstests-dev/results//generic/1226.out.bad)
     --- tests/generic/1226.out  2025-06-27 11:26:58.411121674 +0000
     +++ /home/opc/xfstests-dev/results//generic/1226.out.bad 
2025-06-27 14:03:08.933811064 +0000
     @@ -1,2 +1,104 @@
      QA output created by 1226
     +crc32c: verify failed at file /home/opc/mnt/scratch/test-file 
offset 16457728, length 8192 (requested block: offset=16457728, 
length=8192, flags=84)
     +       Expected CRC: 712de900
     +       Received CRC: f872a18d
     +crc32c: verify failed at file /home/opc/mnt/scratch/test-file 
offset 88104960, length 8192 (requested block: offset=88104960, 
length=8192, flags=84)
     +       Expected CRC: 4aaaf009
     +       Received CRC: 2a5770cf
     ...
     (Run 'diff -u /home/opc/xfstests-dev/tests/generic/1226.out 
/home/opc/xfstests-dev/results//generic/1226.out.bad'  to see the entire 
diff)
Ran: generic/1226
Failures: generic/1226
Failed 1 of 1 tests

[root@jgarry-ol9new xfstests-dev]# cat /sys/block/sdi/queue/max_sectors_kb
4
[root@jgarry-ol9new xfstests-dev]#

However, nobody should fiddle with max_sectors_kb, so the test still has 
value.


> +blocksize=$(_min "$blocksize" "65536")
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
  2025-06-26 11:58 ` [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-06-27 14:48   ` John Garry
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2025-06-27 14:48 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, tytso

On 26/06/2025 12:58, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> 
> This tests uses fio to first create a file with mixed mappings. Then it
> does atomic writes using aio dio with parallel jobs to the same file
> with mixed mappings. This forces the filesystem allocator to allocate
> extents over mixed mapping regions to stress FS block allocators.
> 
> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>

The think that the same comments I had for 1226 apply here, but I also 
get this error on xfs:

[root@jgarry-ol9new xfstests-dev]# diff -u 
/home/opc/xfstests-dev/tests/generic/1227.out 
/home/opc/xfstests-dev/results//generic/1227.out.bad
--- /home/opc/xfstests-dev/tests/generic/1227.out       2025-06-27 
11:26:58.411121674 +0000
+++ /home/opc/xfstests-dev/results//generic/1227.out.bad 
2025-06-27 14:46:24.931649292 +0000
@@ -1,2 +1,3 @@
  QA output created by 1227
+/home/opc/xfstests-dev/tests/generic/1227: line 15: 
_require_scratch_write_atomic_multi_fsblock: command not found
  Silence is golden


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes
  2025-06-26 11:59 ` [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-06-27 16:11   ` John Garry
  2025-07-01  6:34     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-06-27 16:11 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests; +Cc: Ritesh Harjani, djwong, tytso

On 26/06/2025 12:59, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.
> 
> The way we work is that we lay out a file with random mix of written,
> unwritten and hole extents. Then we start performing atomic writes
> sequentially on the file while we parallely shutdown the FS. Then we
> note the last offset where the atomic write happened just before shut
> down and then make sure blocks around it either have completely old
> data or completely new data, ie the write was not torn during shutdown.
> 
> We repeat the same with completely written, completely unwritten and completely
> empty file to ensure these cases are not torn either.  Finally, we have a
> similar test for append atomic writes
> 
> Suggested-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>

this seems to work ok for xfs, as I get data verify errors when I remove 
the -A arg to xfs_io when doing the atomic writes.

But I see this (always):

@@ -1,2 +1,83 @@
  QA output created by 1230
+/home/opc/xfstests-dev/tests/generic/1230: line 13: 
_require_scratch_write_atomic_multi_fsblock: command not found
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (535173) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (535256) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (535339) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (535419) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (560146) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (560229) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (560312) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (560392) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (585304) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (585387) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (585470) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (585550) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (610300) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (610384) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (610468) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (610548) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (635276) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (635359) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (635442) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (635523) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (660400) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (660483) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (660566) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (660646) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (685377) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (685460) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (685543) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (685624) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (710361) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (710444) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (710527) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (710607) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (735333) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (735417) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (735500) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (735580) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (760306) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (760390) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (760473) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (760553) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (785276) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (785359) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (785442) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (785522) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (810249) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (810332) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (810415) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (810495) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (835221) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (835304) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (835388) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (835469) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (860199) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (860282) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (860365) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (860445) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (885169) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (885253) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (885336) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (885416) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (910140) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (910223) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (910307) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (910387) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (935110) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (935193) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (935276) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (935356) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (960079) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (960162) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (960245) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (960326) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (985052) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (985135) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (985218) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (985299) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (1010026) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (1010109) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (1010192) - 
No such process
+/home/opc/xfstests-dev/tests/generic/1230: line 247: kill: (1010272) - 
No such process
  Silence is golden
[root@jgarry-ol9new xfstests-dev]#


any idea (apart from _require_scratch_write_atomic_multi_fsblock)?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc
  2025-06-26 13:32   ` Theodore Ts'o
@ 2025-06-30 15:28     ` Darrick J. Wong
  2025-07-01  6:26       ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: Darrick J. Wong @ 2025-06-30 15:28 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Ojaswin Mujoo, Zorro Lang, fstests, Ritesh Harjani, john.g.garry

On Thu, Jun 26, 2025 at 09:32:39AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 26, 2025 at 05:28:53PM +0530, Ojaswin Mujoo wrote:
> > This is achieved by defining a new function _set_default_fsx_avoid
> > called via run_fsx helper. This can be used to selectively disable
> > fsx options based on the configuration.
> 
> > +_set_default_fsx_avoid() {
> > +       case "$FSTYP" in
> > +       "ext4")
> > +               if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > +                       export FSX_AVOID+=" -I -C"
> > +               fi
> > +               ;;
> 
> This assumes that MKFS_OPTIONS reflects the file system features
> enabled on the file system.  That's true by definition for the scratch
> fs, but it's not necessarily true for the test fs.  At least in
> theory, someone could run fstests without a scrach device, and then
> MKFS_OPTION?S might not be set.
> 
> I'm not sure that we care; is this something that we make assumptions
> in other tests?

/me suspects it ought to be checking dumpe2fs -h output.

--D

> 
> 					- Ted
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc
  2025-06-30 15:28     ` Darrick J. Wong
@ 2025-07-01  6:26       ` Ojaswin Mujoo
  2025-07-02 15:13         ` Darrick J. Wong
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-01  6:26 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Theodore Ts'o, Zorro Lang, fstests, Ritesh Harjani,
	john.g.garry

On Mon, Jun 30, 2025 at 08:28:57AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 26, 2025 at 09:32:39AM -0400, Theodore Ts'o wrote:
> > On Thu, Jun 26, 2025 at 05:28:53PM +0530, Ojaswin Mujoo wrote:
> > > This is achieved by defining a new function _set_default_fsx_avoid
> > > called via run_fsx helper. This can be used to selectively disable
> > > fsx options based on the configuration.
> > 
> > > +_set_default_fsx_avoid() {
> > > +       case "$FSTYP" in
> > > +       "ext4")
> > > +               if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > > +                       export FSX_AVOID+=" -I -C"
> > > +               fi
> > > +               ;;
> > 
> > This assumes that MKFS_OPTIONS reflects the file system features
> > enabled on the file system.  That's true by definition for the scratch
> > fs, but it's not necessarily true for the test fs.  At least in
> > theory, someone could run fstests without a scrach device, and then
> > MKFS_OPTION?S might not be set.
> > 
> > I'm not sure that we care; is this something that we make assumptions
> > in other tests?
> 
> /me suspects it ought to be checking dumpe2fs -h output.
> 
> --D

Hey Ted, Darrick,

I agree that MKFS_OPTIONS is not ideal because it won't reflect the
TEST_DEV options. Most of the existing users of run_fsx helper actually
use the TEST_DEV instead so Darrick's suggestion of using dumpe2fs makes 
sense. 

I can use that approach in v3 if that sounds okay to everyone.

Regards,
ojaswin

> 
> > 
> > 					- Ted
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes
  2025-06-27 16:11   ` John Garry
@ 2025-07-01  6:34     ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-01  6:34 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Fri, Jun 27, 2025 at 05:11:12PM +0100, John Garry wrote:
> On 26/06/2025 12:59, Ojaswin Mujoo wrote:
> > This test is intended to ensure that multi blocks atomic writes
> > maintain atomic guarantees across sudden FS shutdowns.
> > 
> > The way we work is that we lay out a file with random mix of written,
> > unwritten and hole extents. Then we start performing atomic writes
> > sequentially on the file while we parallely shutdown the FS. Then we
> > note the last offset where the atomic write happened just before shut
> > down and then make sure blocks around it either have completely old
> > data or completely new data, ie the write was not torn during shutdown.
> > 
> > We repeat the same with completely written, completely unwritten and completely
> > empty file to ensure these cases are not torn either.  Finally, we have a
> > similar test for append atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> 
> this seems to work ok for xfs, as I get data verify errors when I remove the
> -A arg to xfs_io when doing the atomic writes.
> 
> But I see this (always):
> 
> @@ -1,2 +1,83 @@
>  QA output created by 1230
> +/home/opc/xfstests-dev/tests/generic/1230: line 13:
> _require_scratch_write_atomic_multi_fsblock: command not found
Hi John,

I missed mentioning that these patches are based on [1] which provides
this command.

[1] https://lore.kernel.org/fstests/20250626002735.22827-1-catherine.hoang@oracle.com/T/#t

> +/home/opc/xfstests-dev/tests/generic/1230: line 149: kill: (535173) - No
> such process

Hmm so we follow the flow like:

1. spawn a thread A that does mixed atomic writes in bg
2. sleep 0.2s
3. shutdown and kill A

Seems like in your case, the disk maybe fast enough to actually finish
the task within the 0.2s we sleep for. 

I'll try to tune this better or see if we can just remove the sleep. In
the meantime, if you can share the seqres.full and out.bad file for this run
I'll be able to confirm if my analysis is correct.

Regards,
ojaswin

> any idea (apart from _require_scratch_write_atomic_multi_fsblock)?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-06-27 14:09   ` John Garry
@ 2025-07-01 16:18     ` Ojaswin Mujoo
  2025-07-02  7:46       ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-01 16:18 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Fri, Jun 27, 2025 at 03:09:09PM +0100, John Garry wrote:
> 
> > +touch "$SCRATCH_MNT/f1"
> > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> > +
> > +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> 
> This test fails on xfs due for this reason. software fallback -based atomic
> writes have no serialization against reads or other writes.

Okay so Im still not understanding why serialization is affecting
anything here? Lets assume the IO never breaks, from what I understand
in fio code, we do the write and then once it is complete we try to read
it back and verify it.

Why serialization matters here, because even if device reorders our read
before the atomic write, we still fetch an older $blocksize chunk which
should have the data and the header. The data crc should match the
header because it would have gone atomically. What am I missing?

(Or do you mean that max_sectors_kb is not big enough to support
reads after software atomic writes and thats the issue?)

> 
> Can you cap at atomic write unit max opt? That will mean that we get
> serialization from atomic write HW support.
> 
> BTW, to repeat what I said before, it can also fail for atomic write bios
> using HW support, as reads may be split.

This one I can understand, the fio's read is split causing a short read
and seems like fio is not requeing short read properly for the verify
step(?) so it tries to verify against short read itself.
> 
> See this sample output:
> 
> generic/1226 68s ... - output mismatch (see
> /home/opc/xfstests-dev/results//generic/1226.out.bad)
>     --- tests/generic/1226.out  2025-06-27 11:26:58.411121674 +0000
>     +++ /home/opc/xfstests-dev/results//generic/1226.out.bad 2025-06-27
> 14:03:08.933811064 +0000
>     @@ -1,2 +1,104 @@
>      QA output created by 1226
>     +crc32c: verify failed at file /home/opc/mnt/scratch/test-file offset
> 16457728, length 8192 (requested block: offset=16457728, length=8192,
> flags=84)
>     +       Expected CRC: 712de900
>     +       Received CRC: f872a18d
>     +crc32c: verify failed at file /home/opc/mnt/scratch/test-file offset
> 88104960, length 8192 (requested block: offset=88104960, length=8192,
> flags=84)
>     +       Expected CRC: 4aaaf009
>     +       Received CRC: 2a5770cf
>     ...
>     (Run 'diff -u /home/opc/xfstests-dev/tests/generic/1226.out
> /home/opc/xfstests-dev/results//generic/1226.out.bad'  to see the entire
> diff)
> Ran: generic/1226
> Failures: generic/1226
> Failed 1 of 1 tests
> 
> [root@jgarry-ol9new xfstests-dev]# cat /sys/block/sdi/queue/max_sectors_kb
> 4
> [root@jgarry-ol9new xfstests-dev]#
> 
> However, nobody should fiddle with max_sectors_kb, so the test still has
> value.
> 
> 
> > +blocksize=$(_min "$blocksize" "65536")
> > +
> > +fio_config=$tmp.fio
> > +fio_out=$tmp.fio.out
> > +

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-01 16:18     ` Ojaswin Mujoo
@ 2025-07-02  7:46       ` John Garry
  2025-07-03  6:42         ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-02  7:46 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 01/07/2025 17:18, Ojaswin Mujoo wrote:
> On Fri, Jun 27, 2025 at 03:09:09PM +0100, John Garry wrote:
>>
>>> +touch "$SCRATCH_MNT/f1"
>>> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
>>> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
>>> +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
>>> +
>>> +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
>>
>> This test fails on xfs due for this reason. software fallback -based atomic
>> writes have no serialization against reads or other writes.
> 
> Okay so Im still not understanding why serialization is affecting
> anything here? Lets assume the IO never breaks,

That's a false assumption.

An atomic write should never be split into multiple requests for 
REQ_ATOMIC set. However, regular reads and writes may be split in the 
block stack for any reason. We can easily force a split in regular reads 
and writes by setting max_sectors_kb to a value less than the read/write 
size.

Check blk_mq_submit_bio() -> __bio_split_to_limits() -> bio_split_rw()

> from what I understand
> in fio code, we do the write and then once it is complete we try to read
> it back and verify it.
> 
> Why serialization matters here, because even if device reorders our read
> before the atomic write, we still fetch an older $blocksize chunk which
> should have the data and the header.

Yes, we fetch $blocksize, but it may be as multiple read requests,
so we can have:

partial read $blocksize request, atomic write request, partial read 
$blocksize request

> The data crc should match the
> header because it would have gone atomically. What am I missing?
> 
> (Or do you mean that max_sectors_kb is not big enough to support
> reads after software atomic writes and thats the issue?)
> 
>>
>> Can you cap at atomic write unit max opt? That will mean that we get
>> serialization from atomic write HW support.
>>
>> BTW, to repeat what I said before, it can also fail for atomic write bios
>> using HW support, as reads may be split.
> 
> This one I can understand, the fio's read is split causing a short read
> and seems like fio is not requeing short read properly for the verify
> step(?) so it tries to verify against short read itself.

this is not an fio problem, as above.

I would like to repeat that the test should be ok to use, as long as we 
don't try to use an atomic write size > HW support (and we also have 
max_sectors_kb > $blocksize, which it would normally be).

Thanks,
John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc
  2025-07-01  6:26       ` Ojaswin Mujoo
@ 2025-07-02 15:13         ` Darrick J. Wong
  0 siblings, 0 replies; 38+ messages in thread
From: Darrick J. Wong @ 2025-07-02 15:13 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Theodore Ts'o, Zorro Lang, fstests, Ritesh Harjani,
	john.g.garry

On Tue, Jul 01, 2025 at 11:56:03AM +0530, Ojaswin Mujoo wrote:
> On Mon, Jun 30, 2025 at 08:28:57AM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 26, 2025 at 09:32:39AM -0400, Theodore Ts'o wrote:
> > > On Thu, Jun 26, 2025 at 05:28:53PM +0530, Ojaswin Mujoo wrote:
> > > > This is achieved by defining a new function _set_default_fsx_avoid
> > > > called via run_fsx helper. This can be used to selectively disable
> > > > fsx options based on the configuration.
> > > 
> > > > +_set_default_fsx_avoid() {
> > > > +       case "$FSTYP" in
> > > > +       "ext4")
> > > > +               if [[ "$MKFS_OPTIONS" =~ bigalloc ]]; then
> > > > +                       export FSX_AVOID+=" -I -C"
> > > > +               fi
> > > > +               ;;
> > > 
> > > This assumes that MKFS_OPTIONS reflects the file system features
> > > enabled on the file system.  That's true by definition for the scratch
> > > fs, but it's not necessarily true for the test fs.  At least in
> > > theory, someone could run fstests without a scrach device, and then
> > > MKFS_OPTION?S might not be set.
> > > 
> > > I'm not sure that we care; is this something that we make assumptions
> > > in other tests?
> > 
> > /me suspects it ought to be checking dumpe2fs -h output.
> > 
> > --D
> 
> Hey Ted, Darrick,
> 
> I agree that MKFS_OPTIONS is not ideal because it won't reflect the
> TEST_DEV options. Most of the existing users of run_fsx helper actually
> use the TEST_DEV instead so Darrick's suggestion of using dumpe2fs makes 
> sense. 
> 
> I can use that approach in v3 if that sounds okay to everyone.

Sounds fine to me...

--D

> Regards,
> ojaswin
> 
> > 
> > > 
> > > 					- Ted
> > > 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-02  7:46       ` John Garry
@ 2025-07-03  6:42         ` Ojaswin Mujoo
  2025-07-03 16:26           ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-03  6:42 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Wed, Jul 02, 2025 at 08:46:39AM +0100, John Garry wrote:
> On 01/07/2025 17:18, Ojaswin Mujoo wrote:
> > On Fri, Jun 27, 2025 at 03:09:09PM +0100, John Garry wrote:
> > > 
> > > > +touch "$SCRATCH_MNT/f1"
> > > > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > > > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > > > +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> > > > +
> > > > +# XFS can have high awu_max_write due to software fallback. Cap it at 64k
> > > 
> > > This test fails on xfs due for this reason. software fallback -based atomic
> > > writes have no serialization against reads or other writes.
> > 
> > Okay so Im still not understanding why serialization is affecting
> > anything here? Lets assume the IO never breaks,
> 
> That's a false assumption.
> 
> An atomic write should never be split into multiple requests for REQ_ATOMIC
> set. However, regular reads and writes may be split in the block stack for
> any reason. We can easily force a split in regular reads and writes by
> setting max_sectors_kb to a value less than the read/write size.
> 
> Check blk_mq_submit_bio() -> __bio_split_to_limits() -> bio_split_rw()
> 
> > from what I understand
> > in fio code, we do the write and then once it is complete we try to read
> > it back and verify it.
> > 
> > Why serialization matters here, because even if device reorders our read
> > before the atomic write, we still fetch an older $blocksize chunk which
> > should have the data and the header.
> 
> Yes, we fetch $blocksize, but it may be as multiple read requests,
> so we can have:
> 
> partial read $blocksize request, atomic write request, partial read
> $blocksize request

Ahh now I get the picture. Since software atomic writes might have a
value way higher than the max_sectors that the bio may split at, this
causes the reads to split which is the issue happnes, as you mentioned.

> 
> > The data crc should match the
> > header because it would have gone atomically. What am I missing?
> > 
> > (Or do you mean that max_sectors_kb is not big enough to support
> > reads after software atomic writes and thats the issue?)
> > 
> > > 
> > > Can you cap at atomic write unit max opt? That will mean that we get
> > > serialization from atomic write HW support.
> > > 
> > > BTW, to repeat what I said before, it can also fail for atomic write bios
> > > using HW support, as reads may be split.
> > 
> > This one I can understand, the fio's read is split causing a short read
> > and seems like fio is not requeing short read properly for the verify
> > step(?) so it tries to verify against short read itself.
> 
> this is not an fio problem, as above.

Yep got it, its the block layer splitting the reads.

So as a solution, do you think it is okay to cap  $blocksize to
max_sectors_kb or would you suggest it would be better to just stick to
awu max opt? 

Thanks,
ojaswin
> 
> I would like to repeat that the test should be ok to use, as long as we
> don't try to use an atomic write size > HW support (and we also have
> max_sectors_kb > $blocksize, which it would normally be).
> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-03  6:42         ` Ojaswin Mujoo
@ 2025-07-03 16:26           ` John Garry
  2025-07-04 14:35             ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-03 16:26 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 03/07/2025 07:42, Ojaswin Mujoo wrote:
>>> Why serialization matters here, because even if device reorders our read
>>> before the atomic write, we still fetch an older $blocksize chunk which
>>> should have the data and the header.
>> Yes, we fetch $blocksize, but it may be as multiple read requests,
>> so we can have:
>>
>> partial read $blocksize request, atomic write request, partial read
>> $blocksize request
> Ahh now I get the picture. Since software atomic writes might have a
> value way higher than the max_sectors that the bio may split at, this
> causes the reads to split which is the issue happnes, as you mentioned.

This issue also exists for atomic writes based on REQ_ATOMIC.

> 
>>> The data crc should match the
>>> header because it would have gone atomically. What am I missing?
>>>
>>> (Or do you mean that max_sectors_kb is not big enough to support
>>> reads after software atomic writes and thats the issue?)
>>>
>>>> Can you cap at atomic write unit max opt? That will mean that we get
>>>> serialization from atomic write HW support.
>>>>
>>>> BTW, to repeat what I said before, it can also fail for atomic write bios
>>>> using HW support, as reads may be split.
>>> This one I can understand, the fio's read is split causing a short read
>>> and seems like fio is not requeing short read properly for the verify
>>> step(?) so it tries to verify against short read itself.
>> this is not an fio problem, as above.
> Yep got it, its the block layer splitting the reads.
> 
> So as a solution, do you think it is okay to cap  $blocksize to
> max_sectors_kb or would you suggest it would be better to just stick to
> awu max opt?

I tend to think that it is better to cap at awu max opt. If zero, then 
don't do the test.

What awu max opt is reported for ext4 (for awu max > FS block size)? Is 
min(bigalloc size, bdev awu max)?

Thanks,
John


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-03 16:26           ` John Garry
@ 2025-07-04 14:35             ` Ojaswin Mujoo
  2025-07-04 15:23               ` Ojaswin Mujoo
  2025-07-07  8:02               ` John Garry
  0 siblings, 2 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-04 14:35 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Thu, Jul 03, 2025 at 05:26:49PM +0100, John Garry wrote:
> On 03/07/2025 07:42, Ojaswin Mujoo wrote:
> > > > Why serialization matters here, because even if device reorders our read
> > > > before the atomic write, we still fetch an older $blocksize chunk which
> > > > should have the data and the header.
> > > Yes, we fetch $blocksize, but it may be as multiple read requests,
> > > so we can have:
> > > 
> > > partial read $blocksize request, atomic write request, partial read
> > > $blocksize request
> > Ahh now I get the picture. Since software atomic writes might have a
> > value way higher than the max_sectors that the bio may split at, this
> > causes the reads to split which is the issue happnes, as you mentioned.
> 
> This issue also exists for atomic writes based on REQ_ATOMIC.

Yep got it now, thanks.


> 
> > 
> > > > The data crc should match the
> > > > header because it would have gone atomically. What am I missing?
> > > > 
> > > > (Or do you mean that max_sectors_kb is not big enough to support
> > > > reads after software atomic writes and thats the issue?)
> > > > 
> > > > > Can you cap at atomic write unit max opt? That will mean that we get
> > > > > serialization from atomic write HW support.
> > > > > 
> > > > > BTW, to repeat what I said before, it can also fail for atomic write bios
> > > > > using HW support, as reads may be split.
> > > > This one I can understand, the fio's read is split causing a short read
> > > > and seems like fio is not requeing short read properly for the verify
> > > > step(?) so it tries to verify against short read itself.
> > > this is not an fio problem, as above.
> > Yep got it, its the block layer splitting the reads.
> > 
> > So as a solution, do you think it is okay to cap  $blocksize to
> > max_sectors_kb or would you suggest it would be better to just stick to
> > awu max opt?
> 
> I tend to think that it is better to cap at awu max opt. If zero, then don't
> do the test.

Hmm yep this makes sense but seems like we don't expose it yet in
xfs_io. Will need to add that support as well.

> 
> What awu max opt is reported for ext4 (for awu max > FS block size)? Is
> min(bigalloc size, bdev awu max)?

For awu_max_opt is returned as 0 for ext4, since 0 indicates there
is no such slow fallback. From commit 

  5d894321c49e fs: add atomic write unit max opt to statx

  ...
  When zero, it means that there is no such performance boundary.
	...

Regards,
ojaswin

> 
> Thanks,
> John
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-04 14:35             ` Ojaswin Mujoo
@ 2025-07-04 15:23               ` Ojaswin Mujoo
  2025-07-07  8:18                 ` John Garry
  2025-07-07  8:02               ` John Garry
  1 sibling, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-04 15:23 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Fri, Jul 04, 2025 at 08:05:36PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jul 03, 2025 at 05:26:49PM +0100, John Garry wrote:
> > On 03/07/2025 07:42, Ojaswin Mujoo wrote:
> > > > > Why serialization matters here, because even if device reorders our read
> > > > > before the atomic write, we still fetch an older $blocksize chunk which
> > > > > should have the data and the header.
> > > > Yes, we fetch $blocksize, but it may be as multiple read requests,
> > > > so we can have:
> > > > 
> > > > partial read $blocksize request, atomic write request, partial read
> > > > $blocksize request
> > > Ahh now I get the picture. Since software atomic writes might have a
> > > value way higher than the max_sectors that the bio may split at, this
> > > causes the reads to split which is the issue happnes, as you mentioned.
> > 
> > This issue also exists for atomic writes based on REQ_ATOMIC.
> 
> Yep got it now, thanks.
> 
> 
> > 
> > > 
> > > > > The data crc should match the
> > > > > header because it would have gone atomically. What am I missing?
> > > > > 
> > > > > (Or do you mean that max_sectors_kb is not big enough to support
> > > > > reads after software atomic writes and thats the issue?)
> > > > > 
> > > > > > Can you cap at atomic write unit max opt? That will mean that we get
> > > > > > serialization from atomic write HW support.
> > > > > > 
> > > > > > BTW, to repeat what I said before, it can also fail for atomic write bios
> > > > > > using HW support, as reads may be split.
> > > > > This one I can understand, the fio's read is split causing a short read
> > > > > and seems like fio is not requeing short read properly for the verify
> > > > > step(?) so it tries to verify against short read itself.
> > > > this is not an fio problem, as above.
> > > Yep got it, its the block layer splitting the reads.
> > > 
> > > So as a solution, do you think it is okay to cap  $blocksize to
> > > max_sectors_kb or would you suggest it would be better to just stick to
> > > awu max opt?
> > 
> > I tend to think that it is better to cap at awu max opt. If zero, then don't
> > do the test.
> 
> Hmm yep this makes sense but seems like we don't expose it yet in
> xfs_io. Will need to add that support as well.

Okay so thinking about it a bit more, Im not sure if capping to
awu_max_opt is the right thing to do. 

There is no guarantee in kernel that awu_max_opt or even awu_max would
be smaller than lim->max_sectors so why to derive our io size based on
that. Further, handling awu_max vs awu_max_opt and their different
interpretations for XFS vs EXT4 etc are adding uneccessary complexity.

The simplest way seems to be to just limit the $blocksize to 
something like $(_min 16KB awu_max) and hope that 16KB is small enough
to not be split during reads. We anyways have other tests like
generic/1230 that can be used to test larger atomic write sizes

Thoughts?
> 
> > 
> > What awu max opt is reported for ext4 (for awu max > FS block size)? Is
> > min(bigalloc size, bdev awu max)?
> 
> For awu_max_opt is returned as 0 for ext4, since 0 indicates there
> is no such slow fallback. From commit 
> 
>   5d894321c49e fs: add atomic write unit max opt to statx
> 
>   ...
>   When zero, it means that there is no such performance boundary.
> 	...
> 
> Regards,
> ojaswin
> 
> > 
> > Thanks,
> > John
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-04 14:35             ` Ojaswin Mujoo
  2025-07-04 15:23               ` Ojaswin Mujoo
@ 2025-07-07  8:02               ` John Garry
  1 sibling, 0 replies; 38+ messages in thread
From: John Garry @ 2025-07-07  8:02 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 04/07/2025 15:35, Ojaswin Mujoo wrote:
> On Thu, Jul 03, 2025 at 05:26:49PM +0100, John Garry wrote:
>> On 03/07/2025 07:42, Ojaswin Mujoo wrote:
>>>>> Why serialization matters here, because even if device reorders our read
>>>>> before the atomic write, we still fetch an older $blocksize chunk which
>>>>> should have the data and the header.
>>>> Yes, we fetch $blocksize, but it may be as multiple read requests,
>>>> so we can have:
>>>>
>>>> partial read $blocksize request, atomic write request, partial read
>>>> $blocksize request
>>> Ahh now I get the picture. Since software atomic writes might have a
>>> value way higher than the max_sectors that the bio may split at, this
>>> causes the reads to split which is the issue happnes, as you mentioned.
>>
>> This issue also exists for atomic writes based on REQ_ATOMIC.
> 
> Yep got it now, thanks.
> 
> 
>>
>>>
>>>>> The data crc should match the
>>>>> header because it would have gone atomically. What am I missing?
>>>>>
>>>>> (Or do you mean that max_sectors_kb is not big enough to support
>>>>> reads after software atomic writes and thats the issue?)
>>>>>
>>>>>> Can you cap at atomic write unit max opt? That will mean that we get
>>>>>> serialization from atomic write HW support.
>>>>>>
>>>>>> BTW, to repeat what I said before, it can also fail for atomic write bios
>>>>>> using HW support, as reads may be split.
>>>>> This one I can understand, the fio's read is split causing a short read
>>>>> and seems like fio is not requeing short read properly for the verify
>>>>> step(?) so it tries to verify against short read itself.
>>>> this is not an fio problem, as above.
>>> Yep got it, its the block layer splitting the reads.
>>>
>>> So as a solution, do you think it is okay to cap  $blocksize to
>>> max_sectors_kb or would you suggest it would be better to just stick to
>>> awu max opt?
>>
>> I tend to think that it is better to cap at awu max opt. If zero, then don't
>> do the test.
> 
> Hmm yep this makes sense but seems like we don't expose it yet in
> xfs_io. Will need to add that support as well.

Darrick sent a patch for that.

> 
>>
>> What awu max opt is reported for ext4 (for awu max > FS block size)? Is
>> min(bigalloc size, bdev awu max)?
> 
> For awu_max_opt is returned as 0 for ext4, since 0 indicates there
> is no such slow fallback. From commit
> 
>    5d894321c49e fs: add atomic write unit max opt to statx
> 
>    ...
>    When zero, it means that there is no such performance boundary.
> 	...
> 

Sure, but I think that for 1x FS block size atomic write unit max,

you can make atomic write unit max opt == atomic write unit max.

Or just always make atomic write unit max opt == atomic write unit max
always. I don't think that anyone will care.

BTW, you seem to have weird line endings configured in your client.

Thanks,
John


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-04 15:23               ` Ojaswin Mujoo
@ 2025-07-07  8:18                 ` John Garry
  2025-07-08  6:50                   ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-07  8:18 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 04/07/2025 16:23, Ojaswin Mujoo wrote:
>> Hmm yep this makes sense but seems like we don't expose it yet in
>> xfs_io. Will need to add that support as well.
> Okay so thinking about it a bit more, Im not sure if capping to
> awu_max_opt is the right thing to do.
> 
> There is no guarantee in kernel that awu_max_opt or even awu_max would
> be smaller than lim->max_sectors so why to derive our io size based on
> that. 

Absolutely. There is no guarantee. But generally it would not be less 
than lim->max_sectors. But lim->max_sectors kernel default is 1280KB, 
which is quite large. However, there are other reasons why splits could 
be broken. I am just giving the simple example in lim->max_sectors, as 
that is configurable from userspace (so useful for experimenting).

> Further, handling awu_max vs awu_max_opt and their different
> interpretations for XFS vs EXT4 etc are adding uneccessary complexity.

You could assume awu max opt == awu max for ext4, as we no there are no 
software-based atomic writes. That could be simpler than my previous 
suggestion, which involves a kernel change.

> 
> The simplest way seems to be to just limit the $blocksize to
> something like $(_min 16KB awu_max) and hope that 16KB is small enough
> to not be split during reads. We anyways have other tests like
> generic/1230 that can be used to test larger atomic write sizes
> 
> Thoughts?

As I said at the start, we never had guarantees of serialization of 
reads and atomic writes.

However I still think that this is a useful test. It's just it is 
theoretically possible to give false positives.

You could get the test to read max_sectors_kb, and check whether it is 
greater than the bs. Again, more complexity.

As an alternative, maybe it's better to maintain this test out-of-tree. 
I'm not sure.

Thanks,
John


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-07  8:18                 ` John Garry
@ 2025-07-08  6:50                   ` Ojaswin Mujoo
  2025-07-08 11:11                     ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-08  6:50 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Mon, Jul 07, 2025 at 09:18:04AM +0100, John Garry wrote:
> On 04/07/2025 16:23, Ojaswin Mujoo wrote:
> > > Hmm yep this makes sense but seems like we don't expose it yet in
> > > xfs_io. Will need to add that support as well.
> > Okay so thinking about it a bit more, Im not sure if capping to
> > awu_max_opt is the right thing to do.
> > 
> > There is no guarantee in kernel that awu_max_opt or even awu_max would
> > be smaller than lim->max_sectors so why to derive our io size based on
> > that.
> 
> Absolutely. There is no guarantee. But generally it would not be less than
> lim->max_sectors. But lim->max_sectors kernel default is 1280KB, which is
> quite large. However, there are other reasons why splits could be broken. I
> am just giving the simple example in lim->max_sectors, as that is
> configurable from userspace (so useful for experimenting).

Got it.
> 
> > Further, handling awu_max vs awu_max_opt and their different
> > interpretations for XFS vs EXT4 etc are adding uneccessary complexity.
> 
> You could assume awu max opt == awu max for ext4, as we no there are no
> software-based atomic writes. That could be simpler than my previous
> suggestion, which involves a kernel change.

I want to avoid adding something like 

iosize = FSTYP == "ext4" ? awu_max : awu_max_opt

in the test iteslf, because it is a bit difficult to remember changing
this in case things change in the kernel later, eg if ext4 gets a
fallback.

Im not sure if returning non-zero for ext4 is also correct as per the
documented behavior from 5d894321c49e613.

   When zero, it means that there is no such performance boundary.

Anyways, I think there might be a simpler solution...
> 
> > 
> > The simplest way seems to be to just limit the $blocksize to
> > something like $(_min 16KB awu_max) and hope that 16KB is small enough
> > to not be split during reads. We anyways have other tests like
> > generic/1230 that can be used to test larger atomic write sizes
> > 
> > Thoughts?
> 
> As I said at the start, we never had guarantees of serialization of reads
> and atomic writes.
> 
> However I still think that this is a useful test. It's just it is
> theoretically possible to give false positives.
> 
> You could get the test to read max_sectors_kb, and check whether it is
> greater than the bs. Again, more complexity.

I think this might work. So i can set the iosize as 16kb which is
already low enought. Incase the max_sectors is < than 16kb then we bail
out. (but then maybe there is a very small chance of read split still)

OR another approach is to do the verify at the very end when all threads
are done writing so the reads don't race. The tradeoff is that this
will reduce the effictiveness of the test though to some extent.

> 
> As an alternative, maybe it's better to maintain this test out-of-tree. I'm
> not sure.
That'll be a hassle for everyone :p Does any of the above 2 approaches
seem acceptable for upstreaming this test?

Thanks for the reviews,
ojaswin

P.S: I'm not seeing anything weird with the line endings, but thanks,
i'll try to look into it.
> 
> Thanks,
> John
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-08  6:50                   ` Ojaswin Mujoo
@ 2025-07-08 11:11                     ` John Garry
  2025-07-08 12:01                       ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-08 11:11 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 08/07/2025 07:50, Ojaswin Mujoo wrote:
>>> Further, handling awu_max vs awu_max_opt and their different
>>> interpretations for XFS vs EXT4 etc are adding uneccessary complexity.
>> You could assume awu max opt == awu max for ext4, as we no there are no
>> software-based atomic writes. That could be simpler than my previous
>> suggestion, which involves a kernel change.
> I want to avoid adding something like
> 
> iosize = FSTYP == "ext4" ? awu_max : awu_max_opt
> 
> in the test iteslf, because it is a bit difficult to remember changing
> this in case things change in the kernel later, eg if ext4 gets a
> fallback.

You could do the statx call on the mounted bdev, and see if atomics are 
supported there. If not, skip the test.

But even for xfs when HW support is available, we can still use software 
fallback when HW support is not possible, i.e. misaligned. So, in that 
regard, we really should only test this on ext4.

> 
> Im not sure if returning non-zero for ext4 is also correct as per the
> documented behavior from 5d894321c49e613.
> 
>     When zero, it means that there is no such performance boundary.
> 
> Anyways, I think there might be a simpler solution...
>>> The simplest way seems to be to just limit the $blocksize to
>>> something like $(_min 16KB awu_max) and hope that 16KB is small enough
>>> to not be split during reads. We anyways have other tests like
>>> generic/1230 that can be used to test larger atomic write sizes
>>>
>>> Thoughts?
>> As I said at the start, we never had guarantees of serialization of reads
>> and atomic writes.
>>
>> However I still think that this is a useful test. It's just it is
>> theoretically possible to give false positives.
>>
>> You could get the test to read max_sectors_kb, and check whether it is
>> greater than the bs. Again, more complexity.
> I think this might work. So i can set the iosize as 16kb which is
> already low enought. Incase the max_sectors is < than 16kb then we bail
> out. (but then maybe there is a very small chance of read split still)

Sure, that seems fine.

> 
> OR another approach is to do the verify at the very end when all threads
> are done writing so the reads don't race. The tradeoff is that this
> will reduce the effictiveness of the test though to some extent.

Yeah, right. We would want this test to prove that atomic writes are not 
getting split by the block stack, so racing reads can help prove that.

> 
>> As an alternative, maybe it's better to maintain this test out-of-tree. I'm
>> not sure.
> That'll be a hassle for everyone :p Does any of the above 2 approaches
> seem acceptable for upstreaming this test?

Sure, I think ether are fine, but I am still a bit hesitant to even have 
this test for xfs (and not just ext4).

Thanks,
John



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-08 11:11                     ` John Garry
@ 2025-07-08 12:01                       ` Ojaswin Mujoo
  2025-07-08 12:34                         ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-08 12:01 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Tue, Jul 08, 2025 at 12:11:06PM +0100, John Garry wrote:
> On 08/07/2025 07:50, Ojaswin Mujoo wrote:
> > > > Further, handling awu_max vs awu_max_opt and their different
> > > > interpretations for XFS vs EXT4 etc are adding uneccessary complexity.
> > > You could assume awu max opt == awu max for ext4, as we no there are no
> > > software-based atomic writes. That could be simpler than my previous
> > > suggestion, which involves a kernel change.
> > I want to avoid adding something like
> > 
> > iosize = FSTYP == "ext4" ? awu_max : awu_max_opt
> > 
> > in the test iteslf, because it is a bit difficult to remember changing
> > this in case things change in the kernel later, eg if ext4 gets a
> > fallback.
> 
> You could do the statx call on the mounted bdev, and see if atomics are
> supported there. If not, skip the test.
> 
> But even for xfs when HW support is available, we can still use software
> fallback when HW support is not possible, i.e. misaligned. So, in that
> regard, we really should only test this on ext4.

Hmm, okay so I'm a bit confused now. What is different in ext4 vs xfs.
Why does the software fallback for xfs matter in this case? If we limit
io to 16KB hoping that reads don't break, then either ways the fio
verify should work. And if reads break then both ext4 and xfs suffer
irrespective of hardware or software fallback. (We also saw that
hardware atomic writes also suffer with same issue when max_sectors is
low)

> 
> > 
> > Im not sure if returning non-zero for ext4 is also correct as per the
> > documented behavior from 5d894321c49e613.
> > 
> >     When zero, it means that there is no such performance boundary.
> > 
> > Anyways, I think there might be a simpler solution...
> > > > The simplest way seems to be to just limit the $blocksize to
> > > > something like $(_min 16KB awu_max) and hope that 16KB is small enough
> > > > to not be split during reads. We anyways have other tests like
> > > > generic/1230 that can be used to test larger atomic write sizes
> > > > 
> > > > Thoughts?
> > > As I said at the start, we never had guarantees of serialization of reads
> > > and atomic writes.
> > > 
> > > However I still think that this is a useful test. It's just it is
> > > theoretically possible to give false positives.
> > > 
> > > You could get the test to read max_sectors_kb, and check whether it is
> > > greater than the bs. Again, more complexity.
> > I think this might work. So i can set the iosize as 16kb which is
> > already low enought. Incase the max_sectors is < than 16kb then we bail
> > out. (but then maybe there is a very small chance of read split still)
> 
> Sure, that seems fine.
> 
> > 
> > OR another approach is to do the verify at the very end when all threads
> > are done writing so the reads don't race. The tradeoff is that this
> > will reduce the effictiveness of the test though to some extent.
> 
> Yeah, right. We would want this test to prove that atomic writes are not
> getting split by the block stack, so racing reads can help prove that.

Yeahh, but now i'm leaning towards this approach of keeping verify at
the very end instead of parallel. Yes, it is a bit less effective but
then having a data integrity test that can theoretically return false
positives is not good, it'll always keep me guessing about any failure. 
Having verify at end will not have false failures for sure. Even though
it comes at a cost of less coverage.

Maybe we can also get some other prespectives on this. 
@Zorro, @Darrick, can you please give your thoughts on this?

As a sidenote, with this issue of fio's verify read splitting, it seems
like the parallel verify feature in fio itself is flawed for non-atomic
writes as well since the read can always split and race with the writes.

Not sure what the solution is here, maybe time for non splittable atomic
writes (:p)

> 
> > 
> > > As an alternative, maybe it's better to maintain this test out-of-tree. I'm
> > > not sure.
> > That'll be a hassle for everyone :p Does any of the above 2 approaches
> > seem acceptable for upstreaming this test?
> 
> Sure, I think ether are fine, but I am still a bit hesitant to even have
> this test for xfs (and not just ext4).

> 
> Thanks,
> John
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-08 12:01                       ` Ojaswin Mujoo
@ 2025-07-08 12:34                         ` John Garry
  2025-07-11 10:39                           ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-08 12:34 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 08/07/2025 13:01, Ojaswin Mujoo wrote:
>> You could do the statx call on the mounted bdev, and see if atomics are
>> supported there. If not, skip the test.
>>
>> But even for xfs when HW support is available, we can still use software
>> fallback when HW support is not possible, i.e. misaligned. So, in that
>> regard, we really should only test this on ext4.
> Hmm, okay so I'm a bit confused now. What is different in ext4 vs xfs.
> Why does the software fallback for xfs matter in this case?

For xfs, software fallback-based atomic writes are not even serialized 
against other software fallback-based atomic writes. They can trample on 
one another. That is a reason I say that this test should only be for 
HW-based atomic writes.

But even if the bs <= HW limit, on xfs may rarely still use atomic write 
software fallback if HW-based atomic write is not possible, e.g. 
misaligned, multiple extents, etc. I will emphasis that this would be 
rare, but it is still possible.

For ext4, since alignment is always guaranteed, and HW-based atomics 
will only ever be used.

> If we limit
> io to 16KB hoping that reads don't break, then either ways the fio
> verify should work. And if reads break then both ext4 and xfs suffer
> irrespective of hardware or software fallback. (We also saw that
> hardware atomic writes also suffer with same issue when max_sectors is
> low)
> 
>>> Im not sure if returning non-zero for ext4 is also correct as per the
>>> documented behavior from 5d894321c49e613.
>>>
>>>      When zero, it means that there is no such performance boundary.
>>>
>>> Anyways, I think there might be a simpler solution...
>>>>> The simplest way seems to be to just limit the $blocksize to
>>>>> something like $(_min 16KB awu_max) and hope that 16KB is small enough
>>>>> to not be split during reads. We anyways have other tests like
>>>>> generic/1230 that can be used to test larger atomic write sizes
>>>>>
>>>>> Thoughts?
>>>> As I said at the start, we never had guarantees of serialization of reads
>>>> and atomic writes.
>>>>
>>>> However I still think that this is a useful test. It's just it is
>>>> theoretically possible to give false positives.
>>>>
>>>> You could get the test to read max_sectors_kb, and check whether it is
>>>> greater than the bs. Again, more complexity.
>>> I think this might work. So i can set the iosize as 16kb which is
>>> already low enought. Incase the max_sectors is < than 16kb then we bail
>>> out. (but then maybe there is a very small chance of read split still)
>> Sure, that seems fine.
>>
>>> OR another approach is to do the verify at the very end when all threads
>>> are done writing so the reads don't race. The tradeoff is that this
>>> will reduce the effictiveness of the test though to some extent.
>> Yeah, right. We would want this test to prove that atomic writes are not
>> getting split by the block stack, so racing reads can help prove that.
> Yeahh, but now i'm leaning towards this approach of keeping verify at
> the very end instead of parallel. 

I don't think that fio supports such a mode. Or does it?

> Yes, it is a bit less effective but
> then having a data integrity test that can theoretically return false
> positives is not good, it'll always keep me guessing about any failure.
> Having verify at end will not have false failures for sure. Even though
> it comes at a cost of less coverage.
> 
> Maybe we can also get some other prespectives on this.
> @Zorro, @Darrick, can you please give your thoughts on this?
> 
> As a sidenote, with this issue of fio's verify read splitting, it seems
> like the parallel verify feature in fio itself is flawed for non-atomic
> writes as well since the read can always split and race with the writes.

fio does warn that multiple read and writers in verify mode is unsafe.

> 
> Not sure what the solution is here, maybe time for non splittable atomic
> writes (:p)

I just think that fio in verify mode for ext4 is ok, as long as 
max_sectors_kb is large.

But fundamentally we are trying to test a feature of RWF_ATOMIC which is 
not guaranteed.

Thanks,
John


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-08 12:34                         ` John Garry
@ 2025-07-11 10:39                           ` Ojaswin Mujoo
  2025-07-11 10:51                             ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-11 10:39 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Tue, Jul 08, 2025 at 01:34:39PM +0100, John Garry wrote:
> On 08/07/2025 13:01, Ojaswin Mujoo wrote:
> > > You could do the statx call on the mounted bdev, and see if atomics are
> > > supported there. If not, skip the test.
> > > 
> > > But even for xfs when HW support is available, we can still use software
> > > fallback when HW support is not possible, i.e. misaligned. So, in that
> > > regard, we really should only test this on ext4.
> > Hmm, okay so I'm a bit confused now. What is different in ext4 vs xfs.
> > Why does the software fallback for xfs matter in this case?
> 
> For xfs, software fallback-based atomic writes are not even serialized
> against other software fallback-based atomic writes. They can trample on one
> another. That is a reason I say that this test should only be for HW-based
> atomic writes.
> 
> But even if the bs <= HW limit, on xfs may rarely still use atomic write
> software fallback if HW-based atomic write is not possible, e.g. misaligned,
> multiple extents, etc. I will emphasis that this would be rare, but it is
> still possible.
> 
> For ext4, since alignment is always guaranteed, and HW-based atomics will
> only ever be used.
> 
> > If we limit
> > io to 16KB hoping that reads don't break, then either ways the fio
> > verify should work. And if reads break then both ext4 and xfs suffer
> > irrespective of hardware or software fallback. (We also saw that
> > hardware atomic writes also suffer with same issue when max_sectors is
> > low)
> > 
> > > > Im not sure if returning non-zero for ext4 is also correct as per the
> > > > documented behavior from 5d894321c49e613.
> > > > 
> > > >      When zero, it means that there is no such performance boundary.
> > > > 
> > > > Anyways, I think there might be a simpler solution...
> > > > > > The simplest way seems to be to just limit the $blocksize to
> > > > > > something like $(_min 16KB awu_max) and hope that 16KB is small enough
> > > > > > to not be split during reads. We anyways have other tests like
> > > > > > generic/1230 that can be used to test larger atomic write sizes
> > > > > > 
> > > > > > Thoughts?
> > > > > As I said at the start, we never had guarantees of serialization of reads
> > > > > and atomic writes.
> > > > > 
> > > > > However I still think that this is a useful test. It's just it is
> > > > > theoretically possible to give false positives.
> > > > > 
> > > > > You could get the test to read max_sectors_kb, and check whether it is
> > > > > greater than the bs. Again, more complexity.
> > > > I think this might work. So i can set the iosize as 16kb which is
> > > > already low enought. Incase the max_sectors is < than 16kb then we bail
> > > > out. (but then maybe there is a very small chance of read split still)
> > > Sure, that seems fine.
> > > 
> > > > OR another approach is to do the verify at the very end when all threads
> > > > are done writing so the reads don't race. The tradeoff is that this
> > > > will reduce the effictiveness of the test though to some extent.
> > > Yeah, right. We would want this test to prove that atomic writes are not
> > > getting split by the block stack, so racing reads can help prove that.
> > Yeahh, but now i'm leaning towards this approach of keeping verify at
> > the very end instead of parallel.
> 
> I don't think that fio supports such a mode. Or does it?

Yes we can achieve it via combination of write job with verify=crc,do_verify=0 followed by
verify job with verify=crc,verify_only=1. I'm testing this right now to
see if it can survive max_sectors_kb = 4 (which i think it should since
there are no races now). If everything is fine I'll use this approach
for the fio tests in the next revision.
> 
> > Yes, it is a bit less effective but
> > then having a data integrity test that can theoretically return false
> > positives is not good, it'll always keep me guessing about any failure.
> > Having verify at end will not have false failures for sure. Even though
> > it comes at a cost of less coverage.
> > 
> > Maybe we can also get some other prespectives on this.
> > @Zorro, @Darrick, can you please give your thoughts on this?
> > 
> > As a sidenote, with this issue of fio's verify read splitting, it seems
> > like the parallel verify feature in fio itself is flawed for non-atomic
> > writes as well since the read can always split and race with the writes.
> 
> fio does warn that multiple read and writers in verify mode is unsafe.
Right.

> 
> > 
> > Not sure what the solution is here, maybe time for non splittable atomic
> > writes (:p)
> 
> I just think that fio in verify mode for ext4 is ok, as long as
> max_sectors_kb is large.

Yes but i'd rather have the write followed by verify version to avoid
false positives in case the read bio does end up splitting for whatever
reason.

Thanks for catching this issue in the test!

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-11 10:39                           ` Ojaswin Mujoo
@ 2025-07-11 10:51                             ` John Garry
  2025-07-11 18:16                               ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-07-11 10:51 UTC (permalink / raw)
  To: Ojaswin Mujoo; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On 11/07/2025 11:39, Ojaswin Mujoo wrote:
>> I don't think that fio supports such a mode. Or does it?
> Yes we can achieve it via combination of write job with verify=crc,do_verify=0 followed by
> verify job with verify=crc,verify_only=1. 

So does verify=crc,do_verify=0 really omit the verify and only do write? 
If so, seems good.

> I'm testing this right now to
> see if it can survive max_sectors_kb = 4 (which i think it should since
> there are no races now). If everything is fine I'll use this approach
> for the fio tests in the next revision.

understood

>>> Yes, it is a bit less effective but
>>> then having a data integrity test that can theoretically return false
>>> positives is not good, it'll always keep me guessing about any failure.
>>> Having verify at end will not have false failures for sure. Even though
>>> it comes at a cost of less coverage.
>>>
>>> Maybe we can also get some other prespectives on this.
>>> @Zorro, @Darrick, can you please give your thoughts on this?
>>>
>>> As a sidenote, with this issue of fio's verify read splitting, it seems
>>> like the parallel verify feature in fio itself is flawed for non-atomic
>>> writes as well since the read can always split and race with the writes.
>> fio does warn that multiple read and writers in verify mode is unsafe.
> Right.
> 
>>> Not sure what the solution is here, maybe time for non splittable atomic
>>> writes (:p)
>> I just think that fio in verify mode for ext4 is ok, as long as
>> max_sectors_kb is large.
> Yes but i'd rather have the write followed by verify version to avoid
> false positives in case the read bio does end up splitting for whatever
> reason.
> 
> Thanks for catching this issue in the test!

So, if you want to test this way then you need to omit xfs or ensure 
that the bs <= awu max opt for xfs and hope for no problems.

Thanks,
John


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier
  2025-07-11 10:51                             ` John Garry
@ 2025-07-11 18:16                               ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-07-11 18:16 UTC (permalink / raw)
  To: John Garry; +Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso

On Fri, Jul 11, 2025 at 11:51:22AM +0100, John Garry wrote:
> On 11/07/2025 11:39, Ojaswin Mujoo wrote:
> > > I don't think that fio supports such a mode. Or does it?
> > Yes we can achieve it via combination of write job with verify=crc,do_verify=0 followed by
> > verify job with verify=crc,verify_only=1.
> 
> So does verify=crc,do_verify=0 really omit the verify and only do write? If
> so, seems good.

Yep, it does the write with verify headers but doesn't do the actual
verification. We can later use a verify_only=1 job to verify it. 

I tested this on a setup where i was able to hit crc issues with
max_sectors_kb=4 reliably in 20 iterations with the original approach.
With the above approach we are no longer hitting crc issues.
> 
> > I'm testing this right now to
> > see if it can survive max_sectors_kb = 4 (which i think it should since
> > there are no races now). If everything is fine I'll use this approach
> > for the fio tests in the next revision.
> 
> understood
> 
> > > > Yes, it is a bit less effective but
> > > > then having a data integrity test that can theoretically return false
> > > > positives is not good, it'll always keep me guessing about any failure.
> > > > Having verify at end will not have false failures for sure. Even though
> > > > it comes at a cost of less coverage.
> > > > 
> > > > Maybe we can also get some other prespectives on this.
> > > > @Zorro, @Darrick, can you please give your thoughts on this?
> > > > 
> > > > As a sidenote, with this issue of fio's verify read splitting, it seems
> > > > like the parallel verify feature in fio itself is flawed for non-atomic
> > > > writes as well since the read can always split and race with the writes.
> > > fio does warn that multiple read and writers in verify mode is unsafe.
> > Right.
> > 
> > > > Not sure what the solution is here, maybe time for non splittable atomic
> > > > writes (:p)
> > > I just think that fio in verify mode for ext4 is ok, as long as
> > > max_sectors_kb is large.
> > Yes but i'd rather have the write followed by verify version to avoid
> > false positives in case the read bio does end up splitting for whatever
> > reason.
> > 
> > Thanks for catching this issue in the test!
> 
> So, if you want to test this way then you need to omit xfs or ensure that
> the bs <= awu max opt for xfs and hope for no problems.

Hmm having parallel verifies would have been surely useful but then 
the possibility of false failures would make me lose confidence 
in the test. Even with ext4 hardware support, something someday might end 
up splitting the read bio. I feel having a write followed by verify
approach is most suitable here.

Regards,
ojaswin

> 
> Thanks,
> John
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2025-07-11 18:17 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26 11:58 [PATCH v2 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-06-26 11:58 ` [PATCH v2 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
2025-06-26 11:58 ` [PATCH v2 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
2025-06-26 13:32   ` Theodore Ts'o
2025-06-30 15:28     ` Darrick J. Wong
2025-07-01  6:26       ` Ojaswin Mujoo
2025-07-02 15:13         ` Darrick J. Wong
2025-06-26 11:58 ` [PATCH v2 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
2025-06-26 11:58 ` [PATCH v2 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
2025-06-26 11:58 ` [PATCH v2 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
2025-06-27 14:09   ` John Garry
2025-07-01 16:18     ` Ojaswin Mujoo
2025-07-02  7:46       ` John Garry
2025-07-03  6:42         ` Ojaswin Mujoo
2025-07-03 16:26           ` John Garry
2025-07-04 14:35             ` Ojaswin Mujoo
2025-07-04 15:23               ` Ojaswin Mujoo
2025-07-07  8:18                 ` John Garry
2025-07-08  6:50                   ` Ojaswin Mujoo
2025-07-08 11:11                     ` John Garry
2025-07-08 12:01                       ` Ojaswin Mujoo
2025-07-08 12:34                         ` John Garry
2025-07-11 10:39                           ` Ojaswin Mujoo
2025-07-11 10:51                             ` John Garry
2025-07-11 18:16                               ` Ojaswin Mujoo
2025-07-07  8:02               ` John Garry
2025-06-26 11:58 ` [PATCH v2 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
2025-06-27 14:48   ` John Garry
2025-06-26 11:58 ` [PATCH v2 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
2025-06-26 11:58 ` [PATCH v2 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
2025-06-26 11:59 ` [PATCH v2 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
2025-06-27 16:11   ` John Garry
2025-07-01  6:34     ` Ojaswin Mujoo
2025-06-26 11:59 ` [PATCH v2 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
2025-06-26 11:59 ` [PATCH v2 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
2025-06-26 11:59 ` [PATCH v2 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
2025-06-26 11:59 ` [PATCH v2 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
2025-06-27 13:56 ` [PATCH v2 00/13] Add more tests for multi fs block atomic writes John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox