linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/11] Add more tests for multi fs block atomic writes
@ 2025-08-10 13:41 Ojaswin Mujoo
  2025-08-10 13:41 ` [PATCH v4 01/11] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
                   ` (10 more replies)
  0 siblings, 11 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

Changes in v4: (Thanks to Darrick, John and Zorro for the reviews)

- g/1226,1227: Modify fio threads to not issue overlapping atomic writes
- g/1228: Use xfs_io -c "shutdown" instead of _scratch_shutdown to avoid
          bash overhead
- g/1229: Remove FSX_AVOID handling for bigalloc from common/rc. It is
          part of the specific test now
- ext4/063: add more clearer extent diagram
- ext4/064: Drop the test for now as im taking sometime to understand
            the behavior better.
- Removed test numbers from commit message
- For tests with significant changes I've removed the RVBs

Changes in v3 [3]:

- (2/13) use dumpe2fs to figure out if FS is bigalloc
- (9/13) generic/1230: Detect device speeds for more accurate testing. ALso
  speeds up the test
- fio tests - switch to write followed by verify approach to avoid false
  failures due to fio verify reads splitting and racing with atomic
  writes. Discussion thread:

  https://lore.kernel.org/fstests/0430bd73-e6c2-4ce9-af24-67b1e1fa9b5b@oracle.com/

  [3] https://lore.kernel.org/fstests/cover.1752329098.git.ojaswin@linux.ibm.com/

Changes in v2 [1]:

- (1/13) new patch with _min and _max helpers
- (2/13) remove setup_fs_options and add fsx specifc helper
- (4/13) skip atomic write instead of falling back to normal write (fsx)
- (4/13) make atomic write default on instead of default off (fsx)
- (5,6/13) refactor and cleanup fio tests
- (7/13) refactored common code
- (8/13) dont ignore mmap writes for fsx with atomic writes
- (9/13) use od instead of xxd. handle cleanup of bg threads in _cleanup()
- (10-13/13) minor refactors
- change all tests use _fail for better consistency
- use higher tests numbers for easier merging

 [1] https://lore.kernel.org/fstests/cover.1750924903.git.ojaswin@linux.ibm.com/

* Original cover [2] *

These are the tests we were using to verify that filesystems are not
tearing multi fs block atomic writes. Infact some of the tests like
generic/772 (now: g/1230) actually helped us catch and fix issues in
ext4's early implementations of multi fs block atomic writes and hence
we feel these tests are useful to have in xfstests.

We have tested these with scsi debug as well as a real nvme device
supporting multi fs block atomic writes.

Thoughts and suggestions are welcome!

[2] rfc: https://lore.kernel.org/fstests/cover.1749629233.git.ojaswin@linux.ibm.com/

Ojaswin Mujoo (9):
  common/rc: Add _min() and _max() helpers
  common/rc: Add a helper to run fsx on a given file
  ltp/fsx.c: Add atomic writes support to fsx
  generic: Add atomic write test using fio crc check verifier
  generic: Add atomic write test using fio verify on file mixed mappings
  generic: Add atomic write multi-fsblock O_[D]SYNC tests
  generic: Stress fsx with atomic writes enabled
  generic: Add sudden shutdown tests for multi block atomic writes
  ext4: Atomic write test for extent split across leaf nodes

Ritesh Harjani (IBM) (2):
  ext4: Atomic writes stress test for bigalloc using fio crc verifier
  ext4: Atomic writes test for bigalloc using fio crc verifier on
    multiple files

 common/rc              |  45 ++++-
 ltp/fsx.c              | 109 ++++++++++-
 tests/ext4/061         | 130 ++++++++++++++
 tests/ext4/061.out     |   2 +
 tests/ext4/062         | 176 ++++++++++++++++++
 tests/ext4/062.out     |   2 +
 tests/ext4/063         | 129 +++++++++++++
 tests/ext4/063.out     |   2 +
 tests/generic/1226     | 107 +++++++++++
 tests/generic/1226.out |   2 +
 tests/generic/1227     | 131 ++++++++++++++
 tests/generic/1227.out |   2 +
 tests/generic/1228     | 137 ++++++++++++++
 tests/generic/1228.out |   2 +
 tests/generic/1229     |  68 +++++++
 tests/generic/1229.out |   2 +
 tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1230.out |   2 +
 18 files changed, 1437 insertions(+), 8 deletions(-)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out
 create mode 100755 tests/generic/1226
 create mode 100644 tests/generic/1226.out
 create mode 100755 tests/generic/1227
 create mode 100644 tests/generic/1227.out
 create mode 100755 tests/generic/1228
 create mode 100644 tests/generic/1228.out
 create mode 100755 tests/generic/1229
 create mode 100644 tests/generic/1229.out
 create mode 100755 tests/generic/1230
 create mode 100644 tests/generic/1230.out

-- 
2.49.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v4 01/11] common/rc: Add _min() and _max() helpers
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-13 12:20   ` David Laight
  2025-08-10 13:41 ` [PATCH v4 02/11] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

Many programs open code these functionalities so add it as a generic helper
in common/rc

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/common/rc b/common/rc
index 96578d15..3858ddce 100644
--- a/common/rc
+++ b/common/rc
@@ -5873,6 +5873,28 @@ _require_program() {
 	_have_program "$1" || _notrun "$tag required"
 }
 
+_min() {
+	local ret
+
+	for arg in "$@"; do
+		if [ -z "$ret" ] || (( $arg < $ret )); then
+			ret="$arg"
+		fi
+	done
+	echo $ret
+}
+
+_max() {
+	local ret
+
+	for arg in "$@"; do
+		if [ -z "$ret" ] || (( $arg > $ret )); then
+			ret="$arg"
+		fi
+	done
+	echo $ret
+}
+
 ################################################################################
 # make sure this script returns success
 /bin/true
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 02/11] common/rc: Add a helper to run fsx on a given file
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-08-10 13:41 ` [PATCH v4 01/11] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-10 13:41 ` [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
Add a helper _run_fsx_on_file so that we can run fsx on any
given file including in $SCRATCH_MNT. Also, refactor _run_fsx
to use this helper.

No functional change is intended in this patch.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 common/rc | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/common/rc b/common/rc
index 3858ddce..dc45bff6 100644
--- a/common/rc
+++ b/common/rc
@@ -5169,13 +5169,24 @@ _require_hugepage_fsx()
 		_notrun "fsx binary does not support MADV_COLLAPSE"
 }
 
-_run_fsx()
+_run_fsx_on_file()
 {
+	local testfile=$1
+	shift
+
+	if ! [ -f $testfile ]
+	then
+		echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
+		touch $testfile
+	fi
+
 	echo "fsx $*"
 	local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
-	set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
+
+	set -- $FSX_PROG $args $FSX_AVOID $testfile
+
 	echo "$@" >>$seqres.full
-	rm -f $TEST_DIR/junk
+	rm -f $testfile
 	"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
 	local res=${PIPESTATUS[0]}
 	if [ $res -ne 0 ]; then
@@ -5187,6 +5198,12 @@ _run_fsx()
 	return 0
 }
 
+_run_fsx()
+{
+	_run_fsx_on_file $TEST_DIR/junk $@
+	return $?
+}
+
 # Run fsx with -h(ugepage buffers).  If we can't set up a hugepage then skip
 # the test, but if any other error occurs then exit the test.
 _run_hugepage_fsx() {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
  2025-08-10 13:41 ` [PATCH v4 01/11] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
  2025-08-10 13:41 ` [PATCH v4 02/11] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-13 13:42   ` John Garry
  2025-08-10 13:41 ` [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier Ojaswin Mujoo
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

Implement atomic write support to help fuzz atomic writes
with fsx.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 104 insertions(+), 5 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 163b9453..ea39ca29 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -40,6 +40,7 @@
 #include <liburing.h>
 #endif
 #include <sys/syscall.h>
+#include "statx.h"
 
 #ifndef MAP_FILE
 # define MAP_FILE 0
@@ -49,6 +50,10 @@
 #define RWF_DONTCACHE	0x80
 #endif
 
+#ifndef RWF_ATOMIC
+#define RWF_ATOMIC	0x40
+#endif
+
 #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
 
 /* Operation flags (bitmask) */
@@ -110,6 +115,7 @@ enum {
 	OP_READ_DONTCACHE,
 	OP_WRITE,
 	OP_WRITE_DONTCACHE,
+	OP_WRITE_ATOMIC,
 	OP_MAPREAD,
 	OP_MAPWRITE,
 	OP_MAX_LITE,
@@ -200,6 +206,11 @@ int	uring = 0;
 int	mark_nr = 0;
 int	dontcache_io = 1;
 int	hugepages = 0;                  /* -h flag */
+int	do_atomic_writes = 1;		/* -a flag disables */
+
+/* User for atomic writes */
+int awu_min = 0;
+int awu_max = 0;
 
 /* Stores info needed to periodically collapse hugepages */
 struct hugepages_collapse_info {
@@ -288,6 +299,7 @@ static const char *op_names[] = {
 	[OP_READ_DONTCACHE] = "read_dontcache",
 	[OP_WRITE] = "write",
 	[OP_WRITE_DONTCACHE] = "write_dontcache",
+	[OP_WRITE_ATOMIC] = "write_atomic",
 	[OP_MAPREAD] = "mapread",
 	[OP_MAPWRITE] = "mapwrite",
 	[OP_TRUNCATE] = "truncate",
@@ -422,6 +434,7 @@ logdump(void)
 				prt("\t***RRRR***");
 			break;
 		case OP_WRITE_DONTCACHE:
+		case OP_WRITE_ATOMIC:
 		case OP_WRITE:
 			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
 			    lp->args[0], lp->args[0] + lp->args[1] - 1,
@@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
 	file_size = offset + size;
 }
 
+static int is_power_of_2(unsigned n) {
+	return ((n & (n - 1)) == 0);
+}
+
+/*
+ * Round down n to nearest power of 2.
+ * If n is already a power of 2, return n;
+ */
+static int rounddown_pow_of_2(int n) {
+	int i = 0;
+
+	if (is_power_of_2(n))
+		return n;
+
+	for (; (1 << i) < n; i++);
+
+	return 1 << (i - 1);
+}
+
 void
 dowrite(unsigned offset, unsigned size, int flags)
 {
@@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
 	offset -= offset % writebdy;
 	if (o_direct)
 		size -= size % writebdy;
+	if (flags & RWF_ATOMIC) {
+		/* atomic write len must be inbetween awu_min and awu_max */
+		if (size < awu_min)
+			size = awu_min;
+		if (size > awu_max)
+			size = awu_max;
+
+		/* atomic writes need power-of-2 sizes */
+		size = rounddown_pow_of_2(size);
+
+		/* atomic writes need naturally aligned offsets */
+		offset -= offset % size;
+
+		/* Skip the write if we are crossing max filesize */
+		if ((offset + size) > maxfilelen) {
+			if (!quiet && testcalls > simulatedopcount)
+				prt("skipping atomic write past maxfilelen\n");
+			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+			return;
+		}
+	}
 	if (size == 0) {
 		if (!quiet && testcalls > simulatedopcount && !o_direct)
 			prt("skipping zero size write\n");
@@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
 		return;
 	}
 
-	log4(OP_WRITE, offset, size, FL_NONE);
+	if (flags & RWF_ATOMIC)
+		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
+	else
+		log4(OP_WRITE, offset, size, FL_NONE);
 
 	gendata(original_buf, good_buf, offset, size);
 	if (offset + size > file_size) {
@@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
 		       (monitorstart == -1 ||
 			(offset + size > monitorstart &&
 			(monitorend == -1 || offset <= monitorend))))))
-		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
-		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
+		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
+		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
+		    (flags & RWF_ATOMIC) != 0);
 	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
 	if (iret != size) {
 		if (iret == -1)
@@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
 }
 #endif
 
+int test_atomic_writes(void) {
+	int ret;
+	struct statx stx;
+
+	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
+	if (ret < 0) {
+		fprintf(stderr, "main: Statx failed with %d."
+			" Failed to determine atomic write limits, "
+			" disabling!\n", ret);
+		return 0;
+	}
+
+	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
+	    stx.stx_atomic_write_unit_min > 0) {
+		awu_min = stx.stx_atomic_write_unit_min;
+		awu_max = stx.stx_atomic_write_unit_max;
+		return 1;
+	}
+
+	fprintf(stderr, "main: IO Stack does not support "
+			"atomic writes, disabling!\n");
+	return 0;
+}
+
 #ifdef HAVE_COPY_FILE_RANGE
 int
 test_copy_range(void)
@@ -2356,6 +2437,12 @@ have_op:
 			goto out;
 		}
 		break;
+	case OP_WRITE_ATOMIC:
+		if (!do_atomic_writes) {
+			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+			goto out;
+		}
+		break;
 	}
 
 	switch (op) {
@@ -2385,6 +2472,11 @@ have_op:
 			dowrite(offset, size, 0);
 		break;
 
+	case OP_WRITE_ATOMIC:
+		TRIM_OFF_LEN(offset, size, maxfilelen);
+		dowrite(offset, size, RWF_ATOMIC);
+		break;
+
 	case OP_MAPREAD:
 		TRIM_OFF_LEN(offset, size, file_size);
 		domapread(offset, size);
@@ -2511,13 +2603,14 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
+		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
 	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
 	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
 	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
 	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
 	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
 	   ... fname\n\
+	-a: disable atomic writes\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
@@ -3059,9 +3152,13 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
+				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
+		case 'a':
+			prt("main(): Atomic writes disabled\n");
+			do_atomic_writes = 0;
+			break;
 		case 'b':
 			simulatedopcount = getnum(optarg, &endp);
 			if (!quiet)
@@ -3475,6 +3572,8 @@ main(int argc, char **argv)
 		exchange_range_calls = test_exchange_range();
 	if (dontcache_io)
 		dontcache_io = test_dontcache_io();
+	if (do_atomic_writes)
+		do_atomic_writes = test_atomic_writes();
 
 	while (keep_running())
 		if (!test())
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (2 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-12 17:16   ` Darrick J. Wong
  2025-08-13 13:39   ` John Garry
  2025-08-10 13:41 ` [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

This adds atomic write test using fio based on it's crc check verifier.
fio adds a crc for each data block. If the underlying device supports
atomic write then it is guaranteed that we will never have a mix data from
two threads writing on the same physical block.

Avoid doing overlapping parallel atomic writes because it might give
unexpected results. Use offset_increment=, size= fio options to achieve
this behavior.

Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1226     | 107 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1226.out |   2 +
 2 files changed, 109 insertions(+)
 create mode 100755 tests/generic/1226
 create mode 100644 tests/generic/1226.out

diff --git a/tests/generic/1226 b/tests/generic/1226
new file mode 100755
index 00000000..efc360e1
--- /dev/null
+++ b/tests/generic/1226
@@ -0,0 +1,107 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1226
+#
+# Validate FS atomic write using fio crc check verifier.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic
+_require_odirect
+_require_aio
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+_require_xfs_io_command "falloc"
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+
+blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
+threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
+filesize=$((blocksize * threads * 100))
+depth=$threads
+io_size=$((filesize / threads))
+io_inc=$io_size
+testfile=$SCRATCH_MNT/test-file
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+
+function create_fio_configs()
+{
+	create_fio_aw_config
+	create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+	[verify-job]
+	direct=1
+	ioengine=libaio
+	rw=read
+	bs=$blocksize
+	filename=$testfile
+	size=$filesize
+	iodepth=$depth
+	group_reporting=1
+
+	verify_only=1
+	verify=crc32c
+	verify_fatal=1
+	verify_state_save=0
+	verify_write_sequence=0
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+	[atomicwrite-job]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$blocksize
+	filename=$testfile
+	size=$io_inc
+	offset_increment=$io_inc
+	iodepth=$depth
+	numjobs=$threads
+	group_reporting=1
+	atomic=1
+
+	verify_state_save=0
+	verify=crc32c
+	do_verify=0
+EOF
+}
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+cat $fio_aw_config >> $seqres.full
+cat $fio_verify_config >> $seqres.full
+
+$XFS_IO_PROG -fc "falloc 0 $filesize" $testfile >> $seqres.full
+
+$FIO_PROG $fio_aw_config >> $seqres.full
+ret1=$?
+$FIO_PROG $fio_verify_config >> $seqres.full
+ret2=$?
+
+[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1226.out b/tests/generic/1226.out
new file mode 100644
index 00000000..6dce0ea5
--- /dev/null
+++ b/tests/generic/1226.out
@@ -0,0 +1,2 @@
+QA output created by 1226
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (3 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-12 17:16   ` Darrick J. Wong
  2025-08-10 13:41 ` [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

This tests uses fio to first create a file with mixed mappings. Then it
does atomic writes using aio dio with parallel jobs to the same file with
mixed mappings. This forces the filesystem allocator to allocate extents
over mixed mapping regions to stress FS block allocators.

Avoid doing overlapping parallel atomic writes because it might give
unexpected results. Use offset_increment=, size= fio options to achieve
this behavior.

Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1227     | 131 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1227.out |   2 +
 2 files changed, 133 insertions(+)
 create mode 100755 tests/generic/1227
 create mode 100644 tests/generic/1227.out

diff --git a/tests/generic/1227 b/tests/generic/1227
new file mode 100755
index 00000000..7423e67c
--- /dev/null
+++ b/tests/generic/1227
@@ -0,0 +1,131 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1227
+#
+# Validate FS atomic write using fio crc check verifier on mixed mappings
+# of a file.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_odirect
+_require_aio
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+
+aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
+fsbsize=$(_get_block_size $SCRATCH_MNT)
+
+threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
+filesize=$((aw_bsize * threads * 100))
+depth=$threads
+aw_io_size=$((filesize / threads))
+aw_io_inc=$aw_io_size
+testfile=$SCRATCH_MNT/test-file
+
+fio_prep_config=$tmp.prep.fio
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+cat >$fio_prep_config <<EOF
+# prep file to have mixed mappings
+[global]
+ioengine=libaio
+filename=$testfile
+size=$filesize
+bs=$fsbsize
+direct=1
+iodepth=$depth
+group_reporting=1
+
+# Create written extents
+[prep_written_blocks]
+ioengine=libaio
+rw=randwrite
+io_size=$((filesize/3))
+random_generator=lfsr
+
+# Create unwritten extents
+[prep_unwritten_blocks]
+ioengine=falloc
+rw=randwrite
+io_size=$((filesize/3))
+random_generator=lfsr
+EOF
+
+cat >$fio_aw_config <<EOF
+# atomic write to mixed mappings of written/unwritten/holes
+[atomic_write_job]
+ioengine=libaio
+rw=randwrite
+direct=1
+atomic=1
+random_generator=lfsr
+group_reporting=1
+
+filename=$testfile
+bs=$aw_bsize
+size=$aw_io_size
+offset_increment=$aw_io_inc
+iodepth=$depth
+numjobs=$threads
+
+verify_state_save=0
+verify=crc32c
+do_verify=0
+EOF
+
+cat >$fio_verify_config <<EOF
+# verify atomic writes done by previous job
+[verify_job]
+ioengine=libaio
+rw=read
+random_generator=lfsr
+group_reporting=1
+
+filename=$testfile
+size=$filesize
+bs=$aw_bsize
+iodepth=$depth
+
+verify_state_save=0
+verify_only=1
+verify=crc32c
+verify_fatal=1
+verify_write_sequence=0
+EOF
+
+_require_fio $fio_aw_config
+_require_fio $fio_verify_config
+
+cat $fio_prep_config >> $seqres.full
+cat $fio_aw_config >> $seqres.full
+cat $fio_verify_config >> $seqres.full
+
+$XFS_IO_PROG -fc "truncate $filesize" $testfile >> $seqres.full
+
+#prepare file with mixed mappings
+$FIO_PROG $fio_prep_config >> $seqres.full
+
+# do atomic writes without verifying
+$FIO_PROG $fio_aw_config >> $seqres.full
+
+# verify data is not torn
+$FIO_PROG $fio_verify_config >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1227.out b/tests/generic/1227.out
new file mode 100644
index 00000000..2605d062
--- /dev/null
+++ b/tests/generic/1227.out
@@ -0,0 +1,2 @@
+QA output created by 1227
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (4 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-11 15:29   ` Darrick J. Wong
  2025-08-10 13:41 ` [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled Ojaswin Mujoo
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

This adds various atomic write multi-fsblock stresst tests
with mixed mappings and O_SYNC, to ensure the data and metadata
is atomically persisted even if there is a shutdown.

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1228     | 137 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1228.out |   2 +
 2 files changed, 139 insertions(+)
 create mode 100755 tests/generic/1228
 create mode 100644 tests/generic/1228.out

diff --git a/tests/generic/1228 b/tests/generic/1228
new file mode 100755
index 00000000..888599ce
--- /dev/null
+++ b/tests/generic/1228
@@ -0,0 +1,137 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1228
+#
+# Atomic write multi-fsblock data integrity tests with mixed mappings
+# and O_SYNC
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+check_data_integrity() {
+	actual=$(_hexdump $testfile)
+	if [[ "$expected" != "$actual" ]]
+	then
+		echo "Integrity check failed"
+		echo "Integrity check failed" >> $seqres.full
+		echo "# Expected file contents:" >> $seqres.full
+		echo "$expected" >> $seqres.full
+		echo "# Actual file contents:" >> $seqres.full
+		echo "$actual" >> $seqres.full
+
+		_fail "Data integrity check failed. The atomic write was torn."
+	fi
+}
+
+prep_mixed_mapping() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+	local off=0
+	local mapping=""
+
+	local operations=("W" "H" "U")
+	local num_blocks=$((awu_max / blksz))
+	for ((i=0; i<num_blocks; i++)); do
+		local index=$((RANDOM % ${#operations[@]}))
+		local map="${operations[$index]}"
+		local mapping="${mapping}${map}"
+
+		case "$map" in
+			"W")
+				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
+				;;
+			"H")
+				# No operation needed for hole
+				;;
+			"U")
+				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
+				;;
+		esac
+		off=$((off + blksz))
+	done
+
+	echo "+ + Mixed mapping prep done. Full mapping pattern: $mapping" >> $seqres.full
+
+	sync $testfile
+}
+
+verify_atomic_write() {
+	test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
+	check_data_integrity
+}
+
+mixed_mapping_test() {
+	prep_mixed_mapping
+
+	echo -"+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
+	if [[ "$1" == "shutdown" ]]
+	then
+		bytes_written=$($XFS_IO_PROG -x -dc \
+				"pwrite -DA -V1 -b $awu_max 0 $awu_max" \
+				-c "shutdown" $testfile | grep wrote | \
+				awk -F'[/ ]' '{print $2}')
+		_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed"
+	else
+		bytes_written=$($XFS_IO_PROG -dc \
+				"pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
+				grep wrote | awk -F'[/ ]' '{print $2}')
+	fi
+
+	verify_atomic_write
+}
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+
+# Create an expected pattern to compare with
+$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
+expected=$(_hexdump $testfile)
+echo "# Expected file contents:" >> $seqres.full
+echo "$expected" >> $seqres.full
+echo >> $seqres.full
+
+echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping:" >> $seqres.full
+echo >> $seqres.full
+for ((iteration=1; iteration<=10; iteration++)); do
+	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
+
+	echo "+ Testing without shutdown..." >> $seqres.full
+	mixed_mapping_test
+	echo "Passed!" >> $seqres.full
+
+	echo "+ Testing with sudden shutdown..." >> $seqres.full
+	mixed_mapping_test "shutdown"
+	echo "Passed!" >> $seqres.full
+
+	echo "Iteration $iteration completed: OK" >> $seqres.full
+	echo >> $seqres.full
+done
+echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
+
+
+echo >> $seqres.full
+echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
+bytes_written=$($XFS_IO_PROG -x -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" \
+		-c "shutdown" $testfile | grep wrote | awk -F'[/ ]' '{print $2}')
+_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed"
+verify_atomic_write
+echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
+
diff --git a/tests/generic/1228.out b/tests/generic/1228.out
new file mode 100644
index 00000000..1baffa91
--- /dev/null
+++ b/tests/generic/1228.out
@@ -0,0 +1,2 @@
+QA output created by 1228
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (5 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-12 17:18   ` Darrick J. Wong
  2025-08-10 13:41 ` [PATCH v4 08/11] generic: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

Stress file with atomic writes to ensure we excercise codepaths
where we are mixing different FS operations with atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1229     | 68 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/1229.out |  2 ++
 2 files changed, 70 insertions(+)
 create mode 100755 tests/generic/1229
 create mode 100644 tests/generic/1229.out

diff --git a/tests/generic/1229 b/tests/generic/1229
new file mode 100755
index 00000000..7fa57105
--- /dev/null
+++ b/tests/generic/1229
@@ -0,0 +1,68 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1229
+#
+# fuzz fsx with atomic writes
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest rw auto quick atomicwrites
+
+_require_odirect
+_require_scratch_write_atomic
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount  >> $seqres.full 2>&1
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+
+set_fsx_avoid() {
+	local file=$1
+
+	case "$FSTYP" in
+	"ext4")
+		local dev=$(findmnt -n -o SOURCE --target $testfile)
+
+		# fsx insert/collpase range support for ext4+bigalloc is
+		# currently broken, so disable it. Also disable incase we can't
+		# detect bigalloc to be on safer side.
+		if [ -z "$DUMPE2FS_PROG" ]; then
+			echo "dumpe2fs not found, disabling insert/collapse range" >> $seqres.full
+			FSX_AVOID+=" -I -C"
+			return
+		fi
+
+		$DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
+			echo "fsx insert/collapse range not supported with bigalloc. Disabling.." >> $seqres.full
+			FSX_AVOID+=" -I -C"
+		}
+		;;
+	*)
+		;;
+	esac
+}
+
+# fsx usage:
+#
+# -N numops: total # operations to do
+# -l flen: the upper bound on file size
+# -o oplen: the upper bound on operation size (64k default)
+# -Z: O_DIRECT ()
+
+set_fsx_avoid
+_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID  >> $seqres.full
+if [[ "$?" != "0" ]]
+then
+	_fail "fsx returned error: $?"
+fi
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1229.out b/tests/generic/1229.out
new file mode 100644
index 00000000..737d61c6
--- /dev/null
+++ b/tests/generic/1229.out
@@ -0,0 +1,2 @@
+QA output created by 1229
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 08/11] generic: Add sudden shutdown tests for multi block atomic writes
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (6 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-08-10 13:41 ` Ojaswin Mujoo
  2025-08-10 13:42 ` [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:41 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

This test is intended to ensure that multi blocks atomic writes
maintain atomic guarantees across sudden FS shutdowns.

The way we work is that we lay out a file with random mix of written,
unwritten and hole extents. Then we start performing atomic writes
sequentially on the file while we parallely shutdown the FS. Then we
note the last offset where the atomic write happened just before shut
down and then make sure blocks around it either have completely old
data or completely new data, ie the write was not torn during shutdown.

We repeat the same with completely written, completely unwritten and completely
empty file to ensure these cases are not torn either.  Finally, we have a
similar test for append atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1230.out |   2 +
 2 files changed, 399 insertions(+)
 create mode 100755 tests/generic/1230
 create mode 100644 tests/generic/1230.out

diff --git a/tests/generic/1230 b/tests/generic/1230
new file mode 100755
index 00000000..cff5adc0
--- /dev/null
+++ b/tests/generic/1230
@@ -0,0 +1,397 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test No. 1230
+#
+# Test multi block atomic writes with sudden FS shutdowns to ensure
+# the FS is not tearing the write operation
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+echo "Awu max: $awu_max" >> $seqres.full
+
+num_blocks=$((awu_max / blksz))
+# keep initial value high for dry run. This will be
+# tweaked in dry_run() based on device write speed.
+filesize=$(( 10 * 1024 * 1024 * 1024 ))
+
+_cleanup() {
+	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
+	wait
+}
+
+atomic_write_loop() {
+	local off=0
+	local size=$awu_max
+	for ((i=0; i<$((filesize / $size )); i++)); do
+		# Due to sudden shutdown this can produce errors so just
+		# redirect them to seqres.full
+		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
+		echo "Written to offset: $off" >> $tmp.aw
+		off=$((off + $size))
+	done
+}
+
+# This test has the following flow:
+# 1. Start doing sequential atomic writes in bg, upto $filesize
+# 2. Sleep for 0.2s and shutdown the FS
+# 3. kill the atomic write process
+# 4. verify the writes were not torn
+#
+# We ideally want the shutdown to happen while an atomic write is ongoing
+# but this gets tricky since faster devices can actually finish the whole
+# atomic write loop before sleep 0.2s completes, resulting in the shutdown
+# happening after the write loop which is not what we want. A simple solution
+# to this is to increase $filesize so step 1 takes long enough but a big
+# $filesize leads to create_mixed_mappings() taking very long, which is not
+# ideal.
+#
+# Hence, use the dry_run function to figure out the rough device speed and set
+# $filesize accordingly.
+dry_run() {
+	echo >> $seqres.full
+	echo "# Estimating ideal filesize..." >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
+
+	filesize=$((bytes_written * 3))
+	echo "# Setting \$filesize=$filesize" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+}
+
+create_mixed_mappings() {
+	local file=$1
+	local size_bytes=$2
+
+	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
+	#Fill the file with alternate written and unwritten blocks
+	local off=0
+	local operations=("W" "U")
+
+	for ((i=0; i<$((size_bytes / blksz )); i++)); do
+		index=$(($i % ${#operations[@]}))
+		map="${operations[$index]}"
+
+		case "$map" in
+		    "W")
+			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
+			;;
+		    "U")
+			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
+			;;
+		esac
+		off=$((off + blksz))
+	done
+
+	sync $file
+}
+
+populate_expected_data() {
+	# create a dummy file with expected old data for different cases
+	create_mixed_mappings $testfile.exp_old_mixed $awu_max
+	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
+
+	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
+	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
+
+	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
+	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
+
+	# create a dummy file with expected new data
+	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
+	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
+}
+
+verify_data_blocks() {
+	local verify_start=$1
+	local verify_end=$2
+	local expected_data_old="$3"
+	local expected_data_new="$4"
+
+	echo >> $seqres.full
+	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
+
+	# After an atomic write, for every chunk we ensure that the underlying
+	# data is either the old data or new data as writes shouldn't get torn.
+	local off=$verify_start
+	while [[ "$off" -lt "$verify_end" ]]
+	do
+		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
+		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
+		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
+		then
+			echo "Checksum match failed at off: $off size: $awu_max"
+			echo "Expected contents: (Either of the 2 below):"
+			echo
+			echo "Expected old: "
+			echo "$expected_data_old"
+			echo
+			echo "Expected new: "
+			echo "$expected_data_new"
+			echo
+			echo "Actual contents: "
+			echo "$actual_data"
+
+			_fail
+		fi
+		echo -n "Check at offset $off suceeded! " >> $seqres.full
+		if [[ "$actual_data" == "$expected_data_new" ]]
+		then
+			echo "matched new" >> $seqres.full
+		elif [[ "$actual_data" == "$expected_data_old" ]]
+		then
+			echo "matched old" >> $seqres.full
+		fi
+		off=$(( off + awu_max ))
+	done
+}
+
+# test data integrity for file by shutting down in between atomic writes
+test_data_integrity() {
+	echo >> $seqres.full
+	echo "# Writing atomically to file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	if [[ -z $last_offset ]]
+	then
+		last_offset=0
+	fi
+
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+	# we want to verify all blocks around which the shutdown happended
+	verify_start=$(( last_offset - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	verify_end=$(( last_offset + (awu_max * 5)))
+	if [[ "$verify_end" -gt "$filesize" ]]
+	then
+		verify_end=$filesize
+	fi
+}
+
+# test data integrity for file wiht written and unwritten mappings
+test_data_integrity_mixed() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with mixed mappings" >> $seqres.full
+	create_mixed_mappings $testfile $filesize
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
+}
+
+# test data integrity for file with completely written mappings
+test_data_integrity_writ() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully written mapping" >> $seqres.full
+	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
+}
+
+# test data integrity for file with completely unwritten mappings
+test_data_integrity_unwrit() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
+	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+# test data integrity for file with no mappings
+test_data_integrity_hole() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with no mappings" >> $seqres.full
+	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+test_filesize_integrity() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Performing extending atomic writes over file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	if [[ -z $last_offset ]]
+	then
+		last_offset=0
+	fi
+
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+	local filesize=$(_get_filesize $testfile)
+	echo >> $seqres.full
+	echo "# Filesize after shutdown: $filesize" >> $seqres.full
+
+	# To confirm that the write went atomically, we check:
+	# 1. The last block should be a multiple of awu_max
+	# 2. The last block should be the completely new data
+
+	if (( $filesize % $awu_max ))
+	then
+		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
+	fi
+
+	verify_start=$(( filesize - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	local verify_end=$filesize
+
+	# Here the blocks should always match new data hence, for simplicity of
+	# code, just corrupt the $expected_data_old buffer so it never matches
+	local expected_data_old="POISON"
+	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
+}
+
+$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+dry_run
+
+echo >> $seqres.full
+echo "# Populating expected data buffers" >> $seqres.full
+populate_expected_data
+
+# Loop 20 times to shake out any races due to shutdown
+for ((iter=0; iter<20; iter++))
+do
+	echo >> $seqres.full
+	echo "------ Iteration $iter ------" >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
+	test_data_integrity_mixed
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
+	test_data_integrity_writ
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
+	test_data_integrity_unwrit
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
+	test_data_integrity_hole
+
+	echo >> $seqres.full
+	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
+	test_filesize_integrity
+done
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1230.out b/tests/generic/1230.out
new file mode 100644
index 00000000..d01f54ea
--- /dev/null
+++ b/tests/generic/1230.out
@@ -0,0 +1,2 @@
+QA output created by 1230
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (7 preceding siblings ...)
  2025-08-10 13:41 ` [PATCH v4 08/11] generic: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-08-10 13:42 ` Ojaswin Mujoo
  2025-08-12  8:08   ` John Garry
  2025-08-10 13:42 ` [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
  2025-08-10 13:42 ` [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:42 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

We brute force all possible blocksize & clustersize combinations on
a bigalloc filesystem for stressing atomic write using fio data crc
verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
that we never see the mix of data contents from different threads on
a given bsrange.

This test might do overlapping atomic writes but that should be okay
since overlapping parallel hardware atomic writes don't cause tearing as
long as io size is the same for all writes.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/061.out |   2 +
 2 files changed, 132 insertions(+)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out

diff --git a/tests/ext4/061 b/tests/ext4/061
new file mode 100755
index 00000000..a0e49249
--- /dev/null
+++ b/tests/ext4/061
@@ -0,0 +1,130 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
+# we should never see the mix of data contents from different threads for any
+# given fio blocksize.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100*1024*1024))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_configs()
+{
+	create_fio_aw_config
+	create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+	[aio-dio-aw-verify]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	atomic=1
+	group_reporting=1
+
+	verify_only=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_write_sequence=0
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+	[aio-dio-aw]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	atomic=1
+
+	verify_state_save=0
+	verify=crc32c
+	do_verify=0
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	# cluster sizes above 16 x blocksize are experimental so avoid them
+	# Also, cap cluster size at 128kb to keep it reasonable for large
+	# blocks size
+	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+				touch $SCRATCH_MNT/f1
+				create_fio_configs
+
+				cat $fio_aw_config >> $seqres.full
+				echo >> $seqres.full
+				cat $fio_verify_config >> $seqres.full
+
+				$FIO_PROG $fio_aw_config >> $seqres.full
+				ret1=$?
+
+				$FIO_PROG $fio_verify_config >> $seqres.full
+				ret2=$?
+
+				_scratch_unmount
+
+				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/061.out b/tests/ext4/061.out
new file mode 100644
index 00000000..273be9e0
--- /dev/null
+++ b/tests/ext4/061.out
@@ -0,0 +1,2 @@
+QA output created by 061
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (8 preceding siblings ...)
  2025-08-10 13:42 ` [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-08-10 13:42 ` Ojaswin Mujoo
  2025-08-13 13:45   ` John Garry
  2025-08-10 13:42 ` [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
  10 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:42 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

Brute force all possible blocksize clustersize combination on a bigalloc
filesystem for stressing atomic write using fio data crc verifier. We run
multiple threads in parallel with each job writing to its own file. The
parallel jobs running on a constrained filesystem size ensure that we
stress the ext4 allocator to allocate contiguous extents.

This test might do overlapping atomic writes but that should be okay
since overlapping parallel hardware atomic writes don't cause tearing as
long as io size is the same for all writes.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/062     | 176 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/062.out |   2 +
 2 files changed, 178 insertions(+)
 create mode 100755 tests/ext4/062
 create mode 100644 tests/ext4/062.out

diff --git a/tests/ext4/062 b/tests/ext4/062
new file mode 100755
index 00000000..da5e1076
--- /dev/null
+++ b/tests/ext4/062
@@ -0,0 +1,176 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 062
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
+# a constrained filesystem size to stress the ext4 allocator to allocate
+# contiguous extents.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FSSIZE=$((360*1024*1024))
+FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_configs()
+{
+	create_fio_aw_config
+	create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+	[global]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=truncate
+	size=$((FSSIZE / 12))
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	atomic=1
+
+	verify_only=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_write_sequence=0
+
+	[verify-job1]
+	filename=$SCRATCH_MNT/testfile-job1
+
+	[verify-job2]
+	filename=$SCRATCH_MNT/testfile-job2
+
+	[verify-job3]
+	filename=$SCRATCH_MNT/testfile-job3
+
+	[verify-job4]
+	filename=$SCRATCH_MNT/testfile-job4
+
+	[verify-job5]
+	filename=$SCRATCH_MNT/testfile-job5
+
+	[verify-job6]
+	filename=$SCRATCH_MNT/testfile-job6
+
+	[verify-job7]
+	filename=$SCRATCH_MNT/testfile-job7
+
+	[verify-job8]
+	filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+	[global]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=truncate
+	size=$((FSSIZE / 12))
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	atomic=1
+
+	verify_state_save=0
+	verify=crc32c
+	do_verify=0
+
+	[write-job1]
+	filename=$SCRATCH_MNT/testfile-job1
+
+	[write-job2]
+	filename=$SCRATCH_MNT/testfile-job2
+
+	[write-job3]
+	filename=$SCRATCH_MNT/testfile-job3
+
+	[write-job4]
+	filename=$SCRATCH_MNT/testfile-job4
+
+	[write-job5]
+	filename=$SCRATCH_MNT/testfile-job5
+
+	[write-job6]
+	filename=$SCRATCH_MNT/testfile-job6
+
+	[write-job7]
+	filename=$SCRATCH_MNT/testfile-job7
+
+	[write-job8]
+	filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	# cluster sizes above 16 x blocksize are experimental so avoid them
+	# Also, cap cluster size at 128kb to keep it reasonable for large
+	# blocks size cases.
+	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+				touch $SCRATCH_MNT/f1
+				create_fio_configs
+
+				cat $fio_aw_config >> $seqres.full
+				cat $fio_verify_config >> $seqres.full
+
+				$FIO_PROG $fio_aw_config >> $seqres.full
+				ret1=$?
+
+				$FIO_PROG $fio_verify_config  >> $seqres.full
+				ret2=$?
+
+				_scratch_unmount
+
+				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/062.out b/tests/ext4/062.out
new file mode 100644
index 00000000..a1578f48
--- /dev/null
+++ b/tests/ext4/062.out
@@ -0,0 +1,2 @@
+QA output created by 062
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
                   ` (9 preceding siblings ...)
  2025-08-10 13:42 ` [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-08-10 13:42 ` Ojaswin Mujoo
  2025-08-12 17:19   ` Darrick J. Wong
  2025-08-13 13:54   ` John Garry
  10 siblings, 2 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 13:42 UTC (permalink / raw)
  To: Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
	linux-kernel, linux-ext4

In ext4, even if an allocated range is physically and logically
contiguous, it can still be split into 2 extents. This is because ext4
does not merge extents across leaf nodes. This is an issue for atomic
writes since even for a continuous extent the map block could (in rare
cases) return a shorter map, hence tearning the write. This test creates
such a file and ensures that the atomic write handles this case
correctly

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/063.out |   2 +
 2 files changed, 131 insertions(+)
 create mode 100755 tests/ext4/063
 create mode 100644 tests/ext4/063.out

diff --git a/tests/ext4/063 b/tests/ext4/063
new file mode 100755
index 00000000..40867acb
--- /dev/null
+++ b/tests/ext4/063
@@ -0,0 +1,129 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# In ext4, even if an allocated range is physically and logically contiguous,
+# it can still be split into 2 extents. This is because ext4 does not merge
+# extents across leaf nodes. This is an issue for atomic writes since even for
+# a continuous extent the map block could (in rare cases) return a shorter map,
+# hence tearning the write. This test creates such a file and ensures that the
+# atomic write handles this case correctly
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_command "$DEBUGFS_PROG" debugfs
+
+prep() {
+	local bs=`_get_block_size $SCRATCH_MNT`
+	local ex_hdr_bytes=12
+	local ex_entry_bytes=12
+	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
+
+	# fill the extent tree leaf with bs len extents at alternate offsets.
+	# The tree should look as follows
+	#
+	#                    +---------+---------+
+	#                    | index 1 | index 2 |
+	#                    +-----+---+-----+---+
+	#                   +------+         +-----------+
+	#                   |                            |
+	#      +-------+-------+---+---------+     +-----+----+
+	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
+	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
+	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
+	#      +-------+-------+---+---------+     +----------+
+	#
+	for i in $(seq 0 $entries_per_blk)
+	do
+		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
+	done
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
+	# Since this is a new FS the allocator would find continuous blocks
+	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
+	# contiguous. However, since we dont merge extents across leaf we will
+	# end up with a tree as:
+	#
+	#                    +---------+---------+
+	#                    | index 1 | index 2 |
+	#                    +-----+---+-----+---+
+	#                   +------+         +------------+
+	#                   |                             |
+	#      +-------+-------+---+---------+     +------+-----------+
+	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
+	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
+	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
+	#      +-------+-------+---+---------+     +------------------+
+	#
+	echo >> $seqres.full
+	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
+	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
+	sync $testfile
+
+	echo >> $seqres.full
+	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
+	echo "...">> $seqres.full
+	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+	_scratch_cycle_mount
+}
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+awu_max=$(_get_atomic_write_unit_max $testfile)
+
+echo >> $seqres.full
+echo "# Prepping the file" >> $seqres.full
+prep
+
+torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
+
+echo >> $seqres.full
+echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
+echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+
+echo >> $seqres.full
+echo "Extent state after atomic write:">> $seqres.full
+echo "...">> $seqres.full
+$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+echo >> $seqres.full
+echo "# Checking data integrity" >> $seqres.full
+
+# create a dummy file with expected data
+$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
+expected_data=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp)
+
+# We ensure that the data after atomic writes should match the expected data
+actual_data=$(od -An -t x1 -j $torn_aw_offset -N $awu_max $testfile)
+if [[ "$actual_data" != "$expected_data" ]]
+then
+	echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
+	echo
+	echo "Expected: "
+	echo "$expected_data"
+	echo
+	echo "Actual contents: "
+	echo "$actual_data"
+
+	_fail
+fi
+
+echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/063.out b/tests/ext4/063.out
new file mode 100644
index 00000000..de35fc52
--- /dev/null
+++ b/tests/ext4/063.out
@@ -0,0 +1,2 @@
+QA output created by 063
+Silence is golden
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests
  2025-08-10 13:41 ` [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-08-11 15:29   ` Darrick J. Wong
  0 siblings, 0 replies; 38+ messages in thread
From: Darrick J. Wong @ 2025-08-11 15:29 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, Aug 10, 2025 at 07:11:57PM +0530, Ojaswin Mujoo wrote:
> This adds various atomic write multi-fsblock stresst tests
> with mixed mappings and O_SYNC, to ensure the data and metadata
> is atomically persisted even if there is a shutdown.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Looks fine to me now.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  tests/generic/1228     | 137 +++++++++++++++++++++++++++++++++++++++++
>  tests/generic/1228.out |   2 +
>  2 files changed, 139 insertions(+)
>  create mode 100755 tests/generic/1228
>  create mode 100644 tests/generic/1228.out
> 
> diff --git a/tests/generic/1228 b/tests/generic/1228
> new file mode 100755
> index 00000000..888599ce
> --- /dev/null
> +++ b/tests/generic/1228
> @@ -0,0 +1,137 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1228
> +#
> +# Atomic write multi-fsblock data integrity tests with mixed mappings
> +# and O_SYNC
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_scratch_shutdown
> +_require_xfs_io_command "truncate"
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +check_data_integrity() {
> +	actual=$(_hexdump $testfile)
> +	if [[ "$expected" != "$actual" ]]
> +	then
> +		echo "Integrity check failed"
> +		echo "Integrity check failed" >> $seqres.full
> +		echo "# Expected file contents:" >> $seqres.full
> +		echo "$expected" >> $seqres.full
> +		echo "# Actual file contents:" >> $seqres.full
> +		echo "$actual" >> $seqres.full
> +
> +		_fail "Data integrity check failed. The atomic write was torn."
> +	fi
> +}
> +
> +prep_mixed_mapping() {
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +	local off=0
> +	local mapping=""
> +
> +	local operations=("W" "H" "U")
> +	local num_blocks=$((awu_max / blksz))
> +	for ((i=0; i<num_blocks; i++)); do
> +		local index=$((RANDOM % ${#operations[@]}))
> +		local map="${operations[$index]}"
> +		local mapping="${mapping}${map}"
> +
> +		case "$map" in
> +			"W")
> +				$XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> +				;;
> +			"H")
> +				# No operation needed for hole
> +				;;
> +			"U")
> +				$XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
> +				;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	echo "+ + Mixed mapping prep done. Full mapping pattern: $mapping" >> $seqres.full
> +
> +	sync $testfile
> +}
> +
> +verify_atomic_write() {
> +	test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
> +	check_data_integrity
> +}
> +
> +mixed_mapping_test() {
> +	prep_mixed_mapping
> +
> +	echo -"+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
> +	if [[ "$1" == "shutdown" ]]
> +	then
> +		bytes_written=$($XFS_IO_PROG -x -dc \
> +				"pwrite -DA -V1 -b $awu_max 0 $awu_max" \
> +				-c "shutdown" $testfile | grep wrote | \
> +				awk -F'[/ ]' '{print $2}')
> +		_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed"
> +	else
> +		bytes_written=$($XFS_IO_PROG -dc \
> +				"pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> +				grep wrote | awk -F'[/ ]' '{print $2}')
> +	fi
> +
> +	verify_atomic_write
> +}
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +
> +# Create an expected pattern to compare with
> +$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
> +expected=$(_hexdump $testfile)
> +echo "# Expected file contents:" >> $seqres.full
> +echo "$expected" >> $seqres.full
> +echo >> $seqres.full
> +
> +echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping:" >> $seqres.full
> +echo >> $seqres.full
> +for ((iteration=1; iteration<=10; iteration++)); do
> +	echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
> +
> +	echo "+ Testing without shutdown..." >> $seqres.full
> +	mixed_mapping_test
> +	echo "Passed!" >> $seqres.full
> +
> +	echo "+ Testing with sudden shutdown..." >> $seqres.full
> +	mixed_mapping_test "shutdown"
> +	echo "Passed!" >> $seqres.full
> +
> +	echo "Iteration $iteration completed: OK" >> $seqres.full
> +	echo >> $seqres.full
> +done
> +echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
> +
> +
> +echo >> $seqres.full
> +echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
> +bytes_written=$($XFS_IO_PROG -x -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" \
> +		-c "shutdown" $testfile | grep wrote | awk -F'[/ ]' '{print $2}')
> +_scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed"
> +verify_atomic_write
> +echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> +
> diff --git a/tests/generic/1228.out b/tests/generic/1228.out
> new file mode 100644
> index 00000000..1baffa91
> --- /dev/null
> +++ b/tests/generic/1228.out
> @@ -0,0 +1,2 @@
> +QA output created by 1228
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
  2025-08-10 13:42 ` [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-08-12  8:08   ` John Garry
  2025-08-13  7:08     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-12  8:08 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
	linux-ext4

On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> We brute force all possible blocksize & clustersize combinations on
> a bigalloc filesystem for stressing atomic write using fio data crc
> verifier. 

you seem to run mkfs per block size. Why not just mkfs for largest 
blocksize once, which will support all block sizes?

> We run nproc * $LOAD_FACTOR threads in parallel writing to
> a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> that we never see the mix of data contents from different threads on
> a given bsrange.
> 
> This test might do overlapping atomic writes but that should be okay
> since overlapping parallel hardware atomic writes don't cause tearing as
> long as io size is the same for all writes.

Please mention that serializing racing writes is not guaranteed for 
RWF_ATOMIC, and that NVMe and SCSI provide this guarantee as an 
inseparable feature to power-fail atomicity.

Please also mention that the value is that we test that we split no bios 
or only generate a single bio per write syscall.

> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>   tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/ext4/061.out |   2 +
>   2 files changed, 132 insertions(+)
>   create mode 100755 tests/ext4/061
>   create mode 100644 tests/ext4/061.out
> 
> diff --git a/tests/ext4/061 b/tests/ext4/061
> new file mode 100755
> index 00000000..a0e49249
> --- /dev/null
> +++ b/tests/ext4/061
> @@ -0,0 +1,130 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> +# we should never see the mix of data contents from different threads for any
> +# given fio blocksize.
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_aiodio

do you require fio with a certain version somewhere?

> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((100*1024*1024))
> +fiobsize=4096
> +
> +# Calculate fsblocksize as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(_max 4096 "$bdev_awu_min")
> +
> +function create_fio_configs()
> +{
> +	create_fio_aw_config
> +	create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> +	[aio-dio-aw-verify]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite

it prob makes sense to just have read, but I guess with verify_only=1 
that this makes no difference

> +	bs=$fiobsize
> +	fallocate=native
> +	filename=$SCRATCH_MNT/test-file
> +	size=$SIZE
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	atomic=1
> +	group_reporting=1
> +
> +	verify_only=1
> +	verify_state_save=0
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> +	[aio-dio-aw]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite
> +	bs=$fiobsize
> +	fallocate=native
> +	filename=$SCRATCH_MNT/test-file
> +	size=$SIZE
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	group_reporting=1
> +	atomic=1
> +
> +	verify_state_save=0
> +	verify=crc32c
> +	do_verify=0
> +
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> +	# cluster sizes above 16 x blocksize are experimental so avoid them
> +	# Also, cap cluster size at 128kb to keep it reasonable for large
> +	# blocks size
> +	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> +
> +	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"

this is quite heavy indentation. Maybe the below steps can be put into a 
separate routine (to make the code more readable).


> +			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
> +			if _try_scratch_mount >> $seqres.full 2>&1; then
> +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +
> +				touch $SCRATCH_MNT/f1
> +				create_fio_configs
> +
> +				cat $fio_aw_config >> $seqres.full
> +				echo >> $seqres.full
> +				cat $fio_verify_config >> $seqres.full
> +
> +				$FIO_PROG $fio_aw_config >> $seqres.full
> +				ret1=$?
> +
> +				$FIO_PROG $fio_verify_config >> $seqres.full
> +				ret2=$?
> +
> +				_scratch_unmount
> +
> +				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> +			fi
> +		done
> +	done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> new file mode 100644
> index 00000000..273be9e0
> --- /dev/null
> +++ b/tests/ext4/061.out
> @@ -0,0 +1,2 @@
> +QA output created by 061
> +Silence is golden


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-10 13:41 ` [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-08-12 17:16   ` Darrick J. Wong
  2025-08-13 13:39   ` John Garry
  1 sibling, 0 replies; 38+ messages in thread
From: Darrick J. Wong @ 2025-08-12 17:16 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, Aug 10, 2025 at 07:11:55PM +0530, Ojaswin Mujoo wrote:
> This adds atomic write test using fio based on it's crc check verifier.
> fio adds a crc for each data block. If the underlying device supports
> atomic write then it is guaranteed that we will never have a mix data from
> two threads writing on the same physical block.
> 
> Avoid doing overlapping parallel atomic writes because it might give
> unexpected results. Use offset_increment=, size= fio options to achieve
> this behavior.
> 
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

John had the most opinions about the last iteration of this test, but it
looks reasonable enough to me...

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  tests/generic/1226     | 107 +++++++++++++++++++++++++++++++++++++++++
>  tests/generic/1226.out |   2 +
>  2 files changed, 109 insertions(+)
>  create mode 100755 tests/generic/1226
>  create mode 100644 tests/generic/1226.out
> 
> diff --git a/tests/generic/1226 b/tests/generic/1226
> new file mode 100755
> index 00000000..efc360e1
> --- /dev/null
> +++ b/tests/generic/1226
> @@ -0,0 +1,107 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1226
> +#
> +# Validate FS atomic write using fio crc check verifier.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_odirect
> +_require_aio
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +_require_xfs_io_command "falloc"
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +
> +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> +threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
> +filesize=$((blocksize * threads * 100))
> +depth=$threads
> +io_size=$((filesize / threads))
> +io_inc=$io_size
> +testfile=$SCRATCH_MNT/test-file
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +
> +function create_fio_configs()
> +{
> +	create_fio_aw_config
> +	create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> +	[verify-job]
> +	direct=1
> +	ioengine=libaio
> +	rw=read
> +	bs=$blocksize
> +	filename=$testfile
> +	size=$filesize
> +	iodepth=$depth
> +	group_reporting=1
> +
> +	verify_only=1
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_state_save=0
> +	verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> +	[atomicwrite-job]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite
> +	bs=$blocksize
> +	filename=$testfile
> +	size=$io_inc
> +	offset_increment=$io_inc
> +	iodepth=$depth
> +	numjobs=$threads
> +	group_reporting=1
> +	atomic=1
> +
> +	verify_state_save=0
> +	verify=crc32c
> +	do_verify=0
> +EOF
> +}
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +$XFS_IO_PROG -fc "falloc 0 $filesize" $testfile >> $seqres.full
> +
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +ret1=$?
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +ret2=$?
> +
> +[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1226.out b/tests/generic/1226.out
> new file mode 100644
> index 00000000..6dce0ea5
> --- /dev/null
> +++ b/tests/generic/1226.out
> @@ -0,0 +1,2 @@
> +QA output created by 1226
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings
  2025-08-10 13:41 ` [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-08-12 17:16   ` Darrick J. Wong
  0 siblings, 0 replies; 38+ messages in thread
From: Darrick J. Wong @ 2025-08-12 17:16 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, Aug 10, 2025 at 07:11:56PM +0530, Ojaswin Mujoo wrote:
> This tests uses fio to first create a file with mixed mappings. Then it
> does atomic writes using aio dio with parallel jobs to the same file with
> mixed mappings. This forces the filesystem allocator to allocate extents
> over mixed mapping regions to stress FS block allocators.
> 
> Avoid doing overlapping parallel atomic writes because it might give
> unexpected results. Use offset_increment=, size= fio options to achieve
> this behavior.
> 
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Looks ok still...
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  tests/generic/1227     | 131 +++++++++++++++++++++++++++++++++++++++++
>  tests/generic/1227.out |   2 +
>  2 files changed, 133 insertions(+)
>  create mode 100755 tests/generic/1227
>  create mode 100644 tests/generic/1227.out
> 
> diff --git a/tests/generic/1227 b/tests/generic/1227
> new file mode 100755
> index 00000000..7423e67c
> --- /dev/null
> +++ b/tests/generic/1227
> @@ -0,0 +1,131 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1227
> +#
> +# Validate FS atomic write using fio crc check verifier on mixed mappings
> +# of a file.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_odirect
> +_require_aio
> +_require_xfs_io_command "truncate"
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +
> +aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
> +fsbsize=$(_get_block_size $SCRATCH_MNT)
> +
> +threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
> +filesize=$((aw_bsize * threads * 100))
> +depth=$threads
> +aw_io_size=$((filesize / threads))
> +aw_io_inc=$aw_io_size
> +testfile=$SCRATCH_MNT/test-file
> +
> +fio_prep_config=$tmp.prep.fio
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +cat >$fio_prep_config <<EOF
> +# prep file to have mixed mappings
> +[global]
> +ioengine=libaio
> +filename=$testfile
> +size=$filesize
> +bs=$fsbsize
> +direct=1
> +iodepth=$depth
> +group_reporting=1
> +
> +# Create written extents
> +[prep_written_blocks]
> +ioengine=libaio
> +rw=randwrite
> +io_size=$((filesize/3))
> +random_generator=lfsr
> +
> +# Create unwritten extents
> +[prep_unwritten_blocks]
> +ioengine=falloc
> +rw=randwrite
> +io_size=$((filesize/3))
> +random_generator=lfsr
> +EOF
> +
> +cat >$fio_aw_config <<EOF
> +# atomic write to mixed mappings of written/unwritten/holes
> +[atomic_write_job]
> +ioengine=libaio
> +rw=randwrite
> +direct=1
> +atomic=1
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$testfile
> +bs=$aw_bsize
> +size=$aw_io_size
> +offset_increment=$aw_io_inc
> +iodepth=$depth
> +numjobs=$threads
> +
> +verify_state_save=0
> +verify=crc32c
> +do_verify=0
> +EOF
> +
> +cat >$fio_verify_config <<EOF
> +# verify atomic writes done by previous job
> +[verify_job]
> +ioengine=libaio
> +rw=read
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$testfile
> +size=$filesize
> +bs=$aw_bsize
> +iodepth=$depth
> +
> +verify_state_save=0
> +verify_only=1
> +verify=crc32c
> +verify_fatal=1
> +verify_write_sequence=0
> +EOF
> +
> +_require_fio $fio_aw_config
> +_require_fio $fio_verify_config
> +
> +cat $fio_prep_config >> $seqres.full
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +$XFS_IO_PROG -fc "truncate $filesize" $testfile >> $seqres.full
> +
> +#prepare file with mixed mappings
> +$FIO_PROG $fio_prep_config >> $seqres.full
> +
> +# do atomic writes without verifying
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +
> +# verify data is not torn
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1227.out b/tests/generic/1227.out
> new file mode 100644
> index 00000000..2605d062
> --- /dev/null
> +++ b/tests/generic/1227.out
> @@ -0,0 +1,2 @@
> +QA output created by 1227
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled
  2025-08-10 13:41 ` [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-08-12 17:18   ` Darrick J. Wong
  2025-08-13  5:45     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: Darrick J. Wong @ 2025-08-12 17:18 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, Aug 10, 2025 at 07:11:58PM +0530, Ojaswin Mujoo wrote:
> Stress file with atomic writes to ensure we excercise codepaths
> where we are mixing different FS operations with atomic writes
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Didn't I already tag this
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  tests/generic/1229     | 68 ++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/1229.out |  2 ++
>  2 files changed, 70 insertions(+)
>  create mode 100755 tests/generic/1229
>  create mode 100644 tests/generic/1229.out
> 
> diff --git a/tests/generic/1229 b/tests/generic/1229
> new file mode 100755
> index 00000000..7fa57105
> --- /dev/null
> +++ b/tests/generic/1229
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1229
> +#
> +# fuzz fsx with atomic writes
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest rw auto quick atomicwrites
> +
> +_require_odirect
> +_require_scratch_write_atomic
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount  >> $seqres.full 2>&1
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> +
> +set_fsx_avoid() {
> +	local file=$1
> +
> +	case "$FSTYP" in
> +	"ext4")
> +		local dev=$(findmnt -n -o SOURCE --target $testfile)
> +
> +		# fsx insert/collpase range support for ext4+bigalloc is
> +		# currently broken, so disable it. Also disable incase we can't
> +		# detect bigalloc to be on safer side.
> +		if [ -z "$DUMPE2FS_PROG" ]; then
> +			echo "dumpe2fs not found, disabling insert/collapse range" >> $seqres.full
> +			FSX_AVOID+=" -I -C"
> +			return
> +		fi
> +
> +		$DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
> +			echo "fsx insert/collapse range not supported with bigalloc. Disabling.." >> $seqres.full
> +			FSX_AVOID+=" -I -C"
> +		}
> +		;;
> +	*)
> +		;;
> +	esac
> +}
> +
> +# fsx usage:
> +#
> +# -N numops: total # operations to do
> +# -l flen: the upper bound on file size
> +# -o oplen: the upper bound on operation size (64k default)
> +# -Z: O_DIRECT ()
> +
> +set_fsx_avoid
> +_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID  >> $seqres.full
> +if [[ "$?" != "0" ]]
> +then
> +	_fail "fsx returned error: $?"
> +fi
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/1229.out b/tests/generic/1229.out
> new file mode 100644
> index 00000000..737d61c6
> --- /dev/null
> +++ b/tests/generic/1229.out
> @@ -0,0 +1,2 @@
> +QA output created by 1229
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-10 13:42 ` [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-08-12 17:19   ` Darrick J. Wong
  2025-08-13  5:45     ` Ojaswin Mujoo
  2025-08-13 13:54   ` John Garry
  1 sibling, 1 reply; 38+ messages in thread
From: Darrick J. Wong @ 2025-08-12 17:19 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, Aug 10, 2025 at 07:12:02PM +0530, Ojaswin Mujoo wrote:
> In ext4, even if an allocated range is physically and logically
> contiguous, it can still be split into 2 extents. This is because ext4
> does not merge extents across leaf nodes. This is an issue for atomic
> writes since even for a continuous extent the map block could (in rare
> cases) return a shorter map, hence tearning the write. This test creates
> such a file and ensures that the atomic write handles this case
> correctly
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/063.out |   2 +
>  2 files changed, 131 insertions(+)
>  create mode 100755 tests/ext4/063
>  create mode 100644 tests/ext4/063.out
> 
> diff --git a/tests/ext4/063 b/tests/ext4/063
> new file mode 100755
> index 00000000..40867acb
> --- /dev/null
> +++ b/tests/ext4/063
> @@ -0,0 +1,129 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# In ext4, even if an allocated range is physically and logically contiguous,
> +# it can still be split into 2 extents. This is because ext4 does not merge
> +# extents across leaf nodes. This is an issue for atomic writes since even for
> +# a continuous extent the map block could (in rare cases) return a shorter map,
> +# hence tearning the write. This test creates such a file and ensures that the
> +# atomic write handles this case correctly
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_command "$DEBUGFS_PROG" debugfs
> +
> +prep() {
> +	local bs=`_get_block_size $SCRATCH_MNT`
> +	local ex_hdr_bytes=12
> +	local ex_entry_bytes=12
> +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> +
> +	# fill the extent tree leaf with bs len extents at alternate offsets.
> +	# The tree should look as follows
> +	#
> +	#                    +---------+---------+
> +	#                    | index 1 | index 2 |
> +	#                    +-----+---+-----+---+
> +	#                   +------+         +-----------+
> +	#                   |                            |
> +	#      +-------+-------+---+---------+     +-----+----+
> +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
> +	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
> +	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
> +	#      +-------+-------+---+---------+     +----------+
> +	#
> +	for i in $(seq 0 $entries_per_blk)
> +	do
> +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> +	done
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
> +	# Since this is a new FS the allocator would find continuous blocks
> +	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
> +	# contiguous. However, since we dont merge extents across leaf we will
> +	# end up with a tree as:
> +	#
> +	#                    +---------+---------+
> +	#                    | index 1 | index 2 |
> +	#                    +-----+---+-----+---+
> +	#                   +------+         +------------+
> +	#                   |                             |
> +	#      +-------+-------+---+---------+     +------+-----------+
> +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
> +	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
> +	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
> +	#      +-------+-------+---+---------+     +------------------+
> +	#

Thanks for the nice picture demonstrating what you're trying to test!
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> +	echo >> $seqres.full
> +	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> +	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +	_scratch_cycle_mount
> +}
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +
> +echo >> $seqres.full
> +echo "# Prepping the file" >> $seqres.full
> +prep
> +
> +torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
> +
> +echo >> $seqres.full
> +echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
> +echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
> +$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "Extent state after atomic write:">> $seqres.full
> +echo "...">> $seqres.full
> +$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Checking data integrity" >> $seqres.full
> +
> +# create a dummy file with expected data
> +$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
> +expected_data=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp)
> +
> +# We ensure that the data after atomic writes should match the expected data
> +actual_data=$(od -An -t x1 -j $torn_aw_offset -N $awu_max $testfile)
> +if [[ "$actual_data" != "$expected_data" ]]
> +then
> +	echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
> +	echo
> +	echo "Expected: "
> +	echo "$expected_data"
> +	echo
> +	echo "Actual contents: "
> +	echo "$actual_data"
> +
> +	_fail
> +fi
> +
> +echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/ext4/063.out b/tests/ext4/063.out
> new file mode 100644
> index 00000000..de35fc52
> --- /dev/null
> +++ b/tests/ext4/063.out
> @@ -0,0 +1,2 @@
> +QA output created by 063
> +Silence is golden
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled
  2025-08-12 17:18   ` Darrick J. Wong
@ 2025-08-13  5:45     ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-13  5:45 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Tue, Aug 12, 2025 at 10:18:55AM -0700, Darrick J. Wong wrote:
> On Sun, Aug 10, 2025 at 07:11:58PM +0530, Ojaswin Mujoo wrote:
> > Stress file with atomic writes to ensure we excercise codepaths
> > where we are mixing different FS operations with atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> 
> Didn't I already tag this
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

Yes you did but since I moved the fsx avoid logic from common/rc to here
I just thought it'd be better to remove old reviews.

(also fyi, i also removed the reviews from g/1227 for the same reason)

Thanks for the review again! 

Regards,
ojaswin
> 
> --D
> 
> > ---
> >  tests/generic/1229     | 68 ++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/1229.out |  2 ++
> >  2 files changed, 70 insertions(+)
> >  create mode 100755 tests/generic/1229
> >  create mode 100644 tests/generic/1229.out
> > 
> > diff --git a/tests/generic/1229 b/tests/generic/1229
> > new file mode 100755
> > index 00000000..7fa57105
> > --- /dev/null
> > +++ b/tests/generic/1229
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 1229
> > +#
> > +# fuzz fsx with atomic writes
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest rw auto quick atomicwrites
> > +
> > +_require_odirect
> > +_require_scratch_write_atomic
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount  >> $seqres.full 2>&1
> > +
> > +testfile=$SCRATCH_MNT/testfile
> > +touch $testfile
> > +
> > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > +blksz=$(_get_block_size $SCRATCH_MNT)
> > +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> > +
> > +set_fsx_avoid() {
> > +	local file=$1
> > +
> > +	case "$FSTYP" in
> > +	"ext4")
> > +		local dev=$(findmnt -n -o SOURCE --target $testfile)
> > +
> > +		# fsx insert/collpase range support for ext4+bigalloc is
> > +		# currently broken, so disable it. Also disable incase we can't
> > +		# detect bigalloc to be on safer side.
> > +		if [ -z "$DUMPE2FS_PROG" ]; then
> > +			echo "dumpe2fs not found, disabling insert/collapse range" >> $seqres.full
> > +			FSX_AVOID+=" -I -C"
> > +			return
> > +		fi
> > +
> > +		$DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
> > +			echo "fsx insert/collapse range not supported with bigalloc. Disabling.." >> $seqres.full
> > +			FSX_AVOID+=" -I -C"
> > +		}
> > +		;;
> > +	*)
> > +		;;
> > +	esac
> > +}
> > +
> > +# fsx usage:
> > +#
> > +# -N numops: total # operations to do
> > +# -l flen: the upper bound on file size
> > +# -o oplen: the upper bound on operation size (64k default)
> > +# -Z: O_DIRECT ()
> > +
> > +set_fsx_avoid
> > +_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID  >> $seqres.full
> > +if [[ "$?" != "0" ]]
> > +then
> > +	_fail "fsx returned error: $?"
> > +fi
> > +
> > +echo "Silence is golden"
> > +status=0
> > +exit
> > diff --git a/tests/generic/1229.out b/tests/generic/1229.out
> > new file mode 100644
> > index 00000000..737d61c6
> > --- /dev/null
> > +++ b/tests/generic/1229.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 1229
> > +Silence is golden
> > -- 
> > 2.49.0
> > 
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-12 17:19   ` Darrick J. Wong
@ 2025-08-13  5:45     ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-13  5:45 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Tue, Aug 12, 2025 at 10:19:35AM -0700, Darrick J. Wong wrote:
> On Sun, Aug 10, 2025 at 07:12:02PM +0530, Ojaswin Mujoo wrote:
> > In ext4, even if an allocated range is physically and logically
> > contiguous, it can still be split into 2 extents. This is because ext4
> > does not merge extents across leaf nodes. This is an issue for atomic
> > writes since even for a continuous extent the map block could (in rare
> > cases) return a shorter map, hence tearning the write. This test creates
> > such a file and ensures that the atomic write handles this case
> > correctly
> > 
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
> >  tests/ext4/063.out |   2 +
> >  2 files changed, 131 insertions(+)
> >  create mode 100755 tests/ext4/063
> >  create mode 100644 tests/ext4/063.out
> > 
> > diff --git a/tests/ext4/063 b/tests/ext4/063
> > new file mode 100755
> > index 00000000..40867acb
> > --- /dev/null
> > +++ b/tests/ext4/063
> > @@ -0,0 +1,129 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# In ext4, even if an allocated range is physically and logically contiguous,
> > +# it can still be split into 2 extents. This is because ext4 does not merge
> > +# extents across leaf nodes. This is an issue for atomic writes since even for
> > +# a continuous extent the map block could (in rare cases) return a shorter map,
> > +# hence tearning the write. This test creates such a file and ensures that the
> > +# atomic write handles this case correctly
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_command "$DEBUGFS_PROG" debugfs
> > +
> > +prep() {
> > +	local bs=`_get_block_size $SCRATCH_MNT`
> > +	local ex_hdr_bytes=12
> > +	local ex_entry_bytes=12
> > +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> > +
> > +	# fill the extent tree leaf with bs len extents at alternate offsets.
> > +	# The tree should look as follows
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +-----------+
> > +	#                   |                            |
> > +	#      +-------+-------+---+---------+     +-----+----+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
> > +	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
> > +	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
> > +	#      +-------+-------+---+---------+     +----------+
> > +	#
> > +	for i in $(seq 0 $entries_per_blk)
> > +	do
> > +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> > +	done
> > +	sync $testfile
> > +
> > +	echo >> $seqres.full
> > +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> > +	echo "...">> $seqres.full
> > +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
> > +	# Since this is a new FS the allocator would find continuous blocks
> > +	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
> > +	# contiguous. However, since we dont merge extents across leaf we will
> > +	# end up with a tree as:
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +------------+
> > +	#                   |                             |
> > +	#      +-------+-------+---+---------+     +------+-----------+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
> > +	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
> > +	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
> > +	#      +-------+-------+---+---------+     +------------------+
> > +	#
> 
> Thanks for the nice picture demonstrating what you're trying to test!
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

Sure, thanks for the suggestions and review!

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
  2025-08-12  8:08   ` John Garry
@ 2025-08-13  7:08     ` Ojaswin Mujoo
  2025-08-13  7:33       ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-13  7:08 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Tue, Aug 12, 2025 at 09:08:59AM +0100, John Garry wrote:
> On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > We brute force all possible blocksize & clustersize combinations on
> > a bigalloc filesystem for stressing atomic write using fio data crc
> > verifier.
> 
> you seem to run mkfs per block size. Why not just mkfs for largest blocksize
> once, which will support all block sizes?

We are just stressing all the possible combinations to shake out any
bugs. This is marked as stress so I feel the extra loops should be okay.

> 
> > We run nproc * $LOAD_FACTOR threads in parallel writing to
> > a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> > that we never see the mix of data contents from different threads on
> > a given bsrange.
> > 
> > This test might do overlapping atomic writes but that should be okay
> > since overlapping parallel hardware atomic writes don't cause tearing as
> > long as io size is the same for all writes.
> 
> Please mention that serializing racing writes is not guaranteed for
> RWF_ATOMIC, and that NVMe and SCSI provide this guarantee as an inseparable
> feature to power-fail atomicity.
> 
> Please also mention that the value is that we test that we split no bios or
> only generate a single bio per write syscall.

Got it, will do.
> 
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >   tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/ext4/061.out |   2 +
> >   2 files changed, 132 insertions(+)
> >   create mode 100755 tests/ext4/061
> >   create mode 100644 tests/ext4/061.out
> > 
> > diff --git a/tests/ext4/061 b/tests/ext4/061
> > new file mode 100755
> > index 00000000..a0e49249
> > --- /dev/null
> > +++ b/tests/ext4/061
> > @@ -0,0 +1,130 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 061
> > +#
> > +# Brute force all possible blocksize clustersize combination on a bigalloc
> > +# filesystem for stressing atomic write using fio data crc verifier. We run
> > +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> > +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> > +# we should never see the mix of data contents from different threads for any
> > +# given fio blocksize.
> > +#
> > +
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto rw stress atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_aiodio
> 
> do you require fio with a certain version somewhere?

Oh right you mentioned that atomic=1 was broken on some older fios.
Would you happen to know which version fixed it?

> 
> > +
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > +SIZE=$((100*1024*1024))
> > +fiobsize=4096
> > +
> > +# Calculate fsblocksize as per bdev atomic write units.
> > +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> > +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> > +fsblocksize=$(_max 4096 "$bdev_awu_min")
> > +
> > +function create_fio_configs()
> > +{
> > +	create_fio_aw_config
> > +	create_fio_verify_config
> > +}
> > +
> > +function create_fio_verify_config()
> > +{
> > +cat >$fio_verify_config <<EOF
> > +	[aio-dio-aw-verify]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=randwrite
> 
> it prob makes sense to just have read, but I guess with verify_only=1 that
> this makes no difference

Right, but I can change it in the next revision.

> 
> > +	bs=$fiobsize
> > +	fallocate=native
> > +	filename=$SCRATCH_MNT/test-file
> > +	size=$SIZE
> > +	iodepth=$FIO_LOAD
> > +	numjobs=$FIO_LOAD
> > +	atomic=1
> > +	group_reporting=1
> > +
> > +	verify_only=1
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	verify_fatal=1
> > +	verify_write_sequence=0
> > +EOF
> > +}
> > +
> > +function create_fio_aw_config()
> > +{
> > +cat >$fio_aw_config <<EOF
> > +	[aio-dio-aw]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=randwrite
> > +	bs=$fiobsize
> > +	fallocate=native
> > +	filename=$SCRATCH_MNT/test-file
> > +	size=$SIZE
> > +	iodepth=$FIO_LOAD
> > +	numjobs=$FIO_LOAD
> > +	group_reporting=1
> > +	atomic=1
> > +
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	do_verify=0
> > +
> > +EOF
> > +}
> > +
> > +# Let's create a sample fio config to check whether fio supports all options.
> > +fio_aw_config=$tmp.aw.fio
> > +fio_verify_config=$tmp.verify.fio
> > +fio_out=$tmp.fio.out
> > +
> > +create_fio_configs
> > +_require_fio $fio_aw_config
> > +
> > +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> > +	# cluster sizes above 16 x blocksize are experimental so avoid them
> > +	# Also, cap cluster size at 128kb to keep it reasonable for large
> > +	# blocks size
> > +	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> > +
> > +	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> > +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> > +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> 
> this is quite heavy indentation. Maybe the below steps can be put into a
> separate routine (to make the code more readable).

Got it.

> 
> 
> > +			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
> > +			if _try_scratch_mount >> $seqres.full 2>&1; then
> > +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> > +
> > +				touch $SCRATCH_MNT/f1
> > +				create_fio_configs
> > +
> > +				cat $fio_aw_config >> $seqres.full
> > +				echo >> $seqres.full
> > +				cat $fio_verify_config >> $seqres.full
> > +
> > +				$FIO_PROG $fio_aw_config >> $seqres.full
> > +				ret1=$?
> > +
> > +				$FIO_PROG $fio_verify_config >> $seqres.full
> > +				ret2=$?
> > +
> > +				_scratch_unmount
> > +
> > +				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> > +			fi
> > +		done
> > +	done
> > +done
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> > new file mode 100644
> > index 00000000..273be9e0
> > --- /dev/null
> > +++ b/tests/ext4/061.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 061
> > +Silence is golden
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
  2025-08-13  7:08     ` Ojaswin Mujoo
@ 2025-08-13  7:33       ` John Garry
  2025-08-21  8:29         ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-13  7:33 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On 13/08/2025 08:08, Ojaswin Mujoo wrote:
>>> +_begin_fstest auto rw stress atomicwrites
>>> +
>>> +_require_scratch_write_atomic
>>> +_require_aiodio
>> do you require fio with a certain version somewhere?
> Oh right you mentioned that atomic=1 was broken on some older fios.

It was not broken - it just did nothing. I suppose that they are the 
same thing.

> Would you happen to know which version fixed it?

fio 3.38

thanks,
John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 01/11] common/rc: Add _min() and _max() helpers
  2025-08-10 13:41 ` [PATCH v4 01/11] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
@ 2025-08-13 12:20   ` David Laight
  2025-08-21 10:35     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: David Laight @ 2025-08-13 12:20 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Sun, 10 Aug 2025 19:11:52 +0530
Ojaswin Mujoo <ojaswin@linux.ibm.com> wrote:

> Many programs open code these functionalities so add it as a generic helper
> in common/rc
> 
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  common/rc | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/common/rc b/common/rc
> index 96578d15..3858ddce 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -5873,6 +5873,28 @@ _require_program() {
>  	_have_program "$1" || _notrun "$tag required"
>  }
>  
> +_min() {
> +	local ret
> +
> +	for arg in "$@"; do
> +		if [ -z "$ret" ] || (( $arg < $ret )); then
> +			ret="$arg"
> +		fi
> +	done
> +	echo $ret
> +}

Perhaps:
	local ret="$1"
	shift
	for arg in "$@"; do
		ret=$(((arg) < (ret) ? (arg) : (ret)))
	done;
	echo "$ret"
that should work for 'min 10 "2 + 3"' (with bash, but not dash).

	David

> +
> +_max() {
> +	local ret
> +
> +	for arg in "$@"; do
> +		if [ -z "$ret" ] || (( $arg > $ret )); then
> +			ret="$arg"
> +		fi
> +	done
> +	echo $ret
> +}
> +
>  ################################################################################
>  # make sure this script returns success
>  /bin/true


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-10 13:41 ` [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier Ojaswin Mujoo
  2025-08-12 17:16   ` Darrick J. Wong
@ 2025-08-13 13:39   ` John Garry
  2025-08-21  8:42     ` Ojaswin Mujoo
  1 sibling, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-13 13:39 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
	linux-ext4

On 10/08/2025 14:41, Ojaswin Mujoo wrote:
> This adds atomic write test using fio based on it's crc check verifier.
> fio adds a crc for each data block. If the underlying device supports
> atomic write then it is guaranteed that we will never have a mix data from
> two threads writing on the same physical block.
> 
> Avoid doing overlapping parallel atomic writes because it might give
> unexpected results. Use offset_increment=, size= fio options to achieve
> this behavior.
> 

You are not really describing what the test does.

In the first paragraph, you state what fio verify function does and then 
describe what RWF_ATOMIC means when we only use HW support, i.e. 
serialises. In the second you mention that we guarantee no inter-thread 
overlapping writes.

 From a glance at the code below, in this test each thread writes to a 
separate part of the file and then verifies no crc corruption. But even 
with atomic=0, I would expect no corruption here.

Thanks,
John

> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>   tests/generic/1226     | 107 +++++++++++++++++++++++++++++++++++++++++
>   tests/generic/1226.out |   2 +
>   2 files changed, 109 insertions(+)
>   create mode 100755 tests/generic/1226
>   create mode 100644 tests/generic/1226.out
> 
> diff --git a/tests/generic/1226 b/tests/generic/1226
> new file mode 100755
> index 00000000..efc360e1
> --- /dev/null
> +++ b/tests/generic/1226
> @@ -0,0 +1,107 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1226
> +#
> +# Validate FS atomic write using fio crc check verifier.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_odirect
> +_require_aio
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +_require_xfs_io_command "falloc"
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +
> +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> +threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
> +filesize=$((blocksize * threads * 100))
> +depth=$threads
> +io_size=$((filesize / threads))
> +io_inc=$io_size
> +testfile=$SCRATCH_MNT/test-file
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +
> +function create_fio_configs()
> +{
> +	create_fio_aw_config
> +	create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> +	[verify-job]
> +	direct=1
> +	ioengine=libaio
> +	rw=read
> +	bs=$blocksize
> +	filename=$testfile
> +	size=$filesize
> +	iodepth=$depth
> +	group_reporting=1
> +
> +	verify_only=1
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_state_save=0
> +	verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> +	[atomicwrite-job]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite
> +	bs=$blocksize
> +	filename=$testfile
> +	size=$io_inc
> +	offset_increment=$io_inc
> +	iodepth=$depth
> +	numjobs=$threads
> +	group_reporting=1
> +	atomic=1
> +
> +	verify_state_save=0
> +	verify=crc32c
> +	do_verify=0
> +EOF
> +}
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +$XFS_IO_PROG -fc "falloc 0 $filesize" $testfile >> $seqres.full
> +
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +ret1=$?
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +ret2=$?
> +
> +[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1226.out b/tests/generic/1226.out
> new file mode 100644
> index 00000000..6dce0ea5
> --- /dev/null
> +++ b/tests/generic/1226.out
> @@ -0,0 +1,2 @@
> +QA output created by 1226
> +Silence is golden


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx
  2025-08-10 13:41 ` [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-08-13 13:42   ` John Garry
  2025-08-21  9:45     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-13 13:42 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
	linux-ext4

On 10/08/2025 14:41, Ojaswin Mujoo wrote:
> Implement atomic write support to help fuzz atomic writes
> with fsx.
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

this looks ok, but I am not familiar with fsx.

BTW, you guarantee O_DIRECT with RWF_ATOMIC only, right?

Thanks,
John

> ---
>   ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 104 insertions(+), 5 deletions(-)
> 
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index 163b9453..ea39ca29 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -40,6 +40,7 @@
>   #include <liburing.h>
>   #endif
>   #include <sys/syscall.h>
> +#include "statx.h"
>   
>   #ifndef MAP_FILE
>   # define MAP_FILE 0
> @@ -49,6 +50,10 @@
>   #define RWF_DONTCACHE	0x80
>   #endif
>   
> +#ifndef RWF_ATOMIC
> +#define RWF_ATOMIC	0x40
> +#endif
> +
>   #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
>   
>   /* Operation flags (bitmask) */
> @@ -110,6 +115,7 @@ enum {
>   	OP_READ_DONTCACHE,
>   	OP_WRITE,
>   	OP_WRITE_DONTCACHE,
> +	OP_WRITE_ATOMIC,
>   	OP_MAPREAD,
>   	OP_MAPWRITE,
>   	OP_MAX_LITE,
> @@ -200,6 +206,11 @@ int	uring = 0;
>   int	mark_nr = 0;
>   int	dontcache_io = 1;
>   int	hugepages = 0;                  /* -h flag */
> +int	do_atomic_writes = 1;		/* -a flag disables */
> +
> +/* User for atomic writes */
> +int awu_min = 0;
> +int awu_max = 0;
>   
>   /* Stores info needed to periodically collapse hugepages */
>   struct hugepages_collapse_info {
> @@ -288,6 +299,7 @@ static const char *op_names[] = {
>   	[OP_READ_DONTCACHE] = "read_dontcache",
>   	[OP_WRITE] = "write",
>   	[OP_WRITE_DONTCACHE] = "write_dontcache",
> +	[OP_WRITE_ATOMIC] = "write_atomic",
>   	[OP_MAPREAD] = "mapread",
>   	[OP_MAPWRITE] = "mapwrite",
>   	[OP_TRUNCATE] = "truncate",
> @@ -422,6 +434,7 @@ logdump(void)
>   				prt("\t***RRRR***");
>   			break;
>   		case OP_WRITE_DONTCACHE:
> +		case OP_WRITE_ATOMIC:
>   		case OP_WRITE:
>   			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
>   			    lp->args[0], lp->args[0] + lp->args[1] - 1,
> @@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
>   	file_size = offset + size;
>   }
>   
> +static int is_power_of_2(unsigned n) {
> +	return ((n & (n - 1)) == 0);
> +}
> +
> +/*
> + * Round down n to nearest power of 2.
> + * If n is already a power of 2, return n;
> + */
> +static int rounddown_pow_of_2(int n) {
> +	int i = 0;
> +
> +	if (is_power_of_2(n))
> +		return n;
> +
> +	for (; (1 << i) < n; i++);
> +
> +	return 1 << (i - 1);
> +}
> +
>   void
>   dowrite(unsigned offset, unsigned size, int flags)
>   {
> @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
>   	offset -= offset % writebdy;
>   	if (o_direct)
>   		size -= size % writebdy;
> +	if (flags & RWF_ATOMIC) {
> +		/* atomic write len must be inbetween awu_min and awu_max */
> +		if (size < awu_min)
> +			size = awu_min;
> +		if (size > awu_max)
> +			size = awu_max;
> +
> +		/* atomic writes need power-of-2 sizes */
> +		size = rounddown_pow_of_2(size);
> +
> +		/* atomic writes need naturally aligned offsets */
> +		offset -= offset % size;
> +
> +		/* Skip the write if we are crossing max filesize */
> +		if ((offset + size) > maxfilelen) {
> +			if (!quiet && testcalls > simulatedopcount)
> +				prt("skipping atomic write past maxfilelen\n");
> +			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> +			return;
> +		}
> +	}
>   	if (size == 0) {
>   		if (!quiet && testcalls > simulatedopcount && !o_direct)
>   			prt("skipping zero size write\n");
> @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
>   		return;
>   	}
>   
> -	log4(OP_WRITE, offset, size, FL_NONE);
> +	if (flags & RWF_ATOMIC)
> +		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> +	else
> +		log4(OP_WRITE, offset, size, FL_NONE);
>   
>   	gendata(original_buf, good_buf, offset, size);
>   	if (offset + size > file_size) {
> @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
>   		       (monitorstart == -1 ||
>   			(offset + size > monitorstart &&
>   			(monitorend == -1 || offset <= monitorend))))))
> -		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> -		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> +		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> +		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> +		    (flags & RWF_ATOMIC) != 0);
>   	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
>   	if (iret != size) {
>   		if (iret == -1)
> @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
>   }
>   #endif
>   
> +int test_atomic_writes(void) {
> +	int ret;
> +	struct statx stx;
> +
> +	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
> +	if (ret < 0) {
> +		fprintf(stderr, "main: Statx failed with %d."
> +			" Failed to determine atomic write limits, "
> +			" disabling!\n", ret);
> +		return 0;
> +	}
> +
> +	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
> +	    stx.stx_atomic_write_unit_min > 0) {
> +		awu_min = stx.stx_atomic_write_unit_min;
> +		awu_max = stx.stx_atomic_write_unit_max;
> +		return 1;
> +	}
> +
> +	fprintf(stderr, "main: IO Stack does not support "
> +			"atomic writes, disabling!\n");
> +	return 0;
> +}
> +
>   #ifdef HAVE_COPY_FILE_RANGE
>   int
>   test_copy_range(void)
> @@ -2356,6 +2437,12 @@ have_op:
>   			goto out;
>   		}
>   		break;
> +	case OP_WRITE_ATOMIC:
> +		if (!do_atomic_writes) {
> +			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> +			goto out;
> +		}
> +		break;
>   	}
>   
>   	switch (op) {
> @@ -2385,6 +2472,11 @@ have_op:
>   			dowrite(offset, size, 0);
>   		break;
>   
> +	case OP_WRITE_ATOMIC:
> +		TRIM_OFF_LEN(offset, size, maxfilelen);
> +		dowrite(offset, size, RWF_ATOMIC);
> +		break;
> +
>   	case OP_MAPREAD:
>   		TRIM_OFF_LEN(offset, size, file_size);
>   		domapread(offset, size);
> @@ -2511,13 +2603,14 @@ void
>   usage(void)
>   {
>   	fprintf(stdout, "usage: %s",
> -		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
> +		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
>   	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
>   	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
>   	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
>   	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
>   	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
>   	   ... fname\n\
> +	-a: disable atomic writes\n\
>   	-b opnum: beginning operation number (default 1)\n\
>   	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
>   	-d: debug output for all operations\n\
> @@ -3059,9 +3152,13 @@ main(int argc, char **argv)
>   	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
>   
>   	while ((ch = getopt_long(argc, argv,
> -				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> +				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
>   				 longopts, NULL)) != EOF)
>   		switch (ch) {
> +		case 'a':
> +			prt("main(): Atomic writes disabled\n");
> +			do_atomic_writes = 0;
> +			break;
>   		case 'b':
>   			simulatedopcount = getnum(optarg, &endp);
>   			if (!quiet)
> @@ -3475,6 +3572,8 @@ main(int argc, char **argv)
>   		exchange_range_calls = test_exchange_range();
>   	if (dontcache_io)
>   		dontcache_io = test_dontcache_io();
> +	if (do_atomic_writes)
> +		do_atomic_writes = test_atomic_writes();
>   
>   	while (keep_running())
>   		if (!test())


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-08-10 13:42 ` [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-08-13 13:45   ` John Garry
  2025-08-21  8:28     ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-13 13:45 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
	linux-ext4

On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> 
> Brute force all possible blocksize clustersize combination on a bigalloc
> filesystem for stressing atomic write using fio data crc verifier. We run
> multiple threads in parallel with each job writing to its own file. The
> parallel jobs running on a constrained filesystem size ensure that we
> stress the ext4 allocator to allocate contiguous extents.
> 
> This test might do overlapping atomic writes but that should be okay
> since overlapping parallel hardware atomic writes don't cause tearing as
> long as io size is the same for all writes.
> 
> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> Reviewed-by: Darrick J. Wong<djwong@kernel.org>
> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> ---
>   tests/ext4/062     | 176 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/ext4/062.out |   2 +
>   2 files changed, 178 insertions(+)
>   create mode 100755 tests/ext4/062
>   create mode 100644 tests/ext4/062.out

Is the only difference to 061 that we have multiple files (and not a 
single file)?

Thanks,
John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-10 13:42 ` [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
  2025-08-12 17:19   ` Darrick J. Wong
@ 2025-08-13 13:54   ` John Garry
  2025-08-21  8:25     ` Ojaswin Mujoo
  1 sibling, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-13 13:54 UTC (permalink / raw)
  To: Ojaswin Mujoo, Zorro Lang, fstests
  Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
	linux-ext4

On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> In ext4, even if an allocated range is physically and logically
> contiguous, it can still be split into 2 extents. This is because ext4
> does not merge extents across leaf nodes. This is an issue for atomic
> writes since even for a continuous extent the map block could (in rare
> cases) return a shorter map, hence tearning the write. This test creates
> such a file and ensures that the atomic write handles this case
> correctly
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>   tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/ext4/063.out |   2 +
>   2 files changed, 131 insertions(+)
>   create mode 100755 tests/ext4/063
>   create mode 100644 tests/ext4/063.out
> 
> diff --git a/tests/ext4/063 b/tests/ext4/063
> new file mode 100755
> index 00000000..40867acb
> --- /dev/null
> +++ b/tests/ext4/063
> @@ -0,0 +1,129 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# In ext4, even if an allocated range is physically and logically contiguous,
> +# it can still be split into 2 extents. 

Nit: I assume that you mean "2 or more extents"

> +# This is because ext4 does not merge
> +# extents across leaf nodes. This is an issue for atomic writes since even for
> +# a continuous extent the map block could (in rare cases) return a shorter map,
> +# hence tearning the write. This test creates such a file and ensures that the

tearing

> +# atomic write handles this case correctly
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_command "$DEBUGFS_PROG" debugfs
> +
> +prep() {
> +	local bs=`_get_block_size $SCRATCH_MNT`
> +	local ex_hdr_bytes=12
> +	local ex_entry_bytes=12
> +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> +
> +	# fill the extent tree leaf with bs len extents at alternate offsets.
> +	# The tree should look as follows
> +	#
> +	#                    +---------+---------+
> +	#                    | index 1 | index 2 |
> +	#                    +-----+---+-----+---+
> +	#                   +------+         +-----------+
> +	#                   |                            |
> +	#      +-------+-------+---+---------+     +-----+----+
> +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
> +	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
> +	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
> +	#      +-------+-------+---+---------+     +----------+
> +	#
> +	for i in $(seq 0 $entries_per_blk)
> +	do
> +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> +	done
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
> +	# Since this is a new FS the allocator would find continuous blocks
> +	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
> +	# contiguous. However, since we dont merge extents across leaf we will

don't

> +	# end up with a tree as:
> +	#
> +	#                    +---------+---------+
> +	#                    | index 1 | index 2 |
> +	#                    +-----+---+-----+---+
> +	#                   +------+         +------------+
> +	#                   |                             |
> +	#      +-------+-------+---+---------+     +------+-----------+
> +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
> +	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
> +	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
> +	#      +-------+-------+---+---------+     +------------------+
> +	#
> +	echo >> $seqres.full
> +	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> +	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> +	sync $testfile
> +
> +	echo >> $seqres.full
> +	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> +	echo "...">> $seqres.full
> +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +	_scratch_cycle_mount
> +}
> +

Out of curiosity, for such a file with split extents, what would 
filefrag output look like? An example would be nice.

Thanks,
John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-13 13:54   ` John Garry
@ 2025-08-21  8:25     ` Ojaswin Mujoo
  2025-08-21  9:23       ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21  8:25 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 02:54:04PM +0100, John Garry wrote:
> On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> > In ext4, even if an allocated range is physically and logically
> > contiguous, it can still be split into 2 extents. This is because ext4
> > does not merge extents across leaf nodes. This is an issue for atomic
> > writes since even for a continuous extent the map block could (in rare
> > cases) return a shorter map, hence tearning the write. This test creates
> > such a file and ensures that the atomic write handles this case
> > correctly
> > 
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >   tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/ext4/063.out |   2 +
> >   2 files changed, 131 insertions(+)
> >   create mode 100755 tests/ext4/063
> >   create mode 100644 tests/ext4/063.out
> > 
> > diff --git a/tests/ext4/063 b/tests/ext4/063
> > new file mode 100755
> > index 00000000..40867acb
> > --- /dev/null
> > +++ b/tests/ext4/063
> > @@ -0,0 +1,129 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# In ext4, even if an allocated range is physically and logically contiguous,
> > +# it can still be split into 2 extents.
> 
> Nit: I assume that you mean "2 or more extents"
> 
> > +# This is because ext4 does not merge
> > +# extents across leaf nodes. This is an issue for atomic writes since even for
> > +# a continuous extent the map block could (in rare cases) return a shorter map,
> > +# hence tearning the write. This test creates such a file and ensures that the
> 
> tearing
> 
> > +# atomic write handles this case correctly
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_command "$DEBUGFS_PROG" debugfs
> > +
> > +prep() {
> > +	local bs=`_get_block_size $SCRATCH_MNT`
> > +	local ex_hdr_bytes=12
> > +	local ex_entry_bytes=12
> > +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> > +
> > +	# fill the extent tree leaf with bs len extents at alternate offsets.
> > +	# The tree should look as follows
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +-----------+
> > +	#                   |                            |
> > +	#      +-------+-------+---+---------+     +-----+----+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
> > +	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
> > +	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
> > +	#      +-------+-------+---+---------+     +----------+
> > +	#
> > +	for i in $(seq 0 $entries_per_blk)
> > +	do
> > +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> > +	done
> > +	sync $testfile
> > +
> > +	echo >> $seqres.full
> > +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> > +	echo "...">> $seqres.full
> > +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
> > +	# Since this is a new FS the allocator would find continuous blocks
> > +	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
> > +	# contiguous. However, since we dont merge extents across leaf we will
> 
> don't
> 
> > +	# end up with a tree as:
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +------------+
> > +	#                   |                             |
> > +	#      +-------+-------+---+---------+     +------+-----------+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
> > +	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
> > +	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
> > +	#      +-------+-------+---+---------+     +------------------+
> > +	#
> > +	echo >> $seqres.full
> > +	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> > +	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> > +	sync $testfile
> > +
> > +	echo >> $seqres.full
> > +	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> > +	echo "...">> $seqres.full
> > +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > +	_scratch_cycle_mount
> > +}
> > +
> 
> Out of curiosity, for such a file with split extents, what would filefrag
> output look like? An example would be nice.

Hey John thanks for the review. Sorry for the late reply i had a mini
vacation followed by lei suddenly not pulling emails :/

Anyways, yes I've added the $DEBUGFS command so we can observe the
extent structure, but the filefrag would look something like this (last
few extents):

 ...
 337:      674..     674:      10130..     10130:      1:
 338:      676..     676:      10132..     10132:      1:
 339:      678..     678:      10134..     10134:      1:
 340:      679..     680:      10135..     10136:      2:             last,eof

Notice that the last 2 extents are logically and physically continuous
but not merged.

Reards,
ojaswin

> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-08-13 13:45   ` John Garry
@ 2025-08-21  8:28     ` Ojaswin Mujoo
  2025-08-21  9:28       ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21  8:28 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 02:45:28PM +0100, John Garry wrote:
> On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> > 
> > Brute force all possible blocksize clustersize combination on a bigalloc
> > filesystem for stressing atomic write using fio data crc verifier. We run
> > multiple threads in parallel with each job writing to its own file. The
> > parallel jobs running on a constrained filesystem size ensure that we
> > stress the ext4 allocator to allocate contiguous extents.
> > 
> > This test might do overlapping atomic writes but that should be okay
> > since overlapping parallel hardware atomic writes don't cause tearing as
> > long as io size is the same for all writes.
> > 
> > Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> > Reviewed-by: Darrick J. Wong<djwong@kernel.org>
> > Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> > ---
> >   tests/ext4/062     | 176 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/ext4/062.out |   2 +
> >   2 files changed, 178 insertions(+)
> >   create mode 100755 tests/ext4/062
> >   create mode 100644 tests/ext4/062.out
> 
> Is the only difference to 061 that we have multiple files (and not a single
> file)?

Hey John,

Yes these 2 tests are similar however 061 uses fallocate=native +
_scratch_mkfs_ext4 to test whether atomic writes on preallocated file
via multiple threads works correctly.

The 062 uses fallocate=truncate + _scratch_mkfs_sized 360MB +
'multiple jobs each writing to a different file' to ensure we are
extensively stressing the allocation logic in low space scenarios.

Regards,
ojaswin
> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
  2025-08-13  7:33       ` John Garry
@ 2025-08-21  8:29         ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21  8:29 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 08:33:44AM +0100, John Garry wrote:
> On 13/08/2025 08:08, Ojaswin Mujoo wrote:
> > > > +_begin_fstest auto rw stress atomicwrites
> > > > +
> > > > +_require_scratch_write_atomic
> > > > +_require_aiodio
> > > do you require fio with a certain version somewhere?
> > Oh right you mentioned that atomic=1 was broken on some older fios.
> 
> It was not broken - it just did nothing. I suppose that they are the same
> thing.
> 
> > Would you happen to know which version fixed it?
> 
> fio 3.38

Thanks, I'll add the version check.
> 
> thanks,
> John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-13 13:39   ` John Garry
@ 2025-08-21  8:42     ` Ojaswin Mujoo
  2025-08-21  9:24       ` John Garry
  0 siblings, 1 reply; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21  8:42 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 02:39:40PM +0100, John Garry wrote:
> On 10/08/2025 14:41, Ojaswin Mujoo wrote:
> > This adds atomic write test using fio based on it's crc check verifier.
> > fio adds a crc for each data block. If the underlying device supports
> > atomic write then it is guaranteed that we will never have a mix data from
> > two threads writing on the same physical block.
> > 
> > Avoid doing overlapping parallel atomic writes because it might give
> > unexpected results. Use offset_increment=, size= fio options to achieve
> > this behavior.
> > 
> 
> You are not really describing what the test does.
> 
> In the first paragraph, you state what fio verify function does and then
> describe what RWF_ATOMIC means when we only use HW support, i.e. serialises.
> In the second you mention that we guarantee no inter-thread overlapping
> writes.

Got it John, I will add better commit messages for the fio tests.
> 
> From a glance at the code below, in this test each thread writes to a
> separate part of the file and then verifies no crc corruption. But even with
> atomic=0, I would expect no corruption here.

Right, this is mostly a stress test that is ensuring that all the new
atomic write code paths are not causing anything to break or
introducing any regressions. This should pass with both atomic or non
atomic writes but by using RWF_ATOMIC we excercise the atomic specific
code paths, improving the code coverage.

Regards,
ojaswin
> 
> Thanks,
> John
> 
> > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >   tests/generic/1226     | 107 +++++++++++++++++++++++++++++++++++++++++
> >   tests/generic/1226.out |   2 +
> >   2 files changed, 109 insertions(+)
> >   create mode 100755 tests/generic/1226
> >   create mode 100644 tests/generic/1226.out
> > 
> > diff --git a/tests/generic/1226 b/tests/generic/1226
> > new file mode 100755
> > index 00000000..efc360e1
> > --- /dev/null
> > +++ b/tests/generic/1226
> > @@ -0,0 +1,107 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 1226
> > +#
> > +# Validate FS atomic write using fio crc check verifier.
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto aio rw atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_odirect
> > +_require_aio
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount
> > +_require_xfs_io_command "falloc"
> > +
> > +touch "$SCRATCH_MNT/f1"
> > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > +
> > +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> > +threads=$(_min "$(($(nproc) * 2 * LOAD_FACTOR))" "100")
> > +filesize=$((blocksize * threads * 100))
> > +depth=$threads
> > +io_size=$((filesize / threads))
> > +io_inc=$io_size
> > +testfile=$SCRATCH_MNT/test-file
> > +
> > +fio_config=$tmp.fio
> > +fio_out=$tmp.fio.out
> > +
> > +fio_aw_config=$tmp.aw.fio
> > +fio_verify_config=$tmp.verify.fio
> > +
> > +function create_fio_configs()
> > +{
> > +	create_fio_aw_config
> > +	create_fio_verify_config
> > +}
> > +
> > +function create_fio_verify_config()
> > +{
> > +cat >$fio_verify_config <<EOF
> > +	[verify-job]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=read
> > +	bs=$blocksize
> > +	filename=$testfile
> > +	size=$filesize
> > +	iodepth=$depth
> > +	group_reporting=1
> > +
> > +	verify_only=1
> > +	verify=crc32c
> > +	verify_fatal=1
> > +	verify_state_save=0
> > +	verify_write_sequence=0
> > +EOF
> > +}
> > +
> > +function create_fio_aw_config()
> > +{
> > +cat >$fio_aw_config <<EOF
> > +	[atomicwrite-job]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=randwrite
> > +	bs=$blocksize
> > +	filename=$testfile
> > +	size=$io_inc
> > +	offset_increment=$io_inc
> > +	iodepth=$depth
> > +	numjobs=$threads
> > +	group_reporting=1
> > +	atomic=1
> > +
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	do_verify=0
> > +EOF
> > +}
> > +
> > +create_fio_configs
> > +_require_fio $fio_aw_config
> > +
> > +cat $fio_aw_config >> $seqres.full
> > +cat $fio_verify_config >> $seqres.full
> > +
> > +$XFS_IO_PROG -fc "falloc 0 $filesize" $testfile >> $seqres.full
> > +
> > +$FIO_PROG $fio_aw_config >> $seqres.full
> > +ret1=$?
> > +$FIO_PROG $fio_verify_config >> $seqres.full
> > +ret2=$?
> > +
> > +[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/generic/1226.out b/tests/generic/1226.out
> > new file mode 100644
> > index 00000000..6dce0ea5
> > --- /dev/null
> > +++ b/tests/generic/1226.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 1226
> > +Silence is golden
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes
  2025-08-21  8:25     ` Ojaswin Mujoo
@ 2025-08-21  9:23       ` John Garry
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2025-08-21  9:23 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On 21/08/2025 09:25, Ojaswin Mujoo wrote:
> Anyways, yes I've added the $DEBUGFS command so we can observe the
> extent structure, but the filefrag would look something like this (last
> few extents):
> 
>   ...
>   337:      674..     674:      10130..     10130:      1:
>   338:      676..     676:      10132..     10132:      1:
>   339:      678..     678:      10134..     10134:      1:
>   340:      679..     680:      10135..     10136:      2:             last,eof
> 
> Notice that the last 2 extents are logically and physically continuous
> but not merged.

I see, thanks for the example!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-21  8:42     ` Ojaswin Mujoo
@ 2025-08-21  9:24       ` John Garry
  2025-08-21 12:18         ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-21  9:24 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On 21/08/2025 09:42, Ojaswin Mujoo wrote:
> On Wed, Aug 13, 2025 at 02:39:40PM +0100, John Garry wrote:
>> On 10/08/2025 14:41, Ojaswin Mujoo wrote:
>>> This adds atomic write test using fio based on it's crc check verifier.
>>> fio adds a crc for each data block. If the underlying device supports
>>> atomic write then it is guaranteed that we will never have a mix data from
>>> two threads writing on the same physical block.
>>>
>>> Avoid doing overlapping parallel atomic writes because it might give
>>> unexpected results. Use offset_increment=, size= fio options to achieve
>>> this behavior.
>>>
>> You are not really describing what the test does.
>>
>> In the first paragraph, you state what fio verify function does and then
>> describe what RWF_ATOMIC means when we only use HW support, i.e. serialises.
>> In the second you mention that we guarantee no inter-thread overlapping
>> writes.
> Got it John, I will add better commit messages for the fio tests.
>>  From a glance at the code below, in this test each thread writes to a
>> separate part of the file and then verifies no crc corruption. But even with
>> atomic=0, I would expect no corruption here.
> Right, this is mostly a stress test that is ensuring that all the new
> atomic write code paths are not causing anything to break or
> introducing any regressions. This should pass with both atomic or non
> atomic writes but by using RWF_ATOMIC we excercise the atomic specific
> code paths, improving the code coverage.

I am not sure really how much value this has. At least it should be 
documented what we are doing here and what value there is in this test.

Thanks,
John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-08-21  8:28     ` Ojaswin Mujoo
@ 2025-08-21  9:28       ` John Garry
  2025-08-21 12:19         ` Ojaswin Mujoo
  0 siblings, 1 reply; 38+ messages in thread
From: John Garry @ 2025-08-21  9:28 UTC (permalink / raw)
  To: Ojaswin Mujoo
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On 21/08/2025 09:28, Ojaswin Mujoo wrote:
> Yes these 2 tests are similar however 061 uses fallocate=native +
> _scratch_mkfs_ext4 to test whether atomic writes on preallocated file
> via multiple threads works correctly.
> 
> The 062 uses fallocate=truncate + _scratch_mkfs_sized 360MB +
> 'multiple jobs each writing to a different file' to ensure we are
> extensively stressing the allocation logic in low space scenarios.

I see, please at least fully document this in the commit messages.

Thanks

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx
  2025-08-13 13:42   ` John Garry
@ 2025-08-21  9:45     ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21  9:45 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 02:42:09PM +0100, John Garry wrote:
> On 10/08/2025 14:41, Ojaswin Mujoo wrote:
> > Implement atomic write support to help fuzz atomic writes
> > with fsx.
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> 
> this looks ok, but I am not familiar with fsx.
> 
> BTW, you guarantee O_DIRECT with RWF_ATOMIC only, right?

So right now the user will have to pass in -Z to enable O_DIRECT.

So if the underlying stack has atomic support, fsx will try to do atomic
IO but if -Z is not passed it will fail the writes.

I think I will change this to switch atomic writes off if -Z is not
passed for convinience right now. Later if we have buffered support we
can lift this off. That's what you were asking about right?

Regards,
ojaswin

> 
> Thanks,
> John
> 
> > ---
> >   ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> >   1 file changed, 104 insertions(+), 5 deletions(-)
> > 
> > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > index 163b9453..ea39ca29 100644
> > --- a/ltp/fsx.c
> > +++ b/ltp/fsx.c
> > @@ -40,6 +40,7 @@
> >   #include <liburing.h>
> >   #endif
> >   #include <sys/syscall.h>
> > +#include "statx.h"
> >   #ifndef MAP_FILE
> >   # define MAP_FILE 0
> > @@ -49,6 +50,10 @@
> >   #define RWF_DONTCACHE	0x80
> >   #endif
> > +#ifndef RWF_ATOMIC
> > +#define RWF_ATOMIC	0x40
> > +#endif
> > +
> >   #define NUMPRINTCOLUMNS 32	/* # columns of data to print on each line */
> >   /* Operation flags (bitmask) */
> > @@ -110,6 +115,7 @@ enum {
> >   	OP_READ_DONTCACHE,
> >   	OP_WRITE,
> >   	OP_WRITE_DONTCACHE,
> > +	OP_WRITE_ATOMIC,
> >   	OP_MAPREAD,
> >   	OP_MAPWRITE,
> >   	OP_MAX_LITE,
> > @@ -200,6 +206,11 @@ int	uring = 0;
> >   int	mark_nr = 0;
> >   int	dontcache_io = 1;
> >   int	hugepages = 0;                  /* -h flag */
> > +int	do_atomic_writes = 1;		/* -a flag disables */
> > +
> > +/* User for atomic writes */
> > +int awu_min = 0;
> > +int awu_max = 0;
> >   /* Stores info needed to periodically collapse hugepages */
> >   struct hugepages_collapse_info {
> > @@ -288,6 +299,7 @@ static const char *op_names[] = {
> >   	[OP_READ_DONTCACHE] = "read_dontcache",
> >   	[OP_WRITE] = "write",
> >   	[OP_WRITE_DONTCACHE] = "write_dontcache",
> > +	[OP_WRITE_ATOMIC] = "write_atomic",
> >   	[OP_MAPREAD] = "mapread",
> >   	[OP_MAPWRITE] = "mapwrite",
> >   	[OP_TRUNCATE] = "truncate",
> > @@ -422,6 +434,7 @@ logdump(void)
> >   				prt("\t***RRRR***");
> >   			break;
> >   		case OP_WRITE_DONTCACHE:
> > +		case OP_WRITE_ATOMIC:
> >   		case OP_WRITE:
> >   			prt("WRITE    0x%x thru 0x%x\t(0x%x bytes)",
> >   			    lp->args[0], lp->args[0] + lp->args[1] - 1,
> > @@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
> >   	file_size = offset + size;
> >   }
> > +static int is_power_of_2(unsigned n) {
> > +	return ((n & (n - 1)) == 0);
> > +}
> > +
> > +/*
> > + * Round down n to nearest power of 2.
> > + * If n is already a power of 2, return n;
> > + */
> > +static int rounddown_pow_of_2(int n) {
> > +	int i = 0;
> > +
> > +	if (is_power_of_2(n))
> > +		return n;
> > +
> > +	for (; (1 << i) < n; i++);
> > +
> > +	return 1 << (i - 1);
> > +}
> > +
> >   void
> >   dowrite(unsigned offset, unsigned size, int flags)
> >   {
> > @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
> >   	offset -= offset % writebdy;
> >   	if (o_direct)
> >   		size -= size % writebdy;
> > +	if (flags & RWF_ATOMIC) {
> > +		/* atomic write len must be inbetween awu_min and awu_max */
> > +		if (size < awu_min)
> > +			size = awu_min;
> > +		if (size > awu_max)
> > +			size = awu_max;
> > +
> > +		/* atomic writes need power-of-2 sizes */
> > +		size = rounddown_pow_of_2(size);
> > +
> > +		/* atomic writes need naturally aligned offsets */
> > +		offset -= offset % size;
> > +
> > +		/* Skip the write if we are crossing max filesize */
> > +		if ((offset + size) > maxfilelen) {
> > +			if (!quiet && testcalls > simulatedopcount)
> > +				prt("skipping atomic write past maxfilelen\n");
> > +			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> > +			return;
> > +		}
> > +	}
> >   	if (size == 0) {
> >   		if (!quiet && testcalls > simulatedopcount && !o_direct)
> >   			prt("skipping zero size write\n");
> > @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
> >   		return;
> >   	}
> > -	log4(OP_WRITE, offset, size, FL_NONE);
> > +	if (flags & RWF_ATOMIC)
> > +		log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> > +	else
> > +		log4(OP_WRITE, offset, size, FL_NONE);
> >   	gendata(original_buf, good_buf, offset, size);
> >   	if (offset + size > file_size) {
> > @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
> >   		       (monitorstart == -1 ||
> >   			(offset + size > monitorstart &&
> >   			(monitorend == -1 || offset <= monitorend))))))
> > -		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> > -		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> > +		prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> > +		    offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> > +		    (flags & RWF_ATOMIC) != 0);
> >   	iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
> >   	if (iret != size) {
> >   		if (iret == -1)
> > @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
> >   }
> >   #endif
> > +int test_atomic_writes(void) {
> > +	int ret;
> > +	struct statx stx;
> > +
> > +	ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
> > +	if (ret < 0) {
> > +		fprintf(stderr, "main: Statx failed with %d."
> > +			" Failed to determine atomic write limits, "
> > +			" disabling!\n", ret);
> > +		return 0;
> > +	}
> > +
> > +	if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
> > +	    stx.stx_atomic_write_unit_min > 0) {
> > +		awu_min = stx.stx_atomic_write_unit_min;
> > +		awu_max = stx.stx_atomic_write_unit_max;
> > +		return 1;
> > +	}
> > +
> > +	fprintf(stderr, "main: IO Stack does not support "
> > +			"atomic writes, disabling!\n");
> > +	return 0;
> > +}
> > +
> >   #ifdef HAVE_COPY_FILE_RANGE
> >   int
> >   test_copy_range(void)
> > @@ -2356,6 +2437,12 @@ have_op:
> >   			goto out;
> >   		}
> >   		break;
> > +	case OP_WRITE_ATOMIC:
> > +		if (!do_atomic_writes) {
> > +			log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> > +			goto out;
> > +		}
> > +		break;
> >   	}
> >   	switch (op) {
> > @@ -2385,6 +2472,11 @@ have_op:
> >   			dowrite(offset, size, 0);
> >   		break;
> > +	case OP_WRITE_ATOMIC:
> > +		TRIM_OFF_LEN(offset, size, maxfilelen);
> > +		dowrite(offset, size, RWF_ATOMIC);
> > +		break;
> > +
> >   	case OP_MAPREAD:
> >   		TRIM_OFF_LEN(offset, size, file_size);
> >   		domapread(offset, size);
> > @@ -2511,13 +2603,14 @@ void
> >   usage(void)
> >   {
> >   	fprintf(stdout, "usage: %s",
> > -		"fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
> > +		"fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
> >   	   [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
> >   	   [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
> >   	   [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
> >   	   [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
> >   	   [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
> >   	   ... fname\n\
> > +	-a: disable atomic writes\n\
> >   	-b opnum: beginning operation number (default 1)\n\
> >   	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
> >   	-d: debug output for all operations\n\
> > @@ -3059,9 +3152,13 @@ main(int argc, char **argv)
> >   	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
> >   	while ((ch = getopt_long(argc, argv,
> > -				 "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > +				 "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> >   				 longopts, NULL)) != EOF)
> >   		switch (ch) {
> > +		case 'a':
> > +			prt("main(): Atomic writes disabled\n");
> > +			do_atomic_writes = 0;
> > +			break;
> >   		case 'b':
> >   			simulatedopcount = getnum(optarg, &endp);
> >   			if (!quiet)
> > @@ -3475,6 +3572,8 @@ main(int argc, char **argv)
> >   		exchange_range_calls = test_exchange_range();
> >   	if (dontcache_io)
> >   		dontcache_io = test_dontcache_io();
> > +	if (do_atomic_writes)
> > +		do_atomic_writes = test_atomic_writes();
> >   	while (keep_running())
> >   		if (!test())
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 01/11] common/rc: Add _min() and _max() helpers
  2025-08-13 12:20   ` David Laight
@ 2025-08-21 10:35     ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21 10:35 UTC (permalink / raw)
  To: David Laight
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, john.g.garry, tytso,
	linux-xfs, linux-kernel, linux-ext4

On Wed, Aug 13, 2025 at 01:20:34PM +0100, David Laight wrote:
> On Sun, 10 Aug 2025 19:11:52 +0530
> Ojaswin Mujoo <ojaswin@linux.ibm.com> wrote:
> 
> > Many programs open code these functionalities so add it as a generic helper
> > in common/rc
> > 
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >  common/rc | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/common/rc b/common/rc
> > index 96578d15..3858ddce 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -5873,6 +5873,28 @@ _require_program() {
> >  	_have_program "$1" || _notrun "$tag required"
> >  }
> >  
> > +_min() {
> > +	local ret
> > +
> > +	for arg in "$@"; do
> > +		if [ -z "$ret" ] || (( $arg < $ret )); then
> > +			ret="$arg"
> > +		fi
> > +	done
> > +	echo $ret
> > +}
> 
> Perhaps:
> 	local ret="$1"
> 	shift
> 	for arg in "$@"; do
> 		ret=$(((arg) < (ret) ? (arg) : (ret)))
> 	done;
> 	echo "$ret"
> that should work for 'min 10 "2 + 3"' (with bash, but not dash).
> 
> 	David

Hi David,

Thanks for the feedback. I agree that your way is slightly better but i
would like to keep the current patch as is since we already have some
reviews on it and I would prefer to keep the code as is (especially
since both ways are close enough). Hope this is okay.

Also, we can always do 

_min 10 $((2 + 3)) 

which is a bit more intuitive imo

Regards,
ojaswin

> 
> > +
> > +_max() {
> > +	local ret
> > +
> > +	for arg in "$@"; do
> > +		if [ -z "$ret" ] || (( $arg > $ret )); then
> > +			ret="$arg"
> > +		fi
> > +	done
> > +	echo $ret
> > +}
> > +
> >  ################################################################################
> >  # make sure this script returns success
> >  /bin/true
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier
  2025-08-21  9:24       ` John Garry
@ 2025-08-21 12:18         ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21 12:18 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Thu, Aug 21, 2025 at 10:24:58AM +0100, John Garry wrote:
> On 21/08/2025 09:42, Ojaswin Mujoo wrote:
> > On Wed, Aug 13, 2025 at 02:39:40PM +0100, John Garry wrote:
> > > On 10/08/2025 14:41, Ojaswin Mujoo wrote:
> > > > This adds atomic write test using fio based on it's crc check verifier.
> > > > fio adds a crc for each data block. If the underlying device supports
> > > > atomic write then it is guaranteed that we will never have a mix data from
> > > > two threads writing on the same physical block.
> > > > 
> > > > Avoid doing overlapping parallel atomic writes because it might give
> > > > unexpected results. Use offset_increment=, size= fio options to achieve
> > > > this behavior.
> > > > 
> > > You are not really describing what the test does.
> > > 
> > > In the first paragraph, you state what fio verify function does and then
> > > describe what RWF_ATOMIC means when we only use HW support, i.e. serialises.
> > > In the second you mention that we guarantee no inter-thread overlapping
> > > writes.
> > Got it John, I will add better commit messages for the fio tests.
> > >  From a glance at the code below, in this test each thread writes to a
> > > separate part of the file and then verifies no crc corruption. But even with
> > > atomic=0, I would expect no corruption here.
> > Right, this is mostly a stress test that is ensuring that all the new
> > atomic write code paths are not causing anything to break or
> > introducing any regressions. This should pass with both atomic or non
> > atomic writes but by using RWF_ATOMIC we excercise the atomic specific
> > code paths, improving the code coverage.
> 
> I am not sure really how much value this has. At least it should be
> documented what we are doing here and what value there is in this test.

Got it John, I'll add it with information about what we are doing and
what we are trying to stress.

Regards,
ojaswin

> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files
  2025-08-21  9:28       ` John Garry
@ 2025-08-21 12:19         ` Ojaswin Mujoo
  0 siblings, 0 replies; 38+ messages in thread
From: Ojaswin Mujoo @ 2025-08-21 12:19 UTC (permalink / raw)
  To: John Garry
  Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
	linux-kernel, linux-ext4

On Thu, Aug 21, 2025 at 10:28:56AM +0100, John Garry wrote:
> On 21/08/2025 09:28, Ojaswin Mujoo wrote:
> > Yes these 2 tests are similar however 061 uses fallocate=native +
> > _scratch_mkfs_ext4 to test whether atomic writes on preallocated file
> > via multiple threads works correctly.
> > 
> > The 062 uses fallocate=truncate + _scratch_mkfs_sized 360MB +
> > 'multiple jobs each writing to a different file' to ensure we are
> > extensively stressing the allocation logic in low space scenarios.
> 
> I see, please at least fully document this in the commit messages.
> 
> Thanks

Will do.

Regards,
ojaswin

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2025-08-21 12:19 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-10 13:41 [PATCH v4 00/11] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 01/11] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
2025-08-13 12:20   ` David Laight
2025-08-21 10:35     ` Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 02/11] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 03/11] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
2025-08-13 13:42   ` John Garry
2025-08-21  9:45     ` Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 04/11] generic: Add atomic write test using fio crc check verifier Ojaswin Mujoo
2025-08-12 17:16   ` Darrick J. Wong
2025-08-13 13:39   ` John Garry
2025-08-21  8:42     ` Ojaswin Mujoo
2025-08-21  9:24       ` John Garry
2025-08-21 12:18         ` Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 05/11] generic: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
2025-08-12 17:16   ` Darrick J. Wong
2025-08-10 13:41 ` [PATCH v4 06/11] generic: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
2025-08-11 15:29   ` Darrick J. Wong
2025-08-10 13:41 ` [PATCH v4 07/11] generic: Stress fsx with atomic writes enabled Ojaswin Mujoo
2025-08-12 17:18   ` Darrick J. Wong
2025-08-13  5:45     ` Ojaswin Mujoo
2025-08-10 13:41 ` [PATCH v4 08/11] generic: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
2025-08-10 13:42 ` [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
2025-08-12  8:08   ` John Garry
2025-08-13  7:08     ` Ojaswin Mujoo
2025-08-13  7:33       ` John Garry
2025-08-21  8:29         ` Ojaswin Mujoo
2025-08-10 13:42 ` [PATCH v4 10/11] ext4: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
2025-08-13 13:45   ` John Garry
2025-08-21  8:28     ` Ojaswin Mujoo
2025-08-21  9:28       ` John Garry
2025-08-21 12:19         ` Ojaswin Mujoo
2025-08-10 13:42 ` [PATCH v4 11/11] ext4: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
2025-08-12 17:19   ` Darrick J. Wong
2025-08-13  5:45     ` Ojaswin Mujoo
2025-08-13 13:54   ` John Garry
2025-08-21  8:25     ` Ojaswin Mujoo
2025-08-21  9:23       ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).