* [PATCH v3 00/13] Add more tests for multi fs block atomic writes
@ 2025-07-12 14:12 Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
` (12 more replies)
0 siblings, 13 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
NOTE: This patch series is based on:
https://lore.kernel.org/fstests/20250626002735.22827-1-catherine.hoang@oracle.com/T/#t
Changes in v3:
- (2/13) use dumpe2fs to figure out if FS is bigalloc
- (9/13) generic/1230: Detect device speeds for more accurate testing. ALso
speeds up the test
- fio tests - switch to write followed by verify approach to avoid false
failures due to fio verify reads splitting and racing with atomic
writes. Discussion thread:
https://lore.kernel.org/fstests/0430bd73-e6c2-4ce9-af24-67b1e1fa9b5b@oracle.com/
Changes in v2 [1]:
- (1/13) new patch with _min and _max helpers
- (2/13) remove setup_fs_options and add fsx specifc helper
- (4/13) skip atomic write instead of falling back to normal write (fsx)
- (4/13) make atomic write default on instead of default off (fsx)
- (5,6/13) refactor and cleanup fio tests
- (7/13) refactored common code
- (8/13) dont ignore mmap writes for fsx with atomic writes
- (9/13) use od instead of xxd. handle cleanup of bg threads in _cleanup()
- (10-13/13) minor refactors
- change all tests use _fail for better consistency
- use higher tests numbers for easier merging
[1] https://lore.kernel.org/fstests/cover.1750924903.git.ojaswin@linux.ibm.com/
* Original cover [2] *
These are the tests we were using to verify that filesystems are not
tearing multi fs block atomic writes. Infact some of the tests like
generic/772 (now: g/1230) actually helped us catch and fix issues in
ext4's early implementations of multi fs block atomic writes and hence
we feel these tests are useful to have in xfstests.
We have tested these with scsi debug as well as a real nvme device
supporting multi fs block atomic writes.
Thoughts and suggestions are welcome!
[2] rfc: https://lore.kernel.org/fstests/cover.1749629233.git.ojaswin@linux.ibm.com/
Ojaswin Mujoo (9):
common/rc: Add _min() and _max() helpers
common/rc: Fix fsx for ext4 with bigalloc
common/rc: Add a helper to run fsx on a given file
ltp/fsx.c: Add atomic writes support to fsx
generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
generic/1229: Stress fsx with atomic writes enabled
generic/1230: Add sudden shutdown tests for multi block atomic writes
ext4/063: Atomic write test for extent split across leaf nodes
ext4/064: Add atomic write tests for journal credit calculation
Ritesh Harjani (IBM) (4):
generic/1226: Add atomic write test using fio crc check verifier
generic/1227: Add atomic write test using fio verify on file mixed
mappings
ext4/061: Atomic writes stress test for bigalloc using fio crc
verifier
ext4/062: Atomic writes test for bigalloc using fio crc verifier on
multiple files
common/rc | 71 +++++++-
ltp/fsx.c | 109 ++++++++++-
tests/ext4/061 | 130 ++++++++++++++
tests/ext4/061.out | 2 +
tests/ext4/062 | 176 ++++++++++++++++++
tests/ext4/062.out | 2 +
tests/ext4/063 | 125 +++++++++++++
tests/ext4/063.out | 2 +
tests/ext4/064 | 75 ++++++++
tests/ext4/064.out | 2 +
tests/generic/1226 | 101 +++++++++++
tests/generic/1226.out | 2 +
tests/generic/1227 | 123 +++++++++++++
tests/generic/1227.out | 2 +
tests/generic/1228 | 139 +++++++++++++++
tests/generic/1228.out | 2 +
tests/generic/1229 | 41 +++++
tests/generic/1229.out | 2 +
tests/generic/1230 | 397 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1230.out | 2 +
20 files changed, 1497 insertions(+), 8 deletions(-)
create mode 100755 tests/ext4/061
create mode 100644 tests/ext4/061.out
create mode 100755 tests/ext4/062
create mode 100644 tests/ext4/062.out
create mode 100755 tests/ext4/063
create mode 100644 tests/ext4/063.out
create mode 100755 tests/ext4/064
create mode 100644 tests/ext4/064.out
create mode 100755 tests/generic/1226
create mode 100644 tests/generic/1226.out
create mode 100755 tests/generic/1227
create mode 100644 tests/generic/1227.out
create mode 100755 tests/generic/1228
create mode 100644 tests/generic/1228.out
create mode 100755 tests/generic/1229
create mode 100644 tests/generic/1229.out
create mode 100755 tests/generic/1230
create mode 100644 tests/generic/1230.out
--
2.49.0
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v3 01/13] common/rc: Add _min() and _max() helpers
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 15:02 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
` (11 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Many programs open code these functionalities so add it as a generic helper
in common/rc
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
common/rc | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/common/rc b/common/rc
index f71cc8f0..9a9d3cc8 100644
--- a/common/rc
+++ b/common/rc
@@ -5817,6 +5817,28 @@ _require_program() {
_have_program "$1" || _notrun "$tag required"
}
+_min() {
+ local ret
+
+ for arg in "$@"; do
+ if [ -z "$ret" ] || (( $arg < $ret )); then
+ ret="$arg"
+ fi
+ done
+ echo $ret
+}
+
+_max() {
+ local ret
+
+ for arg in "$@"; do
+ if [ -z "$ret" ] || (( $arg > $ret )); then
+ ret="$arg"
+ fi
+ done
+ echo $ret
+}
+
################################################################################
# make sure this script returns success
/bin/true
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 16:11 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
` (10 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Insert range and collapse range only works with bigalloc in case
the range is cluster size aligned, which fsx doesnt take care. To
work past this, disable insert range and collapse range on ext4, if
bigalloc is enabled.
This is achieved by defining a new function _set_default_fsx_avoid
called via run_fsx helper. This can be used to selectively disable
fsx options based on the configuration.
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
common/rc | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/common/rc b/common/rc
index 9a9d3cc8..218cf253 100644
--- a/common/rc
+++ b/common/rc
@@ -5113,10 +5113,37 @@ _require_hugepage_fsx()
_notrun "fsx binary does not support MADV_COLLAPSE"
}
+_set_default_fsx_avoid() {
+ local file=$1
+
+ case "$FSTYP" in
+ "ext4")
+ local dev=$(findmnt -n -o SOURCE --target $file)
+
+ # open code instead of _require_dumpe2fs cause we don't
+ # want to _notrun if dumpe2fs is not available
+ if [ -z "$DUMPE2FS_PROG" ]; then
+ echo "_set_default_fsx_avoid: dumpe2fs not found, skipping bigalloc check." >> $seqres.full
+ return
+ fi
+
+ $DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
+ export FSX_AVOID+=" -I -C"
+ }
+ ;;
+ # Add other filesystem types here as needed
+ *)
+ ;;
+ esac
+}
+
_run_fsx()
{
echo "fsx $*"
local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
+
+ _set_default_fsx_avoid $testfile
+
set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
echo "$@" >>$seqres.full
rm -f $TEST_DIR/junk
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 03/13] common/rc: Add a helper to run fsx on a given file
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
` (9 subsequent siblings)
12 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Currently run_fsx is hardcoded to run on a file in $TEST_DIR.
Add a helper _run_fsx_on_file so that we can run fsx on any
given file including in $SCRATCH_MNT. Also, refactor _run_fsx
to use this helper.
No functional change is intended in this patch.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
common/rc | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/common/rc b/common/rc
index 218cf253..2d4592b6 100644
--- a/common/rc
+++ b/common/rc
@@ -5137,16 +5137,26 @@ _set_default_fsx_avoid() {
esac
}
-_run_fsx()
+_run_fsx_on_file()
{
+ local testfile=$1
+ shift
+
+ if ! [ -f $testfile ]
+ then
+ echo "_run_fsx_on_file: $testfile doesn't exist. Creating" >> $seqres.full
+ touch $testfile
+ fi
+
echo "fsx $*"
local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
_set_default_fsx_avoid $testfile
- set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
+ set -- $FSX_PROG $args $FSX_AVOID $testfile
+
echo "$@" >>$seqres.full
- rm -f $TEST_DIR/junk
+ rm -f $testfile
"$@" 2>&1 | tee -a $seqres.full >$tmp.fsx
local res=${PIPESTATUS[0]}
if [ $res -ne 0 ]; then
@@ -5158,6 +5168,12 @@ _run_fsx()
return 0
}
+_run_fsx()
+{
+ _run_fsx_on_file $TEST_DIR/junk $@
+ return $?
+}
+
# Run fsx with -h(ugepage buffers). If we can't set up a hugepage then skip
# the test, but if any other error occurs then exit the test.
_run_hugepage_fsx() {
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (2 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 16:17 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
` (8 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Implement atomic write support to help fuzz atomic writes
with fsx.
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 104 insertions(+), 5 deletions(-)
diff --git a/ltp/fsx.c b/ltp/fsx.c
index 163b9453..ea39ca29 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -40,6 +40,7 @@
#include <liburing.h>
#endif
#include <sys/syscall.h>
+#include "statx.h"
#ifndef MAP_FILE
# define MAP_FILE 0
@@ -49,6 +50,10 @@
#define RWF_DONTCACHE 0x80
#endif
+#ifndef RWF_ATOMIC
+#define RWF_ATOMIC 0x40
+#endif
+
#define NUMPRINTCOLUMNS 32 /* # columns of data to print on each line */
/* Operation flags (bitmask) */
@@ -110,6 +115,7 @@ enum {
OP_READ_DONTCACHE,
OP_WRITE,
OP_WRITE_DONTCACHE,
+ OP_WRITE_ATOMIC,
OP_MAPREAD,
OP_MAPWRITE,
OP_MAX_LITE,
@@ -200,6 +206,11 @@ int uring = 0;
int mark_nr = 0;
int dontcache_io = 1;
int hugepages = 0; /* -h flag */
+int do_atomic_writes = 1; /* -a flag disables */
+
+/* User for atomic writes */
+int awu_min = 0;
+int awu_max = 0;
/* Stores info needed to periodically collapse hugepages */
struct hugepages_collapse_info {
@@ -288,6 +299,7 @@ static const char *op_names[] = {
[OP_READ_DONTCACHE] = "read_dontcache",
[OP_WRITE] = "write",
[OP_WRITE_DONTCACHE] = "write_dontcache",
+ [OP_WRITE_ATOMIC] = "write_atomic",
[OP_MAPREAD] = "mapread",
[OP_MAPWRITE] = "mapwrite",
[OP_TRUNCATE] = "truncate",
@@ -422,6 +434,7 @@ logdump(void)
prt("\t***RRRR***");
break;
case OP_WRITE_DONTCACHE:
+ case OP_WRITE_ATOMIC:
case OP_WRITE:
prt("WRITE 0x%x thru 0x%x\t(0x%x bytes)",
lp->args[0], lp->args[0] + lp->args[1] - 1,
@@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
file_size = offset + size;
}
+static int is_power_of_2(unsigned n) {
+ return ((n & (n - 1)) == 0);
+}
+
+/*
+ * Round down n to nearest power of 2.
+ * If n is already a power of 2, return n;
+ */
+static int rounddown_pow_of_2(int n) {
+ int i = 0;
+
+ if (is_power_of_2(n))
+ return n;
+
+ for (; (1 << i) < n; i++);
+
+ return 1 << (i - 1);
+}
+
void
dowrite(unsigned offset, unsigned size, int flags)
{
@@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
offset -= offset % writebdy;
if (o_direct)
size -= size % writebdy;
+ if (flags & RWF_ATOMIC) {
+ /* atomic write len must be inbetween awu_min and awu_max */
+ if (size < awu_min)
+ size = awu_min;
+ if (size > awu_max)
+ size = awu_max;
+
+ /* atomic writes need power-of-2 sizes */
+ size = rounddown_pow_of_2(size);
+
+ /* atomic writes need naturally aligned offsets */
+ offset -= offset % size;
+
+ /* Skip the write if we are crossing max filesize */
+ if ((offset + size) > maxfilelen) {
+ if (!quiet && testcalls > simulatedopcount)
+ prt("skipping atomic write past maxfilelen\n");
+ log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+ return;
+ }
+ }
if (size == 0) {
if (!quiet && testcalls > simulatedopcount && !o_direct)
prt("skipping zero size write\n");
@@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
return;
}
- log4(OP_WRITE, offset, size, FL_NONE);
+ if (flags & RWF_ATOMIC)
+ log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
+ else
+ log4(OP_WRITE, offset, size, FL_NONE);
gendata(original_buf, good_buf, offset, size);
if (offset + size > file_size) {
@@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
(monitorstart == -1 ||
(offset + size > monitorstart &&
(monitorend == -1 || offset <= monitorend))))))
- prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
- offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
+ prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
+ offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
+ (flags & RWF_ATOMIC) != 0);
iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
if (iret != size) {
if (iret == -1)
@@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
}
#endif
+int test_atomic_writes(void) {
+ int ret;
+ struct statx stx;
+
+ ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
+ if (ret < 0) {
+ fprintf(stderr, "main: Statx failed with %d."
+ " Failed to determine atomic write limits, "
+ " disabling!\n", ret);
+ return 0;
+ }
+
+ if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
+ stx.stx_atomic_write_unit_min > 0) {
+ awu_min = stx.stx_atomic_write_unit_min;
+ awu_max = stx.stx_atomic_write_unit_max;
+ return 1;
+ }
+
+ fprintf(stderr, "main: IO Stack does not support "
+ "atomic writes, disabling!\n");
+ return 0;
+}
+
#ifdef HAVE_COPY_FILE_RANGE
int
test_copy_range(void)
@@ -2356,6 +2437,12 @@ have_op:
goto out;
}
break;
+ case OP_WRITE_ATOMIC:
+ if (!do_atomic_writes) {
+ log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
+ goto out;
+ }
+ break;
}
switch (op) {
@@ -2385,6 +2472,11 @@ have_op:
dowrite(offset, size, 0);
break;
+ case OP_WRITE_ATOMIC:
+ TRIM_OFF_LEN(offset, size, maxfilelen);
+ dowrite(offset, size, RWF_ATOMIC);
+ break;
+
case OP_MAPREAD:
TRIM_OFF_LEN(offset, size, file_size);
domapread(offset, size);
@@ -2511,13 +2603,14 @@ void
usage(void)
{
fprintf(stdout, "usage: %s",
- "fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
+ "fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
[-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
[-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
[-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
[-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
[--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
... fname\n\
+ -a: disable atomic writes\n\
-b opnum: beginning operation number (default 1)\n\
-c P: 1 in P chance of file close+open at each op (default infinity)\n\
-d: debug output for all operations\n\
@@ -3059,9 +3152,13 @@ main(int argc, char **argv)
setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
while ((ch = getopt_long(argc, argv,
- "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
+ "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
longopts, NULL)) != EOF)
switch (ch) {
+ case 'a':
+ prt("main(): Atomic writes disabled\n");
+ do_atomic_writes = 0;
+ break;
case 'b':
simulatedopcount = getnum(optarg, &endp);
if (!quiet)
@@ -3475,6 +3572,8 @@ main(int argc, char **argv)
exchange_range_calls = test_exchange_range();
if (dontcache_io)
dontcache_io = test_dontcache_io();
+ if (do_atomic_writes)
+ do_atomic_writes = test_atomic_writes();
while (keep_running())
if (!test())
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (3 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 13:00 ` John Garry
2025-07-12 14:12 ` [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
` (7 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
This adds atomic write test using fio based on it's crc check verifier.
fio adds a crc for each data block. If the underlying device supports atomic
write then it is guaranteed that we will never have a mix data from two
threads writing on the same physical block.
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/generic/1226 | 101 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1226.out | 2 +
2 files changed, 103 insertions(+)
create mode 100755 tests/generic/1226
create mode 100644 tests/generic/1226.out
diff --git a/tests/generic/1226 b/tests/generic/1226
new file mode 100755
index 00000000..455fc55f
--- /dev/null
+++ b/tests/generic/1226
@@ -0,0 +1,101 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1226
+#
+# Validate FS atomic write using fio crc check verifier.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic
+_require_odirect
+_require_aio
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100 * 1024 * 1024))
+
+function create_fio_configs()
+{
+ create_fio_aw_config
+ create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+ [verify-job]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$blocksize
+ fallocate=native
+ filename=$SCRATCH_MNT/test-file
+ size=$SIZE
+ iodepth=$FIO_LOAD
+ group_reporting=1
+
+ verify_only=1
+ verify=crc32c
+ verify_fatal=1
+ verify_state_save=0
+ verify_write_sequence=0
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+ [atomicwrite-job]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$blocksize
+ fallocate=native
+ filename=$SCRATCH_MNT/test-file
+ size=$SIZE
+ iodepth=$FIO_LOAD
+ numjobs=$FIO_LOAD
+ group_reporting=1
+ atomic=1
+
+ verify_state_save=0
+ verify=crc32c
+ do_verify=0
+EOF
+}
+
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+cat $fio_aw_config >> $seqres.full
+cat $fio_verify_config >> $seqres.full
+
+$FIO_PROG $fio_aw_config >> $seqres.full
+ret1=$?
+$FIO_PROG $fio_verify_config >> $seqres.full
+ret2=$?
+
+[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1226.out b/tests/generic/1226.out
new file mode 100644
index 00000000..6dce0ea5
--- /dev/null
+++ b/tests/generic/1226.out
@@ -0,0 +1,2 @@
+QA output created by 1226
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (4 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 16:32 ` Darrick J. Wong
2025-07-28 8:58 ` Zorro Lang
2025-07-12 14:12 ` [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
` (6 subsequent siblings)
12 siblings, 2 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
This tests uses fio to first create a file with mixed mappings. Then it
does atomic writes using aio dio with parallel jobs to the same file
with mixed mappings. This forces the filesystem allocator to allocate
extents over mixed mapping regions to stress FS block allocators.
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/generic/1227 | 123 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1227.out | 2 +
2 files changed, 125 insertions(+)
create mode 100755 tests/generic/1227
create mode 100644 tests/generic/1227.out
diff --git a/tests/generic/1227 b/tests/generic/1227
new file mode 100755
index 00000000..cfdc54ec
--- /dev/null
+++ b/tests/generic/1227
@@ -0,0 +1,123 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1227
+#
+# Validate FS atomic write using fio crc check verifier on mixed mappings
+# of a file.
+#
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto aio rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_odirect
+_require_aio
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+
+touch "$SCRATCH_MNT/f1"
+awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
+awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
+aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
+
+fsbsize=$(_get_block_size $SCRATCH_MNT)
+
+fio_prep_config=$tmp.prep.fio
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((128 * 1024 * 1024))
+
+cat >$fio_prep_config <<EOF
+# prep file to have mixed mappings
+[global]
+ioengine=libaio
+fallocate=none
+filename=$SCRATCH_MNT/test-file
+filesize=$SIZE
+bs=$fsbsize
+direct=1
+group_reporting=1
+
+# Create written extents
+[prep_written_blocks]
+ioengine=libaio
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+
+# Create unwritten extents
+[prep_unwritten_blocks]
+ioengine=falloc
+rw=randwrite
+io_size=$((SIZE/3))
+random_generator=lfsr
+EOF
+
+cat >$fio_aw_config <<EOF
+# atomic write to mixed mappings of written/unwritten/holes
+[atomic_write_job]
+ioengine=libaio
+rw=randwrite
+direct=1
+atomic=1
+random_generator=lfsr
+group_reporting=1
+
+filename=$SCRATCH_MNT/test-file
+size=$SIZE
+bs=$aw_bsize
+iodepth=$FIO_LOAD
+numjobs=$FIO_LOAD
+
+verify_state_save=0
+verify=crc32c
+do_verify=0
+EOF
+
+cat >$fio_verify_config <<EOF
+# verify atomic writes done by previous job
+[verify_job]
+ioengine=libaio
+rw=randwrite
+random_generator=lfsr
+group_reporting=1
+
+filename=$SCRATCH_MNT/test-file
+size=$SIZE
+bs=$aw_bsize
+iodepth=$FIO_LOAD
+
+verify_state_save=0
+verify_only=1
+verify=crc32c
+verify_fatal=1
+verify_write_sequence=0
+EOF
+
+_require_fio $fio_aw_config
+_require_fio $fio_verify_config
+
+cat $fio_prep_config >> $seqres.full
+cat $fio_aw_config >> $seqres.full
+cat $fio_verify_config >> $seqres.full
+
+#prepare file with mixed mappings
+$FIO_PROG $fio_prep_config >> $seqres.full
+
+# do atomic writes without verifying
+$FIO_PROG $fio_aw_config >> $seqres.full
+
+# verify data is not torn
+$FIO_PROG $fio_verify_config >> $seqres.full
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/generic/1227.out b/tests/generic/1227.out
new file mode 100644
index 00000000..2605d062
--- /dev/null
+++ b/tests/generic/1227.out
@@ -0,0 +1,2 @@
+QA output created by 1227
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (5 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 16:35 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
` (5 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
This adds various atomic write multi-fsblock stresst tests
with mixed mappings and O_SYNC, to ensure the data and metadata
is atomically persisted even if there is a shutdown.
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/generic/1228 | 139 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1228.out | 2 +
2 files changed, 141 insertions(+)
create mode 100755 tests/generic/1228
create mode 100644 tests/generic/1228.out
diff --git a/tests/generic/1228 b/tests/generic/1228
new file mode 100755
index 00000000..3f9a6af1
--- /dev/null
+++ b/tests/generic/1228
@@ -0,0 +1,139 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1228
+#
+# Atomic write multi-fsblock data integrity tests with mixed mappings
+# and O_SYNC
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick rw atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+check_data_integrity() {
+ actual=$(_hexdump $testfile)
+ if [[ "$expected" != "$actual" ]]
+ then
+ echo "Integrity check failed"
+ echo "Integrity check failed" >> $seqres.full
+ echo "# Expected file contents:" >> $seqres.full
+ echo "$expected" >> $seqres.full
+ echo "# Actual file contents:" >> $seqres.full
+ echo "$actual" >> $seqres.full
+
+ _fail "Data integrity check failed. The atomic write was torn."
+ fi
+}
+
+prep_mixed_mapping() {
+ $XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+ local off=0
+ local mapping=""
+
+ local operations=("W" "H" "U")
+ local num_blocks=$((awu_max / blksz))
+ for ((i=0; i<num_blocks; i++)); do
+ local index=$((RANDOM % ${#operations[@]}))
+ local map="${operations[$index]}"
+ local mapping="${mapping}${map}"
+
+ case "$map" in
+ "W")
+ $XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
+ ;;
+ "H")
+ # No operation needed for hole
+ ;;
+ "U")
+ $XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
+ ;;
+ esac
+ off=$((off + blksz))
+ done
+
+ echo "+ + Mixed mapping prep done. Full mapping pattern: $mapping" >> $seqres.full
+
+ sync $testfile
+}
+
+verify_atomic_write() {
+ if [[ "$1" == "shutdown" ]]
+ then
+ local do_shutdown=1
+ fi
+
+ test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
+
+ if [[ $do_shutdown -eq "1" ]]
+ then
+ echo "Shutting down filesystem" >> $seqres.full
+ _scratch_shutdown >> $seqres.full
+ _scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
+ fi
+
+ check_data_integrity
+}
+
+mixed_mapping_test() {
+ prep_mixed_mapping
+
+ echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
+ bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
+ grep wrote | awk -F'[/ ]' '{print $2}')
+
+ verify_atomic_write $1
+}
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+
+# Create an expected pattern to compare with
+$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
+expected=$(_hexdump $testfile)
+echo "# Expected file contents:" >> $seqres.full
+echo "$expected" >> $seqres.full
+echo >> $seqres.full
+
+echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping:" >> $seqres.full
+echo >> $seqres.full
+for ((iteration=1; iteration<=10; iteration++)); do
+ echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
+
+ echo "+ Testing without shutdown..." >> $seqres.full
+ mixed_mapping_test
+ echo "Passed!" >> $seqres.full
+
+ echo "+ Testing with sudden shutdown..." >> $seqres.full
+ mixed_mapping_test "shutdown"
+ echo "Passed!" >> $seqres.full
+
+ echo "Iteration $iteration completed: OK" >> $seqres.full
+ echo >> $seqres.full
+done
+echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
+
+
+echo >> $seqres.full
+echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
+bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
+ grep wrote | awk -F'[/ ]' '{print $2}')
+verify_atomic_write "shutdown"
+echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
+
diff --git a/tests/generic/1228.out b/tests/generic/1228.out
new file mode 100644
index 00000000..1baffa91
--- /dev/null
+++ b/tests/generic/1228.out
@@ -0,0 +1,2 @@
+QA output created by 1228
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (6 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-17 16:22 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
` (4 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Stress file with atomic writes to ensure we excercise codepaths
where we are mixing different FS operations with atomic writes
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/generic/1229 | 41 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1229.out | 2 ++
2 files changed, 43 insertions(+)
create mode 100755 tests/generic/1229
create mode 100644 tests/generic/1229.out
diff --git a/tests/generic/1229 b/tests/generic/1229
new file mode 100755
index 00000000..98e9b50c
--- /dev/null
+++ b/tests/generic/1229
@@ -0,0 +1,41 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 1229
+#
+# fuzz fsx with atomic writes
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest rw auto quick atomicwrites
+
+_require_odirect
+_require_scratch_write_atomic
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+
+# fsx usage:
+#
+# -N numops: total # operations to do
+# -l flen: the upper bound on file size
+# -o oplen: the upper bound on operation size (64k default)
+# -Z: O_DIRECT ()
+
+_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID >> $seqres.full
+if [[ "$?" != "0" ]]
+then
+ _fail "fsx returned error: $?"
+fi
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1229.out b/tests/generic/1229.out
new file mode 100644
index 00000000..737d61c6
--- /dev/null
+++ b/tests/generic/1229.out
@@ -0,0 +1,2 @@
+QA output created by 1229
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (7 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-29 19:49 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
` (3 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
This test is intended to ensure that multi blocks atomic writes
maintain atomic guarantees across sudden FS shutdowns.
The way we work is that we lay out a file with random mix of written,
unwritten and hole extents. Then we start performing atomic writes
sequentially on the file while we parallely shutdown the FS. Then we
note the last offset where the atomic write happened just before shut
down and then make sure blocks around it either have completely old
data or completely new data, ie the write was not torn during shutdown.
We repeat the same with completely written, completely unwritten and completely
empty file to ensure these cases are not torn either. Finally, we have a
similar test for append atomic writes
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/generic/1230 | 397 +++++++++++++++++++++++++++++++++++++++++
tests/generic/1230.out | 2 +
2 files changed, 399 insertions(+)
create mode 100755 tests/generic/1230
create mode 100644 tests/generic/1230.out
diff --git a/tests/generic/1230 b/tests/generic/1230
new file mode 100755
index 00000000..cff5adc0
--- /dev/null
+++ b/tests/generic/1230
@@ -0,0 +1,397 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test No. 1230
+#
+# Test multi block atomic writes with sudden FS shutdowns to ensure
+# the FS is not tearing the write operation
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+echo "Awu max: $awu_max" >> $seqres.full
+
+num_blocks=$((awu_max / blksz))
+# keep initial value high for dry run. This will be
+# tweaked in dry_run() based on device write speed.
+filesize=$(( 10 * 1024 * 1024 * 1024 ))
+
+_cleanup() {
+ [ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
+ wait
+}
+
+atomic_write_loop() {
+ local off=0
+ local size=$awu_max
+ for ((i=0; i<$((filesize / $size )); i++)); do
+ # Due to sudden shutdown this can produce errors so just
+ # redirect them to seqres.full
+ $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
+ echo "Written to offset: $off" >> $tmp.aw
+ off=$((off + $size))
+ done
+}
+
+# This test has the following flow:
+# 1. Start doing sequential atomic writes in bg, upto $filesize
+# 2. Sleep for 0.2s and shutdown the FS
+# 3. kill the atomic write process
+# 4. verify the writes were not torn
+#
+# We ideally want the shutdown to happen while an atomic write is ongoing
+# but this gets tricky since faster devices can actually finish the whole
+# atomic write loop before sleep 0.2s completes, resulting in the shutdown
+# happening after the write loop which is not what we want. A simple solution
+# to this is to increase $filesize so step 1 takes long enough but a big
+# $filesize leads to create_mixed_mappings() taking very long, which is not
+# ideal.
+#
+# Hence, use the dry_run function to figure out the rough device speed and set
+# $filesize accordingly.
+dry_run() {
+ echo >> $seqres.full
+ echo "# Estimating ideal filesize..." >> $seqres.full
+ atomic_write_loop &
+ awloop_pid=$!
+
+ local i=0
+ # Wait for atleast first write to be recorded or 10s
+ while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+ if [[ $i -gt 50 ]]
+ then
+ _fail "atomic write process took too long to start"
+ fi
+
+ echo >> $seqres.full
+ echo "# Shutting down filesystem while write is running" >> $seqres.full
+ _scratch_shutdown
+
+ kill $awloop_pid 2>/dev/null # the process might have finished already
+ wait $awloop_pid
+ unset $awloop_pid
+
+ bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+ echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
+
+ filesize=$((bytes_written * 3))
+ echo "# Setting \$filesize=$filesize" >> $seqres.full
+
+ rm $tmp.aw
+ sleep 0.5
+
+ _scratch_cycle_mount
+
+}
+
+create_mixed_mappings() {
+ local file=$1
+ local size_bytes=$2
+
+ echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
+ #Fill the file with alternate written and unwritten blocks
+ local off=0
+ local operations=("W" "U")
+
+ for ((i=0; i<$((size_bytes / blksz )); i++)); do
+ index=$(($i % ${#operations[@]}))
+ map="${operations[$index]}"
+
+ case "$map" in
+ "W")
+ $XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file >> /dev/null
+ ;;
+ "U")
+ $XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
+ ;;
+ esac
+ off=$((off + blksz))
+ done
+
+ sync $file
+}
+
+populate_expected_data() {
+ # create a dummy file with expected old data for different cases
+ create_mixed_mappings $testfile.exp_old_mixed $awu_max
+ expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
+
+ $XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
+ expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
+
+ $XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
+ expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
+
+ # create a dummy file with expected new data
+ $XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
+ expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
+}
+
+verify_data_blocks() {
+ local verify_start=$1
+ local verify_end=$2
+ local expected_data_old="$3"
+ local expected_data_new="$4"
+
+ echo >> $seqres.full
+ echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
+
+ # After an atomic write, for every chunk we ensure that the underlying
+ # data is either the old data or new data as writes shouldn't get torn.
+ local off=$verify_start
+ while [[ "$off" -lt "$verify_end" ]]
+ do
+ #actual_data=$(xxd -s $off -l $awu_max -p $testfile)
+ actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
+ if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
+ then
+ echo "Checksum match failed at off: $off size: $awu_max"
+ echo "Expected contents: (Either of the 2 below):"
+ echo
+ echo "Expected old: "
+ echo "$expected_data_old"
+ echo
+ echo "Expected new: "
+ echo "$expected_data_new"
+ echo
+ echo "Actual contents: "
+ echo "$actual_data"
+
+ _fail
+ fi
+ echo -n "Check at offset $off suceeded! " >> $seqres.full
+ if [[ "$actual_data" == "$expected_data_new" ]]
+ then
+ echo "matched new" >> $seqres.full
+ elif [[ "$actual_data" == "$expected_data_old" ]]
+ then
+ echo "matched old" >> $seqres.full
+ fi
+ off=$(( off + awu_max ))
+ done
+}
+
+# test data integrity for file by shutting down in between atomic writes
+test_data_integrity() {
+ echo >> $seqres.full
+ echo "# Writing atomically to file in background" >> $seqres.full
+ atomic_write_loop &
+ awloop_pid=$!
+
+ local i=0
+ # Wait for atleast first write to be recorded or 10s
+ while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+ if [[ $i -gt 50 ]]
+ then
+ _fail "atomic write process took too long to start"
+ fi
+
+ echo >> $seqres.full
+ echo "# Shutting down filesystem while write is running" >> $seqres.full
+ _scratch_shutdown
+
+ kill $awloop_pid 2>/dev/null # the process might have finished already
+ wait $awloop_pid
+ unset $awloop_pid
+
+ last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+ if [[ -z $last_offset ]]
+ then
+ last_offset=0
+ fi
+
+ echo >> $seqres.full
+ echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+
+ rm $tmp.aw
+ sleep 0.5
+
+ _scratch_cycle_mount
+
+ # we want to verify all blocks around which the shutdown happended
+ verify_start=$(( last_offset - (awu_max * 5)))
+ if [[ $verify_start < 0 ]]
+ then
+ verify_start=0
+ fi
+
+ verify_end=$(( last_offset + (awu_max * 5)))
+ if [[ "$verify_end" -gt "$filesize" ]]
+ then
+ verify_end=$filesize
+ fi
+}
+
+# test data integrity for file wiht written and unwritten mappings
+test_data_integrity_mixed() {
+ $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Creating testfile with mixed mappings" >> $seqres.full
+ create_mixed_mappings $testfile $filesize
+
+ test_data_integrity
+
+ verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
+}
+
+# test data integrity for file with completely written mappings
+test_data_integrity_writ() {
+ $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Creating testfile with fully written mapping" >> $seqres.full
+ $XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
+ sync $testfile
+
+ test_data_integrity
+
+ verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
+}
+
+# test data integrity for file with completely unwritten mappings
+test_data_integrity_unwrit() {
+ $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
+ $XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
+ sync $testfile
+
+ test_data_integrity
+
+ verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+# test data integrity for file with no mappings
+test_data_integrity_hole() {
+ $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Creating testfile with no mappings" >> $seqres.full
+ $XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
+ sync $testfile
+
+ test_data_integrity
+
+ verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+test_filesize_integrity() {
+ $XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Performing extending atomic writes over file in background" >> $seqres.full
+ atomic_write_loop &
+ awloop_pid=$!
+
+ local i=0
+ # Wait for atleast first write to be recorded or 10s
+ while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+ if [[ $i -gt 50 ]]
+ then
+ _fail "atomic write process took too long to start"
+ fi
+
+ echo >> $seqres.full
+ echo "# Shutting down filesystem while write is running" >> $seqres.full
+ _scratch_shutdown
+
+ kill $awloop_pid 2>/dev/null # the process might have finished already
+ wait $awloop_pid
+ unset $awloop_pid
+
+ local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+ if [[ -z $last_offset ]]
+ then
+ last_offset=0
+ fi
+
+ echo >> $seqres.full
+ echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+ rm $tmp.aw
+ sleep 0.5
+
+ _scratch_cycle_mount
+ local filesize=$(_get_filesize $testfile)
+ echo >> $seqres.full
+ echo "# Filesize after shutdown: $filesize" >> $seqres.full
+
+ # To confirm that the write went atomically, we check:
+ # 1. The last block should be a multiple of awu_max
+ # 2. The last block should be the completely new data
+
+ if (( $filesize % $awu_max ))
+ then
+ echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
+ fi
+
+ verify_start=$(( filesize - (awu_max * 5)))
+ if [[ $verify_start < 0 ]]
+ then
+ verify_start=0
+ fi
+
+ local verify_end=$filesize
+
+ # Here the blocks should always match new data hence, for simplicity of
+ # code, just corrupt the $expected_data_old buffer so it never matches
+ local expected_data_old="POISON"
+ verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
+}
+
+$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+dry_run
+
+echo >> $seqres.full
+echo "# Populating expected data buffers" >> $seqres.full
+populate_expected_data
+
+# Loop 20 times to shake out any races due to shutdown
+for ((iter=0; iter<20; iter++))
+do
+ echo >> $seqres.full
+ echo "------ Iteration $iter ------" >> $seqres.full
+
+ echo >> $seqres.full
+ echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
+ test_data_integrity_mixed
+
+ echo >> $seqres.full
+ echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
+ test_data_integrity_writ
+
+ echo >> $seqres.full
+ echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
+ test_data_integrity_unwrit
+
+ echo >> $seqres.full
+ echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
+ test_data_integrity_hole
+
+ echo >> $seqres.full
+ echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
+ test_filesize_integrity
+done
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1230.out b/tests/generic/1230.out
new file mode 100644
index 00000000..d01f54ea
--- /dev/null
+++ b/tests/generic/1230.out
@@ -0,0 +1,2 @@
+QA output created by 1230
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (8 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-29 19:47 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
` (2 subsequent siblings)
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
We brute force all possible blocksize & clustersize combinations on
a bigalloc filesystem for stressing atomic write using fio data crc
verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
that we never see the mix of data contents from different threads on
a given bsrange.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/ext4/061 | 130 +++++++++++++++++++++++++++++++++++++++++++++
tests/ext4/061.out | 2 +
2 files changed, 132 insertions(+)
create mode 100755 tests/ext4/061
create mode 100644 tests/ext4/061.out
diff --git a/tests/ext4/061 b/tests/ext4/061
new file mode 100755
index 00000000..a0e49249
--- /dev/null
+++ b/tests/ext4/061
@@ -0,0 +1,130 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
+# we should never see the mix of data contents from different threads for any
+# given fio blocksize.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100*1024*1024))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_configs()
+{
+ create_fio_aw_config
+ create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+ [aio-dio-aw-verify]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$fiobsize
+ fallocate=native
+ filename=$SCRATCH_MNT/test-file
+ size=$SIZE
+ iodepth=$FIO_LOAD
+ numjobs=$FIO_LOAD
+ atomic=1
+ group_reporting=1
+
+ verify_only=1
+ verify_state_save=0
+ verify=crc32c
+ verify_fatal=1
+ verify_write_sequence=0
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+ [aio-dio-aw]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$fiobsize
+ fallocate=native
+ filename=$SCRATCH_MNT/test-file
+ size=$SIZE
+ iodepth=$FIO_LOAD
+ numjobs=$FIO_LOAD
+ group_reporting=1
+ atomic=1
+
+ verify_state_save=0
+ verify=crc32c
+ do_verify=0
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+ # cluster sizes above 16 x blocksize are experimental so avoid them
+ # Also, cap cluster size at 128kb to keep it reasonable for large
+ # blocks size
+ fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+ for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+ for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+ MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+ _scratch_mkfs_ext4 >> $seqres.full 2>&1 || continue
+ if _try_scratch_mount >> $seqres.full 2>&1; then
+ echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+ touch $SCRATCH_MNT/f1
+ create_fio_configs
+
+ cat $fio_aw_config >> $seqres.full
+ echo >> $seqres.full
+ cat $fio_verify_config >> $seqres.full
+
+ $FIO_PROG $fio_aw_config >> $seqres.full
+ ret1=$?
+
+ $FIO_PROG $fio_verify_config >> $seqres.full
+ ret2=$?
+
+ _scratch_unmount
+
+ [[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+ fi
+ done
+ done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/061.out b/tests/ext4/061.out
new file mode 100644
index 00000000..273be9e0
--- /dev/null
+++ b/tests/ext4/061.out
@@ -0,0 +1,2 @@
+QA output created by 061
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (9 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-29 19:44 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Brute force all possible blocksize clustersize combination on a bigalloc
filesystem for stressing atomic write using fio data crc verifier. We run
multiple threads in parallel with each job writing to its own file. The
parallel jobs running on a constrained filesystem size ensure that we stress
the ext4 allocator to allocate contiguous extents.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/ext4/062 | 176 +++++++++++++++++++++++++++++++++++++++++++++
tests/ext4/062.out | 2 +
2 files changed, 178 insertions(+)
create mode 100755 tests/ext4/062
create mode 100644 tests/ext4/062.out
diff --git a/tests/ext4/062 b/tests/ext4/062
new file mode 100755
index 00000000..85b82f97
--- /dev/null
+++ b/tests/ext4/062
@@ -0,0 +1,176 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
+# a constrained filesystem size to stress the ext4 allocator to allocate
+# contiguous extents.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FSSIZE=$((360*1024*1024))
+FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_configs()
+{
+ create_fio_aw_config
+ create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+ [global]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$fiobsize
+ fallocate=truncate
+ size=$((FSSIZE / 12))
+ iodepth=$FIO_LOAD
+ numjobs=$FIO_LOAD
+ group_reporting=1
+ atomic=1
+
+ verify_only=1
+ verify_state_save=0
+ verify=crc32c
+ verify_fatal=1
+ verify_write_sequence=0
+
+ [verify-job1]
+ filename=$SCRATCH_MNT/testfile-job1
+
+ [verify-job2]
+ filename=$SCRATCH_MNT/testfile-job2
+
+ [verify-job3]
+ filename=$SCRATCH_MNT/testfile-job3
+
+ [verify-job4]
+ filename=$SCRATCH_MNT/testfile-job4
+
+ [verify-job5]
+ filename=$SCRATCH_MNT/testfile-job5
+
+ [verify-job6]
+ filename=$SCRATCH_MNT/testfile-job6
+
+ [verify-job7]
+ filename=$SCRATCH_MNT/testfile-job7
+
+ [verify-job8]
+ filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+ [global]
+ direct=1
+ ioengine=libaio
+ rw=randwrite
+ bs=$fiobsize
+ fallocate=truncate
+ size=$((FSSIZE / 12))
+ iodepth=$FIO_LOAD
+ numjobs=$FIO_LOAD
+ group_reporting=1
+ atomic=1
+
+ verify_state_save=0
+ verify=crc32c
+ do_verify=0
+
+ [write-job1]
+ filename=$SCRATCH_MNT/testfile-job1
+
+ [write-job2]
+ filename=$SCRATCH_MNT/testfile-job2
+
+ [write-job3]
+ filename=$SCRATCH_MNT/testfile-job3
+
+ [write-job4]
+ filename=$SCRATCH_MNT/testfile-job4
+
+ [write-job5]
+ filename=$SCRATCH_MNT/testfile-job5
+
+ [write-job6]
+ filename=$SCRATCH_MNT/testfile-job6
+
+ [write-job7]
+ filename=$SCRATCH_MNT/testfile-job7
+
+ [write-job8]
+ filename=$SCRATCH_MNT/testfile-job8
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+ # cluster sizes above 16 x blocksize are experimental so avoid them
+ # Also, cap cluster size at 128kb to keep it reasonable for large
+ # blocks size cases.
+ fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+ for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+ for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+ MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+ _scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
+ if _try_scratch_mount >> $seqres.full 2>&1; then
+ echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+ touch $SCRATCH_MNT/f1
+ create_fio_configs
+
+ cat $fio_aw_config >> $seqres.full
+ cat $fio_verify_config >> $seqres.full
+
+ $FIO_PROG $fio_aw_config >> $seqres.full
+ ret1=$?
+
+ $FIO_PROG $fio_verify_config >> $seqres.full
+ ret2=$?
+
+ _scratch_unmount
+
+ [[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+ fi
+ done
+ done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/062.out b/tests/ext4/062.out
new file mode 100644
index 00000000..a1578f48
--- /dev/null
+++ b/tests/ext4/062.out
@@ -0,0 +1,2 @@
+QA output created by 062
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (10 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-29 19:41 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
In ext4, even if an allocated range is physically and logically
contiguous, it can still be split into 2 extents. This is because ext4
does not merge extents across leaf nodes. This is an issue for atomic
writes since even for a continuous extent the map block could (in rare
cases) return a shorter map, hence tearning the write. This test creates
such a file and ensures that the atomic write handles this case
correctly
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/ext4/063 | 125 +++++++++++++++++++++++++++++++++++++++++++++
tests/ext4/063.out | 2 +
2 files changed, 127 insertions(+)
create mode 100755 tests/ext4/063
create mode 100644 tests/ext4/063.out
diff --git a/tests/ext4/063 b/tests/ext4/063
new file mode 100755
index 00000000..25b5693d
--- /dev/null
+++ b/tests/ext4/063
@@ -0,0 +1,125 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# In ext4, even if an allocated range is physically and logically contiguous,
+# it can still be split into 2 extents. This is because ext4 does not merge
+# extents across leaf nodes. This is an issue for atomic writes since even for
+# a continuous extent the map block could (in rare cases) return a shorter map,
+# hence tearning the write. This test creates such a file and ensures that the
+# atomic write handles this case correctly
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_command "$DEBUGFS_PROG" debugfs
+
+prep() {
+ local bs=`_get_block_size $SCRATCH_MNT`
+ local ex_hdr_bytes=12
+ local ex_entry_bytes=12
+ local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
+
+ # fill the extent tree leaf which bs len extents at alternate offsets. For example,
+ # for 4k bs the tree should look as follows
+ #
+ # +---------+---------+
+ # | index 1 | index 2 |
+ # +-----+---+-----+---+
+ # +--------+ +-------+
+ # | |
+ # +----------+--------------+ +-----+-----+
+ # | ex 1 | ex 2 |... | ex n | | ex n + 1 |
+ # +-------------------------+ +-----------+
+ # 0 2 680 682
+ for i in $(seq 0 $entries_per_blk)
+ do
+ $XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
+ done
+ sync $testfile
+
+ echo >> $seqres.full
+ echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
+ echo "...">> $seqres.full
+ $DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+ # Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
+ # this is a new FS the allocator would find continuous blocks such that
+ # ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
+ # since we dont merge extents across leaf we will end up with a tree as:
+ #
+ # +---------+---------+
+ # | index 1 | index 2 |
+ # +-----+---+-----+---+
+ # +--------+ +-------+
+ # | |
+ # +----------+--------------+ +-----+-----+
+ # | ex 1 | ex 2 |... | ex n | | ex merged |
+ # +-------------------------+ +-----------+
+ # 0 2 680 681 682 684
+ #
+ echo >> $seqres.full
+ torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
+ $XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
+ sync $testfile
+
+ echo >> $seqres.full
+ echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
+ echo "...">> $seqres.full
+ $DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+ _scratch_cycle_mount
+}
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+awu_max=$(_get_atomic_write_unit_max $testfile)
+
+echo >> $seqres.full
+echo "# Prepping the file" >> $seqres.full
+prep
+
+torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
+
+echo >> $seqres.full
+echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
+echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
+
+echo >> $seqres.full
+echo "Extent state after atomic write:">> $seqres.full
+echo "...">> $seqres.full
+$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
+
+echo >> $seqres.full
+echo "# Checking data integrity" >> $seqres.full
+
+# create a dummy file with expected data
+$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
+expected_data=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp)
+
+# We ensure that the data after atomic writes should match the expected data
+actual_data=$(od -An -t x1 -j $torn_aw_offset -N $awu_max $testfile)
+if [[ "$actual_data" != "$expected_data" ]]
+then
+ echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
+ echo
+ echo "Expected: "
+ echo "$expected_data"
+ echo
+ echo "Actual contents: "
+ echo "$actual_data"
+
+ _fail
+fi
+
+echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/063.out b/tests/ext4/063.out
new file mode 100644
index 00000000..de35fc52
--- /dev/null
+++ b/tests/ext4/063.out
@@ -0,0 +1,2 @@
+QA output created by 063
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
` (11 preceding siblings ...)
2025-07-12 14:12 ` [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-07-12 14:12 ` Ojaswin Mujoo
2025-07-29 19:36 ` Darrick J. Wong
12 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-12 14:12 UTC (permalink / raw)
To: Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
Test atomic writes with journal credit calculation. We take 2 cases
here:
1. Atomic writes on single mapping causing tree to collapse into
the inode
2. Atomic writes on mixed mapping causing tree to collapse into the
inode
This test is inspired by ext4/034.
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
tests/ext4/064 | 75 ++++++++++++++++++++++++++++++++++++++++++++++
tests/ext4/064.out | 2 ++
2 files changed, 77 insertions(+)
create mode 100755 tests/ext4/064
create mode 100644 tests/ext4/064.out
diff --git a/tests/ext4/064 b/tests/ext4/064
new file mode 100755
index 00000000..ec31f983
--- /dev/null
+++ b/tests/ext4/064
@@ -0,0 +1,75 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 034
+#
+# Test proper credit reservation is done when performing
+# tree collapse during an aotmic write based allocation
+#
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto quick quota fiemap prealloc atomicwrites
+
+# Import common functions.
+
+
+# Modify as appropriate.
+_exclude_fs ext2
+_exclude_fs ext3
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "syncfs"
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+
+echo "----- Testing with atomic write on non-mixed mapping -----" >> $seqres.full
+
+echo "Format and mount" >> $seqres.full
+_scratch_mkfs > $seqres.full 2>&1
+_scratch_mount > $seqres.full 2>&1
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+ -c "pwrite 20k 4k" -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+ -c "fsync" $SCRATCH_MNT/foobar >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 4k 4k" $SCRATCH_MNT/foobar >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+echo "----- Testing with atomi write on mixed mapping -----" >> $seqres.full
+
+echo "Create the original file" >> $seqres.full
+touch $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create 2 level extent tree (btree) for foobar2 with a unwritten extent" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
+ -c "pwrite 20k 4k" -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
+ -c "fsync" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
+$XFS_IO_PROG -dc "pwrite -A -V1 0k 12k" $SCRATCH_MNT/foobar2 >> $seqres.full
+
+echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy2 >> $seqres.full
+
+echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
+$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/ext4/064.out b/tests/ext4/064.out
new file mode 100644
index 00000000..d9076546
--- /dev/null
+++ b/tests/ext4/064.out
@@ -0,0 +1,2 @@
+QA output created by 064
+Silence is golden
--
2.49.0
^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-12 14:12 ` [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
@ 2025-07-17 13:00 ` John Garry
2025-07-17 13:52 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-17 13:00 UTC (permalink / raw)
To: Ojaswin Mujoo, Zorro Lang, fstests
Cc: Ritesh Harjani, djwong, tytso, linux-xfs, linux-kernel,
linux-ext4
On 12/07/2025 15:12, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>
> This adds atomic write test using fio based on it's crc check verifier.
> fio adds a crc for each data block. If the underlying device supports atomic
> write then it is guaranteed that we will never have a mix data from two
> threads writing on the same physical block.
I think that you should mention that 2-phase approach.
Is there something which ensures that we have fio which supports
RWF_ATOMIC? fio for some time supported the "atomic" cmdline param, but
did not do anything until recently
>
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/generic/1226 | 101 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1226.out | 2 +
Was this tested with xfs?
> 2 files changed, 103 insertions(+)
> create mode 100755 tests/generic/1226
> create mode 100644 tests/generic/1226.out
>
> diff --git a/tests/generic/1226 b/tests/generic/1226
> new file mode 100755
> index 00000000..455fc55f
> --- /dev/null
> +++ b/tests/generic/1226
> @@ -0,0 +1,101 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1226
> +#
> +# Validate FS atomic write using fio crc check verifier.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_odirect
> +_require_aio
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((100 * 1024 * 1024))
> +
> +function create_fio_configs()
> +{
> + create_fio_aw_config
it's strange ordering in this file, since create_fio_aw_config is
declared below here
> + create_fio_verify_config
same
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> + [verify-job]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
is this really required? Maybe it is. I would use read if something was
required for this param
> + bs=$blocksize
> + fallocate=native
> + filename=$SCRATCH_MNT/test-file
> + size=$SIZE
> + iodepth=$FIO_LOAD
> + group_reporting=1
> +
> + verify_only=1
> + verify=crc32c
> + verify_fatal=1
> + verify_state_save=0
> + verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> + [atomicwrite-job]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
> + bs=$blocksize
> + fallocate=native
> + filename=$SCRATCH_MNT/test-file
> + size=$SIZE
> + iodepth=$FIO_LOAD
> + numjobs=$FIO_LOAD
> + group_reporting=1
> + atomic=1
> +
> + verify_state_save=0
> + verify=crc32c
> + do_verify=0
> +EOF
> +}
> +
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +ret1=$?
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +ret2=$?
> +
> +[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1226.out b/tests/generic/1226.out
> new file mode 100644
> index 00000000..6dce0ea5
> --- /dev/null
> +++ b/tests/generic/1226.out
> @@ -0,0 +1,2 @@
> +QA output created by 1226
> +Silence is golden
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-17 13:00 ` John Garry
@ 2025-07-17 13:52 ` Ojaswin Mujoo
2025-07-17 14:06 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-17 13:52 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 02:00:18PM +0100, John Garry wrote:
> On 12/07/2025 15:12, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> >
> > This adds atomic write test using fio based on it's crc check verifier.
> > fio adds a crc for each data block. If the underlying device supports atomic
> > write then it is guaranteed that we will never have a mix data from two
> > threads writing on the same physical block.
>
> I think that you should mention that 2-phase approach.
Sure I can add a comment and update the commit message with this.
>
> Is there something which ensures that we have fio which supports RWF_ATOMIC?
> fio for some time supported the "atomic" cmdline param, but did not do
> anything until recently
We do have _require_fio which ensures the options passed are supported
by the current fio. If you are saying some versions of fio have --atomic
valid but dont do an RWF_ATOMIC then I'm not really sure if that can be
caught though.
>
> >
> > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> > tests/generic/1226 | 101 +++++++++++++++++++++++++++++++++++++++++
> > tests/generic/1226.out | 2 +
>
> Was this tested with xfs?
Yes, I've tested with XFS with software fallback as well. Also, tested
xfs while keeping io size as 16kb so we stress the hw paths too. Both
seem to be passing as expected.
>
> > 2 files changed, 103 insertions(+)
> > create mode 100755 tests/generic/1226
> > create mode 100644 tests/generic/1226.out
> >
> > diff --git a/tests/generic/1226 b/tests/generic/1226
> > new file mode 100755
> > index 00000000..455fc55f
> > --- /dev/null
> > +++ b/tests/generic/1226
> > @@ -0,0 +1,101 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 1226
> > +#
> > +# Validate FS atomic write using fio crc check verifier.
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto aio rw atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_odirect
> > +_require_aio
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount
> > +
> > +touch "$SCRATCH_MNT/f1"
> > +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> > +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> > +blocksize=$(_max "$awu_min_write" "$((awu_max_write/2))")
> > +
> > +fio_config=$tmp.fio
> > +fio_out=$tmp.fio.out
> > +
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > +SIZE=$((100 * 1024 * 1024))
> > +
> > +function create_fio_configs()
> > +{
> > + create_fio_aw_config
>
> it's strange ordering in this file, since create_fio_aw_config is declared
> below here
>
> > + create_fio_verify_config
>
> same
That works in bash.
>
> > +}
> > +
> > +function create_fio_verify_config()
> > +{
> > +cat >$fio_verify_config <<EOF
> > + [verify-job]
> > + direct=1
> > + ioengine=libaio
> > + rw=randwrite
>
> is this really required? Maybe it is. I would use read if something was
> required for this param
Usually the fio verfiy phase internally converts writes to reads so
rw=write and rw=read doesnt matter much. I can make the change tho,
should be fine.
Thanks,
ojaswin
>
> > + bs=$blocksize
> > + fallocate=native
> > + filename=$SCRATCH_MNT/test-file
> > + size=$SIZE
> > + iodepth=$FIO_LOAD
> > + group_reporting=1
> > +
> > + verify_only=1
> > + verify=crc32c
> > + verify_fatal=1
> > + verify_state_save=0
> > + verify_write_sequence=0
> > +EOF
> > +}
> > +
> > +function create_fio_aw_config()
> > +{
> > +cat >$fio_aw_config <<EOF
> > + [atomicwrite-job]
> > + direct=1
> > + ioengine=libaio
> > + rw=randwrite
> > + bs=$blocksize
> > + fallocate=native
> > + filename=$SCRATCH_MNT/test-file
> > + size=$SIZE
> > + iodepth=$FIO_LOAD
> > + numjobs=$FIO_LOAD
> > + group_reporting=1
> > + atomic=1
> > +
> > + verify_state_save=0
> > + verify=crc32c
> > + do_verify=0
> > +EOF
> > +}
> > +
> > +fio_aw_config=$tmp.aw.fio
> > +fio_verify_config=$tmp.verify.fio
> > +
> > +create_fio_configs
> > +_require_fio $fio_aw_config
> > +
> > +cat $fio_aw_config >> $seqres.full
> > +cat $fio_verify_config >> $seqres.full
> > +
> > +$FIO_PROG $fio_aw_config >> $seqres.full
> > +ret1=$?
> > +$FIO_PROG $fio_verify_config >> $seqres.full
> > +ret2=$?
> > +
> > +[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/generic/1226.out b/tests/generic/1226.out
> > new file mode 100644
> > index 00000000..6dce0ea5
> > --- /dev/null
> > +++ b/tests/generic/1226.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 1226
> > +Silence is golden
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-17 13:52 ` Ojaswin Mujoo
@ 2025-07-17 14:06 ` John Garry
2025-07-22 8:47 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-17 14:06 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 17/07/2025 14:52, Ojaswin Mujoo wrote:
> On Thu, Jul 17, 2025 at 02:00:18PM +0100, John Garry wrote:
>> On 12/07/2025 15:12, Ojaswin Mujoo wrote:
>>> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>>>
>>> This adds atomic write test using fio based on it's crc check verifier.
>>> fio adds a crc for each data block. If the underlying device supports atomic
>>> write then it is guaranteed that we will never have a mix data from two
>>> threads writing on the same physical block.
>>
>> I think that you should mention that 2-phase approach.
>
> Sure I can add a comment and update the commit message with this.
>
>>
>> Is there something which ensures that we have fio which supports RWF_ATOMIC?
>> fio for some time supported the "atomic" cmdline param, but did not do
>> anything until recently
>
> We do have _require_fio which ensures the options passed are supported
> by the current fio. If you are saying some versions of fio have --atomic
> valid but dont do an RWF_ATOMIC then I'm not really sure if that can be
> caught though.
Can you check the fio version?
>
>>
>>>
>>> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>>> ---
>>> tests/generic/1226 | 101 +++++++++++++++++++++++++++++++++++++++++
>>> tests/generic/1226.out | 2 +
>>
>> Was this tested with xfs?
>
> Yes, I've tested with XFS with software fallback as well. Also, tested
> xfs while keeping io size as 16kb so we stress the hw paths too.
so is that requirement implemented with the
_require_scratch_write_atomic check?
> Both
> seem to be passing as expected.
>>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 01/13] common/rc: Add _min() and _max() helpers
2025-07-12 14:12 ` [PATCH v3 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
@ 2025-07-17 15:02 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 15:02 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:43PM +0530, Ojaswin Mujoo wrote:
> Many programs open code these functionalities so add it as a generic helper
> in common/rc
>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Looks decent,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> common/rc | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/common/rc b/common/rc
> index f71cc8f0..9a9d3cc8 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -5817,6 +5817,28 @@ _require_program() {
> _have_program "$1" || _notrun "$tag required"
> }
>
> +_min() {
> + local ret
> +
> + for arg in "$@"; do
> + if [ -z "$ret" ] || (( $arg < $ret )); then
> + ret="$arg"
> + fi
> + done
> + echo $ret
> +}
> +
> +_max() {
> + local ret
> +
> + for arg in "$@"; do
> + if [ -z "$ret" ] || (( $arg > $ret )); then
> + ret="$arg"
> + fi
> + done
> + echo $ret
> +}
> +
> ################################################################################
> # make sure this script returns success
> /bin/true
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc
2025-07-12 14:12 ` [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
@ 2025-07-17 16:11 ` Darrick J. Wong
2025-07-22 9:53 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 16:11 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:44PM +0530, Ojaswin Mujoo wrote:
> Insert range and collapse range only works with bigalloc in case
> the range is cluster size aligned, which fsx doesnt take care. To
> work past this, disable insert range and collapse range on ext4, if
> bigalloc is enabled.
>
> This is achieved by defining a new function _set_default_fsx_avoid
> called via run_fsx helper. This can be used to selectively disable
> fsx options based on the configuration.
>
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> common/rc | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/common/rc b/common/rc
> index 9a9d3cc8..218cf253 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -5113,10 +5113,37 @@ _require_hugepage_fsx()
> _notrun "fsx binary does not support MADV_COLLAPSE"
> }
>
> +_set_default_fsx_avoid() {
> + local file=$1
> +
> + case "$FSTYP" in
> + "ext4")
> + local dev=$(findmnt -n -o SOURCE --target $file)
> +
> + # open code instead of _require_dumpe2fs cause we don't
> + # want to _notrun if dumpe2fs is not available
> + if [ -z "$DUMPE2FS_PROG" ]; then
> + echo "_set_default_fsx_avoid: dumpe2fs not found, skipping bigalloc check." >> $seqres.full
> + return
> + fi
I hate to be the guy who says one thing and then another, but ...
If we extended _get_file_block_size to report the ext4 bigalloc cluster
size, would that be sufficient to keep testing collapse/insert range?
I guess the tricky part here is that bigalloc allows sub-cluster
mappings and we might not want to do all file IO testing in such big
units.
> +
> + $DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
> + export FSX_AVOID+=" -I -C"
No need to export FSX_AVOID to subprocesses.
--D
> + }
> + ;;
> + # Add other filesystem types here as needed
> + *)
> + ;;
> + esac
> +}
> +
> _run_fsx()
> {
> echo "fsx $*"
> local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> +
> + _set_default_fsx_avoid $testfile
> +
> set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> echo "$@" >>$seqres.full
> rm -f $TEST_DIR/junk
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx
2025-07-12 14:12 ` [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
@ 2025-07-17 16:17 ` Darrick J. Wong
2025-07-22 9:59 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 16:17 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:46PM +0530, Ojaswin Mujoo wrote:
> Implement atomic write support to help fuzz atomic writes
> with fsx.
>
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> ltp/fsx.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 104 insertions(+), 5 deletions(-)
>
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index 163b9453..ea39ca29 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -40,6 +40,7 @@
> #include <liburing.h>
> #endif
> #include <sys/syscall.h>
> +#include "statx.h"
>
> #ifndef MAP_FILE
> # define MAP_FILE 0
> @@ -49,6 +50,10 @@
> #define RWF_DONTCACHE 0x80
> #endif
>
> +#ifndef RWF_ATOMIC
> +#define RWF_ATOMIC 0x40
> +#endif
> +
> #define NUMPRINTCOLUMNS 32 /* # columns of data to print on each line */
>
> /* Operation flags (bitmask) */
> @@ -110,6 +115,7 @@ enum {
> OP_READ_DONTCACHE,
> OP_WRITE,
> OP_WRITE_DONTCACHE,
> + OP_WRITE_ATOMIC,
> OP_MAPREAD,
> OP_MAPWRITE,
> OP_MAX_LITE,
> @@ -200,6 +206,11 @@ int uring = 0;
> int mark_nr = 0;
> int dontcache_io = 1;
> int hugepages = 0; /* -h flag */
> +int do_atomic_writes = 1; /* -a flag disables */
> +
> +/* User for atomic writes */
> +int awu_min = 0;
> +int awu_max = 0;
>
> /* Stores info needed to periodically collapse hugepages */
> struct hugepages_collapse_info {
> @@ -288,6 +299,7 @@ static const char *op_names[] = {
> [OP_READ_DONTCACHE] = "read_dontcache",
> [OP_WRITE] = "write",
> [OP_WRITE_DONTCACHE] = "write_dontcache",
> + [OP_WRITE_ATOMIC] = "write_atomic",
> [OP_MAPREAD] = "mapread",
> [OP_MAPWRITE] = "mapwrite",
> [OP_TRUNCATE] = "truncate",
> @@ -422,6 +434,7 @@ logdump(void)
> prt("\t***RRRR***");
> break;
> case OP_WRITE_DONTCACHE:
> + case OP_WRITE_ATOMIC:
> case OP_WRITE:
> prt("WRITE 0x%x thru 0x%x\t(0x%x bytes)",
> lp->args[0], lp->args[0] + lp->args[1] - 1,
> @@ -1073,6 +1086,25 @@ update_file_size(unsigned offset, unsigned size)
> file_size = offset + size;
> }
>
> +static int is_power_of_2(unsigned n) {
> + return ((n & (n - 1)) == 0);
> +}
> +
> +/*
> + * Round down n to nearest power of 2.
> + * If n is already a power of 2, return n;
> + */
> +static int rounddown_pow_of_2(int n) {
> + int i = 0;
> +
> + if (is_power_of_2(n))
> + return n;
> +
> + for (; (1 << i) < n; i++);
> +
> + return 1 << (i - 1);
> +}
> +
> void
> dowrite(unsigned offset, unsigned size, int flags)
> {
> @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
> offset -= offset % writebdy;
> if (o_direct)
> size -= size % writebdy;
> + if (flags & RWF_ATOMIC) {
> + /* atomic write len must be inbetween awu_min and awu_max */
> + if (size < awu_min)
> + size = awu_min;
> + if (size > awu_max)
> + size = awu_max;
> +
> + /* atomic writes need power-of-2 sizes */
> + size = rounddown_pow_of_2(size);
> +
> + /* atomic writes need naturally aligned offsets */
> + offset -= offset % size;
I don't think you should be modifying offset/size here. Normally for
fsx we do all the rounding of the file range in the switch statement
after the "calculate appropriate op to run" comment statement.
--D
> +
> + /* Skip the write if we are crossing max filesize */
> + if ((offset + size) > maxfilelen) {
> + if (!quiet && testcalls > simulatedopcount)
> + prt("skipping atomic write past maxfilelen\n");
> + log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> + return;
> + }
> + }
> if (size == 0) {
> if (!quiet && testcalls > simulatedopcount && !o_direct)
> prt("skipping zero size write\n");
> @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
> return;
> }
>
> - log4(OP_WRITE, offset, size, FL_NONE);
> + if (flags & RWF_ATOMIC)
> + log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> + else
> + log4(OP_WRITE, offset, size, FL_NONE);
>
> gendata(original_buf, good_buf, offset, size);
> if (offset + size > file_size) {
> @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
> (monitorstart == -1 ||
> (offset + size > monitorstart &&
> (monitorend == -1 || offset <= monitorend))))))
> - prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> - offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> + prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> + offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> + (flags & RWF_ATOMIC) != 0);
> iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
> if (iret != size) {
> if (iret == -1)
> @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
> }
> #endif
>
> +int test_atomic_writes(void) {
> + int ret;
> + struct statx stx;
> +
> + ret = xfstests_statx(AT_FDCWD, fname, 0, STATX_WRITE_ATOMIC, &stx);
> + if (ret < 0) {
> + fprintf(stderr, "main: Statx failed with %d."
> + " Failed to determine atomic write limits, "
> + " disabling!\n", ret);
> + return 0;
> + }
> +
> + if (stx.stx_attributes & STATX_ATTR_WRITE_ATOMIC &&
> + stx.stx_atomic_write_unit_min > 0) {
> + awu_min = stx.stx_atomic_write_unit_min;
> + awu_max = stx.stx_atomic_write_unit_max;
> + return 1;
> + }
> +
> + fprintf(stderr, "main: IO Stack does not support "
> + "atomic writes, disabling!\n");
> + return 0;
> +}
> +
> #ifdef HAVE_COPY_FILE_RANGE
> int
> test_copy_range(void)
> @@ -2356,6 +2437,12 @@ have_op:
> goto out;
> }
> break;
> + case OP_WRITE_ATOMIC:
> + if (!do_atomic_writes) {
> + log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> + goto out;
> + }
> + break;
> }
>
> switch (op) {
> @@ -2385,6 +2472,11 @@ have_op:
> dowrite(offset, size, 0);
> break;
>
> + case OP_WRITE_ATOMIC:
> + TRIM_OFF_LEN(offset, size, maxfilelen);
> + dowrite(offset, size, RWF_ATOMIC);
> + break;
> +
> case OP_MAPREAD:
> TRIM_OFF_LEN(offset, size, file_size);
> domapread(offset, size);
> @@ -2511,13 +2603,14 @@ void
> usage(void)
> {
> fprintf(stdout, "usage: %s",
> - "fsx [-dfhknqxyzBEFHIJKLORWXZ0]\n\
> + "fsx [-adfhknqxyzBEFHIJKLORWXZ0]\n\
> [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid]\n\
> [-l flen] [-m start:end] [-o oplen] [-p progressinterval]\n\
> [-r readbdy] [-s style] [-t truncbdy] [-w writebdy]\n\
> [-A|-U] [-D startingop] [-N numops] [-P dirpath] [-S seed]\n\
> [--replay-ops=opsfile] [--record-ops[=opsfile]] [--duration=seconds]\n\
> ... fname\n\
> + -a: disable atomic writes\n\
> -b opnum: beginning operation number (default 1)\n\
> -c P: 1 in P chance of file close+open at each op (default infinity)\n\
> -d: debug output for all operations\n\
> @@ -3059,9 +3152,13 @@ main(int argc, char **argv)
> setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
>
> while ((ch = getopt_long(argc, argv,
> - "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> + "0ab:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> longopts, NULL)) != EOF)
> switch (ch) {
> + case 'a':
> + prt("main(): Atomic writes disabled\n");
> + do_atomic_writes = 0;
> + break;
> case 'b':
> simulatedopcount = getnum(optarg, &endp);
> if (!quiet)
> @@ -3475,6 +3572,8 @@ main(int argc, char **argv)
> exchange_range_calls = test_exchange_range();
> if (dontcache_io)
> dontcache_io = test_dontcache_io();
> + if (do_atomic_writes)
> + do_atomic_writes = test_atomic_writes();
>
> while (keep_running())
> if (!test())
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled
2025-07-12 14:12 ` [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
@ 2025-07-17 16:22 ` Darrick J. Wong
2025-07-23 6:30 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 16:22 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:50PM +0530, Ojaswin Mujoo wrote:
> Stress file with atomic writes to ensure we excercise codepaths
> where we are mixing different FS operations with atomic writes
>
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Hrm, doesn't generic/521 test this already if the fs happens to support
atomic writes?
--D
> ---
> tests/generic/1229 | 41 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1229.out | 2 ++
> 2 files changed, 43 insertions(+)
> create mode 100755 tests/generic/1229
> create mode 100644 tests/generic/1229.out
>
> diff --git a/tests/generic/1229 b/tests/generic/1229
> new file mode 100755
> index 00000000..98e9b50c
> --- /dev/null
> +++ b/tests/generic/1229
> @@ -0,0 +1,41 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1229
> +#
> +# fuzz fsx with atomic writes
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest rw auto quick atomicwrites
> +
> +_require_odirect
> +_require_scratch_write_atomic
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +bsize=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
> +
> +# fsx usage:
> +#
> +# -N numops: total # operations to do
> +# -l flen: the upper bound on file size
> +# -o oplen: the upper bound on operation size (64k default)
> +# -Z: O_DIRECT ()
> +
> +_run_fsx_on_file $testfile -N 10000 -o $awu_max -A -l 500000 -r $bsize -w $bsize -Z $FSX_AVOID >> $seqres.full
> +if [[ "$?" != "0" ]]
> +then
> + _fail "fsx returned error: $?"
> +fi
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/1229.out b/tests/generic/1229.out
> new file mode 100644
> index 00000000..737d61c6
> --- /dev/null
> +++ b/tests/generic/1229.out
> @@ -0,0 +1,2 @@
> +QA output created by 1229
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
2025-07-12 14:12 ` [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
@ 2025-07-17 16:32 ` Darrick J. Wong
2025-07-28 8:58 ` Zorro Lang
1 sibling, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 16:32 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:48PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>
> This tests uses fio to first create a file with mixed mappings. Then it
> does atomic writes using aio dio with parallel jobs to the same file
> with mixed mappings. This forces the filesystem allocator to allocate
> extents over mixed mapping regions to stress FS block allocators.
>
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Seems reasonable to me...
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> tests/generic/1227 | 123 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1227.out | 2 +
> 2 files changed, 125 insertions(+)
> create mode 100755 tests/generic/1227
> create mode 100644 tests/generic/1227.out
>
> diff --git a/tests/generic/1227 b/tests/generic/1227
> new file mode 100755
> index 00000000..cfdc54ec
> --- /dev/null
> +++ b/tests/generic/1227
> @@ -0,0 +1,123 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1227
> +#
> +# Validate FS atomic write using fio crc check verifier on mixed mappings
> +# of a file.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_odirect
> +_require_aio
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
> +
> +fsbsize=$(_get_block_size $SCRATCH_MNT)
> +
> +fio_prep_config=$tmp.prep.fio
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((128 * 1024 * 1024))
> +
> +cat >$fio_prep_config <<EOF
> +# prep file to have mixed mappings
> +[global]
> +ioengine=libaio
> +fallocate=none
> +filename=$SCRATCH_MNT/test-file
> +filesize=$SIZE
> +bs=$fsbsize
> +direct=1
> +group_reporting=1
> +
> +# Create written extents
> +[prep_written_blocks]
> +ioengine=libaio
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +
> +# Create unwritten extents
> +[prep_unwritten_blocks]
> +ioengine=falloc
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +EOF
> +
> +cat >$fio_aw_config <<EOF
> +# atomic write to mixed mappings of written/unwritten/holes
> +[atomic_write_job]
> +ioengine=libaio
> +rw=randwrite
> +direct=1
> +atomic=1
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +bs=$aw_bsize
> +iodepth=$FIO_LOAD
> +numjobs=$FIO_LOAD
> +
> +verify_state_save=0
> +verify=crc32c
> +do_verify=0
> +EOF
> +
> +cat >$fio_verify_config <<EOF
> +# verify atomic writes done by previous job
> +[verify_job]
> +ioengine=libaio
> +rw=randwrite
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +bs=$aw_bsize
> +iodepth=$FIO_LOAD
> +
> +verify_state_save=0
> +verify_only=1
> +verify=crc32c
> +verify_fatal=1
> +verify_write_sequence=0
> +EOF
> +
> +_require_fio $fio_aw_config
> +_require_fio $fio_verify_config
> +
> +cat $fio_prep_config >> $seqres.full
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +#prepare file with mixed mappings
> +$FIO_PROG $fio_prep_config >> $seqres.full
> +
> +# do atomic writes without verifying
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +
> +# verify data is not torn
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1227.out b/tests/generic/1227.out
> new file mode 100644
> index 00000000..2605d062
> --- /dev/null
> +++ b/tests/generic/1227.out
> @@ -0,0 +1,2 @@
> +QA output created by 1227
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
2025-07-12 14:12 ` [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
@ 2025-07-17 16:35 ` Darrick J. Wong
2025-07-23 13:53 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-17 16:35 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:49PM +0530, Ojaswin Mujoo wrote:
> This adds various atomic write multi-fsblock stresst tests
> with mixed mappings and O_SYNC, to ensure the data and metadata
> is atomically persisted even if there is a shutdown.
>
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/generic/1228 | 139 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1228.out | 2 +
> 2 files changed, 141 insertions(+)
> create mode 100755 tests/generic/1228
> create mode 100644 tests/generic/1228.out
>
> diff --git a/tests/generic/1228 b/tests/generic/1228
> new file mode 100755
> index 00000000..3f9a6af1
> --- /dev/null
> +++ b/tests/generic/1228
> @@ -0,0 +1,139 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1228
> +#
> +# Atomic write multi-fsblock data integrity tests with mixed mappings
> +# and O_SYNC
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_scratch_shutdown
> +_require_xfs_io_command "truncate"
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +check_data_integrity() {
> + actual=$(_hexdump $testfile)
> + if [[ "$expected" != "$actual" ]]
> + then
> + echo "Integrity check failed"
> + echo "Integrity check failed" >> $seqres.full
> + echo "# Expected file contents:" >> $seqres.full
> + echo "$expected" >> $seqres.full
> + echo "# Actual file contents:" >> $seqres.full
> + echo "$actual" >> $seqres.full
> +
> + _fail "Data integrity check failed. The atomic write was torn."
> + fi
> +}
> +
> +prep_mixed_mapping() {
> + $XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> + local off=0
> + local mapping=""
> +
> + local operations=("W" "H" "U")
> + local num_blocks=$((awu_max / blksz))
> + for ((i=0; i<num_blocks; i++)); do
> + local index=$((RANDOM % ${#operations[@]}))
> + local map="${operations[$index]}"
> + local mapping="${mapping}${map}"
> +
> + case "$map" in
> + "W")
> + $XFS_IO_PROG -dc "pwrite -S 0x61 -b $blksz $off $blksz" $testfile > /dev/null
> + ;;
> + "H")
> + # No operation needed for hole
> + ;;
> + "U")
> + $XFS_IO_PROG -c "falloc $off $blksz" $testfile >> /dev/null
> + ;;
> + esac
> + off=$((off + blksz))
> + done
> +
> + echo "+ + Mixed mapping prep done. Full mapping pattern: $mapping" >> $seqres.full
> +
> + sync $testfile
> +}
> +
> +verify_atomic_write() {
> + if [[ "$1" == "shutdown" ]]
> + then
> + local do_shutdown=1
> + fi
> +
> + test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
> +
> + if [[ $do_shutdown -eq "1" ]]
> + then
> + echo "Shutting down filesystem" >> $seqres.full
> + _scratch_shutdown >> $seqres.full
> + _scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> + fi
> +
> + check_data_integrity
> +}
> +
> +mixed_mapping_test() {
> + prep_mixed_mapping
> +
> + echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
> + bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> + grep wrote | awk -F'[/ ]' '{print $2}')
> +
> + verify_atomic_write $1
The shutdown happens after the synchronous write completes? If so, then
what part of recovery is this testing?
--D
> +}
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +
> +# Create an expected pattern to compare with
> +$XFS_IO_PROG -tc "pwrite -b $awu_max 0 $awu_max" $testfile >> $seqres.full
> +expected=$(_hexdump $testfile)
> +echo "# Expected file contents:" >> $seqres.full
> +echo "$expected" >> $seqres.full
> +echo >> $seqres.full
> +
> +echo "# Test 1: Do O_DSYNC atomic write on random mixed mapping:" >> $seqres.full
> +echo >> $seqres.full
> +for ((iteration=1; iteration<=10; iteration++)); do
> + echo "=== Mixed Mapping Test Iteration $iteration ===" >> $seqres.full
> +
> + echo "+ Testing without shutdown..." >> $seqres.full
> + mixed_mapping_test
> + echo "Passed!" >> $seqres.full
> +
> + echo "+ Testing with sudden shutdown..." >> $seqres.full
> + mixed_mapping_test "shutdown"
> + echo "Passed!" >> $seqres.full
> +
> + echo "Iteration $iteration completed: OK" >> $seqres.full
> + echo >> $seqres.full
> +done
> +echo "# Test 1: Do O_SYNC atomic write on random mixed mapping (10 iterations): OK" >> $seqres.full
> +
> +
> +echo >> $seqres.full
> +echo "# Test 2: Do extending O_SYNC atomic writes: " >> $seqres.full
> +bytes_written=$($XFS_IO_PROG -dstc "pwrite -A -V1 -b $awu_max 0 $awu_max" $testfile | \
> + grep wrote | awk -F'[/ ]' '{print $2}')
> +verify_atomic_write "shutdown"
> +echo "# Test 2: Do extending O_SYNC atomic writes: OK" >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> +
> diff --git a/tests/generic/1228.out b/tests/generic/1228.out
> new file mode 100644
> index 00000000..1baffa91
> --- /dev/null
> +++ b/tests/generic/1228.out
> @@ -0,0 +1,2 @@
> +QA output created by 1228
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-17 14:06 ` John Garry
@ 2025-07-22 8:47 ` Ojaswin Mujoo
2025-07-23 11:33 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-22 8:47 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 03:06:01PM +0100, John Garry wrote:
> On 17/07/2025 14:52, Ojaswin Mujoo wrote:
> > On Thu, Jul 17, 2025 at 02:00:18PM +0100, John Garry wrote:
> > > On 12/07/2025 15:12, Ojaswin Mujoo wrote:
> > > > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > > >
> > > > This adds atomic write test using fio based on it's crc check verifier.
> > > > fio adds a crc for each data block. If the underlying device supports atomic
> > > > write then it is guaranteed that we will never have a mix data from two
> > > > threads writing on the same physical block.
> > >
> > > I think that you should mention that 2-phase approach.
> >
> > Sure I can add a comment and update the commit message with this.
> >
> > >
> > > Is there something which ensures that we have fio which supports RWF_ATOMIC?
> > > fio for some time supported the "atomic" cmdline param, but did not do
> > > anything until recently
> >
> > We do have _require_fio which ensures the options passed are supported
> > by the current fio. If you are saying some versions of fio have --atomic
> > valid but dont do an RWF_ATOMIC then I'm not really sure if that can be
> > caught though.
>
> Can you check the fio version?
We don't have a helper but yes I think that should be possible
>
> >
> > >
> > > >
> > > > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > > ---
> > > > tests/generic/1226 | 101 +++++++++++++++++++++++++++++++++++++++++
> > > > tests/generic/1226.out | 2 +
> > >
> > > Was this tested with xfs?
> >
> > Yes, I've tested with XFS with software fallback as well. Also, tested
> > xfs while keeping io size as 16kb so we stress the hw paths too.
>
> so is that requirement implemented with the _require_scratch_write_atomic
> check?
No, its just something i hardcoded for that particular run. This patch
doesn't enforce hardware only atomic writes
Regards,
ojaswin
>
> > Both
> > seem to be passing as expected.
> > >
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc
2025-07-17 16:11 ` Darrick J. Wong
@ 2025-07-22 9:53 ` Ojaswin Mujoo
2025-07-23 14:50 ` Darrick J. Wong
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-22 9:53 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 09:11:54AM -0700, Darrick J. Wong wrote:
> On Sat, Jul 12, 2025 at 07:42:44PM +0530, Ojaswin Mujoo wrote:
> > Insert range and collapse range only works with bigalloc in case
> > the range is cluster size aligned, which fsx doesnt take care. To
> > work past this, disable insert range and collapse range on ext4, if
> > bigalloc is enabled.
> >
> > This is achieved by defining a new function _set_default_fsx_avoid
> > called via run_fsx helper. This can be used to selectively disable
> > fsx options based on the configuration.
> >
> > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> > common/rc | 27 +++++++++++++++++++++++++++
> > 1 file changed, 27 insertions(+)
> >
> > diff --git a/common/rc b/common/rc
> > index 9a9d3cc8..218cf253 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -5113,10 +5113,37 @@ _require_hugepage_fsx()
> > _notrun "fsx binary does not support MADV_COLLAPSE"
> > }
> >
> > +_set_default_fsx_avoid() {
> > + local file=$1
> > +
> > + case "$FSTYP" in
> > + "ext4")
> > + local dev=$(findmnt -n -o SOURCE --target $file)
> > +
> > + # open code instead of _require_dumpe2fs cause we don't
> > + # want to _notrun if dumpe2fs is not available
> > + if [ -z "$DUMPE2FS_PROG" ]; then
> > + echo "_set_default_fsx_avoid: dumpe2fs not found, skipping bigalloc check." >> $seqres.full
> > + return
> > + fi
>
> I hate to be the guy who says one thing and then another, but ...
>
> If we extended _get_file_block_size to report the ext4 bigalloc cluster
> size, would that be sufficient to keep testing collapse/insert range?
>
> I guess the tricky part here is that bigalloc allows sub-cluster
> mappings and we might not want to do all file IO testing in such big
> units.
Hmm, so maybe a better way is to just add a parameter like alloc_unit in
fsx where we can pass the cluster_size to which INSERT/COLLAPSE range be
aligned to. For now we can pass it explicitly in the tests if needed.
I do plan on working on your suggestion of exposing alloc unit via
statx(). Once we have that in the kernel, fsx can use that as well.
If this approach sounds okay I can try to maybe send the whole "fixing
of insert/collpase range in fsx" as a patchset separate from atomic
writes.
>
> > +
> > + $DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
> > + export FSX_AVOID+=" -I -C"
>
> No need to export FSX_AVOID to subprocesses.
>
> --D
Got it, will fix. Thanks for review!
Regards,
ojaswin
>
> > + }
> > + ;;
> > + # Add other filesystem types here as needed
> > + *)
> > + ;;
> > + esac
> > +}
> > +
> > _run_fsx()
> > {
> > echo "fsx $*"
> > local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> > +
> > + _set_default_fsx_avoid $testfile
> > +
> > set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> > echo "$@" >>$seqres.full
> > rm -f $TEST_DIR/junk
> > --
> > 2.49.0
> >
> >
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx
2025-07-17 16:17 ` Darrick J. Wong
@ 2025-07-22 9:59 ` Ojaswin Mujoo
2025-07-23 14:57 ` Darrick J. Wong
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-22 9:59 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 09:17:47AM -0700, Darrick J. Wong wrote:
<snip>
> > +
> > +/*
> > + * Round down n to nearest power of 2.
> > + * If n is already a power of 2, return n;
> > + */
> > +static int rounddown_pow_of_2(int n) {
> > + int i = 0;
> > +
> > + if (is_power_of_2(n))
> > + return n;
> > +
> > + for (; (1 << i) < n; i++);
> > +
> > + return 1 << (i - 1);
> > +}
> > +
> > void
> > dowrite(unsigned offset, unsigned size, int flags)
> > {
> > @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
> > offset -= offset % writebdy;
> > if (o_direct)
> > size -= size % writebdy;
> > + if (flags & RWF_ATOMIC) {
> > + /* atomic write len must be inbetween awu_min and awu_max */
> > + if (size < awu_min)
> > + size = awu_min;
> > + if (size > awu_max)
> > + size = awu_max;
> > +
> > + /* atomic writes need power-of-2 sizes */
> > + size = rounddown_pow_of_2(size);
> > +
> > + /* atomic writes need naturally aligned offsets */
> > + offset -= offset % size;
>
> I don't think you should be modifying offset/size here. Normally for
> fsx we do all the rounding of the file range in the switch statement
> after the "calculate appropriate op to run" comment statement.
>
> --D
Yes, I noticed that but then I saw we make size/offset adjustments in
do write for writebdy and I wanted atomic writes adjustments to be done
after that.
Regads,
ojaswin
>
> > +
> > + /* Skip the write if we are crossing max filesize */
> > + if ((offset + size) > maxfilelen) {
> > + if (!quiet && testcalls > simulatedopcount)
> > + prt("skipping atomic write past maxfilelen\n");
> > + log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> > + return;
> > + }
> > + }
> > if (size == 0) {
> > if (!quiet && testcalls > simulatedopcount && !o_direct)
> > prt("skipping zero size write\n");
> > @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
> > return;
> > }
> >
> > - log4(OP_WRITE, offset, size, FL_NONE);
> > + if (flags & RWF_ATOMIC)
> > + log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> > + else
> > + log4(OP_WRITE, offset, size, FL_NONE);
> >
> > gendata(original_buf, good_buf, offset, size);
> > if (offset + size > file_size) {
> > @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
> > (monitorstart == -1 ||
> > (offset + size > monitorstart &&
> > (monitorend == -1 || offset <= monitorend))))))
> > - prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> > - offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> > + prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> > + offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> > + (flags & RWF_ATOMIC) != 0);
> > iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
> > if (iret != size) {
> > if (iret == -1)
> > @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
> > }
> > #endif
> >
> > +int test_atomic_writes(void) {
> > + int ret;
> > + struct statx stx;
> > +
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled
2025-07-17 16:22 ` Darrick J. Wong
@ 2025-07-23 6:30 ` Ojaswin Mujoo
2025-07-23 14:56 ` Darrick J. Wong
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-23 6:30 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 09:22:30AM -0700, Darrick J. Wong wrote:
> On Sat, Jul 12, 2025 at 07:42:50PM +0530, Ojaswin Mujoo wrote:
> > Stress file with atomic writes to ensure we excercise codepaths
> > where we are mixing different FS operations with atomic writes
> >
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>
> Hrm, doesn't generic/521 test this already if the fs happens to support
> atomic writes?
>
> --D
Hi Darrick,
Yes but I wanted one with _require_scratch_write_atomic and writes going
to SCRATCH fs to explicitly test atomic writes as that can get missed in
g/521.
Would you instead prefer to have those changes in g/521?
Regards,
Ojaswin
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-22 8:47 ` Ojaswin Mujoo
@ 2025-07-23 11:33 ` John Garry
2025-07-23 13:51 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-23 11:33 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 22/07/2025 09:47, Ojaswin Mujoo wrote:
>>> Yes, I've tested with XFS with software fallback as well. Also, tested
>>> xfs while keeping io size as 16kb so we stress the hw paths too.
>> so is that requirement implemented with the _require_scratch_write_atomic
>> check?
> No, its just something i hardcoded for that particular run. This patch
> doesn't enforce hardware only atomic writes
If we are to test this for XFS then we need to ensure that HW atomics
are available.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-23 11:33 ` John Garry
@ 2025-07-23 13:51 ` Ojaswin Mujoo
2025-07-23 16:25 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-23 13:51 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Wed, Jul 23, 2025 at 12:33:27PM +0100, John Garry wrote:
> On 22/07/2025 09:47, Ojaswin Mujoo wrote:
> > > > Yes, I've tested with XFS with software fallback as well. Also, tested
> > > > xfs while keeping io size as 16kb so we stress the hw paths too.
> > > so is that requirement implemented with the _require_scratch_write_atomic
> > > check?
> > No, its just something i hardcoded for that particular run. This patch
> > doesn't enforce hardware only atomic writes
>
> If we are to test this for XFS then we need to ensure that HW atomics are
> available.
Why is that? Now with the verification step happening after writes,
software atomic writes should also pass this test since there are no
racing writes to the verify reads.
Regards,
ojaswin
>
> Thanks,
> John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
2025-07-17 16:35 ` Darrick J. Wong
@ 2025-07-23 13:53 ` Ojaswin Mujoo
2025-07-23 14:54 ` Darrick J. Wong
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-23 13:53 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Thu, Jul 17, 2025 at 09:35:10AM -0700, Darrick J. Wong wrote:
<snip>
> > +verify_atomic_write() {
> > + if [[ "$1" == "shutdown" ]]
> > + then
> > + local do_shutdown=1
> > + fi
> > +
> > + test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
> > +
> > + if [[ $do_shutdown -eq "1" ]]
> > + then
> > + echo "Shutting down filesystem" >> $seqres.full
> > + _scratch_shutdown >> $seqres.full
> > + _scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> > + fi
> > +
> > + check_data_integrity
> > +}
> > +
> > +mixed_mapping_test() {
> > + prep_mixed_mapping
> > +
> > + echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
> > + bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> > + grep wrote | awk -F'[/ ]' '{print $2}')
> > +
> > + verify_atomic_write $1
>
> The shutdown happens after the synchronous write completes? If so, then
> what part of recovery is this testing?
>
> --D
Right, it is mostly inspired by [1] where sometimes isize update could
be lost after dio completion. Although this might not exactly be
affected by atomic writes, we added it here out of caution.
[1] https://lore.kernel.org/fstests/434beffaf18d39f898518ea9eb1cea4548e77c3a.1695383715.git.ritesh.list@gmail.com/
>
> > +}
> > +
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc
2025-07-22 9:53 ` Ojaswin Mujoo
@ 2025-07-23 14:50 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-23 14:50 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Tue, Jul 22, 2025 at 03:23:02PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jul 17, 2025 at 09:11:54AM -0700, Darrick J. Wong wrote:
> > On Sat, Jul 12, 2025 at 07:42:44PM +0530, Ojaswin Mujoo wrote:
> > > Insert range and collapse range only works with bigalloc in case
> > > the range is cluster size aligned, which fsx doesnt take care. To
> > > work past this, disable insert range and collapse range on ext4, if
> > > bigalloc is enabled.
> > >
> > > This is achieved by defining a new function _set_default_fsx_avoid
> > > called via run_fsx helper. This can be used to selectively disable
> > > fsx options based on the configuration.
> > >
> > > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > > ---
> > > common/rc | 27 +++++++++++++++++++++++++++
> > > 1 file changed, 27 insertions(+)
> > >
> > > diff --git a/common/rc b/common/rc
> > > index 9a9d3cc8..218cf253 100644
> > > --- a/common/rc
> > > +++ b/common/rc
> > > @@ -5113,10 +5113,37 @@ _require_hugepage_fsx()
> > > _notrun "fsx binary does not support MADV_COLLAPSE"
> > > }
> > >
> > > +_set_default_fsx_avoid() {
> > > + local file=$1
> > > +
> > > + case "$FSTYP" in
> > > + "ext4")
> > > + local dev=$(findmnt -n -o SOURCE --target $file)
> > > +
> > > + # open code instead of _require_dumpe2fs cause we don't
> > > + # want to _notrun if dumpe2fs is not available
> > > + if [ -z "$DUMPE2FS_PROG" ]; then
> > > + echo "_set_default_fsx_avoid: dumpe2fs not found, skipping bigalloc check." >> $seqres.full
> > > + return
> > > + fi
> >
> > I hate to be the guy who says one thing and then another, but ...
> >
> > If we extended _get_file_block_size to report the ext4 bigalloc cluster
> > size, would that be sufficient to keep testing collapse/insert range?
> >
> > I guess the tricky part here is that bigalloc allows sub-cluster
> > mappings and we might not want to do all file IO testing in such big
> > units.
>
> Hmm, so maybe a better way is to just add a parameter like alloc_unit in
> fsx where we can pass the cluster_size to which INSERT/COLLAPSE range be
> aligned to. For now we can pass it explicitly in the tests if needed.
>
> I do plan on working on your suggestion of exposing alloc unit via
> statx(). Once we have that in the kernel, fsx can use that as well.
>
> If this approach sounds okay I can try to maybe send the whole "fixing
> of insert/collpase range in fsx" as a patchset separate from atomic
> writes.
Yeah, that sounds like a good longer-term solution to me. :)
--D
> >
> > > +
> > > + $DUMPE2FS_PROG -h $dev 2>&1 | grep -q bigalloc && {
> > > + export FSX_AVOID+=" -I -C"
> >
> > No need to export FSX_AVOID to subprocesses.
> >
> > --D
>
> Got it, will fix. Thanks for review!
>
>
> Regards,
> ojaswin
> >
> > > + }
> > > + ;;
> > > + # Add other filesystem types here as needed
> > > + *)
> > > + ;;
> > > + esac
> > > +}
> > > +
> > > _run_fsx()
> > > {
> > > echo "fsx $*"
> > > local args=`echo $@ | sed -e "s/ BSIZE / $bsize /g" -e "s/ PSIZE / $psize /g"`
> > > +
> > > + _set_default_fsx_avoid $testfile
> > > +
> > > set -- $FSX_PROG $args $FSX_AVOID $TEST_DIR/junk
> > > echo "$@" >>$seqres.full
> > > rm -f $TEST_DIR/junk
> > > --
> > > 2.49.0
> > >
> > >
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
2025-07-23 13:53 ` Ojaswin Mujoo
@ 2025-07-23 14:54 ` Darrick J. Wong
2025-08-10 9:41 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-23 14:54 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Wed, Jul 23, 2025 at 07:23:58PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jul 17, 2025 at 09:35:10AM -0700, Darrick J. Wong wrote:
>
> <snip>
>
> > > +verify_atomic_write() {
> > > + if [[ "$1" == "shutdown" ]]
> > > + then
> > > + local do_shutdown=1
> > > + fi
> > > +
> > > + test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
> > > +
> > > + if [[ $do_shutdown -eq "1" ]]
> > > + then
> > > + echo "Shutting down filesystem" >> $seqres.full
> > > + _scratch_shutdown >> $seqres.full
> > > + _scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> > > + fi
> > > +
> > > + check_data_integrity
> > > +}
> > > +
> > > +mixed_mapping_test() {
> > > + prep_mixed_mapping
> > > +
> > > + echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
> > > + bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> > > + grep wrote | awk -F'[/ ]' '{print $2}')
> > > +
> > > + verify_atomic_write $1
> >
> > The shutdown happens after the synchronous write completes? If so, then
> > what part of recovery is this testing?
> >
> > --D
>
> Right, it is mostly inspired by [1] where sometimes isize update could
> be lost after dio completion. Although this might not exactly be
> affected by atomic writes, we added it here out of caution.
>
> [1] https://lore.kernel.org/fstests/434beffaf18d39f898518ea9eb1cea4548e77c3a.1695383715.git.ritesh.list@gmail.com/
Ah, so we're racing with background log flush then. Would it improve
the potential failure detection rate to call shutdown right after the
pwrite, e.g.
$XFS_IO_PROG -dxc "pwrite -DA..." -c 'shutdown' $testfile
It can take a few milliseconds to walk down the bash functions and
fork/exec another child process.
--D
> > > +}
> > > +
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled
2025-07-23 6:30 ` Ojaswin Mujoo
@ 2025-07-23 14:56 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-23 14:56 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Wed, Jul 23, 2025 at 12:00:48PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jul 17, 2025 at 09:22:30AM -0700, Darrick J. Wong wrote:
> > On Sat, Jul 12, 2025 at 07:42:50PM +0530, Ojaswin Mujoo wrote:
> > > Stress file with atomic writes to ensure we excercise codepaths
> > > where we are mixing different FS operations with atomic writes
> > >
> > > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> >
> > Hrm, doesn't generic/521 test this already if the fs happens to support
> > atomic writes?
> >
> > --D
>
> Hi Darrick,
>
> Yes but I wanted one with _require_scratch_write_atomic and writes going
> to SCRATCH fs to explicitly test atomic writes as that can get missed in
> g/521.
>
> Would you instead prefer to have those changes in g/521?
Oh, I see. You're setting the opsize to awu_max so that you're
guaranteed to get maximally sized atomic writes, which might not happen
with regular g521.
Ok I'm convinced,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> Regards,
> Ojaswin
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx
2025-07-22 9:59 ` Ojaswin Mujoo
@ 2025-07-23 14:57 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-23 14:57 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Tue, Jul 22, 2025 at 03:29:02PM +0530, Ojaswin Mujoo wrote:
> On Thu, Jul 17, 2025 at 09:17:47AM -0700, Darrick J. Wong wrote:
>
> <snip>
>
> > > +
> > > +/*
> > > + * Round down n to nearest power of 2.
> > > + * If n is already a power of 2, return n;
> > > + */
> > > +static int rounddown_pow_of_2(int n) {
> > > + int i = 0;
> > > +
> > > + if (is_power_of_2(n))
> > > + return n;
> > > +
> > > + for (; (1 << i) < n; i++);
> > > +
> > > + return 1 << (i - 1);
> > > +}
> > > +
> > > void
> > > dowrite(unsigned offset, unsigned size, int flags)
> > > {
> > > @@ -1081,6 +1113,27 @@ dowrite(unsigned offset, unsigned size, int flags)
> > > offset -= offset % writebdy;
> > > if (o_direct)
> > > size -= size % writebdy;
> > > + if (flags & RWF_ATOMIC) {
> > > + /* atomic write len must be inbetween awu_min and awu_max */
> > > + if (size < awu_min)
> > > + size = awu_min;
> > > + if (size > awu_max)
> > > + size = awu_max;
> > > +
> > > + /* atomic writes need power-of-2 sizes */
> > > + size = rounddown_pow_of_2(size);
> > > +
> > > + /* atomic writes need naturally aligned offsets */
> > > + offset -= offset % size;
> >
> > I don't think you should be modifying offset/size here. Normally for
> > fsx we do all the rounding of the file range in the switch statement
> > after the "calculate appropriate op to run" comment statement.
> >
> > --D
>
> Yes, I noticed that but then I saw we make size/offset adjustments in
> do write for writebdy and I wanted atomic writes adjustments to be done
> after that.
<nod> ok then, I forgot that we already tweak the file range for
write...
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> Regads,
> ojaswin
>
> >
> > > +
> > > + /* Skip the write if we are crossing max filesize */
> > > + if ((offset + size) > maxfilelen) {
> > > + if (!quiet && testcalls > simulatedopcount)
> > > + prt("skipping atomic write past maxfilelen\n");
> > > + log4(OP_WRITE_ATOMIC, offset, size, FL_SKIPPED);
> > > + return;
> > > + }
> > > + }
> > > if (size == 0) {
> > > if (!quiet && testcalls > simulatedopcount && !o_direct)
> > > prt("skipping zero size write\n");
> > > @@ -1088,7 +1141,10 @@ dowrite(unsigned offset, unsigned size, int flags)
> > > return;
> > > }
> > >
> > > - log4(OP_WRITE, offset, size, FL_NONE);
> > > + if (flags & RWF_ATOMIC)
> > > + log4(OP_WRITE_ATOMIC, offset, size, FL_NONE);
> > > + else
> > > + log4(OP_WRITE, offset, size, FL_NONE);
> > >
> > > gendata(original_buf, good_buf, offset, size);
> > > if (offset + size > file_size) {
> > > @@ -1108,8 +1164,9 @@ dowrite(unsigned offset, unsigned size, int flags)
> > > (monitorstart == -1 ||
> > > (offset + size > monitorstart &&
> > > (monitorend == -1 || offset <= monitorend))))))
> > > - prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d\n", testcalls,
> > > - offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0);
> > > + prt("%lld write\t0x%x thru\t0x%x\t(0x%x bytes)\tdontcache=%d atomic_wr=%d\n", testcalls,
> > > + offset, offset + size - 1, size, (flags & RWF_DONTCACHE) != 0,
> > > + (flags & RWF_ATOMIC) != 0);
> > > iret = fsxwrite(fd, good_buf + offset, size, offset, flags);
> > > if (iret != size) {
> > > if (iret == -1)
> > > @@ -1785,6 +1842,30 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
> > > }
> > > #endif
> > >
> > > +int test_atomic_writes(void) {
> > > + int ret;
> > > + struct statx stx;
> > > +
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-23 13:51 ` Ojaswin Mujoo
@ 2025-07-23 16:25 ` John Garry
2025-07-25 6:27 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-23 16:25 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 23/07/2025 14:51, Ojaswin Mujoo wrote:
>>> No, its just something i hardcoded for that particular run. This patch
>>> doesn't enforce hardware only atomic writes
>> If we are to test this for XFS then we need to ensure that HW atomics are
>> available.
> Why is that? Now with the verification step happening after writes,
> software atomic writes should also pass this test since there are no
> racing writes to the verify reads.
Sure, but racing software atomic writes against other software atomic
writes is not safe.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-23 16:25 ` John Garry
@ 2025-07-25 6:27 ` Ojaswin Mujoo
2025-07-25 8:14 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-25 6:27 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Wed, Jul 23, 2025 at 05:25:41PM +0100, John Garry wrote:
> On 23/07/2025 14:51, Ojaswin Mujoo wrote:
> > > > No, its just something i hardcoded for that particular run. This patch
> > > > doesn't enforce hardware only atomic writes
> > > If we are to test this for XFS then we need to ensure that HW atomics are
> > > available.
> > Why is that? Now with the verification step happening after writes,
> > software atomic writes should also pass this test since there are no
> > racing writes to the verify reads.
>
> Sure, but racing software atomic writes against other software atomic writes
> is not safe.
>
> Thanks,
> John
What do you mean by not safe? Does it mean the test can fail?
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-25 6:27 ` Ojaswin Mujoo
@ 2025-07-25 8:14 ` John Garry
2025-07-28 6:43 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-25 8:14 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 25/07/2025 07:27, Ojaswin Mujoo wrote:
> On Wed, Jul 23, 2025 at 05:25:41PM +0100, John Garry wrote:
>> On 23/07/2025 14:51, Ojaswin Mujoo wrote:
>>>>> No, its just something i hardcoded for that particular run. This patch
>>>>> doesn't enforce hardware only atomic writes
>>>> If we are to test this for XFS then we need to ensure that HW atomics are
>>>> available.
>>> Why is that? Now with the verification step happening after writes,
>>> software atomic writes should also pass this test since there are no
>>> racing writes to the verify reads.
>> Sure, but racing software atomic writes against other software atomic writes
>> is not safe.
>>
>> Thanks,
>> John
> What do you mean by not safe?
Multiple threads issuing atomic writes may trample over one another.
It is due to the steps used to issue an atomic write in xfs by software
method. Here we do 3x steps:
a. allocate blocks for out-of-place write
b. do write in those blocks
c. atomically update extent mapping.
In this, threads wanting to atomic write to the same address will use
the new blocks and can trample over one another before we atomically
update the mapping.
So we do not guarantee serialization of atomic writes vs atomic writes.
And this is why I said that this test is never totally safe for xfs.
We could change this simply to have serialization of software-based
atomic writes against all other dio, like follows:
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -747,6 +747,7 @@ xfs_file_dio_write_atomic(
unsigned int iolock = XFS_IOLOCK_SHARED;
ssize_t ret, ocount = iov_iter_count(from);
const struct iomap_ops *dops;
+ unsigned int dio_flags = 0;
/*
* HW offload should be faster, so try that first if it is already
@@ -766,15 +767,12 @@ xfs_file_dio_write_atomic(
if (ret)
goto out_unlock;
- /* Demote similar to xfs_file_dio_write_aligned() */
- if (iolock == XFS_IOLOCK_EXCL) {
- xfs_ilock_demote(ip, XFS_IOLOCK_EXCL);
- iolock = XFS_IOLOCK_SHARED;
- }
+ if (dio_flags & IOMAP_DIO_FORCE_WAIT)
+ inode_dio_wait(VFS_I(ip));
trace_xfs_file_direct_write(iocb, from);
ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
- 0, NULL, 0);
+ dio_flags, NULL, 0);
/*
* The retry mechanism is based on the ->iomap_begin method
returning
@@ -785,6 +783,8 @@ xfs_file_dio_write_atomic(
if (ret == -ENOPROTOOPT && dops == &xfs_direct_write_iomap_ops) {
xfs_iunlock(ip, iolock);
dops = &xfs_atomic_write_cow_iomap_ops;
+ iolock = XFS_IOLOCK_EXCL;
+ dio_flags = IOMAP_DIO_FORCE_WAIT;
goto retry;
}
But it may affect performance.
> Does it mean the test can fail?
Yes, but it is unlikely if we have HW atomics available. That is because
we will rarely be using software-based atomic method, as HW method
should often be possible.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-25 8:14 ` John Garry
@ 2025-07-28 6:43 ` Ojaswin Mujoo
2025-07-28 9:09 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-28 6:43 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Fri, Jul 25, 2025 at 09:14:25AM +0100, John Garry wrote:
> On 25/07/2025 07:27, Ojaswin Mujoo wrote:
> > On Wed, Jul 23, 2025 at 05:25:41PM +0100, John Garry wrote:
> > > On 23/07/2025 14:51, Ojaswin Mujoo wrote:
> > > > > > No, its just something i hardcoded for that particular run. This patch
> > > > > > doesn't enforce hardware only atomic writes
> > > > > If we are to test this for XFS then we need to ensure that HW atomics are
> > > > > available.
> > > > Why is that? Now with the verification step happening after writes,
> > > > software atomic writes should also pass this test since there are no
> > > > racing writes to the verify reads.
> > > Sure, but racing software atomic writes against other software atomic writes
> > > is not safe.
> > >
> > > Thanks,
> > > John
> > What do you mean by not safe?
>
> Multiple threads issuing atomic writes may trample over one another.
>
> It is due to the steps used to issue an atomic write in xfs by software
> method. Here we do 3x steps:
> a. allocate blocks for out-of-place write
> b. do write in those blocks
> c. atomically update extent mapping.
>
> In this, threads wanting to atomic write to the same address will use the
> new blocks and can trample over one another before we atomically update the
> mapping.
So iiuc, w/ software fallback, a thread atomically writing to a range
will use a new block A. Another parallel thread trying to atomically
write to the same range will also use A, and there is no serialization
b/w the 2 so A could end up with a mix of data from both threads.
If this is true, aren't we violating the atomic guarantees. Nothing
prevents the userspace from doing overlapping parallel atomic writes and
it is kernels duty to error out if the write could get torn.
>
> So we do not guarantee serialization of atomic writes vs atomic writes. And
> this is why I said that this test is never totally safe for xfs.
>
> We could change this simply to have serialization of software-based atomic
> writes against all other dio, like follows:
>
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -747,6 +747,7 @@ xfs_file_dio_write_atomic(
> unsigned int iolock = XFS_IOLOCK_SHARED;
> ssize_t ret, ocount = iov_iter_count(from);
> const struct iomap_ops *dops;
> + unsigned int dio_flags = 0;
>
> /*
> * HW offload should be faster, so try that first if it is already
> @@ -766,15 +767,12 @@ xfs_file_dio_write_atomic(
> if (ret)
> goto out_unlock;
>
> - /* Demote similar to xfs_file_dio_write_aligned() */
> - if (iolock == XFS_IOLOCK_EXCL) {
> - xfs_ilock_demote(ip, XFS_IOLOCK_EXCL);
> - iolock = XFS_IOLOCK_SHARED;
> - }
> + if (dio_flags & IOMAP_DIO_FORCE_WAIT)
> + inode_dio_wait(VFS_I(ip));
>
> trace_xfs_file_direct_write(iocb, from);
> ret = iomap_dio_rw(iocb, from, dops, &xfs_dio_write_ops,
> - 0, NULL, 0);
> + dio_flags, NULL, 0);
>
> /*
> * The retry mechanism is based on the ->iomap_begin method returning
> @@ -785,6 +783,8 @@ xfs_file_dio_write_atomic(
> if (ret == -ENOPROTOOPT && dops == &xfs_direct_write_iomap_ops) {
> xfs_iunlock(ip, iolock);
> dops = &xfs_atomic_write_cow_iomap_ops;
> + iolock = XFS_IOLOCK_EXCL;
> + dio_flags = IOMAP_DIO_FORCE_WAIT;
> goto retry;
> }
>
>
> But it may affect performance.
>
> > Does it mean the test can fail?
>
> Yes, but it is unlikely if we have HW atomics available. That is because we
> will rarely be using software-based atomic method, as HW method should often
> be possible.
Yes, but having a chance to tear the write when it is not possible is
not the right behavior.
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
2025-07-12 14:12 ` [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
2025-07-17 16:32 ` Darrick J. Wong
@ 2025-07-28 8:58 ` Zorro Lang
2025-07-28 9:27 ` Ojaswin Mujoo
1 sibling, 1 reply; 60+ messages in thread
From: Zorro Lang @ 2025-07-28 8:58 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: fstests, Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:48PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>
> This tests uses fio to first create a file with mixed mappings. Then it
> does atomic writes using aio dio with parallel jobs to the same file
> with mixed mappings. This forces the filesystem allocator to allocate
> extents over mixed mapping regions to stress FS block allocators.
>
> Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
This patch looks good to me, just the subject:
"generic/1227: Add atomic write test using fio verify on file mixed mappings"
generally if we write a new test case, we don't use a temporary case number
in commit subject, you can just write as "generic: add atomic write test using ..."
Other patches (with new test cases) refer to this.
With this change,
Reviewed-by: Zorro Lang <zlang@redhat.com>
> tests/generic/1227 | 123 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1227.out | 2 +
> 2 files changed, 125 insertions(+)
> create mode 100755 tests/generic/1227
> create mode 100644 tests/generic/1227.out
>
> diff --git a/tests/generic/1227 b/tests/generic/1227
> new file mode 100755
> index 00000000..cfdc54ec
> --- /dev/null
> +++ b/tests/generic/1227
> @@ -0,0 +1,123 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 1227
> +#
> +# Validate FS atomic write using fio crc check verifier on mixed mappings
> +# of a file.
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto aio rw atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_odirect
> +_require_aio
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +
> +touch "$SCRATCH_MNT/f1"
> +awu_min_write=$(_get_atomic_write_unit_min "$SCRATCH_MNT/f1")
> +awu_max_write=$(_get_atomic_write_unit_max "$SCRATCH_MNT/f1")
> +aw_bsize=$(_max "$awu_min_write" "$((awu_max_write/4))")
> +
> +fsbsize=$(_get_block_size $SCRATCH_MNT)
> +
> +fio_prep_config=$tmp.prep.fio
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((128 * 1024 * 1024))
> +
> +cat >$fio_prep_config <<EOF
> +# prep file to have mixed mappings
> +[global]
> +ioengine=libaio
> +fallocate=none
> +filename=$SCRATCH_MNT/test-file
> +filesize=$SIZE
> +bs=$fsbsize
> +direct=1
> +group_reporting=1
> +
> +# Create written extents
> +[prep_written_blocks]
> +ioengine=libaio
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +
> +# Create unwritten extents
> +[prep_unwritten_blocks]
> +ioengine=falloc
> +rw=randwrite
> +io_size=$((SIZE/3))
> +random_generator=lfsr
> +EOF
> +
> +cat >$fio_aw_config <<EOF
> +# atomic write to mixed mappings of written/unwritten/holes
> +[atomic_write_job]
> +ioengine=libaio
> +rw=randwrite
> +direct=1
> +atomic=1
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +bs=$aw_bsize
> +iodepth=$FIO_LOAD
> +numjobs=$FIO_LOAD
> +
> +verify_state_save=0
> +verify=crc32c
> +do_verify=0
> +EOF
> +
> +cat >$fio_verify_config <<EOF
> +# verify atomic writes done by previous job
> +[verify_job]
> +ioengine=libaio
> +rw=randwrite
> +random_generator=lfsr
> +group_reporting=1
> +
> +filename=$SCRATCH_MNT/test-file
> +size=$SIZE
> +bs=$aw_bsize
> +iodepth=$FIO_LOAD
> +
> +verify_state_save=0
> +verify_only=1
> +verify=crc32c
> +verify_fatal=1
> +verify_write_sequence=0
> +EOF
> +
> +_require_fio $fio_aw_config
> +_require_fio $fio_verify_config
> +
> +cat $fio_prep_config >> $seqres.full
> +cat $fio_aw_config >> $seqres.full
> +cat $fio_verify_config >> $seqres.full
> +
> +#prepare file with mixed mappings
> +$FIO_PROG $fio_prep_config >> $seqres.full
> +
> +# do atomic writes without verifying
> +$FIO_PROG $fio_aw_config >> $seqres.full
> +
> +# verify data is not torn
> +$FIO_PROG $fio_verify_config >> $seqres.full
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/generic/1227.out b/tests/generic/1227.out
> new file mode 100644
> index 00000000..2605d062
> --- /dev/null
> +++ b/tests/generic/1227.out
> @@ -0,0 +1,2 @@
> +QA output created by 1227
> +Silence is golden
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-28 6:43 ` Ojaswin Mujoo
@ 2025-07-28 9:09 ` John Garry
2025-07-28 13:35 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-28 9:09 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 28/07/2025 07:43, Ojaswin Mujoo wrote:
>>> What do you mean by not safe?
>> Multiple threads issuing atomic writes may trample over one another.
>>
>> It is due to the steps used to issue an atomic write in xfs by software
>> method. Here we do 3x steps:
>> a. allocate blocks for out-of-place write
>> b. do write in those blocks
>> c. atomically update extent mapping.
>>
>> In this, threads wanting to atomic write to the same address will use the
>> new blocks and can trample over one another before we atomically update the
>> mapping.
> So iiuc, w/ software fallback, a thread atomically writing to a range
> will use a new block A. Another parallel thread trying to atomically
> write to the same range will also use A, and there is no serialization
> b/w the 2 so A could end up with a mix of data from both threads.
right
>
> If this is true, aren't we violating the atomic guarantees. Nothing
> prevents the userspace from doing overlapping parallel atomic writes and
> it is kernels duty to error out if the write could get torn.
Correct, but simply userspace should not do this. Direct I/O
applications are responsible for ordering.
We guarantee that the write is committed all-or-nothing, but do rely on
userspace not issuing racing atomic writes or racing regular writes.
I can easily change this, as I mentioned, but I am not convinced that it
is a must.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings
2025-07-28 8:58 ` Zorro Lang
@ 2025-07-28 9:27 ` Ojaswin Mujoo
0 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-28 9:27 UTC (permalink / raw)
To: Zorro Lang
Cc: fstests, Ritesh Harjani, djwong, john.g.garry, tytso, linux-xfs,
linux-kernel, linux-ext4
On Mon, Jul 28, 2025 at 04:58:51PM +0800, Zorro Lang wrote:
> On Sat, Jul 12, 2025 at 07:42:48PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> >
> > This tests uses fio to first create a file with mixed mappings. Then it
> > does atomic writes using aio dio with parallel jobs to the same file
> > with mixed mappings. This forces the filesystem allocator to allocate
> > extents over mixed mapping regions to stress FS block allocators.
> >
> > Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
>
> This patch looks good to me, just the subject:
> "generic/1227: Add atomic write test using fio verify on file mixed mappings"
>
> generally if we write a new test case, we don't use a temporary case number
> in commit subject, you can just write as "generic: add atomic write test using ..."
>
> Other patches (with new test cases) refer to this.
>
> With this change,
> Reviewed-by: Zorro Lang <zlang@redhat.com>
Hi Zorro, thanks for pointing it out. I'll make the change in next
revision.
Regards,
Ojaswin
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-28 9:09 ` John Garry
@ 2025-07-28 13:35 ` Ojaswin Mujoo
2025-07-28 14:00 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-28 13:35 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Mon, Jul 28, 2025 at 10:09:47AM +0100, John Garry wrote:
> On 28/07/2025 07:43, Ojaswin Mujoo wrote:
> > > > What do you mean by not safe?
> > > Multiple threads issuing atomic writes may trample over one another.
> > >
> > > It is due to the steps used to issue an atomic write in xfs by software
> > > method. Here we do 3x steps:
> > > a. allocate blocks for out-of-place write
> > > b. do write in those blocks
> > > c. atomically update extent mapping.
> > >
> > > In this, threads wanting to atomic write to the same address will use the
> > > new blocks and can trample over one another before we atomically update the
> > > mapping.
> > So iiuc, w/ software fallback, a thread atomically writing to a range
> > will use a new block A. Another parallel thread trying to atomically
> > write to the same range will also use A, and there is no serialization
> > b/w the 2 so A could end up with a mix of data from both threads.
>
> right
>
> >
> > If this is true, aren't we violating the atomic guarantees. Nothing
> > prevents the userspace from doing overlapping parallel atomic writes and
> > it is kernels duty to error out if the write could get torn.
>
> Correct, but simply userspace should not do this. Direct I/O applications
> are responsible for ordering.
>
> We guarantee that the write is committed all-or-nothing, but do rely on
> userspace not issuing racing atomic writes or racing regular writes.
>
> I can easily change this, as I mentioned, but I am not convinced that it is
> a must.
Purely from a design point of view, I feel we are breaking atomicity and
hence we should serialize or just stop userspace from doing this (which
is a bit extreme).
I know userspace should ideally not do overwriting atomic writes but if
it is something we are allowing (which we do) then it is
kernel's responsibility to ensure atomicity. Sure we can penalize them
by serializing the writes but not by tearing it.
With that reasoning, I don't think the test should accomodate for this
particular scenario.
Regards,
ojaswin
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-28 13:35 ` Ojaswin Mujoo
@ 2025-07-28 14:00 ` John Garry
2025-07-29 6:11 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-28 14:00 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On 28/07/2025 14:35, Ojaswin Mujoo wrote:
>> We guarantee that the write is committed all-or-nothing, but do rely on
>> userspace not issuing racing atomic writes or racing regular writes.
>>
>> I can easily change this, as I mentioned, but I am not convinced that it is
>> a must.
> Purely from a design point of view, I feel we are breaking atomicity and
> hence we should serialize or just stop userspace from doing this (which
> is a bit extreme).
If you check the man page description of RWF_ATOMIC, it does not mention
serialization. The user should conclude that usual direct IO rules
apply, i.e. userspace is responsible for serializing.
>
> I know userspace should ideally not do overwriting atomic writes but if
> it is something we are allowing (which we do) then it is
> kernel's responsibility to ensure atomicity. Sure we can penalize them
> by serializing the writes but not by tearing it.
>
> With that reasoning, I don't think the test should accomodate for this
> particular scenario.
I can send a patch to the community for xfs (to provide serialization),
like I showed earlier, to get opinion.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-28 14:00 ` John Garry
@ 2025-07-29 6:11 ` Ojaswin Mujoo
2025-07-29 14:45 ` Darrick J. Wong
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-29 6:11 UTC (permalink / raw)
To: John Garry
Cc: Zorro Lang, fstests, Ritesh Harjani, djwong, tytso, linux-xfs,
linux-kernel, linux-ext4
On Mon, Jul 28, 2025 at 03:00:40PM +0100, John Garry wrote:
> On 28/07/2025 14:35, Ojaswin Mujoo wrote:
> > > We guarantee that the write is committed all-or-nothing, but do rely on
> > > userspace not issuing racing atomic writes or racing regular writes.
> > >
> > > I can easily change this, as I mentioned, but I am not convinced that it is
> > > a must.
> > Purely from a design point of view, I feel we are breaking atomicity and
> > hence we should serialize or just stop userspace from doing this (which
> > is a bit extreme).
>
> If you check the man page description of RWF_ATOMIC, it does not mention
> serialization. The user should conclude that usual direct IO rules apply,
> i.e. userspace is responsible for serializing.
My mental model of serialization in context of atomic writes is that if
user does 64k atomic write A followed by a parallel overlapping 64kb
atomic write B then the user might see complete A or complete B (we
don't guarantee) but not a mix of A and B.
>
> >
> > I know userspace should ideally not do overwriting atomic writes but if
> > it is something we are allowing (which we do) then it is
> > kernel's responsibility to ensure atomicity. Sure we can penalize them
> > by serializing the writes but not by tearing it.
> >
> > With that reasoning, I don't think the test should accomodate for this
> > particular scenario.
>
> I can send a patch to the community for xfs (to provide serialization), like
> I showed earlier, to get opinion.
Thanks, that would be great.
Regards,
John
>
> Thanks,
> John
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-29 6:11 ` Ojaswin Mujoo
@ 2025-07-29 14:45 ` Darrick J. Wong
2025-07-31 4:18 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 14:45 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: John Garry, Zorro Lang, fstests, Ritesh Harjani, tytso, linux-xfs,
linux-kernel, linux-ext4
On Tue, Jul 29, 2025 at 11:41:39AM +0530, Ojaswin Mujoo wrote:
> On Mon, Jul 28, 2025 at 03:00:40PM +0100, John Garry wrote:
> > On 28/07/2025 14:35, Ojaswin Mujoo wrote:
> > > > We guarantee that the write is committed all-or-nothing, but do rely on
> > > > userspace not issuing racing atomic writes or racing regular writes.
> > > >
> > > > I can easily change this, as I mentioned, but I am not convinced that it is
> > > > a must.
> > > Purely from a design point of view, I feel we are breaking atomicity and
> > > hence we should serialize or just stop userspace from doing this (which
> > > is a bit extreme).
> >
> > If you check the man page description of RWF_ATOMIC, it does not mention
> > serialization. The user should conclude that usual direct IO rules apply,
> > i.e. userspace is responsible for serializing.
>
> My mental model of serialization in context of atomic writes is that if
> user does 64k atomic write A followed by a parallel overlapping 64kb
> atomic write B then the user might see complete A or complete B (we
> don't guarantee) but not a mix of A and B.
Heh, here comes that feature naming confusing again. This is my
definition:
RWF_ATOMIC means the system won't introduce new tearing when persisting
file writes. The application is allowed to introduce tearing by writing
to overlapping ranges at the same time. The system does not isolate
overlapping reads from writes.
--D
> >
> > >
> > > I know userspace should ideally not do overwriting atomic writes but if
> > > it is something we are allowing (which we do) then it is
> > > kernel's responsibility to ensure atomicity. Sure we can penalize them
> > > by serializing the writes but not by tearing it.
> > >
> > > With that reasoning, I don't think the test should accomodate for this
> > > particular scenario.
> >
> > I can send a patch to the community for xfs (to provide serialization), like
> > I showed earlier, to get opinion.
>
> Thanks, that would be great.
>
> Regards,
> John
> >
> > Thanks,
> > John
> >
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation
2025-07-12 14:12 ` [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
@ 2025-07-29 19:36 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 19:36 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:55PM +0530, Ojaswin Mujoo wrote:
> Test atomic writes with journal credit calculation. We take 2 cases
> here:
>
> 1. Atomic writes on single mapping causing tree to collapse into
> the inode
> 2. Atomic writes on mixed mapping causing tree to collapse into the
> inode
>
> This test is inspired by ext4/034.
>
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/ext4/064 | 75 ++++++++++++++++++++++++++++++++++++++++++++++
> tests/ext4/064.out | 2 ++
> 2 files changed, 77 insertions(+)
> create mode 100755 tests/ext4/064
> create mode 100644 tests/ext4/064.out
>
> diff --git a/tests/ext4/064 b/tests/ext4/064
> new file mode 100755
> index 00000000..ec31f983
> --- /dev/null
> +++ b/tests/ext4/064
> @@ -0,0 +1,75 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 034
> +#
> +# Test proper credit reservation is done when performing
> +# tree collapse during an aotmic write based allocation
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto quick quota fiemap prealloc atomicwrites
> +
> +# Import common functions.
> +
> +
> +# Modify as appropriate.
> +_exclude_fs ext2
> +_exclude_fs ext3
> +_require_xfs_io_command "falloc"
> +_require_xfs_io_command "fiemap"
> +_require_xfs_io_command "syncfs"
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
_require_metadata_journaling $SCRATCH_DEV ?
> +
> +echo "----- Testing with atomic write on non-mixed mapping -----" >> $seqres.full
> +
> +echo "Format and mount" >> $seqres.full
> +_scratch_mkfs > $seqres.full 2>&1
> +_scratch_mount > $seqres.full 2>&1
> +
> +echo "Create the original file" >> $seqres.full
> +touch $SCRATCH_MNT/foobar >> $seqres.full
> +
> +echo "Create 2 level extent tree (btree) for foobar with a unwritten extent" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
> + -c "pwrite 20k 4k" -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
> + -c "fsync" $SCRATCH_MNT/foobar >> $seqres.full
What happens if the block size isn't 4k?
--D
> +
> +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar >> $seqres.full
> +
> +echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
> +$XFS_IO_PROG -dc "pwrite -A -V1 4k 4k" $SCRATCH_MNT/foobar >> $seqres.full
> +
> +echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy >> $seqres.full
> +
> +echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
> +$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
> +
> +echo "----- Testing with atomi write on mixed mapping -----" >> $seqres.full
> +
> +echo "Create the original file" >> $seqres.full
> +touch $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Create 2 level extent tree (btree) for foobar2 with a unwritten extent" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "falloc 4k 4k" -c "pwrite 8k 4k" \
> + -c "pwrite 20k 4k" -c "pwrite 28k 4k" -c "pwrite 36k 4k" \
> + -c "fsync" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Convert unwritten extent to written and collapse extent tree to inode" >> $seqres.full
> +$XFS_IO_PROG -dc "pwrite -A -V1 0k 12k" $SCRATCH_MNT/foobar2 >> $seqres.full
> +
> +echo "Create a new file and do fsync to force a jbd2 commit" >> $seqres.full
> +$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/dummy2 >> $seqres.full
> +
> +echo "sync $SCRATCH_MNT to writeback" >> $seqres.full
> +$XFS_IO_PROG -c "syncfs" $SCRATCH_MNT >> $seqres.full
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/ext4/064.out b/tests/ext4/064.out
> new file mode 100644
> index 00000000..d9076546
> --- /dev/null
> +++ b/tests/ext4/064.out
> @@ -0,0 +1,2 @@
> +QA output created by 064
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes
2025-07-12 14:12 ` [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
@ 2025-07-29 19:41 ` Darrick J. Wong
2025-07-30 14:06 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 19:41 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:54PM +0530, Ojaswin Mujoo wrote:
> In ext4, even if an allocated range is physically and logically
> contiguous, it can still be split into 2 extents. This is because ext4
> does not merge extents across leaf nodes. This is an issue for atomic
> writes since even for a continuous extent the map block could (in rare
> cases) return a shorter map, hence tearning the write. This test creates
> such a file and ensures that the atomic write handles this case
> correctly
>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/ext4/063 | 125 +++++++++++++++++++++++++++++++++++++++++++++
> tests/ext4/063.out | 2 +
> 2 files changed, 127 insertions(+)
> create mode 100755 tests/ext4/063
> create mode 100644 tests/ext4/063.out
>
> diff --git a/tests/ext4/063 b/tests/ext4/063
> new file mode 100755
> index 00000000..25b5693d
> --- /dev/null
> +++ b/tests/ext4/063
> @@ -0,0 +1,125 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# In ext4, even if an allocated range is physically and logically contiguous,
> +# it can still be split into 2 extents. This is because ext4 does not merge
> +# extents across leaf nodes. This is an issue for atomic writes since even for
> +# a continuous extent the map block could (in rare cases) return a shorter map,
> +# hence tearning the write. This test creates such a file and ensures that the
> +# atomic write handles this case correctly
> +#
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_command "$DEBUGFS_PROG" debugfs
> +
> +prep() {
> + local bs=`_get_block_size $SCRATCH_MNT`
> + local ex_hdr_bytes=12
> + local ex_entry_bytes=12
> + local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> +
> + # fill the extent tree leaf which bs len extents at alternate offsets. For example,
> + # for 4k bs the tree should look as follows
> + #
> + # +---------+---------+
> + # | index 1 | index 2 |
> + # +-----+---+-----+---+
> + # +--------+ +-------+
> + # | |
> + # +----------+--------------+ +-----+-----+
> + # | ex 1 | ex 2 |... | ex n | | ex n + 1 |
> + # +-------------------------+ +-----------+
> + # 0 2 680 682
> + for i in $(seq 0 $entries_per_blk)
> + do
> + $XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> + done
> + sync $testfile
> +
> + echo >> $seqres.full
> + echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> + echo "...">> $seqres.full
> + $DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> + # Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
> + # this is a new FS the allocator would find continuous blocks such that
> + # ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
> + # since we dont merge extents across leaf we will end up with a tree as:
> + #
> + # +---------+---------+
> + # | index 1 | index 2 |
> + # +-----+---+-----+---+
> + # +--------+ +-------+
> + # | |
> + # +----------+--------------+ +-----+-----+
> + # | ex 1 | ex 2 |... | ex n | | ex merged |
> + # +-------------------------+ +-----------+
> + # 0 2 680 681 682 684
Where did 684 come from? It's not in the 'before' diagram. Did
"ex n + 1" previously map 682-684, and now it maps 681-684?
The rest looks ok though.
--D
> + #
> + echo >> $seqres.full
> + torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> + $XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> + sync $testfile
> +
> + echo >> $seqres.full
> + echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> + echo "...">> $seqres.full
> + $DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> + _scratch_cycle_mount
> +}
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +
> +echo >> $seqres.full
> +echo "# Prepping the file" >> $seqres.full
> +prep
> +
> +torn_aw_offset=$((torn_ex_offset - (torn_ex_offset % awu_max)))
> +
> +echo >> $seqres.full
> +echo "# Performing atomic IO on the torn extent range. Command: " >> $seqres.full
> +echo $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
> +$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $awu_max $torn_aw_offset $awu_max" >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "Extent state after atomic write:">> $seqres.full
> +echo "...">> $seqres.full
> +$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> +
> +echo >> $seqres.full
> +echo "# Checking data integrity" >> $seqres.full
> +
> +# create a dummy file with expected data
> +$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp >> /dev/null
> +expected_data=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp)
> +
> +# We ensure that the data after atomic writes should match the expected data
> +actual_data=$(od -An -t x1 -j $torn_aw_offset -N $awu_max $testfile)
> +if [[ "$actual_data" != "$expected_data" ]]
> +then
> + echo "Checksum match failed at off: $torn_aw_offset size: $awu_max"
> + echo
> + echo "Expected: "
> + echo "$expected_data"
> + echo
> + echo "Actual contents: "
> + echo "$actual_data"
> +
> + _fail
> +fi
> +
> +echo -n "Data verification at offset $torn_aw_offset suceeded!" >> $seqres.full
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/ext4/063.out b/tests/ext4/063.out
> new file mode 100644
> index 00000000..de35fc52
> --- /dev/null
> +++ b/tests/ext4/063.out
> @@ -0,0 +1,2 @@
> +QA output created by 063
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files
2025-07-12 14:12 ` [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
@ 2025-07-29 19:44 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 19:44 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:53PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>
> Brute force all possible blocksize clustersize combination on a bigalloc
> filesystem for stressing atomic write using fio data crc verifier. We run
> multiple threads in parallel with each job writing to its own file. The
> parallel jobs running on a constrained filesystem size ensure that we stress
> the ext4 allocator to allocate contiguous extents.
>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/ext4/062 | 176 +++++++++++++++++++++++++++++++++++++++++++++
> tests/ext4/062.out | 2 +
> 2 files changed, 178 insertions(+)
> create mode 100755 tests/ext4/062
> create mode 100644 tests/ext4/062.out
>
> diff --git a/tests/ext4/062 b/tests/ext4/062
> new file mode 100755
> index 00000000..85b82f97
> --- /dev/null
> +++ b/tests/ext4/062
> @@ -0,0 +1,176 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. We also create 8 such parallel jobs to run on
> +# a constrained filesystem size to stress the ext4 allocator to allocate
> +# contiguous extents.
Looks ok to me,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_aiodio
> +
> +FSSIZE=$((360*1024*1024))
> +FIO_LOAD=$(($(nproc) * LOAD_FACTOR))
> +fiobsize=4096
> +
> +# Calculate fsblocksize as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(_max 4096 "$bdev_awu_min")
> +
> +function create_fio_configs()
> +{
> + create_fio_aw_config
> + create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> + [global]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
> + bs=$fiobsize
> + fallocate=truncate
> + size=$((FSSIZE / 12))
> + iodepth=$FIO_LOAD
> + numjobs=$FIO_LOAD
> + group_reporting=1
> + atomic=1
> +
> + verify_only=1
> + verify_state_save=0
> + verify=crc32c
> + verify_fatal=1
> + verify_write_sequence=0
> +
> + [verify-job1]
> + filename=$SCRATCH_MNT/testfile-job1
> +
> + [verify-job2]
> + filename=$SCRATCH_MNT/testfile-job2
> +
> + [verify-job3]
> + filename=$SCRATCH_MNT/testfile-job3
> +
> + [verify-job4]
> + filename=$SCRATCH_MNT/testfile-job4
> +
> + [verify-job5]
> + filename=$SCRATCH_MNT/testfile-job5
> +
> + [verify-job6]
> + filename=$SCRATCH_MNT/testfile-job6
> +
> + [verify-job7]
> + filename=$SCRATCH_MNT/testfile-job7
> +
> + [verify-job8]
> + filename=$SCRATCH_MNT/testfile-job8
> +
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> + [global]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
> + bs=$fiobsize
> + fallocate=truncate
> + size=$((FSSIZE / 12))
> + iodepth=$FIO_LOAD
> + numjobs=$FIO_LOAD
> + group_reporting=1
> + atomic=1
> +
> + verify_state_save=0
> + verify=crc32c
> + do_verify=0
> +
> + [write-job1]
> + filename=$SCRATCH_MNT/testfile-job1
> +
> + [write-job2]
> + filename=$SCRATCH_MNT/testfile-job2
> +
> + [write-job3]
> + filename=$SCRATCH_MNT/testfile-job3
> +
> + [write-job4]
> + filename=$SCRATCH_MNT/testfile-job4
> +
> + [write-job5]
> + filename=$SCRATCH_MNT/testfile-job5
> +
> + [write-job6]
> + filename=$SCRATCH_MNT/testfile-job6
> +
> + [write-job7]
> + filename=$SCRATCH_MNT/testfile-job7
> +
> + [write-job8]
> + filename=$SCRATCH_MNT/testfile-job8
> +
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> + # cluster sizes above 16 x blocksize are experimental so avoid them
> + # Also, cap cluster size at 128kb to keep it reasonable for large
> + # blocks size cases.
> + fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> +
> + for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> + for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> + MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> + _scratch_mkfs_sized "$FSSIZE" >> $seqres.full 2>&1 || continue
> + if _try_scratch_mount >> $seqres.full 2>&1; then
> + echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +
> + touch $SCRATCH_MNT/f1
> + create_fio_configs
> +
> + cat $fio_aw_config >> $seqres.full
> + cat $fio_verify_config >> $seqres.full
> +
> + $FIO_PROG $fio_aw_config >> $seqres.full
> + ret1=$?
> +
> + $FIO_PROG $fio_verify_config >> $seqres.full
> + ret2=$?
> +
> + _scratch_unmount
> +
> + [[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> + fi
> + done
> + done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/062.out b/tests/ext4/062.out
> new file mode 100644
> index 00000000..a1578f48
> --- /dev/null
> +++ b/tests/ext4/062.out
> @@ -0,0 +1,2 @@
> +QA output created by 062
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
2025-07-12 14:12 ` [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
@ 2025-07-29 19:47 ` Darrick J. Wong
2025-07-30 13:56 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 19:47 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:52PM +0530, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>
> We brute force all possible blocksize & clustersize combinations on
> a bigalloc filesystem for stressing atomic write using fio data crc
> verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
> a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> that we never see the mix of data contents from different threads on
> a given bsrange.
Err, how does this differ from the next patch? It looks like this one
creates one IO thread, whereas the next one creates 8? If so, what does
this test add over ext4/062?
(and now that I look at it, ext4/062 says "FS QA Test 061"...)
--D
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
> tests/ext4/061 | 130 +++++++++++++++++++++++++++++++++++++++++++++
> tests/ext4/061.out | 2 +
> 2 files changed, 132 insertions(+)
> create mode 100755 tests/ext4/061
> create mode 100644 tests/ext4/061.out
>
> diff --git a/tests/ext4/061 b/tests/ext4/061
> new file mode 100755
> index 00000000..a0e49249
> --- /dev/null
> +++ b/tests/ext4/061
> @@ -0,0 +1,130 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> +# we should never see the mix of data contents from different threads for any
> +# given fio blocksize.
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_aiodio
> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((100*1024*1024))
> +fiobsize=4096
> +
> +# Calculate fsblocksize as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(_max 4096 "$bdev_awu_min")
> +
> +function create_fio_configs()
> +{
> + create_fio_aw_config
> + create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> + [aio-dio-aw-verify]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
> + bs=$fiobsize
> + fallocate=native
> + filename=$SCRATCH_MNT/test-file
> + size=$SIZE
> + iodepth=$FIO_LOAD
> + numjobs=$FIO_LOAD
> + atomic=1
> + group_reporting=1
> +
> + verify_only=1
> + verify_state_save=0
> + verify=crc32c
> + verify_fatal=1
> + verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> + [aio-dio-aw]
> + direct=1
> + ioengine=libaio
> + rw=randwrite
> + bs=$fiobsize
> + fallocate=native
> + filename=$SCRATCH_MNT/test-file
> + size=$SIZE
> + iodepth=$FIO_LOAD
> + numjobs=$FIO_LOAD
> + group_reporting=1
> + atomic=1
> +
> + verify_state_save=0
> + verify=crc32c
> + do_verify=0
> +
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> + # cluster sizes above 16 x blocksize are experimental so avoid them
> + # Also, cap cluster size at 128kb to keep it reasonable for large
> + # blocks size
> + fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> +
> + for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> + for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> + MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> + _scratch_mkfs_ext4 >> $seqres.full 2>&1 || continue
> + if _try_scratch_mount >> $seqres.full 2>&1; then
> + echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +
> + touch $SCRATCH_MNT/f1
> + create_fio_configs
> +
> + cat $fio_aw_config >> $seqres.full
> + echo >> $seqres.full
> + cat $fio_verify_config >> $seqres.full
> +
> + $FIO_PROG $fio_aw_config >> $seqres.full
> + ret1=$?
> +
> + $FIO_PROG $fio_verify_config >> $seqres.full
> + ret2=$?
> +
> + _scratch_unmount
> +
> + [[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> + fi
> + done
> + done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> new file mode 100644
> index 00000000..273be9e0
> --- /dev/null
> +++ b/tests/ext4/061.out
> @@ -0,0 +1,2 @@
> +QA output created by 061
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes
2025-07-12 14:12 ` [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
@ 2025-07-29 19:49 ` Darrick J. Wong
0 siblings, 0 replies; 60+ messages in thread
From: Darrick J. Wong @ 2025-07-29 19:49 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Sat, Jul 12, 2025 at 07:42:51PM +0530, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.
>
> The way we work is that we lay out a file with random mix of written,
> unwritten and hole extents. Then we start performing atomic writes
> sequentially on the file while we parallely shutdown the FS. Then we
> note the last offset where the atomic write happened just before shut
> down and then make sure blocks around it either have completely old
> data or completely new data, ie the write was not torn during shutdown.
>
> We repeat the same with completely written, completely unwritten and completely
> empty file to ensure these cases are not torn either. Finally, we have a
> similar test for append atomic writes
>
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Looks fine to me,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> tests/generic/1230 | 397 +++++++++++++++++++++++++++++++++++++++++
> tests/generic/1230.out | 2 +
> 2 files changed, 399 insertions(+)
> create mode 100755 tests/generic/1230
> create mode 100644 tests/generic/1230.out
>
> diff --git a/tests/generic/1230 b/tests/generic/1230
> new file mode 100755
> index 00000000..cff5adc0
> --- /dev/null
> +++ b/tests/generic/1230
> @@ -0,0 +1,397 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test No. 1230
> +#
> +# Test multi block atomic writes with sudden FS shutdowns to ensure
> +# the FS is not tearing the write operation
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_scratch_shutdown
> +_require_xfs_io_command "truncate"
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +echo "Awu max: $awu_max" >> $seqres.full
> +
> +num_blocks=$((awu_max / blksz))
> +# keep initial value high for dry run. This will be
> +# tweaked in dry_run() based on device write speed.
> +filesize=$(( 10 * 1024 * 1024 * 1024 ))
> +
> +_cleanup() {
> + [ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
> + wait
> +}
> +
> +atomic_write_loop() {
> + local off=0
> + local size=$awu_max
> + for ((i=0; i<$((filesize / $size )); i++)); do
> + # Due to sudden shutdown this can produce errors so just
> + # redirect them to seqres.full
> + $XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> + echo "Written to offset: $off" >> $tmp.aw
> + off=$((off + $size))
> + done
> +}
> +
> +# This test has the following flow:
> +# 1. Start doing sequential atomic writes in bg, upto $filesize
> +# 2. Sleep for 0.2s and shutdown the FS
> +# 3. kill the atomic write process
> +# 4. verify the writes were not torn
> +#
> +# We ideally want the shutdown to happen while an atomic write is ongoing
> +# but this gets tricky since faster devices can actually finish the whole
> +# atomic write loop before sleep 0.2s completes, resulting in the shutdown
> +# happening after the write loop which is not what we want. A simple solution
> +# to this is to increase $filesize so step 1 takes long enough but a big
> +# $filesize leads to create_mixed_mappings() taking very long, which is not
> +# ideal.
> +#
> +# Hence, use the dry_run function to figure out the rough device speed and set
> +# $filesize accordingly.
> +dry_run() {
> + echo >> $seqres.full
> + echo "# Estimating ideal filesize..." >> $seqres.full
> + atomic_write_loop &
> + awloop_pid=$!
> +
> + local i=0
> + # Wait for atleast first write to be recorded or 10s
> + while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> + if [[ $i -gt 50 ]]
> + then
> + _fail "atomic write process took too long to start"
> + fi
> +
> + echo >> $seqres.full
> + echo "# Shutting down filesystem while write is running" >> $seqres.full
> + _scratch_shutdown
> +
> + kill $awloop_pid 2>/dev/null # the process might have finished already
> + wait $awloop_pid
> + unset $awloop_pid
> +
> + bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> + echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
> +
> + filesize=$((bytes_written * 3))
> + echo "# Setting \$filesize=$filesize" >> $seqres.full
> +
> + rm $tmp.aw
> + sleep 0.5
> +
> + _scratch_cycle_mount
> +
> +}
> +
> +create_mixed_mappings() {
> + local file=$1
> + local size_bytes=$2
> +
> + echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> + #Fill the file with alternate written and unwritten blocks
> + local off=0
> + local operations=("W" "U")
> +
> + for ((i=0; i<$((size_bytes / blksz )); i++)); do
> + index=$(($i % ${#operations[@]}))
> + map="${operations[$index]}"
> +
> + case "$map" in
> + "W")
> + $XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file >> /dev/null
> + ;;
> + "U")
> + $XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
> + ;;
> + esac
> + off=$((off + blksz))
> + done
> +
> + sync $file
> +}
> +
> +populate_expected_data() {
> + # create a dummy file with expected old data for different cases
> + create_mixed_mappings $testfile.exp_old_mixed $awu_max
> + expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
> +
> + $XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
> + expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
> +
> + $XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
> + expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
> +
> + # create a dummy file with expected new data
> + $XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
> + expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
> +}
> +
> +verify_data_blocks() {
> + local verify_start=$1
> + local verify_end=$2
> + local expected_data_old="$3"
> + local expected_data_new="$4"
> +
> + echo >> $seqres.full
> + echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> +
> + # After an atomic write, for every chunk we ensure that the underlying
> + # data is either the old data or new data as writes shouldn't get torn.
> + local off=$verify_start
> + while [[ "$off" -lt "$verify_end" ]]
> + do
> + #actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> + actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
> + if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> + then
> + echo "Checksum match failed at off: $off size: $awu_max"
> + echo "Expected contents: (Either of the 2 below):"
> + echo
> + echo "Expected old: "
> + echo "$expected_data_old"
> + echo
> + echo "Expected new: "
> + echo "$expected_data_new"
> + echo
> + echo "Actual contents: "
> + echo "$actual_data"
> +
> + _fail
> + fi
> + echo -n "Check at offset $off suceeded! " >> $seqres.full
> + if [[ "$actual_data" == "$expected_data_new" ]]
> + then
> + echo "matched new" >> $seqres.full
> + elif [[ "$actual_data" == "$expected_data_old" ]]
> + then
> + echo "matched old" >> $seqres.full
> + fi
> + off=$(( off + awu_max ))
> + done
> +}
> +
> +# test data integrity for file by shutting down in between atomic writes
> +test_data_integrity() {
> + echo >> $seqres.full
> + echo "# Writing atomically to file in background" >> $seqres.full
> + atomic_write_loop &
> + awloop_pid=$!
> +
> + local i=0
> + # Wait for atleast first write to be recorded or 10s
> + while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> + if [[ $i -gt 50 ]]
> + then
> + _fail "atomic write process took too long to start"
> + fi
> +
> + echo >> $seqres.full
> + echo "# Shutting down filesystem while write is running" >> $seqres.full
> + _scratch_shutdown
> +
> + kill $awloop_pid 2>/dev/null # the process might have finished already
> + wait $awloop_pid
> + unset $awloop_pid
> +
> + last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> + if [[ -z $last_offset ]]
> + then
> + last_offset=0
> + fi
> +
> + echo >> $seqres.full
> + echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +
> + rm $tmp.aw
> + sleep 0.5
> +
> + _scratch_cycle_mount
> +
> + # we want to verify all blocks around which the shutdown happended
> + verify_start=$(( last_offset - (awu_max * 5)))
> + if [[ $verify_start < 0 ]]
> + then
> + verify_start=0
> + fi
> +
> + verify_end=$(( last_offset + (awu_max * 5)))
> + if [[ "$verify_end" -gt "$filesize" ]]
> + then
> + verify_end=$filesize
> + fi
> +}
> +
> +# test data integrity for file wiht written and unwritten mappings
> +test_data_integrity_mixed() {
> + $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Creating testfile with mixed mappings" >> $seqres.full
> + create_mixed_mappings $testfile $filesize
> +
> + test_data_integrity
> +
> + verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> +}
> +
> +# test data integrity for file with completely written mappings
> +test_data_integrity_writ() {
> + $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Creating testfile with fully written mapping" >> $seqres.full
> + $XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> + sync $testfile
> +
> + test_data_integrity
> +
> + verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> +}
> +
> +# test data integrity for file with completely unwritten mappings
> +test_data_integrity_unwrit() {
> + $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> + $XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> + sync $testfile
> +
> + test_data_integrity
> +
> + verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +}
> +
> +# test data integrity for file with no mappings
> +test_data_integrity_hole() {
> + $XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Creating testfile with no mappings" >> $seqres.full
> + $XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> + sync $testfile
> +
> + test_data_integrity
> +
> + verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +}
> +
> +test_filesize_integrity() {
> + $XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Performing extending atomic writes over file in background" >> $seqres.full
> + atomic_write_loop &
> + awloop_pid=$!
> +
> + local i=0
> + # Wait for atleast first write to be recorded or 10s
> + while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> + if [[ $i -gt 50 ]]
> + then
> + _fail "atomic write process took too long to start"
> + fi
> +
> + echo >> $seqres.full
> + echo "# Shutting down filesystem while write is running" >> $seqres.full
> + _scratch_shutdown
> +
> + kill $awloop_pid 2>/dev/null # the process might have finished already
> + wait $awloop_pid
> + unset $awloop_pid
> +
> + local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> + if [[ -z $last_offset ]]
> + then
> + last_offset=0
> + fi
> +
> + echo >> $seqres.full
> + echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> + rm $tmp.aw
> + sleep 0.5
> +
> + _scratch_cycle_mount
> + local filesize=$(_get_filesize $testfile)
> + echo >> $seqres.full
> + echo "# Filesize after shutdown: $filesize" >> $seqres.full
> +
> + # To confirm that the write went atomically, we check:
> + # 1. The last block should be a multiple of awu_max
> + # 2. The last block should be the completely new data
> +
> + if (( $filesize % $awu_max ))
> + then
> + echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
> + fi
> +
> + verify_start=$(( filesize - (awu_max * 5)))
> + if [[ $verify_start < 0 ]]
> + then
> + verify_start=0
> + fi
> +
> + local verify_end=$filesize
> +
> + # Here the blocks should always match new data hence, for simplicity of
> + # code, just corrupt the $expected_data_old buffer so it never matches
> + local expected_data_old="POISON"
> + verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
> +}
> +
> +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +dry_run
> +
> +echo >> $seqres.full
> +echo "# Populating expected data buffers" >> $seqres.full
> +populate_expected_data
> +
> +# Loop 20 times to shake out any races due to shutdown
> +for ((iter=0; iter<20; iter++))
> +do
> + echo >> $seqres.full
> + echo "------ Iteration $iter ------" >> $seqres.full
> +
> + echo >> $seqres.full
> + echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> + test_data_integrity_mixed
> +
> + echo >> $seqres.full
> + echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> + test_data_integrity_writ
> +
> + echo >> $seqres.full
> + echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> + test_data_integrity_unwrit
> +
> + echo >> $seqres.full
> + echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> + test_data_integrity_hole
> +
> + echo >> $seqres.full
> + echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> + test_filesize_integrity
> +done
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/1230.out b/tests/generic/1230.out
> new file mode 100644
> index 00000000..d01f54ea
> --- /dev/null
> +++ b/tests/generic/1230.out
> @@ -0,0 +1,2 @@
> +QA output created by 1230
> +Silence is golden
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier
2025-07-29 19:47 ` Darrick J. Wong
@ 2025-07-30 13:56 ` Ojaswin Mujoo
0 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-30 13:56 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Tue, Jul 29, 2025 at 12:47:08PM -0700, Darrick J. Wong wrote:
> On Sat, Jul 12, 2025 at 07:42:52PM +0530, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> >
> > We brute force all possible blocksize & clustersize combinations on
> > a bigalloc filesystem for stressing atomic write using fio data crc
> > verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
> > a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> > that we never see the mix of data contents from different threads on
> > a given bsrange.
Hi Darrick, thanks for the reviews.
>
> Err, how does this differ from the next patch? It looks like this one
> creates one IO thread, whereas the next one creates 8? If so, what does
> this test add over ext4/062?
Yes these 2 tests are similar however this one uses fallocate=native +
_scratch_mkfs_ext4 to test whether atomic writes on preallocated file
via multiple threads works correctly.
The other one uses fallocate=truncate + _scratch_mkfs_sized 360MB +
'multiple jobs each writing to a different file' to ensure we are
extensively stressing the allocation logic in low space scenarios.
>
> (and now that I look at it, ext4/062 says "FS QA Test 061"...)
Ahh I missed it somehow, thanks I'll fix it.
Regards,
ojaswin
>
> --D
>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> > tests/ext4/061 | 130 +++++++++++++++++++++++++++++++++++++++++++++
> > tests/ext4/061.out | 2 +
> > 2 files changed, 132 insertions(+)
> > create mode 100755 tests/ext4/061
> > create mode 100644 tests/ext4/061.out
> >
> > diff --git a/tests/ext4/061 b/tests/ext4/061
> > new file mode 100755
> > index 00000000..a0e49249
> > --- /dev/null
> > +++ b/tests/ext4/061
> > @@ -0,0 +1,130 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 061
> > +#
> > +# Brute force all possible blocksize clustersize combination on a bigalloc
> > +# filesystem for stressing atomic write using fio data crc verifier. We run
> > +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> > +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> > +# we should never see the mix of data contents from different threads for any
> > +# given fio blocksize.
> > +#
> > +
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto rw stress atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_aiodio
> > +
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > +SIZE=$((100*1024*1024))
> > +fiobsize=4096
> > +
> > +# Calculate fsblocksize as per bdev atomic write units.
> > +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> > +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> > +fsblocksize=$(_max 4096 "$bdev_awu_min")
> > +
> > +function create_fio_configs()
> > +{
> > + create_fio_aw_config
> > + create_fio_verify_config
> > +}
> > +
> > +function create_fio_verify_config()
> > +{
> > +cat >$fio_verify_config <<EOF
> > + [aio-dio-aw-verify]
> > + direct=1
> > + ioengine=libaio
> > + rw=randwrite
> > + bs=$fiobsize
> > + fallocate=native
> > + filename=$SCRATCH_MNT/test-file
> > + size=$SIZE
> > + iodepth=$FIO_LOAD
> > + numjobs=$FIO_LOAD
> > + atomic=1
> > + group_reporting=1
> > +
> > + verify_only=1
> > + verify_state_save=0
> > + verify=crc32c
> > + verify_fatal=1
> > + verify_write_sequence=0
> > +EOF
> > +}
> > +
> > +function create_fio_aw_config()
> > +{
> > +cat >$fio_aw_config <<EOF
> > + [aio-dio-aw]
> > + direct=1
> > + ioengine=libaio
> > + rw=randwrite
> > + bs=$fiobsize
> > + fallocate=native
> > + filename=$SCRATCH_MNT/test-file
> > + size=$SIZE
> > + iodepth=$FIO_LOAD
> > + numjobs=$FIO_LOAD
> > + group_reporting=1
> > + atomic=1
> > +
> > + verify_state_save=0
> > + verify=crc32c
> > + do_verify=0
> > +
> > +EOF
> > +}
> > +
> > +# Let's create a sample fio config to check whether fio supports all options.
> > +fio_aw_config=$tmp.aw.fio
> > +fio_verify_config=$tmp.verify.fio
> > +fio_out=$tmp.fio.out
> > +
> > +create_fio_configs
> > +_require_fio $fio_aw_config
> > +
> > +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> > + # cluster sizes above 16 x blocksize are experimental so avoid them
> > + # Also, cap cluster size at 128kb to keep it reasonable for large
> > + # blocks size
> > + fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> > +
> > + for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> > + for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> > + MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> > + _scratch_mkfs_ext4 >> $seqres.full 2>&1 || continue
> > + if _try_scratch_mount >> $seqres.full 2>&1; then
> > + echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> > +
> > + touch $SCRATCH_MNT/f1
> > + create_fio_configs
> > +
> > + cat $fio_aw_config >> $seqres.full
> > + echo >> $seqres.full
> > + cat $fio_verify_config >> $seqres.full
> > +
> > + $FIO_PROG $fio_aw_config >> $seqres.full
> > + ret1=$?
> > +
> > + $FIO_PROG $fio_verify_config >> $seqres.full
> > + ret2=$?
> > +
> > + _scratch_unmount
> > +
> > + [[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> > + fi
> > + done
> > + done
> > +done
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> > new file mode 100644
> > index 00000000..273be9e0
> > --- /dev/null
> > +++ b/tests/ext4/061.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 061
> > +Silence is golden
> > --
> > 2.49.0
> >
> >
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes
2025-07-29 19:41 ` Darrick J. Wong
@ 2025-07-30 14:06 ` Ojaswin Mujoo
0 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-30 14:06 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Tue, Jul 29, 2025 at 12:41:54PM -0700, Darrick J. Wong wrote:
> On Sat, Jul 12, 2025 at 07:42:54PM +0530, Ojaswin Mujoo wrote:
> > In ext4, even if an allocated range is physically and logically
> > contiguous, it can still be split into 2 extents. This is because ext4
> > does not merge extents across leaf nodes. This is an issue for atomic
> > writes since even for a continuous extent the map block could (in rare
> > cases) return a shorter map, hence tearning the write. This test creates
> > such a file and ensures that the atomic write handles this case
> > correctly
> >
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> > tests/ext4/063 | 125 +++++++++++++++++++++++++++++++++++++++++++++
> > tests/ext4/063.out | 2 +
> > 2 files changed, 127 insertions(+)
> > create mode 100755 tests/ext4/063
> > create mode 100644 tests/ext4/063.out
> >
> > diff --git a/tests/ext4/063 b/tests/ext4/063
> > new file mode 100755
> > index 00000000..25b5693d
> > --- /dev/null
> > +++ b/tests/ext4/063
> > @@ -0,0 +1,125 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# In ext4, even if an allocated range is physically and logically contiguous,
> > +# it can still be split into 2 extents. This is because ext4 does not merge
> > +# extents across leaf nodes. This is an issue for atomic writes since even for
> > +# a continuous extent the map block could (in rare cases) return a shorter map,
> > +# hence tearning the write. This test creates such a file and ensures that the
> > +# atomic write handles this case correctly
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_command "$DEBUGFS_PROG" debugfs
> > +
> > +prep() {
> > + local bs=`_get_block_size $SCRATCH_MNT`
> > + local ex_hdr_bytes=12
> > + local ex_entry_bytes=12
> > + local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> > +
> > + # fill the extent tree leaf which bs len extents at alternate offsets. For example,
> > + # for 4k bs the tree should look as follows
> > + #
> > + # +---------+---------+
> > + # | index 1 | index 2 |
> > + # +-----+---+-----+---+
> > + # +--------+ +-------+
> > + # | |
> > + # +----------+--------------+ +-----+-----+
> > + # | ex 1 | ex 2 |... | ex n | | ex n + 1 |
> > + # +-------------------------+ +-----------+
> > + # 0 2 680 682
> > + for i in $(seq 0 $entries_per_blk)
> > + do
> > + $XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> > + done
> > + sync $testfile
> > +
> > + echo >> $seqres.full
> > + echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> > + echo "...">> $seqres.full
> > + $DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > + # Now try to insert a new extent ex(new) between ex(n) and ex(n+1). Since
> > + # this is a new FS the allocator would find continuous blocks such that
> > + # ex(n) ex(new) ex(n+1) are physically(and logically) contiguous. However,
> > + # since we dont merge extents across leaf we will end up with a tree as:
> > + #
> > + # +---------+---------+
> > + # | index 1 | index 2 |
> > + # +-----+---+-----+---+
> > + # +--------+ +-------+
> > + # | |
> > + # +----------+--------------+ +-----+-----+
> > + # | ex 1 | ex 2 |... | ex n | | ex merged |
> > + # +-------------------------+ +-----------+
> > + # 0 2 680 681 682 684
>
> Where did 684 come from? It's not in the 'before' diagram. Did
> "ex n + 1" previously map 682-684, and now it maps 681-684?
Okay so the 684 is a bit misleading as in there is nothing there.
The extent at 682 is len=1 and spans [682-683). Now that you pointed it
out, I think the 0..2...680 logicial offsets are confusing, since they
are actually ext4_extent.ee_block values but the diagram makes it seem
like they are indexes into the array of extents. Let me see if I can
make it better.
Thanks for the review!
ojaswin
>
> The rest looks ok though.
>
> --D
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-29 14:45 ` Darrick J. Wong
@ 2025-07-31 4:18 ` Ojaswin Mujoo
2025-07-31 7:58 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-07-31 4:18 UTC (permalink / raw)
To: Darrick J. Wong
Cc: John Garry, Zorro Lang, fstests, Ritesh Harjani, tytso, linux-xfs,
linux-kernel, linux-ext4
On Tue, Jul 29, 2025 at 07:45:26AM -0700, Darrick J. Wong wrote:
> On Tue, Jul 29, 2025 at 11:41:39AM +0530, Ojaswin Mujoo wrote:
> > On Mon, Jul 28, 2025 at 03:00:40PM +0100, John Garry wrote:
> > > On 28/07/2025 14:35, Ojaswin Mujoo wrote:
> > > > > We guarantee that the write is committed all-or-nothing, but do rely on
> > > > > userspace not issuing racing atomic writes or racing regular writes.
> > > > >
> > > > > I can easily change this, as I mentioned, but I am not convinced that it is
> > > > > a must.
> > > > Purely from a design point of view, I feel we are breaking atomicity and
> > > > hence we should serialize or just stop userspace from doing this (which
> > > > is a bit extreme).
> > >
> > > If you check the man page description of RWF_ATOMIC, it does not mention
> > > serialization. The user should conclude that usual direct IO rules apply,
> > > i.e. userspace is responsible for serializing.
> >
> > My mental model of serialization in context of atomic writes is that if
> > user does 64k atomic write A followed by a parallel overlapping 64kb
> > atomic write B then the user might see complete A or complete B (we
> > don't guarantee) but not a mix of A and B.
>
> Heh, here comes that feature naming confusing again. This is my
> definition:
>
> RWF_ATOMIC means the system won't introduce new tearing when persisting
> file writes. The application is allowed to introduce tearing by writing
> to overlapping ranges at the same time. The system does not isolate
> overlapping reads from writes.
>
> --D
Hey Darrick, John,
So seems like my expectations of RWF_ATOMIC were a bit misplaced. I
understand now that we don't really guarantee much when there are
overlapping parallel writes going on. Even 2 overlapping RWF_ATOMIC
writes can get torn. Seems like there are some edge cases where this
might happen with hardware atomic writes as well.
In that sense, if this fio test is doing overlapped atomic io and
expecting them to be untorn, I don't think that's the correct way to
test it since that is beyond what RWF_ATOMIC guarantees.
I'll try to check if we can modify the tests to write on non-overlapping
ranges in a file.
Thanks and sorry for the confusion :)
Ojaswin
>
> > >
> > > >
> > > > I know userspace should ideally not do overwriting atomic writes but if
> > > > it is something we are allowing (which we do) then it is
> > > > kernel's responsibility to ensure atomicity. Sure we can penalize them
> > > > by serializing the writes but not by tearing it.
> > > >
> > > > With that reasoning, I don't think the test should accomodate for this
> > > > particular scenario.
> > >
> > > I can send a patch to the community for xfs (to provide serialization), like
> > > I showed earlier, to get opinion.
> >
> > Thanks, that would be great.
> >
> > Regards,
> > John
> > >
> > > Thanks,
> > > John
> > >
> >
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-31 4:18 ` Ojaswin Mujoo
@ 2025-07-31 7:58 ` John Garry
2025-08-01 6:41 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-07-31 7:58 UTC (permalink / raw)
To: Ojaswin Mujoo, Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, tytso, linux-xfs,
linux-kernel, linux-ext4
On 31/07/2025 05:18, Ojaswin Mujoo wrote:
>> Heh, here comes that feature naming confusing again. This is my
>> definition:
>>
>> RWF_ATOMIC means the system won't introduce new tearing when persisting
>> file writes. The application is allowed to introduce tearing by writing
>> to overlapping ranges at the same time. The system does not isolate
>> overlapping reads from writes.
>>
>> --D
> Hey Darrick, John,
>
> So seems like my expectations of RWF_ATOMIC were a bit misplaced. I
> understand now that we don't really guarantee much when there are
> overlapping parallel writes going on. Even 2 overlapping RWF_ATOMIC
> writes can get torn. Seems like there are some edge cases where this
> might happen with hardware atomic writes as well.
>
> In that sense, if this fio test is doing overlapped atomic io and
> expecting them to be untorn, I don't think that's the correct way to
> test it since that is beyond what RWF_ATOMIC guarantees.
I think that this test has value, but can only be used for ext4 or any
FS which only relies on HW atomics only.
The value is that we prove that we don't get any bios being split in the
storage stack, which is essential for HW atomics support.
Both NVMe and SCSI guarantee serialization of atomic writes.
>
> I'll try to check if we can modify the tests to write on non-overlapping
> ranges in a file.
JFYI, for testing SW-based atomic writes on XFS, I do something like
this. I have multiple threads each writing to separate regions of a file
or writing to separate files. I use this for power-fail testing with my
RPI. Indeed, I have also being using this sort of test in qemu for
shutting down the VM when fio is running - I would like to automate
this, but I am not sure how yet.
Please let me know if you want further info on the fio script.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-07-31 7:58 ` John Garry
@ 2025-08-01 6:41 ` Ojaswin Mujoo
2025-08-01 8:23 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-08-01 6:41 UTC (permalink / raw)
To: John Garry
Cc: Darrick J. Wong, Zorro Lang, fstests, Ritesh Harjani, tytso,
linux-xfs, linux-kernel, linux-ext4
On Thu, Jul 31, 2025 at 08:58:59AM +0100, John Garry wrote:
> On 31/07/2025 05:18, Ojaswin Mujoo wrote:
> > > Heh, here comes that feature naming confusing again. This is my
> > > definition:
> > >
> > > RWF_ATOMIC means the system won't introduce new tearing when persisting
> > > file writes. The application is allowed to introduce tearing by writing
> > > to overlapping ranges at the same time. The system does not isolate
> > > overlapping reads from writes.
> > >
> > > --D
> > Hey Darrick, John,
> >
> > So seems like my expectations of RWF_ATOMIC were a bit misplaced. I
> > understand now that we don't really guarantee much when there are
> > overlapping parallel writes going on. Even 2 overlapping RWF_ATOMIC
> > writes can get torn. Seems like there are some edge cases where this
> > might happen with hardware atomic writes as well.
> >
> > In that sense, if this fio test is doing overlapped atomic io and
> > expecting them to be untorn, I don't think that's the correct way to
> > test it since that is beyond what RWF_ATOMIC guarantees.
>
> I think that this test has value, but can only be used for ext4 or any FS
> which only relies on HW atomics only.
>
> The value is that we prove that we don't get any bios being split in the
> storage stack, which is essential for HW atomics support.
>
> Both NVMe and SCSI guarantee serialization of atomic writes.
Hi John,
Got it, I think I can make this test work for ext4 only but then it might
be more appropriate to run the fio tests directly on atomic blkdev and
skip the FS, since we anyways want to focus on the storage stack.
>
> >
> > I'll try to check if we can modify the tests to write on non-overlapping
> > ranges in a file.
>
> JFYI, for testing SW-based atomic writes on XFS, I do something like this. I
> have multiple threads each writing to separate regions of a file or writing
> to separate files. I use this for power-fail testing with my RPI. Indeed, I
> have also being using this sort of test in qemu for shutting down the VM
> when fio is running - I would like to automate this, but I am not sure how
> yet.
>
> Please let me know if you want further info on the fio script.
Got it, thanks for the insights. I was thinking of something similar now
where I can modify the fio files of this test to write on non
overlapping ranges in the same file. The only doubt i have right now is
that when I have eg, numjobs=10 filesize=1G, how do i ensure each job
writes to its own separate range and not overlap with each other.
I saw the offset_increment= fio options which might help, yet to try it
out though. If you know any better way please do share.
Thanks,
Ojaswin
>
> Thanks,
> John
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-08-01 6:41 ` Ojaswin Mujoo
@ 2025-08-01 8:23 ` John Garry
2025-08-02 6:49 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-08-01 8:23 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Darrick J. Wong, Zorro Lang, fstests, Ritesh Harjani, tytso,
linux-xfs, linux-kernel, linux-ext4
On 01/08/2025 07:41, Ojaswin Mujoo wrote:
> Got it, I think I can make this test work for ext4 only but then it might
> be more appropriate to run the fio tests directly on atomic blkdev and
> skip the FS, since we anyways want to focus on the storage stack.
>
testing on ext4 will prove also that the FS and iomap behave correctly
in that they generate a single bio per atomic write (as well as testing
the block stack and below).
>>> I'll try to check if we can modify the tests to write on non-overlapping
>>> ranges in a file.
>> JFYI, for testing SW-based atomic writes on XFS, I do something like this. I
>> have multiple threads each writing to separate regions of a file or writing
>> to separate files. I use this for power-fail testing with my RPI. Indeed, I
>> have also being using this sort of test in qemu for shutting down the VM
>> when fio is running - I would like to automate this, but I am not sure how
>> yet.
>>
>> Please let me know if you want further info on the fio script.
> Got it, thanks for the insights. I was thinking of something similar now
> where I can modify the fio files of this test to write on non
> overlapping ranges in the same file. The only doubt i have right now is
> that when I have eg, numjobs=10 filesize=1G, how do i ensure each job
> writes to its own separate range and not overlap with each other.
>
> I saw the offset_increment= fio options which might help, yet to try it
> out though. If you know any better way please do share.
Yeah, so I use something like:
--numjobs=2 --offset_align=0 --offset_increment=1M --size=1M
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-08-01 8:23 ` John Garry
@ 2025-08-02 6:49 ` Ojaswin Mujoo
2025-08-04 7:12 ` John Garry
0 siblings, 1 reply; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-08-02 6:49 UTC (permalink / raw)
To: John Garry
Cc: Darrick J. Wong, Zorro Lang, fstests, Ritesh Harjani, tytso,
linux-xfs, linux-kernel, linux-ext4
On Fri, Aug 01, 2025 at 09:23:46AM +0100, John Garry wrote:
> On 01/08/2025 07:41, Ojaswin Mujoo wrote:
> > Got it, I think I can make this test work for ext4 only but then it might
> > be more appropriate to run the fio tests directly on atomic blkdev and
> > skip the FS, since we anyways want to focus on the storage stack.
> >
>
> testing on ext4 will prove also that the FS and iomap behave correctly in
> that they generate a single bio per atomic write (as well as testing the
> block stack and below).
Okay, I think we are already testing those in the ext4/061 ext4/062
tests of this patchset. Just thought blkdev test might be useful to keep
in generic. Do you see a value in that or shall I just drop the generic
overlapping write tests?
Also, just for the records, ext4 passes the fio tests ONLY because we use
the same io size for all threads. If we happen to start overlapping
RWF_ATOMIC writes with different sizes that can get torn due to racing
unwritten conversion.
>
> > > > I'll try to check if we can modify the tests to write on non-overlapping
> > > > ranges in a file.
> > > JFYI, for testing SW-based atomic writes on XFS, I do something like this. I
> > > have multiple threads each writing to separate regions of a file or writing
> > > to separate files. I use this for power-fail testing with my RPI. Indeed, I
> > > have also being using this sort of test in qemu for shutting down the VM
> > > when fio is running - I would like to automate this, but I am not sure how
> > > yet.
> > >
> > > Please let me know if you want further info on the fio script.
> > Got it, thanks for the insights. I was thinking of something similar now
> > where I can modify the fio files of this test to write on non
> > overlapping ranges in the same file. The only doubt i have right now is
> > that when I have eg, numjobs=10 filesize=1G, how do i ensure each job
> > writes to its own separate range and not overlap with each other.
> >
> > I saw the offset_increment= fio options which might help, yet to try it
> > out though. If you know any better way please do share.
>
> Yeah, so I use something like:
> --numjobs=2 --offset_align=0 --offset_increment=1M --size=1M
Got it, thanks!
ojaswin
>
> Thanks,
> John
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-08-02 6:49 ` Ojaswin Mujoo
@ 2025-08-04 7:12 ` John Garry
2025-08-08 6:00 ` Ojaswin Mujoo
0 siblings, 1 reply; 60+ messages in thread
From: John Garry @ 2025-08-04 7:12 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: Darrick J. Wong, Zorro Lang, fstests, Ritesh Harjani, tytso,
linux-xfs, linux-kernel, linux-ext4
On 02/08/2025 07:49, Ojaswin Mujoo wrote:
> On Fri, Aug 01, 2025 at 09:23:46AM +0100, John Garry wrote:
>> On 01/08/2025 07:41, Ojaswin Mujoo wrote:
>>> Got it, I think I can make this test work for ext4 only but then it might
>>> be more appropriate to run the fio tests directly on atomic blkdev and
>>> skip the FS, since we anyways want to focus on the storage stack.
>>>
>> testing on ext4 will prove also that the FS and iomap behave correctly in
>> that they generate a single bio per atomic write (as well as testing the
>> block stack and below).
> Okay, I think we are already testing those in the ext4/061 ext4/062
> tests of this patchset. Just thought blkdev test might be useful to keep
> in generic. Do you see a value in that or shall I just drop the generic
> overlapping write tests?
If you want to just test fio on the blkdev, then I think that is fine.
Indeed, maybe such tests are useful in blktests also.
>
> Also, just for the records, ext4 passes the fio tests ONLY because we use
> the same io size for all threads. If we happen to start overlapping
> RWF_ATOMIC writes with different sizes that can get torn due to racing
> unwritten conversion.
I'd keep the same io size for all threads in the tests.
Thanks,
John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier
2025-08-04 7:12 ` John Garry
@ 2025-08-08 6:00 ` Ojaswin Mujoo
0 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-08-08 6:00 UTC (permalink / raw)
To: John Garry
Cc: Darrick J. Wong, Zorro Lang, fstests, Ritesh Harjani, tytso,
linux-xfs, linux-kernel, linux-ext4
On Mon, Aug 04, 2025 at 08:12:00AM +0100, John Garry wrote:
> On 02/08/2025 07:49, Ojaswin Mujoo wrote:
> > On Fri, Aug 01, 2025 at 09:23:46AM +0100, John Garry wrote:
> > > On 01/08/2025 07:41, Ojaswin Mujoo wrote:
> > > > Got it, I think I can make this test work for ext4 only but then it might
> > > > be more appropriate to run the fio tests directly on atomic blkdev and
> > > > skip the FS, since we anyways want to focus on the storage stack.
> > > >
> > > testing on ext4 will prove also that the FS and iomap behave correctly in
> > > that they generate a single bio per atomic write (as well as testing the
> > > block stack and below).
> > Okay, I think we are already testing those in the ext4/061 ext4/062
> > tests of this patchset. Just thought blkdev test might be useful to keep
> > in generic. Do you see a value in that or shall I just drop the generic
> > overlapping write tests?
>
> If you want to just test fio on the blkdev, then I think that is fine.
> Indeed, maybe such tests are useful in blktests also.
Okay, I think it is better suited for blktests, so I'll add it there.
>
> >
> > Also, just for the records, ext4 passes the fio tests ONLY because we use
> > the same io size for all threads. If we happen to start overlapping
> > RWF_ATOMIC writes with different sizes that can get torn due to racing
> > unwritten conversion.
>
> I'd keep the same io size for all threads in the tests.
Yep
Thanks,
Ojaswin
>
> Thanks,
> John
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests
2025-07-23 14:54 ` Darrick J. Wong
@ 2025-08-10 9:41 ` Ojaswin Mujoo
0 siblings, 0 replies; 60+ messages in thread
From: Ojaswin Mujoo @ 2025-08-10 9:41 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Zorro Lang, fstests, Ritesh Harjani, john.g.garry, tytso,
linux-xfs, linux-kernel, linux-ext4
On Wed, Jul 23, 2025 at 07:54:23AM -0700, Darrick J. Wong wrote:
> On Wed, Jul 23, 2025 at 07:23:58PM +0530, Ojaswin Mujoo wrote:
> > On Thu, Jul 17, 2025 at 09:35:10AM -0700, Darrick J. Wong wrote:
> >
> > <snip>
> >
> > > > +verify_atomic_write() {
> > > > + if [[ "$1" == "shutdown" ]]
> > > > + then
> > > > + local do_shutdown=1
> > > > + fi
> > > > +
> > > > + test $bytes_written -eq $awu_max || _fail "atomic write len=$awu_max assertion failed"
> > > > +
> > > > + if [[ $do_shutdown -eq "1" ]]
> > > > + then
> > > > + echo "Shutting down filesystem" >> $seqres.full
> > > > + _scratch_shutdown >> $seqres.full
> > > > + _scratch_cycle_mount >>$seqres.full 2>&1 || _fail "remount failed for Test-3"
> > > > + fi
> > > > +
> > > > + check_data_integrity
> > > > +}
> > > > +
> > > > +mixed_mapping_test() {
> > > > + prep_mixed_mapping
> > > > +
> > > > + echo "+ + Performing O_DSYNC atomic write from 0 to $awu_max" >> $seqres.full
> > > > + bytes_written=$($XFS_IO_PROG -dc "pwrite -DA -V1 -b $awu_max 0 $awu_max" $testfile | \
> > > > + grep wrote | awk -F'[/ ]' '{print $2}')
> > > > +
> > > > + verify_atomic_write $1
> > >
> > > The shutdown happens after the synchronous write completes? If so, then
> > > what part of recovery is this testing?
> > >
> > > --D
> >
> > Right, it is mostly inspired by [1] where sometimes isize update could
> > be lost after dio completion. Although this might not exactly be
> > affected by atomic writes, we added it here out of caution.
> >
> > [1] https://lore.kernel.org/fstests/434beffaf18d39f898518ea9eb1cea4548e77c3a.1695383715.git.ritesh.list@gmail.com/
>
> Ah, so we're racing with background log flush then. Would it improve
> the potential failure detection rate to call shutdown right after the
> pwrite, e.g.
>
> $XFS_IO_PROG -dxc "pwrite -DA..." -c 'shutdown' $testfile
>
> It can take a few milliseconds to walk down the bash functions and
> fork/exec another child process.
Sounds good, I can make that change.
Thanks!
>
> --D
>
> > > > +}
> > > > +
> >
^ permalink raw reply [flat|nested] 60+ messages in thread
end of thread, other threads:[~2025-08-10 9:41 UTC | newest]
Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-12 14:12 [PATCH v3 00/13] Add more tests for multi fs block atomic writes Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 01/13] common/rc: Add _min() and _max() helpers Ojaswin Mujoo
2025-07-17 15:02 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 02/13] common/rc: Fix fsx for ext4 with bigalloc Ojaswin Mujoo
2025-07-17 16:11 ` Darrick J. Wong
2025-07-22 9:53 ` Ojaswin Mujoo
2025-07-23 14:50 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 03/13] common/rc: Add a helper to run fsx on a given file Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 04/13] ltp/fsx.c: Add atomic writes support to fsx Ojaswin Mujoo
2025-07-17 16:17 ` Darrick J. Wong
2025-07-22 9:59 ` Ojaswin Mujoo
2025-07-23 14:57 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 05/13] generic/1226: Add atomic write test using fio crc check verifier Ojaswin Mujoo
2025-07-17 13:00 ` John Garry
2025-07-17 13:52 ` Ojaswin Mujoo
2025-07-17 14:06 ` John Garry
2025-07-22 8:47 ` Ojaswin Mujoo
2025-07-23 11:33 ` John Garry
2025-07-23 13:51 ` Ojaswin Mujoo
2025-07-23 16:25 ` John Garry
2025-07-25 6:27 ` Ojaswin Mujoo
2025-07-25 8:14 ` John Garry
2025-07-28 6:43 ` Ojaswin Mujoo
2025-07-28 9:09 ` John Garry
2025-07-28 13:35 ` Ojaswin Mujoo
2025-07-28 14:00 ` John Garry
2025-07-29 6:11 ` Ojaswin Mujoo
2025-07-29 14:45 ` Darrick J. Wong
2025-07-31 4:18 ` Ojaswin Mujoo
2025-07-31 7:58 ` John Garry
2025-08-01 6:41 ` Ojaswin Mujoo
2025-08-01 8:23 ` John Garry
2025-08-02 6:49 ` Ojaswin Mujoo
2025-08-04 7:12 ` John Garry
2025-08-08 6:00 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 06/13] generic/1227: Add atomic write test using fio verify on file mixed mappings Ojaswin Mujoo
2025-07-17 16:32 ` Darrick J. Wong
2025-07-28 8:58 ` Zorro Lang
2025-07-28 9:27 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 07/13] generic/1228: Add atomic write multi-fsblock O_[D]SYNC tests Ojaswin Mujoo
2025-07-17 16:35 ` Darrick J. Wong
2025-07-23 13:53 ` Ojaswin Mujoo
2025-07-23 14:54 ` Darrick J. Wong
2025-08-10 9:41 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 08/13] generic/1229: Stress fsx with atomic writes enabled Ojaswin Mujoo
2025-07-17 16:22 ` Darrick J. Wong
2025-07-23 6:30 ` Ojaswin Mujoo
2025-07-23 14:56 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 09/13] generic/1230: Add sudden shutdown tests for multi block atomic writes Ojaswin Mujoo
2025-07-29 19:49 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 10/13] ext4/061: Atomic writes stress test for bigalloc using fio crc verifier Ojaswin Mujoo
2025-07-29 19:47 ` Darrick J. Wong
2025-07-30 13:56 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 11/13] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files Ojaswin Mujoo
2025-07-29 19:44 ` Darrick J. Wong
2025-07-12 14:12 ` [PATCH v3 12/13] ext4/063: Atomic write test for extent split across leaf nodes Ojaswin Mujoo
2025-07-29 19:41 ` Darrick J. Wong
2025-07-30 14:06 ` Ojaswin Mujoo
2025-07-12 14:12 ` [PATCH v3 13/13] ext4/064: Add atomic write tests for journal credit calculation Ojaswin Mujoo
2025-07-29 19:36 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).