public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: fstests@vger.kernel.org
Subject: [PATCH 15/40] fstests: mark tests that are unreliable when run in parallel
Date: Wed, 27 Nov 2024 15:51:45 +1100	[thread overview]
Message-ID: <20241127045403.3665299-16-david@fromorbit.com> (raw)
In-Reply-To: <20241127045403.3665299-1-david@fromorbit.com>

From: Dave Chinner <dchinner@redhat.com>

Add a group named "unreliable_in_parallel" to mark tests that
do not give reliable results when multiple tests are run in
parallel. Generally this happens with tests that are reliant on
caching in some way, such as generating specific file layouts using
buffered IO or expecting inodes to be cached in memory. These are
perturbed by other tests running sync(), generating memory pressure,
dropping caches, etc.

Hence whether these tests pass or fail is wholly dependent on what
tests are running at the same time, and hence randomly fail when
nothing has actually gone wrong. Hence they are unreliable as
regression tests when running tests in parallel, so we add them to
the "unreliable_in_parallel" group and a parallel check can exclude
this group.

As tests are updated to be robust against external interference,
they can be removed from the unreliable_in_parallel group.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 doc/group-names.txt | 1 +
 tests/generic/336   | 7 ++++++-
 tests/generic/561   | 8 +++++++-
 tests/xfs/177       | 8 ++++++--
 tests/xfs/232       | 6 +++++-
 tests/xfs/237       | 8 +++++++-
 tests/xfs/243       | 7 +++++--
 tests/xfs/300       | 8 ++++++--
 tests/xfs/440       | 6 +++++-
 tests/xfs/527       | 5 ++++-
 tests/xfs/631       | 7 ++++++-
 tests/xfs/802       | 7 ++++++-
 12 files changed, 64 insertions(+), 14 deletions(-)

diff --git a/doc/group-names.txt b/doc/group-names.txt
index ed886caac..f5bf79a56 100644
--- a/doc/group-names.txt
+++ b/doc/group-names.txt
@@ -138,6 +138,7 @@ trim			FITRIM ioctl
 udf			UDF functionality tests
 union			tests from the unionmount test suite
 unlink			O_TMPFILE unlinked files
+unreliable_in_parallel	randomly fail when run in parallel with other tests
 unshare			fallocate FALLOC_FL_UNSHARE_RANGE
 v2log			XFS v2 log format tests
 verity			fsverity
diff --git a/tests/generic/336 b/tests/generic/336
index 06391a93f..c874997e4 100755
--- a/tests/generic/336
+++ b/tests/generic/336
@@ -9,8 +9,13 @@
 # file F2 from directory B into directory C, fsync inode F1, power fail and
 # remount the filesystem, file F2 exists and is located only in directory C.
 #
+
+# unreliable_in_parallel: external sync operations can change what is synced to
+# the log before the flakey device drops writes. hence post-remount file
+# contents can be different to what the test expects.
+
 . ./common/preamble
-_begin_fstest auto quick metadata log
+_begin_fstest auto quick metadata log unreliable_in_parallel
 
 # Override the default cleanup function.
 _cleanup()
diff --git a/tests/generic/561 b/tests/generic/561
index 3e931b1a7..602c235bc 100755
--- a/tests/generic/561
+++ b/tests/generic/561
@@ -7,8 +7,14 @@
 # Dedup & random I/O race test, do multi-threads fsstress and dedupe on
 # same directory/files
 #
+
+# unreliable_in_parallel: duperemove is buggy. It can get stuck in endless
+# fiemap mapping loops, and this seems to happen a *lot* when the system is
+# under heavy load. when they do this, they don't die when they are supposed to
+# and so have to be manually killed to end the test.
+
 . ./common/preamble
-_begin_fstest auto stress dedupe
+_begin_fstest auto stress dedupe unreliable_in_parallel
 
 # Override the default cleanup function.
 _cleanup()
diff --git a/tests/xfs/177 b/tests/xfs/177
index 773049524..22719ba1c 100755
--- a/tests/xfs/177
+++ b/tests/xfs/177
@@ -21,9 +21,13 @@
 # Regrettably, there is no way to poke /only/ XFS inode reclamation directly,
 # so we're stuck with setting xfssyncd_centisecs to a low value and sleeping
 # while watching the internal inode cache counters.
-#
+
+# unreliable_in_parallel: cache residency is affected by external drop caches
+# operations. Hence counting inodes "in cache" often does not reflect what the
+# test has actually done.
+
 . ./common/preamble
-_begin_fstest auto ioctl
+_begin_fstest auto ioctl unreliable_in_parallel
 
 _cleanup()
 {
diff --git a/tests/xfs/232 b/tests/xfs/232
index 0eea2c098..f0f3916e7 100755
--- a/tests/xfs/232
+++ b/tests/xfs/232
@@ -12,8 +12,12 @@
 # - Wait for the reclaim to run.
 # - Write more and see how bad fragmentation is.
 #
+
+# unreliable_in_parallel: external sync operations affect what happens while
+# the test is waiting for COW expiration.
+
 . ./common/preamble
-_begin_fstest auto quick clone fiemap prealloc
+_begin_fstest auto quick clone fiemap prealloc unreliable_in_parallel
 
 # Override the default cleanup function.
 _cleanup()
diff --git a/tests/xfs/237 b/tests/xfs/237
index f172aaf59..91f56d6c1 100755
--- a/tests/xfs/237
+++ b/tests/xfs/237
@@ -6,8 +6,14 @@
 #
 # Test AIO DIO CoW behavior when the write temporarily fails.
 #
+
+# unreliable_in_parallel: external drop caches can co-incide with the error
+# table being loaded, so the test being run fails with EIO trying to load the
+# inode from disk instead of whatever operation it is supposed to fail on when
+# the inode is already cached in memory.
+
 . ./common/preamble
-_begin_fstest auto quick clone eio
+_begin_fstest auto quick clone eio unreliable_in_parallel
 
 # Override the default cleanup function.
 _cleanup()
diff --git a/tests/xfs/243 b/tests/xfs/243
index 964e94e1d..f9cc2d50f 100755
--- a/tests/xfs/243
+++ b/tests/xfs/243
@@ -15,9 +15,12 @@
 #     5. delalloc
 #   - CoW across the halfway mark, starting with the unwritten extent.
 #   - Check that the files are now different where we say they're different.
-#
+
+# unreliable_in_parallel: external sync can affect the layout of the files being
+# created, results in unreliable detection of delalloc extents.
+
 . ./common/preamble
-_begin_fstest auto quick clone punch prealloc
+_begin_fstest auto quick clone punch prealloc unreliable_in_parallel
 
 # Import common functions.
 . ./common/filter
diff --git a/tests/xfs/300 b/tests/xfs/300
index 3f0dbb9ac..c4c3b1ab8 100755
--- a/tests/xfs/300
+++ b/tests/xfs/300
@@ -5,9 +5,13 @@
 # FS QA Test No. 300
 #
 # Test xfs_fsr / exchangerange management of di_forkoff w/ selinux
-#
+
+# unreliable_in_parallel: file layout appears to be perturbed by load related
+# timing issues. Not 100% sure, but the backwards write does not reliably
+# fragment the source file under heavy external load
+
 . ./common/preamble
-_begin_fstest auto fsr
+_begin_fstest auto fsr unreliable_in_parallel
 
 # Import common functions.
 . ./common/filter
diff --git a/tests/xfs/440 b/tests/xfs/440
index 0cc679aeb..c0b6756ba 100755
--- a/tests/xfs/440
+++ b/tests/xfs/440
@@ -8,8 +8,12 @@
 # a file that has CoW reservations and no dirty pages.  The reservations
 # should shift over to the new owner, but they do not.
 #
+
+# unreliable_in_parallel: external sync(1) and/or drop caches can reclaim inodes
+# and free post-eof space, resulting in lower than expected block counts.
+
 . ./common/preamble
-_begin_fstest auto quick clone quota
+_begin_fstest auto quick clone quota unreliable_in_parallel
 
 # Import common functions.
 . ./common/reflink
diff --git a/tests/xfs/527 b/tests/xfs/527
index 2ef428c25..0d06b128c 100755
--- a/tests/xfs/527
+++ b/tests/xfs/527
@@ -14,8 +14,11 @@
 # xfs: fix incorrect root dquot corruption error when switching group/project
 # quota types
 
+# unreliable_in_parallel: dmesg check can pick up corruptions from other tests.
+# Need to filter corruption reports by short scratch dev name.
+
 . ./common/preamble
-_begin_fstest auto quick quota
+_begin_fstest auto quick quota unreliable_in_parallel
 
 # Import common functions.
 . ./common/quota
diff --git a/tests/xfs/631 b/tests/xfs/631
index 4d79b821f..319995f81 100755
--- a/tests/xfs/631
+++ b/tests/xfs/631
@@ -7,8 +7,13 @@
 # Post-EOF preallocation defeat test for direct I/O with extent size hints.
 #
 
+# unreliable_in_parallel: external cache drops can result in the extent size
+# being truncated as the inode is evicted from cache between writes. This can
+# increase the number of extents significantly beyond what would be expected
+# from the extent size hint.
+
 . ./common/preamble
-_begin_fstest auto quick prealloc rw
+_begin_fstest auto quick prealloc rw unreliable_in_parallel
 
 . ./common/filter
 
diff --git a/tests/xfs/802 b/tests/xfs/802
index ea09817fd..fc4767acb 100755
--- a/tests/xfs/802
+++ b/tests/xfs/802
@@ -8,8 +8,13 @@
 # filesystem, and that we can read the health reports after the fact.  IOWs,
 # this is basic testing for the systemd background services.
 #
+
+# unreliable_in_parallel: this appears to try to run scrub services on all
+# mounted filesystems - that's aproblem when there are a hundred other test
+# filesystems mounted running other tests...
+
 . ./common/preamble
-_begin_fstest auto scrub
+_begin_fstest auto scrub unreliable_in_parallel
 
 _cleanup()
 {
-- 
2.45.2


  parent reply	other threads:[~2024-11-27  4:59 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27  4:51 [RFC PATCH 00/40] fstests: concurrent test execution Dave Chinner
2024-11-27  4:51 ` [PATCH 01/40] xfs/448: get rid of assert-on-failure Dave Chinner
2024-11-27  4:51 ` [PATCH 02/40] fstests: cleanup fsstress process management Dave Chinner
2024-11-29  4:03   ` Zorro Lang
2024-12-04 17:57   ` Zorro Lang
2024-12-05  4:42     ` Dave Chinner
2024-12-05  9:57       ` Zorro Lang
2024-12-04 18:04   ` Zorro Lang
2024-12-05  4:55     ` Dave Chinner
2024-12-05 10:05       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 03/40] fuzzy: don't use killall Dave Chinner
2024-11-27  4:51 ` [PATCH 04/40] fstests: per-test dmflakey instances Dave Chinner
2024-11-27  4:51 ` [PATCH 05/40] fstests: per-test dmerror instances Dave Chinner
2024-11-27  4:51 ` [PATCH 06/40] fstests: per-test dmhuge instances Dave Chinner
2024-11-27  4:51 ` [PATCH 07/40] fstests: per-test dmthin instances Dave Chinner
2024-11-27  4:51 ` [PATCH 08/40] fstests: per-test dmdust instances Dave Chinner
2024-11-27  4:51 ` [PATCH 09/40] fstests: per-test dmdelay instances Dave Chinner
2024-11-27  4:51 ` [PATCH 10/40] fstests: fix DM device creation/removal vs udev races Dave Chinner
2024-11-27  4:51 ` [PATCH 11/40] fstests: use syncfs rather than sync Dave Chinner
2024-11-27  4:51 ` [PATCH 12/40] fstests: clean up mount and unmount operations Dave Chinner
2024-11-27  4:51 ` [PATCH 13/40] fstests: clean up loop device instantiation Dave Chinner
2024-12-01 12:31   ` Zorro Lang
2024-12-01 12:50   ` Zorro Lang
2024-12-07 12:44   ` Zorro Lang
2024-12-07 18:59     ` Zorro Lang
2024-12-07 19:51       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 14/40] fstests: xfs/227 is really slow Dave Chinner
2024-11-27  4:51 ` Dave Chinner [this message]
2024-11-27  4:51 ` [PATCH 16/40] fstests: use udevadm wait in preference to settle Dave Chinner
2024-11-29 17:10   ` Darrick J. Wong
2024-11-29 22:33     ` Dave Chinner
2024-11-30  2:34       ` Zorro Lang
2024-11-27  4:51 ` [PATCH 17/40] xfs/442: rescale load so it's not exponential Dave Chinner
2024-11-27  4:51 ` [PATCH 18/40] xfs/176: fix broken setup code Dave Chinner
2024-11-27  4:51 ` [PATCH 19/40] xfs/177: remove unused slab object count location checks Dave Chinner
2024-11-27  4:51 ` [PATCH 20/40] fstests: remove uses of killall where possible Dave Chinner
2024-11-27  4:51 ` [PATCH 21/40] generic/127: reduce runtime Dave Chinner
2024-11-27  4:51 ` [PATCH 22/40] quota: system project quota files need to be shared Dave Chinner
2024-11-27  4:51 ` [PATCH 23/40] dmesg: reduce noise from other tests Dave Chinner
2024-11-27  4:51 ` [PATCH 24/40] fstests: stop using /tmp directly Dave Chinner
2024-11-27  4:51 ` [PATCH 25/40] fstests: scale some tests for high CPU count sanity Dave Chinner
2024-11-29  3:34   ` Zorro Lang
2024-11-27  4:51 ` [PATCH 26/40] generic/310: cleanup killing background processes Dave Chinner
2024-11-27  4:51 ` [PATCH 27/40] filter: handle mount errors from CONFIG_BLK_DEV_WRITE_MOUNTED=y Dave Chinner
2024-11-27  4:51 ` [PATCH 28/40] filters: add a filter that accepts EIO instead of other errors Dave Chinner
2024-11-27  4:51 ` [PATCH 29/40] generic/085: general cleanup for reliability and debugging Dave Chinner
2024-11-27  4:52 ` [PATCH 30/40] fstests: don't use directory stacks Dave Chinner
2024-12-01 12:10   ` Zorro Lang
2024-12-01 21:37     ` Dave Chinner
2024-11-27  4:52 ` [PATCH 31/40] fstests: clean up a couple of dm-flakey tests Dave Chinner
2024-11-27  4:52 ` [PATCH 32/40] fstests: clean up termination of various tests Dave Chinner
2024-11-27  4:52 ` [PATCH 33/40] vfstests: some tests require the testdir to be shared Dave Chinner
2024-11-27  4:52 ` [PATCH 34/40] xfs/629: single extent files should be within tolerance Dave Chinner
2024-11-27  4:52 ` [PATCH 35/40] xfs/076: fix broken mkfs filtering Dave Chinner
2024-11-27  4:52 ` [PATCH 36/40] fstests: capture some failures to seqres.full Dave Chinner
2024-11-27  4:52 ` [PATCH 37/40] fstests: always use fail-at-unmount semantics for XFS Dave Chinner
2024-11-27  4:52 ` [PATCH 38/40] generic/062: don't leave debug files in $here on failure Dave Chinner
2024-11-27  4:52 ` [PATCH 39/40] fstests: quota grace periods unreliable under load Dave Chinner
2024-11-27  4:52 ` [PATCH 40/40] fstests: check-parallel Dave Chinner
2024-11-29  4:22 ` [RFC PATCH 00/40] fstests: concurrent test execution Zorro Lang
2024-12-07  0:09   ` Darrick J. Wong
2024-12-07  9:38     ` Zorro Lang
2024-12-08  0:02     ` Dave Chinner
2024-12-08  6:15       ` Zorro Lang
2024-12-10  0:55         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241127045403.3665299-16-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox