FS/XFS testing framework
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: fstests@vger.kernel.org
Cc: zlang@kernel.org
Subject: [PATCH 00/28] check-parallel: Running tests without check
Date: Thu, 17 Apr 2025 13:00:41 +1000	[thread overview]
Message-ID: <20250417031208.1852171-1-david@fromorbit.com> (raw)

Hi folks,

This set of patches is intended to move check-parallel away from
using check to execute tests.

To do this, we need to share a bunch of check code between check and
check-parallel. This is mainly the code that parses and builds the
test list, the config section parsing and iteration, and the test
execution loop itself.

To do this, test list parsing and building is factored out of check
into common/test_list. check is converted to use the test list
functions at the same time, and then check-parallel is converted to
use the factored code to directly build it's test list rather than
the open coded grep hack it currently uses.  This allows
check-parallel CLI to use the same group selection interface as
check.

The next change is to factor the config section parsing out of
common/config and move it to common/config-section. This allows
check-parallel to parse and implement section iteration itself
without needing to run all the environment setup code in
common/config. This also allows check-parallel to implement it's own
config section to define the device sizes that it will use
independently of the sections that run tests.

Next, we change check-parallel to use a global test list that runner
scripts can safely dequeue the next test to run. This uses a test
list file and a lock file to serialise access to the file. Hence a
runner can dequeue the next test and remove it from the test list
file without racing with any other runner trying to dequeue the next
test to run. This means we get rid of the static per-runner test
lists that result in many runners finishing and going idle while
other test runners have pending tests still to run. i.e. all test
runners keep executing tests until there are no tests left in the
queue, hence keeping utilisation as high as possible across the test
run.

Then we factor the test execution loop out of check and put it in
common/test_exec. This abstraction makes the results array part of
the test execution, as well using a context defined helper
"_run_seq" to do the actual execution of the test. This allows the
test execution loop to be completely generic, whilst allowing check
and check-parallel to do completely independent things with
individual test execution and overall results reporting.

Finally, we change check-parallel to run tests directly via the
common/test_exec infrastructure rather than executing them via
check. This requires a new helper function that does the test
environment setup in the private mount+pid namespace, but this is
much simpler and faster than using check itself to execute
individual tests. This last bit of functionality is still a work in
progress, so this specific patch is still tagged with [RFC].

There are lots of other bits of changes. The way common/rc and
common/config are used is changed. common/config only sets up the
execution environment now, and should not contain any code that
needs to be executed outside of environment setup. It should only be
sourced once at the highest level to set up the environment, and
never called again.

common/rc is similar - all directly executed code has been removed
from it, and that is now called from the high level code that needs
initialisation work done.  It no longer sources common/config,
either. The test preamble does not need to run init_rc() any more;
they just need to source the generic and fs specific functions the
tests may run. Also, because check does some weird things and lots
of _requires....() functions assume the TEST_DEV is mounted without
first running _require_test(), it also needs to ensure the TEST_DEV
is mounted...

check-parallel can now take a "-t N" parameter to specify how many
execution threads it will use. If this is not specified, it will
default to the number of CPUs in the machine. Testing with 4p
restrictions show that check-parallel will run the quick group 3.5x
faster on a 4p system with 8 execution threads than it will with a
single execution thread. IOWs, even on small test systems,
check-parallel can result in dramatic reductions in test runtime
over check.

On a 64 p machine, testing XFS with the quick group drops from 61
minutes to just under 4 minutes. Testing XFS with the auto group
drops from 246 minutes to just under 8 minutes.

Other miscellaneous stuff in the series:

	- kill non-numeric test name support
	- creating common/exit for all the general test exit
	  functions to fix circular dependencies between common/rc
	  and common/config
	- fix iscratch_mkfs_sized to make USE_EXTERNAL on XFS work
	  the same as ext4.
	- dm-logwrites devices are now created by check-parallel
	- several test conversions from sync() to syncfs()
	- removal of a could of stale .c test source files.
	- address poor CPU count scaling in a couple of tests

I have tried not to cause any regressions for people running plain
check. I've tested that a bit with XFS and ext4, but I can't
guarantee that there aren't issues I haven't uncovered. e.g. btrfs,
as yet, is untested. It is unfortunate that the problem I seek to
address - running exhaustive check testing across many filesystem
types and configurations is prohibitively expensive in terms of time
- is the very reason I can't really adequately test check for
regressions as I develop check-parallel functionality...

Thoughts, comments and code review all welcome!

-Dave.

 .gitignore                          |   1 -
 check                               | 727 ++++--------------------------------
 check-parallel                      | 351 ++++++++++++++---
 common/config                       | 612 +-----------------------------
 common/config-sections              | 461 +++++++++++++++++++++++
 common/dmlogwrites                  |   5 +-
 common/exit                         |  48 +++
 common/preamble                     |  19 +-
 common/rc                           | 253 +++++++++++--
 common/report                       |   2 +-
 common/test_exec                    | 377 +++++++++++++++++++
 common/test_list                    | 308 +++++++++++++++
 common/test_names                   |   8 +-
 new                                 |  24 --
 src/Makefile                        |   4 +-
 src/bulkstat_unlink_test.c          |  12 +-
 src/bulkstat_unlink_test_modified.c | 193 ----------
 src/fsync-tester.c                  |   2 +-
 src/open_by_handle.c                |   6 +-
 src/scaleread.c                     | 224 -----------
 src/scaleread.sh                    |  64 ----
 src/stale_handle.c                  |  15 +-
 tests/generic/531                   |   8 +-
 tests/xfs/259                       |   1 -
 tests/xfs/271                       |   2 -
 tools/run_test.sh                   | 116 ++++++
 26 files changed, 1954 insertions(+), 1889 deletions(-)


             reply	other threads:[~2025-04-17  3:12 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  3:00 Dave Chinner [this message]
2025-04-17  3:00 ` [PATCH 01/28] fstests: remove support for non-numeric test names Dave Chinner
2025-04-30  9:17   ` Nirjhar Roy (IBM)
2025-05-21  2:39     ` Dave Chinner
2025-05-26  5:14       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 02/28] _scratch_mkfs_sized: obey USE_EXTERNAL for XFS filesystems Dave Chinner
2025-05-05  6:14   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 03/28] fstests: move test exit functions to common/exit Dave Chinner
2025-04-17  3:00 ` [PATCH 04/28] check-parallel: report how many tests were _notrun Dave Chinner
2025-05-05  9:58   ` Nirjhar Roy (IBM)
2025-05-21  2:53     ` Dave Chinner
2025-05-26  6:09       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 05/28] check: factor out test list building code Dave Chinner
2025-05-06 11:32   ` Nirjhar Roy (IBM)
2025-05-21  3:55     ` Dave Chinner
2025-05-26  6:48       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 06/28] check-parallel: use common group list parsing code Dave Chinner
2025-05-06 15:56   ` Nirjhar Roy (IBM)
2025-05-21  4:13     ` Dave Chinner
2025-05-26  6:58       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 07/28] check-parallel: adjust concurrency according to CPU count Dave Chinner
2025-05-07  6:45   ` Nirjhar Roy (IBM)
2025-05-21  4:32     ` Dave Chinner
2025-05-26  8:50       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 08/28] check-parallel: add logwrite device support Dave Chinner
2025-05-07  8:18   ` Nirjhar Roy (IBM)
2025-05-21 10:07     ` Dave Chinner
2025-05-26  8:59       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 09/28] check-parallel: allow FSTYP selection from the CLI Dave Chinner
2025-05-07  8:49   ` Nirjhar Roy (IBM)
2025-05-21 10:17     ` Dave Chinner
2025-05-26  9:00       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 10/28] check-parallel: use PID namespaces for runner process isolation Dave Chinner
2025-05-07  9:02   ` Nirjhar Roy (IBM)
2025-05-21 10:19     ` Dave Chinner
2025-05-26  9:04       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 11/28] check-parallel: initial support for specifying device sizes Dave Chinner
2025-05-07 10:05   ` Nirjhar Roy (IBM)
2025-05-21 11:11     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 12/28] config: move config section code to it's own file Dave Chinner
2025-05-09  6:09   ` Nirjhar Roy
2025-05-21 11:28     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 13/28] check-parallel: introduce config file support Dave Chinner
2025-05-09 12:01   ` Nirjhar Roy
2025-05-21 12:23     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 14/28] fstests: further separate sourcing common/rc and common/config from initialisation Dave Chinner
2025-05-10 14:08   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 15/28] check-parallel: de-batch test execution Dave Chinner
2025-05-09 13:16   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 16/28] check-parallel: run sections directly Dave Chinner
2025-05-09 14:03   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 17/28] check-parallel: rebuild test list when FSTYP changes Dave Chinner
2025-05-09 16:00   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 18/28] check-parallel: create a "results-latest" symlink Dave Chinner
2025-05-10 13:12   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 19/28] check: factor test running Dave Chinner
2025-05-12 13:57   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 20/28] [RFC] check-parallel: run tests directly without using check Dave Chinner
2025-05-13 14:48   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 21/28] generic/531: limit max files per CPU Dave Chinner
2025-05-10 13:15   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 22/28] fsync-tester.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  9:08   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 23/28] open-by-handle.c: " Dave Chinner
2025-04-30  9:02   ` Nirjhar Roy (IBM)
2025-05-21  2:32     ` Dave Chinner
2025-05-26  5:11       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 24/28] " Dave Chinner
2025-04-30  8:56   ` Nirjhar Roy (IBM)
2025-05-21  2:30     ` Dave Chinner
2025-05-26  4:56       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 25/28] bulkstat_unlink_test_modified.c: remove unused test code Dave Chinner
2025-04-30  8:47   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 26/28] stale-handle.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  8:34   ` Nirjhar Roy (IBM)
2025-05-21  2:24     ` Dave Chinner
2025-04-17  3:01 ` [PATCH 27/28] scaleread: remove dead test code Dave Chinner
2025-04-30  8:10   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 28/28] xfs/259: no need to call sync Dave Chinner
2025-04-30  7:56   ` Nirjhar Roy (IBM)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250417031208.1852171-1-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    --cc=zlang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox