[PATCH 07/28] check-parallel: adjust concurrency according to CPU count

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: fstests@vger.kernel.org
Cc: zlang@kernel.org
Subject: [PATCH 07/28] check-parallel: adjust concurrency according to CPU count
Date: Thu, 17 Apr 2025 13:00:48 +1000	[thread overview]
Message-ID: <20250417031208.1852171-8-david@fromorbit.com> (raw)
In-Reply-To: <20250417031208.1852171-1-david@fromorbit.com>

From: Dave Chinner <dchinner@redhat.com>

Concurrency is currently hard coded at 64 worker threads. This is
too many for small CPU count machines; the idea is to create a
sustained load of roughly one test per CPU as they are mostly single
threaded/single process tests. The number "64" was chosen because
I've been developing this functionality on a 64p VM.

Rather than hard coding the concurrency, probe the number of CPUs
available and create that many running contexts as the default
concurrency to use.

Further, add a CLI option to specify the number of threads to run so
that we can over- or under-commit the CPU resources to enable direct
benchmarking of performance with different levels of concurrency.

Let's use that capability to show how much check-parallel can
benefit small systems. Using a single check execution thread for all
tests inside a 4p control group to limit maximum CPU usage to the
equivalent of a small 4p machine:

$ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 1 -g quick -s xfs -x dump -X generic/531
Runner 0 Failures:  generic/504
Tests run: 921
Tests _notrun: 272
Failure count: 2
.....

real    61m31.362s
user    0m0.029s
sys     0m0.059s

the quick group on XFS takes *over an hour* to run.

If we use the same 4p control group setup and run with 8 test
execution threads to ensure the 4 CPUs are fully utilised for most
of the test run:

$ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 8 -g quick -s xfs -x dump -X generic/531
Runner 7 Failures:  generic/504
Tests run: 921
Tests _notrun: 145
Failure count: 1
.....

real    17m33.124s
user    0m0.009s
sys     0m0.017s

The same test run takes only 17m33s. The same number of tests were
run, the same failures occurred. [ Ignore the differences in
notrun/failure count - the multi-file aggregation currently doesn't
work correctly for the single log file case. ]

That's a reduction in test runtime of ~72% for a 4 CPU system. Or,
if we want to measure it the other way, we get a ~3.5x improvement
in runtime scalability. i.e. going from 1 -> 4 CPUs being used for
test execution (4x increase) we get a 3.5x improvement in
scalability when we go from check to check-parallel.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 check-parallel | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/check-parallel b/check-parallel
index cb5d6aedf..0649a417f 100755
--- a/check-parallel
+++ b/check-parallel
@@ -10,7 +10,7 @@
 # the loop devices.

 basedir=""
-runners=64
+runners=$(getconf _NPROCESSORS_CONF)
 runner_list=()
 runtimes=()
 show_test_list=
@@ -30,6 +30,7 @@ usage()

 check options
     -D <dir>		Directory to run in
+    -t <n>		Number of concurrent tests to  run
     -n			Output test list, do not run tests
     -r			randomize test order
     --exact-order	run tests in the exact order specified
@@ -81,6 +82,7 @@ while [ $# -gt 0 ]; do
 	-\? | -h | --help) usage ;;

 	-D)	basedir=$2; shift ;;
+	-t)	runners=$2; shift ;;
 	-g)	_tl_setup_group $2 ; shift ;;
 	-e)	_tl_setup_exclude_tests $2 ; shift ;;
 	-E)	_tl_setup_exclude_file $2 ; shift ;;
@@ -111,6 +113,11 @@ if [ ! -d "$basedir" ]; then
 	echo "Invalid basedir specification"
 	usage
 fi
+if [[ $runners -le 0 || $runners -gt 1024 ]]; then
+	echo "Invalid thread specificaton: $runners"
+	usage
+fi
+
 if [ -d "$basedir/runner-0/" ]; then
 	prev_results=`ls -tr $basedir/runner-0/ | grep results | tail -1`
 fi
-- 
2.45.2

next prev parent reply	other threads:[~2025-04-17  3:29 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  3:00 [PATCH 00/28] check-parallel: Running tests without check Dave Chinner
2025-04-17  3:00 ` [PATCH 01/28] fstests: remove support for non-numeric test names Dave Chinner
2025-04-30  9:17   ` Nirjhar Roy (IBM)
2025-05-21  2:39     ` Dave Chinner
2025-05-26  5:14       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 02/28] _scratch_mkfs_sized: obey USE_EXTERNAL for XFS filesystems Dave Chinner
2025-05-05  6:14   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 03/28] fstests: move test exit functions to common/exit Dave Chinner
2025-04-17  3:00 ` [PATCH 04/28] check-parallel: report how many tests were _notrun Dave Chinner
2025-05-05  9:58   ` Nirjhar Roy (IBM)
2025-05-21  2:53     ` Dave Chinner
2025-05-26  6:09       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 05/28] check: factor out test list building code Dave Chinner
2025-05-06 11:32   ` Nirjhar Roy (IBM)
2025-05-21  3:55     ` Dave Chinner
2025-05-26  6:48       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 06/28] check-parallel: use common group list parsing code Dave Chinner
2025-05-06 15:56   ` Nirjhar Roy (IBM)
2025-05-21  4:13     ` Dave Chinner
2025-05-26  6:58       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` Dave Chinner [this message]
2025-05-07  6:45   ` [PATCH 07/28] check-parallel: adjust concurrency according to CPU count Nirjhar Roy (IBM)
2025-05-21  4:32     ` Dave Chinner
2025-05-26  8:50       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 08/28] check-parallel: add logwrite device support Dave Chinner
2025-05-07  8:18   ` Nirjhar Roy (IBM)
2025-05-21 10:07     ` Dave Chinner
2025-05-26  8:59       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 09/28] check-parallel: allow FSTYP selection from the CLI Dave Chinner
2025-05-07  8:49   ` Nirjhar Roy (IBM)
2025-05-21 10:17     ` Dave Chinner
2025-05-26  9:00       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 10/28] check-parallel: use PID namespaces for runner process isolation Dave Chinner
2025-05-07  9:02   ` Nirjhar Roy (IBM)
2025-05-21 10:19     ` Dave Chinner
2025-05-26  9:04       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 11/28] check-parallel: initial support for specifying device sizes Dave Chinner
2025-05-07 10:05   ` Nirjhar Roy (IBM)
2025-05-21 11:11     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 12/28] config: move config section code to it's own file Dave Chinner
2025-05-09  6:09   ` Nirjhar Roy
2025-05-21 11:28     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 13/28] check-parallel: introduce config file support Dave Chinner
2025-05-09 12:01   ` Nirjhar Roy
2025-05-21 12:23     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 14/28] fstests: further separate sourcing common/rc and common/config from initialisation Dave Chinner
2025-05-10 14:08   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 15/28] check-parallel: de-batch test execution Dave Chinner
2025-05-09 13:16   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 16/28] check-parallel: run sections directly Dave Chinner
2025-05-09 14:03   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 17/28] check-parallel: rebuild test list when FSTYP changes Dave Chinner
2025-05-09 16:00   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 18/28] check-parallel: create a "results-latest" symlink Dave Chinner
2025-05-10 13:12   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 19/28] check: factor test running Dave Chinner
2025-05-12 13:57   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 20/28] [RFC] check-parallel: run tests directly without using check Dave Chinner
2025-05-13 14:48   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 21/28] generic/531: limit max files per CPU Dave Chinner
2025-05-10 13:15   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 22/28] fsync-tester.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  9:08   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 23/28] open-by-handle.c: " Dave Chinner
2025-04-30  9:02   ` Nirjhar Roy (IBM)
2025-05-21  2:32     ` Dave Chinner
2025-05-26  5:11       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 24/28] " Dave Chinner
2025-04-30  8:56   ` Nirjhar Roy (IBM)
2025-05-21  2:30     ` Dave Chinner
2025-05-26  4:56       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 25/28] bulkstat_unlink_test_modified.c: remove unused test code Dave Chinner
2025-04-30  8:47   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 26/28] stale-handle.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  8:34   ` Nirjhar Roy (IBM)
2025-05-21  2:24     ` Dave Chinner
2025-04-17  3:01 ` [PATCH 27/28] scaleread: remove dead test code Dave Chinner
2025-04-30  8:10   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 28/28] xfs/259: no need to call sync Dave Chinner
2025-04-30  7:56   ` Nirjhar Roy (IBM)

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:cb5d6aed dfblob:0649a417 )
 OR (
bs:"[PATCH 07/28] check-parallel: adjust concurrency according to CPU count" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250417031208.1852171-8-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    --cc=zlang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.