From: Dave Chinner <david@fromorbit.com>
To: "Nirjhar Roy (IBM)" <nirjhar.roy.lists@gmail.com>
Cc: fstests@vger.kernel.org, zlang@kernel.org
Subject: Re: [PATCH 07/28] check-parallel: adjust concurrency according to CPU count
Date: Wed, 21 May 2025 14:32:27 +1000 [thread overview]
Message-ID: <aC1XWzTFPUa_7WS_@dread.disaster.area> (raw)
In-Reply-To: <e434a271402602fa7ec5efbcd9d92329ecdd4614.camel@gmail.com>
On Wed, May 07, 2025 at 12:15:09PM +0530, Nirjhar Roy (IBM) wrote:
> On Thu, 2025-04-17 at 13:00 +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > Concurrency is currently hard coded at 64 worker threads. This is
> > too many for small CPU count machines; the idea is to create a
> > sustained load of roughly one test per CPU as they are mostly single
> > threaded/single process tests. The number "64" was chosen because
> > I've been developing this functionality on a 64p VM.
.....
> > diff --git a/check-parallel b/check-parallel
> > index cb5d6aedf..0649a417f 100755
> > --- a/check-parallel
> > +++ b/check-parallel
> > @@ -10,7 +10,7 @@
> > # the loop devices.
> >
> > basedir=""
> > -runners=64
> > +runners=$(getconf _NPROCESSORS_CONF)
> Minor: Not related to this change. Maybe we should have a _get_nproc()
> just like
> _get_page_size()
> {
> echo $(getconf PAGE_SIZE)
> }
> _get_nproc()
> {
> echo $(getconf _NPROCESSORS_CONF)
> }
> and replace all the $(getconf _NPROCESSORS_CONF) with $(_get_nproc)
I have thoughts on that.
I think determining test scaling based on the number of CPUs in the
machine is wrong. If I run in a cgroup that limits tests to 4p on a
64p machine, then fstests will think it is running on a 64p machine
rahter than on 4p. That's ... not ideal.
I think there should be a global variable that defines the
concurrency that tests should use to scale load/processes rather
than use the CPU count. Then check/check-parallel can set the
concurrency as they desire and we no longer have to worry about
tests creating excesively huge loads on high CPU count machines
because they were sized to load a small, low concurrency test
system.
i.e. I intend to separate the concurrency with which check-parallel
runs from the concurrency that individual tests use to scale. For
check-parallel, I am thinking of fixing test concurrency scaling at
something like min(nr_cpus, 8), whilst the check-parallel harness
itself uses nr_cpus to determine how many concurrent runners to
instantiate.
> > runtimes=()
> > show_test_list=
> > @@ -30,6 +30,7 @@ usage()
> >
> > check options
> > -D <dir> Directory to run in
> > + -t <n> Number of concurrent tests to run
> Minor: Maybe we should mention the valid range of <n> i.e, 0 to 1024?
> > -n Output test list, do not run tests
>
> Nit: Maybe there is some spacing issue here? "Number of concurrent ..."
> and "Output test..." don't begin together.
Not that I can see, I think this is just email quoting and tabs
doing funky stuff.
> > -r randomize test order
> > --exact-order run tests in the exact order specified
> > @@ -81,6 +82,7 @@ while [ $# -gt 0 ]; do
> > -\? | -h | --help) usage ;;
> >
> > -D) basedir=$2; shift ;;
> > + -t) runners=$2; shift ;;
> > -g) _tl_setup_group $2 ; shift ;;
> > -e) _tl_setup_exclude_tests $2 ; shift ;;
> > -E) _tl_setup_exclude_file $2 ; shift ;;
> > @@ -111,6 +113,11 @@ if [ ! -d "$basedir" ]; then
> > echo "Invalid basedir specification"
> > usage
> > fi
> > +if [[ $runners -le 0 || $runners -gt 1024 ]]; then
> > + echo "Invalid thread specificaton: $runners"
> Minor: Maybe we should mention the valid range of "runners" in this
> error message (0 to 1024)?
/me shrugs.
It's an arbitrary "check-parallel won't ever scale past this" limit
because the diminishing returns from added concurrency that Amdahl's
Law define is already kicking in at 64 threads. If you are running
check-parallel on a >1024p machine, you already know enough to be
able to look at the source code to determine why the error is being
emitted. :)
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2025-05-21 4:32 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-17 3:00 [PATCH 00/28] check-parallel: Running tests without check Dave Chinner
2025-04-17 3:00 ` [PATCH 01/28] fstests: remove support for non-numeric test names Dave Chinner
2025-04-30 9:17 ` Nirjhar Roy (IBM)
2025-05-21 2:39 ` Dave Chinner
2025-05-26 5:14 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 02/28] _scratch_mkfs_sized: obey USE_EXTERNAL for XFS filesystems Dave Chinner
2025-05-05 6:14 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 03/28] fstests: move test exit functions to common/exit Dave Chinner
2025-04-17 3:00 ` [PATCH 04/28] check-parallel: report how many tests were _notrun Dave Chinner
2025-05-05 9:58 ` Nirjhar Roy (IBM)
2025-05-21 2:53 ` Dave Chinner
2025-05-26 6:09 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 05/28] check: factor out test list building code Dave Chinner
2025-05-06 11:32 ` Nirjhar Roy (IBM)
2025-05-21 3:55 ` Dave Chinner
2025-05-26 6:48 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 06/28] check-parallel: use common group list parsing code Dave Chinner
2025-05-06 15:56 ` Nirjhar Roy (IBM)
2025-05-21 4:13 ` Dave Chinner
2025-05-26 6:58 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 07/28] check-parallel: adjust concurrency according to CPU count Dave Chinner
2025-05-07 6:45 ` Nirjhar Roy (IBM)
2025-05-21 4:32 ` Dave Chinner [this message]
2025-05-26 8:50 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 08/28] check-parallel: add logwrite device support Dave Chinner
2025-05-07 8:18 ` Nirjhar Roy (IBM)
2025-05-21 10:07 ` Dave Chinner
2025-05-26 8:59 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 09/28] check-parallel: allow FSTYP selection from the CLI Dave Chinner
2025-05-07 8:49 ` Nirjhar Roy (IBM)
2025-05-21 10:17 ` Dave Chinner
2025-05-26 9:00 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 10/28] check-parallel: use PID namespaces for runner process isolation Dave Chinner
2025-05-07 9:02 ` Nirjhar Roy (IBM)
2025-05-21 10:19 ` Dave Chinner
2025-05-26 9:04 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 11/28] check-parallel: initial support for specifying device sizes Dave Chinner
2025-05-07 10:05 ` Nirjhar Roy (IBM)
2025-05-21 11:11 ` Dave Chinner
2025-04-17 3:00 ` [PATCH 12/28] config: move config section code to it's own file Dave Chinner
2025-05-09 6:09 ` Nirjhar Roy
2025-05-21 11:28 ` Dave Chinner
2025-04-17 3:00 ` [PATCH 13/28] check-parallel: introduce config file support Dave Chinner
2025-05-09 12:01 ` Nirjhar Roy
2025-05-21 12:23 ` Dave Chinner
2025-04-17 3:00 ` [PATCH 14/28] fstests: further separate sourcing common/rc and common/config from initialisation Dave Chinner
2025-05-10 14:08 ` Nirjhar Roy (IBM)
2025-04-17 3:00 ` [PATCH 15/28] check-parallel: de-batch test execution Dave Chinner
2025-05-09 13:16 ` Nirjhar Roy
2025-04-17 3:00 ` [PATCH 16/28] check-parallel: run sections directly Dave Chinner
2025-05-09 14:03 ` Nirjhar Roy
2025-04-17 3:00 ` [PATCH 17/28] check-parallel: rebuild test list when FSTYP changes Dave Chinner
2025-05-09 16:00 ` Nirjhar Roy
2025-04-17 3:00 ` [PATCH 18/28] check-parallel: create a "results-latest" symlink Dave Chinner
2025-05-10 13:12 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 19/28] check: factor test running Dave Chinner
2025-05-12 13:57 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 20/28] [RFC] check-parallel: run tests directly without using check Dave Chinner
2025-05-13 14:48 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 21/28] generic/531: limit max files per CPU Dave Chinner
2025-05-10 13:15 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 22/28] fsync-tester.c: use syncfs() rather than sync() Dave Chinner
2025-04-30 9:08 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 23/28] open-by-handle.c: " Dave Chinner
2025-04-30 9:02 ` Nirjhar Roy (IBM)
2025-05-21 2:32 ` Dave Chinner
2025-05-26 5:11 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 24/28] " Dave Chinner
2025-04-30 8:56 ` Nirjhar Roy (IBM)
2025-05-21 2:30 ` Dave Chinner
2025-05-26 4:56 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 25/28] bulkstat_unlink_test_modified.c: remove unused test code Dave Chinner
2025-04-30 8:47 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 26/28] stale-handle.c: use syncfs() rather than sync() Dave Chinner
2025-04-30 8:34 ` Nirjhar Roy (IBM)
2025-05-21 2:24 ` Dave Chinner
2025-04-17 3:01 ` [PATCH 27/28] scaleread: remove dead test code Dave Chinner
2025-04-30 8:10 ` Nirjhar Roy (IBM)
2025-04-17 3:01 ` [PATCH 28/28] xfs/259: no need to call sync Dave Chinner
2025-04-30 7:56 ` Nirjhar Roy (IBM)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aC1XWzTFPUa_7WS_@dread.disaster.area \
--to=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=nirjhar.roy.lists@gmail.com \
--cc=zlang@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.