Re: [PATCH 07/28] check-parallel: adjust concurrency according to CPU count

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Nirjhar Roy (IBM)" <nirjhar.roy.lists@gmail.com>
To: Dave Chinner <david@fromorbit.com>, fstests@vger.kernel.org
Cc: zlang@kernel.org
Subject: Re: [PATCH 07/28] check-parallel: adjust concurrency according to CPU count
Date: Wed, 07 May 2025 12:15:09 +0530	[thread overview]
Message-ID: <e434a271402602fa7ec5efbcd9d92329ecdd4614.camel@gmail.com> (raw)
In-Reply-To: <20250417031208.1852171-8-david@fromorbit.com>

On Thu, 2025-04-17 at 13:00 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Concurrency is currently hard coded at 64 worker threads. This is
> too many for small CPU count machines; the idea is to create a
> sustained load of roughly one test per CPU as they are mostly single
> threaded/single process tests. The number "64" was chosen because
> I've been developing this functionality on a 64p VM.
> 
> Rather than hard coding the concurrency, probe the number of CPUs
> available and create that many running contexts as the default
> concurrency to use.
> 
> Further, add a CLI option to specify the number of threads to run so
> that we can over- or under-commit the CPU resources to enable direct
> benchmarking of performance with different levels of concurrency.
> 
> Let's use that capability to show how much check-parallel can
> benefit small systems. Using a single check execution thread for all
> tests inside a 4p control group to limit maximum CPU usage to the
> equivalent of a small 4p machine:
> 
> $ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 1 -g quick -s xfs -x dump -X generic/531
> Runner 0 Failures:  generic/504
> Tests run: 921
> Tests _notrun: 272
> Failure count: 2
> .....
> 
> real    61m31.362s
> user    0m0.029s
> sys     0m0.059s
> 
> the quick group on XFS takes *over an hour* to run.
> 
> If we use the same 4p control group setup and run with 8 test
> execution threads to ensure the 4 CPUs are fully utilised for most
> of the test run:
> 
> $ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 8 -g quick -s xfs -x dump -X generic/531
> Runner 7 Failures:  generic/504
> Tests run: 921
> Tests _notrun: 145
> Failure count: 1
> .....
> 
> real    17m33.124s
> user    0m0.009s
> sys     0m0.017s
> 
> The same test run takes only 17m33s. The same number of tests were
> run, the same failures occurred. [ Ignore the differences in
> notrun/failure count - the multi-file aggregation currently doesn't
> work correctly for the single log file case. ]
> 
> That's a reduction in test runtime of ~72% for a 4 CPU system. Or,
> if we want to measure it the other way, we get a ~3.5x improvement
> in runtime scalability. i.e. going from 1 -> 4 CPUs being used for
> test execution (4x increase) we get a 3.5x improvement in
> scalability when we go from check to check-parallel.
The functionality looks useful to me and the implementation is also
fine as far as I understand. I have some minor/nit comments below.
Other than that:

Reviewed-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>

> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  check-parallel | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/check-parallel b/check-parallel
> index cb5d6aedf..0649a417f 100755
> --- a/check-parallel
> +++ b/check-parallel
> @@ -10,7 +10,7 @@
>  # the loop devices.
>  
>  basedir=""
> -runners=64
> +runners=$(getconf _NPROCESSORS_CONF)
Minor: Not related to this change. Maybe we should have a _get_nproc()
just like 
_get_page_size()
{
	echo $(getconf PAGE_SIZE)
}
_get_nproc()
{
	echo $(getconf _NPROCESSORS_CONF)
}
and replace all the $(getconf _NPROCESSORS_CONF) with $(_get_nproc)
>  runner_list=()
>  runtimes=()
>  show_test_list=
> @@ -30,6 +30,7 @@ usage()
>  
>  check options
>      -D <dir>		Directory to run in
> +    -t <n>		Number of concurrent tests to  run
Minor: Maybe we should mention the valid range of <n> i.e, 0 to 1024?
>      -n			Output test list, do not run tests

Nit: Maybe there is some spacing issue here? "Number of concurrent ..."
and "Output test..." don't begin together.
--NR
>      -r			randomize test order
>      --exact-order	run tests in the exact order specified
> @@ -81,6 +82,7 @@ while [ $# -gt 0 ]; do
>  	-\? | -h | --help) usage ;;
>  
>  	-D)	basedir=$2; shift ;;
> +	-t)	runners=$2; shift ;;
>  	-g)	_tl_setup_group $2 ; shift ;;
>  	-e)	_tl_setup_exclude_tests $2 ; shift ;;
>  	-E)	_tl_setup_exclude_file $2 ; shift ;;
> @@ -111,6 +113,11 @@ if [ ! -d "$basedir" ]; then
>  	echo "Invalid basedir specification"
>  	usage
>  fi
> +if [[ $runners -le 0 || $runners -gt 1024 ]]; then
> +	echo "Invalid thread specificaton: $runners"
Minor: Maybe we should mention the valid range of "runners" in this
error message (0 to 1024)?
> +	usage
> +fi
> +
>  if [ -d "$basedir/runner-0/" ]; then
>  	prev_results=`ls -tr $basedir/runner-0/ | grep results | tail -1`
>  fi

next prev parent reply	other threads:[~2025-05-07  6:45 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  3:00 [PATCH 00/28] check-parallel: Running tests without check Dave Chinner
2025-04-17  3:00 ` [PATCH 01/28] fstests: remove support for non-numeric test names Dave Chinner
2025-04-30  9:17   ` Nirjhar Roy (IBM)
2025-05-21  2:39     ` Dave Chinner
2025-05-26  5:14       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 02/28] _scratch_mkfs_sized: obey USE_EXTERNAL for XFS filesystems Dave Chinner
2025-05-05  6:14   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 03/28] fstests: move test exit functions to common/exit Dave Chinner
2025-04-17  3:00 ` [PATCH 04/28] check-parallel: report how many tests were _notrun Dave Chinner
2025-05-05  9:58   ` Nirjhar Roy (IBM)
2025-05-21  2:53     ` Dave Chinner
2025-05-26  6:09       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 05/28] check: factor out test list building code Dave Chinner
2025-05-06 11:32   ` Nirjhar Roy (IBM)
2025-05-21  3:55     ` Dave Chinner
2025-05-26  6:48       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 06/28] check-parallel: use common group list parsing code Dave Chinner
2025-05-06 15:56   ` Nirjhar Roy (IBM)
2025-05-21  4:13     ` Dave Chinner
2025-05-26  6:58       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 07/28] check-parallel: adjust concurrency according to CPU count Dave Chinner
2025-05-07  6:45   ` Nirjhar Roy (IBM) [this message]
2025-05-21  4:32     ` Dave Chinner
2025-05-26  8:50       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 08/28] check-parallel: add logwrite device support Dave Chinner
2025-05-07  8:18   ` Nirjhar Roy (IBM)
2025-05-21 10:07     ` Dave Chinner
2025-05-26  8:59       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 09/28] check-parallel: allow FSTYP selection from the CLI Dave Chinner
2025-05-07  8:49   ` Nirjhar Roy (IBM)
2025-05-21 10:17     ` Dave Chinner
2025-05-26  9:00       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 10/28] check-parallel: use PID namespaces for runner process isolation Dave Chinner
2025-05-07  9:02   ` Nirjhar Roy (IBM)
2025-05-21 10:19     ` Dave Chinner
2025-05-26  9:04       ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 11/28] check-parallel: initial support for specifying device sizes Dave Chinner
2025-05-07 10:05   ` Nirjhar Roy (IBM)
2025-05-21 11:11     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 12/28] config: move config section code to it's own file Dave Chinner
2025-05-09  6:09   ` Nirjhar Roy
2025-05-21 11:28     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 13/28] check-parallel: introduce config file support Dave Chinner
2025-05-09 12:01   ` Nirjhar Roy
2025-05-21 12:23     ` Dave Chinner
2025-04-17  3:00 ` [PATCH 14/28] fstests: further separate sourcing common/rc and common/config from initialisation Dave Chinner
2025-05-10 14:08   ` Nirjhar Roy (IBM)
2025-04-17  3:00 ` [PATCH 15/28] check-parallel: de-batch test execution Dave Chinner
2025-05-09 13:16   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 16/28] check-parallel: run sections directly Dave Chinner
2025-05-09 14:03   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 17/28] check-parallel: rebuild test list when FSTYP changes Dave Chinner
2025-05-09 16:00   ` Nirjhar Roy
2025-04-17  3:00 ` [PATCH 18/28] check-parallel: create a "results-latest" symlink Dave Chinner
2025-05-10 13:12   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 19/28] check: factor test running Dave Chinner
2025-05-12 13:57   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 20/28] [RFC] check-parallel: run tests directly without using check Dave Chinner
2025-05-13 14:48   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 21/28] generic/531: limit max files per CPU Dave Chinner
2025-05-10 13:15   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 22/28] fsync-tester.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  9:08   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 23/28] open-by-handle.c: " Dave Chinner
2025-04-30  9:02   ` Nirjhar Roy (IBM)
2025-05-21  2:32     ` Dave Chinner
2025-05-26  5:11       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 24/28] " Dave Chinner
2025-04-30  8:56   ` Nirjhar Roy (IBM)
2025-05-21  2:30     ` Dave Chinner
2025-05-26  4:56       ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 25/28] bulkstat_unlink_test_modified.c: remove unused test code Dave Chinner
2025-04-30  8:47   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 26/28] stale-handle.c: use syncfs() rather than sync() Dave Chinner
2025-04-30  8:34   ` Nirjhar Roy (IBM)
2025-05-21  2:24     ` Dave Chinner
2025-04-17  3:01 ` [PATCH 27/28] scaleread: remove dead test code Dave Chinner
2025-04-30  8:10   ` Nirjhar Roy (IBM)
2025-04-17  3:01 ` [PATCH 28/28] xfs/259: no need to call sync Dave Chinner
2025-04-30  7:56   ` Nirjhar Roy (IBM)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e434a271402602fa7ec5efbcd9d92329ecdd4614.camel@gmail.com \
    --to=nirjhar.roy.lists@gmail.com \
    --cc=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    --cc=zlang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.