public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/4] Mitigate inconsistent NUMA imbalance behaviour
Date: Fri, 20 May 2022 11:18:12 +0100	[thread overview]
Message-ID: <20220520101812.GW3441@techsingularity.net> (raw)
In-Reply-To: <f6b2eba0-2c28-b41b-3900-8834abbd6575@amd.com>

On Fri, May 20, 2022 at 10:28:02AM +0530, K Prateek Nayak wrote:
> Hello Mel,
> 
> We tested the patch series on a our systems.
> 
> tl;dr
> 
> Results of testing:
> - Benefits short running Stream tasks in NPS2 and NPS4 mode.
> - Benefits seen for tbench in NPS1 mode for 8-128 worker count.
> - Regression in Hackbench with 16 groups in NPS1 mode. A rerun for
>   same data point suggested run to run variation on patched kernel.
> - Regression in case of tbench with 32 and 64 workers in NPS2 mode.
>   Patched kernel however seems to report more stable value for 64
>   worker count compared to tip.
> - Slight regression in schbench in NPS2 and NPS4 mode for large
>   worker count but we did spot some run to run variation with
>   both tip and patched kernel.
> 
> Below are all the detailed numbers for the benchmarks.
> 

Thanks!

I looked through the results but I do not see anything that is very
alarming. Some notes.

o Hackbench with 16 groups on NPS1, that would likely be 640 tasks
  communicating unless other paramters are used. I expect it to be
  variable and it's a heavily overloaded scenario. Initial placement is
  not necessarily critical as migrations are likely to be very high.
  On NPS1, there is going to be random luck given that the latency
  to individual CPUs and the physical topology is hidden.

o NPS2 with 128 workers. That's at the threshold where load is
  potentially evenly split between the two sockets but not perfectly
  split due to migrate-on-wakeup being a little unpredictable. Might
  be worth checking the variability there.

o Same observations for tbench. I looked at my own results for NPS1
  on Zen3 and what I see is that there is a small blip there but
  the mpstat heat map indicates that the nodes are being more evenly
  used than without the patch which is expected.

o STREAM is interesting in that there are large differences between
  10 runs and 100 hundred runs. In indicates that without pinning that
  STREAM can be a bit variable. The problem might be similar to NAS
  as reported in the leader mail with the variability due to commit
  c6f886546cb8 for unknown reasons.

> > 
> >  kernel/sched/fair.c     | 59 ++++++++++++++++++++++++++---------------
> >  kernel/sched/topology.c | 23 ++++++++++------
> >  2 files changed, 53 insertions(+), 29 deletions(-)
> > 
> 
> Please let me know if you would like me to get some additional
> data on the test system.

Other than checking variability, the min, max and range, I don't need
additional data. I suspect in some cases like what I observed with NAS
that there is wide variability for reasons independent of this series.

I'm of the opinion though that your results are not a barrier for
merging. Do you agree?

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2022-05-20 10:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-11 14:30 [PATCH 0/4] Mitigate inconsistent NUMA imbalance behaviour Mel Gorman
2022-05-11 14:30 ` [PATCH 1/4] sched/numa: Initialise numa_migrate_retry Mel Gorman
2022-05-11 14:30 ` [PATCH 2/4] sched/numa: Do not swap tasks between nodes when spare capacity is available Mel Gorman
2022-05-11 14:30 ` [PATCH 3/4] sched/numa: Apply imbalance limitations consistently Mel Gorman
2022-05-18  9:24   ` [sched/numa] bb2dee337b: unixbench.score -11.2% regression kernel test robot
2022-05-18 15:22     ` Mel Gorman
2022-05-19  7:54       ` ying.huang
2022-05-20  6:44         ` [LKP] " Ying Huang
2022-05-18  9:31   ` [PATCH 3/4] sched/numa: Apply imbalance limitations consistently Peter Zijlstra
2022-05-18 10:46     ` Mel Gorman
2022-05-18 13:59       ` Peter Zijlstra
2022-05-18 15:39         ` Mel Gorman
2022-05-11 14:30 ` [PATCH 4/4] sched/numa: Adjust imb_numa_nr to a better approximation of memory channels Mel Gorman
2022-05-18  9:41   ` Peter Zijlstra
2022-05-18 11:15     ` Mel Gorman
2022-05-18 14:05       ` Peter Zijlstra
2022-05-18 17:06         ` Mel Gorman
2022-05-19  9:29           ` Mel Gorman
2022-05-20  4:58 ` [PATCH 0/4] Mitigate inconsistent NUMA imbalance behaviour K Prateek Nayak
2022-05-20 10:18   ` Mel Gorman [this message]
2022-05-20 15:17     ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220520101812.GW3441@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=aubrey.li@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox