Re: [PATCH] sched, fair: Allow a small load imbalance between low utilisation SD_NUMA domains v4

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Phil Auld <pauld@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <Morten.Rasmussen@arm.com>,
	Hillf Danton <hdanton@sina.com>, Parth Shah <parth@linux.ibm.com>,
	Rik van Riel <riel@surriel.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small load imbalance between low utilisation SD_NUMA domains v4
Date: Mon, 20 Jan 2020 08:33:54 +0000	[thread overview]
Message-ID: <20200120083354.GT3466@techsingularity.net> (raw)
In-Reply-To: <20200120080935.GD20112@linux.vnet.ibm.com>

On Mon, Jan 20, 2020 at 01:39:35PM +0530, Srikar Dronamraju wrote:
> * Mel Gorman <mgorman@techsingularity.net> [2020-01-17 21:58:53]:
> 
> > On Fri, Jan 17, 2020 at 11:26:31PM +0530, Srikar Dronamraju wrote:
> > > * Mel Gorman <mgorman@techsingularity.net> [2020-01-14 10:13:20]:
> > > 
> > > We certainly are seeing better results than v1.
> > > However numa02, numa03, numa05, numa09 and numa10 still seem to regressing, while
> > > the others are improving.
> > > 
> > > While numa04 improves by 14%, numa02 regress by around 12%.
> > > 
> 
> > Ok, so it's both a win and a loss. This is a curiousity that this patch
> > may be the primary factor given that the logic only triggers when the
> > local group has spare capacity and the busiest group is nearly idle. The
> > test cases you describe should have fairly busy local groups.
> > 
> 
> Right, your code only seems to affect when the local group has spare
> capacity and the busiest->sum_nr_running <=2 
> 

And this is why I'm curious as to why your workload is affected at all
because it uses many tasks.  I stopped allowing an imbalance for higher
task counts partially on the basis of your previous report.

> > This one is more interesting. False sharing shouldn't be an issue so the
> > threads should be independent.
> > 
> > > numa03 is a single process with 256 threads;
> > > each thread doing 50 loops on 3GB process shared memory operations.
> > > 
> > 
> > Similar.
> 
> This is similar to numa01. Except now all threads belong to just one
> process.
> 

My concern is that the shared memory options means that NUMA balancing
and false sharing can dominate and hide any impact of the patch itself.
Whether it has good or bad results may be partially down to luck.

> > 
> > > numa05 is a set of 16 process (twice as many nodes) each running 16 threads;
> > > each thread doing 50 loops on 3GB process shared memory operations.
> > > 
> > 
> > > Details below:
> > 
> > How many iterations for each test? 
> 
> I run 5 iterations. Want me to run with more iterations?
> 

5 should be enough for now. I'm more interested in hearing if the
regression/gain is consistent when the patch is applied and a confirmation
that the patch really makes a difference to this set of workloads.

> > 
> > 
> > > ./numa02.sh      Real:  78.87      82.31      80.59      1.72     -12.7187%
> > > ./numa02.sh      Sys:   81.18      85.07      83.12      1.94     -35.0337%
> > > ./numa02.sh      User:  16303.70   17122.14   16712.92   409.22   -12.5182%
> > 
> > Before range: 58 to 72
> > After range: 78 to 82
> > 
> > This one is more interesting in general. Can you add trace_printks to
> > the check for SD_NUMA the patch introduces and dump the sum_nr_running
> > for both local and busiest when the imbalance is ignored please? That
> > might give some hint as to the improper conditions where imbalance is
> > ignored.
> 
> Can be done. Will get back with the results. But do let me know if you want
> to run with more iterations or rerun the tests.
> 

The results of this will be interesting in itself. I'm particularly
interested in seeing what the traces look like for a good and bad result.

> > 
> > However, knowing the number of iterations would be helpful. Can you also
> > tell me if this is consistent between boots or is it always roughly 12%
> > regression regardless of the number of iterations?
> > 
> 
> I have only measured for 5 iterations and I haven't repeated to see if the
> numbers are consistent.
> 

Ok, that is quite a problem as the assertion at the moment is that the
patch causes a mix of regressions/gains. It's currently unclear to me
why the patch would have a major impact on this workload at all given the
number of active tasks and the nature of the patch.  I'm concerned that
the workload may be naturally unstable but tracing may be able to help.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2020-01-20  8:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 10:13 [PATCH] sched, fair: Allow a small load imbalance between low utilisation SD_NUMA domains v4 Mel Gorman
2020-01-16 16:35 ` Mel Gorman
2020-01-17 13:08   ` Vincent Guittot
2020-01-17 14:15     ` Mel Gorman
2020-01-17 14:32       ` Phil Auld
2020-01-17 14:23     ` Phil Auld
2020-01-17 14:37   ` Valentin Schneider
2020-01-17 13:16 ` Vincent Guittot
2020-01-17 14:26   ` Mel Gorman
2020-01-17 14:29     ` Vincent Guittot
2020-01-17 15:09 ` Vincent Guittot
2020-01-17 15:11   ` Peter Zijlstra
2020-01-17 15:21 ` Phil Auld
2020-01-17 17:56 ` Srikar Dronamraju
2020-01-17 21:58   ` Mel Gorman
2020-01-20  8:09     ` Srikar Dronamraju
2020-01-20  8:33       ` Mel Gorman [this message]
2020-01-20 17:27         ` Srikar Dronamraju
2020-01-20 18:21           ` Mel Gorman
2020-01-21  8:55             ` Srikar Dronamraju
2020-01-21  9:11               ` Mel Gorman
2020-01-21 10:42                 ` Peter Zijlstra
2020-01-21  9:59 ` Srikar Dronamraju
2020-01-29 11:32 ` [tip: sched/core] sched/fair: Allow a small load imbalance between low utilisation SD_NUMA domains tip-bot2 for Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200120083354.GT3466@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=Morten.Rasmussen@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quentin.perret@arm.com \
    --cc=riel@surriel.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox