linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
Cc: Mel Gorman <mgorman@suse.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	"bsegall@google.com" <bsegall@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linuxarm <linuxarm@huawei.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Phil Auld <pauld@redhat.com>, Hillf Danton <hdanton@sina.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
Date: Mon, 7 Sep 2020 18:01:06 +0530	[thread overview]
Message-ID: <20200907123106.GA28232@linux.vnet.ibm.com> (raw)
In-Reply-To: <9c8e148805bc4da1a2bfdd72f1c76f06@hisilicon.com>

> > 
> > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote:
> > > Something is wrong. In find_busiest_group(), we are checking if src has
> > > higher load, however, in task_numa_find_cpu(), we are checking if dst
> > > will have higher load after balancing. It seems it is not sensible to
> > > check src.
> > > It maybe cause wrong imbalance value, for example, if
> > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and
> > > src_running = env->src_stats.nr_running - 1 results in 1;
> > > The current code is thinking imbalance as 0 since src_running is smaller
> > > than 2.
> > > This is inconsistent with load balancer.
> > >

I have observed the similar behaviour what Barry Song has documented with a
simple ebizzy with less threads on a 2 node system

ebizzy -t 6 -S 100

We see couple of ebizzy threads moving back and forth between the 2 nodes
because of numa balancer and load balancer trying to do the exact opposite.

However with Barry's patch, couple of tests regress heavily. (Any numa
workload that has shared numa faults).
For example:
perf bench numa mem --no-data_rand_walk -p 1 -t 6 -G 0 -P 3072 -T 0 -l 50 -c

I also don't understand the rational behind checking for dst_running in numa
balancer path. This almost means no numa balancing in lightly loaded scenario.

So agree with Mel that we should probably test more scenarios before
we accept this patch.

> > 
> > It checks the conditions if the move was to happen. Have you evaluated
> > this for a NUMA balancing load and confirmed it a) balances properly and
> > b) does not increase the scan rate trying to "fix" the problem?
> 
> I think the original code was trying to check if the numa migration
> would lead to new imbalance in load balancer. In case src is A, dst is B, and
> both of them have nr_running as 2. A moves one task to B, then A
> will have 1, B will have 3. In load balancer, A will try to pull task
> from B since B's nr_running is larger than min_imbalance. But the code
> is saying imbalance=0 by finding A's nr_running is smaller than
> min_imbalance.
> 
> Will share more test data if you need.
> 
> > 
> > --
> > Mel Gorman
> > SUSE Labs
> 
> Thanks
> Barry

-- 
Thanks and Regards
Srikar Dronamraju

  parent reply	other threads:[~2020-09-07 12:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-07  7:27 [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer Barry Song
2020-09-07  9:27 ` Mel Gorman
2020-09-07  9:44   ` Song Bao Hua (Barry Song)
2020-09-07 10:44     ` Mel Gorman
2020-09-07 12:31     ` Srikar Dronamraju [this message]
     [not found]       ` <20200908010717.12436-1-hdanton@sina.com>
2020-09-10 21:50         ` Jirka Hladky
2020-09-21 11:02           ` Mel Gorman
2020-09-21 16:02             ` Jirka Hladky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200907123106.GA28232@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).