From: Mel Gorman <mgorman@techsingularity.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Phil Auld <pauld@redhat.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Quentin Perret <quentin.perret@arm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Morten Rasmussen <Morten.Rasmussen@arm.com>,
Hillf Danton <hdanton@sina.com>, Parth Shah <parth@linux.ibm.com>,
Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH v4 04/11] sched/fair: rework load_balance
Date: Thu, 31 Oct 2019 11:40:20 +0000 [thread overview]
Message-ID: <20191031114020.GQ3016@techsingularity.net> (raw)
In-Reply-To: <CAKfTPtByO7oLQZxF_+-FxZ9u1JhO24-rujW3j-QDqr+PFDOQ=Q@mail.gmail.com>
On Thu, Oct 31, 2019 at 12:13:09PM +0100, Vincent Guittot wrote:
> > > > On the last one, spreading tasks evenly across NUMA domains is not
> > > > necessarily a good idea. If I have 2 tasks running on a 2-socket machine
> > > > with 24 logical CPUs per socket, it should not automatically mean that
> > > > one task should move cross-node and I have definitely observed this
> > > > happening. It's probably bad in terms of locality no matter what but it's
> > > > especially bad if the 2 tasks happened to be communicating because then
> > > > load balancing will pull apart the tasks while wake_affine will push
> > > > them together (and potentially NUMA balancing as well). Note that this
> > > > also applies for some IO workloads because, depending on the filesystem,
> > > > the task may be communicating with workqueues (XFS) or a kernel thread
> > > > (ext4 with jbd2).
> > >
> > > This rework doesn't touch the NUMA_BALANCING part and NUMA balancing
> > > still gives guidances with fbq_classify_group/queue.
> >
> > I know the NUMA_BALANCING part is not touched, I'm talking about load
> > balancing across SD_NUMA domains which happens independently of
> > NUMA_BALANCING. In fact, there is logic in NUMA_BALANCING that tries to
> > override the load balancer when it moves tasks away from the preferred
> > node.
>
> Yes. this patchset relies on this override for now to prevent moving task away.
Fair enough, netperf hits the corner case where it does not work but
that is also true without your series.
> I agree that additional patches are probably needed to improve load
> balance at NUMA level and I expect that this rework will make it
> simpler to add.
> I just wanted to get the output of some real use cases before defining
> more numa level specific conditions. Some want to spread on there numa
> nodes but other want to keep everything together. The preferred node
> and fbq_classify_group was the only sensible metrics to me when he
> wrote this patchset but changes can be added if they make sense.
>
That's fair. While it was possible to address the case before your
series, it was a hatchet job. If the changelog simply notes that some
special casing may still be required for SD_NUMA but it's outside the
scope of the series, then I'd be happy. At least there is a good chance
then if there is follow-up work that it won't be interpreted as an
attempt to reintroduce hacky heuristics.
> >
> > > But the latter could also take advantage of the new type of group. For
> > > example, what I did in the fix for find_idlest_group : checking
> > > numa_preferred_nid when the group has capacity and keep the task on
> > > preferred node if possible. Similar behavior could also be beneficial
> > > in periodic load_balance case.
> > >
> >
> > And this is the catch -- numa_preferred_nid is not guaranteed to be set at
> > all. NUMA balancing might be disabled, the task may not have been running
> > long enough to pick a preferred NID or NUMA balancing might be unable to
> > pick a preferred NID. The decision to avoid unnecessary migrations across
> > NUMA domains should be made independently of NUMA balancing. The netperf
> > configuration from mmtests is great at illustrating the point because it'll
> > also say what the average local/remote access ratio is. 2 communicating
> > tasks running on an otherwise idle NUMA machine should not have the load
> > balancer move the server to one node and the client to another.
>
> I'm going to make it a try on my setup to see the results
>
Thanks.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2019-10-31 11:40 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 13:26 [PATCH v4 00/10] sched/fair: rework the CFS load balance Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 01/11] sched/fair: clean up asym packing Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Clean " tip-bot2 for Vincent Guittot
2019-10-30 14:51 ` [PATCH v4 01/11] sched/fair: clean " Mel Gorman
2019-10-30 16:03 ` Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 02/11] sched/fair: rename sum_nr_running to sum_h_nr_running Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rename sg_lb_stats::sum_nr_running " tip-bot2 for Vincent Guittot
2019-10-30 14:53 ` [PATCH v4 02/11] sched/fair: rename sum_nr_running " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 03/11] sched/fair: remove meaningless imbalance calculation Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 04/11] sched/fair: rework load_balance Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rework load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:45 ` [PATCH v4 04/11] sched/fair: rework load_balance Mel Gorman
2019-10-30 16:16 ` Valentin Schneider
2019-10-31 9:09 ` Vincent Guittot
2019-10-31 10:15 ` Mel Gorman
2019-10-31 11:13 ` Vincent Guittot
2019-10-31 11:40 ` Mel Gorman [this message]
2019-11-08 16:35 ` Vincent Guittot
2019-11-08 18:37 ` Mel Gorman
2019-11-12 10:58 ` Vincent Guittot
2019-11-12 15:06 ` Mel Gorman
2019-11-12 15:40 ` Vincent Guittot
2019-11-12 17:45 ` Mel Gorman
2019-11-18 13:50 ` Ingo Molnar
2019-11-18 13:57 ` Vincent Guittot
2019-11-18 14:51 ` Mel Gorman
2019-10-18 13:26 ` [PATCH v4 05/11] sched/fair: use rq->nr_running when balancing load Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-30 15:54 ` [PATCH v4 05/11] sched/fair: use " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use load instead of runnable load in load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:58 ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Mel Gorman
2019-10-18 13:26 ` [PATCH v4 07/11] sched/fair: evenly spread tasks when not overloaded Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Spread out tasks evenly " tip-bot2 for Vincent Guittot
2019-10-30 16:03 ` [PATCH v4 07/11] sched/fair: evenly spread tasks " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 08/11] sched/fair: use utilization to select misfit task Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 09/11] sched/fair: use load instead of runnable load in wakeup path Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 10/11] sched/fair: optimize find_idlest_group Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Optimize find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rework find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-22 16:46 ` [PATCH] sched/fair: fix rework of find_idlest_group() Vincent Guittot
2019-10-23 7:50 ` Chen, Rong A
2019-10-30 16:07 ` Mel Gorman
2019-11-18 17:42 ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Vincent Guittot
2019-11-22 14:37 ` [PATCH] sched/fair: fix " Valentin Schneider
2019-11-25 9:16 ` Vincent Guittot
2019-11-25 11:03 ` Valentin Schneider
2019-11-20 11:58 ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Qais Yousef
2019-11-20 13:21 ` Vincent Guittot
2019-11-20 16:53 ` Vincent Guittot
2019-11-20 17:34 ` Qais Yousef
2019-11-20 17:43 ` Vincent Guittot
2019-11-20 18:10 ` Qais Yousef
2019-11-20 18:20 ` Vincent Guittot
2019-11-20 18:27 ` Qais Yousef
2019-11-20 19:28 ` Vincent Guittot
2019-11-20 19:55 ` Qais Yousef
2019-11-21 14:58 ` Qais Yousef
2019-11-22 14:34 ` Valentin Schneider
2019-11-25 9:59 ` Vincent Guittot
2019-11-25 11:13 ` Valentin Schneider
2019-10-21 7:50 ` [PATCH v4 00/10] sched/fair: rework the CFS load balance Ingo Molnar
2019-10-21 8:44 ` Vincent Guittot
2019-10-21 12:56 ` Phil Auld
2019-10-24 12:38 ` Phil Auld
2019-10-24 13:46 ` Phil Auld
2019-10-24 14:59 ` Vincent Guittot
2019-10-25 13:33 ` Phil Auld
2019-10-28 13:03 ` Vincent Guittot
2019-10-30 14:39 ` Phil Auld
2019-10-30 16:24 ` Dietmar Eggemann
2019-10-30 16:35 ` Valentin Schneider
2019-10-30 17:19 ` Phil Auld
2019-10-30 17:25 ` Valentin Schneider
2019-10-30 17:29 ` Phil Auld
2019-10-30 17:28 ` Vincent Guittot
2019-10-30 17:44 ` Phil Auld
2019-10-30 17:25 ` Vincent Guittot
2019-10-31 13:57 ` Phil Auld
2019-10-31 16:41 ` Vincent Guittot
2019-10-30 16:24 ` Mel Gorman
2019-10-30 16:35 ` Vincent Guittot
2019-11-18 13:15 ` Ingo Molnar
2019-11-25 12:48 ` Valentin Schneider
2020-01-03 16:39 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191031114020.GQ3016@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=Morten.Rasmussen@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=parth@linux.ibm.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=quentin.perret@arm.com \
--cc=riel@surriel.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox