From: Rik van Riel <riel@redhat.com>
To: Preeti Murthy <preeti.lkml@gmail.com>, umgwanakikbuti@gmail.com
Cc: LKML <linux-kernel@vger.kernel.org>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
george.mccollister@gmail.com, ktkhai@parallels.com,
Preeti U Murthy <preeti@linux.vnet.ibm.com>
Subject: Re: [PATCH RFC/TEST] sched: make sync affine wakeups work
Date: Sun, 04 May 2014 08:41:09 -0400 [thread overview]
Message-ID: <53663565.9080306@redhat.com> (raw)
In-Reply-To: <CAM4v1pPwe4B0K8MPpf183LqabRoRKRPi_R7n8-Y02aR43M8iQQ@mail.gmail.com>
On 05/04/2014 07:44 AM, Preeti Murthy wrote:
> Hi Rik, Mike
>
> On Fri, May 2, 2014 at 12:00 PM, Rik van Riel <riel@redhat.com> wrote:
>> On 05/02/2014 02:13 AM, Mike Galbraith wrote:
>>> On Fri, 2014-05-02 at 00:42 -0400, Rik van Riel wrote:
>>>
>>>> Whether or not this is the right thing to do remains to be seen,
>>>> but it does allow us to verify whether or not the wake_affine
>>>> strategy of always doing affine wakeups and only disabling them
>>>> in a specific circumstance is sound, or needs rethinking...
>>>
>>> Yes, it needs rethinking.
>>>
>>> I know why you want to try this, yes, select_idle_sibling() is very much
>>> a two faced little bitch.
>>
>> My biggest problem with select_idle_sibling and wake_affine in
>> general is that it will override NUMA placement, even when
>> processes only wake each other up infrequently...
>
> As far as my understanding goes, the logic in select_task_rq_fair()
> does wake_affine() or calls select_idle_sibling() only at those
> levels of sched domains where the flag SD_WAKE_AFFINE is set.
> This flag is not set at the numa domain and hence they will not be
> balancing across numa nodes. So I don't understand how
> *these functions* are affecting NUMA placements.
Even on 8-node DL980 systems, the NUMA distance in the
SLIT table is less than RECLAIM_DISTANCE, and we will
do wake_affine across the entire system.
> The wake_affine() and select_idle_sibling() will shuttle tasks
> within a NUMA node as far as I can see.i.e. if the cpu that the task
> previously ran on and the waker cpu belong to the same node.
> Else they are not called.
That is what I first hoped, too. I was wrong.
> If the prev_cpu and the waker cpu are on different NUMA nodes
> then naturally the tasks will get shuttled across NUMA nodes but
> the culprits are the find_idlest* functions.
> They do a top-down search for the idlest group and cpu, starting
> at the NUMA domain *attached to the waker and not the prev_cpu*.
> This means that the task will end up on a different NUMA node.
> Looks to me that the problem lies here and not in the wake_affine()
> and select_idle_siblings().
I have a patch for find_idlest_group that takes the NUMA
distance between each group and the task's preferred node
into account.
However, as long as the wake_affine stuff still gets to
override it, that does not make much difference :)
--
All rights reversed
next prev parent reply other threads:[~2014-05-04 12:41 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-02 4:42 [PATCH RFC/TEST] sched: make sync affine wakeups work Rik van Riel
2014-05-02 5:32 ` Mike Galbraith
2014-05-02 5:41 ` Mike Galbraith
2014-05-02 5:58 ` Mike Galbraith
2014-05-02 6:08 ` Rik van Riel
2014-05-02 6:36 ` Mike Galbraith
2014-05-02 6:51 ` Mike Galbraith
2014-05-02 6:13 ` Mike Galbraith
2014-05-02 6:30 ` Rik van Riel
2014-05-02 7:37 ` Mike Galbraith
2014-05-02 10:56 ` Rik van Riel
2014-05-02 11:27 ` Mike Galbraith
2014-05-02 12:51 ` Mike Galbraith
[not found] ` <5363B793.9010208@redhat.com>
2014-05-06 11:54 ` Peter Zijlstra
2014-05-06 20:19 ` Rik van Riel
2014-05-06 20:39 ` Peter Zijlstra
2014-05-06 23:46 ` Rik van Riel
2014-05-09 2:20 ` Rik van Riel
2014-05-09 5:27 ` [PATCH] sched: wake up task on prev_cpu if not in SD_WAKE_AFFINE domain with cpu Rik van Riel
2014-05-09 6:04 ` [PATCH] sched: clean up select_task_rq_fair conditionals and indentation Rik van Riel
2014-05-09 7:34 ` [PATCH] sched: wake up task on prev_cpu if not in SD_WAKE_AFFINE domain with cpu Mike Galbraith
2014-05-09 14:22 ` Rik van Riel
2014-05-09 15:24 ` Mike Galbraith
2014-05-09 15:24 ` Rik van Riel
2014-05-09 17:55 ` Mike Galbraith
2014-05-09 18:16 ` Rik van Riel
2014-05-10 3:54 ` Mike Galbraith
2014-05-13 14:08 ` Rik van Riel
2014-05-14 4:08 ` Mike Galbraith
2014-05-14 15:40 ` [PATCH] sched: call select_idle_sibling when not affine_sd Rik van Riel
2014-05-14 15:45 ` Peter Zijlstra
2014-05-19 13:08 ` [tip:sched/core] " tip-bot for Rik van Riel
2014-05-22 12:27 ` [tip:sched/core] sched: Call select_idle_sibling() " tip-bot for Rik van Riel
2014-05-04 11:44 ` [PATCH RFC/TEST] sched: make sync affine wakeups work Preeti Murthy
2014-05-04 12:04 ` Mike Galbraith
2014-05-05 4:38 ` Preeti U Murthy
2014-05-04 12:41 ` Rik van Riel [this message]
2014-05-05 4:50 ` Preeti U Murthy
2014-05-05 6:43 ` Preeti U Murthy
2014-05-05 11:28 ` Rik van Riel
2014-05-06 13:26 ` Peter Zijlstra
2014-05-06 13:25 ` Peter Zijlstra
2014-05-06 20:20 ` Rik van Riel
2014-05-06 20:41 ` Peter Zijlstra
2014-05-07 12:17 ` Ingo Molnar
2014-05-06 11:56 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53663565.9080306@redhat.com \
--to=riel@redhat.com \
--cc=george.mccollister@gmail.com \
--cc=ktkhai@parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=preeti.lkml@gmail.com \
--cc=preeti@linux.vnet.ibm.com \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.