From: Brendan Jackman <brendan.jackman@arm.com>
To: Joel Fernandes <joelaf@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Andres Oportus <andresoportus@google.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Josef Bacik <josef@toxicpanda.com>,
Mike Galbraith <efault@gmx.de>,
Matt Fleming <matt@codeblueprint.co.uk>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Morten Rasmussen <morten.rasmussen@arm.com>
Subject: Re: [RFC] sched/fair: Use wake_q length as a hint for wake_wide
Date: Wed, 20 Sep 2017 10:33:07 +0100 [thread overview]
Message-ID: <87poal1zqk.fsf@arm.com> (raw)
In-Reply-To: <CAJWu+oqqroSv=GO3FZgDa0FUTn5OuVjZxnBrgYJHVrzd0b2RrA@mail.gmail.com>
On Wed, Sep 20 2017 at 05:06, Joel Fernandes wrote:
>> On Tue, Sep 19, 2017 at 3:05 AM, Brendan Jackman
>> <brendan.jackman@arm.com> wrote:
>>> On Mon, Sep 18 2017 at 22:15, Joel Fernandes wrote:
> [..]
>>>>> IIUC, if wake_affine() behaves correctly this trick wouldn't be
>>>>> necessary on SMP systems, so it might be best guarded by the presence
>>>>
>>>> Actually wake_affine doesn't check for balance if previous/next cpu
>>>> are within the same shared cache domain. The difference is some time
>>>> ago it would return true for shared cache but now it returns false as
>>>> of 4.14-rc1:
>>>> http://elixir.free-electrons.com/linux/v4.14-rc1/source/kernel/sched/fair.c#L5466
>>>>
>>>> Since it would return false in the above wake up cases for task 1 and
>>>> 2, it would then run select_idle_sibling on the previous CPU which is
>>>> also within the big cluster, so I don't think it will make a
>>>> difference in this case... Infact what it returns probably doesn't
>>>> matter.
>>>
>>> So my paragraph here was making a leap in reasoning, let me try to fill
>>> the gap: On SMP these tasks never need to move around. If by some chance
>>> they did get coscheduled, the first load balance would spread them out and
>>> then every time they wake up from then on, prev_cpu is the sensible
>>> choice. So it will look something like:
>>>
>>> v CPU v ->time->
>>>
>>> -------------
>>> { 0 (SAME) 11111111111
>>> cache { -------------
>>> { 1 (SAME) 222222222222|
>>> -------------
>>> { 2 (SAME) 33333333333
>>> cache { -------------
>>> { 3 (SAME) 44444444444
>>> -------------
>>>
>>> So here, task 2 wakes up the other guys and when it's doing tasks 3 and
>>> 4, prev_cpu and smp_processor_id() don't share a cache, so IIUC its'
>>> basically wake_affine's job to decide between prev_cpu and
>>> smp_processor_id(). So "if wake_affine is behaving correctly" the
>>> problem that this patch aims to solve (i.e. the fact that we overload
>>> the waker's LLC domain because of bias towards prev_cpu) does not arise
>>> on SMP.
>>>
>>
>> Yes SMP, but your patch is for solving a problem for non-SMP. So your
>> original statement about wake_affine solving any problem for SMP is
>> not relevant I feel :-P. I guess you can just kill this para from the
>> commit message to prevent confusion.
>
> Ok I take that back, you were talking about guarding this feature by
> the SD_ASYM_CPUCAPACITY flag.
>
> I don't think that protection would be helpful because you can have
> the same issue if the tasks do different amount of work on SMP. So in
> that case some threads might still complete before the others and you
> run into the same thing.
Well assuming we're still talking about one task per CPU, if you have
tasks doing different amount of work there's still no reason to move the
longer-running threads around. The only reason that happens in my
example is because of the asym capacity.
There might be a case on SMP where this would have a benefit, but the
reason I suggested disabling it for SMP is because the cost of this
patch is that we use the slow path a whole load more for certain usage
patterns. On big.LITTLE etc. my thinking is that's the price we have to
pay to get task placement right; on SMP I don't think we'd want to pay
that price unless we see a good reason.
next prev parent reply other threads:[~2017-09-20 9:33 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-11 9:45 [RFC] sched/fair: Use wake_q length as a hint for wake_wide Brendan Jackman
2017-09-19 5:15 ` Joel Fernandes
2017-09-19 10:05 ` Brendan Jackman
2017-09-20 4:39 ` Joel Fernandes
2017-09-20 5:06 ` Joel Fernandes
2017-09-20 9:33 ` Brendan Jackman [this message]
2017-09-20 20:23 ` Joel Fernandes
2017-09-20 21:17 ` Atish Patra
2017-09-21 5:50 ` Joel Fernandes
2017-09-21 16:51 ` Atish Patra
2017-09-20 9:41 ` Brendan Jackman
2017-09-20 19:36 ` Atish Patra
2017-09-29 17:05 ` Brendan Jackman
2017-10-02 10:46 ` Brendan Jackman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87poal1zqk.fsf@arm.com \
--to=brendan.jackman@arm.com \
--cc=andresoportus@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=efault@gmx.de \
--cc=joelaf@google.com \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox