From: Valentin Schneider <valentin.schneider@arm.com>
To: Quentin Perret <qperret@google.com>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
peterz@infradead.org, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, morten.rasmussen@arm.com
Subject: Re: [PATCH v3 2/7] sched/topology: Define and assign sched_domain flag metadata
Date: Thu, 02 Jul 2020 17:25:41 +0100 [thread overview]
Message-ID: <jhjfta9994q.mognet@arm.com> (raw)
In-Reply-To: <20200702154514.GA1072702@google.com>
On 02/07/20 16:45, Quentin Perret wrote:
> On Thursday 02 Jul 2020 at 15:31:07 (+0100), Valentin Schneider wrote:
>> There an "interesting" quirk of asym_cpu_capacity_level() in that it does
>> something slightly different than what it says on the tin: it detects
>> the lowest topology level where *the biggest* CPU capacity is visible by
>> all CPUs. That works just fine on big.LITTLE, but there are questionable
>> DynamIQ topologies that could hit some issues.
>>
>> Consider:
>>
>> DIE [ ]
>> MC [ ][ ] <- sd_asym_cpucapacity
>> 0 1 2 3 4 5
>> L L B B B B
>>
>> asym_cpu_capacity_level() would pick MC as the asymmetric topology level,
>> and you can argue either way: it should be DIE, because that's where CPUs 4
>> and 5 can see a LITTLE, or it should be MC, at least for CPUs 0-3 because
>> there they see all CPU capacities.
>
> Right, I am not looking forward to these topologies...
I'll try my best to prevent those from seeing the light of day, but you
know how this works...
>> Say there are two clusters in the system, one with a lone big CPU and the
>> other with a mix of big and LITTLE CPUs:
>>
>> DIE [ ]
>> MC [ ][ ]
>> 0 1 2 3 4
>> L L B B B
>>
>> asym_cpu_capacity_level() will figure out that the MC level is the one
>> where all CPUs can see a CPU of max capacity, and we will thus set
>> SD_ASYM_CPUCAPACITY at MC level for all CPUs.
>>
>> That lone big CPU will degenerate its MC domain, since it would be alone in
>> there, and will end up with just a DIE domain. Since the flag was only set
>> at MC, this CPU ends up not seeing any SD with the flag set, which is
>> broken.
>
> +1
>
>> Rather than clearing dflags at every topology level, clear it before
>> entering the topology level loop. This will properly propagate upwards
>> flags that are set starting from a certain level.
>
> I'm feeling a bit nervous about that asymmetry -- in your example
> select_idle_capacity() on, say, CPU3 will see less CPUs than on CPU4.
> So, you might get fun side-effects where all task migrated to CPUs 0-3
> will be 'stuck' there while CPU 4 stays mostly idle.
>
It's actually pretty close to what happens with the LLC domain on SMP -
select_idle_sibling() doesn't look outside of it. The wake_affine() stuff
might steer the task towards a different LLC, but that's about it for
wakeups. We rely on load balancing (fork/exec, newidle, nohz and periodic)
to spread this further - and we would here too.
It gets "funny" for EAS when we aren't overutilized and thus can't rely on
load balancing; at least misfit ought to still work. It *is* a weird
topology, for sure.
> I have a few ideas to avoid that (e.g. looking at the rd span in
> select_idle_capacity() instead of sd_asym_cpucapacity) but all this is
> theoretical, so I'm happy to wait for a real platform to be released
> before we worry too much about it.
>
> In the meantime:
>
> Reviewed-by: Quentin Perret <qperret@google.com>
Thanks!
next prev parent reply other threads:[~2020-07-02 16:25 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-01 19:06 [PATCH v3 0/7] sched: Instrument sched domain flags Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 1/7] sched/topology: Split out SD_* flags declaration to its own file Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 2/7] sched/topology: Define and assign sched_domain flag metadata Valentin Schneider
2020-07-02 12:15 ` Quentin Perret
2020-07-02 14:31 ` Valentin Schneider
2020-07-02 15:45 ` Quentin Perret
2020-07-02 16:25 ` Valentin Schneider [this message]
2020-07-02 16:37 ` Quentin Perret
2020-07-02 16:49 ` Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 3/7] sched/topology: Verify SD_* flags setup when sched_debug is on Valentin Schneider
2020-07-02 14:20 ` Peter Zijlstra
2020-07-02 14:32 ` Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 4/7] arm, sched/topology: Remove SD_SHARE_POWERDOMAIN Valentin Schneider
2020-07-02 16:44 ` Dietmar Eggemann
2020-07-02 18:46 ` Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 5/7] sched/topology: Add more flags to the SD degeneration mask Valentin Schneider
2020-07-02 18:28 ` Dietmar Eggemann
2020-07-01 19:06 ` [PATCH v3 6/7] sched/topology: Introduce SD metaflag for flags needing > 1 groups Valentin Schneider
2020-07-02 18:29 ` Dietmar Eggemann
2020-07-02 18:46 ` Valentin Schneider
2020-07-13 12:39 ` Peter Zijlstra
2020-07-13 13:25 ` Valentin Schneider
2020-07-01 19:06 ` [PATCH v3 7/7] sched/topology: Use prebuilt SD flag degeneration mask Valentin Schneider
2020-07-13 12:55 ` Peter Zijlstra
2020-07-13 13:28 ` Valentin Schneider
2020-07-13 13:43 ` Peter Zijlstra
2020-07-13 13:52 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jhjfta9994q.mognet@arm.com \
--to=valentin.schneider@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=qperret@google.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.