From: Vincent Guittot <vincent.guittot@linaro.org>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Qais Yousef <qais.yousef@arm.com>,
Quentin Perret <qperret@google.com>,
Pavan Kondeti <pkondeti@codeaurora.org>,
Rik van Riel <riel@surriel.com>,
Lingutla Chandrasekhar <clingutla@codeaurora.org>
Subject: Re: [PATCH 2/2] sched/fair: Relax task_hot() for misfit tasks
Date: Fri, 16 Apr 2021 15:51:13 +0200 [thread overview]
Message-ID: <20210416135113.GA16445@vingu-book> (raw)
In-Reply-To: <20210415175846.494385-3-valentin.schneider@arm.com>
Le jeudi 15 avril 2021 à 18:58:46 (+0100), Valentin Schneider a écrit :
> Consider the following topology:
>
> DIE [ ]
> MC [ ][ ]
> 0 1 2 3
>
> capacity_orig_of(x \in {0-1}) < capacity_orig_of(x \in {2-3})
>
> w/ CPUs 2-3 idle and CPUs 0-1 running CPU hogs (util_avg=1024).
>
> When CPU2 goes through load_balance() (via periodic / NOHZ balance), it
> should pull one CPU hog from either CPU0 or CPU1 (this is misfit task
> upmigration). However, should a e.g. pcpu kworker awake on CPU0 just before
> this load_balance() happens and preempt the CPU hog running there, we would
> have, for the [0-1] group at CPU2's DIE level:
>
> o sgs->sum_nr_running > sgs->group_weight
> o sgs->group_capacity * 100 < sgs->group_util * imbalance_pct
>
> IOW, this group is group_overloaded.
>
> Considering CPU0 is picked by find_busiest_queue(), we would then visit the
> preempted CPU hog in detach_tasks(). However, given it has just been
> preempted by this pcpu kworker, task_hot() will prevent it from being
> detached. We then leave load_balance() without having done anything.
>
> Long story short, preempted misfit tasks are affected by task_hot(), while
> currently running misfit tasks are intentionally preempted by the stopper
> task to migrate them over to a higher-capacity CPU.
>
> Align detach_tasks() with the active-balance logic and let it pick a
> cache-hot misfit task when the destination CPU can provide a capacity
> uplift.
>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> kernel/sched/fair.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d2d1a69d7aa7..43fc98d34276 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7493,6 +7493,7 @@ struct lb_env {
> enum fbq_type fbq_type;
> enum migration_type migration_type;
> enum group_type src_grp_type;
> + enum group_type dst_grp_type;
> struct list_head tasks;
> };
>
> @@ -7533,6 +7534,31 @@ static int task_hot(struct task_struct *p, struct lb_env *env)
> return delta < (s64)sysctl_sched_migration_cost;
> }
>
> +
> +/*
> + * What does migrating this task do to our capacity-aware scheduling criterion?
> + *
> + * Returns 1, if the task needs more capacity than the dst CPU can provide.
> + * Returns 0, if the task needs the extra capacity provided by the dst CPU
> + * Returns -1, if the task isn't impacted by the migration wrt capacity.
> + */
> +static int migrate_degrades_capacity(struct task_struct *p, struct lb_env *env)
> +{
> + if (!(env->sd->flags & SD_ASYM_CPUCAPACITY))
> + return -1;
> +
> + if (!task_fits_capacity(p, capacity_of(env->src_cpu))) {
> + if (cpu_capacity_greater(env->dst_cpu, env->src_cpu))
> + return 0;
> + else if (cpu_capacity_greater(env->src_cpu, env->dst_cpu))
> + return 1;
> + else
> + return -1;
> + }
Being there means that task fits src_cpu capacity so why testing p against dst_cpu ?
> +
> + return task_fits_capacity(p, capacity_of(env->dst_cpu)) ? -1 : 1;
> +}
I prefer the below which easier to read because the same var is use everywhere and you can remove cpu_capacity_greater.
static int migrate_degrades_capacity(struct task_struct *p, struct lb_env *env)
{
unsigned long src_capacity, dst_capacity;
if (!(env->sd->flags & SD_ASYM_CPUCAPACITY))
return -1;
src_capacity = capacity_of(env->src_cpu);
dst_capacity = capacity_of(env->dst_cpu);
if (!task_fits_capacity(p, src_capacity)) {
if (capacity_greater(dst_capacity, src_capacity))
return 0;
else if (capacity_greater(src_capacity, dst_capacity))
return 1;
else
return -1;
}
return task_fits_capacity(p, dst_capacity) ? -1 : 1;
}
> +
> #ifdef CONFIG_NUMA_BALANCING
> /*
> * Returns 1, if task migration degrades locality
> @@ -7672,6 +7698,15 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> if (tsk_cache_hot == -1)
> tsk_cache_hot = task_hot(p, env);
>
> + /*
> + * On a (sane) asymmetric CPU capacity system, the increase in compute
> + * capacity should offset any potential performance hit caused by a
> + * migration.
> + */
> + if ((env->dst_grp_type == group_has_spare) &&
Shouldn't it be env->src_grp_type == group_misfit_task to only care of misfit task case as
stated in $subject
> + !migrate_degrades_capacity(p, env))
> + tsk_cache_hot = 0;
> +
> if (tsk_cache_hot <= 0 ||
> env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
> if (tsk_cache_hot == 1) {
> @@ -9310,6 +9345,7 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
> if (!sds.busiest)
> goto out_balanced;
>
> + env->dst_grp_type = local->group_type;
> env->src_grp_type = busiest->group_type;
>
> /* Misfit tasks should be dealt with regardless of the avg load */
> --
> 2.25.1
>
next prev parent reply other threads:[~2021-04-16 13:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-15 17:58 [PATCH 0/2] sched/fair: (The return of) misfit task load-balance tweaks Valentin Schneider
2021-04-15 17:58 ` [PATCH 1/2] sched/fair: Filter out locally-unsolvable misfit imbalances Valentin Schneider
2021-04-15 18:47 ` Rik van Riel
2021-04-16 13:29 ` Vincent Guittot
2021-04-19 17:13 ` Valentin Schneider
2021-04-22 9:48 ` Dietmar Eggemann
2021-04-22 19:19 ` Valentin Schneider
2021-04-15 17:58 ` [PATCH 2/2] sched/fair: Relax task_hot() for misfit tasks Valentin Schneider
2021-04-15 20:39 ` Rik van Riel
2021-04-16 9:43 ` Valentin Schneider
2021-04-19 12:59 ` Phil Auld
2021-04-19 17:17 ` Valentin Schneider
2021-04-19 20:23 ` Phil Auld
2021-04-16 13:51 ` Vincent Guittot [this message]
2021-04-19 17:13 ` Valentin Schneider
2021-04-20 14:33 ` Vincent Guittot
2021-04-21 10:52 ` Valentin Schneider
2021-04-22 17:29 ` Dietmar Eggemann
2021-04-22 19:19 ` Valentin Schneider
2021-04-30 6:58 ` Vincent Guittot
2021-05-07 13:46 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210416135113.GA16445@vingu-book \
--to=vincent.guittot@linaro.org \
--cc=clingutla@codeaurora.org \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=pkondeti@codeaurora.org \
--cc=qais.yousef@arm.com \
--cc=qperret@google.com \
--cc=riel@surriel.com \
--cc=valentin.schneider@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox