All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched/fair: handle case of task_h_load() returning 0
Date: Thu, 02 Jul 2020 17:11:15 +0100	[thread overview]
Message-ID: <jhjh7up99ss.mognet@arm.com> (raw)
In-Reply-To: <20200702144258.19326-1-vincent.guittot@linaro.org>


On 02/07/20 15:42, Vincent Guittot wrote:
> task_h_load() can return 0 in some situations like running stress-ng
> mmapfork, which forks thousands of threads, in a sched group on a 224 cores
> system. The load balance doesn't handle this correctly because
> env->imbalance never decreases and it will stop pulling tasks only after
> reaching loop_max, which can be equal to the number of running tasks of
> the cfs. Make sure that imbalance will be decreased by at least 1.
>
> misfit task is the other feature that doesn't handle correctly such
> situation although it's probably more difficult to face the problem
> because of the smaller number of CPUs and running tasks on heterogenous
> system.
>
> We can't simply ensure that task_h_load() returns at least one because it
> would imply to handle underrun in other places.

Nasty one, that...

Random thought: isn't that the kind of thing we have scale_load() and
scale_load_down() for? There's more uses of task_h_load() than I would like
for this, but if we upscale its output (or introduce an upscaled variant),
we could do something like:

---
detach_tasks()
{
        long imbalance = env->imbalance;

        if (env->migration_type == migrate_load)
                imbalance = scale_load(imbalance);

        while (!list_empty(tasks)) {
                /* ... */
                switch (env->migration_type) {
                case migrate_load:
                        load = task_h_load_upscaled(p);
                        /* ... usual bits here ...*/
                        lsub_positive(&env->imbalance, load);
                        break;
                        /* ... */
                }

                if (!scale_load_down(env->imbalance))
                        break;
        }
}
---

It's not perfect, and there's still the misfit situation to sort out -
still, do you think this is something we could go towards?

>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6fab1d17c575..62747c24aa9e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4049,7 +4049,13 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
>               return;
>       }
>
> -	rq->misfit_task_load = task_h_load(p);
> +	/*
> +	 * Make sure that misfit_task_load will not be null even if
> +	 * task_h_load() returns 0. misfit_task_load is only used to select
> +	 * rq with highest load so adding 1 will not modify the result
> +	 * of the comparison.
> +	 */
> +	rq->misfit_task_load = task_h_load(p) + 1;
>  }
>
>  #else /* CONFIG_SMP */
> @@ -7664,6 +7670,16 @@ static int detach_tasks(struct lb_env *env)
>                           env->sd->nr_balance_failed <= env->sd->cache_nice_tries)
>                               goto next;
>
> +			/*
> +			 * Depending of the number of CPUs and tasks and the
> +			 * cgroup hierarchy, task_h_load() can return a null
> +			 * value. Make sure that env->imbalance decreases
> +			 * otherwise detach_tasks() will stop only after
> +			 * detaching up to loop_max tasks.
> +			 */
> +			if (!load)
> +				load = 1;
> +
>                       env->imbalance -= load;
>                       break;

  reply	other threads:[~2020-07-02 16:11 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-02 14:42 [PATCH] sched/fair: handle case of task_h_load() returning 0 Vincent Guittot
2020-07-02 16:11 ` Valentin Schneider [this message]
2020-07-02 16:28   ` Vincent Guittot
2020-07-07 13:30     ` Vincent Guittot
2020-07-08 10:34       ` Valentin Schneider
2020-07-08  9:44 ` Dietmar Eggemann
2020-07-08  9:47   ` Vincent Guittot
2020-07-09 13:34     ` Dietmar Eggemann
2020-07-09 13:52       ` Vincent Guittot
2020-07-09 13:06 ` Valentin Schneider
2020-07-09 13:51   ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhjh7up99ss.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.