Re: [PATCH v4 2/2] sched/fair: update scale invariance of PELT

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Pavan Kondeti <pkondeti@codeaurora.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: peterz@infradead.org, mingo@kernel.org,
	linux-kernel@vger.kernel.org, rjw@rjwysocki.net,
	dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com,
	patrick.bellasi@arm.com, pjt@google.com, bsegall@google.com,
	thara.gopinath@linaro.org
Subject: Re: [PATCH v4 2/2] sched/fair: update scale invariance of PELT
Date: Tue, 23 Oct 2018 11:29:37 +0530	[thread overview]
Message-ID: <20181023055937.GC27587@codeaurora.org> (raw)
In-Reply-To: <1539965871-22410-3-git-send-email-vincent.guittot@linaro.org>

Hi Vincent,

On Fri, Oct 19, 2018 at 06:17:51PM +0200, Vincent Guittot wrote:
>  
>  /*
> + * The clock_pelt scales the time to reflect the effective amount of
> + * computation done during the running delta time but then sync back to
> + * clock_task when rq is idle.
> + *
> + *
> + * absolute time   | 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16
> + * @ max capacity  ------******---------------******---------------
> + * @ half capacity ------************---------************---------
> + * clock pelt      | 1| 2|    3|    4| 7| 8| 9|   10|   11|14|15|16
> + *
> + */
> +void update_rq_clock_pelt(struct rq *rq, s64 delta)
> +{
> +
> +	if (is_idle_task(rq->curr)) {
> +		u32 divider = (LOAD_AVG_MAX - 1024 + rq->cfs.avg.period_contrib) << SCHED_CAPACITY_SHIFT;
> +		u32 overload = rq->cfs.avg.util_sum + LOAD_AVG_MAX;
> +		overload += rq->avg_rt.util_sum;
> +		overload += rq->avg_dl.util_sum;
> +
> +		/*
> +		 * Reflecting some stolen time makes sense only if the idle
> +		 * phase would be present at max capacity. As soon as the
> +		 * utilization of a rq has reached the maximum value, it is
> +		 * considered as an always runnnig rq without idle time to
> +		 * steal. This potential idle time is considered as lost in
> +		 * this case. We keep track of this lost idle time compare to
> +		 * rq's clock_task.
> +		 */
> +		if (overload >= divider)
> +			rq->lost_idle_time += rq_clock_task(rq) - rq->clock_pelt;
> +

I am trying to understand this better. I believe we run into this scenario, when
the frequency is limited due to thermal/userspace constraints. Lets say
frequency is limited to Fmax/2. A 50% task at Fmax, becomes 100% running at
Fmax/2. The utilization is built up to 100% after several periods.
The clock_pelt runs at 1/2 speed of the clock_task. We are loosing the idle time
all along. What happens when the CPU enters idle for a short duration and comes
back to run this 100% utilization task?

If the above block is not present i.e lost_idle_time is not tracked, we
stretch the idle time (since clock_pelt is synced to clock_task) and the
utilization is dropped. Right?

With the above block, we don't stretch the idle time. In fact we don't
consider the idle time at all. Because,

idle_time = now - last_time;

idle_time = (rq->clock_pelt - rq->lost_idle_time) - last_time
idle_time = (rq->clock_task - rq_clock_task + rq->clock_pelt_old) - last_time
idle_time = rq->clock_pelt_old - last_time

The last time is nothing but the last snapshot of the rq->clock_pelt when the
task entered sleep due to which CPU entered idle.

Can you please explain the significance of the above block with an example?

> +
> +		/* The rq is idle, we can sync to clock_task */
> +		rq->clock_pelt  = rq_clock_task(rq);
> +
> +
> +	} else {
> +		/*
> +		 * When a rq runs at a lower compute capacity, it will need
> +		 * more time to do the same amount of work than at max
> +		 * capacity: either because it takes more time to compute the
> +		 * same amount of work or because taking more time means
> +		 * sharing more often the CPU between entities.
> +		 * In order to be invariant, we scale the delta to reflect how
> +		 * much work has been really done.
> +		 * Running at lower capacity also means running longer to do
> +		 * the same amount of work and this results in stealing some
> +		 * idle time that will disturb the load signal compared to
> +		 * max capacity; This stolen idle time will be automaticcally
> +		 * reflected when the rq will be idle and the clock will be
> +		 * synced with rq_clock_task.
> +		 */
> +
> +		/*
> +		 * scale the elapsed time to reflect the real amount of
> +		 * computation
> +		 */
> +		delta = cap_scale(delta, arch_scale_freq_capacity(cpu_of(rq)));
> +		delta = cap_scale(delta, arch_scale_cpu_capacity(NULL, cpu_of(rq)));
> +
> +		rq->clock_pelt += delta;

AFAICT, the rq->clock_pelt is used for both utilization and load. So the load
also becomes a function of CPU uarch now. Is this intentional?

Thanks,
Pavan
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

next prev parent reply	other threads:[~2018-10-23  5:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-19 16:17 [PATCH v4 0/2] sched/fair: update scale invariance of PELT Vincent Guittot
2018-10-19 16:17 ` [PATCH 1/2] sched/fair: move rq_of helper function Vincent Guittot
2018-10-20  0:44   ` kbuild test robot
2018-10-19 16:17 ` [PATCH v4 2/2] sched/fair: update scale invariance of PELT Vincent Guittot
2018-10-23  5:59   ` Pavan Kondeti [this message]
2018-10-23 12:15     ` Vincent Guittot
2018-10-24  4:53       ` Pavan Kondeti
2018-10-24  9:07         ` Vincent Guittot
2018-10-23 10:00   ` Peter Zijlstra
2018-10-23 12:15     ` Vincent Guittot
2018-10-25 10:35   ` Dietmar Eggemann
2018-10-25 10:43     ` Vincent Guittot
2018-10-25 11:08       ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181023055937.GC27587@codeaurora.org \
    --to=pkondeti@codeaurora.org \
    --cc=Morten.Rasmussen@arm.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=thara.gopinath@linaro.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.