All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Vishal Chourasia <vishalc@linux.ibm.com>,
	Chuyi Zhou <zhouchuyi@bytedance.com>
Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, vschneid@redhat.com,
	chengming.zhou@linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] sched/fair: Sync se's load_avg with cfs_rq in reweight_task
Date: Tue, 23 Jul 2024 23:10:56 +0200	[thread overview]
Message-ID: <da5b5683-ed22-4e65-9e89-e22ff7f6cf2b@arm.com> (raw)
In-Reply-To: <Zp_QzS-DUiE934X2@linux.ibm.com>

On 23/07/2024 17:48, Vishal Chourasia wrote:
> On Tue, Jul 23, 2024 at 07:42:47PM +0800, Chuyi Zhou wrote:
>> In reweight_task(), there are two situations:
>>
>> 1. The task was on_rq, then the task's load_avg is accurate because we
>> synchronized it with cfs_rq through update_load_avg() in dequeue_task().
>>
>> 2. The task is sleeping, its load_avg might not have been updated for some
>> time, which can result in inaccurate dequeue_load_avg() in
>> reweight_entity().
>>
>> This patch solves this by using sync_entity_load_avg() to synchronize the
>> load_avg of se with cfs_rq before dequeue_load_avg() in reweight_entity().
>> For tasks were on_rq, since we already update load_avg to accurate values
>> in dequeue_task(), this change will not have other effects due to the short
>> time interval between the two updates.
>>
>> Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
>> ---
>> Changes in v3:
>> - use sync_entity_load_avg() rather than update_load_avg() to sync the
>> sleeping task with its cfs_rq suggested by Dietmar.
>> - Link t0 v2: https://lore.kernel.org/lkml/20240720051248.59608-1-zhouchuyi@bytedance.com/
>> Changes in v2:
>> - change the description in commit log.
>> - use update_load_avg() in reweight_task() rather than in reweight_entity
>> suggested by chengming.
>> - Link to v1: https://lore.kernel.org/lkml/20240716150840.23061-1-zhouchuyi@bytedance.com/
>> ---
>>  kernel/sched/fair.c | 43 ++++++++++++++++++++++++-------------------
>>  1 file changed, 24 insertions(+), 19 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 9057584ec06d..da3cdd86ab2e 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -3669,11 +3669,32 @@ dequeue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>  	cfs_rq->avg.load_sum = max_t(u32, cfs_rq->avg.load_sum,
>>  					  cfs_rq->avg.load_avg * PELT_MIN_DIVIDER);
>>  }
>> +
>> +static inline u64 cfs_rq_last_update_time(struct cfs_rq *cfs_rq)
>> +{
>> +	return u64_u32_load_copy(cfs_rq->avg.last_update_time,
>> +				 cfs_rq->last_update_time_copy);
>> +}
>> +
>> +/*
>> + * Synchronize entity load avg of dequeued entity without locking
>> + * the previous rq.
>> + */
>> +static void sync_entity_load_avg(struct sched_entity *se)
>> +{
>> +	struct cfs_rq *cfs_rq = cfs_rq_of(se);
>> +	u64 last_update_time;
>> +
>> +	last_update_time = cfs_rq_last_update_time(cfs_rq);
>> +	__update_load_avg_blocked_se(last_update_time, se);
>> +}
>> +
> The difference between using update_load_avg() and
> sync_entity_load_avg() is:
> 1. update_load_avg() uses the updated PELT clock value from the rq
>    structure.  
> 2. sync_entity_load_avg() uses the last updated time of
>    the cfs_rq where the scheduling entity (se) is attached.
> 
> Won't this affect the entity load sync?

Not sure what you mean exactly by entity load sync here.

The task has been sleeping for a long time, i.e. its PELT values haven't
been updated or a long time (its last_update_time (lut) value is pretty
old).

In this meantime the task's cfs_rq has potentially seen other PELT
updates due to PELT updates of other non-sleeping tasks related to this
cfs_rq. I.e. the cfs_rq lut is much more recent.

What we want to do here is to sync the sleeping task with its cfs_rq. If
the task was sleeping for more than 1us (1024ns) and we cross a 1ms PELT
period (1024us) when we use cfs_rq's lut as the 'now' value for
__update_load_avg_blocked_se() then we will see the task PELT values decay.

We rely on sync_entity_load_avg() for instance in EAS wakeup where the
task's util_avg influences on which CPU type the task will run next. So
we sync the wakee with its cfs_rq to be able to work with a current task
util_avg.

  reply	other threads:[~2024-07-23 21:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-23 11:42 [PATCH v3] sched/fair: Sync se's load_avg with cfs_rq in reweight_task Chuyi Zhou
2024-07-23 15:48 ` Vishal Chourasia
2024-07-23 21:10   ` Dietmar Eggemann [this message]
2024-07-23 23:00     ` Vishal Chourasia
2024-07-24  2:12       ` Chengming Zhou
2024-07-24  9:06         ` Dietmar Eggemann
2024-07-24  2:08 ` Chengming Zhou
2024-07-24  6:20 ` Vishal Chourasia
2024-07-29  7:56 ` Dietmar Eggemann
2024-07-29  8:20   ` Chuyi Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da5b5683-ed22-4e65-9e89-e22ff7f6cf2b@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=bsegall@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vishalc@linux.ibm.com \
    --cc=vschneid@redhat.com \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.