public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Dengjun Su <dengjun.su@mediatek.com>
To: <peterz@infradead.org>
Cc: <angelogioacchino.delregno@collabora.com>, <bsegall@google.com>,
	<dengjun.su@mediatek.com>, <dietmar.eggemann@arm.com>,
	<haiqiang.gong@mediatek.com>, <juri.lelli@redhat.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-mediatek@lists.infradead.org>, <matthias.bgg@gmail.com>,
	<mgorman@suse.de>, <mike.zhang@mediatek.com>, <mingo@redhat.com>,
	<peijun.huang@mediatek.com>, <rostedt@goodmis.org>,
	<vincent.guittot@linaro.org>, <vschneid@redhat.com>
Subject: Re: [PATCH] sched/rt: fix incorrect schedstats for rt thread
Date: Fri, 9 Jan 2026 15:24:47 +0800	[thread overview]
Message-ID: <20260109072451.2843331-1-dengjun.su@mediatek.com> (raw)
In-Reply-To: <20260108111632.GH272712@noisy.programming.kicks-ass.net>

On Thu, 2026-01-08 at 12:16 +0100, Peter Zijlstra wrote:
> On Thu, Jan 08, 2026 at 11:13:07AM +0800, Dengjun Su wrote:
> > For RT thread, only 'set_next_task_rt' will call
> > 'update_stats_wait_end_rt' to update schedstats information.
> > However, during the RT migration process,
> > 'update_stats_wait_start_rt' will be called twice, which
> > will cause the values of wait_max and wait_sum to be incorrect.
> 
> Right, that looses time. Also note that I think dl has the same
> issue.

Hi Peter,

Thanks for the feedback. Yes, sorry for miss dl class,
I will update it in V2.

> 
> > The specific output as follows:
> > $ cat /proc/6046/task/6046/sched | grep wait
> > wait_start                                   :             0.000000
> > wait_max                                     :        496717.080029
> > wait_sum                                     :       7921540.776553
> > 
> > Add 'update_stats_wait_end_rt' in 'update_stats_dequeue_rt' to
> > update schedstats information when dequeue_task.
> 
> This needs a few more words on why this is correct -- notably it took
> me
> a little time to find the 'task_on_rq_migrating()' case in
> __update_stats_wait_end() which makes this not actually 'end'.
> 
> But then the corresponding clause in __update_stats_wait_start()
> gives
> me a headache:
> 
>  'wait_start > prev_wait_start'
> 
> I mean, wtf. Should that not equally be using task_on_rq_migrating()
> ?
> 
> Can you please take a hard look at all that and fix up things
> all-round?
> 

A complete schedstats information update flow of migrate should be
__update_stats_wait_start() [enter queue A, stage 1] ->
__update_stats_wait_end()   [leave queue A, stage 2] ->
__update_stats_wait_start() [enter queue B, stage 3] ->
__update_stats_wait_end()   [start running on queue B, stage 4]

    Stage 1: prev_wait_start is 0, and in the end, wait_start records the
    time of entering the queue.
    Stage 2: task_on_rq_migrating(p) is true, and wait_start is updated to
    the waiting time on queue A.
    Stage 3: prev_wait_start is the waiting time on queue A, wait_start is
    the time of entering queue B, and wait_start is expected to be greater
    than prev_wait_start. Under this condition, wait_start is updated to
    (the moment of entering queue B) - (the waiting time on queue A).
    Stage 4: the final wait time = (time when starting to run on queue B)
    - (time of entering queue B) + (waiting time on queue A) = waiting
    time on queue B + waiting time on queue A.

The current problem is that stage 2 does not call __update_stats_wait_end
to update wait_start, which causes the final computed wait time = waiting
time on queue B + the moment of entering queue A, leading to incorrect
wait_max and wait_sum.

For __update_stats_wait_end(), task_on_rq_migrating(p) is needed to
distinguish between stage 2 and stage 4 because they involve different
processing flows, but for __update_stats_wait_start(), it is not necessary
to distinguish between stage 1 and stage 3.

As for adding the condition wait_start > prev_wait_start, I think it is
more like a mechanism to prevent statistical deviations caused by time
inconsistencies.

Thanks



  reply	other threads:[~2026-01-09  7:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08  3:13 [PATCH] sched/rt: fix incorrect schedstats for rt thread Dengjun Su
2026-01-08 11:16 ` Peter Zijlstra
2026-01-09  7:24   ` Dengjun Su [this message]
2026-01-12 16:38     ` Peter Zijlstra
2026-01-14 11:55       ` Dengjun Su

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260109072451.2843331-1-dengjun.su@mediatek.com \
    --to=dengjun.su@mediatek.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=haiqiang.gong@mediatek.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=matthias.bgg@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mike.zhang@mediatek.com \
    --cc=mingo@redhat.com \
    --cc=peijun.huang@mediatek.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox