From: Dengjun Su <dengjun.su@mediatek.com>
To: <peterz@infradead.org>
Cc: <angelogioacchino.delregno@collabora.com>, <bsegall@google.com>,
<dengjun.su@mediatek.com>, <dietmar.eggemann@arm.com>,
<haiqiang.gong@mediatek.com>, <juri.lelli@redhat.com>,
<linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>,
<linux-mediatek@lists.infradead.org>, <matthias.bgg@gmail.com>,
<mgorman@suse.de>, <mike.zhang@mediatek.com>, <mingo@redhat.com>,
<peijun.huang@mediatek.com>, <rostedt@goodmis.org>,
<vincent.guittot@linaro.org>, <vschneid@redhat.com>
Subject: Re: [PATCH] sched/rt: fix incorrect schedstats for rt thread
Date: Fri, 9 Jan 2026 15:24:47 +0800 [thread overview]
Message-ID: <20260109072451.2843331-1-dengjun.su@mediatek.com> (raw)
In-Reply-To: <20260108111632.GH272712@noisy.programming.kicks-ass.net>
On Thu, 2026-01-08 at 12:16 +0100, Peter Zijlstra wrote:
> On Thu, Jan 08, 2026 at 11:13:07AM +0800, Dengjun Su wrote:
> > For RT thread, only 'set_next_task_rt' will call
> > 'update_stats_wait_end_rt' to update schedstats information.
> > However, during the RT migration process,
> > 'update_stats_wait_start_rt' will be called twice, which
> > will cause the values of wait_max and wait_sum to be incorrect.
>
> Right, that looses time. Also note that I think dl has the same
> issue.
Hi Peter,
Thanks for the feedback. Yes, sorry for miss dl class,
I will update it in V2.
>
> > The specific output as follows:
> > $ cat /proc/6046/task/6046/sched | grep wait
> > wait_start : 0.000000
> > wait_max : 496717.080029
> > wait_sum : 7921540.776553
> >
> > Add 'update_stats_wait_end_rt' in 'update_stats_dequeue_rt' to
> > update schedstats information when dequeue_task.
>
> This needs a few more words on why this is correct -- notably it took
> me
> a little time to find the 'task_on_rq_migrating()' case in
> __update_stats_wait_end() which makes this not actually 'end'.
>
> But then the corresponding clause in __update_stats_wait_start()
> gives
> me a headache:
>
> 'wait_start > prev_wait_start'
>
> I mean, wtf. Should that not equally be using task_on_rq_migrating()
> ?
>
> Can you please take a hard look at all that and fix up things
> all-round?
>
A complete schedstats information update flow of migrate should be
__update_stats_wait_start() [enter queue A, stage 1] ->
__update_stats_wait_end() [leave queue A, stage 2] ->
__update_stats_wait_start() [enter queue B, stage 3] ->
__update_stats_wait_end() [start running on queue B, stage 4]
Stage 1: prev_wait_start is 0, and in the end, wait_start records the
time of entering the queue.
Stage 2: task_on_rq_migrating(p) is true, and wait_start is updated to
the waiting time on queue A.
Stage 3: prev_wait_start is the waiting time on queue A, wait_start is
the time of entering queue B, and wait_start is expected to be greater
than prev_wait_start. Under this condition, wait_start is updated to
(the moment of entering queue B) - (the waiting time on queue A).
Stage 4: the final wait time = (time when starting to run on queue B)
- (time of entering queue B) + (waiting time on queue A) = waiting
time on queue B + waiting time on queue A.
The current problem is that stage 2 does not call __update_stats_wait_end
to update wait_start, which causes the final computed wait time = waiting
time on queue B + the moment of entering queue A, leading to incorrect
wait_max and wait_sum.
For __update_stats_wait_end(), task_on_rq_migrating(p) is needed to
distinguish between stage 2 and stage 4 because they involve different
processing flows, but for __update_stats_wait_start(), it is not necessary
to distinguish between stage 1 and stage 3.
As for adding the condition wait_start > prev_wait_start, I think it is
more like a mechanism to prevent statistical deviations caused by time
inconsistencies.
Thanks
next prev parent reply other threads:[~2026-01-09 7:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-08 3:13 [PATCH] sched/rt: fix incorrect schedstats for rt thread Dengjun Su
2026-01-08 11:16 ` Peter Zijlstra
2026-01-09 7:24 ` Dengjun Su [this message]
2026-01-12 16:38 ` Peter Zijlstra
2026-01-14 11:55 ` Dengjun Su
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260109072451.2843331-1-dengjun.su@mediatek.com \
--to=dengjun.su@mediatek.com \
--cc=angelogioacchino.delregno@collabora.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=haiqiang.gong@mediatek.com \
--cc=juri.lelli@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=matthias.bgg@gmail.com \
--cc=mgorman@suse.de \
--cc=mike.zhang@mediatek.com \
--cc=mingo@redhat.com \
--cc=peijun.huang@mediatek.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox