From: "Aaron Lu" <ziqianlu@bytedance.com>
To: "xupengbo" <xupengbo1029@163.com>,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>
Cc: "Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>,
"Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"David Vernet" <void@manifault.com>,
<linux-kernel@vger.kernel.org>, <cgroups@vger.kernel.org>
Subject: Re: [PATCH v5] sched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task migrates out.
Date: Fri, 28 Nov 2025 19:54:45 +0800 [thread overview]
Message-ID: <20251128115445.GA1526246@bytedance.com> (raw)
In-Reply-To: <20250827022208.14487-1-xupengbo@oppo.com>
Hello,
On Wed, Aug 27, 2025 at 10:22:07AM +0800, xupengbo wrote:
> When a task is migrated out, there is a probability that the tg->load_avg
> value will become abnormal. The reason is as follows.
>
> 1. Due to the 1ms update period limitation in update_tg_load_avg(), there
> is a possibility that the reduced load_avg is not updated to tg->load_avg
> when a task migrates out.
> 2. Even though __update_blocked_fair() traverses the leaf_cfs_rq_list and
> calls update_tg_load_avg() for cfs_rqs that are not fully decayed, the key
> function cfs_rq_is_decayed() does not check whether
> cfs->tg_load_avg_contrib is null. Consequently, in some cases,
> __update_blocked_fair() removes cfs_rqs whose avg.load_avg has not been
> updated to tg->load_avg.
>
> Add a check of cfs_rq->tg_load_avg_contrib in cfs_rq_is_decayed(),
> which fixes the case (2.) mentioned above.
>
> Fixes: 1528c661c24b ("sched/fair: Ratelimit update to tg->load_avg")
> Tested-by: Aaron Lu <ziqianlu@bytedance.com>
> Reviewed-by: Aaron Lu <ziqianlu@bytedance.com>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> Signed-off-by: xupengbo <xupengbo@oppo.com>
I wonder if there are any more concerns about this patch? If no, I hope
this fix can be merged. It's a rare case but it does happen for some
specific setup.
Sorry if this is a bad timing, but I just hit an oncall where this exact
problem occurred so I suppose it's worth a ping :)
Best regards,
Aaron
next prev parent reply other threads:[~2025-11-28 11:55 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 2:22 [PATCH v5] sched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task migrates out xupengbo
2025-11-28 11:54 ` Aaron Lu [this message]
2025-11-28 13:40 ` Peter Zijlstra
2025-11-28 14:15 ` Aaron Lu
2025-12-03 18:25 ` [tip: sched/urgent] " tip-bot2 for xupengbo
2025-12-03 18:31 ` tip-bot2 for xupengbo
2025-12-06 9:10 ` tip-bot2 for xupengbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251128115445.GA1526246@bytedance.com \
--to=ziqianlu@bytedance.com \
--cc=bsegall@google.com \
--cc=cgroups@vger.kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
--cc=xupengbo1029@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox