From: Doug Berger <opendmb@gmail.com>
To: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Florian Fainelli <florian.fainelli@broadcom.com>,
linux-kernel@vger.kernel.org, Doug Berger <opendmb@gmail.com>
Subject: [RFC PATCH] sched/topology: clear freecpu bit on detach
Date: Tue, 14 Jan 2025 15:04:04 -0800 [thread overview]
Message-ID: <20250114230404.661569-1-opendmb@gmail.com> (raw)
There is a hazard in the deadline scheduler where an offlined CPU
can have its free_cpus bit left set in the def_root_domain when
the schedutil cpufreq governor is used. This can allow a deadline
thread to be pushed to the runqueue of a powered down CPU which
breaks scheduling. The details can be found here:
https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com
The free_cpus mask is expected to be cleared by set_rq_offline();
however, the hazard occurs before the root domain is made online
during CPU hotplug so that function is not invoked for the CPU
that is being made active.
This commit works around the issue by ensuring the free_cpus bit
for a CPU is always cleared when the CPU is removed from a
root_domain. This likely makes the call of cpudl_clear_freecpu()
in rq_offline_dl() fully redundant, but I have not removed it
here because I am not certain of all flows.
It seems likely that a better solution is possible from someone
more familiar with the scheduler implementation, but this
approach is minimally invasive from someone who is not.
Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Signed-off-by: Doug Berger <opendmb@gmail.com>
---
kernel/sched/topology.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index da33ec9e94ab..3cbc14953c36 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -499,6 +499,7 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
set_rq_offline(rq);
cpumask_clear_cpu(rq->cpu, old_rd->span);
+ cpudl_clear_freecpu(&old_rd->cpudl, rq->cpu);
/*
* If we don't want to free the old_rd yet then
--
2.34.1
next reply other threads:[~2025-01-14 23:04 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-14 23:04 Doug Berger [this message]
2025-01-16 10:13 ` [RFC PATCH] sched/topology: clear freecpu bit on detach Juri Lelli
2025-01-17 1:06 ` Doug Berger
2025-02-19 20:35 ` Doug Berger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250114230404.661569-1-opendmb@gmail.com \
--to=opendmb@gmail.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=florian.fainelli@broadcom.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox