From: Peter Zijlstra <peterz@infradead.org>
To: Shashank Balaji <shashank.mahadasyam@sony.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
linux-kernel@vger.kernel.org, Alexandre Ghiti <alex@ghiti.fr>,
linux-riscv@lists.infradead.org,
Sia Jee Heng <jeeheng.sia@starfivetech.com>,
James Morse <james.morse@arm.com>,
Nicholas Piggin <npiggin@gmail.com>,
linux-arm-kernel@lists.infradead.org,
Rahul Bukte <rahul.bukte@sony.com>,
Daniel Palmer <daniel.palmer@sony.com>,
Shinya Takumi <shinya.takumi@sony.com>
Subject: Re: [RFC PATCH] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type
Date: Mon, 30 Jun 2025 10:48:08 +0200 [thread overview]
Message-ID: <20250630084808.GH1613376@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20250630082103.829352-1-shashank.mahadasyam@sony.com>
On Mon, Jun 30, 2025 at 05:20:59PM +0900, Shashank Balaji wrote:
> On an x86 machine, when cpu 0 is isolated with "isolcpus=", on initiating
> suspend to memory, a warning is triggered, followed by a kernel crash. This is
> on a defconfig + CONFIG_ENERGY_MODEL kernel:
> This happens because in order to offline the last secondary cpu, i.e. cpu 1,
> build_sched_domains() ends up being passed an empty cpumask, since the only remaining
> cpu (cpu 0) is isolated. It warns and fails, after which perf domains are
> are attempted to be built, which crashes the kernel. The same problem occurs
> during cpu hotplug, but that was fixed by
> commit 38685e2a0476127d ("cpu/hotplug: Don't offline the last non-isolated CPU").
>
> Fix this by ensuring that the primary cpu, the last standing cpu, is of domain
> type, so that build_sched_domains() is not passed an empty cpumask.
>
> Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
> Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
> ---
> kernel/cpu.c | 27 ++++++++++++++++++++++-----
> 1 file changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index a59e009e0be4..d9167b0559a5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1902,12 +1902,28 @@ int freeze_secondary_cpus(int primary)
>
> cpu_maps_update_begin();
> if (primary == -1) {
> - primary = cpumask_first(cpu_online_mask);
> - if (!housekeeping_cpu(primary, HK_TYPE_TIMER))
> - primary = housekeeping_any_cpu(HK_TYPE_TIMER);
> + primary = cpumask_first_and_and(cpu_online_mask,
> + housekeeping_cpumask(HK_TYPE_TIMER),
> + housekeeping_cpumask(HK_TYPE_DOMAIN));
That's terrible indenting, please align after the opening bracket like:
primary = cpumask_first_and_and(cpu_online_mask,
housekeeping_cpumask(HK_TYPE_TIMER),
housekeeping_cpumask(HK_TYPE_DOMAIN));
Also, IIRC HK_TYPE_HRTIMER is deprecated and should be something like
HK_TYPE_NOISE or somesuch. Frederic?
> + if (primary >= nr_cpu_ids) {
> + error = -ENODEV;
> + pr_err("No suitable primary CPU found. Ensure at least one non-isolated, non-nohz_full CPU is online\n");
> + goto abort;
> + }
> } else {
> - if (!cpu_online(primary))
> - primary = cpumask_first(cpu_online_mask);
> + if (!cpu_online(primary)) {
> + primary = cpumask_first_and(cpu_online_mask,
> + housekeeping_cpumask(HK_TYPE_DOMAIN));
Indenting again.
> + if (primary >= nr_cpu_ids) {
> + error = -ENODEV;
> + pr_err("No suitable primary CPU found. Ensure at least one non-isolated CPU is online\n");
> + goto abort;
> + }
> + } else if (!housekeeping_cpu(primary, HK_TYPE_DOMAIN)) {
> + error = -ENODEV;
> + pr_err("Primary CPU %d should not be isolated\n", primary);
> + goto abort;
> + }
> }
>
> /*
> @@ -1943,6 +1959,7 @@ int freeze_secondary_cpus(int primary)
> else
> pr_err("Non-boot CPUs are not disabled\n");
>
> +abort:
> /*
> * Make sure the CPUs won't be enabled by someone else. We need to do
> * this even in case of failure as all freeze_secondary_cpus() users are
Also; doesn't the above boil down to something like:
if (primary == -1) {
primary = cpumask_first_and_and(cpu_online_mask,
housekeeping_cpumask(HK_TYPE_TIMER),
housekeeping_cpumask(HK_TYPE_DOMAIN));
} if (!cpu_online(primary)) {
primary = cpumask_first_and(cpu_online_mask,
housekeeping_cpumask(HK_TYPE_DOMAIN));
}
if (primary >= nr_cpu_ids || !housekeeping_cpu(primary, HK_TYPE_DOMAIN)) {
error = -ENODEV;
pr_err("Primary CPU %d should not be isolated\n", primary);
goto abort;
}
Yes, this has less error string variation, but the code is simpler.
next prev parent reply other threads:[~2025-06-30 8:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 8:20 [RFC PATCH] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type Shashank Balaji
2025-06-30 8:48 ` Peter Zijlstra [this message]
2025-06-30 10:29 ` Shashank Balaji
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250630084808.GH1613376@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=catalin.marinas@arm.com \
--cc=daniel.palmer@sony.com \
--cc=james.morse@arm.com \
--cc=jeeheng.sia@starfivetech.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=npiggin@gmail.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=rahul.bukte@sony.com \
--cc=shashank.mahadasyam@sony.com \
--cc=shinya.takumi@sony.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox