From: Catalin Marinas <catalin.marinas@arm.com>
To: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Will Deacon <will@kernel.org>,
corbet@lwn.net, skhan@linuxfoundation.org,
punit.agrawal@oss.qualcomm.com, jic23@kernel.org,
osama.abdelkader@gmail.com, chenl311@chinatelecom.cn,
fengchengwen@huawei.com, suzuki.poulose@arm.com, maz@kernel.org,
lpieralisi@kernel.org, timothy.hayes@arm.com,
sascha.bischoff@arm.com, arnd@arndb.de,
mrigendra.chaubey@gmail.com, pierre.gondois@arm.com,
dietmar.eggemann@arm.com, yangyicong@hisilicon.com,
sudeep.holla@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
Date: Tue, 9 Jun 2026 18:58:22 +0100 [thread overview]
Message-ID: <aihUPsuEytsM6Dly@arm.com> (raw)
In-Reply-To: <02932ef7-5819-4cf5-8e78-8fd3fd40274f@huawei.com>
Hi Jinjie,
On Wed, Jun 03, 2026 at 02:38:11PM +0800, Jinjie Ruan wrote:
> On 6/2/2026 7:09 PM, Will Deacon wrote:
> > On Wed, May 20, 2026 at 10:20:23AM +0800, Jinjie Ruan wrote:
> >> When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> >> enumerated CPUs as "present" regardless of their status in the MADT. This
> >> causes issues with SMT hotplug control. For instance, with QEMU's
> >> "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> >> follows: the first four CPUs are marked Enabled while the remaining four
> >> are marked Online Capable to support potential hot-plugging.
> >>
> >> Fix this by:
> >>
> >> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> >> entry before calling set_cpu_present() during SMP initialization.
> >>
> >> 2. Properly managing the present mask in acpi_map_cpu() and
> >> acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> >> other architectures like x86 and LoongArch.
> >>
> >> 3. Update the arm64 CPU hotplug documentation to no longer state that all
> >> online-capable vCPUs are marked as present by the kernel at boot time.
> >>
> >> This ensures that only physically available or explicitly enabled CPUs
> >> are in the present mask, keeping the SMT control logic consistent with
> >> the actual hardware state.
> >
> > Please can you check the Sashiko review comment?
> >
> > https://sashiko.dev/#/patchset/20260520022023.126670-1-ruanjinjie@huawei.com
>
> I think commit eba4675008a6 ("arm64: arch_register_cpu() variant to
> check if an ACPI handle is now available.") introduced this bug.
>
> It introduced an architectural safety block inside
> arch_unregister_cpu(). If a hot-unplug operation is determined to be a
> physical hardware removal (where _STA evaluates to
> !ACPI_STA_DEVICE_PRESENT), it aborts the unregistration transaction
> early to protect unreadied arm64 infrastructure, thereby skipping
> unregister_cpu().
>
> However, the generic ACPI processor driver path in
> acpi_processor_post_eject() currently treats arch_unregister_cpu() as
> an unconditional void operation. When arch_unregister_cpu() bails out
> early, the subsequent cleanup flow blindly proceeds to call
> acpi_unmap_cpu(), clears global per-cpu processor arrays, and
> unconditionally free the 'struct acpi_processor' object.
>
> I think we can fix this by:
>
> 1. Refactoring arch_unregister_cpu() to return an integer
> transaction status. It returns -EOPNOTSUPP when aborting due to physical
> hot-remove blocking, -EINVAL/-EIO on firmware failures, and 0 only upon
> successful unregistration.
>
> 2. Guarding the downstream execution flow in
> acpi_processor_post_eject(). If arch_unregister_cpu() returns a error
> code, the hot-unplug transaction is considered aborted.
I wonder whether we need all this guarding. In the worst case, we could
rewrite the function, something like below, to always unregister and
only warn:
void arch_unregister_cpu(int cpu)
{
acpi_handle acpi_handle = acpi_get_processor_handle(cpu);
struct cpu *c = &per_cpu(cpu_devices, cpu);
acpi_status status;
unsigned long long sta;
if (!acpi_handle) {
pr_err_once("Removing a CPU without associated ACPI handle\n");
} else {
status = acpi_evaluate_integer(acpi_handle, "_STA", NULL, &sta);
if (!ACPI_FAILURE(status) &&
cpu_present(cpu) && !(sta & ACPI_STA_DEVICE_PRESENT))
pr_err_once("Changing CPU present bit is not supported\n");
}
unregister_cpu(c);
}
However, on the first condition, can we actually trigger !acpi_handle?
If not, we could just drop it. I tried to look up the paths and I don't
think we'd ever end up in this function with !acpi_handle. So this
leaves us with the next checks.
On the second/third conditions, it's more about preventing physical CPU
hotplug as we haven't properly defined it for arm yet but we could just
add a WARN_ONCE() to make it more visible and still proceed with the
unregistering. I think with your proposal, we don't fully unroll the
state anyway just by returning an error in arch_unregister_cpu(), so I'd
rather continue here.
What does firmware do for virtual CPU hotplug w.r.t. _STA? I noticed a
slight change in wording in the cpu-hotplug.rst doc with your patch from
On virtual systems the _STA method must always report the CPU as
``present``
to
On virtual systems the _STA method must report the CPU as ``present``
when it is activated by the firmware
Was your intention that _STA.PRESENT can become 0 when hot-unplugging
virtual CPUs?
--
Catalin
prev parent reply other threads:[~2026-06-09 17:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 2:20 [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
2026-06-02 11:09 ` Will Deacon
2026-06-02 12:14 ` Jinjie Ruan
2026-06-02 12:47 ` Will Deacon
2026-06-03 6:38 ` Jinjie Ruan
2026-06-09 17:58 ` Catalin Marinas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aihUPsuEytsM6Dly@arm.com \
--to=catalin.marinas@arm.com \
--cc=arnd@arndb.de \
--cc=chenl311@chinatelecom.cn \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=fengchengwen@huawei.com \
--cc=jic23@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mrigendra.chaubey@gmail.com \
--cc=osama.abdelkader@gmail.com \
--cc=pierre.gondois@arm.com \
--cc=punit.agrawal@oss.qualcomm.com \
--cc=ruanjinjie@huawei.com \
--cc=sascha.bischoff@arm.com \
--cc=skhan@linuxfoundation.org \
--cc=sudeep.holla@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=timothy.hayes@arm.com \
--cc=will@kernel.org \
--cc=yangyicong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox