From: Catalin Marinas <catalin.marinas@arm.com>
To: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Will Deacon <will@kernel.org>,
corbet@lwn.net, skhan@linuxfoundation.org,
punit.agrawal@oss.qualcomm.com, jic23@kernel.org,
osama.abdelkader@gmail.com, chenl311@chinatelecom.cn,
fengchengwen@huawei.com, suzuki.poulose@arm.com, maz@kernel.org,
lpieralisi@kernel.org, timothy.hayes@arm.com,
sascha.bischoff@arm.com, arnd@arndb.de,
mrigendra.chaubey@gmail.com, pierre.gondois@arm.com,
dietmar.eggemann@arm.com, yangyicong@hisilicon.com,
sudeep.holla@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
Date: Tue, 9 Jun 2026 18:58:22 +0100 [thread overview]
Message-ID: <aihUPsuEytsM6Dly@arm.com> (raw)
In-Reply-To: <02932ef7-5819-4cf5-8e78-8fd3fd40274f@huawei.com>
Hi Jinjie,
On Wed, Jun 03, 2026 at 02:38:11PM +0800, Jinjie Ruan wrote:
> On 6/2/2026 7:09 PM, Will Deacon wrote:
> > On Wed, May 20, 2026 at 10:20:23AM +0800, Jinjie Ruan wrote:
> >> When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> >> enumerated CPUs as "present" regardless of their status in the MADT. This
> >> causes issues with SMT hotplug control. For instance, with QEMU's
> >> "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> >> follows: the first four CPUs are marked Enabled while the remaining four
> >> are marked Online Capable to support potential hot-plugging.
> >>
> >> Fix this by:
> >>
> >> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> >> entry before calling set_cpu_present() during SMP initialization.
> >>
> >> 2. Properly managing the present mask in acpi_map_cpu() and
> >> acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> >> other architectures like x86 and LoongArch.
> >>
> >> 3. Update the arm64 CPU hotplug documentation to no longer state that all
> >> online-capable vCPUs are marked as present by the kernel at boot time.
> >>
> >> This ensures that only physically available or explicitly enabled CPUs
> >> are in the present mask, keeping the SMT control logic consistent with
> >> the actual hardware state.
> >
> > Please can you check the Sashiko review comment?
> >
> > https://sashiko.dev/#/patchset/20260520022023.126670-1-ruanjinjie@huawei.com
>
> I think commit eba4675008a6 ("arm64: arch_register_cpu() variant to
> check if an ACPI handle is now available.") introduced this bug.
>
> It introduced an architectural safety block inside
> arch_unregister_cpu(). If a hot-unplug operation is determined to be a
> physical hardware removal (where _STA evaluates to
> !ACPI_STA_DEVICE_PRESENT), it aborts the unregistration transaction
> early to protect unreadied arm64 infrastructure, thereby skipping
> unregister_cpu().
>
> However, the generic ACPI processor driver path in
> acpi_processor_post_eject() currently treats arch_unregister_cpu() as
> an unconditional void operation. When arch_unregister_cpu() bails out
> early, the subsequent cleanup flow blindly proceeds to call
> acpi_unmap_cpu(), clears global per-cpu processor arrays, and
> unconditionally free the 'struct acpi_processor' object.
>
> I think we can fix this by:
>
> 1. Refactoring arch_unregister_cpu() to return an integer
> transaction status. It returns -EOPNOTSUPP when aborting due to physical
> hot-remove blocking, -EINVAL/-EIO on firmware failures, and 0 only upon
> successful unregistration.
>
> 2. Guarding the downstream execution flow in
> acpi_processor_post_eject(). If arch_unregister_cpu() returns a error
> code, the hot-unplug transaction is considered aborted.
I wonder whether we need all this guarding. In the worst case, we could
rewrite the function, something like below, to always unregister and
only warn:
void arch_unregister_cpu(int cpu)
{
acpi_handle acpi_handle = acpi_get_processor_handle(cpu);
struct cpu *c = &per_cpu(cpu_devices, cpu);
acpi_status status;
unsigned long long sta;
if (!acpi_handle) {
pr_err_once("Removing a CPU without associated ACPI handle\n");
} else {
status = acpi_evaluate_integer(acpi_handle, "_STA", NULL, &sta);
if (!ACPI_FAILURE(status) &&
cpu_present(cpu) && !(sta & ACPI_STA_DEVICE_PRESENT))
pr_err_once("Changing CPU present bit is not supported\n");
}
unregister_cpu(c);
}
However, on the first condition, can we actually trigger !acpi_handle?
If not, we could just drop it. I tried to look up the paths and I don't
think we'd ever end up in this function with !acpi_handle. So this
leaves us with the next checks.
On the second/third conditions, it's more about preventing physical CPU
hotplug as we haven't properly defined it for arm yet but we could just
add a WARN_ONCE() to make it more visible and still proceed with the
unregistering. I think with your proposal, we don't fully unroll the
state anyway just by returning an error in arch_unregister_cpu(), so I'd
rather continue here.
What does firmware do for virtual CPU hotplug w.r.t. _STA? I noticed a
slight change in wording in the cpu-hotplug.rst doc with your patch from
On virtual systems the _STA method must always report the CPU as
``present``
to
On virtual systems the _STA method must report the CPU as ``present``
when it is activated by the firmware
Was your intention that _STA.PRESENT can become 0 when hot-unplugging
virtual CPUs?
--
Catalin
prev parent reply other threads:[~2026-06-09 17:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 2:20 [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
2026-06-02 11:09 ` Will Deacon
2026-06-02 12:14 ` Jinjie Ruan
2026-06-02 12:47 ` Will Deacon
2026-06-03 6:38 ` Jinjie Ruan
2026-06-09 17:58 ` Catalin Marinas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aihUPsuEytsM6Dly@arm.com \
--to=catalin.marinas@arm.com \
--cc=arnd@arndb.de \
--cc=chenl311@chinatelecom.cn \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=fengchengwen@huawei.com \
--cc=jic23@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mrigendra.chaubey@gmail.com \
--cc=osama.abdelkader@gmail.com \
--cc=pierre.gondois@arm.com \
--cc=punit.agrawal@oss.qualcomm.com \
--cc=ruanjinjie@huawei.com \
--cc=sascha.bischoff@arm.com \
--cc=skhan@linuxfoundation.org \
--cc=sudeep.holla@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=timothy.hayes@arm.com \
--cc=will@kernel.org \
--cc=yangyicong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.