Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Will Deacon <will@kernel.org>,
	corbet@lwn.net, skhan@linuxfoundation.org,
	punit.agrawal@oss.qualcomm.com, jic23@kernel.org,
	osama.abdelkader@gmail.com, chenl311@chinatelecom.cn,
	fengchengwen@huawei.com, suzuki.poulose@arm.com, maz@kernel.org,
	lpieralisi@kernel.org, timothy.hayes@arm.com,
	sascha.bischoff@arm.com, arnd@arndb.de,
	mrigendra.chaubey@gmail.com, pierre.gondois@arm.com,
	dietmar.eggemann@arm.com, yangyicong@hisilicon.com,
	sudeep.holla@kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
Date: Tue, 9 Jun 2026 18:58:22 +0100	[thread overview]
Message-ID: <aihUPsuEytsM6Dly@arm.com> (raw)
In-Reply-To: <02932ef7-5819-4cf5-8e78-8fd3fd40274f@huawei.com>

Hi Jinjie,

On Wed, Jun 03, 2026 at 02:38:11PM +0800, Jinjie Ruan wrote:
> On 6/2/2026 7:09 PM, Will Deacon wrote:
> > On Wed, May 20, 2026 at 10:20:23AM +0800, Jinjie Ruan wrote:
> >> When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> >> enumerated CPUs as "present" regardless of their status in the MADT. This
> >> causes issues with SMT hotplug control. For instance, with QEMU's
> >> "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> >> follows: the first four CPUs are marked Enabled while the remaining four
> >> are marked Online Capable to support potential hot-plugging.
> >>
> >> Fix this by:
> >>
> >> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> >>    entry before calling set_cpu_present() during SMP initialization.
> >>
> >> 2. Properly managing the present mask in acpi_map_cpu() and
> >>    acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> >>    other architectures like x86 and LoongArch.
> >>
> >> 3. Update the arm64 CPU hotplug documentation to no longer state that all
> >>    online-capable vCPUs are marked as present by the kernel at boot time.
> >>
> >> This ensures that only physically available or explicitly enabled CPUs
> >> are in the present mask, keeping the SMT control logic consistent with
> >> the actual hardware state.
> > 
> > Please can you check the Sashiko review comment?
> > 
> > https://sashiko.dev/#/patchset/20260520022023.126670-1-ruanjinjie@huawei.com
> 
> I think commit eba4675008a6 ("arm64: arch_register_cpu() variant to
> check if an ACPI handle is now available.") introduced this bug.
> 
> It introduced an architectural safety block inside
> arch_unregister_cpu(). If a hot-unplug operation is determined to be a
> physical hardware removal (where _STA evaluates to
> !ACPI_STA_DEVICE_PRESENT), it aborts the unregistration transaction
> early to protect unreadied arm64 infrastructure, thereby skipping
> unregister_cpu().
> 
> However, the generic ACPI processor driver path in
> acpi_processor_post_eject() currently treats arch_unregister_cpu() as
> an unconditional void operation. When arch_unregister_cpu() bails out
> early, the subsequent cleanup flow blindly proceeds to call
> acpi_unmap_cpu(), clears global per-cpu processor arrays, and
> unconditionally free the 'struct acpi_processor' object.
> 
> I think we can fix this by:
> 
>     1. Refactoring arch_unregister_cpu() to return an integer
> transaction status. It returns -EOPNOTSUPP when aborting due to physical
> hot-remove blocking, -EINVAL/-EIO on firmware failures, and 0 only upon
> successful unregistration.
> 
>     2. Guarding the downstream execution flow in
> acpi_processor_post_eject(). If arch_unregister_cpu() returns a error
> code, the hot-unplug transaction is considered aborted.

I wonder whether we need all this guarding. In the worst case, we could
rewrite the function, something like below, to always unregister and
only warn:

void arch_unregister_cpu(int cpu)
{
	acpi_handle acpi_handle = acpi_get_processor_handle(cpu);
	struct cpu *c = &per_cpu(cpu_devices, cpu);
	acpi_status status;
	unsigned long long sta;

	if (!acpi_handle) {
		pr_err_once("Removing a CPU without associated ACPI handle\n");
	} else {
		status = acpi_evaluate_integer(acpi_handle, "_STA", NULL, &sta);
		if (!ACPI_FAILURE(status) &&
		    cpu_present(cpu) && !(sta & ACPI_STA_DEVICE_PRESENT))
			pr_err_once("Changing CPU present bit is not supported\n");
	}

	unregister_cpu(c);
}

However, on the first condition, can we actually trigger !acpi_handle?
If not, we could just drop it. I tried to look up the paths and I don't
think we'd ever end up in this function with !acpi_handle. So this
leaves us with the next checks.

On the second/third conditions, it's more about preventing physical CPU
hotplug as we haven't properly defined it for arm yet but we could just
add a WARN_ONCE() to make it more visible and still proceed with the
unregistering. I think with your proposal, we don't fully unroll the
state anyway just by returning an error in arch_unregister_cpu(), so I'd
rather continue here.

What does firmware do for virtual CPU hotplug w.r.t. _STA? I noticed a
slight change in wording in the cpu-hotplug.rst doc with your patch from

  On virtual systems the _STA method must always report the CPU as
  ``present``

to

  On virtual systems the _STA method must report the CPU as ``present``
  when it is activated by the firmware

Was your intention that _STA.PRESENT can become 0 when hot-unplugging
virtual CPUs?

-- 
Catalin


      reply	other threads:[~2026-06-09 17:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20  2:20 [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
2026-06-02 11:09 ` Will Deacon
2026-06-02 12:14   ` Jinjie Ruan
2026-06-02 12:47     ` Will Deacon
2026-06-03  6:38   ` Jinjie Ruan
2026-06-09 17:58     ` Catalin Marinas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aihUPsuEytsM6Dly@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=arnd@arndb.de \
    --cc=chenl311@chinatelecom.cn \
    --cc=corbet@lwn.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=fengchengwen@huawei.com \
    --cc=jic23@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=maz@kernel.org \
    --cc=mrigendra.chaubey@gmail.com \
    --cc=osama.abdelkader@gmail.com \
    --cc=pierre.gondois@arm.com \
    --cc=punit.agrawal@oss.qualcomm.com \
    --cc=ruanjinjie@huawei.com \
    --cc=sascha.bischoff@arm.com \
    --cc=skhan@linuxfoundation.org \
    --cc=sudeep.holla@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=timothy.hayes@arm.com \
    --cc=will@kernel.org \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox