* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
[not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
@ 2026-05-11 3:17 ` Jinjie Ruan
2026-05-11 9:55 ` Catalin Marinas
2026-05-11 17:37 ` Catalin Marinas
1 sibling, 1 reply; 3+ messages in thread
From: Jinjie Ruan @ 2026-05-11 3:17 UTC (permalink / raw)
To: catalin.marinas, will, punit.agrawal, rafael.j.wysocki,
fengchengwen, chenl311, suzuki.poulose, maz, timothy.hayes,
lpieralisi, mrigendra.chaubey, arnd, sudeep.holla, yangyicong,
jic23, pierre.gondois, linux-arm-kernel, linux-kernel,
james.morse
On 4/27/2026 10:35 AM, Jinjie Ruan wrote:
> On arm64, when booting with `maxcpus` greater than the number of present
> CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> but have not yet been registered via register_cpu(). Consequently,
> the per-cpu device objects for these CPUs are not yet initialized.
>
> In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
> _cpu_up() for these unregistered CPUs eventually leads to
> sysfs_create_group() being called with a NULL kobject (or a kobject
> without a directory), triggering the following warning in
> fs/sysfs/group.c:
>
> if (WARN_ON(!kobj || (!update && !kobj->sd)))
> return -EINVAL;
>
> When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> enumerated CPUs as "present" regardless of their status in the MADT. This
> causes issues with SMT hotplug control. For instance, with QEMU's
> "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> follows: the first four CPUs are marked Enabled while the remaining four
> are marked Online Capable to support potential hot-plugging.
>
> Fix this by:
>
> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> entry before calling set_cpu_present() during SMP initialization.
>
> 2. Properly managing the present mask in acpi_map_cpu() and
> acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> other architectures like x86 and LoongArch.
>
> This ensures that only physically available or explicitly enabled CPUs
> are in the present mask, keeping the SMT control logic consistent with
> the actual hardware state.
>
> How to reproduce:
>
> 1. echo off > /sys/devices/system/cpu/smt/control
> psci: CPU1 killed (polled 0 ms)
> psci: CPU3 killed (polled 0 ms)
>
> 2. echo 2 > /sys/devices/system/cpu/smt/control
>
> Detected PIPT I-cache on CPU1
> GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
> CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
> Detected PIPT I-cache on CPU3
> GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
> ------------[ cut here ]------------
> WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
> Modules linked in:
> CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
> Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : internal_create_group+0x41c/0x4bc
> lr : sysfs_create_group+0x18/0x24
> sp : ffff80008078ba40
> x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
> x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
> x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
> x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
> x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
> x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
> x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
> x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
> Call trace:
> internal_create_group+0x41c/0x4bc (P)
> sysfs_create_group+0x18/0x24
> topology_add_dev+0x1c/0x28
> cpuhp_invoke_callback+0x104/0x20c
> __cpuhp_invoke_callback_range+0x94/0x11c
> _cpu_up+0x200/0x37c
> cpuhp_smt_enable+0xbc/0x114
> control_store+0xe8/0x1d4
> dev_attr_store+0x18/0x2c
> sysfs_kf_write+0x7c/0x94
> kernfs_fop_write_iter+0x128/0x1b8
> vfs_write+0x2b0/0x354
> ksys_write+0x68/0xfc
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x48/0x10c
> el0_svc_common.constprop.0+0x40/0xe8
> do_el0_svc+0x20/0x2c
> el0_svc+0x34/0x124
> el0t_64_sync_handler+0xa0/0xe4
> el0t_64_sync+0x198/0x19c
> ---[ end trace 0000000000000000 ]---
Hi, just a gentle ping on this v2 patch. It’s been about two weeks, and
I was wondering if there are any further comments or if there's anything
else I should address? Thanks!
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jonathan Cameron <jic23@kernel.org>
> Cc: James Morse <james.morse@arm.com>
> Cc: Yicong Yang <yangyicong@hisilicon.com>
> Cc: stable@vger.kernel.org
> Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> Fixes: eed4583bcf9a6 ("arm64: Kconfig: Enable HOTPLUG_SMT")
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/kernel/acpi.c | 2 ++
> arch/arm64/kernel/smp.c | 12 +++++++++++-
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index 5891f92c2035..681aa2bbc399 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
> return *pcpu;
> }
>
> + set_cpu_present(*pcpu, true);
> return 0;
> }
> EXPORT_SYMBOL(acpi_map_cpu);
>
> int acpi_unmap_cpu(int cpu)
> {
> + set_cpu_present(cpu, false);
> return 0;
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..5932e5b30b71 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
> }
> EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
>
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
> +}
> +
> /*
> * acpi_map_gic_cpu_interface - parse processor MADT entry
> *
> @@ -670,6 +675,10 @@ static void __init acpi_parse_and_init_cpus(void)
> early_map_cpu_to_node(i, acpi_numa_get_nid(i));
> }
> #else
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return false;
> +}
> #define acpi_parse_and_init_cpus(...) do { } while (0)
> #endif
>
> @@ -808,7 +817,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> if (err)
> continue;
>
> - set_cpu_present(cpu, true);
> + if (acpi_disabled || acpi_cpu_is_present(cpu))
> + set_cpu_present(cpu, true);
> numa_store_cpu_info(cpu);
> }
> }
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-05-11 3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
@ 2026-05-11 9:55 ` Catalin Marinas
0 siblings, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2026-05-11 9:55 UTC (permalink / raw)
To: Jinjie Ruan
Cc: will, punit.agrawal, rafael.j.wysocki, fengchengwen, chenl311,
suzuki.poulose, maz, timothy.hayes, lpieralisi, mrigendra.chaubey,
arnd, sudeep.holla, yangyicong, jic23, pierre.gondois,
linux-arm-kernel, linux-kernel, james.morse
On Mon, May 11, 2026 at 11:17:43AM +0800, Jinjie Ruan wrote:
> On 4/27/2026 10:35 AM, Jinjie Ruan wrote:
> > On arm64, when booting with `maxcpus` greater than the number of present
> > CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> > but have not yet been registered via register_cpu(). Consequently,
> > the per-cpu device objects for these CPUs are not yet initialized.
> >
> > In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
> > _cpu_up() for these unregistered CPUs eventually leads to
> > sysfs_create_group() being called with a NULL kobject (or a kobject
> > without a directory), triggering the following warning in
> > fs/sysfs/group.c:
> >
> > if (WARN_ON(!kobj || (!update && !kobj->sd)))
> > return -EINVAL;
> >
> > When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> > enumerated CPUs as "present" regardless of their status in the MADT. This
> > causes issues with SMT hotplug control. For instance, with QEMU's
> > "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> > follows: the first four CPUs are marked Enabled while the remaining four
> > are marked Online Capable to support potential hot-plugging.
> >
> > Fix this by:
> >
> > 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> > entry before calling set_cpu_present() during SMP initialization.
> >
> > 2. Properly managing the present mask in acpi_map_cpu() and
> > acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> > other architectures like x86 and LoongArch.
> >
> > This ensures that only physically available or explicitly enabled CPUs
> > are in the present mask, keeping the SMT control logic consistent with
> > the actual hardware state.
> >
> > How to reproduce:
> >
> > 1. echo off > /sys/devices/system/cpu/smt/control
> > psci: CPU1 killed (polled 0 ms)
> > psci: CPU3 killed (polled 0 ms)
> >
> > 2. echo 2 > /sys/devices/system/cpu/smt/control
> >
> > Detected PIPT I-cache on CPU1
> > GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
> > CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
> > Detected PIPT I-cache on CPU3
> > GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> > CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
> > ------------[ cut here ]------------
> > WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
> > Modules linked in:
> > CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
> > Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> > pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : internal_create_group+0x41c/0x4bc
> > lr : sysfs_create_group+0x18/0x24
> > sp : ffff80008078ba40
> > x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
> > x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
> > x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
> > x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
> > x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
> > x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
> > x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
> > x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
> > x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
> > Call trace:
> > internal_create_group+0x41c/0x4bc (P)
> > sysfs_create_group+0x18/0x24
> > topology_add_dev+0x1c/0x28
> > cpuhp_invoke_callback+0x104/0x20c
> > __cpuhp_invoke_callback_range+0x94/0x11c
> > _cpu_up+0x200/0x37c
> > cpuhp_smt_enable+0xbc/0x114
> > control_store+0xe8/0x1d4
> > dev_attr_store+0x18/0x2c
> > sysfs_kf_write+0x7c/0x94
> > kernfs_fop_write_iter+0x128/0x1b8
> > vfs_write+0x2b0/0x354
> > ksys_write+0x68/0xfc
> > __arm64_sys_write+0x1c/0x28
> > invoke_syscall+0x48/0x10c
> > el0_svc_common.constprop.0+0x40/0xe8
> > do_el0_svc+0x20/0x2c
> > el0_svc+0x34/0x124
> > el0t_64_sync_handler+0xa0/0xe4
> > el0t_64_sync+0x198/0x19c
> > ---[ end trace 0000000000000000 ]---
>
> Hi, just a gentle ping on this v2 patch. It’s been about two weeks, and
> I was wondering if there are any further comments or if there's anything
> else I should address? Thanks!
No comments from me but I'd like James Morse to review, in case he
recalls the decisions made at the time. I think Jonathan already said
that he doesn't fully remember.
Given that it's just a warning rather than some kernel panic, I'm happy
to wait until the upcoming merging window (for 7.2). In the meantime:
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
[not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
2026-05-11 3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
@ 2026-05-11 17:37 ` Catalin Marinas
1 sibling, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2026-05-11 17:37 UTC (permalink / raw)
To: Jinjie Ruan
Cc: will, punit.agrawal, rafael.j.wysocki, fengchengwen, chenl311,
suzuki.poulose, maz, timothy.hayes, lpieralisi, mrigendra.chaubey,
arnd, sudeep.holla, yangyicong, jic23, pierre.gondois,
linux-arm-kernel, linux-kernel, james.morse
On Mon, Apr 27, 2026 at 10:35:07AM +0800, Jinjie Ruan wrote:
> On arm64, when booting with `maxcpus` greater than the number of present
> CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> but have not yet been registered via register_cpu(). Consequently,
> the per-cpu device objects for these CPUs are not yet initialized.
[...]
> Fix this by:
>
> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> entry before calling set_cpu_present() during SMP initialization.
>
> 2. Properly managing the present mask in acpi_map_cpu() and
> acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> other architectures like x86 and LoongArch.
I had a chat with James earlier and IIUC the decision was to mark all
CPUs present and the GIC must be fully initialised. But digging through
the GICv3 code, I don't see it depending on cpu_present_mask but rather
on the "always on" MADT GICR description. So I think it should be safe
as long as we don't rely on the GICC gicr_base_address. But we should
update Documentation/arch/arm64/cpu-hotplug.rst to no longer state that
all online-capable vCPUs are marked as present by the kernel.
(or maybe I misunderstood all this)
--
Catalin
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-11 17:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
2026-05-11 3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
2026-05-11 9:55 ` Catalin Marinas
2026-05-11 17:37 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox