* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
[not found] <20260417075534.3745793-1-ruanjinjie@huawei.com>
@ 2026-04-18 11:55 ` Catalin Marinas
2026-04-18 15:05 ` Catalin Marinas
2026-04-23 10:08 ` Thomas Gleixner
0 siblings, 2 replies; 9+ messages in thread
From: Catalin Marinas @ 2026-04-18 11:55 UTC (permalink / raw)
To: Jinjie Ruan
Cc: tglx, peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
+ James Morse, linux-arm-kernel
On Fri, Apr 17, 2026 at 03:55:34PM +0800, Jinjie Ruan wrote:
> When booting with `maxcpus` greater than the number of present CPUs (e.g.,
> QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present' but have not
> yet been registered via register_cpu(). Consequently, the per-cpu device
> objects for these CPUs are not yet initialized.
>
> In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
> _cpu_up() for these unregistered CPUs eventually leads to
> sysfs_create_group() being called with a NULL kobject (or a kobject
> without a directory), triggering the following warning in
> fs/sysfs/group.c:
>
> if (WARN_ON(!kobj || (!update && !kobj->sd)))
> return -EINVAL;
>
> Fix this by adding a check for get_cpu_device(cpu). This ensures that
> SMT is only enabled for CPUs that have successfully completed their
> base device registration in sysfs.
>
> How to reproduce:
>
> 1. echo off > /sys/devices/system/cpu/smt/control
> psci: CPU1 killed (polled 0 ms)
> psci: CPU3 killed (polled 0 ms)
>
> 2. echo 2 > /sys/devices/system/cpu/smt/control
>
> Detected PIPT I-cache on CPU1
> GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
> CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
> Detected PIPT I-cache on CPU3
> GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
> ------------[ cut here ]------------
> WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
> Modules linked in:
> CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
> Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : internal_create_group+0x41c/0x4bc
> lr : sysfs_create_group+0x18/0x24
> sp : ffff80008078ba40
> x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
> x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
> x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
> x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
> x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
> x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
> x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
> x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
> Call trace:
> internal_create_group+0x41c/0x4bc (P)
> sysfs_create_group+0x18/0x24
> topology_add_dev+0x1c/0x28
> cpuhp_invoke_callback+0x104/0x20c
> __cpuhp_invoke_callback_range+0x94/0x11c
> _cpu_up+0x200/0x37c
> cpuhp_smt_enable+0xbc/0x114
> control_store+0xe8/0x1d4
> dev_attr_store+0x18/0x2c
> sysfs_kf_write+0x7c/0x94
> kernfs_fop_write_iter+0x128/0x1b8
> vfs_write+0x2b0/0x354
> ksys_write+0x68/0xfc
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x48/0x10c
> el0_svc_common.constprop.0+0x40/0xe8
> do_el0_svc+0x20/0x2c
> el0_svc+0x34/0x124
> el0t_64_sync_handler+0xa0/0xe4
> el0t_64_sync+0x198/0x19c
> ---[ end trace 0000000000000000 ]---
>
> Cc: stable@vger.kernel.org
> Fixes: eed4583bcf9a6 ("arm64: Kconfig: Enable HOTPLUG_SMT")
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> kernel/cpu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index bc4f7a9ba64e..df725d92ad4f 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2706,6 +2706,13 @@ int cpuhp_smt_enable(void)
> cpu_maps_update_begin();
> cpu_smt_control = CPU_SMT_ENABLED;
> for_each_present_cpu(cpu) {
> + /*
> + * Skip CPUs that have not been registered in sysfs yet.
> + * This avoids triggering NULL kobject warnings for maxcpus.
> + */
> + if (!get_cpu_device(cpu))
> + continue;
> +
> /* Skip online CPUs and CPUs on offline nodes */
> if (cpu_online(cpu) || !node_online(cpu_to_node(cpu)))
> continue;
I spent some time trying to understand how we can get into this
situation. It seems to be an ACPI only thing.
Since commit eba4675008a6 ("arm64: arch_register_cpu() variant to check
if an ACPI handle is now available.") as part of the virtual CPU hotplug
series, arch_register_cpu() can defer registering the CPU devices. IIUC
this is done later via acpi_processor_hotadd_init(). smp_prepare_cpus(),
however, still marks them as present.
cpuhp_smt_enable(), if triggered later, walks the present cpus and
attempts _cpu_up(). cpuhp_up_callbacks() will call topology_add_dev()
(registered as a CPUHP_TOPOLOGY_PREPARE callback) and warn since
get_cpu_device() returns NULL. I think this can't happen during early
_cpu_up() calls since smp_init() is called before the topology callback
has been registered (slightly later during do_basic_setup()).
I'm not sure what the best fix should be. If we go with something along
the lines of the above, I wonder whether we should instead return an
error in the topology_add_dev() callback if get_cpu_device() returns
NULL. It feels a bit wrong to add a check for sysfs in
cpuhp_smt_enable() just because of some callbacks attempted by
_cpu_up(). There's precedent in cpu_capacity_sysctl_add() returning
-ENOENT. However, this messes up the smt control enabling since
cpuhp_smt_enable() will bail out early and return an error.
Another option would have been to avoid marking such CPUs present but I
think this will break other things. Yet another option is to register
all CPU devices even if they never come up (like maxcpus greater than
actual CPUs).
Opinions? It might be an arm64+ACPI-only thing.
--
Catalin
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-18 11:55 ` [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Catalin Marinas
@ 2026-04-18 15:05 ` Catalin Marinas
2026-04-20 1:29 ` Jinjie Ruan
2026-04-23 12:46 ` Jinjie Ruan
2026-04-23 10:08 ` Thomas Gleixner
1 sibling, 2 replies; 9+ messages in thread
From: Catalin Marinas @ 2026-04-18 15:05 UTC (permalink / raw)
To: Jinjie Ruan
Cc: tglx, peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
On Sat, Apr 18, 2026 at 12:55:22PM +0100, Catalin Marinas wrote:
> On Fri, Apr 17, 2026 at 03:55:34PM +0800, Jinjie Ruan wrote:
> > When booting with `maxcpus` greater than the number of present CPUs (e.g.,
> > QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present' but have not
> > yet been registered via register_cpu(). Consequently, the per-cpu device
> > objects for these CPUs are not yet initialized.
[...]
> Another option would have been to avoid marking such CPUs present but I
> think this will break other things. Yet another option is to register
> all CPU devices even if they never come up (like maxcpus greater than
> actual CPUs).
Something like below, untested (and I don't claim I properly understand
this code; just lots of tokens used trying to make sense of it ;))
------------------------8<-------------------------
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index a9d884fd1d00..4c0a5ed906ea 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
return *pcpu;
}
+ set_cpu_present(*pcpu, true);
return 0;
}
EXPORT_SYMBOL(acpi_map_cpu);
int acpi_unmap_cpu(int cpu)
{
+ set_cpu_present(cpu, false);
return 0;
}
EXPORT_SYMBOL(acpi_unmap_cpu);
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1aa324104afb..751a74d997e1 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -510,8 +510,10 @@ int arch_register_cpu(int cpu)
struct cpu *c = &per_cpu(cpu_devices, cpu);
if (!acpi_disabled && !acpi_handle &&
- IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
+ IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU)) {
+ set_cpu_present(cpu, false);
return -EPROBE_DEFER;
+ }
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* For now block anything that looks like physical CPU Hotplug */
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-18 15:05 ` Catalin Marinas
@ 2026-04-20 1:29 ` Jinjie Ruan
2026-04-23 12:46 ` Jinjie Ruan
1 sibling, 0 replies; 9+ messages in thread
From: Jinjie Ruan @ 2026-04-20 1:29 UTC (permalink / raw)
To: Catalin Marinas
Cc: tglx, peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
On 4/18/2026 11:05 PM, Catalin Marinas wrote:
> On Sat, Apr 18, 2026 at 12:55:22PM +0100, Catalin Marinas wrote:
>> On Fri, Apr 17, 2026 at 03:55:34PM +0800, Jinjie Ruan wrote:
>>> When booting with `maxcpus` greater than the number of present CPUs (e.g.,
>>> QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present' but have not
>>> yet been registered via register_cpu(). Consequently, the per-cpu device
>>> objects for these CPUs are not yet initialized.
> [...]
>> Another option would have been to avoid marking such CPUs present but I
>> think this will break other things. Yet another option is to register
>> all CPU devices even if they never come up (like maxcpus greater than
>> actual CPUs).
>
> Something like below, untested (and I don't claim I properly understand
> this code; just lots of tokens used trying to make sense of it ;))
>
> ------------------------8<-------------------------
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index a9d884fd1d00..4c0a5ed906ea 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
> return *pcpu;
> }
>
> + set_cpu_present(*pcpu, true);
> return 0;
> }
> EXPORT_SYMBOL(acpi_map_cpu);
>
> int acpi_unmap_cpu(int cpu)
> {
> + set_cpu_present(cpu, false);
> return 0;
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..751a74d997e1 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -510,8 +510,10 @@ int arch_register_cpu(int cpu)
> struct cpu *c = &per_cpu(cpu_devices, cpu);
>
> if (!acpi_disabled && !acpi_handle &&
> - IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
> + IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU)) {
> + set_cpu_present(cpu, false);
> return -EPROBE_DEFER;
> + }
I have verified this patch in my local environment, and it passes the test.
>
> #ifdef CONFIG_ACPI_HOTPLUG_CPU
> /* For now block anything that looks like physical CPU Hotplug */
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-18 11:55 ` [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Catalin Marinas
2026-04-18 15:05 ` Catalin Marinas
@ 2026-04-23 10:08 ` Thomas Gleixner
2026-04-23 12:32 ` Jinjie Ruan
1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2026-04-23 10:08 UTC (permalink / raw)
To: Catalin Marinas, Jinjie Ruan
Cc: peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
On Sat, Apr 18 2026 at 12:55, Catalin Marinas wrote:
> Another option would have been to avoid marking such CPUs present but I
> think this will break other things. Yet another option is to register
> all CPU devices even if they never come up (like maxcpus greater than
> actual CPUs).
>
> Opinions? It might be an arm64+ACPI-only thing.
I think so. The proper thing to do is to apply sane limits:
1) The possible CPUs enumerated by firmware N_POSSIBLE_FW
2) The maxcpus limit on the command line N_MAXCPUS_CL
So the actual possible CPUs evaluates to:
num_possible = min(N_POSSIBLE_FW, N_MAXCPUS_CL, CONFIG_NR_CPUS);
The evaluation of the firmware should not mark CPUs present which are
actually not. ACPI gives you that information. See:
5.2.12.14 GIC CPU Interface (GICC) Structure
in the ACPI spec. That has two related bits:
Enabled:
If this bit is set, the processor is ready for use. If this bit is
clear and the Online Capable bit is set, the system supports enabling
this processor during OS runtime. If this bit is clear and the Online
Capable bit is also clear, this processor is un- usable, and the
operating system support will not attempt to use it.
Online Capable:
The information conveyed by this bit depends on the value of the
Enabled bit. If the Enabled bit is set, this bit is reserved and must
be zero. Otherwise, if this bit is set, the system supports enabling
this processor later during OS runtime
So the combination of those gives you the right answer:
Enabled Online
Capable
0 0 Not present, not possible
0 1 Not present, but possible to "hotplug" layter
1 0 Present
1 1 Invalid
The kernel sizes everything on the number of possible CPUs and the
present CPU mask is only there to figure out which CPUs are actually
usable and can be brought up.
The runtime physical hotplug mechanics use acpi_[un]map_cpu() to toggle
the present bit.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-23 10:08 ` Thomas Gleixner
@ 2026-04-23 12:32 ` Jinjie Ruan
2026-04-23 20:11 ` Catalin Marinas
0 siblings, 1 reply; 9+ messages in thread
From: Jinjie Ruan @ 2026-04-23 12:32 UTC (permalink / raw)
To: Thomas Gleixner, Catalin Marinas
Cc: peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
On 4/23/2026 6:08 PM, Thomas Gleixner wrote:
> On Sat, Apr 18 2026 at 12:55, Catalin Marinas wrote:
>> Another option would have been to avoid marking such CPUs present but I
>> think this will break other things. Yet another option is to register
>> all CPU devices even if they never come up (like maxcpus greater than
>> actual CPUs).
>>
>> Opinions? It might be an arm64+ACPI-only thing.
>
> I think so. The proper thing to do is to apply sane limits:
>
> 1) The possible CPUs enumerated by firmware N_POSSIBLE_FW
>
> 2) The maxcpus limit on the command line N_MAXCPUS_CL
>
> So the actual possible CPUs evaluates to:
>
> num_possible = min(N_POSSIBLE_FW, N_MAXCPUS_CL, CONFIG_NR_CPUS);
>
> The evaluation of the firmware should not mark CPUs present which are
> actually not. ACPI gives you that information. See:
>
> 5.2.12.14 GIC CPU Interface (GICC) Structure
>
> in the ACPI spec. That has two related bits:
>
> Enabled:
>
> If this bit is set, the processor is ready for use. If this bit is
> clear and the Online Capable bit is set, the system supports enabling
> this processor during OS runtime. If this bit is clear and the Online
> Capable bit is also clear, this processor is un- usable, and the
> operating system support will not attempt to use it.
>
> Online Capable:
>
> The information conveyed by this bit depends on the value of the
> Enabled bit. If the Enabled bit is set, this bit is reserved and must
> be zero. Otherwise, if this bit is set, the system supports enabling
> this processor later during OS runtime
>
> So the combination of those gives you the right answer:
>
> Enabled Online
> Capable
> 0 0 Not present, not possible
> 0 1 Not present, but possible to "hotplug" layter
> 1 0 Present
> 1 1 Invalid
On x86, it seems that all CPUs with the ACPI_MADT_ENABLED bit set will
be marked as present.
acpi_parse_x2apic()
-> enabled = processor->lapic_flags & ACPI_MADT_ENABLED
-> topology_register_apic(enabled)
-> topo_register_apic(enabled)
-> set_cpu_present(cpu, true)
>
> The kernel sizes everything on the number of possible CPUs and the
> present CPU mask is only there to figure out which CPUs are actually
> usable and can be brought up.
>
> The runtime physical hotplug mechanics use acpi_[un]map_cpu() to toggle
> the present bit.
>
> Thanks,
>
> tglx
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-18 15:05 ` Catalin Marinas
2026-04-20 1:29 ` Jinjie Ruan
@ 2026-04-23 12:46 ` Jinjie Ruan
1 sibling, 0 replies; 9+ messages in thread
From: Jinjie Ruan @ 2026-04-23 12:46 UTC (permalink / raw)
To: Catalin Marinas
Cc: tglx, peterz, sudeep.holla, yangyicong, dietmar.eggemann,
Jonathan.Cameron, linux-kernel, James Morse, linux-arm-kernel
On 4/18/2026 11:05 PM, Catalin Marinas wrote:
> On Sat, Apr 18, 2026 at 12:55:22PM +0100, Catalin Marinas wrote:
>> On Fri, Apr 17, 2026 at 03:55:34PM +0800, Jinjie Ruan wrote:
>>> When booting with `maxcpus` greater than the number of present CPUs (e.g.,
>>> QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present' but have not
>>> yet been registered via register_cpu(). Consequently, the per-cpu device
>>> objects for these CPUs are not yet initialized.
> [...]
>> Another option would have been to avoid marking such CPUs present but I
>> think this will break other things. Yet another option is to register
>> all CPU devices even if they never come up (like maxcpus greater than
>> actual CPUs).
>
> Something like below, untested (and I don't claim I properly understand
> this code; just lots of tokens used trying to make sense of it ;))
>
> ------------------------8<-------------------------
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index a9d884fd1d00..4c0a5ed906ea 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
> return *pcpu;
> }
>
> + set_cpu_present(*pcpu, true);
> return 0;
> }
> EXPORT_SYMBOL(acpi_map_cpu);
>
> int acpi_unmap_cpu(int cpu)
> {
> + set_cpu_present(cpu, false);
This logic, where we set 'present' in acpi_map_cpu() and clear it in
acpi_unmap_cpu(), seems to align with how x86 does it.
acpi_map_cpu()
-> topology_hotplug_apic()
-> topo_set_cpuids()
-> set_cpu_present(cpu, true)
acpi_unmap_cpu()
-> topology_hotunplug_apic(cpu)
-> set_cpu_present(cpu, false)
Should we consider moving the setting/clearing of the 'present' bit into
the generic ACPI code (e.g., within the success path of acpi_map_cpu)?
This would ensure consistency across architectures and prevent new
implementations from missing these critical state updates.
> return 0;
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..751a74d997e1 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -510,8 +510,10 @@ int arch_register_cpu(int cpu)
> struct cpu *c = &per_cpu(cpu_devices, cpu);
>
> if (!acpi_disabled && !acpi_handle &&
> - IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
> + IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU)) {
> + set_cpu_present(cpu, false);
> return -EPROBE_DEFER;
> + }
>
> #ifdef CONFIG_ACPI_HOTPLUG_CPU
> /* For now block anything that looks like physical CPU Hotplug */
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-23 12:32 ` Jinjie Ruan
@ 2026-04-23 20:11 ` Catalin Marinas
2026-04-24 1:56 ` Jinjie Ruan
2026-04-24 2:47 ` Jinjie Ruan
0 siblings, 2 replies; 9+ messages in thread
From: Catalin Marinas @ 2026-04-23 20:11 UTC (permalink / raw)
To: Jinjie Ruan
Cc: Thomas Gleixner, peterz, sudeep.holla, yangyicong,
dietmar.eggemann, Jonathan.Cameron, linux-kernel, James Morse,
linux-arm-kernel
On Thu, Apr 23, 2026 at 08:32:34PM +0800, Jinjie Ruan wrote:
> On 4/23/2026 6:08 PM, Thomas Gleixner wrote:
> > On Sat, Apr 18 2026 at 12:55, Catalin Marinas wrote:
> >> Another option would have been to avoid marking such CPUs present but I
> >> think this will break other things. Yet another option is to register
> >> all CPU devices even if they never come up (like maxcpus greater than
> >> actual CPUs).
> >>
> >> Opinions? It might be an arm64+ACPI-only thing.
> >
> > I think so. The proper thing to do is to apply sane limits:
> >
> > 1) The possible CPUs enumerated by firmware N_POSSIBLE_FW
> >
> > 2) The maxcpus limit on the command line N_MAXCPUS_CL
> >
> > So the actual possible CPUs evaluates to:
> >
> > num_possible = min(N_POSSIBLE_FW, N_MAXCPUS_CL, CONFIG_NR_CPUS);
> >
> > The evaluation of the firmware should not mark CPUs present which are
> > actually not. ACPI gives you that information. See:
> >
> > 5.2.12.14 GIC CPU Interface (GICC) Structure
> >
> > in the ACPI spec. That has two related bits:
> >
> > Enabled:
> >
> > If this bit is set, the processor is ready for use. If this bit is
> > clear and the Online Capable bit is set, the system supports enabling
> > this processor during OS runtime. If this bit is clear and the Online
> > Capable bit is also clear, this processor is un- usable, and the
> > operating system support will not attempt to use it.
> >
> > Online Capable:
> >
> > The information conveyed by this bit depends on the value of the
> > Enabled bit. If the Enabled bit is set, this bit is reserved and must
> > be zero. Otherwise, if this bit is set, the system supports enabling
> > this processor later during OS runtime
> >
> > So the combination of those gives you the right answer:
> >
> > Enabled Online
> > Capable
> > 0 0 Not present, not possible
> > 0 1 Not present, but possible to "hotplug" layter
> > 1 0 Present
> > 1 1 Invalid
>
> On x86, it seems that all CPUs with the ACPI_MADT_ENABLED bit set will
> be marked as present.
>
> acpi_parse_x2apic()
> -> enabled = processor->lapic_flags & ACPI_MADT_ENABLED
> -> topology_register_apic(enabled)
> -> topo_register_apic(enabled)
> -> set_cpu_present(cpu, true)
Yes but arm64 marks all CPUs present even if !ACPI_MADT_ENABLED as we
don't have the notion of hardware CPU hotplug.
I need to dig some more into the original vCPU hotplug support and why
we ended up with all CPUs marked as present even if not calling
register_cpu():
https://lore.kernel.org/linux-arm-kernel/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
What's the MADT GICC provided by qemu with "-smp cpus=4,maxcpus=8"? If
it says Enabled for the first 4 and Online Capable for the rest, maybe
we can try something like below:
----------------------8<-----------------
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 5891f92c2035..681aa2bbc399 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
return *pcpu;
}
+ set_cpu_present(*pcpu, true);
return 0;
}
EXPORT_SYMBOL(acpi_map_cpu);
int acpi_unmap_cpu(int cpu)
{
+ set_cpu_present(cpu, false);
return 0;
}
EXPORT_SYMBOL(acpi_unmap_cpu);
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1aa324104afb..6421027669fc 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
}
EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
+static bool acpi_cpu_is_present(int cpu)
+{
+ return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
+}
+
/*
* acpi_map_gic_cpu_interface - parse processor MADT entry
*
@@ -670,6 +675,11 @@ static void __init acpi_parse_and_init_cpus(void)
early_map_cpu_to_node(i, acpi_numa_get_nid(i));
}
#else
+static bool acpi_cpu_is_present(int cpu)
+{
+ return false;
+}
+
#define acpi_parse_and_init_cpus(...) do { } while (0)
#endif
@@ -808,7 +818,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
if (err)
continue;
- set_cpu_present(cpu, true);
+ if (acpi_disabled || acpi_cpu_is_present(cpu))
+ set_cpu_present(cpu, true);
numa_store_cpu_info(cpu);
}
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-23 20:11 ` Catalin Marinas
@ 2026-04-24 1:56 ` Jinjie Ruan
2026-04-24 2:47 ` Jinjie Ruan
1 sibling, 0 replies; 9+ messages in thread
From: Jinjie Ruan @ 2026-04-24 1:56 UTC (permalink / raw)
To: Catalin Marinas
Cc: Thomas Gleixner, peterz, sudeep.holla, yangyicong,
dietmar.eggemann, Jonathan.Cameron, linux-kernel, James Morse,
linux-arm-kernel
On 4/24/2026 4:11 AM, Catalin Marinas wrote:
> On Thu, Apr 23, 2026 at 08:32:34PM +0800, Jinjie Ruan wrote:
>> On 4/23/2026 6:08 PM, Thomas Gleixner wrote:
>>> On Sat, Apr 18 2026 at 12:55, Catalin Marinas wrote:
>>>> Another option would have been to avoid marking such CPUs present but I
>>>> think this will break other things. Yet another option is to register
>>>> all CPU devices even if they never come up (like maxcpus greater than
>>>> actual CPUs).
>>>>
>>>> Opinions? It might be an arm64+ACPI-only thing.
>>>
>>> I think so. The proper thing to do is to apply sane limits:
>>>
>>> 1) The possible CPUs enumerated by firmware N_POSSIBLE_FW
>>>
>>> 2) The maxcpus limit on the command line N_MAXCPUS_CL
>>>
>>> So the actual possible CPUs evaluates to:
>>>
>>> num_possible = min(N_POSSIBLE_FW, N_MAXCPUS_CL, CONFIG_NR_CPUS);
>>>
>>> The evaluation of the firmware should not mark CPUs present which are
>>> actually not. ACPI gives you that information. See:
>>>
>>> 5.2.12.14 GIC CPU Interface (GICC) Structure
>>>
>>> in the ACPI spec. That has two related bits:
>>>
>>> Enabled:
>>>
>>> If this bit is set, the processor is ready for use. If this bit is
>>> clear and the Online Capable bit is set, the system supports enabling
>>> this processor during OS runtime. If this bit is clear and the Online
>>> Capable bit is also clear, this processor is un- usable, and the
>>> operating system support will not attempt to use it.
>>>
>>> Online Capable:
>>>
>>> The information conveyed by this bit depends on the value of the
>>> Enabled bit. If the Enabled bit is set, this bit is reserved and must
>>> be zero. Otherwise, if this bit is set, the system supports enabling
>>> this processor later during OS runtime
>>>
>>> So the combination of those gives you the right answer:
>>>
>>> Enabled Online
>>> Capable
>>> 0 0 Not present, not possible
>>> 0 1 Not present, but possible to "hotplug" layter
>>> 1 0 Present
>>> 1 1 Invalid
>>
>> On x86, it seems that all CPUs with the ACPI_MADT_ENABLED bit set will
>> be marked as present.
>>
>> acpi_parse_x2apic()
>> -> enabled = processor->lapic_flags & ACPI_MADT_ENABLED
>> -> topology_register_apic(enabled)
>> -> topo_register_apic(enabled)
>> -> set_cpu_present(cpu, true)
>
> Yes but arm64 marks all CPUs present even if !ACPI_MADT_ENABLED as we
> don't have the notion of hardware CPU hotplug.
>
> I need to dig some more into the original vCPU hotplug support and why
> we ended up with all CPUs marked as present even if not calling
> register_cpu():
>
> https://lore.kernel.org/linux-arm-kernel/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
>
> What's the MADT GICC provided by qemu with "-smp cpus=4,maxcpus=8"? If
> it says Enabled for the first 4 and Online Capable for the rest, maybe
> we can try something like below:
Yes, you are absolutely right,Enabled for the first 4(with GIC Flags:
0x1, bit0 set) and Online Capable for the rest(with GIC Flags: 0x8, bit3
set). The ACPI MADT disassembly result is as follows:
Link:
https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
# cat /sys/firmware/acpi/tables/APIC > madt.bin
# iasl -d madt.bin
[048h 0072 4] CPU Interface Number : 00000000
[030h 0048 4] Local GIC Hardware ID : 00000000
...
[04Ch 0076 4] Processor UID : 00000000
[050h 0080 4] Flags (decoded below) : 00000001
Processor Enabled : 1
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[098h 0152 4] CPU Interface Number : 00000001
[09Ch 0156 4] Processor UID : 00000001
[0A0h 0160 4] Flags (decoded below) : 00000001
Processor Enabled : 1
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[0E8h 0232 4] CPU Interface Number : 00000002
[0ECh 0236 4] Processor UID : 00000002
[0F0h 0240 4] Flags (decoded below) : 00000001
Processor Enabled : 1
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[138h 0312 4] CPU Interface Number : 00000003
[13Ch 0316 4] Processor UID : 00000003
[140h 0320 4] Flags (decoded below) : 00000001
Processor Enabled : 1
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[188h 0392 4] CPU Interface Number : 00000004
[18Ch 0396 4] Processor UID : 00000004
[190h 0400 4] Flags (decoded below) : 00000008
Processor Enabled : 0
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[1D8h 0472 4] CPU Interface Number : 00000005
[1DCh 0476 4] Processor UID : 00000005
[1E0h 0480 4] Flags (decoded below) : 00000008
Processor Enabled : 0
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[228h 0552 4] CPU Interface Number : 00000006
[22Ch 0556 4] Processor UID : 00000006
[230h 0560 4] Flags (decoded below) : 00000008
Processor Enabled : 0
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
[278h 0632 4] CPU Interface Number : 00000007
[27Ch 0636 4] Processor UID : 00000007
[280h 0640 4] Flags (decoded below) : 00000008
Processor Enabled : 0
Performance Interrupt Trigger Mode : 0
Virtual GIC Interrupt Trigger Mode : 0
...
>
> ----------------------8<-----------------
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index 5891f92c2035..681aa2bbc399 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
> return *pcpu;
> }
>
> + set_cpu_present(*pcpu, true);
> return 0;
> }
> EXPORT_SYMBOL(acpi_map_cpu);
>
> int acpi_unmap_cpu(int cpu)
> {
> + set_cpu_present(cpu, false);
> return 0;
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..6421027669fc 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
> }
> EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
>
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
> +}
> +
> /*
> * acpi_map_gic_cpu_interface - parse processor MADT entry
> *
> @@ -670,6 +675,11 @@ static void __init acpi_parse_and_init_cpus(void)
> early_map_cpu_to_node(i, acpi_numa_get_nid(i));
> }
> #else
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return false;
> +}
> +
> #define acpi_parse_and_init_cpus(...) do { } while (0)
> #endif
>
> @@ -808,7 +818,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> if (err)
> continue;
>
> - set_cpu_present(cpu, true);
> + if (acpi_disabled || acpi_cpu_is_present(cpu))
> + set_cpu_present(cpu, true);
> numa_store_cpu_info(cpu);
> }
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
2026-04-23 20:11 ` Catalin Marinas
2026-04-24 1:56 ` Jinjie Ruan
@ 2026-04-24 2:47 ` Jinjie Ruan
1 sibling, 0 replies; 9+ messages in thread
From: Jinjie Ruan @ 2026-04-24 2:47 UTC (permalink / raw)
To: Catalin Marinas
Cc: Thomas Gleixner, peterz, sudeep.holla, yangyicong,
dietmar.eggemann, Jonathan.Cameron, linux-kernel, James Morse,
linux-arm-kernel
On 4/24/2026 4:11 AM, Catalin Marinas wrote:
> On Thu, Apr 23, 2026 at 08:32:34PM +0800, Jinjie Ruan wrote:
>> On 4/23/2026 6:08 PM, Thomas Gleixner wrote:
>>> On Sat, Apr 18 2026 at 12:55, Catalin Marinas wrote:
>>>> Another option would have been to avoid marking such CPUs present but I
>>>> think this will break other things. Yet another option is to register
>>>> all CPU devices even if they never come up (like maxcpus greater than
>>>> actual CPUs).
>>>>
>>>> Opinions? It might be an arm64+ACPI-only thing.
>>>
>>> I think so. The proper thing to do is to apply sane limits:
>>>
>>> 1) The possible CPUs enumerated by firmware N_POSSIBLE_FW
>>>
>>> 2) The maxcpus limit on the command line N_MAXCPUS_CL
>>>
>>> So the actual possible CPUs evaluates to:
>>>
>>> num_possible = min(N_POSSIBLE_FW, N_MAXCPUS_CL, CONFIG_NR_CPUS);
>>>
>>> The evaluation of the firmware should not mark CPUs present which are
>>> actually not. ACPI gives you that information. See:
>>>
>>> 5.2.12.14 GIC CPU Interface (GICC) Structure
>>>
>>> in the ACPI spec. That has two related bits:
>>>
>>> Enabled:
>>>
>>> If this bit is set, the processor is ready for use. If this bit is
>>> clear and the Online Capable bit is set, the system supports enabling
>>> this processor during OS runtime. If this bit is clear and the Online
>>> Capable bit is also clear, this processor is un- usable, and the
>>> operating system support will not attempt to use it.
>>>
>>> Online Capable:
>>>
>>> The information conveyed by this bit depends on the value of the
>>> Enabled bit. If the Enabled bit is set, this bit is reserved and must
>>> be zero. Otherwise, if this bit is set, the system supports enabling
>>> this processor later during OS runtime
>>>
>>> So the combination of those gives you the right answer:
>>>
>>> Enabled Online
>>> Capable
>>> 0 0 Not present, not possible
>>> 0 1 Not present, but possible to "hotplug" layter
>>> 1 0 Present
>>> 1 1 Invalid
>>
>> On x86, it seems that all CPUs with the ACPI_MADT_ENABLED bit set will
>> be marked as present.
>>
>> acpi_parse_x2apic()
>> -> enabled = processor->lapic_flags & ACPI_MADT_ENABLED
>> -> topology_register_apic(enabled)
>> -> topo_register_apic(enabled)
>> -> set_cpu_present(cpu, true)
>
> Yes but arm64 marks all CPUs present even if !ACPI_MADT_ENABLED as we
> don't have the notion of hardware CPU hotplug.
>
> I need to dig some more into the original vCPU hotplug support and why
> we ended up with all CPUs marked as present even if not calling
> register_cpu():
>
> https://lore.kernel.org/linux-arm-kernel/20240529133446.28446-1-Jonathan.Cameron@huawei.com/
>
> What's the MADT GICC provided by qemu with "-smp cpus=4,maxcpus=8"? If
> it says Enabled for the first 4 and Online Capable for the rest, maybe
> we can try something like below:
>
> ----------------------8<-----------------
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index 5891f92c2035..681aa2bbc399 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
> return *pcpu;
> }
>
> + set_cpu_present(*pcpu, true);
> return 0;
> }
> EXPORT_SYMBOL(acpi_map_cpu);
>
> int acpi_unmap_cpu(int cpu)
> {
> + set_cpu_present(cpu, false);
> return 0;
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..6421027669fc 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
> }
> EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
>
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
> +}
> +
> /*
> * acpi_map_gic_cpu_interface - parse processor MADT entry
> *
> @@ -670,6 +675,11 @@ static void __init acpi_parse_and_init_cpus(void)
> early_map_cpu_to_node(i, acpi_numa_get_nid(i));
> }
> #else
> +static bool acpi_cpu_is_present(int cpu)
> +{
> + return false;
> +}
> +
> #define acpi_parse_and_init_cpus(...) do { } while (0)
> #endif
>
> @@ -808,7 +818,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> if (err)
> continue;
>
> - set_cpu_present(cpu, true);
> + if (acpi_disabled || acpi_cpu_is_present(cpu))
> + set_cpu_present(cpu, true);
Hi, Catalin
It alss passes the test on the local QEMU-KVM environment where the ACPI
issue occurs. And this looks like the cleanest fix.
Best regards,
Jinjie
> numa_store_cpu_info(cpu);
> }
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-24 2:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260417075534.3745793-1-ruanjinjie@huawei.com>
2026-04-18 11:55 ` [PATCH] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Catalin Marinas
2026-04-18 15:05 ` Catalin Marinas
2026-04-20 1:29 ` Jinjie Ruan
2026-04-23 12:46 ` Jinjie Ruan
2026-04-23 10:08 ` Thomas Gleixner
2026-04-23 12:32 ` Jinjie Ruan
2026-04-23 20:11 ` Catalin Marinas
2026-04-24 1:56 ` Jinjie Ruan
2026-04-24 2:47 ` Jinjie Ruan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox