The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
       [not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
@ 2026-05-11  3:17 ` Jinjie Ruan
  2026-05-11  9:55   ` Catalin Marinas
  2026-05-11 17:37 ` Catalin Marinas
  1 sibling, 1 reply; 3+ messages in thread
From: Jinjie Ruan @ 2026-05-11  3:17 UTC (permalink / raw)
  To: catalin.marinas, will, punit.agrawal, rafael.j.wysocki,
	fengchengwen, chenl311, suzuki.poulose, maz, timothy.hayes,
	lpieralisi, mrigendra.chaubey, arnd, sudeep.holla, yangyicong,
	jic23, pierre.gondois, linux-arm-kernel, linux-kernel,
	james.morse



On 4/27/2026 10:35 AM, Jinjie Ruan wrote:
> On arm64, when booting with `maxcpus` greater than the number of present
> CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> but have not yet been registered via register_cpu(). Consequently,
> the per-cpu device objects for these CPUs are not yet initialized.
> 
> In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
> _cpu_up() for these unregistered CPUs eventually leads to
> sysfs_create_group() being called with a NULL kobject (or a kobject
> without a directory), triggering the following warning in
> fs/sysfs/group.c:
> 
> 	if (WARN_ON(!kobj || (!update && !kobj->sd)))
> 		return -EINVAL;
> 
> When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> enumerated CPUs as "present" regardless of their status in the MADT. This
> causes issues with SMT hotplug control. For instance, with QEMU's
> "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> follows: the first four CPUs are marked Enabled while the remaining four
> are marked Online Capable to support potential hot-plugging.
> 
> Fix this by:
> 
> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
>    entry before calling set_cpu_present() during SMP initialization.
> 
> 2. Properly managing the present mask in acpi_map_cpu() and
>    acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
>    other architectures like x86 and LoongArch.
> 
> This ensures that only physically available or explicitly enabled CPUs
> are in the present mask, keeping the SMT control logic consistent with
> the actual hardware state.
> 
> How to reproduce:
> 
> 	1. echo off > /sys/devices/system/cpu/smt/control
> 		psci: CPU1 killed (polled 0 ms)
> 		psci: CPU3 killed (polled 0 ms)
> 
> 	2. echo 2 > /sys/devices/system/cpu/smt/control
> 
> 	Detected PIPT I-cache on CPU1
> 	GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
> 	CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
> 	Detected PIPT I-cache on CPU3
> 	GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> 	CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
> 	------------[ cut here ]------------
> 	WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
> 	Modules linked in:
> 	CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
> 	Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> 	pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> 	pc : internal_create_group+0x41c/0x4bc
> 	lr : sysfs_create_group+0x18/0x24
> 	sp : ffff80008078ba40
> 	x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
> 	x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
> 	x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
> 	x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
> 	x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> 	x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
> 	x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
> 	x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
> 	x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
> 	x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
> 	Call trace:
> 	 internal_create_group+0x41c/0x4bc (P)
> 	 sysfs_create_group+0x18/0x24
> 	 topology_add_dev+0x1c/0x28
> 	 cpuhp_invoke_callback+0x104/0x20c
> 	 __cpuhp_invoke_callback_range+0x94/0x11c
> 	 _cpu_up+0x200/0x37c
> 	 cpuhp_smt_enable+0xbc/0x114
> 	 control_store+0xe8/0x1d4
> 	 dev_attr_store+0x18/0x2c
> 	 sysfs_kf_write+0x7c/0x94
> 	 kernfs_fop_write_iter+0x128/0x1b8
> 	 vfs_write+0x2b0/0x354
> 	 ksys_write+0x68/0xfc
> 	 __arm64_sys_write+0x1c/0x28
> 	 invoke_syscall+0x48/0x10c
> 	 el0_svc_common.constprop.0+0x40/0xe8
> 	 do_el0_svc+0x20/0x2c
> 	 el0_svc+0x34/0x124
> 	 el0t_64_sync_handler+0xa0/0xe4
> 	 el0t_64_sync+0x198/0x19c
> 	---[ end trace 0000000000000000 ]---

Hi, just a gentle ping on this v2 patch. It’s been about two weeks, and
I was wondering if there are any further comments or if there's anything
else I should address? Thanks!

> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jonathan Cameron <jic23@kernel.org>
> Cc: James Morse <james.morse@arm.com>
> Cc: Yicong Yang <yangyicong@hisilicon.com>
> Cc: stable@vger.kernel.org
> Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> Fixes: eed4583bcf9a6 ("arm64: Kconfig: Enable HOTPLUG_SMT")
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
>  arch/arm64/kernel/acpi.c |  2 ++
>  arch/arm64/kernel/smp.c  | 12 +++++++++++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index 5891f92c2035..681aa2bbc399 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
>  		return *pcpu;
>  	}
>  
> +	set_cpu_present(*pcpu, true);
>  	return 0;
>  }
>  EXPORT_SYMBOL(acpi_map_cpu);
>  
>  int acpi_unmap_cpu(int cpu)
>  {
> +	set_cpu_present(cpu, false);
>  	return 0;
>  }
>  EXPORT_SYMBOL(acpi_unmap_cpu);
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..5932e5b30b71 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
>  }
>  EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
>  
> +static bool acpi_cpu_is_present(int cpu)
> +{
> +	return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
> +}
> +
>  /*
>   * acpi_map_gic_cpu_interface - parse processor MADT entry
>   *
> @@ -670,6 +675,10 @@ static void __init acpi_parse_and_init_cpus(void)
>  		early_map_cpu_to_node(i, acpi_numa_get_nid(i));
>  }
>  #else
> +static bool acpi_cpu_is_present(int cpu)
> +{
> +	return false;
> +}
>  #define acpi_parse_and_init_cpus(...)	do { } while (0)
>  #endif
>  
> @@ -808,7 +817,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  		if (err)
>  			continue;
>  
> -		set_cpu_present(cpu, true);
> +		if (acpi_disabled || acpi_cpu_is_present(cpu))
> +			set_cpu_present(cpu, true);
>  		numa_store_cpu_info(cpu);
>  	}
>  }


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
  2026-05-11  3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
@ 2026-05-11  9:55   ` Catalin Marinas
  0 siblings, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2026-05-11  9:55 UTC (permalink / raw)
  To: Jinjie Ruan
  Cc: will, punit.agrawal, rafael.j.wysocki, fengchengwen, chenl311,
	suzuki.poulose, maz, timothy.hayes, lpieralisi, mrigendra.chaubey,
	arnd, sudeep.holla, yangyicong, jic23, pierre.gondois,
	linux-arm-kernel, linux-kernel, james.morse

On Mon, May 11, 2026 at 11:17:43AM +0800, Jinjie Ruan wrote:
> On 4/27/2026 10:35 AM, Jinjie Ruan wrote:
> > On arm64, when booting with `maxcpus` greater than the number of present
> > CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> > but have not yet been registered via register_cpu(). Consequently,
> > the per-cpu device objects for these CPUs are not yet initialized.
> > 
> > In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
> > _cpu_up() for these unregistered CPUs eventually leads to
> > sysfs_create_group() being called with a NULL kobject (or a kobject
> > without a directory), triggering the following warning in
> > fs/sysfs/group.c:
> > 
> > 	if (WARN_ON(!kobj || (!update && !kobj->sd)))
> > 		return -EINVAL;
> > 
> > When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
> > enumerated CPUs as "present" regardless of their status in the MADT. This
> > causes issues with SMT hotplug control. For instance, with QEMU's
> > "-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
> > follows: the first four CPUs are marked Enabled while the remaining four
> > are marked Online Capable to support potential hot-plugging.
> > 
> > Fix this by:
> > 
> > 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
> >    entry before calling set_cpu_present() during SMP initialization.
> > 
> > 2. Properly managing the present mask in acpi_map_cpu() and
> >    acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
> >    other architectures like x86 and LoongArch.
> > 
> > This ensures that only physically available or explicitly enabled CPUs
> > are in the present mask, keeping the SMT control logic consistent with
> > the actual hardware state.
> > 
> > How to reproduce:
> > 
> > 	1. echo off > /sys/devices/system/cpu/smt/control
> > 		psci: CPU1 killed (polled 0 ms)
> > 		psci: CPU3 killed (polled 0 ms)
> > 
> > 	2. echo 2 > /sys/devices/system/cpu/smt/control
> > 
> > 	Detected PIPT I-cache on CPU1
> > 	GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
> > 	CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
> > 	Detected PIPT I-cache on CPU3
> > 	GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> > 	CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
> > 	------------[ cut here ]------------
> > 	WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
> > 	Modules linked in:
> > 	CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
> > 	Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> > 	pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > 	pc : internal_create_group+0x41c/0x4bc
> > 	lr : sysfs_create_group+0x18/0x24
> > 	sp : ffff80008078ba40
> > 	x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
> > 	x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
> > 	x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
> > 	x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
> > 	x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > 	x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
> > 	x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
> > 	x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
> > 	x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
> > 	x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
> > 	Call trace:
> > 	 internal_create_group+0x41c/0x4bc (P)
> > 	 sysfs_create_group+0x18/0x24
> > 	 topology_add_dev+0x1c/0x28
> > 	 cpuhp_invoke_callback+0x104/0x20c
> > 	 __cpuhp_invoke_callback_range+0x94/0x11c
> > 	 _cpu_up+0x200/0x37c
> > 	 cpuhp_smt_enable+0xbc/0x114
> > 	 control_store+0xe8/0x1d4
> > 	 dev_attr_store+0x18/0x2c
> > 	 sysfs_kf_write+0x7c/0x94
> > 	 kernfs_fop_write_iter+0x128/0x1b8
> > 	 vfs_write+0x2b0/0x354
> > 	 ksys_write+0x68/0xfc
> > 	 __arm64_sys_write+0x1c/0x28
> > 	 invoke_syscall+0x48/0x10c
> > 	 el0_svc_common.constprop.0+0x40/0xe8
> > 	 do_el0_svc+0x20/0x2c
> > 	 el0_svc+0x34/0x124
> > 	 el0t_64_sync_handler+0xa0/0xe4
> > 	 el0t_64_sync+0x198/0x19c
> > 	---[ end trace 0000000000000000 ]---
> 
> Hi, just a gentle ping on this v2 patch. It’s been about two weeks, and
> I was wondering if there are any further comments or if there's anything
> else I should address? Thanks!

No comments from me but I'd like James Morse to review, in case he
recalls the decisions made at the time. I think Jonathan already said
that he doesn't fully remember.

Given that it's just a warning rather than some kernel panic, I'm happy
to wait until the upcoming merging window (for 7.2). In the meantime:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
       [not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
  2026-05-11  3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
@ 2026-05-11 17:37 ` Catalin Marinas
  1 sibling, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2026-05-11 17:37 UTC (permalink / raw)
  To: Jinjie Ruan
  Cc: will, punit.agrawal, rafael.j.wysocki, fengchengwen, chenl311,
	suzuki.poulose, maz, timothy.hayes, lpieralisi, mrigendra.chaubey,
	arnd, sudeep.holla, yangyicong, jic23, pierre.gondois,
	linux-arm-kernel, linux-kernel, james.morse

On Mon, Apr 27, 2026 at 10:35:07AM +0800, Jinjie Ruan wrote:
> On arm64, when booting with `maxcpus` greater than the number of present
> CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
> but have not yet been registered via register_cpu(). Consequently,
> the per-cpu device objects for these CPUs are not yet initialized.
[...]
> Fix this by:
> 
> 1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
>    entry before calling set_cpu_present() during SMP initialization.
> 
> 2. Properly managing the present mask in acpi_map_cpu() and
>    acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
>    other architectures like x86 and LoongArch.

I had a chat with James earlier and IIUC the decision was to mark all
CPUs present and the GIC must be fully initialised. But digging through
the GICv3 code, I don't see it depending on cpu_present_mask but rather
on the "always on" MADT GICR description. So I think it should be safe
as long as we don't rely on the GICC gicr_base_address. But we should
update Documentation/arch/arm64/cpu-hotplug.rst to no longer state that
all online-capable vCPUs are marked as present by the kernel.

(or maybe I misunderstood all this)

-- 
Catalin

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-11 17:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260427023507.1247418-1-ruanjinjie@huawei.com>
2026-05-11  3:17 ` [PATCH v2] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable() Jinjie Ruan
2026-05-11  9:55   ` Catalin Marinas
2026-05-11 17:37 ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox