linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 6.1 v2 00/10] fix invalid sleeping in detect_cache_attributes()
@ 2025-10-20 17:36 Wen Yang
  2025-10-20 17:36 ` [PATCH 6.1 01/10] cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation Wen Yang
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Wen Yang @ 2025-10-20 17:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jon Hunter; +Cc: stable, linux-kernel, Wen Yang

commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
  'BUG: sleeping function called from invalid context'
as the code is executed with preemption and interrupts disabled:

 | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
 | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
 | preempt_count: 1, expected: 0
 | RCU nest depth: 1, expected: 1
 | 3 locks held by swapper/111/0:
 |  #0:  (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
 |  #1:  (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
 |  #2:  (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
 | irq event stamp: 0
 | hardirqs last  enabled at (0):  0x0
 | hardirqs last disabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last  enabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last disabled at (0):  0x0
 | Preemption disabled at:
 |  migrate_enable+0x30/0x130
 | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-rt6-[...]
 | Call trace:
 |  __kmalloc+0xbc/0x1e8
 |  detect_cache_attributes+0x2d4/0x5f0
 |  update_siblings_masks+0x30/0x368
 |  store_cpu_topology+0x78/0xb8
 |  secondary_start_kernel+0xd0/0x198
 |  __secondary_switched+0xb0/0xb4


Pierre fixed this issue in the upstream 6.3 and the original series is follows:
https://lore.kernel.org/all/167404285593.885445.6219705651301997538.b4-ty@arm.com/

We also encountered the same issue on 6.1 stable branch, and need to backport this series:
cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
cacheinfo: Return error code in init_of_cache_level()
cacheinfo: Check 'cache-unified' property to count cache leaves
ACPI: PPTT: Remove acpi_find_cache_levels()
ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()
arch_topology: Build cacheinfo from primary CPU

And there was a non-trivial number of follow-on fixes for patches in this
series, as pointed out by Greg in the 6.1.156-RC1 review:
cacheinfo: Initialize variables in fetch_cache_info()
cacheinfo: Fix LLC is not exported through sysfs
drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug

Finally, Jon discovered an issue in the Tegra platform caused by these patches:
https://lore.kernel.org/all/046f08cb-0610-48c9-af24-4804367df177@nvidia.com/
So we also need to backport the following patch:
arm64: tegra: Update cache properties

K Prateek Nayak (1):
  drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug

Pierre Gondois (8):
  cacheinfo: Use RISC-V's init_cache_level() as generic OF
    implementation
  cacheinfo: Return error code in init_of_cache_level()
  cacheinfo: Check 'cache-unified' property to count cache leaves
  ACPI: PPTT: Remove acpi_find_cache_levels()
  ACPI: PPTT: Update acpi_find_last_cache_level() to
    acpi_get_cache_info()
  arch_topology: Build cacheinfo from primary CPU
  cacheinfo: Initialize variables in fetch_cache_info()
  arm64: tegra: Update cache properties

Yicong Yang (1):
  cacheinfo: Fix LLC is not exported through sysfs

 arch/arm64/boot/dts/nvidia/tegra194.dtsi |  15 +++
 arch/arm64/boot/dts/nvidia/tegra210.dtsi |   1 +
 arch/arm64/boot/dts/nvidia/tegra234.dtsi |  33 +++++
 arch/arm64/kernel/cacheinfo.c            |  11 +-
 arch/riscv/kernel/cacheinfo.c            |  42 ------
 drivers/acpi/pptt.c                      |  93 ++++++++------
 drivers/base/arch_topology.c             |  12 +-
 drivers/base/cacheinfo.c                 | 156 +++++++++++++++++++----
 include/linux/cacheinfo.h                |  11 +-
 9 files changed, 262 insertions(+), 112 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-10-20 17:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 17:36 [PATCH 6.1 v2 00/10] fix invalid sleeping in detect_cache_attributes() Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 01/10] cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 02/10] cacheinfo: Return error code in init_of_cache_level() Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 03/10] cacheinfo: Check 'cache-unified' property to count cache leaves Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 04/10] ACPI: PPTT: Remove acpi_find_cache_levels() Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 05/10] ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info() Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 06/10] arch_topology: Build cacheinfo from primary CPU Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 07/10] cacheinfo: Initialize variables in fetch_cache_info() Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 08/10] cacheinfo: Fix LLC is not exported through sysfs Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 09/10] drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug Wen Yang
2025-10-20 17:36 ` [PATCH 6.1 10/10] arm64: tegra: Update cache properties Wen Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).