Re: [PATCH 6.1 0/6] fix invalid sleeping in detect_cache_attributes()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Wen Yang <wen.yang@linux.dev>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6.1 0/6] fix invalid sleeping in detect_cache_attributes()
Date: Thu, 16 Oct 2025 01:23:29 +0800	[thread overview]
Message-ID: <f2bac92d-53ea-4db3-a96b-460eb64d7863@linux.dev> (raw)
In-Reply-To: <2025101509-bucktooth-reawake-5176@gregkh>



On 10/15/25 16:43, Greg Kroah-Hartman wrote:
> On Wed, Oct 01, 2025 at 01:27:25AM +0800, Wen Yang wrote:
>> commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
>> in the CPU hotplug path")
>> adds a call to detect_cache_attributes() to populate the cacheinfo
>> before updating the siblings mask. detect_cache_attributes() allocates
>> memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
>> kernels, on secondary CPUs, this triggers a:
>>    'BUG: sleeping function called from invalid context'
>> as the code is executed with preemption and interrupts disabled:
>>
>>   | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
>>   | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
>>   | preempt_count: 1, expected: 0
>>   | RCU nest depth: 1, expected: 1
>>   | 3 locks held by swapper/111/0:
>>   |  #0:  (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
>>   |  #1:  (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
>>   |  #2:  (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
>>   | irq event stamp: 0
>>   | hardirqs last  enabled at (0):  0x0
>>   | hardirqs last disabled at (0):  copy_process+0x5dc/0x1ab8
>>   | softirqs last  enabled at (0):  copy_process+0x5dc/0x1ab8
>>   | softirqs last disabled at (0):  0x0
>>   | Preemption disabled at:
>>   |  migrate_enable+0x30/0x130
>>   | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-rt6-[...]
>>   | Call trace:
>>   |  __kmalloc+0xbc/0x1e8
>>   |  detect_cache_attributes+0x2d4/0x5f0
>>   |  update_siblings_masks+0x30/0x368
>>   |  store_cpu_topology+0x78/0xb8
>>   |  secondary_start_kernel+0xd0/0x198
>>   |  __secondary_switched+0xb0/0xb4
>>
>>
>> Pierre fixed this issue in the upstream 6.3 and the original series is follows:
>> https://lore.kernel.org/all/167404285593.885445.6219705651301997538.b4-ty@arm.com/
>>
>> We also encountered the same issue on 6.1 stable branch,  and need to backport this series.
>>
>> Pierre Gondois (6):
>>    cacheinfo: Use RISC-V's init_cache_level() as generic OF
>>      implementation
>>    cacheinfo: Return error code in init_of_cache_level()
>>    cacheinfo: Check 'cache-unified' property to count cache leaves
>>    ACPI: PPTT: Remove acpi_find_cache_levels()
>>    ACPI: PPTT: Update acpi_find_last_cache_level() to
>>      acpi_get_cache_info()
>>    arch_topology: Build cacheinfo from primary CPU
> 
> This series seems to have broken existing systems, as reported here:
> 	https://lore.kernel.org/r/046f08cb-0610-48c9-af24-4804367df177@nvidia.com
> 
> so I'm going to drop it from the queue at this point in time.  Please
> work to resolve this before resubmitting it.
> 

Hi Jon,

Thank you for testing. The root cause here is that this series has 
exposed previously hidden bugs (such as those in 
arch/arm64/boot/dts/nvidia/tgra194.dtsi).

We may need to further backport the following patches:

ommit 27f1568b1d5fe35014074f92717b250afbe67031
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Mon Nov 7 16:57:08 2022 +0100

     arm64: tegra: Update cache properties

     The DeviceTree Specification v0.3 specifies that the cache node
     'compatible' and 'cache-level' properties are 'required'. Cf.
     s3.8 Multi-level and Shared Cache Nodes
     The 'cache-unified' property should be present if one of the
     properties for unified cache is present ('cache-size', ...).


But I don't have a Tegra device right now. Could you please apply the 
patch above and verify it again?

--
Best wishes,
Wen

     prev parent reply	other threads:[~2025-10-15 17:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30 17:27 [PATCH 6.1 0/6] fix invalid sleeping in detect_cache_attributes() Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 1/6] cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 2/6] cacheinfo: Return error code in init_of_cache_level() Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 3/6] cacheinfo: Check 'cache-unified' property to count cache leaves Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 4/6] ACPI: PPTT: Remove acpi_find_cache_levels() Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 5/6] ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info() Wen Yang
2025-09-30 17:27 ` [PATCH 6.1 6/6] arch_topology: Build cacheinfo from primary CPU Wen Yang
2025-10-15  8:43 ` [PATCH 6.1 0/6] fix invalid sleeping in detect_cache_attributes() Greg Kroah-Hartman
2025-10-15  8:45   ` Greg Kroah-Hartman
2025-10-15 17:23   ` Wen Yang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2bac92d-53ea-4db3-a96b-460eb64d7863@linux.dev \
    --to=wen.yang@linux.dev \
    --cc=gregkh@linuxfoundation.org \
    --cc=jonathanh@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox