All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Guenter Roeck <linux@roeck-us.net>,
	Yang Shi <yang@os.amperecomputing.com>
Cc: catalin.marinas@arm.com, will@kernel.org,
	akpm@linux-foundation.org, david@redhat.com,
	lorenzo.stoakes@oracle.com, ardb@kernel.org,
	scott@os.amperecomputing.com, cl@gentwo.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, nd@arm.com
Subject: Re: [PATCH v8 3/5] arm64: mm: support large block mapping when rodata=full
Date: Mon, 3 Nov 2025 11:23:38 +0530	[thread overview]
Message-ID: <4bc562ea-2fba-4484-9548-c606e254bc00@arm.com> (raw)
In-Reply-To: <bee6b93d-aa2e-4335-9801-89f02eb3eccc@arm.com>

>>>>
>>> With lock debugging enabled, we see a large number of "BUG: sleeping
>>> function called from invalid context at kernel/locking/mutex.c:580"
>>> and "BUG: Invalid wait context:" backtraces when running v6.18-rc3.
>>> Please see example below.
>>>
>>> Bisect points to this patch.
>>>
>>> Please let me know if there is anything I can do to help tracking
>>> down the problem.
>> Thanks for the report - ouch!
>>
>> I expect you're running on a system that supports BBML2_NOABORT, based on the
>> stack trace, I expect you have CONFIG_DEBUG_PAGEALLOC enabled? That will cause
>> permission tricks to be played on the linear map at page allocation and free
>> time, which can happen in non-sleepable contexts. And with this patch we are
>> taking pgtable_split_lock (a mutex) in split_kernel_leaf_mapping(), which is
>> called as a result of the permission change request.
>>
>> However, when CONFIG_DEBUG_PAGEALLOC enabled we always force-map the linear map
>> by PTE so split_kernel_leaf_mapping() is actually unneccessary and will return
>> without actually having to split anything. So we could add an early "if
>> (force_pte_mapping()) return 0;" to bypass the function entirely in this case,
>> and I *think* that should solve it.
>>
>> But I'm also concerned about KFENCE. I can't remember it's exact semantics off
>> the top of my head, so I'm concerned we could see similar problems there (where
>> we only force pte mapping for the KFENCE pool).
>>
>> I'll investigate fully tomorrow and hopefully provide a fix.
> Here's a proposed fix, although I can't get access to a system with BBML2 until
> tomorrow at the earliest. Guenter, I wonder if you could check that this
> resolves your issue?
>
> ---8<---
> commit 602ec2db74e5abfb058bd03934475ead8558eb72
> Author: Ryan Roberts <ryan.roberts@arm.com>
> Date:   Sun Nov 2 11:45:18 2025 +0000
>
>      arm64: mm: Don't attempt to split known pte-mapped regions
>      
>      It has been reported that split_kernel_leaf_mapping() is trying to sleep
>      in non-sleepable context. It does this when acquiring the
>      pgtable_split_lock mutex, when either CONFIG_DEBUG_ALLOC or
>      CONFIG_KFENCE are enabled, which change linear map permissions within
>      softirq context during memory allocation and/or freeing.
>      
>      But it turns out that the memory for which these features may attempt to
>      modify the permissions is always mapped by pte, so there is no need to
>      attempt to split the mapping. So let's exit early in these cases and
>      avoid attempting to take the mutex.
>      
>      Closes: https://lore.kernel.org/all/f24b9032-0ec9-47b1-8b95-c0eeac7a31c5@roeck-us.net/
>      Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full")
>      Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index b8d37eb037fc..6e26f070bb49 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -708,6 +708,16 @@ static int split_kernel_leaf_mapping_locked(unsigned long addr)
>   	return ret;
>   }
>   
> +static inline bool force_pte_mapping(void)
> +{
> +	bool bbml2 = system_capabilities_finalized() ?
> +		system_supports_bbml2_noabort() : cpu_supports_bbml2_noabort();
> +
> +	return (!bbml2 && (rodata_full || arm64_kfence_can_set_direct_map() ||
> +			   is_realm_world())) ||
> +		debug_pagealloc_enabled();
> +}
> +
>   static DEFINE_MUTEX(pgtable_split_lock);
>   
>   int split_kernel_leaf_mapping(unsigned long start, unsigned long end)
> @@ -723,6 +733,16 @@ int split_kernel_leaf_mapping(unsigned long start, unsigned long end)
>   	if (!system_supports_bbml2_noabort())
>   		return 0;
>   
> +	/*
> +	 * If the region is within a pte-mapped area, there is no need to try to
> +	 * split. Additionally, CONFIG_DEBUG_ALLOC and CONFIG_KFENCE may change

Nit: CONFIG_DEBUG_PAGEALLOC.

> +	 * permissions from softirq context so for those cases (which are always
> +	 * pte-mapped), we must not go any further because taking the mutex
> +	 * below may sleep.
> +	 */
> +	if (force_pte_mapping() || is_kfence_address((void *)start))
> +		return 0;
> +
>   	/*
>   	 * Ensure start and end are at least page-aligned since this is the
>   	 * finest granularity we can split to.
> @@ -1009,16 +1029,6 @@ static inline void arm64_kfence_map_pool(phys_addr_t kfence_pool, pgd_t *pgdp) {
>   
>   #endif /* CONFIG_KFENCE */
>   
> -static inline bool force_pte_mapping(void)
> -{
> -	bool bbml2 = system_capabilities_finalized() ?
> -		system_supports_bbml2_noabort() : cpu_supports_bbml2_noabort();
> -
> -	return (!bbml2 && (rodata_full || arm64_kfence_can_set_direct_map() ||
> -			   is_realm_world())) ||
> -		debug_pagealloc_enabled();
> -}
> -

Otherwise LGTM.

Reviewed-by: Dev Jain <dev.jain@arm.com>

>   static void __init map_mem(pgd_t *pgdp)
>   {
>   	static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
> ---8<---
>
> Thanks,
> Ryan
>
>> Yang Shi, Do you have any additional thoughts?
>>
>> Thanks,
>> Ryan
>>


  parent reply	other threads:[~2025-11-03  5:53 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 19:02 [PATCH v8 0/5] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full Yang Shi
2025-09-17 19:02 ` [PATCH v8 1/5] arm64: Enable permission change on arm64 kernel block mappings Yang Shi
2025-09-17 19:02 ` [PATCH v8 2/5] arm64: cpufeature: add AmpereOne to BBML2 allow list Yang Shi
2025-09-17 19:02 ` [PATCH v8 3/5] arm64: mm: support large block mapping when rodata=full Yang Shi
2025-11-01 16:14   ` Guenter Roeck
2025-11-02 10:31     ` Ryan Roberts
2025-11-02 12:11       ` Ryan Roberts
2025-11-02 15:13         ` Guenter Roeck
2025-11-02 17:46         ` Guenter Roeck
2025-11-02 17:49         ` Guenter Roeck
2025-11-02 17:52           ` Guenter Roeck
2025-11-03  0:47         ` Yang Shi
2025-11-03 10:07           ` Ryan Roberts
2025-11-03 16:21             ` Yang Shi
2025-11-03  5:53         ` Dev Jain [this message]
2025-09-17 19:02 ` [PATCH v8 4/5] arm64: mm: split linear mapping if BBML2 unsupported on secondary CPUs Yang Shi
2026-02-02  7:18   ` Arnd Bergmann
2026-02-02  7:43     ` Ard Biesheuvel
2026-02-02  8:11       ` Arnd Bergmann
2025-09-17 19:02 ` [PATCH v8 5/5] arm64: kprobes: call set_memory_rox() for kprobe page Yang Shi
2025-09-18 12:48   ` Catalin Marinas
2025-09-18 15:05     ` Yang Shi
2025-09-18 15:30       ` Ryan Roberts
2025-09-18 15:50         ` Yang Shi
2025-09-18 15:32       ` Catalin Marinas
2025-09-18 15:48         ` Yang Shi
2025-09-18 21:10 ` [PATCH v8 0/5] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full Will Deacon
2025-09-19 10:08   ` Ryan Roberts
2025-09-19 11:27     ` Will Deacon
2025-09-19 11:49       ` Ryan Roberts
2025-09-19 11:56         ` Will Deacon
2025-09-19 12:00           ` Ryan Roberts
2025-09-19 18:44             ` Will Deacon
2025-09-23  7:15               ` Ryan Roberts
2025-09-19 14:55   ` Yang Shi
2026-03-16  7:35 ` Jinjiang Tu
2026-03-16 15:47   ` Ryan Roberts
2026-03-17  0:15     ` Yang Shi
2026-03-17  2:06       ` Jinjiang Tu
2026-03-17  9:07         ` Ryan Roberts
2026-03-17 17:03           ` Yang Shi
2026-03-18  8:29           ` Jinjiang Tu
2026-03-18  9:17             ` Ryan Roberts
2026-03-19  1:22               ` Jinjiang Tu
2026-03-17 17:12         ` Yang Shi
2026-03-17  8:47       ` Kevin Brodsky
2026-03-17  9:13         ` Ryan Roberts
2026-03-17  9:29           ` Kevin Brodsky
2026-03-17 11:45             ` Ryan Roberts
2026-03-17 12:43               ` Kevin Brodsky
2026-03-17 15:05                 ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bc562ea-2fba-4484-9548-c606e254bc00@arm.com \
    --to=dev.jain@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=david@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=nd@arm.com \
    --cc=ryan.roberts@arm.com \
    --cc=scott@os.amperecomputing.com \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.