All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] arm64/mm: Ensure lazy_mmu_mode never nests
Date: Tue, 10 Jun 2025 13:00:00 +0100	[thread overview]
Message-ID: <aEgeQCCzRt-B8_nW@arm.com> (raw)
In-Reply-To: <20250606135654.178300-1-ryan.roberts@arm.com>

On Fri, Jun 06, 2025 at 02:56:52PM +0100, Ryan Roberts wrote:
> Commit 1ef3095b1405 ("arm64/mm: Permit lazy_mmu_mode to be nested")
> provided a quick fix to ensure that lazy_mmu_mode continues to work when
> CONFIG_DEBUG_PAGEALLOC is enabled, which can cause lazy_mmu_mode to
> nest.
> 
> The solution in that patch is the make the implementation tolerant to

s/is the make/is to make/

> nesting; when the inner nest exits lazy_mmu_mode, we exit then the outer
> exit becomes a nop. But this sacrifices the optimization opportunity for
> the remainder of the outer user.
[...]
> I wonder if you might be willing to take this for v6.16? I think its a neater
> solution then my first attempt - Commit 1ef3095b1405 ("arm64/mm: Permit
> lazy_mmu_mode to be nested") - which is already in Linus's master.
> 
> To be clear, the current solution is safe, I just think this is much neater.

Maybe better, though I wouldn't say much neater. One concern I have is
about whether we'll get other such nesting in the future and we need to
fix them in generic code. Here we control __kernel_map_pages() but we
may not for other cases.

Is it the fault of the arch code that uses apply_to_page_range() via
__kernel_map_pages()? It feels like it shouldn't care about the lazy
mode as that's some detail of the apply_to_page_range() implementation.
Maybe this API should just allow nesting.

> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 88db8a0c0b37..9f387337ccc3 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -83,21 +83,11 @@ static inline void queue_pte_barriers(void)
>  #define  __HAVE_ARCH_ENTER_LAZY_MMU_MODE
>  static inline void arch_enter_lazy_mmu_mode(void)
>  {
> -	/*
> -	 * lazy_mmu_mode is not supposed to permit nesting. But in practice this
> -	 * does happen with CONFIG_DEBUG_PAGEALLOC, where a page allocation
> -	 * inside a lazy_mmu_mode section (such as zap_pte_range()) will change
> -	 * permissions on the linear map with apply_to_page_range(), which
> -	 * re-enters lazy_mmu_mode. So we tolerate nesting in our
> -	 * implementation. The first call to arch_leave_lazy_mmu_mode() will
> -	 * flush and clear the flag such that the remainder of the work in the
> -	 * outer nest behaves as if outside of lazy mmu mode. This is safe and
> -	 * keeps tracking simple.
> -	 */
> -
>  	if (in_interrupt())
>  		return;
> 
> +	VM_WARN_ON(test_thread_flag(TIF_LAZY_MMU));

This warning is good to have back.

> +
>  	set_thread_flag(TIF_LAZY_MMU);
>  }
> 
> @@ -119,6 +109,14 @@ static inline void arch_leave_lazy_mmu_mode(void)
>  	clear_thread_flag(TIF_LAZY_MMU);
>  }
> 
> +static inline bool arch_in_lazy_mmu_mode(void)
> +{
> +	if (in_interrupt())
> +		return false;
> +
> +	return test_thread_flag(TIF_LAZY_MMU);
> +}
> +
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
> 
> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
> index 04d4a8f676db..4da7a847d5f3 100644
> --- a/arch/arm64/mm/pageattr.c
> +++ b/arch/arm64/mm/pageattr.c
> @@ -293,18 +293,29 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
>  }
> 
>  #ifdef CONFIG_DEBUG_PAGEALLOC
> -/*
> - * This is - apart from the return value - doing the same
> - * thing as the new set_direct_map_valid_noflush() function.
> - *
> - * Unify? Explain the conceptual differences?
> - */
>  void __kernel_map_pages(struct page *page, int numpages, int enable)
>  {
> +	bool lazy_mmu;
> +
>  	if (!can_set_direct_map())
>  		return;
> 
> +	/*
> +	 * This is called during page alloc or free, and maybe called while in
> +	 * lazy mmu mode. Since set_memory_valid() may also enter lazy mmu mode,
> +	 * this would cause nesting which is not supported; the inner call to
> +	 * exit the mode would exit, meaning that the outer lazy mmu mode is no
> +	 * longer benefiting from the optimization. So temporarily leave lazy
> +	 * mmu mode for the duration of the call.
> +	 */
> +	lazy_mmu = arch_in_lazy_mmu_mode();
> +	if (lazy_mmu)
> +		arch_leave_lazy_mmu_mode();
> +
>  	set_memory_valid((unsigned long)page_address(page), numpages, enable);
> +
> +	if (lazy_mmu)
> +		arch_enter_lazy_mmu_mode();
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */

So basically you are flattening the enter/leave_lazy_mmu_mode() regions.
Ideally this could have been done by the nesting
arch_enter_lazy_mmu_mode() automatically but that means this function
returning the current mode and arch_leave_lazy_mmu_mode() taking an
argument - more like the irq saving/restoring (even better renaming it
to arch_restore_lazy_mmu_mode()). I guess this won't go well with the mm
folk who don't seem willing to changes in this area.

FWIW, this patch is correct.

-- 
Catalin


  reply	other threads:[~2025-06-10 12:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-06 13:56 [PATCH v1] arm64/mm: Ensure lazy_mmu_mode never nests Ryan Roberts
2025-06-10 12:00 ` Catalin Marinas [this message]
2025-06-10 13:41   ` Ryan Roberts
2025-06-10 15:07     ` Catalin Marinas
2025-06-10 16:37       ` Ryan Roberts
2025-06-12 14:59         ` Will Deacon
2025-06-12 16:00           ` Ryan Roberts
2025-06-12 16:21             ` Will Deacon
2025-08-01 11:22             ` Kevin Brodsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEgeQCCzRt-B8_nW@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.