All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Alexandre Ghiti <alexghiti@rivosinc.com>,
	Kevin Brodsky <kevin.brodsky@arm.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings
Date: Tue, 15 Apr 2025 11:51:45 +0100	[thread overview]
Message-ID: <Z_46QUFXVI69zRZR@arm.com> (raw)
In-Reply-To: <aabc9fb1-4e74-409a-b25b-8e844e65c502@arm.com>

On Mon, Apr 14, 2025 at 07:28:46PM +0100, Ryan Roberts wrote:
> On 14/04/2025 18:38, Catalin Marinas wrote:
> > On Tue, Mar 04, 2025 at 03:04:41PM +0000, Ryan Roberts wrote:
> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> >> index 1898c3069c43..149df945c1ab 100644
> >> --- a/arch/arm64/include/asm/pgtable.h
> >> +++ b/arch/arm64/include/asm/pgtable.h
> >> @@ -40,6 +40,55 @@
> >>  #include <linux/sched.h>
> >>  #include <linux/page_table_check.h>
> >>  
> >> +static inline void emit_pte_barriers(void)
> >> +{
> >> +	/*
> >> +	 * These barriers are emitted under certain conditions after a pte entry
> >> +	 * was modified (see e.g. __set_pte_complete()). The dsb makes the store
> >> +	 * visible to the table walker. The isb ensures that any previous
> >> +	 * speculative "invalid translation" marker that is in the CPU's
> >> +	 * pipeline gets cleared, so that any access to that address after
> >> +	 * setting the pte to valid won't cause a spurious fault. If the thread
> >> +	 * gets preempted after storing to the pgtable but before emitting these
> >> +	 * barriers, __switch_to() emits a dsb which ensure the walker gets to
> >> +	 * see the store. There is no guarrantee of an isb being issued though.
> >> +	 * This is safe because it will still get issued (albeit on a
> >> +	 * potentially different CPU) when the thread starts running again,
> >> +	 * before any access to the address.
> >> +	 */
> >> +	dsb(ishst);
> >> +	isb();
> >> +}
> >> +
> >> +static inline void queue_pte_barriers(void)
> >> +{
> >> +	if (test_thread_flag(TIF_LAZY_MMU))
> >> +		set_thread_flag(TIF_LAZY_MMU_PENDING);
> > 
> > As we can have lots of calls here, it might be slightly cheaper to test
> > TIF_LAZY_MMU_PENDING and avoid setting it unnecessarily.
> 
> Yes, good point.
> 
> > I haven't checked - does the compiler generate multiple mrs from sp_el0
> > for subsequent test_thread_flag()?
> 
> It emits a single mrs but it loads from the pointer twice.

It's not that bad if only do the set_thread_flag() once.

> I think v3 is the version we want?
> 
> 
> void TEST_queue_pte_barriers_v1(void)
> {
> 	if (test_thread_flag(TIF_LAZY_MMU))
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }
> 
> void TEST_queue_pte_barriers_v2(void)
> {
> 	if (test_thread_flag(TIF_LAZY_MMU) &&
> 	    !test_thread_flag(TIF_LAZY_MMU_PENDING))
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }
> 
> void TEST_queue_pte_barriers_v3(void)
> {
> 	unsigned long flags = read_thread_flags();
> 
> 	if ((flags & (_TIF_LAZY_MMU | _TIF_LAZY_MMU_PENDING)) == _TIF_LAZY_MMU)
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }

Doesn't v3 emit barriers once _TIF_LAZY_MMU_PENDING has been set? We
need something like:

	if (flags & _TIF_LAZY_MMU) {
		if (!(flags & _TIF_LAZY_MMU_PENDING))
			set_thread_flag(TIF_LAZY_MMU_PENDING);
	} else {
		emit_pte_barriers();
	}

-- 
Catalin


  reply	other threads:[~2025-04-15 11:37 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-04 15:04 [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 01/11] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-04-04  3:03   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 02/11] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-04-03 20:47   ` Catalin Marinas
2025-04-04  3:50   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 03/11] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-03-26 14:48   ` Pasha Tatashin
2025-03-26 14:54     ` Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 04/11] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-03-06  5:08   ` kernel test robot
2025-03-06 11:54     ` Ryan Roberts
2025-04-14 16:25   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 05/11] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Ryan Roberts
2025-03-05 16:00   ` kernel test robot
2025-03-05 16:32     ` Ryan Roberts
2025-04-03 20:47   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-04-04  4:11   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 07/11] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-03-27 13:05   ` Uladzislau Rezki
2025-03-04 15:04 ` [PATCH v3 08/11] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 09/11] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 10/11] mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes Ryan Roberts
2025-03-27 13:06   ` Uladzislau Rezki
2025-04-03 20:47   ` Catalin Marinas
2025-04-04  4:54   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings Ryan Roberts
2025-04-04  6:02   ` Anshuman Khandual
2025-04-14 17:38   ` Catalin Marinas
2025-04-14 18:28     ` Ryan Roberts
2025-04-15 10:51       ` Catalin Marinas [this message]
2025-04-15 17:28         ` Ryan Roberts
2025-03-27 12:16 ` [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Uladzislau Rezki
2025-03-27 13:46   ` Ryan Roberts
2025-04-14 13:56 ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_46QUFXVI69zRZR@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexghiti@rivosinc.com \
    --cc=anshuman.khandual@arm.com \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=ryan.roberts@arm.com \
    --cc=urezki@gmail.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.