public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Marc Zyngier <maz@kernel.org>, "Dev Jain" <dev.jain@arm.com>,
	Linu Cherian <Linu.Cherian@arm.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()
Date: Tue, 3 Mar 2026 17:34:33 +0000	[thread overview]
Message-ID: <20260303173433.0000031f@huawei.com> (raw)
In-Reply-To: <28c7506f-a57e-4c59-b3ef-4596d8e1b11e@arm.com>

On Tue, 3 Mar 2026 13:54:33 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:

> On 03/03/2026 09:57, Jonathan Cameron wrote:
> > On Mon, 2 Mar 2026 13:55:58 +0000
> > Ryan Roberts <ryan.roberts@arm.com> wrote:
> >   
> >> Refactor function variants with "_nosync", "_local" and "_nonotify" into
> >> a single __always_inline implementation that takes flags and rely on
> >> constant folding to select the parts that are actually needed at any
> >> given callsite, based on the provided flags.
> >>
> >> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
> >> to provide the strongest semantics (i.e. evict from walk cache,
> >> broadcast, synchronise and notify). Each flag reduces the strength in
> >> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
> >> complement the existing TLBF_NOWALKCACHE.
> >>
> >> There are no users that require TLBF_NOBROADCAST without
> >> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
> >> introduce dead code for vae1 invalidations.
> >>
> >> The result is a clearer, simpler, more powerful API.  
> > Hi Ryan,
> > 
> > There is one subtle change to rounding that should be called out at least.  
> 
> Thanks for the review. I'm confident that there isn't actually a change to the
> rounding here, but the responsibility has moved to the caller. See below...
> 
> > 
> > Might even be worth pulling it to a precursor patch where you can add an
> > explanation of why original code was rounding to a larger value than was
> > ever needed.
> > 
> > Jonathan
> > 
> >   
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>  
> > 
> >   
> >>  static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >>  				     unsigned long stride, int tlb_level,
> >>  				     tlbf_t flags)
> >>  {
> >> -	__flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
> >> -				 tlb_level, flags);
> >> -	__tlbi_sync_s1ish();
> >> -}
> >> -
> >> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
> >> -					   unsigned long addr)
> >> -{
> >> -	unsigned long asid;
> >> -
> >> -	addr = round_down(addr, CONT_PTE_SIZE);  
> > See below.  
> >> -
> >> -	dsb(nshst);
> >> -	asid = ASID(vma->vm_mm);
> >> -	__flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
> >> -	mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
> >> -						    addr + CONT_PTE_SIZE);
> >> -	dsb(nsh);
> >> +	start = round_down(start, stride);  
> > See below.  
> >> +	end = round_up(end, stride);
> >> +	__do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
> >>  }  
> >   
> >>  
> >>  static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
> >> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> >> index 681f22fac52a1..3f1a3e86353de 100644
> >> --- a/arch/arm64/mm/contpte.c
> >> +++ b/arch/arm64/mm/contpte.c  
> > ...
> >   
> >> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
> >>  			__ptep_set_access_flags(vma, addr, ptep, entry, 0);
> >>  
> >>  		if (dirty)
> >> -			local_flush_tlb_contpte(vma, start_addr);
> >> +			__flush_tlb_range(vma, start_addr,
> >> +					  start_addr + CONT_PTE_SIZE,
> >> +					  PAGE_SIZE, 3,  
> > 
> > This results in a different stride to round down. 
> > local_flush_tlb_contpte() did
> > addr = round_down(addr, CONT_PTE_SIZE);
> > 
> > With this call we have
> > start = round_down(start, stride); where stride is PAGE_SIZE.
> > 
> > I'm too lazy to figure out if that matters.  
> 
> contpte_ptep_set_access_flags() is operating on a contpte block of ptes, and as
> such, start_addr has already been rounded down to the start of the block, which
> is always bigger than (and perfectly divisible by) PAGE_SIZE.
> 
> Previously, local_flush_tlb_contpte() allowed passing any VA in within the
> contpte block and the function would automatically round it down to the start of
> the block and invalidate the full block.
> 
> After the change, we are explicitly passing the already aligned block;
> start_addr is already guaranteed to be at the start of the block and "start_addr
> + CONT_PTE_SIZE" is the end.
> 
> So in both cases, the rounding down that is done by local_flush_tlb_contpte() /
> __flush_tlb_range() doesn't actually change the value.

Ah ok, so key is that the round down in local_flush_tlb_contpte() never
did anything in practice because the only caller is
contpte_ptep_set_access_flags() and that does the align down a couple of
lines before the call.  I should have spent a few seconds looking! :(

Maybe if you are respinning just throw in a one line comment on this in the commit
description.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

> 
> Thanks,
> Ryan
> 
> 
> > 
> >   
> >> +					  TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
> >>  	} else {
> >>  		__contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
> >>  		__ptep_set_access_flags(vma, addr, ptep, entry, dirty);  
> >   
> 
> 


  reply	other threads:[~2026-03-03 17:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess() Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 10/13] arm64: mm: Refactor __flush_tlb_range() to take flags Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Ryan Roberts
2026-03-03  9:57   ` Jonathan Cameron
2026-03-03 13:54     ` Ryan Roberts
2026-03-03 17:34       ` Jonathan Cameron [this message]
2026-03-02 13:55 ` [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() Ryan Roberts
2026-03-03  9:59   ` Jonathan Cameron
2026-03-02 13:56 ` [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page() Ryan Roberts
2026-03-02 14:42   ` Mark Rutland
2026-03-02 17:39     ` Ryan Roberts
2026-03-02 17:56       ` Mark Rutland
2026-03-13 19:43 ` [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260303173433.0000031f@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Linu.Cherian@arm.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dev.jain@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox