public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Mikołaj Lenczewski" <miko.lenczewski@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: ryan.roberts@arm.com, suzuki.poulose@arm.com,
	yang@os.amperecomputing.com, catalin.marinas@arm.com,
	will@kernel.org, joro@8bytes.org, jean-philippe@linaro.org,
	mark.rutland@arm.com, joey.gouly@arm.com, oliver.upton@linux.dev,
	james.morse@arm.com, broonie@kernel.org, maz@kernel.org,
	akpm@linux-foundation.org, jgg@ziepe.ca, nicolinc@nvidia.com,
	mshavit@google.com, jsnitsel@redhat.com, smostafa@google.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev
Subject: Re: [PATCH v2 3/4] arm64/mm: Elide tlbi in contpte_convert() under BBML2
Date: Mon, 3 Mar 2025 10:55:47 +0000	[thread overview]
Message-ID: <20250303105539.GA74129@e133081.arm.com> (raw)
In-Reply-To: <7e987f17-ffcb-45e0-8588-2d569d90f776@redhat.com>

On Mon, Mar 03, 2025 at 10:57:21AM +0100, David Hildenbrand wrote:
> On 03.03.25 10:49, Mikołaj Lenczewski wrote:
> > Hi David,
> > 
> > Thanks for taking the time to review.
> > 
> > On Mon, Mar 03, 2025 at 10:17:12AM +0100, David Hildenbrand wrote:
> > > On 28.02.25 19:24, Mikołaj Lenczewski wrote:
> > > > If we support bbml2 without conflict aborts, we can avoid the final
> > > > flush and have hardware manage the tlb entries for us. Avoiding flushes
> > > > is a win.
> > > > 
> > > > Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
> > > > Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> > > > ---
> > > >    arch/arm64/mm/contpte.c | 3 ---
> > > >    1 file changed, 3 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> > > > index 145530f706a9..77ed03b30b72 100644
> > > > --- a/arch/arm64/mm/contpte.c
> > > > +++ b/arch/arm64/mm/contpte.c
> > > > @@ -72,9 +72,6 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
> > > >    		__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
> > > >    	__set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
> > > > -
> > > > -	if (system_supports_bbml2_noabort())
> > > > -		__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
> > > >    }
> > > >    void __contpte_try_fold(struct mm_struct *mm, unsigned long addr,
> > > 
> > > What's the point of not squashing this into #2? :)
> > > 
> > > If this split was requested during earlier review, at least seeing patch #2
> > > on its own confused me.
> > 
> > This split is a holdover from an earlier patchset, where it was still
> > unknown whether the removal of the second flush was permitted with
> > BBML2. Partly this was due to us being worried about conflict aborts
> > after the removal, and partly this was because the "delay" is a separate
> > optimisation that we could apply even if it turned out the final patch
> > was not architecturally sound.
> > 
> > Now that we do not handle conflict aborts (preferring only systems that
> > handle BBML2 without ever raising aborts), the first issue is not a
> > problem. The reasoning behind the second patch is also a little bit
> > outdated, but I can see the logical split between a tlbi reorder, and
> > the removal of the tlbi. If this is truly redundant though, I would be
> > happy to squash the two into a single patch.
> 
> Thanks for the information.
> 
> Does patch #2 (reordering the tlbi) have any benefit on its own? I read
> "other threads will not see an invalid pagetable entry", but I am not sure
> that is correct. A concurrent HW page table walker would still find the
> invalid PTE? It's just a matter of TLB state.

I think I understand what you mean. I agree that it is possible for a
concurrent walk to see an invalid TLBI state, if it is on the same TLB
that the repaint is happening on. For other TLBs, the flush has not yet
propagated our invalidated PTEs (from `__ptep_get_and_clear()`) though?
That invalidation will only be seen by other TLBs after the
`__flush_tlb_range()`, so we should save a few faults because only
"local" threads will ever see the invalid entry, as opposed to all
threads that try to read our modified range? Or it is the case that I
have misunderstood something basic here, or that I have misinterpreted
what you have written?

> If there is no benefit in having patch #2 independently, I'd just squash
> them. Reordering to then remove is more complicated than just removing it
> IMHO.
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 

-- 
Kind regards,
Mikołaj Lenczewski

  reply	other threads:[~2025-03-03 10:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 18:24 [PATCH v2 0/4] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
2025-02-28 18:24 ` [PATCH v2 1/4] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-02-28 21:16   ` Yang Shi
2025-03-01  1:29   ` Yang Shi
2025-03-01  2:45     ` Yang Shi
2025-03-03  9:40       ` Mikołaj Lenczewski
2025-03-03  9:40         ` Mikołaj Lenczewski
2025-03-03 19:55         ` Yang Shi
2025-02-28 18:24 ` [PATCH v2 2/4] arm64/mm: Delay tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
2025-02-28 18:24 ` [PATCH v2 3/4] arm64/mm: Elide " Mikołaj Lenczewski
2025-03-03  9:17   ` David Hildenbrand
2025-03-03  9:49     ` Mikołaj Lenczewski
2025-03-03  9:57       ` David Hildenbrand
2025-03-03 10:55         ` Mikołaj Lenczewski [this message]
2025-03-03 11:42           ` David Hildenbrand
2025-03-03 11:52             ` Mikołaj Lenczewski
2025-02-28 18:24 ` [PATCH v2 4/4] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
2025-02-28 19:32   ` Jason Gunthorpe
2025-03-03  8:49     ` Shameerali Kolothum Thodi
2025-03-03 10:31       ` Mikołaj Lenczewski
2025-03-03 16:52         ` Jason Gunthorpe
2025-03-03 19:03           ` Mikołaj Lenczewski
2025-03-04 14:26             ` Jason Gunthorpe
2025-03-04 16:02               ` Ryan Roberts
2025-03-04 16:19                 ` Jason Gunthorpe
2025-03-11 14:37                   ` Robin Murphy
2025-03-01  1:32   ` Yang Shi
2025-03-03 10:17     ` Ryan Roberts
2025-03-03 10:32       ` Mikołaj Lenczewski
2025-03-03 19:56       ` Yang Shi
2025-03-11 10:17       ` Suzuki K Poulose
2025-03-11 10:58         ` Ryan Roberts
2025-03-11 12:16           ` Suzuki K Poulose
2025-03-11 13:20             ` Ryan Roberts
2025-03-03  9:14 ` [PATCH v2 0/4] Initial BBML2 support for contpte_convert() David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250303105539.GA74129@e133081.arm.com \
    --to=miko.lenczewski@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=james.morse@arm.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=joey.gouly@arm.com \
    --cc=joro@8bytes.org \
    --cc=jsnitsel@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=smostafa@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox