Re: [PATCH v4 01/12] arm64/mm: Update non-range tlb invalidation routines for FEAT_LPA2

From: Marc Zyngier <maz@kernel.org>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	James Morse <james.morse@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev
Subject: Re: [PATCH v4 01/12] arm64/mm: Update non-range tlb invalidation routines for FEAT_LPA2
Date: Fri, 20 Oct 2023 14:41:29 +0100	[thread overview]
Message-ID: <86pm19o1h2.wl-maz@kernel.org> (raw)
In-Reply-To: <59a62837-1adf-43a5-8716-8068ddbdb7cf@arm.com>

On Fri, 20 Oct 2023 14:21:39 +0100,
Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> On 20/10/2023 14:02, Marc Zyngier wrote:
> > On Fri, 20 Oct 2023 13:39:47 +0100,
> > Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> On 20/10/2023 09:05, Marc Zyngier wrote:
> >>> Maybe. There is something to be said about making the range rework
> >>> (decreasing scale) an independent patch, as it is a significant change
> >>> on its own. But maybe the rest of the plumbing can be grouped
> >>> together.
> >>
> >> But that's effectively the split I have now, isn't it? The first patch
> >> introduces TLBI_TTL_UNKNOWN to enable use of 0 as a ttl hint. Then the second
> >> patch reworks the range stuff. I don't quite follow what you are suggesting.
> > 
> > Not quite.
> > 
> > What I'm proposing is that you pull the scale changes in their own
> > patch, and preferably without any change to the external API (i.e. no
> > change to the signature of the helper). They any extra change, such as
> > the TTL rework can go separately.
> > 
> > So while this is similar to your existing split, I'd like to see it
> > without any churn around the calling convention. Which means turning
> > the ordering around, and making use of a static key in the various
> > helpers that need to know about LPA2.
> 
> I don't think we can embed the static key usage directly inside
> __flush_tlb_range_op() (if that's what you were suggesting), because this macro
> is used by both the kernel (for its stage 1) and the hypervisor (for stage 2).
> And the kernel doesn't support LPA2 (until Ard's work is merged). So I think
> this needs to be an argument to the macro.

I can see two outcomes here:

- either you create separate helpers that abstract the LPA2-ness for
  KVM and stick to non-LPA2 for the kernel (until Ard's series makes
  it in)

- or you leave the whole thing disabled until we have full LPA2
  support.

Eventually, you replace the whole extra parameter with a static key,
and nobody sees any churn.

> Or are you asking that I make the scale change universally, even if LPA2 is not
> in use? I could do that as its own change change (which I could benchmark), then
> add the rest in a separate change. But my thinking was that we would not want to
> change the algorithm for !LAP2 since it is not as effcient (due to the LPA2 64K
> alignment requirement).

I'm all for simplicity. If having an extra 15 potential TLBIs is
acceptable from a performance perspective, I won't complain. But I can
imagine that NV would be suffering from that (TLBIs on S2 have to
trap).

	M.

-- 
Without deviation from the norm, progress is not possible.