From: Oliver Upton <oliver.upton@linux.dev>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Joey Gouly <joey.gouly@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Christoffer Dall <christoffer.dall@arm.com>,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Subject: Re: [PATCH v2 13/16] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
Date: Tue, 4 Jun 2024 17:49:29 +0000 [thread overview]
Message-ID: <Zl9TqVRqNo-Cm90N@linux.dev> (raw)
In-Reply-To: <87frtt2l56.wl-maz@kernel.org>
On Tue, Jun 04, 2024 at 08:59:49AM +0100, Marc Zyngier wrote:
> On Mon, 03 Jun 2024 19:36:57 +0100,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> >
> > On Wed, May 29, 2024 at 03:56:25PM +0100, Marc Zyngier wrote:
>
> [...]
>
> > > +/*
> > > + * Compute the equivalent of the TTL field by parsing the shadow PT. The
> > > + * granule size is extracted from the cached VTCR_EL2.TG0 while the level is
> > > + * retrieved from first entry carrying the level as a tag.
> > > + */
> > > +static u8 get_guest_mapping_ttl(struct kvm_s2_mmu *mmu, u64 addr)
> > > +{
> >
> > Can you add a lockdep assertion that the MMU lock is held for write
> > here? At least for me this is far enough away from the 'real' page table
> > walk that it wasn't clear what locks were held at this point.
>
> Sure thing.
>
> >
> > > + u64 tmp, sz = 0, vtcr = mmu->tlb_vtcr;
> > > + kvm_pte_t pte;
> > > + u8 ttl, level;
> > > +
> > > + switch (vtcr & VTCR_EL2_TG0_MASK) {
> > > + case VTCR_EL2_TG0_4K:
> > > + ttl = (TLBI_TTL_TG_4K << 2);
> > > + break;
> > > + case VTCR_EL2_TG0_16K:
> > > + ttl = (TLBI_TTL_TG_16K << 2);
> > > + break;
> > > + case VTCR_EL2_TG0_64K:
> > > + default: /* IMPDEF: treat any other value as 64k */
> > > + ttl = (TLBI_TTL_TG_64K << 2);
> > > + break;
> > > + }
> > > +
> > > + tmp = addr;
> > > +
> > > +again:
> > > + /* Iteratively compute the block sizes for a particular granule size */
> > > + switch (vtcr & VTCR_EL2_TG0_MASK) {
> > > + case VTCR_EL2_TG0_4K:
> > > + if (sz < SZ_4K) sz = SZ_4K;
> > > + else if (sz < SZ_2M) sz = SZ_2M;
> > > + else if (sz < SZ_1G) sz = SZ_1G;
> > > + else sz = 0;
> > > + break;
> > > + case VTCR_EL2_TG0_16K:
> > > + if (sz < SZ_16K) sz = SZ_16K;
> > > + else if (sz < SZ_32M) sz = SZ_32M;
> > > + else sz = 0;
> > > + break;
> > > + case VTCR_EL2_TG0_64K:
> > > + default: /* IMPDEF: treat any other value as 64k */
> > > + if (sz < SZ_64K) sz = SZ_64K;
> > > + else if (sz < SZ_512M) sz = SZ_512M;
> > > + else sz = 0;
> > > + break;
> > > + }
> > > +
> > > + if (sz == 0)
> > > + return 0;
> > > +
> > > + tmp &= ~(sz - 1);
> > > + if (kvm_pgtable_get_leaf(mmu->pgt, tmp, &pte, NULL))
> > > + goto again;
> >
> > Assuming we're virtualizing a larger TG than what's in use at L0 this
> > may not actually find a valid leaf that exists within the span of a
> > single virtual TLB entry.
> >
> > For example, if we're using 4K at L0 and 16K at L1, we could have:
> >
> > [ ----- valid 16K entry ------- ]
> >
> > mapped as:
> >
> > [ ----- | ----- | valid | ----- ]
> >
> > in the shadow S2. kvm_pgtable_get_leaf() will always return the first
> > splintered page, which could be invalid.
> >
> > What I'm getting at is: should this use a bespoke table walker that
> > scans for a valid TTL in the range of [addr, addr + sz)? It may make
> > sense to back off a bit more aggressively and switch to a conservative,
> > unscoped TLBI to avoid visiting too many PTEs.
>
> I had something along those lines at some point (circa 2019), and
> quickly dropped it as it had a horrible "look-around" behaviour,
> specially if the L1 S2 granule size is much larger than L0's (64k vs
> 4k). As you pointed out, it needs heuristics to limit the look-around,
> which I don't find very satisfying.
>
> Which is why the current code limits the search to be in depth only,
> hoping for the head descriptor to be valid, and quickly backs off to
> do a level-0 invalidation.
>
> My preferred option would be to allow the use of non-valid entries to
> cache the level (always using the first L0 entry that would map the L1
> descriptor), but this opens another can of worms: you could end-up
> with page table pages containing only invalid descriptors, except for
> the presence of a level annotation, which screws the refcounting. I'd
> very much like to see this rather than the look-around option.
Yeah, this seems to be a better solution, albeit more complex on the map
path. Having said that, such an improvement can obviously be layered on
top after the fact, so I don't view this as a requirement for getting
these patches merged.
> Now, it is important to consider how useful this is. I expect modern
> hypervisors to use either TTL-hinted (which we emulate even if the HW
> doesn't support it) or Range-based invalidation in the vast majority
> of the cases, so this would only help SW that hasn't got on with the
> program.
"enterprise edition" hypervisors could be an annoyance, but you do have
a good point that there are tools available to the L1 to obviate TTL
caching as a requirement for good perf.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-06-04 17:49 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 14:56 [PATCH v2 00/16] KVM: arm64: nv: Shadow stage-2 page table handling Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 01/16] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 02/16] KVM: arm64: nv: Implement nested Stage-2 page table walk logic Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 03/16] KVM: arm64: nv: Handle shadow stage 2 page faults Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 04/16] KVM: arm64: nv: Unmap/flush shadow stage 2 page tables Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 05/16] KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 06/16] KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 07/16] KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1 Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 08/16] KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 09/16] KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 10/16] KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 11/16] KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 12/16] KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level Marc Zyngier
2024-06-03 18:05 ` Oliver Upton
2024-06-05 7:56 ` Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 13/16] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information Marc Zyngier
2024-06-03 18:36 ` Oliver Upton
2024-06-04 7:59 ` Marc Zyngier
2024-06-04 17:49 ` Oliver Upton [this message]
2024-05-29 14:56 ` [PATCH v2 14/16] KVM: arm64: nv: Add handling of outer-shareable TLBI operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 15/16] KVM: arm64: nv: Add handling of range-based " Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 16/16] KVM: arm64: nv: Add handling of NXS-flavoured " Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zl9TqVRqNo-Cm90N@linux.dev \
--to=oliver.upton@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=christoffer.dall@arm.com \
--cc=gankulkarni@os.amperecomputing.com \
--cc=james.morse@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).