linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Subject: Re: [PATCH v2 13/16] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
Date: Tue, 04 Jun 2024 08:59:49 +0100	[thread overview]
Message-ID: <87frtt2l56.wl-maz@kernel.org> (raw)
In-Reply-To: <Zl4NScV0E_YV7GR2@linux.dev>

On Mon, 03 Jun 2024 19:36:57 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Wed, May 29, 2024 at 03:56:25PM +0100, Marc Zyngier wrote:

[...]

> > +/*
> > + * Compute the equivalent of the TTL field by parsing the shadow PT.  The
> > + * granule size is extracted from the cached VTCR_EL2.TG0 while the level is
> > + * retrieved from first entry carrying the level as a tag.
> > + */
> > +static u8 get_guest_mapping_ttl(struct kvm_s2_mmu *mmu, u64 addr)
> > +{
> 
> Can you add a lockdep assertion that the MMU lock is held for write
> here? At least for me this is far enough away from the 'real' page table
> walk that it wasn't clear what locks were held at this point.

Sure thing.

> 
> > +	u64 tmp, sz = 0, vtcr = mmu->tlb_vtcr;
> > +	kvm_pte_t pte;
> > +	u8 ttl, level;
> > +
> > +	switch (vtcr & VTCR_EL2_TG0_MASK) {
> > +	case VTCR_EL2_TG0_4K:
> > +		ttl = (TLBI_TTL_TG_4K << 2);
> > +		break;
> > +	case VTCR_EL2_TG0_16K:
> > +		ttl = (TLBI_TTL_TG_16K << 2);
> > +		break;
> > +	case VTCR_EL2_TG0_64K:
> > +	default:	    /* IMPDEF: treat any other value as 64k */
> > +		ttl = (TLBI_TTL_TG_64K << 2);
> > +		break;
> > +	}
> > +
> > +	tmp = addr;
> > +
> > +again:
> > +	/* Iteratively compute the block sizes for a particular granule size */
> > +	switch (vtcr & VTCR_EL2_TG0_MASK) {
> > +	case VTCR_EL2_TG0_4K:
> > +		if	(sz < SZ_4K)	sz = SZ_4K;
> > +		else if (sz < SZ_2M)	sz = SZ_2M;
> > +		else if (sz < SZ_1G)	sz = SZ_1G;
> > +		else			sz = 0;
> > +		break;
> > +	case VTCR_EL2_TG0_16K:
> > +		if	(sz < SZ_16K)	sz = SZ_16K;
> > +		else if (sz < SZ_32M)	sz = SZ_32M;
> > +		else			sz = 0;
> > +		break;
> > +	case VTCR_EL2_TG0_64K:
> > +	default:	    /* IMPDEF: treat any other value as 64k */
> > +		if	(sz < SZ_64K)	sz = SZ_64K;
> > +		else if (sz < SZ_512M)	sz = SZ_512M;
> > +		else			sz = 0;
> > +		break;
> > +	}
> > +
> > +	if (sz == 0)
> > +		return 0;
> > +
> > +	tmp &= ~(sz - 1);
> > +	if (kvm_pgtable_get_leaf(mmu->pgt, tmp, &pte, NULL))
> > +		goto again;
> 
> Assuming we're virtualizing a larger TG than what's in use at L0 this
> may not actually find a valid leaf that exists within the span of a
> single virtual TLB entry.
> 
> For example, if we're using 4K at L0 and 16K at L1, we could have:
> 
> 	[ ----- valid 16K entry ------- ]
> 
> mapped as:
> 
> 	[ ----- | ----- | valid | ----- ]
> 
> in the shadow S2. kvm_pgtable_get_leaf() will always return the first
> splintered page, which could be invalid.
> 
> What I'm getting at is: should this use a bespoke table walker that
> scans for a valid TTL in the range of [addr, addr + sz)? It may make
> sense to back off a bit more aggressively and switch to a conservative,
> unscoped TLBI to avoid visiting too many PTEs.

I had something along those lines at some point (circa 2019), and
quickly dropped it as it had a horrible "look-around" behaviour,
specially if the L1 S2 granule size is much larger than L0's (64k vs
4k). As you pointed out, it needs heuristics to limit the look-around,
which I don't find very satisfying.

Which is why the current code limits the search to be in depth only,
hoping for the head descriptor to be valid, and quickly backs off to
do a level-0 invalidation.

My preferred option would be to allow the use of non-valid entries to
cache the level (always using the first L0 entry that would map the L1
descriptor), but this opens another can of worms: you could end-up
with page table pages containing only invalid descriptors, except for
the presence of a level annotation, which screws the refcounting. I'd
very much like to see this rather than the look-around option.

Now, it is important to consider how useful this is. I expect modern
hypervisors to use either TTL-hinted (which we emulate even if the HW
doesn't support it) or Range-based invalidation in the vast majority
of the cases, so this would only help SW that hasn't got on with the
program.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-06-04  8:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 14:56 [PATCH v2 00/16] KVM: arm64: nv: Shadow stage-2 page table handling Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 01/16] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 02/16] KVM: arm64: nv: Implement nested Stage-2 page table walk logic Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 03/16] KVM: arm64: nv: Handle shadow stage 2 page faults Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 04/16] KVM: arm64: nv: Unmap/flush shadow stage 2 page tables Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 05/16] KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 06/16] KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 07/16] KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1 Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 08/16] KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 09/16] KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 10/16] KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 11/16] KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 12/16] KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level Marc Zyngier
2024-06-03 18:05   ` Oliver Upton
2024-06-05  7:56     ` Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 13/16] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information Marc Zyngier
2024-06-03 18:36   ` Oliver Upton
2024-06-04  7:59     ` Marc Zyngier [this message]
2024-06-04 17:49       ` Oliver Upton
2024-05-29 14:56 ` [PATCH v2 14/16] KVM: arm64: nv: Add handling of outer-shareable TLBI operations Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 15/16] KVM: arm64: nv: Add handling of range-based " Marc Zyngier
2024-05-29 14:56 ` [PATCH v2 16/16] KVM: arm64: nv: Add handling of NXS-flavoured " Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87frtt2l56.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alexandru.elisei@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).