From: Oliver Upton <oliver.upton@linux.dev>
To: Raghavendra Rao Ananta <rananta@google.com>, h@linux.dev
Cc: Oliver Upton <oupton@google.com>, Marc Zyngier <maz@kernel.org>,
Ricardo Koller <ricarkol@google.com>,
Reiji Watanabe <reijiw@google.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Will Deacon <will@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Jing Zhang <jingzhangos@google.com>,
Colton Lewis <coltonlewis@google.com>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v2 7/7] KVM: arm64: Create a fast stage-2 unmap path
Date: Tue, 4 Apr 2023 19:19:28 +0000 [thread overview]
Message-ID: <ZCx4QCs+cjr4nYev@linux.dev> (raw)
In-Reply-To: <CAJHc60xvSFpUs+o84fR14Rghd6rruBJkCMBtroeCeLDtjJg=gw@mail.gmail.com>
On Tue, Apr 04, 2023 at 10:52:01AM -0700, Raghavendra Rao Ananta wrote:
> On Wed, Mar 29, 2023 at 5:42 PM Oliver Upton <oliver.upton@linux.dev> wrote:
> >
> > On Mon, Feb 06, 2023 at 05:23:40PM +0000, Raghavendra Rao Ananta wrote:
> > > The current implementation of the stage-2 unmap walker
> > > traverses the entire page-table to clear and flush the TLBs
> > > for each entry. This could be very expensive, especially if
> > > the VM is not backed by hugepages. The unmap operation could be
> > > made efficient by disconnecting the table at the very
> > > top (level at which the largest block mapping can be hosted)
> > > and do the rest of the unmapping using free_removed_table().
> > > If the system supports FEAT_TLBIRANGE, flush the entire range
> > > that has been disconnected from the rest of the page-table.
> > >
> > > Suggested-by: Ricardo Koller <ricarkol@google.com>
> > > Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
> > > ---
> > > arch/arm64/kvm/hyp/pgtable.c | 44 ++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 44 insertions(+)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > > index 0858d1fa85d6b..af3729d0971f2 100644
> > > --- a/arch/arm64/kvm/hyp/pgtable.c
> > > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > > @@ -1017,6 +1017,49 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
> > > return 0;
> > > }
> > >
> > > +/*
> > > + * The fast walker executes only if the unmap size is exactly equal to the
> > > + * largest block mapping supported (i.e. at KVM_PGTABLE_MIN_BLOCK_LEVEL),
> > > + * such that the underneath hierarchy at KVM_PGTABLE_MIN_BLOCK_LEVEL can
> > > + * be disconnected from the rest of the page-table without the need to
> > > + * traverse all the PTEs, at all the levels, and unmap each and every one
> > > + * of them. The disconnected table is freed using free_removed_table().
> > > + */
> > > +static int fast_stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
> > > + enum kvm_pgtable_walk_flags visit)
> > > +{
> > > + struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops;
> > > + kvm_pte_t *childp = kvm_pte_follow(ctx->old, mm_ops);
> > > + struct kvm_s2_mmu *mmu = ctx->arg;
> > > +
> > > + if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MIN_BLOCK_LEVEL)
> > > + return 0;
> > > +
> > > + if (!stage2_try_break_pte(ctx, mmu))
> > > + return -EAGAIN;
> > > +
> > > + /*
> > > + * Gain back a reference for stage2_unmap_walker() to free
> > > + * this table entry from KVM_PGTABLE_MIN_BLOCK_LEVEL - 1.
> > > + */
> > > + mm_ops->get_page(ctx->ptep);
> >
> > Doesn't this run the risk of a potential UAF if the refcount was 1 before
> > calling stage2_try_break_pte()? IOW, stage2_try_break_pte() will drop
> > the refcount to 0 on the page before this ever gets called.
> >
> > Also, AFAICT this misses the CMOs that are required on systems w/o
> > FEAT_FWB. Without them it is possible that the host will read something
> > other than what was most recently written by the guest if it is using
> > noncacheable memory attributes at stage-1.
> >
> > I imagine the actual bottleneck is the DSB required after every
> > CMO/TLBI. Theoretically, the unmap path could be updated to:
> >
> > - Perform the appropriate CMOs for every valid leaf entry *without*
> > issuing a DSB.
> >
> > - Elide TLBIs entirely that take place in the middle of the walk
> >
> > - After the walk completes, dsb(ish) to guarantee that the CMOs have
> > completed and the invalid PTEs are made visible to the hardware
> > walkers. This should be done implicitly by the TLBI implementation
> >
> > - Invalidate the [addr, addr + size) range of IPAs
> >
> > This would also avoid over-invalidating stage-1 since we blast the
> > entire stage-1 context for every stage-2 invalidation. Thoughts?
> >
> Correct me if I'm wrong, but if we invalidate the TLB after the walk
> is complete, don't you think there's a risk of race if the guest can
> hit in the TLB even though the page was unmapped?
Yeah, we'd need to do the CMOs _after_ making the translation invalid in
the page tables and completing the TLB invalidation. Apologies.
Otherwise, the only requirement we need to uphold w/ either the MMU
notifiers or userspace is that the translation has been invalidated at
the time of return.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-04-04 19:20 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-06 17:23 [PATCH v2 0/7] KVM: arm64: Add support for FEAT_TLBIRANGE Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 1/7] arm64: tlb: Refactor the core flush algorithm of __flush_tlb_range Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 2/7] KVM: arm64: Add FEAT_TLBIRANGE support Raghavendra Rao Ananta
2023-03-30 1:19 ` Oliver Upton
2023-04-03 17:26 ` Raghavendra Rao Ananta
2023-04-04 18:41 ` Oliver Upton
2023-04-04 18:50 ` Oliver Upton
2023-04-04 21:39 ` Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 3/7] KVM: arm64: Implement __kvm_tlb_flush_range_vmid_ipa() Raghavendra Rao Ananta
2023-03-30 0:59 ` Oliver Upton
2023-04-03 21:08 ` Raghavendra Rao Ananta
2023-04-04 18:46 ` Oliver Upton
2023-04-04 20:50 ` Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 4/7] KVM: arm64: Implement kvm_arch_flush_remote_tlbs_range() Raghavendra Rao Ananta
2023-03-30 0:53 ` Oliver Upton
2023-04-03 21:23 ` Raghavendra Rao Ananta
2023-04-04 19:09 ` Oliver Upton
2023-04-04 20:59 ` Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 5/7] KVM: arm64: Flush only the memslot after write-protect Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 6/7] KVM: arm64: Break the table entries using TLBI range instructions Raghavendra Rao Ananta
2023-03-30 0:17 ` Oliver Upton
2023-04-03 21:25 ` Raghavendra Rao Ananta
2023-02-06 17:23 ` [PATCH v2 7/7] KVM: arm64: Create a fast stage-2 unmap path Raghavendra Rao Ananta
2023-03-30 0:42 ` Oliver Upton
2023-04-04 17:52 ` Raghavendra Rao Ananta
2023-04-04 19:19 ` Oliver Upton [this message]
2023-04-04 21:07 ` Raghavendra Rao Ananta
2023-04-04 21:30 ` Oliver Upton
2023-04-04 21:45 ` Raghavendra Rao Ananta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCx4QCs+cjr4nYev@linux.dev \
--to=oliver.upton@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=coltonlewis@google.com \
--cc=h@linux.dev \
--cc=james.morse@arm.com \
--cc=jingzhangos@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=rananta@google.com \
--cc=reijiw@google.com \
--cc=ricarkol@google.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).