From: Oliver Upton <oliver.upton@linux.dev>
To: Sean Christopherson <seanjc@google.com>
Cc: Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
Reiji Watanabe <reijiw@google.com>,
Ricardo Koller <ricarkol@google.com>,
David Matlack <dmatlack@google.com>,
Quentin Perret <qperret@google.com>,
Ben Gardon <bgardon@google.com>, Gavin Shan <gshan@redhat.com>,
Peter Xu <peterx@redhat.com>, Will Deacon <will@kernel.org>,
kvmarm@lists.linux.dev
Subject: Re: [PATCH v2 07/15] KVM: arm64: Use an opaque type for pteps
Date: Thu, 20 Oct 2022 11:32:28 +0300 [thread overview]
Message-ID: <Y1EHnFN2Goj2eLkE@google.com> (raw)
In-Reply-To: <Y1CFl8sLllXm4seK@google.com>
On Wed, Oct 19, 2022 at 11:17:43PM +0000, Sean Christopherson wrote:
> On Fri, Oct 07, 2022, Oliver Upton wrote:
> > Use an opaque type for pteps and require visitors explicitly dereference
> > the pointer before using. Protecting page table memory with RCU requires
> > that KVM dereferences RCU-annotated pointers before using. However, RCU
> > is not available for use in the nVHE hypervisor and the opaque type can
> > be conditionally annotated with RCU for the stage-2 MMU.
> >
> > Call the type a 'pteref' to avoid a naming collision with raw pteps. No
> > functional change intended.
> >
> > Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> > ---
> > arch/arm64/include/asm/kvm_pgtable.h | 9 ++++++++-
> > arch/arm64/kvm/hyp/pgtable.c | 23 ++++++++++++-----------
> > 2 files changed, 20 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > index c33edcf36b5b..beb89eac155c 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -25,6 +25,13 @@ static inline u64 kvm_get_parange(u64 mmfr0)
> >
> > typedef u64 kvm_pte_t;
> >
> > +typedef kvm_pte_t *kvm_pteref_t;
> > +
> > +static inline kvm_pte_t *kvm_dereference_pteref(kvm_pteref_t pteref, bool shared)
> > +{
> > + return pteref;
>
> Returning the pointer is unsafe (when it becomes RCU-protected). The full
> dereference of the data needs to occur under RCU protection, not just the retrieval
> of the pointer. E.g. this (straw man) would be broken
>
> bool table = kvm_pte_table(ctx.old, level);
>
> rcu_read_lock();
> ptep = kvm_dereference_pteref(pteref, flags & KVM_PGTABLE_WALK_SHARED);
> rcu_read_unlock();
>
> if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE))
> ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_PRE);
>
> if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) {
> ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF);
> ctx.old = READ_ONCE(*ptep);
> table = kvm_pte_table(ctx.old, level);
> }
>
> as the read of the entry pointed at by ptep could be to a page table that is freed
> in an RCU callback.
>
> The naming collision you are trying to avoid is a symptom of this bad pattern,
> as there should never be "raw" pteps floating around, at least not in non-pKVM
> contexts that utilize RCU.
Fair enough, this was mostly from an attempt to avoid churn in the
visitor callbacks by adding more layering for reads/writes through RCU
pointers. The lifetime of the raw pointer is 'safe' as it sits on the
stack and we go behind the rcu lock much further up.
> > +}
> > +
> > #define KVM_PTE_VALID BIT(0)
> >
> > #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT)
> > @@ -170,7 +177,7 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end,
> > struct kvm_pgtable {
> > u32 ia_bits;
> > u32 start_level;
> > - kvm_pte_t *pgd;
> > + kvm_pteref_t pgd;
> > struct kvm_pgtable_mm_ops *mm_ops;
> >
> > /* Stage-2 only */
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index 02c33fccb178..6b6e1ed7ee2f 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -175,13 +175,14 @@ static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data,
> > }
> >
> > static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data,
> > - struct kvm_pgtable_mm_ops *mm_ops, kvm_pte_t *pgtable, u32 level);
> > + struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level);
> >
> > static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
> > struct kvm_pgtable_mm_ops *mm_ops,
> > - kvm_pte_t *ptep, u32 level)
> > + kvm_pteref_t pteref, u32 level)
> > {
> > enum kvm_pgtable_walk_flags flags = data->walker->flags;
> > + kvm_pte_t *ptep = kvm_dereference_pteref(pteref, false);
> > struct kvm_pgtable_visit_ctx ctx = {
> > .ptep = ptep,
> > .old = READ_ONCE(*ptep),
>
> This is where you want the protection to kick in, e.g.
>
> typedef kvm_pte_t __rcu *kvm_ptep_t;
>
> static inline kvm_pte_t kvm_read_pte(kvm_ptep_t ptep)
> {
> return READ_ONCE(*rcu_dereference(ptep));
> }
>
> .old = kvm_read_pte(ptep),
>
> In other words, the pointer itself isn't that's protected, it's PTE that the
> pointer points at that's protected.
Right, but practically speaking it is the boundary at which we assert
that protection.
Anyhow, I'll look at abstracting the actual memory accesses in the
visitors without too much mess.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-10-20 8:33 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-07 23:28 [PATCH v2 00/15] KVM: arm64: Parallel stage-2 fault handling Oliver Upton
2022-10-07 23:28 ` [PATCH v2 01/15] KVM: arm64: Combine visitor arguments into a context structure Oliver Upton
2022-10-07 23:28 ` [PATCH v2 02/15] KVM: arm64: Stash observed pte value in visitor context Oliver Upton
2022-10-07 23:28 ` [PATCH v2 03/15] KVM: arm64: Pass mm_ops through the " Oliver Upton
2022-10-07 23:28 ` [PATCH v2 04/15] KVM: arm64: Don't pass kvm_pgtable through kvm_pgtable_walk_data Oliver Upton
2022-10-07 23:28 ` [PATCH v2 05/15] KVM: arm64: Add a helper to tear down unlinked stage-2 subtrees Oliver Upton
2022-10-07 23:28 ` [PATCH v2 06/15] KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make Oliver Upton
2022-10-28 18:41 ` Ricardo Koller
2022-10-28 18:43 ` Ricardo Koller
2022-10-28 18:53 ` Ricardo Koller
2022-10-07 23:28 ` [PATCH v2 07/15] KVM: arm64: Use an opaque type for pteps Oliver Upton
2022-10-19 23:17 ` Sean Christopherson
2022-10-20 8:32 ` Oliver Upton [this message]
2022-10-27 22:31 ` Oliver Upton
2022-10-07 23:28 ` [PATCH v2 08/15] KVM: arm64: Protect stage-2 traversal with RCU Oliver Upton
2022-10-19 23:29 ` Sean Christopherson
2022-10-20 8:34 ` Oliver Upton
2022-10-07 23:28 ` [PATCH v2 09/15] KVM: arm64: Free removed stage-2 tables in RCU callback Oliver Upton
2022-10-07 23:28 ` [PATCH v2 10/15] KVM: arm64: Atomically update stage 2 leaf attributes in parallel walks Oliver Upton
2022-10-07 23:30 ` [PATCH v2 11/15] KVM: arm64: Split init and set for table PTE Oliver Upton
2022-10-07 23:31 ` [PATCH v2 12/15] KVM: arm64: Make block->table PTE changes parallel-aware Oliver Upton
2022-10-07 23:32 ` [PATCH v2 13/15] KVM: arm64: Make leaf->leaf " Oliver Upton
2022-10-07 23:32 ` [PATCH v2 14/15] KVM: arm64: Make table->block " Oliver Upton
2022-10-07 23:32 ` [PATCH v2 15/15] KVM: arm64: Handle stage-2 faults in parallel Oliver Upton
[not found] ` <202210081008.A9PLyQx2-lkp@intel.com>
2022-10-08 3:14 ` Oliver Upton
2022-10-19 23:32 ` Sean Christopherson
2022-10-20 8:35 ` Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1EHnFN2Goj2eLkE@google.com \
--to=oliver.upton@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=gshan@redhat.com \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=peterx@redhat.com \
--cc=qperret@google.com \
--cc=reijiw@google.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).