From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 53F71C4332F for ; Thu, 20 Oct 2022 08:33:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kBTwtMJlBHRujbsbvfbg5TvnU0eCDgL4isMLUwXlYfg=; b=4TrXrwerWqyDE2 vtXH/AmXRE87mQXS6jnbe3iQJP71mLHK5wuBMxZGcMWAs+qsWAbJEEvK9TqVe9SfQh/F3RgZqB2SN CgRdX+cau6sn+wiJAP5XY8Jnj9q7iKEdYHSJMYuYNEVdCW7kEhUPIZyQAhfrDJW5TmiYywkSovkCz Kth2HM/3oFuFmz3CIib6PB+5wOHYfnSgU3ROSDlq3zFPJMTkaT1HWIJmClDB+JfnsbLwr2h1+tI4P wxxyRU5X1Qi3CFhlHCKLXS4KRAZuklK+GwCQWkDjWTf0K647xeS8z/FtCVOkeI/ne1sFGY8EgancE hKXZU2UNLPfYytYZ9Thg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1olQyn-00CFFi-P6; Thu, 20 Oct 2022 08:32:41 +0000 Received: from out0.migadu.com ([94.23.1.103]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1olQyj-00CFC8-EK for linux-arm-kernel@lists.infradead.org; Thu, 20 Oct 2022 08:32:39 +0000 Date: Thu, 20 Oct 2022 11:32:28 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1666254751; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ctJTA03g4S212w2fsHSI6DaGgRDXgMSNX17PNK39G/0=; b=pgyjDfj23srvir/0JWPCeon9S2icBTGfjwie/jgTm29lfHFUB5MWMUQC2NOCYHM+Ev6Kwu vYfDqUZn9trZqo950qlzCnTvt9p28mC3423MnKGH9g/cZ6eb9s50soaCKfJQLzQJkzCN7N sIVBDEL8fDY+TEzFGGrstuX4JGlSOfY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Sean Christopherson Cc: Marc Zyngier , James Morse , Alexandru Elisei , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, Reiji Watanabe , Ricardo Koller , David Matlack , Quentin Perret , Ben Gardon , Gavin Shan , Peter Xu , Will Deacon , kvmarm@lists.linux.dev Subject: Re: [PATCH v2 07/15] KVM: arm64: Use an opaque type for pteps Message-ID: References: <20221007232818.459650-1-oliver.upton@linux.dev> <20221007232818.459650-8-oliver.upton@linux.dev> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221020_013238_243892_2A772BAD X-CRM114-Status: GOOD ( 27.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Oct 19, 2022 at 11:17:43PM +0000, Sean Christopherson wrote: > On Fri, Oct 07, 2022, Oliver Upton wrote: > > Use an opaque type for pteps and require visitors explicitly dereference > > the pointer before using. Protecting page table memory with RCU requires > > that KVM dereferences RCU-annotated pointers before using. However, RCU > > is not available for use in the nVHE hypervisor and the opaque type can > > be conditionally annotated with RCU for the stage-2 MMU. > > > > Call the type a 'pteref' to avoid a naming collision with raw pteps. No > > functional change intended. > > > > Signed-off-by: Oliver Upton > > --- > > arch/arm64/include/asm/kvm_pgtable.h | 9 ++++++++- > > arch/arm64/kvm/hyp/pgtable.c | 23 ++++++++++++----------- > > 2 files changed, 20 insertions(+), 12 deletions(-) > > > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > > index c33edcf36b5b..beb89eac155c 100644 > > --- a/arch/arm64/include/asm/kvm_pgtable.h > > +++ b/arch/arm64/include/asm/kvm_pgtable.h > > @@ -25,6 +25,13 @@ static inline u64 kvm_get_parange(u64 mmfr0) > > > > typedef u64 kvm_pte_t; > > > > +typedef kvm_pte_t *kvm_pteref_t; > > + > > +static inline kvm_pte_t *kvm_dereference_pteref(kvm_pteref_t pteref, bool shared) > > +{ > > + return pteref; > > Returning the pointer is unsafe (when it becomes RCU-protected). The full > dereference of the data needs to occur under RCU protection, not just the retrieval > of the pointer. E.g. this (straw man) would be broken > > bool table = kvm_pte_table(ctx.old, level); > > rcu_read_lock(); > ptep = kvm_dereference_pteref(pteref, flags & KVM_PGTABLE_WALK_SHARED); > rcu_read_unlock(); > > if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE)) > ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_PRE); > > if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) { > ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF); > ctx.old = READ_ONCE(*ptep); > table = kvm_pte_table(ctx.old, level); > } > > as the read of the entry pointed at by ptep could be to a page table that is freed > in an RCU callback. > > The naming collision you are trying to avoid is a symptom of this bad pattern, > as there should never be "raw" pteps floating around, at least not in non-pKVM > contexts that utilize RCU. Fair enough, this was mostly from an attempt to avoid churn in the visitor callbacks by adding more layering for reads/writes through RCU pointers. The lifetime of the raw pointer is 'safe' as it sits on the stack and we go behind the rcu lock much further up. > > +} > > + > > #define KVM_PTE_VALID BIT(0) > > > > #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) > > @@ -170,7 +177,7 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, > > struct kvm_pgtable { > > u32 ia_bits; > > u32 start_level; > > - kvm_pte_t *pgd; > > + kvm_pteref_t pgd; > > struct kvm_pgtable_mm_ops *mm_ops; > > > > /* Stage-2 only */ > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > > index 02c33fccb178..6b6e1ed7ee2f 100644 > > --- a/arch/arm64/kvm/hyp/pgtable.c > > +++ b/arch/arm64/kvm/hyp/pgtable.c > > @@ -175,13 +175,14 @@ static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, > > } > > > > static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, > > - struct kvm_pgtable_mm_ops *mm_ops, kvm_pte_t *pgtable, u32 level); > > + struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level); > > > > static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, > > struct kvm_pgtable_mm_ops *mm_ops, > > - kvm_pte_t *ptep, u32 level) > > + kvm_pteref_t pteref, u32 level) > > { > > enum kvm_pgtable_walk_flags flags = data->walker->flags; > > + kvm_pte_t *ptep = kvm_dereference_pteref(pteref, false); > > struct kvm_pgtable_visit_ctx ctx = { > > .ptep = ptep, > > .old = READ_ONCE(*ptep), > > This is where you want the protection to kick in, e.g. > > typedef kvm_pte_t __rcu *kvm_ptep_t; > > static inline kvm_pte_t kvm_read_pte(kvm_ptep_t ptep) > { > return READ_ONCE(*rcu_dereference(ptep)); > } > > .old = kvm_read_pte(ptep), > > In other words, the pointer itself isn't that's protected, it's PTE that the > pointer points at that's protected. Right, but practically speaking it is the boundary at which we assert that protection. Anyhow, I'll look at abstracting the actual memory accesses in the visitors without too much mess. -- Thanks, Oliver _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel