From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9F8E9C433FE for ; Wed, 19 Oct 2022 23:19:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nOS/lVRc7WpVAo+jSz7f6nzY4okH8LMnpxlQ6b98fQU=; b=lQW/f5WFrC8DdY pmyAIon/Q6K33Jhz01fWuPuCvCs6WYJA6+PjQbfXewZz4Xg/ZNzp5GsOUTnMMDm4zhNzDvRPbh1Py PBQNQgqtvcN6CVEsLhK9ywtSWbYa66nrPytoHkfTD4F+32pA04mNH0jlnU3wP62xMrEvW3GdbiUe/ jLbiqoXYooSK2NGhY8F+HdMjtEjAZNoVFn/8WaeFvRjzqyh6Bup2qt2QJ1z8atq7MPWO0YVNQ2AzC 6NR8CxrE62afDAuN570mm0IrEFdmRQll+qnK5sCS7vLN/bwDWo9TrvXQOf0HATKtGhK1pXLD55Csp 2kA8AnrhUiCa6T8QQFJw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1olIJt-005j9B-NS; Wed, 19 Oct 2022 23:17:53 +0000 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1olIJr-005j3x-04 for linux-arm-kernel@lists.infradead.org; Wed, 19 Oct 2022 23:17:52 +0000 Received: by mail-pj1-x1031.google.com with SMTP id o17-20020a17090aac1100b0020d98b0c0f4so1603700pjq.4 for ; Wed, 19 Oct 2022 16:17:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Me73D3UIfDoCysc1MWGMK8EUvSaCv+Air3k0VL/BRN8=; b=RmCtH8neEhRNB9EOWZTVkrahvAUVFTRCLPGLyV9Nb3VYKC99nZ2WwGhFzKu2zd6HVj 0uiDQ0fsRP8/n8QzU7kIqDtbhRIkGoQL/i2SdXmhp8SDWC6ergvgFiSb+W1w4zuFcH2a SZWduHd7knWotjL0Mm+L4nTpW9p7iR7JwE0yMzWN95zSotrbo9VjGX9yuDLmvdWRphjX +H/H0qojfkImnsy0HIgOG6NIde3/7kIqOsCbLwiptGca8qtknEBtRNZfKvvivPq7IDt2 3TNIVxbQksE24xTQ6z60h2Xdhr09RmEmeKnumzKlYfhKoC5eBLWNh5BT2+XEE/q2I0Cj 6aTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Me73D3UIfDoCysc1MWGMK8EUvSaCv+Air3k0VL/BRN8=; b=QgQ75o5DjogK775ft7majadrzbe4WjFNN4PCMrKAsh9JModSynGrTP1njWUgfgrwJN ryegrD/JbmI6X9nqHXigAe4QOcy0BTyvfPt52gvPooUeOjKL4t6oLdojPK5SiySranD9 z9OAUzz9P4YJ+LKPBQPxVO8Cw7afpPpVzmSi38dqH4Jq3by6iLmxRm1kuEZHVi3JDV1N m2XSAmdTMkO32GvpZPtnM+UUxzjxVpZjLMgwzEuF51IPLQ5hNO0b5rVcvcSUhKaAbl11 q/zTj2lSxlsgkTJ8FDRWr9rZCNXlJo0VUr52h8q8bp1QY5RGhWpwYMJM5XNUKRFnY2ze ng3Q== X-Gm-Message-State: ACrzQf0FHl9mZ4lNp10NIcEaxWtlM6jHMpNi1WkD5bMKsLW9ElzvUg9M OstExJRCbnXwoNfQaiHY6OGy0w== X-Google-Smtp-Source: AMsMyM7BE+rbsUH5R/9cYMnwScFvrtkWbyHC05khyxy3u7jhhyoNaEpU284lTQQEk+0awE2RbPVfvw== X-Received: by 2002:a17:90a:b28d:b0:20d:6790:19fa with SMTP id c13-20020a17090ab28d00b0020d679019famr47693623pjr.68.1666221468020; Wed, 19 Oct 2022 16:17:48 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id s8-20020a170902ea0800b00176a579fae8sm6310265plg.210.2022.10.19.16.17.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Oct 2022 16:17:47 -0700 (PDT) Date: Wed, 19 Oct 2022 23:17:43 +0000 From: Sean Christopherson To: Oliver Upton Cc: Marc Zyngier , James Morse , Alexandru Elisei , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, Reiji Watanabe , Ricardo Koller , David Matlack , Quentin Perret , Ben Gardon , Gavin Shan , Peter Xu , Will Deacon , kvmarm@lists.linux.dev Subject: Re: [PATCH v2 07/15] KVM: arm64: Use an opaque type for pteps Message-ID: References: <20221007232818.459650-1-oliver.upton@linux.dev> <20221007232818.459650-8-oliver.upton@linux.dev> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20221007232818.459650-8-oliver.upton@linux.dev> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221019_161751_066621_B617AB84 X-CRM114-Status: GOOD ( 22.11 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Oct 07, 2022, Oliver Upton wrote: > Use an opaque type for pteps and require visitors explicitly dereference > the pointer before using. Protecting page table memory with RCU requires > that KVM dereferences RCU-annotated pointers before using. However, RCU > is not available for use in the nVHE hypervisor and the opaque type can > be conditionally annotated with RCU for the stage-2 MMU. > > Call the type a 'pteref' to avoid a naming collision with raw pteps. No > functional change intended. > > Signed-off-by: Oliver Upton > --- > arch/arm64/include/asm/kvm_pgtable.h | 9 ++++++++- > arch/arm64/kvm/hyp/pgtable.c | 23 ++++++++++++----------- > 2 files changed, 20 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > index c33edcf36b5b..beb89eac155c 100644 > --- a/arch/arm64/include/asm/kvm_pgtable.h > +++ b/arch/arm64/include/asm/kvm_pgtable.h > @@ -25,6 +25,13 @@ static inline u64 kvm_get_parange(u64 mmfr0) > > typedef u64 kvm_pte_t; > > +typedef kvm_pte_t *kvm_pteref_t; > + > +static inline kvm_pte_t *kvm_dereference_pteref(kvm_pteref_t pteref, bool shared) > +{ > + return pteref; Returning the pointer is unsafe (when it becomes RCU-protected). The full dereference of the data needs to occur under RCU protection, not just the retrieval of the pointer. E.g. this (straw man) would be broken bool table = kvm_pte_table(ctx.old, level); rcu_read_lock(); ptep = kvm_dereference_pteref(pteref, flags & KVM_PGTABLE_WALK_SHARED); rcu_read_unlock(); if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE)) ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_PRE); if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) { ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF); ctx.old = READ_ONCE(*ptep); table = kvm_pte_table(ctx.old, level); } as the read of the entry pointed at by ptep could be to a page table that is freed in an RCU callback. The naming collision you are trying to avoid is a symptom of this bad pattern, as there should never be "raw" pteps floating around, at least not in non-pKVM contexts that utilize RCU. > +} > + > #define KVM_PTE_VALID BIT(0) > > #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) > @@ -170,7 +177,7 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, > struct kvm_pgtable { > u32 ia_bits; > u32 start_level; > - kvm_pte_t *pgd; > + kvm_pteref_t pgd; > struct kvm_pgtable_mm_ops *mm_ops; > > /* Stage-2 only */ > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index 02c33fccb178..6b6e1ed7ee2f 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -175,13 +175,14 @@ static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, > } > > static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, > - struct kvm_pgtable_mm_ops *mm_ops, kvm_pte_t *pgtable, u32 level); > + struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level); > > static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, > struct kvm_pgtable_mm_ops *mm_ops, > - kvm_pte_t *ptep, u32 level) > + kvm_pteref_t pteref, u32 level) > { > enum kvm_pgtable_walk_flags flags = data->walker->flags; > + kvm_pte_t *ptep = kvm_dereference_pteref(pteref, false); > struct kvm_pgtable_visit_ctx ctx = { > .ptep = ptep, > .old = READ_ONCE(*ptep), This is where you want the protection to kick in, e.g. typedef kvm_pte_t __rcu *kvm_ptep_t; static inline kvm_pte_t kvm_read_pte(kvm_ptep_t ptep) { return READ_ONCE(*rcu_dereference(ptep)); } .old = kvm_read_pte(ptep), In other words, the pointer itself isn't that's protected, it's PTE that the pointer points at that's protected. rcu_dereference() has no overhead when CONFIG_PROVE_RCU=n, i.e. there's no reason to "optimize" dereferences. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel