From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19B4FC4321E for ; Fri, 1 Apr 2022 14:55:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349713AbiDAO5n (ORCPT ); Fri, 1 Apr 2022 10:57:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347569AbiDAOfV (ORCPT ); Fri, 1 Apr 2022 10:35:21 -0400 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EA141EC99D for ; Fri, 1 Apr 2022 07:33:14 -0700 (PDT) Received: by mail-pj1-x102a.google.com with SMTP id mp11-20020a17090b190b00b001c79aa8fac4so5325187pjb.0 for ; Fri, 01 Apr 2022 07:33:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=k4bN/0aTPe4C1wZQvO0SYhx4k2cWLhq656Tv/Tf/44Q=; b=Pdr4qXTrSN6HO6EW00C3G26qTRcxrkqJNfj684zrI3WaeqK9l5ggIwGJ+WkMpJ9SJm nbbzWGGzq1aSkJMclTKolCLJ2d+HDQdMRKBRy7IKNg4t12HYJWMruSkmDiQiSI5IpqGz o4JN1E56/U21WKsUB3fii0cBwaNafgM7Y1Q33+fUUwl0vCDWB0hspps8UAD0yeEp6tRz HvMRcY4zYkAywZde69EE22cMKPLVuJ3jVyLyzqSCg6BYCE1Z9l7Zb+e4xhwwCPKFB9QK DEOdozm+Fr8HJyYtXFo6MnQ1BZdeSozSBp3QZORbgfVezJ2NjdRCh0CpSXJtI1BeYR5h RXyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=k4bN/0aTPe4C1wZQvO0SYhx4k2cWLhq656Tv/Tf/44Q=; b=U1ODdhcFYWYCuRhPtUdvCrkS3dAx/d5LTCjt55/sCNTur7D36eCFeJHjpdsq2c88aw 6bmYf2fO1/sveAW+M3bs2le2nl7naIi5viPSvkAzR/KgsRBcBGpgowi1mcOt9bG7oHlO n+UR/fHTL4sz1vUHhAKulGCN4fOu7E1LByEsl0GH22G9dgysduo6HcKqSPSdiBLmNEG6 IsoTVZ1WnXeKo4dsIBWuksDwhTpt6M4yC8AgbRjSAEZLFMqqBs6TOn5ahQAtazo8XS3R L9We9JGM+aO+hv+YrGhvc5ashP6OQd8Dta9ylJnH2bVhwgp0wfF3H8gQOFhkTnHPGOL8 dAjw== X-Gm-Message-State: AOAM531fgdm1jYP8eoKo5uFyCIQ4tNo8Eg94Ld0MG0LoRKTptzv1cqWE K0EbRUQBIC6qmnkpObWtGAdf/w== X-Google-Smtp-Source: ABdhPJyZr/LEQ/N3wCmeTeBzQjGN5hVBv4JdlD3t3GYtrHw2ufdpM/juOTCs0cgXG7LWrG+Szun8Pg== X-Received: by 2002:a17:90b:1811:b0:1c7:832a:3388 with SMTP id lw17-20020a17090b181100b001c7832a3388mr12316590pjb.40.1648823593409; Fri, 01 Apr 2022 07:33:13 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id ep2-20020a17090ae64200b001c6a7c22aedsm2700568pjb.37.2022.04.01.07.33.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Apr 2022 07:33:12 -0700 (PDT) Date: Fri, 1 Apr 2022 14:33:08 +0000 From: Sean Christopherson To: Mingwei Zhang Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Ben Gardon , David Matlack , Jing Zhang , Peter Xu Subject: Re: [PATCH v3 3/6] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust() Message-ID: References: <20220401063636.2414200-1-mizhang@google.com> <20220401063636.2414200-4-mizhang@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220401063636.2414200-4-mizhang@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 01, 2022, Mingwei Zhang wrote: > Explicitly check if a NX huge page is disallowed when determining if a page > fault needs to be forced to use a smaller sized page. KVM incorrectly > assumes that the NX huge page mitigation is the only scenario where KVM > will create a shadow page instead of a huge page. Any scenario that causes > KVM to zap leaf SPTEs may result in having a SP that can be made huge > without violating the NX huge page mitigation. E.g. disabling of dirty > logging, zapping from mmu_notifier due to page migration, guest MTRR > changes that affect the viability of a huge page, etc... > > Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") > > Reviewed-by: Ben Gardon Reviewed-by (and Acked-by and Tested-by) tags should be dropped when there is a non-trivial, functional, and/or semantic change between versions. It's often a judgment call, but in this case I would definitely drop Ben's review. It's usually better to err on the side of caution for this rule/guideline. > Signed-off-by: Mingwei Zhang > --- > arch/x86/kvm/mmu/mmu.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 5cb845fae56e..033609e8b332 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -2896,6 +2896,16 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ > cur_level == fault->goal_level && > is_shadow_present_pte(spte) && > !is_large_pte(spte)) { > + struct kvm_mmu_page *sp; > + u64 page_mask; > + > + sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); > + > + /* Prevent lpage_disallowed read from moving ahead. */ > + smp_rmb(); smp_rmb() needs to be paired with e.g. smp_wmb(), and the comment is supposed to tie the two together (the reason checkpatch warns about comments is to try and prevent sprinking bogus barriers everywhere). These types of barriers always need to be paired with something, as both the reader(s) and the writer need to have specific, coordinated ordering. E.g. reading lpage_disallowed after checking for a present SPTE is meaningless if the writer doesn't set lpage_disallowed before setting the SPTE present (which is the purpose of patch 1). I didn't add the smp_wmb() in patch 1 because it's not necessary as of that patch, it's only when the concurrent reader comes along in this patch that the barrier becomes necessary. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 9f73c175845e..041b1ab838f0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2901,7 +2901,12 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); - /* Prevent lpage_disallowed read from moving ahead. */ + /* + * Ensure lpage_disallowed is read after checking for a present + * SPTE. A different vCPU may be concurrently installing the + * shadow page if mmu_lock is held for read. Pairs with the + * smp_wmb() in kvm_tdp_mmu_map(). + */ smp_rmb(); if (!sp->lpage_disallowed) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1f44826cf794..2422684f1dc3 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1186,6 +1186,13 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) sp->lpage_disallowed = account_nx; + /* + * Ensure lpage_disallowed is visible before the SP is + * marked present, as mmu_lock is held for read. Pairs + * with the smp_rmb() in disallowed_hugepage_adjust(). + */ + smp_wmb(); + if (tdp_mmu_link_sp(kvm, &iter, sp, true)) { tdp_mmu_free_sp(sp); break;