From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
David Matlack <dmatlack@google.com>,
Yan Zhao <yan.y.zhao@intel.com>, Ben Gardon <bgardon@google.com>
Subject: Re: [PATCH v3 2/8] KVM: x86/mmu: Tag disallowed NX huge pages even if they're not tracked
Date: Sun, 14 Aug 2022 00:53:51 +0000 [thread overview]
Message-ID: <YvhHn50lQmRRST8N@google.com> (raw)
In-Reply-To: <20220805230513.148869-3-seanjc@google.com>
On Fri, Aug 05, 2022, Sean Christopherson wrote:
> Tag shadow pages that cannot be replaced with an NX huge page regardless
> of whether or not zapping the page would allow KVM to immediately create
> a huge page, e.g. because something else prevents creating a huge page.
>
> I.e. track pages that are disallowed from being NX huge pages regardless
> of whether or not the page could have been huge at the time of fault.
> KVM currently tracks pages that were disallowed from being huge due to
> the NX workaround if and only if the page could otherwise be huge. But
> that fails to handled the scenario where whatever restriction prevented
> KVM from installing a huge page goes away, e.g. if dirty logging is
> disabled, the host mapping level changes, etc...
>
> Failure to tag shadow pages appropriately could theoretically lead to
> false negatives, e.g. if a fetch fault requests a small page and thus
> isn't tracked, and a read/write fault later requests a huge page, KVM
> will not reject the huge page as it should.
>
> To avoid yet another flag, initialize the list_head and use list_empty()
> to determine whether or not a page is on the list of NX huge pages that
> should be recovered.
>
> Note, the TDP MMU accounting is still flawed as fixing the TDP MMU is
> more involved due to mmu_lock being held for read. This will be
> addressed in a future commit.
>
> Fixes: 5bcaf3e1715f ("KVM: x86/mmu: Account NX huge page disallowed iff huge page was requested")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Mingwei Zhang <mizhang@google.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 27 +++++++++++++++++++--------
> arch/x86/kvm/mmu/mmu_internal.h | 10 +++++++++-
> arch/x86/kvm/mmu/paging_tmpl.h | 6 +++---
> arch/x86/kvm/mmu/tdp_mmu.c | 4 +++-
> 4 files changed, 34 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 36b898dbde91..55dac44f3397 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -802,15 +802,20 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
> kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
> }
>
> -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
> +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> + bool nx_huge_page_possible)
> {
> - if (KVM_BUG_ON(sp->lpage_disallowed, kvm))
> + if (KVM_BUG_ON(!list_empty(&sp->lpage_disallowed_link), kvm))
> + return;
> +
> + sp->lpage_disallowed = true;
> +
> + if (!nx_huge_page_possible)
> return;
>
> ++kvm->stat.nx_lpage_splits;
> list_add_tail(&sp->lpage_disallowed_link,
> &kvm->arch.lpage_disallowed_mmu_pages);
> - sp->lpage_disallowed = true;
> }
>
> static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
> @@ -832,9 +837,13 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
>
> void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
> {
> - --kvm->stat.nx_lpage_splits;
> sp->lpage_disallowed = false;
> - list_del(&sp->lpage_disallowed_link);
> +
> + if (list_empty(&sp->lpage_disallowed_link))
> + return;
> +
> + --kvm->stat.nx_lpage_splits;
> + list_del_init(&sp->lpage_disallowed_link);
> }
>
> static struct kvm_memory_slot *
> @@ -2115,6 +2124,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm,
>
> set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
>
> + INIT_LIST_HEAD(&sp->lpage_disallowed_link);
> +
> /*
> * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages()
> * depends on valid pages being added to the head of the list. See
> @@ -3112,9 +3123,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> continue;
>
> link_shadow_page(vcpu, it.sptep, sp);
> - if (fault->is_tdp && fault->huge_page_disallowed &&
> - fault->req_level >= it.level)
> - account_huge_nx_page(vcpu->kvm, sp);
> + if (fault->is_tdp && fault->huge_page_disallowed)
> + account_huge_nx_page(vcpu->kvm, sp,
> + fault->req_level >= it.level);
> }
>
> if (WARN_ON_ONCE(it.level != fault->goal_level))
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index 582def531d4d..cca1ad75d096 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -100,6 +100,13 @@ struct kvm_mmu_page {
> };
> };
>
> + /*
> + * Tracks shadow pages that, if zapped, would allow KVM to create an NX
> + * huge page. A shadow page will have lpage_disallowed set but not be
> + * on the list if a huge page is disallowed for other reasons, e.g.
> + * because KVM is shadowing a PTE at the same gfn, the memslot isn't
> + * properly aligned, etc...
> + */
> struct list_head lpage_disallowed_link;
> #ifdef CONFIG_X86_32
> /*
> @@ -315,7 +322,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
>
> void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
>
> -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
> +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> + bool nx_huge_page_possible);
> void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
>
> #endif /* __KVM_X86_MMU_INTERNAL_H */
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index f5958071220c..e450f49f2225 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -713,9 +713,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
> continue;
>
> link_shadow_page(vcpu, it.sptep, sp);
> - if (fault->huge_page_disallowed &&
> - fault->req_level >= it.level)
> - account_huge_nx_page(vcpu->kvm, sp);
> + if (fault->huge_page_disallowed)
> + account_huge_nx_page(vcpu->kvm, sp,
> + fault->req_level >= it.level);
> }
>
> if (WARN_ON_ONCE(it.level != fault->goal_level))
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index bf2ccf9debca..903d0d3497b6 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -284,6 +284,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu)
> static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep,
> gfn_t gfn, union kvm_mmu_page_role role)
> {
> + INIT_LIST_HEAD(&sp->lpage_disallowed_link);
> +
> set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
>
> sp->role = role;
> @@ -1130,7 +1132,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter,
> spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
> if (account_nx)
> - account_huge_nx_page(kvm, sp);
> + account_huge_nx_page(kvm, sp, true);
> spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>
> return 0;
> --
> 2.37.1.559.g78731f0fdb-goog
>
next prev parent reply other threads:[~2022-08-14 0:54 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-05 23:05 [PATCH v3 0/8] KVM: x86: Apply NX mitigation more precisely Sean Christopherson
2022-08-05 23:05 ` [PATCH v3 1/8] KVM: x86/mmu: Bug the VM if KVM attempts to double count an NX huge page Sean Christopherson
2022-08-14 0:53 ` Mingwei Zhang
2022-08-05 23:05 ` [PATCH v3 2/8] KVM: x86/mmu: Tag disallowed NX huge pages even if they're not tracked Sean Christopherson
2022-08-14 0:53 ` Mingwei Zhang [this message]
2022-08-05 23:05 ` [PATCH v3 3/8] KVM: x86/mmu: Rename NX huge pages fields/functions for consistency Sean Christopherson
2022-08-14 1:12 ` Mingwei Zhang
2022-08-15 21:54 ` Sean Christopherson
2022-08-16 21:09 ` Mingwei Zhang
2022-08-17 16:13 ` Sean Christopherson
2022-08-18 22:13 ` Mingwei Zhang
2022-08-18 23:45 ` Sean Christopherson
2022-08-19 18:30 ` Mingwei Zhang
2022-08-20 1:04 ` Mingwei Zhang
2022-08-05 23:05 ` [PATCH v3 4/8] KVM: x86/mmu: Properly account NX huge page workaround for nonpaging MMUs Sean Christopherson
2022-08-16 21:25 ` Mingwei Zhang
2022-08-05 23:05 ` [PATCH v3 5/8] KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE Sean Christopherson
2022-08-09 3:26 ` Yan Zhao
2022-08-09 12:49 ` Paolo Bonzini
2022-08-09 14:44 ` Sean Christopherson
2022-08-09 14:48 ` Paolo Bonzini
2022-08-09 15:05 ` Sean Christopherson
2022-08-05 23:05 ` [PATCH v3 6/8] KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual pages Sean Christopherson
2022-08-05 23:05 ` [PATCH v3 7/8] KVM: x86/mmu: Add helper to convert SPTE value to its shadow page Sean Christopherson
2022-08-05 23:05 ` [PATCH v3 8/8] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust() Sean Christopherson
2022-08-09 12:57 ` Paolo Bonzini
2022-08-09 14:49 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvhHn50lQmRRST8N@google.com \
--to=mizhang@google.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox