From: Mingwei Zhang <mizhang@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Sean Christopherson <seanjc@google.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
David Hildenbrand <david@redhat.com>,
David Matlack <dmatlack@google.com>,
Ben Gardon <bgardon@google.com>
Subject: Re: [PATCH v4 03/30] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic
Date: Thu, 3 Mar 2022 23:39:30 +0000 [thread overview]
Message-ID: <YiFRskA4p1pwNAwS@google.com> (raw)
In-Reply-To: <20220303193842.370645-4-pbonzini@redhat.com>
On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
>
> Explicitly ignore the result of zap_gfn_range() when putting the last
> reference to a TDP MMU root, and add a pile of comments to formalize the
> TDP MMU's behavior of deferring TLB flushes to alloc/reuse. Note, this
> only affects the !shared case, as zap_gfn_range() subtly never returns
> true for "flush" as the flush is handled by tdp_mmu_zap_spte_atomic().
>
> Putting the root without a flush is ok because even if there are stale
> references to the root in the TLB, they are unreachable because KVM will
> not run the guest with the same ASID without first flushing (where ASID
> in this context refers to both SVM's explicit ASID and Intel's implicit
> ASID that is constructed from VPID+PCID+EPT4A+etc...).
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Message-Id: <20220226001546.360188-5-seanjc@google.com>
> Reviewed-by: Mingwei Zhang <mizhang@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 8 ++++++++
> arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++-
> 2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 32c041ed65cb..9a6df2d02777 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5083,6 +5083,14 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
> kvm_mmu_sync_roots(vcpu);
>
> kvm_mmu_load_pgd(vcpu);
> +
> + /*
> + * Flush any TLB entries for the new root, the provenance of the root
> + * is unknown. Even if KVM ensures there are no stale TLB entries
> + * for a freed root, in theory another hypervisor could have left
> + * stale entries. Flushing on alloc also allows KVM to skip the TLB
> + * flush when freeing a root (see kvm_tdp_mmu_put_root()).
> + */
> static_call(kvm_x86_flush_tlb_current)(vcpu);
> out:
> return r;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index b97a4125feac..921fa386df99 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -93,7 +93,15 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
> list_del_rcu(&root->link);
> spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>
> - zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
> + /*
> + * A TLB flush is not necessary as KVM performs a local TLB flush when
> + * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
> + * to a different pCPU. Note, the local TLB flush on reuse also
> + * invalidates any paging-structure-cache entries, i.e. TLB entries for
> + * intermediate paging structures, that may be zapped, as such entries
> + * are associated with the ASID on both VMX and SVM.
> + */
> + (void)zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
Discussed offline with Sean. Now I get myself comfortable with the style
of mmu with multiple 'roots' and leaving TLB unflushed for invalidated
roots.
I guess one minor improvement on the comment could be:
"A TLB flush is not necessary as each vCPU performs a local TLB flush
when allocating or assigning a new root (see kvm_mmu_load()), and when
migrating to a different pCPU."
The above could be better since "KVM performs a local TLB flush" makes
readers think why we miss the 'remote' TLB flushes?
>
> call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
> }
> --
> 2.31.1
>
>
next prev parent reply other threads:[~2022-03-03 23:39 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-03 19:38 [PATCH v4 00/30] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 01/30] KVM: x86/mmu: Check for present SPTE when clearing dirty bit in TDP MMU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 02/30] KVM: x86/mmu: Fix wrong/misleading comments in TDP MMU fast zap Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 03/30] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic Paolo Bonzini
2022-03-03 23:39 ` Mingwei Zhang [this message]
2022-03-03 19:38 ` [PATCH v4 04/30] KVM: x86/mmu: Document that zapping invalidated roots doesn't need to flush Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 05/30] KVM: x86/mmu: Require mmu_lock be held for write in unyielding root iter Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 06/30] KVM: x86/mmu: only perform eager page splitting on valid roots Paolo Bonzini
2022-03-03 20:03 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 07/30] KVM: x86/mmu: do not allow readers to acquire references to invalid roots Paolo Bonzini
2022-03-03 20:12 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 08/30] KVM: x86/mmu: Check for !leaf=>leaf, not PFN change, in TDP MMU SP removal Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 09/30] KVM: x86/mmu: Batch TLB flushes from TDP MMU for MMU notifier change_spte Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 10/30] KVM: x86/mmu: Drop RCU after processing each root in MMU notifier hooks Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 11/30] KVM: x86/mmu: Add helpers to read/write TDP MMU SPTEs and document RCU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 12/30] KVM: x86/mmu: WARN if old _or_ new SPTE is REMOVED in non-atomic path Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 13/30] KVM: x86/mmu: Refactor low-level TDP MMU set SPTE helper to take raw values Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 14/30] KVM: x86/mmu: Zap only the target TDP MMU shadow page in NX recovery Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 15/30] KVM: x86/mmu: Skip remote TLB flush when zapping all of TDP MMU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 16/30] KVM: x86/mmu: Add dedicated helper to zap TDP MMU root shadow page Paolo Bonzini
2022-03-04 0:07 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 17/30] KVM: x86/mmu: Require mmu_lock be held for write to zap TDP MMU range Paolo Bonzini
2022-03-04 0:14 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 18/30] KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range() Paolo Bonzini
2022-03-04 1:16 ` Mingwei Zhang
2022-03-04 16:11 ` Sean Christopherson
2022-03-04 18:00 ` Mingwei Zhang
2022-03-04 18:42 ` Sean Christopherson
2022-03-11 15:09 ` Vitaly Kuznetsov
2022-03-13 18:40 ` Mingwei Zhang
2022-03-25 15:13 ` Sean Christopherson
2022-03-26 18:10 ` Mingwei Zhang
2022-03-28 15:06 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 19/30] KVM: x86/mmu: Do remote TLB flush before dropping RCU in TDP MMU resched Paolo Bonzini
2022-03-04 1:19 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 20/30] KVM: x86/mmu: Defer TLB flush to caller when freeing TDP MMU shadow pages Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 21/30] KVM: x86/mmu: Zap invalidated roots via asynchronous worker Paolo Bonzini
2022-03-03 20:54 ` Sean Christopherson
2022-03-03 21:06 ` Sean Christopherson
2022-03-03 21:20 ` Sean Christopherson
2022-03-03 21:32 ` Sean Christopherson
2022-03-04 6:48 ` Paolo Bonzini
2022-03-04 16:02 ` Sean Christopherson
2022-03-04 18:11 ` Paolo Bonzini
2022-03-05 0:34 ` Sean Christopherson
2022-03-05 19:53 ` Paolo Bonzini
2022-03-08 21:29 ` Sean Christopherson
2022-03-11 17:50 ` Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 22/30] KVM: x86/mmu: Allow yielding when zapping GFNs for defunct TDP MMU root Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 23/30] KVM: x86/mmu: Zap roots in two passes to avoid inducing RCU stalls Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 24/30] KVM: x86/mmu: Zap defunct roots via asynchronous worker Paolo Bonzini
2022-03-03 22:08 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 25/30] KVM: x86/mmu: Check for a REMOVED leaf SPTE before making the SPTE Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 26/30] KVM: x86/mmu: WARN on any attempt to atomically update REMOVED SPTE Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 27/30] KVM: selftests: Move raw KVM_SET_USER_MEMORY_REGION helper to utils Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 28/30] KVM: selftests: Split out helper to allocate guest mem via memfd Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 29/30] KVM: selftests: Define cpu_relax() helpers for s390 and x86 Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 30/30] KVM: selftests: Add test to populate a VM with the max possible guest mem Paolo Bonzini
2022-03-08 14:47 ` Paolo Bonzini
2022-03-08 15:36 ` Christian Borntraeger
2022-03-08 21:09 ` Sean Christopherson
2022-03-08 17:25 ` [PATCH v4 00/30] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YiFRskA4p1pwNAwS@google.com \
--to=mizhang@google.com \
--cc=bgardon@google.com \
--cc=david@redhat.com \
--cc=dmatlack@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.