From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
David Hildenbrand <david@redhat.com>,
David Matlack <dmatlack@google.com>,
Ben Gardon <bgardon@google.com>,
Mingwei Zhang <mizhang@google.com>
Subject: Re: [PATCH v4 21/30] KVM: x86/mmu: Zap invalidated roots via asynchronous worker
Date: Thu, 3 Mar 2022 21:06:14 +0000 [thread overview]
Message-ID: <YiEtxt6pQHtemFkm@google.com> (raw)
In-Reply-To: <YiErEoIMDZy94HIH@google.com>
On Thu, Mar 03, 2022, Sean Christopherson wrote:
> On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> > + root->tdp_mmu_async_data = kvm;
> > + INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
> > + queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
> > +}
> > +
> > +static inline bool kvm_tdp_root_mark_invalid(struct kvm_mmu_page *page)
> > +{
> > + union kvm_mmu_page_role role = page->role;
> > + role.invalid = true;
> > +
> > + /* No need to use cmpxchg, only the invalid bit can change. */
> > + role.word = xchg(&page->role.word, role.word);
> > + return role.invalid;
>
> This helper is unused. It _could_ be used here, but I think it belongs in the
> next patch. Critically, until zapping defunct roots creates the invariant that
> invalid roots are _always_ zapped via worker, kvm_tdp_mmu_invalidate_all_roots()
> must not assume that an invalid root is queued for zapping. I.e. doing this
> before the "Zap defunct roots" would be wrong:
>
> list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) {
> if (kvm_tdp_root_mark_invalid(root))
> continue;
>
> if (WARN_ON_ONCE(!kvm_tdp_mmu_get_root(root)));
> continue;
>
> tdp_mmu_schedule_zap_root(kvm, root);
> }
Gah, lost my train of thought and forgot that this _can_ re-queue a root even in
this patch, it just can't it just can't re-queue a root that is _currently_ queued.
The re-queue scenario happens if a root is queued and zapped, but is kept alive
by a vCPU that hasn't yet put its reference. If another memslot comes along before
the (sleeping) vCPU drops its reference, this will re-queue the root.
It's not a major problem in this patch as it's a small amount of wasted effort,
but it will be an issue when the "put" path starts using the queue, as that will
create a scenario where a memslot update (or NX toggle) can come along while a
defunct root is in the zap queue.
Checking for role.invalid is wrong (as above), so for this patch I think the
easiest thing is to use tdp_mmu_async_data as a sentinel that the root was zapped
in the past and doesn't need to be re-zapped.
/*
* Mark each TDP MMU root as invalid to prevent vCPUs from reusing a root that
* is about to be zapped, e.g. in response to a memslots update. The actual
* zapping is performed asynchronously, so a reference is taken on all roots.
* Using a separate workqueue makes it easy to ensure that the destruction is
* performed before the "fast zap" completes, without keeping a separate list
* of invalidated roots; the list is effectively the list of work items in
* the workqueue.
*
* Skip roots that were already queued for zapping, the "fast zap" path is the
* only user of the zap queue and always flushes the queue under slots_lock,
* i.e. the queued zap is guaranteed to have completed already.
*
* Because mmu_lock is held for write, it should be impossible to observe a
* root with zero refcount,* i.e. the list of roots cannot be stale.
*
* This has essentially the same effect for the TDP MMU
* as updating mmu_valid_gen does for the shadow MMU.
*/
void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
{
struct kvm_mmu_page *root;
lockdep_assert_held_write(&kvm->mmu_lock);
list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) {
if (root->tdp_mmu_async_data)
continue;
if (WARN_ON_ONCE(!kvm_tdp_mmu_get_root(root)))
continue;
root->role.invalid = true;
tdp_mmu_schedule_zap_root(kvm, root);
}
}
next prev parent reply other threads:[~2022-03-03 21:06 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-03 19:38 [PATCH v4 00/30] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 01/30] KVM: x86/mmu: Check for present SPTE when clearing dirty bit in TDP MMU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 02/30] KVM: x86/mmu: Fix wrong/misleading comments in TDP MMU fast zap Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 03/30] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic Paolo Bonzini
2022-03-03 23:39 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 04/30] KVM: x86/mmu: Document that zapping invalidated roots doesn't need to flush Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 05/30] KVM: x86/mmu: Require mmu_lock be held for write in unyielding root iter Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 06/30] KVM: x86/mmu: only perform eager page splitting on valid roots Paolo Bonzini
2022-03-03 20:03 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 07/30] KVM: x86/mmu: do not allow readers to acquire references to invalid roots Paolo Bonzini
2022-03-03 20:12 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 08/30] KVM: x86/mmu: Check for !leaf=>leaf, not PFN change, in TDP MMU SP removal Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 09/30] KVM: x86/mmu: Batch TLB flushes from TDP MMU for MMU notifier change_spte Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 10/30] KVM: x86/mmu: Drop RCU after processing each root in MMU notifier hooks Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 11/30] KVM: x86/mmu: Add helpers to read/write TDP MMU SPTEs and document RCU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 12/30] KVM: x86/mmu: WARN if old _or_ new SPTE is REMOVED in non-atomic path Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 13/30] KVM: x86/mmu: Refactor low-level TDP MMU set SPTE helper to take raw values Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 14/30] KVM: x86/mmu: Zap only the target TDP MMU shadow page in NX recovery Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 15/30] KVM: x86/mmu: Skip remote TLB flush when zapping all of TDP MMU Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 16/30] KVM: x86/mmu: Add dedicated helper to zap TDP MMU root shadow page Paolo Bonzini
2022-03-04 0:07 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 17/30] KVM: x86/mmu: Require mmu_lock be held for write to zap TDP MMU range Paolo Bonzini
2022-03-04 0:14 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 18/30] KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range() Paolo Bonzini
2022-03-04 1:16 ` Mingwei Zhang
2022-03-04 16:11 ` Sean Christopherson
2022-03-04 18:00 ` Mingwei Zhang
2022-03-04 18:42 ` Sean Christopherson
2022-03-11 15:09 ` Vitaly Kuznetsov
2022-03-13 18:40 ` Mingwei Zhang
2022-03-25 15:13 ` Sean Christopherson
2022-03-26 18:10 ` Mingwei Zhang
2022-03-28 15:06 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 19/30] KVM: x86/mmu: Do remote TLB flush before dropping RCU in TDP MMU resched Paolo Bonzini
2022-03-04 1:19 ` Mingwei Zhang
2022-03-03 19:38 ` [PATCH v4 20/30] KVM: x86/mmu: Defer TLB flush to caller when freeing TDP MMU shadow pages Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 21/30] KVM: x86/mmu: Zap invalidated roots via asynchronous worker Paolo Bonzini
2022-03-03 20:54 ` Sean Christopherson
2022-03-03 21:06 ` Sean Christopherson [this message]
2022-03-03 21:20 ` Sean Christopherson
2022-03-03 21:32 ` Sean Christopherson
2022-03-04 6:48 ` Paolo Bonzini
2022-03-04 16:02 ` Sean Christopherson
2022-03-04 18:11 ` Paolo Bonzini
2022-03-05 0:34 ` Sean Christopherson
2022-03-05 19:53 ` Paolo Bonzini
2022-03-08 21:29 ` Sean Christopherson
2022-03-11 17:50 ` Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 22/30] KVM: x86/mmu: Allow yielding when zapping GFNs for defunct TDP MMU root Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 23/30] KVM: x86/mmu: Zap roots in two passes to avoid inducing RCU stalls Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 24/30] KVM: x86/mmu: Zap defunct roots via asynchronous worker Paolo Bonzini
2022-03-03 22:08 ` Sean Christopherson
2022-03-03 19:38 ` [PATCH v4 25/30] KVM: x86/mmu: Check for a REMOVED leaf SPTE before making the SPTE Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 26/30] KVM: x86/mmu: WARN on any attempt to atomically update REMOVED SPTE Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 27/30] KVM: selftests: Move raw KVM_SET_USER_MEMORY_REGION helper to utils Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 28/30] KVM: selftests: Split out helper to allocate guest mem via memfd Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 29/30] KVM: selftests: Define cpu_relax() helpers for s390 and x86 Paolo Bonzini
2022-03-03 19:38 ` [PATCH v4 30/30] KVM: selftests: Add test to populate a VM with the max possible guest mem Paolo Bonzini
2022-03-08 14:47 ` Paolo Bonzini
2022-03-08 15:36 ` Christian Borntraeger
2022-03-08 21:09 ` Sean Christopherson
2022-03-08 17:25 ` [PATCH v4 00/30] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YiEtxt6pQHtemFkm@google.com \
--to=seanjc@google.com \
--cc=bgardon@google.com \
--cc=david@redhat.com \
--cc=dmatlack@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.