From: Sean Christopherson <seanjc@google.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Vipin Sharma <vipinsh@google.com>,
pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches
Date: Wed, 18 Jan 2023 17:36:19 +0000 [thread overview]
Message-ID: <Y8guE+d22xf0RePz@google.com> (raw)
In-Reply-To: <CAL715W+UjRqqs4aJHoGDf+1c1OMHp8LhhSNgtjkGD5TbFVW_ng@mail.gmail.com>
On Tue, Jan 03, 2023, Mingwei Zhang wrote:
> On Tue, Jan 3, 2023 at 5:00 PM Vipin Sharma <vipinsh@google.com> wrote:
> > > I think the mmu_cache allocation and deallocation may cause the usage
> > > of GFP_ATOMIC (as observed by other reviewers as well). Adding a new
> > > lock would definitely sound like a plan, but I think it might affect
> > > the performance. Alternatively, I am wondering if we could use a
> > > mmu_cache_sequence similar to mmu_notifier_seq to help avoid the
> > > concurrency?
> > >
> >
> > Can you explain more about the performance impact? Each vcpu will have
> > its own mutex. So, only contention will be with the mmu_shrinker. This
> > shrinker will use mutex_try_lock() which will not block to wait for
> > the lock, it will just pass on to the next vcpu. While shrinker is
> > holding the lock, vcpu will be blocked in the page fault path but I
> > think it should not have a huge impact considering it will execute
> > rarely and for a small time.
> >
> > > Similar to mmu_notifier_seq, mmu_cache_sequence should be protected by
> > > mmu write lock. In the page fault path, each vcpu has to collect a
> > > snapshot of mmu_cache_sequence before calling into
> > > mmu_topup_memory_caches() and check the value again when holding the
> > > mmu lock. If the value is different, that means the mmu_shrinker has
> > > removed the cache objects and because of that, the vcpu should retry.
> > >
> >
> > Yeah, this can be one approach. I think it will come down to the
> > performance impact of using mutex which I don't think should be a
> > concern.
>
> hmm, I think you are right that there is no performance overhead by
> adding a mutex and letting the shrinker using mutex_trylock(). The
> point of using a sequence counter is to avoid the new lock, since
> introducing a new lock will increase management burden.
No, more locks doesn't necessarily mean higher maintenance cost. More complexity
definitely means more maintenance, but additional locks doesn't necessarily equate
to increased complexity.
Lockless algorithms are almost always more difficult to reason about, i.e. trying
to use a sequence counter for this case would be more complex than using a per-vCPU
mutex. The only complexity in adding another mutex is understanding why an additional
lock necessary, and IMO that's fairly easy to explain/understand (the shrinker will
almost never succeed if it has to wait for vcpu->mutex to be dropped).
> So unless it is necessary, we probably should choose a simple solution first.
>
> In this case, I think we do have such a choice and since a similar
> mechanism has already been used by mmu_notifiers.
The mmu_notifier case is very different. The invalidations affect the entire VM,
notifiers _must_ succeed, may or may not allowing sleeping, the readers (vCPUs)
effectively need protection while running in the guest, and practically speaking
holding a per-VM (or global) lock of any kind while a vCPU is running in the guest
is not viable, e.g. even holding kvm->srcu is disallowed.
In other words, using a traditional locking scheme to serialize guest accesses
with host-initiated page table (or memslot) updates is simply not an option.
next prev parent reply other threads:[~2023-01-18 17:36 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-22 2:34 [Patch v3 0/9] NUMA aware page table's pages allocation Vipin Sharma
2022-12-22 2:34 ` [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches Vipin Sharma
2022-12-27 18:37 ` Ben Gardon
2022-12-28 22:07 ` Vipin Sharma
2022-12-29 21:15 ` David Matlack
2023-01-03 17:38 ` Vipin Sharma
2022-12-29 21:54 ` David Matlack
2023-01-03 18:01 ` Vipin Sharma
2023-01-04 0:25 ` Vipin Sharma
2023-01-18 17:43 ` Sean Christopherson
2023-01-03 19:32 ` Mingwei Zhang
2023-01-04 1:00 ` Vipin Sharma
2023-01-04 6:29 ` Mingwei Zhang
2023-01-04 6:57 ` Mingwei Zhang
2023-01-18 17:36 ` Sean Christopherson [this message]
2022-12-22 2:34 ` [Patch v3 2/9] KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{} Vipin Sharma
2022-12-29 21:59 ` David Matlack
2022-12-22 2:34 ` [Patch v3 3/9] KVM: x86/mmu: Shrink split_shadow_page_cache via KVM MMU shrinker Vipin Sharma
2022-12-22 2:34 ` [Patch v3 4/9] KVM: Add module param to make page tables NUMA aware Vipin Sharma
2022-12-29 22:05 ` David Matlack
2022-12-22 2:34 ` [Patch v3 5/9] KVM: x86/mmu: Allocate TDP page table's page on correct NUMA node on split Vipin Sharma
2022-12-27 19:02 ` Ben Gardon
2022-12-28 22:07 ` Vipin Sharma
2022-12-29 22:30 ` David Matlack
2023-01-03 18:26 ` Vipin Sharma
2022-12-22 2:34 ` [Patch v3 6/9] KVM: Provide NUMA node support to kvm_mmu_memory_cache{} Vipin Sharma
2022-12-27 19:09 ` Ben Gardon
2022-12-28 22:07 ` Vipin Sharma
2022-12-29 18:22 ` Ben Gardon
2023-01-03 17:36 ` Vipin Sharma
2022-12-29 23:08 ` David Matlack
2022-12-29 23:11 ` David Matlack
2023-01-03 18:45 ` Vipin Sharma
2023-01-03 18:55 ` David Matlack
2022-12-22 2:34 ` [Patch v3 7/9] KVM: x86/mmu: Allocate page table's pages on NUMA node of the underlying pages Vipin Sharma
2022-12-27 19:34 ` Ben Gardon
2022-12-28 22:08 ` Vipin Sharma
2022-12-29 18:20 ` Ben Gardon
2022-12-22 2:34 ` [Patch v3 8/9] KVM: x86/mmu: Make split_shadow_page_cache NUMA aware Vipin Sharma
2022-12-27 19:42 ` Ben Gardon
2022-12-28 22:08 ` Vipin Sharma
2022-12-29 23:18 ` David Matlack
2023-01-03 18:49 ` Vipin Sharma
2022-12-22 2:34 ` [Patch v3 9/9] KVM: x86/mmu: Reduce default cache size in KVM from 40 to PT64_ROOT_MAX_LEVEL Vipin Sharma
2022-12-27 19:52 ` Ben Gardon
2022-12-28 22:08 ` Vipin Sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8guE+d22xf0RePz@google.com \
--to=seanjc@google.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=vipinsh@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox