Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: David Matlack <dmatlack@google.com>,
	pbonzini@redhat.com, bgardon@google.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches
Date: Wed, 18 Jan 2023 17:43:46 +0000	[thread overview]
Message-ID: <Y8gv0srYi+6PvJml@google.com> (raw)
In-Reply-To: <CAHVum0cZtDYZN2bD3TZgUNpcWiy2-Qkw1mb40syut_2kkR=Agg@mail.gmail.com>

@all, trim your replies!

On Tue, Jan 03, 2023, Vipin Sharma wrote:
> On Tue, Jan 3, 2023 at 10:01 AM Vipin Sharma <vipinsh@google.com> wrote:
> >
> > On Thu, Dec 29, 2022 at 1:55 PM David Matlack <dmatlack@google.com> wrote:
> > > > @@ -6646,66 +6690,49 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
> > > >  static unsigned long
> > > >  mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > >  {
> > > > -     struct kvm *kvm;
> > > > -     int nr_to_scan = sc->nr_to_scan;
> > > > +     struct kvm_mmu_memory_cache *cache;
> > > > +     struct kvm *kvm, *first_kvm = NULL;
> > > >       unsigned long freed = 0;
> > > > +     /* spinlock for memory cache */
> > > > +     spinlock_t *cache_lock;
> > > > +     struct kvm_vcpu *vcpu;
> > > > +     unsigned long i;
> > > >
> > > >       mutex_lock(&kvm_lock);
> > > >
> > > >       list_for_each_entry(kvm, &vm_list, vm_list) {
> > > > -             int idx;
> > > > -             LIST_HEAD(invalid_list);
> > > > -
> > > > -             /*
> > > > -              * Never scan more than sc->nr_to_scan VM instances.
> > > > -              * Will not hit this condition practically since we do not try
> > > > -              * to shrink more than one VM and it is very unlikely to see
> > > > -              * !n_used_mmu_pages so many times.
> > > > -              */
> > > > -             if (!nr_to_scan--)
> > > > +             if (first_kvm == kvm)
> > > >                       break;
> > > > -             /*
> > > > -              * n_used_mmu_pages is accessed without holding kvm->mmu_lock
> > > > -              * here. We may skip a VM instance errorneosly, but we do not
> > > > -              * want to shrink a VM that only started to populate its MMU
> > > > -              * anyway.
> > > > -              */
> > > > -             if (!kvm->arch.n_used_mmu_pages &&
> > > > -                 !kvm_has_zapped_obsolete_pages(kvm))
> > > > -                     continue;
> > > > +             if (!first_kvm)
> > > > +                     first_kvm = kvm;
> > > > +             list_move_tail(&kvm->vm_list, &vm_list);
> > > >
> > > > -             idx = srcu_read_lock(&kvm->srcu);
> > > > -             write_lock(&kvm->mmu_lock);
> > > > +             kvm_for_each_vcpu(i, vcpu, kvm) {
> > >
> > > What protects this from racing with vCPU creation/deletion?
> > >
> 
> vCPU deletion:
> We take kvm_lock in mmu_shrink_scan(), the same lock is taken in
> kvm_destroy_vm() to remove a vm from vm_list. So, once we are
> iterating vm_list we will not see any VM removal which will means no
> vcpu removal.
> 
> I didn't find any other code for vCPU deletion except failures during
> VM and VCPU set up. A VM is only added to vm_list after successful
> creation.

Yep, KVM doesn't support destroying/freeing a vCPU after it's been added.

> vCPU creation:
> I think it will work.
> 
> kvm_vm_ioctl_create_vcpus() initializes the vcpu, adds it to
> kvm->vcpu_array which is of the type xarray and is managed by RCU.
> After this online_vcpus is incremented. So, kvm_for_each_vcpu() which
> uses RCU to read entries, if it sees incremented online_vcpus value
> then it will also sees all of the vcpu initialization.

Yep.  The shrinker may race with a vCPU creation, e.g. not process a just-created
vCPU, but that's totally ok in this case since the shrinker path is best effort
(and purging the caches of a newly created vCPU is likely pointless).

> @Sean, Paolo
> 
> Is the above explanation correct, kvm_for_each_vcpu() is safe without any lock?

Well, in this case, you do need to hold kvm_lock ;-)

But yes, iterating over vCPUs without holding the per-VM kvm->lock is safe, the
caller just needs to ensure the VM can't be destroyed, i.e. either needs to hold
a reference to the VM or needs to hold kvm_lock.

next prev parent reply	other threads:[~2023-01-18 17:45 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22  2:34 [Patch v3 0/9] NUMA aware page table's pages allocation Vipin Sharma
2022-12-22  2:34 ` [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches Vipin Sharma
2022-12-27 18:37   ` Ben Gardon
2022-12-28 22:07     ` Vipin Sharma
2022-12-29 21:15       ` David Matlack
2023-01-03 17:38         ` Vipin Sharma
2022-12-29 21:54   ` David Matlack
2023-01-03 18:01     ` Vipin Sharma
2023-01-04  0:25       ` Vipin Sharma
2023-01-18 17:43         ` Sean Christopherson [this message]
2023-01-03 19:32   ` Mingwei Zhang
2023-01-04  1:00     ` Vipin Sharma
2023-01-04  6:29       ` Mingwei Zhang
2023-01-04  6:57         ` Mingwei Zhang
2023-01-18 17:36         ` Sean Christopherson
2022-12-22  2:34 ` [Patch v3 2/9] KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{} Vipin Sharma
2022-12-29 21:59   ` David Matlack
2022-12-22  2:34 ` [Patch v3 3/9] KVM: x86/mmu: Shrink split_shadow_page_cache via KVM MMU shrinker Vipin Sharma
2022-12-22  2:34 ` [Patch v3 4/9] KVM: Add module param to make page tables NUMA aware Vipin Sharma
2022-12-29 22:05   ` David Matlack
2022-12-22  2:34 ` [Patch v3 5/9] KVM: x86/mmu: Allocate TDP page table's page on correct NUMA node on split Vipin Sharma
2022-12-27 19:02   ` Ben Gardon
2022-12-28 22:07     ` Vipin Sharma
2022-12-29 22:30   ` David Matlack
2023-01-03 18:26     ` Vipin Sharma
2022-12-22  2:34 ` [Patch v3 6/9] KVM: Provide NUMA node support to kvm_mmu_memory_cache{} Vipin Sharma
2022-12-27 19:09   ` Ben Gardon
2022-12-28 22:07     ` Vipin Sharma
2022-12-29 18:22       ` Ben Gardon
2023-01-03 17:36         ` Vipin Sharma
2022-12-29 23:08   ` David Matlack
2022-12-29 23:11     ` David Matlack
2023-01-03 18:45       ` Vipin Sharma
2023-01-03 18:55         ` David Matlack
2022-12-22  2:34 ` [Patch v3 7/9] KVM: x86/mmu: Allocate page table's pages on NUMA node of the underlying pages Vipin Sharma
2022-12-27 19:34   ` Ben Gardon
2022-12-28 22:08     ` Vipin Sharma
2022-12-29 18:20       ` Ben Gardon
2022-12-22  2:34 ` [Patch v3 8/9] KVM: x86/mmu: Make split_shadow_page_cache NUMA aware Vipin Sharma
2022-12-27 19:42   ` Ben Gardon
2022-12-28 22:08     ` Vipin Sharma
2022-12-29 23:18   ` David Matlack
2023-01-03 18:49     ` Vipin Sharma
2022-12-22  2:34 ` [Patch v3 9/9] KVM: x86/mmu: Reduce default cache size in KVM from 40 to PT64_ROOT_MAX_LEVEL Vipin Sharma
2022-12-27 19:52   ` Ben Gardon
2022-12-28 22:08     ` Vipin Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8gv0srYi+6PvJml@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox