All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
	jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally
Date: Thu, 9 Mar 2023 16:55:58 -0800	[thread overview]
Message-ID: <ZAqAHiaCz0b2OKJF@google.com> (raw)
In-Reply-To: <CAHVum0eQzmLXDxMy3+LpmGxVU7YsT1wRNYkFq9o7sfR2uNK-xA@mail.gmail.com>

On Thu, Mar 09, 2023 at 04:28:10PM -0800, Vipin Sharma wrote:
> On Thu, Mar 9, 2023 at 3:53 PM David Matlack <dmatlack@google.com> wrote:
> >
> > On Mon, Mar 06, 2023 at 02:41:12PM -0800, Vipin Sharma wrote:
> > > Create a global counter for total number of pages available
> > > in MMU page caches across all VMs. Add mmu_shadow_page_cache
> > > pages to this counter.
> >
> > I think I prefer counting the objects on-demand in mmu_shrink_count(),
> > instead of keeping track of the count. Keeping track of the count adds
> > complexity to the topup/alloc paths for the sole benefit of the
> > shrinker. I'd rather contain that complexity to the shrinker code unless
> > there is a compelling reason to optimize mmu_shrink_count().
> >
> > IIRC we discussed this at one point. Was there a reason to take this
> > approach that I'm just forgetting?
> 
> To count on demand, we first need to lock on kvm_lock and then for
> each VMs iterate through each vCPU, take a lock, and sum the objects
> count in caches. When the NUMA support will be introduced in this
> series then it means we have to iterate even more caches. We
> can't/shouldn't use mutex_trylock() as it will not give the correct
> picture and when shrink_scan is called count can be totally different.

Yeah good point. Hm, do we need to take the cache mutex to calculate the
count though? mmu_shrink_count() is inherently racy (something could get
freed or allocated in between count() and scan()). I don't think holding
the mutex buys us anything over just reading the count without the
mutex.

e.g.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index df8dcb7e5de7..c80a5c52f0ea 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6739,10 +6739,20 @@ static unsigned long mmu_shrink_scan(struct shrinker *shrink,
 static unsigned long mmu_shrink_count(struct shrinker *shrink,
                                      struct shrink_control *sc)
 {
-       s64 count = percpu_counter_sum(&kvm_total_unused_cached_pages);
+       struct kvm *kvm, *next_kvm;
+       unsigned long count = 0;

-       WARN_ON(count < 0);
-       return count <= 0 ? SHRINK_EMPTY : count;
+       mutex_lock(&kvm_lock);
+       list_for_each_entry_safe(kvm, next_kvm, &vm_list, vm_list) {
+               struct kvm_vcpu *vcpu;
+               unsigned long i;
+
+               kvm_for_each_vcpu(i, vcpu, kvm)
+                       count += READ_ONCE(vcpu->arch.mmu_shadow_page_cache.nobjs);
+       }
+       mutex_unlock(&kvm_lock);
+
+       return count == 0 ? SHRINK_EMPTY : count;

 }

Then the only concern is an additional acquire of kvm_lock. But it
should be fairly quick (quicker than mmu_shrink_scan()). If we can
tolerate the kvm_lock overhead of mmu_shrink_scan(), then we should be
able to tolerate some here.

> 
> scan_count() API comment says to not do any deadlock check (I don't
> know what does that mean) and percpu_counter is very fast when we are
> adding/subtracting pages so the effect of using it to keep global
> count is very minimal. Since, there is not much impact to using
> percpu_count compared to previous one, we ended our discussion with
> keeping this per cpu counter.

Yeah it's just the code complexity of maintaing
kvm_total_unused_cached_pages that I'm hoping to avoid. We have to
create the counter, destroy it, and keep it up to date. Some
kvm_mmu_memory_caches have to update the counter, and others don't. It's
just adds a lot of bookkeeping code that I'm not convinced is worth the
it.

  reply	other threads:[~2023-03-10  0:56 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 22:41 [Patch v4 00/18] NUMA aware page table allocation Vipin Sharma
2023-03-06 22:41 ` [Patch v4 01/18] KVM: x86/mmu: Change KVM mmu shrinker to no-op Vipin Sharma
2023-03-06 22:41 ` [Patch v4 02/18] KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{} Vipin Sharma
2023-03-06 22:41 ` [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally Vipin Sharma
2023-03-07 11:32   ` kernel test robot
2023-03-07 19:13     ` Vipin Sharma
2023-03-07 20:18       ` Sean Christopherson
2023-03-07 12:13   ` kernel test robot
2023-03-08 20:33   ` Zhi Wang
2023-03-08 22:16     ` Vipin Sharma
2023-03-09  5:18       ` Mingwei Zhang
2023-03-09 12:52         ` Zhi Wang
2023-03-09 19:52           ` Vipin Sharma
2023-03-09 15:37   ` Zhi Wang
2023-03-09 18:19     ` Vipin Sharma
2023-03-09 23:53   ` David Matlack
2023-03-10  0:28     ` Vipin Sharma
2023-03-10  0:55       ` David Matlack [this message]
2023-03-10  1:09         ` Vipin Sharma
2023-03-10  0:22   ` David Matlack
2023-03-10  0:36     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 04/18] KVM: x86/mmu: Shrink shadow page caches via MMU shrinker Vipin Sharma
2023-03-06 22:41 ` [Patch v4 05/18] KVM: x86/mmu: Add split_shadow_page_cache pages to global count of MMU cache pages Vipin Sharma
2023-03-09 15:58   ` Zhi Wang
2023-03-09 19:59     ` Vipin Sharma
2023-03-10  0:05       ` David Matlack
2023-03-10  0:06         ` David Matlack
2023-03-06 22:41 ` [Patch v4 06/18] KVM: x86/mmu: Shrink split_shadow_page_cache via MMU shrinker Vipin Sharma
2023-03-09 16:01   ` Zhi Wang
2023-03-09 19:59     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 07/18] KVM: x86/mmu: Unconditionally count allocations from MMU page caches Vipin Sharma
2023-03-09 16:03   ` Zhi Wang
2023-03-06 22:41 ` [Patch v4 08/18] KVM: x86/mmu: Track unused mmu_shadowed_info_cache pages count via global counter Vipin Sharma
2023-03-30  4:53   ` Yang, Weijiang
2023-04-03 23:02     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 09/18] KVM: x86/mmu: Shrink mmu_shadowed_info_cache via MMU shrinker Vipin Sharma
2023-03-06 22:41 ` [Patch v4 10/18] KVM: x86/mmu: Add per VM NUMA aware page table capability Vipin Sharma
2023-03-06 22:41 ` [Patch v4 11/18] KVM: x86/mmu: Add documentation of " Vipin Sharma
2023-03-23 21:59   ` David Matlack
2023-03-28 16:47     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 12/18] KVM: x86/mmu: Allocate NUMA aware page tables on TDP huge page splits Vipin Sharma
2023-03-23 22:15   ` David Matlack
2023-03-28 17:12     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 13/18] KVM: mmu: Add common initialization logic for struct kvm_mmu_memory_cache{} Vipin Sharma
2023-03-23 22:23   ` David Matlack
2023-03-28 17:16     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 14/18] KVM: mmu: Initialize kvm_mmu_memory_cache.gfp_zero to __GFP_ZERO by default Vipin Sharma
2023-03-23 22:28   ` David Matlack
2023-03-28 17:31     ` Vipin Sharma
2023-03-28 23:13       ` David Matlack
2023-03-06 22:41 ` [Patch v4 15/18] KVM: mmu: Add NUMA node support in struct kvm_mmu_memory_cache{} Vipin Sharma
2023-03-23 22:30   ` David Matlack
2023-03-28 17:50     ` Vipin Sharma
2023-03-28 23:24       ` David Matlack
2023-04-03 22:57         ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 16/18] KVM: x86/mmu: Allocate numa aware page tables during page fault Vipin Sharma
2023-03-29  0:21   ` David Matlack
2023-03-29  0:28     ` David Matlack
2023-03-29 19:03     ` David Matlack
2023-04-03 22:54       ` Vipin Sharma
2023-04-03 22:50     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 17/18] KVM: x86/mmu: Allocate shadow mmu page table on huge page split on the same NUMA node Vipin Sharma
2023-03-06 22:41 ` [Patch v4 18/18] KVM: x86/mmu: Reduce default mmu memory cache size Vipin Sharma
2023-03-07 18:19 ` [Patch v4 00/18] NUMA aware page table allocation Mingwei Zhang
2023-03-07 18:33   ` Vipin Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAqAHiaCz0b2OKJF@google.com \
    --to=dmatlack@google.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.