public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
	jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally
Date: Thu, 9 Mar 2023 16:55:58 -0800	[thread overview]
Message-ID: <ZAqAHiaCz0b2OKJF@google.com> (raw)
In-Reply-To: <CAHVum0eQzmLXDxMy3+LpmGxVU7YsT1wRNYkFq9o7sfR2uNK-xA@mail.gmail.com>

On Thu, Mar 09, 2023 at 04:28:10PM -0800, Vipin Sharma wrote:
> On Thu, Mar 9, 2023 at 3:53 PM David Matlack <dmatlack@google.com> wrote:
> >
> > On Mon, Mar 06, 2023 at 02:41:12PM -0800, Vipin Sharma wrote:
> > > Create a global counter for total number of pages available
> > > in MMU page caches across all VMs. Add mmu_shadow_page_cache
> > > pages to this counter.
> >
> > I think I prefer counting the objects on-demand in mmu_shrink_count(),
> > instead of keeping track of the count. Keeping track of the count adds
> > complexity to the topup/alloc paths for the sole benefit of the
> > shrinker. I'd rather contain that complexity to the shrinker code unless
> > there is a compelling reason to optimize mmu_shrink_count().
> >
> > IIRC we discussed this at one point. Was there a reason to take this
> > approach that I'm just forgetting?
> 
> To count on demand, we first need to lock on kvm_lock and then for
> each VMs iterate through each vCPU, take a lock, and sum the objects
> count in caches. When the NUMA support will be introduced in this
> series then it means we have to iterate even more caches. We
> can't/shouldn't use mutex_trylock() as it will not give the correct
> picture and when shrink_scan is called count can be totally different.

Yeah good point. Hm, do we need to take the cache mutex to calculate the
count though? mmu_shrink_count() is inherently racy (something could get
freed or allocated in between count() and scan()). I don't think holding
the mutex buys us anything over just reading the count without the
mutex.

e.g.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index df8dcb7e5de7..c80a5c52f0ea 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6739,10 +6739,20 @@ static unsigned long mmu_shrink_scan(struct shrinker *shrink,
 static unsigned long mmu_shrink_count(struct shrinker *shrink,
                                      struct shrink_control *sc)
 {
-       s64 count = percpu_counter_sum(&kvm_total_unused_cached_pages);
+       struct kvm *kvm, *next_kvm;
+       unsigned long count = 0;

-       WARN_ON(count < 0);
-       return count <= 0 ? SHRINK_EMPTY : count;
+       mutex_lock(&kvm_lock);
+       list_for_each_entry_safe(kvm, next_kvm, &vm_list, vm_list) {
+               struct kvm_vcpu *vcpu;
+               unsigned long i;
+
+               kvm_for_each_vcpu(i, vcpu, kvm)
+                       count += READ_ONCE(vcpu->arch.mmu_shadow_page_cache.nobjs);
+       }
+       mutex_unlock(&kvm_lock);
+
+       return count == 0 ? SHRINK_EMPTY : count;

 }

Then the only concern is an additional acquire of kvm_lock. But it
should be fairly quick (quicker than mmu_shrink_scan()). If we can
tolerate the kvm_lock overhead of mmu_shrink_scan(), then we should be
able to tolerate some here.

> 
> scan_count() API comment says to not do any deadlock check (I don't
> know what does that mean) and percpu_counter is very fast when we are
> adding/subtracting pages so the effect of using it to keep global
> count is very minimal. Since, there is not much impact to using
> percpu_count compared to previous one, we ended our discussion with
> keeping this per cpu counter.

Yeah it's just the code complexity of maintaing
kvm_total_unused_cached_pages that I'm hoping to avoid. We have to
create the counter, destroy it, and keep it up to date. Some
kvm_mmu_memory_caches have to update the counter, and others don't. It's
just adds a lot of bookkeeping code that I'm not convinced is worth the
it.

  reply	other threads:[~2023-03-10  0:56 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 22:41 [Patch v4 00/18] NUMA aware page table allocation Vipin Sharma
2023-03-06 22:41 ` [Patch v4 01/18] KVM: x86/mmu: Change KVM mmu shrinker to no-op Vipin Sharma
2023-03-06 22:41 ` [Patch v4 02/18] KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{} Vipin Sharma
2023-03-06 22:41 ` [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally Vipin Sharma
2023-03-07 11:32   ` kernel test robot
2023-03-07 19:13     ` Vipin Sharma
2023-03-07 20:18       ` Sean Christopherson
2023-03-07 12:13   ` kernel test robot
2023-03-08 20:33   ` Zhi Wang
2023-03-08 22:16     ` Vipin Sharma
2023-03-09  5:18       ` Mingwei Zhang
2023-03-09 12:52         ` Zhi Wang
2023-03-09 19:52           ` Vipin Sharma
2023-03-09 15:37   ` Zhi Wang
2023-03-09 18:19     ` Vipin Sharma
2023-03-09 23:53   ` David Matlack
2023-03-10  0:28     ` Vipin Sharma
2023-03-10  0:55       ` David Matlack [this message]
2023-03-10  1:09         ` Vipin Sharma
2023-03-10  0:22   ` David Matlack
2023-03-10  0:36     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 04/18] KVM: x86/mmu: Shrink shadow page caches via MMU shrinker Vipin Sharma
2023-03-06 22:41 ` [Patch v4 05/18] KVM: x86/mmu: Add split_shadow_page_cache pages to global count of MMU cache pages Vipin Sharma
2023-03-09 15:58   ` Zhi Wang
2023-03-09 19:59     ` Vipin Sharma
2023-03-10  0:05       ` David Matlack
2023-03-10  0:06         ` David Matlack
2023-03-06 22:41 ` [Patch v4 06/18] KVM: x86/mmu: Shrink split_shadow_page_cache via MMU shrinker Vipin Sharma
2023-03-09 16:01   ` Zhi Wang
2023-03-09 19:59     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 07/18] KVM: x86/mmu: Unconditionally count allocations from MMU page caches Vipin Sharma
2023-03-09 16:03   ` Zhi Wang
2023-03-06 22:41 ` [Patch v4 08/18] KVM: x86/mmu: Track unused mmu_shadowed_info_cache pages count via global counter Vipin Sharma
2023-03-30  4:53   ` Yang, Weijiang
2023-04-03 23:02     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 09/18] KVM: x86/mmu: Shrink mmu_shadowed_info_cache via MMU shrinker Vipin Sharma
2023-03-06 22:41 ` [Patch v4 10/18] KVM: x86/mmu: Add per VM NUMA aware page table capability Vipin Sharma
2023-03-06 22:41 ` [Patch v4 11/18] KVM: x86/mmu: Add documentation of " Vipin Sharma
2023-03-23 21:59   ` David Matlack
2023-03-28 16:47     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 12/18] KVM: x86/mmu: Allocate NUMA aware page tables on TDP huge page splits Vipin Sharma
2023-03-23 22:15   ` David Matlack
2023-03-28 17:12     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 13/18] KVM: mmu: Add common initialization logic for struct kvm_mmu_memory_cache{} Vipin Sharma
2023-03-23 22:23   ` David Matlack
2023-03-28 17:16     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 14/18] KVM: mmu: Initialize kvm_mmu_memory_cache.gfp_zero to __GFP_ZERO by default Vipin Sharma
2023-03-23 22:28   ` David Matlack
2023-03-28 17:31     ` Vipin Sharma
2023-03-28 23:13       ` David Matlack
2023-03-06 22:41 ` [Patch v4 15/18] KVM: mmu: Add NUMA node support in struct kvm_mmu_memory_cache{} Vipin Sharma
2023-03-23 22:30   ` David Matlack
2023-03-28 17:50     ` Vipin Sharma
2023-03-28 23:24       ` David Matlack
2023-04-03 22:57         ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 16/18] KVM: x86/mmu: Allocate numa aware page tables during page fault Vipin Sharma
2023-03-29  0:21   ` David Matlack
2023-03-29  0:28     ` David Matlack
2023-03-29 19:03     ` David Matlack
2023-04-03 22:54       ` Vipin Sharma
2023-04-03 22:50     ` Vipin Sharma
2023-03-06 22:41 ` [Patch v4 17/18] KVM: x86/mmu: Allocate shadow mmu page table on huge page split on the same NUMA node Vipin Sharma
2023-03-06 22:41 ` [Patch v4 18/18] KVM: x86/mmu: Reduce default mmu memory cache size Vipin Sharma
2023-03-07 18:19 ` [Patch v4 00/18] NUMA aware page table allocation Mingwei Zhang
2023-03-07 18:33   ` Vipin Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAqAHiaCz0b2OKJF@google.com \
    --to=dmatlack@google.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox