From: Sean Christopherson <seanjc@google.com>
To: Peter Xu <peterx@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] KVM: x86/mmu: Shrink pte_list_desc size when KVM is using TDP
Date: Thu, 14 Jul 2022 18:43:50 +0000 [thread overview]
Message-ID: <YtBj5noXqagqYBVs@google.com> (raw)
In-Reply-To: <Ys4Qx1RxmWrtQ8it@xz-m1.local>
On Tue, Jul 12, 2022, Peter Xu wrote:
> On Tue, Jul 12, 2022 at 10:53:48PM +0000, Sean Christopherson wrote:
> > On Tue, Jul 12, 2022, Peter Xu wrote:
> > > On Fri, Jun 24, 2022 at 11:27:34PM +0000, Sean Christopherson wrote:
> > > Sorry to start with asking questions, it's just that if we know that
> > > pte_list_desc is probably not gonna be used then could we simply skip the
> > > cache layer as a whole? IOW, we don't make the "array size of pte list
> > > desc" dynamic, instead we make the whole "pte list desc cache layer"
> > > dynamic. Is it possible?
> >
> > Not really? It's theoretically possible, but it'd require pre-checking that aren't
> > aliases, and to do that race free we'd have to do it under mmu_lock, which means
> > having to support bailing from the page fault to topup the cache. The memory
> > overhead for the cache isn't so significant that it's worth that level of complexity.
>
> Ah, okay..
>
> So the other question is I'm curious how fundamentally this extra
> complexity could help us to save spaces.
>
> The thing is IIUC slub works in page sizes, so at least one slub cache eats
> one page which is 4096 anyway. In our case if there was 40 objects
> allocated for 14 entries array, are you sure it'll still be 40 objects but
> only smaller?
Definitely not 100% positive.
> I'd thought after the change each obj is smaller but slub could have cached
> more objects since min slub size is 4k for x86.
> I don't remember the details of the eager split work on having per-vcpu
The eager split logic uses a single per-VM cache, but it's large (513 entries).
> caches, but I'm also wondering if we cannot drop the whole cache layer
> whether we can selectively use slub in this case, then we can cache much
> less assuming we will use just less too.
>
> Currently:
>
> r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache,
> 1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM);
>
> We could have the pte list desc cache layer to be managed manually
> (e.g. using kmalloc()?) for tdp=1, then we'll at least in control of how
> many objects we cache? Then with a limited number of objects, the wasted
> memory is much reduced too.
I suspect that, without implementing something that looks an awful lot like the
kmem caches, manually handling allocations would degrade performance for shadow
paging and nested MMUs.
> I think I'm fine with current approach too, but only if it really helps
> reduce memory footprint as we expected.
Yeah, I'll get numbers before sending v2 (which will be quite some time at this
point).
next prev parent reply other threads:[~2022-07-14 18:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-24 23:27 [PATCH 0/4] KVM: x86/mmu: pte_list_desc fix and cleanups Sean Christopherson
2022-06-24 23:27 ` [PATCH 1/4] KVM: x86/mmu: Track the number entries in a pte_list_desc with a ulong Sean Christopherson
2022-06-24 23:27 ` [PATCH 2/4] KVM: x86/mmu: Defer "full" MMU setup until after vendor hardware_setup() Sean Christopherson
2022-06-25 0:16 ` David Matlack
2022-06-27 15:40 ` Sean Christopherson
2022-06-27 22:50 ` David Matlack
2022-07-12 21:56 ` Peter Xu
2022-07-14 18:23 ` Sean Christopherson
2022-06-24 23:27 ` [PATCH 3/4] KVM: x86/mmu: Shrink pte_list_desc size when KVM is using TDP Sean Christopherson
2022-07-12 22:35 ` Peter Xu
2022-07-12 22:53 ` Sean Christopherson
2022-07-13 0:24 ` Peter Xu
2022-07-14 18:43 ` Sean Christopherson [this message]
2022-06-24 23:27 ` [PATCH 4/4] KVM: x86/mmu: Topup pte_list_desc cache iff VM is using rmaps Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YtBj5noXqagqYBVs@google.com \
--to=seanjc@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).