From: Peter Xu <peterx@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] KVM: x86/mmu: Shrink pte_list_desc size when KVM is using TDP
Date: Tue, 12 Jul 2022 18:35:50 -0400 [thread overview]
Message-ID: <Ys33RtxeDz0egEM0@xz-m1.local> (raw)
In-Reply-To: <20220624232735.3090056-4-seanjc@google.com>
On Fri, Jun 24, 2022 at 11:27:34PM +0000, Sean Christopherson wrote:
> Dynamically size struct pte_list_desc's array of sptes based on whether
> or not KVM is using TDP. Commit dc1cff969101 ("KVM: X86: MMU: Tune
> PTE_LIST_EXT to be bigger") bumped the number of entries in order to
> improve performance when using shadow paging, but its analysis that the
> larger size would not affect TDP was wrong. Consuming pte_list_desc
> objects for nested TDP is indeed rare, but _allocating_ objects is not,
> as KVM allocates 40 objects for each per-vCPU cache. Reducing the size
> from 128 bytes to 32 bytes reduces that per-vCPU cost from 5120 bytes to
> 1280, and also provides similar savings when eager page splitting for
> nested MMUs kicks in.
>
> The per-vCPU overhead could be further reduced by using a custom, smaller
> capacity for the per-vCPU caches, but that's more of an "and" than
> an "or" change, e.g. it wouldn't help the eager page split use case.
>
> Set the list size to the bare minimum without completely defeating the
> purpose of an array (and because pte_list_add() assumes the array is at
> least two entries deep). A larger size, e.g. 4, would reduce the number
> of "allocations", but those "allocations" only become allocations in
> truth if a single vCPU depletes its cache to where a topup is needed,
> i.e. if a single vCPU "allocates" 30+ lists. Conversely, those 2 extra
> entries consume 16 bytes * 40 * nr_vcpus in the caches the instant nested
> TDP is used.
>
> In the unlikely event that performance of aliased gfns for nested TDP
> really is (or becomes) a priority for oddball workloads, KVM could add a
> knob to let the admin tune the array size for their environment.
>
> Note, KVM also unnecessarily tops up the per-vCPU caches even when not
> using rmaps; this can also be addressed separately.
The only possible way of using pte_list_desc when tdp=1 is when the
hypervisor tries to map the same host pages with different GPAs?
And we don't really have a real use case of that, or.. do we?
Sorry to start with asking questions, it's just that if we know that
pte_list_desc is probably not gonna be used then could we simply skip the
cache layer as a whole? IOW, we don't make the "array size of pte list
desc" dynamic, instead we make the whole "pte list desc cache layer"
dynamic. Is it possible?
--
Peter Xu
next prev parent reply other threads:[~2022-07-12 22:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-24 23:27 [PATCH 0/4] KVM: x86/mmu: pte_list_desc fix and cleanups Sean Christopherson
2022-06-24 23:27 ` [PATCH 1/4] KVM: x86/mmu: Track the number entries in a pte_list_desc with a ulong Sean Christopherson
2022-06-24 23:27 ` [PATCH 2/4] KVM: x86/mmu: Defer "full" MMU setup until after vendor hardware_setup() Sean Christopherson
2022-06-25 0:16 ` David Matlack
2022-06-27 15:40 ` Sean Christopherson
2022-06-27 22:50 ` David Matlack
2022-07-12 21:56 ` Peter Xu
2022-07-14 18:23 ` Sean Christopherson
2022-06-24 23:27 ` [PATCH 3/4] KVM: x86/mmu: Shrink pte_list_desc size when KVM is using TDP Sean Christopherson
2022-07-12 22:35 ` Peter Xu [this message]
2022-07-12 22:53 ` Sean Christopherson
2022-07-13 0:24 ` Peter Xu
2022-07-14 18:43 ` Sean Christopherson
2022-06-24 23:27 ` [PATCH 4/4] KVM: x86/mmu: Topup pte_list_desc cache iff VM is using rmaps Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ys33RtxeDz0egEM0@xz-m1.local \
--to=peterx@redhat.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).