Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Fred Griffoul <griffoul@gmail.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, vkuznets@redhat.com,
	 shuah@kernel.org, dwmw@amazon.co.uk,
	linux-kselftest@vger.kernel.org,  linux-kernel@vger.kernel.org,
	Fred Griffoul <fgriffo@amazon.co.uk>
Subject: Re: [PATCH v4 08/10] KVM: x86: Add nested context management
Date: Mon, 11 May 2026 17:13:15 -0700	[thread overview]
Message-ID: <agJwm8Vog6cSkNna@google.com> (raw)
In-Reply-To: <20260102142429.896101-9-griffoul@gmail.com>

On Fri, Jan 02, 2026, Fred Griffoul wrote:
> From: Fred Griffoul <fgriffo@amazon.co.uk>
> 
> Add infrastructure to persist nested virtualization state when L2 vCPUs

Please be more transparent with what exactly is being persisted.

> are switched on an L1 vCPU or migrated between L1 vCPUs.
> 
> The nested context table uses a hash table for fast lookup by nested
> control block GPA (VMPTR for VMX, VMCB for SVM) and maintains a free
> list for context management.
>
> The kvm_nested_context_load() function searches for a context indexed by
> the target GPA; if not found, it allocates a new context up to the
> configured maximum. If at capacity, it recycles the oldest context from
> the free list.
> 
> The oversubscription is hardcoded to support up to 8 L2 vCPUs per L1
> vCPU.
> 
> The kvm_nested_context_clear() function moves the context to the free
> list while keeping it in the hash table for potential reuse.
> 
> This allows nested hypervisors to multiplex multiple L2 vCPUs on L1
> vCPUs without losing cached nested state, significantly improving
> performance for workloads with frequent L2 context switches.
> 
> This patch adds the basic infrastructure. Subsequent patches will add
> the nested VMX and SVM specific support to populate and utilize the
> cached nested state.
> 
> Signed-off-by: Fred Griffoul <fgriffo@amazon.co.uk>
> ---
>  arch/x86/include/asm/kvm_host.h |  31 +++++
>  arch/x86/include/uapi/asm/kvm.h |   2 +
>  arch/x86/kvm/Makefile           |   2 +-
>  arch/x86/kvm/nested.c           | 199 ++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |   5 +-
>  5 files changed, 237 insertions(+), 2 deletions(-)

Please provide concrete performance numbers.  They need to be isolated from the
switch to gpcs, and need to show how much benefit is provided for a per-VM hash
table vs. (much) simpler approaches, e.g. versus a stupid simple per-vCPU LRU
cache, a la KVM's pgd caching.

There also needs to be an analysis of the downsides of the performance gains.
If I'm putting the pieces together correctly, quoting a snippet from the cover
letter, the performance benefits come from:

  The pfncache infrastructure maintains persistent mappings as long as the
  page GPA does not change, eliminating the memremap/memunmap overhead on
  every VM entry/exit cycle. 

Which means that this caching effectively eliminates the security value added by
removing memory from the kernel's direct map.  If, in the long term, we're
collectively moving towards guest_memfd (for setups that don't want all of the
overcommit goodness provided by mm/), then the performance provided by this approach
is directly at odds with the efforts to remove guest_memfd memory from the direct
map for added security.

E.g. if the ratio of L2:L1 contexts is pushed high enough, it would be possible
to have the majority of guest memory mapped into the host kernel.

That then raises the question of whether or not we are optimizing the right thing.
E.g. if we can somehow make map+unmap blazing fast for "all" real world usage that
matters, then maybe we don't need this type of caching.

In general, this needs a _lot_ more justification on the design decisions.  A lot,
a lot, a _lot_ more.  This is too much code and complexity for me to even start
reviewing without hard data.

  reply	other threads:[~2026-05-12  0:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 14:24 [PATCH v4 00/10] KVM: nVMX: Improve performance for unmanaged guest memory Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 01/10] KVM: nVMX: Implement cache for L1 MSR bitmap Fred Griffoul
2026-05-11 23:08   ` Sean Christopherson
2026-01-02 14:24 ` [PATCH v4 02/10] KVM: pfncache: Restore guest-uses-pfn support Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 03/10] KVM: x86: Add nested state validation for pfncache support Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 04/10] KVM: nVMX: Implement cache for L1 APIC pages Fred Griffoul
2026-05-11 23:35   ` Sean Christopherson
2026-01-02 14:24 ` [PATCH v4 05/10] KVM: selftests: Add nested VMX APIC cache invalidation test Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 06/10] KVM: nVMX: Cache evmcs fields to ensure consistency during VM-entry Fred Griffoul
2026-01-02 15:40   ` Vitaly Kuznetsov
2026-01-02 14:24 ` [PATCH v4 07/10] KVM: nVMX: Replace evmcs kvm_host_map with pfncache Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 08/10] KVM: x86: Add nested context management Fred Griffoul
2026-05-12  0:13   ` Sean Christopherson [this message]
2026-01-02 14:24 ` [PATCH v4 09/10] KVM: nVMX: Use nested context for pfncache persistence Fred Griffoul
2026-01-02 14:24 ` [PATCH v4 10/10] KVM: selftests: Add L2 vcpu context switch test Fred Griffoul
2026-05-11 23:56 ` [PATCH v4 00/10] KVM: nVMX: Improve performance for unmanaged guest memory Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agJwm8Vog6cSkNna@google.com \
    --to=seanjc@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=fgriffo@amazon.co.uk \
    --cc=griffoul@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=shuah@kernel.org \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox