public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Lai Jiangshan <jiangshan.ljs@antgroup.com>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	 Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H. Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH 2/2] KVM: x86/mmu: KVM: x86/mmu: Skip unsync when large pages are allowed
Date: Thu, 12 Mar 2026 10:07:57 -0700	[thread overview]
Message-ID: <abLy7cEDz7VlWtWS@google.com> (raw)
In-Reply-To: <20260123090304.32286-2-jiangshanlai@gmail.com>

On Fri, Jan 23, 2026, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> Use the large-page metadata to avoid pointless attempts to search SP.
> 
> If the target GFN falls within a range where a large page is allowed,
> then there cannot be a shadow page for that GFN; a shadow page in the
> range would itself disallow using a large page. In that case, there
> is nothing to unsync and mmu_try_to_unsync_pages() can return
> immediately.
> 
> This is always true for TDP MMU without nested TDP,

I wouldn't expect this to be a much of a performance optimization for this case
though, as kvm_get_mmu_page_hash() will return an empty list, i.e.
for_each_gfn_valid_sp_with_gptes() won't do meaningful work anyways.

> and holds for a significant fraction of cases with shadow paging even all SPs
> are 4K.
> 
> For shadow paging, this optimization theoretically avoids work for about
> 1/e ~= 37% of GFNs, assuming one guest page table per 2M of memory and
> that each GPT falls randomly into the 2M memory buckets. In a simple
> test setup, it skipped unsync in a much higher percentage of cases,
> mainly because the guest buddy allocator clusters GPTs into fewer
> buckets.
> 
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4535d2836004..555075fb63d9 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2932,6 +2932,14 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
>  	struct kvm_mmu_page *sp;
>  	bool locked = false;
>  
> +	/*
> +	 * If large page is allowed, there is no shadow page in the GFN range,
> +	 * because the presence of a shadow page in that range would prevent
> +	 * using a large page.
> +	 */
> +	if (!lpage_info_slot(gfn, slot, PG_LEVEL_2M)->disallow_lpage)
> +		return 0;

Hmm, I'd like to move this to after the write-tracking check, even though as
implemented in code today, the two are mutually exclusive.  Specifically, I don't
want to rely on KVM not supporting write-tracking at 2MiB granularity, and also
to avoid confusing readers.  E.g. a shallow read of account_shadowed() would lead
people to believe this code is wrong:

	/* the non-leaf shadow pages are keeping readonly. */
	if (sp->role.level > PG_LEVEL_4K)
		return __kvm_write_track_add_gfn(kvm, slot, gfn);

	kvm_mmu_gfn_disallow_lpage(slot, gfn);

if they didn't follow __kvm_write_track_add_gfn() to see:

	/*
	 * new track stops large page mapping for the
	 * tracked page.
	 */
	kvm_mmu_gfn_disallow_lpage(slot, gfn);

From a performance perspective, kvm_gfn_is_write_tracked() is O(1) time, and
should be very fast for the "pure" TDP MMU case, so I don't think that's a
concern.

This is what I have locally, please holler if you object to landing the code
after the write-tracked check.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 363967a17069..3d0e0c1b5332 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2940,6 +2940,15 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
        if (kvm_gfn_is_write_tracked(kvm, slot, gfn))
                return -EPERM;
 
+       /*
+        * Only 4KiB mappings can become unsync, and KVM disallows hugepages
+        * for unsync gfns.  Upper-level gPTEs (leaf or non-leaf) are always
+        * write-protected (see above), thus if the gfn can be mapped with a
+        * hugepage and isn't write-tracked, it can't be unsync.
+        */
+       if (!lpage_info_slot(gfn, slot, PG_LEVEL_2M)->disallow_lpage)
+               return 0;
+
        /*
         * The page is not write-tracked, mark existing shadow pages unsync
         * unless KVM is synchronizing an unsync SP.  In that case, KVM must


>  	/*
>  	 * Force write-protection if the page is being tracked.  Note, the page
>  	 * track machinery is used to write-protect upper-level shadow pages,
> -- 
> 2.19.1.6.gb485710b
> 

  reply	other threads:[~2026-03-12 17:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23  9:03 [PATCH 1/2] KVM: x86/mmu: Don't check old SPTE permissions when trying to unsync Lai Jiangshan
2026-01-23  9:03 ` [PATCH 2/2] KVM: x86/mmu: KVM: x86/mmu: Skip unsync when large pages are allowed Lai Jiangshan
2026-03-12 17:07   ` Sean Christopherson [this message]
2026-03-12 17:22     ` Sean Christopherson
2026-03-12 17:35 ` [PATCH 1/2] KVM: x86/mmu: Don't check old SPTE permissions when trying to unsync Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abLy7cEDz7VlWtWS@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jiangshan.ljs@antgroup.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox