public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Kevin Cheng <chengkev@google.com>
Cc: kvm@vger.kernel.org, jmattson@google.com, pbonzini@redhat.com
Subject: Re: [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits
Date: Tue, 18 Nov 2025 15:20:50 -0800	[thread overview]
Message-ID: <aRz_UkBPM7FcfmpP@google.com> (raw)
In-Reply-To: <20250820162951.3499017-1-chengkev@google.com>

On Wed, Aug 20, 2025, Kevin Cheng wrote:
> The nVMX tests already have coverage for TDP A/D bits. Add a
> similar test for nSVM to improve test parity between nSVM and nVMX.

Please write a more verbose changelog that explains _exactly_ what you intend to
test.  Partly because I don't entirely trust the EPT A/D tests, but mostly
because I think it will help you fully understand the double walks that need to
be done.  E.g. explain what all NPT entries will be touched if the guest writes
using a virtual address.

> Signed-off-by: Kevin Cheng <chengkev@google.com>
> ---
>  x86/svm.c     | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  x86/svm.h     |  5 +++
>  x86/svm_npt.c | 46 +++++++++++++++++++++++++

The {check,clear}_npt_ad() helpers belong in svm_npt.c, not svm.c

>  3 files changed, 144 insertions(+)
> 
> diff --git a/x86/svm.c b/x86/svm.c
> index e715e270..53b78d16 100644
> --- a/x86/svm.c
> +++ b/x86/svm.c
> @@ -14,6 +14,8 @@
>  #include "isr.h"
>  #include "apic.h"
>  
> +#include <asm/page.h>
> +
>  /* for the nested page table*/
>  u64 *pml4e;
>  
> @@ -43,6 +45,97 @@ u64 *npt_get_pml4e(void)
>  	return pml4e;
>  }
>  
> +void clear_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> +		  unsigned long guest_addr)

GPAs are of type phys_addr_t, not "unsigned long".  The tests pass because they're
64-bit only, but that's ugly and shouldn't be relied upon.

And pml4e, a.k.a. npt_get_pml4e(), is a u64 *.  Actually, pml4e isn't even used,
which makes me question whether or not this test does what it intended to do.

Ah, guest_addr is actually a _virtual_ address.  Please name it "gva" to remove
ambiguity (and to help clarify that guest_cr3 really is CR3, not nCR3).

> +{
> +	int l;
> +	unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> +	u64 *npt_pte, pte, offset_in_page;
> +	unsigned offset;

Please don't use bare "unsigned", and use reverse fir-tree.  "offset" can simply
be an "int" because we're 100% relying on it not to be greater than 511.  I.e.

	unsigned long *pt = (unsigned long *)guest_cr3, gpa;
	u64 *npt_pte, pte, offset_in_page;
	int l, offset;

> +
> +	for (l = PAGE_LEVEL; ; --l) {
> +		offset = PGDIR_OFFSET(guest_addr, l);
> +		npt_pte = npt_get_pte((u64) &pt[offset]);

This is retrieving a PTE from a non-NPT page table (pt == guest_cr3).

> +		if(!npt_pte) {
> +			printf("NPT - guest level %d page table is not mapped.\n", l);

Why printf instead of report_fail()?

> +			return;
> +		}
> +
> +		*npt_pte &= ~(PT_AD_MASK);

And so here you're clearing bits in the non-nested page tables.  The test passes
because check_npt_ad() also checks the non-nested page tables, but the test isn't
doing what it purports to do.

> +
> +		pte = pt[offset];
> +		if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> +			break;
> +		if (!(pte & PT_PRESENT_MASK))
> +			return;
> +		pt = (unsigned long *)(pte & PT_ADDR_MASK);
> +	}
> +
> +	offset = PGDIR_OFFSET(guest_addr, l);
> +	offset_in_page = guest_addr &  ((1 << PGDIR_BITS(l)) - 1);
> +	gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> +	npt_pte = npt_get_pte(gpa);

Ah, the nested NPT tables are checked, but only for the leaf PTE of the final GPA.
This fails to validate address.  This fails to validate Accessed bits on non-leaf
NPT entries, and fails to validate A/D bits on NPT entries used to translate GPAs
accessed while walking the guest GVA=>GPA page tables.

> +	*npt_pte &= ~(PT_AD_MASK);
> +}
> +
> +void check_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> +	unsigned long guest_addr, int expected_gpa_ad,
> +	int expected_pt_ad)

Align indentation.

> +{
> +	int l;
> +	unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> +	u64 *npt_pte, pte, offset_in_page;
> +	unsigned offset;
> +	bool bad_pt_ad = false;
> +
> +	for (l = PAGE_LEVEL; ; --l) {
> +		offset = PGDIR_OFFSET(guest_addr, l);
> +		npt_pte = npt_get_pte((u64) &pt[offset]);
> +
> +		if(!npt_pte) {
> +			printf("NPT - guest level %d page table is not mapped.\n", l);

Why printf instead of report_fail()?

> +			return;
> +		}
> +
> +		if (!bad_pt_ad) {
> +			bad_pt_ad |= (*npt_pte & PT_AD_MASK) != expected_pt_ad;
> +			if(bad_pt_ad)

Unnecessary nesting.  E.g. ignoring the below feedback, this could be:

		if (!bad_pt_ad && (*npt_pte & PT_AD_MASK) != expected_pt_ad) {
			bad_pt_ad = true
			report_fail(...)
		}

> +				report_fail("NPT - received guest level %d page table A=%d/D=%d",
> +					    l,
> +					    !!(expected_pt_ad & PT_ACCESSED_MASK),
> +					    !!(expected_pt_ad & PT_DIRTY_MASK));

Print both expected and actual, otherwise debugging is no fun.  Actually, looking
at the caller, passing in @expected_pt_ad is quite silly.  The only time A/D bits
are expected to be '0' are for the initial check, immediately after A/D bits are
zeroed.  That's honestly uninteresting, it's basically verifying KVM hasn't
randomly corrupted memory.  If someone _really_ wants to verify that A/D bits are
cleared, just read back immediately after writing.

Dropping @expected_pt_ad will make it easier to expand coverage, e.g. for reads
vs. writes, and of intermediate GPAs.

E.g. 

> +		}
> +
> +		pte = pt[offset];
> +		if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> +			break;
> +		if (!(pte & PT_PRESENT_MASK))
> +			return;
> +		pt = (unsigned long *)(pte & PT_ADDR_MASK);
> +	}
> +
> +	if (!bad_pt_ad)
> +		report_pass("NPT - guest page table structures A=%d/D=%d",
> +			    !!(expected_pt_ad & PT_ACCESSED_MASK),
> +			    !!(expected_pt_ad & PT_DIRTY_MASK));
> +
> +	offset = PGDIR_OFFSET(guest_addr, l);
> +	offset_in_page = guest_addr &  ((1 << PGDIR_BITS(l)) - 1);
> +	gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> +
> +	npt_pte = npt_get_pte(gpa);
> +
> +	if (!npt_pte) {
> +		report_fail("NPT - guest physical address is not mapped");
> +		return;
> +	}
> +	report((*npt_pte & PT_AD_MASK) == expected_gpa_ad,
> +	       "NPT - guest physical address A=%d/D=%d",
> +	       !!(expected_gpa_ad & PT_ACCESSED_MASK),
> +	       !!(expected_gpa_ad & PT_DIRTY_MASK));
> +}

  parent reply	other threads:[~2025-11-18 23:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-20 16:29 [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits Kevin Cheng
2025-09-11 19:07 ` Kevin Cheng
2025-11-18 23:20 ` Sean Christopherson [this message]
2025-12-19 23:02   ` Kevin Cheng
2025-12-19 23:01 ` Kevin Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRz_UkBPM7FcfmpP@google.com \
    --to=seanjc@google.com \
    --cc=chengkev@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox