From: Sean Christopherson <seanjc@google.com>
To: Kevin Cheng <chengkev@google.com>
Cc: kvm@vger.kernel.org, jmattson@google.com, pbonzini@redhat.com
Subject: Re: [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits
Date: Tue, 18 Nov 2025 15:20:50 -0800 [thread overview]
Message-ID: <aRz_UkBPM7FcfmpP@google.com> (raw)
In-Reply-To: <20250820162951.3499017-1-chengkev@google.com>
On Wed, Aug 20, 2025, Kevin Cheng wrote:
> The nVMX tests already have coverage for TDP A/D bits. Add a
> similar test for nSVM to improve test parity between nSVM and nVMX.
Please write a more verbose changelog that explains _exactly_ what you intend to
test. Partly because I don't entirely trust the EPT A/D tests, but mostly
because I think it will help you fully understand the double walks that need to
be done. E.g. explain what all NPT entries will be touched if the guest writes
using a virtual address.
> Signed-off-by: Kevin Cheng <chengkev@google.com>
> ---
> x86/svm.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
> x86/svm.h | 5 +++
> x86/svm_npt.c | 46 +++++++++++++++++++++++++
The {check,clear}_npt_ad() helpers belong in svm_npt.c, not svm.c
> 3 files changed, 144 insertions(+)
>
> diff --git a/x86/svm.c b/x86/svm.c
> index e715e270..53b78d16 100644
> --- a/x86/svm.c
> +++ b/x86/svm.c
> @@ -14,6 +14,8 @@
> #include "isr.h"
> #include "apic.h"
>
> +#include <asm/page.h>
> +
> /* for the nested page table*/
> u64 *pml4e;
>
> @@ -43,6 +45,97 @@ u64 *npt_get_pml4e(void)
> return pml4e;
> }
>
> +void clear_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> + unsigned long guest_addr)
GPAs are of type phys_addr_t, not "unsigned long". The tests pass because they're
64-bit only, but that's ugly and shouldn't be relied upon.
And pml4e, a.k.a. npt_get_pml4e(), is a u64 *. Actually, pml4e isn't even used,
which makes me question whether or not this test does what it intended to do.
Ah, guest_addr is actually a _virtual_ address. Please name it "gva" to remove
ambiguity (and to help clarify that guest_cr3 really is CR3, not nCR3).
> +{
> + int l;
> + unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> + u64 *npt_pte, pte, offset_in_page;
> + unsigned offset;
Please don't use bare "unsigned", and use reverse fir-tree. "offset" can simply
be an "int" because we're 100% relying on it not to be greater than 511. I.e.
unsigned long *pt = (unsigned long *)guest_cr3, gpa;
u64 *npt_pte, pte, offset_in_page;
int l, offset;
> +
> + for (l = PAGE_LEVEL; ; --l) {
> + offset = PGDIR_OFFSET(guest_addr, l);
> + npt_pte = npt_get_pte((u64) &pt[offset]);
This is retrieving a PTE from a non-NPT page table (pt == guest_cr3).
> + if(!npt_pte) {
> + printf("NPT - guest level %d page table is not mapped.\n", l);
Why printf instead of report_fail()?
> + return;
> + }
> +
> + *npt_pte &= ~(PT_AD_MASK);
And so here you're clearing bits in the non-nested page tables. The test passes
because check_npt_ad() also checks the non-nested page tables, but the test isn't
doing what it purports to do.
> +
> + pte = pt[offset];
> + if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> + break;
> + if (!(pte & PT_PRESENT_MASK))
> + return;
> + pt = (unsigned long *)(pte & PT_ADDR_MASK);
> + }
> +
> + offset = PGDIR_OFFSET(guest_addr, l);
> + offset_in_page = guest_addr & ((1 << PGDIR_BITS(l)) - 1);
> + gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> + npt_pte = npt_get_pte(gpa);
Ah, the nested NPT tables are checked, but only for the leaf PTE of the final GPA.
This fails to validate address. This fails to validate Accessed bits on non-leaf
NPT entries, and fails to validate A/D bits on NPT entries used to translate GPAs
accessed while walking the guest GVA=>GPA page tables.
> + *npt_pte &= ~(PT_AD_MASK);
> +}
> +
> +void check_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> + unsigned long guest_addr, int expected_gpa_ad,
> + int expected_pt_ad)
Align indentation.
> +{
> + int l;
> + unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> + u64 *npt_pte, pte, offset_in_page;
> + unsigned offset;
> + bool bad_pt_ad = false;
> +
> + for (l = PAGE_LEVEL; ; --l) {
> + offset = PGDIR_OFFSET(guest_addr, l);
> + npt_pte = npt_get_pte((u64) &pt[offset]);
> +
> + if(!npt_pte) {
> + printf("NPT - guest level %d page table is not mapped.\n", l);
Why printf instead of report_fail()?
> + return;
> + }
> +
> + if (!bad_pt_ad) {
> + bad_pt_ad |= (*npt_pte & PT_AD_MASK) != expected_pt_ad;
> + if(bad_pt_ad)
Unnecessary nesting. E.g. ignoring the below feedback, this could be:
if (!bad_pt_ad && (*npt_pte & PT_AD_MASK) != expected_pt_ad) {
bad_pt_ad = true
report_fail(...)
}
> + report_fail("NPT - received guest level %d page table A=%d/D=%d",
> + l,
> + !!(expected_pt_ad & PT_ACCESSED_MASK),
> + !!(expected_pt_ad & PT_DIRTY_MASK));
Print both expected and actual, otherwise debugging is no fun. Actually, looking
at the caller, passing in @expected_pt_ad is quite silly. The only time A/D bits
are expected to be '0' are for the initial check, immediately after A/D bits are
zeroed. That's honestly uninteresting, it's basically verifying KVM hasn't
randomly corrupted memory. If someone _really_ wants to verify that A/D bits are
cleared, just read back immediately after writing.
Dropping @expected_pt_ad will make it easier to expand coverage, e.g. for reads
vs. writes, and of intermediate GPAs.
E.g.
> + }
> +
> + pte = pt[offset];
> + if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> + break;
> + if (!(pte & PT_PRESENT_MASK))
> + return;
> + pt = (unsigned long *)(pte & PT_ADDR_MASK);
> + }
> +
> + if (!bad_pt_ad)
> + report_pass("NPT - guest page table structures A=%d/D=%d",
> + !!(expected_pt_ad & PT_ACCESSED_MASK),
> + !!(expected_pt_ad & PT_DIRTY_MASK));
> +
> + offset = PGDIR_OFFSET(guest_addr, l);
> + offset_in_page = guest_addr & ((1 << PGDIR_BITS(l)) - 1);
> + gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> +
> + npt_pte = npt_get_pte(gpa);
> +
> + if (!npt_pte) {
> + report_fail("NPT - guest physical address is not mapped");
> + return;
> + }
> + report((*npt_pte & PT_AD_MASK) == expected_gpa_ad,
> + "NPT - guest physical address A=%d/D=%d",
> + !!(expected_gpa_ad & PT_ACCESSED_MASK),
> + !!(expected_gpa_ad & PT_DIRTY_MASK));
> +}
next prev parent reply other threads:[~2025-11-18 23:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-20 16:29 [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits Kevin Cheng
2025-09-11 19:07 ` Kevin Cheng
2025-11-18 23:20 ` Sean Christopherson [this message]
2025-12-19 23:02 ` Kevin Cheng
2025-12-19 23:01 ` Kevin Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRz_UkBPM7FcfmpP@google.com \
--to=seanjc@google.com \
--cc=chengkev@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox