From: Sean Christopherson <seanjc@google.com>
To: Kevin Cheng <chengkev@google.com>
Cc: kvm@vger.kernel.org, jmattson@google.com, pbonzini@redhat.com
Subject: Re: [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits
Date: Tue, 18 Nov 2025 15:20:50 -0800 [thread overview]
Message-ID: <aRz_UkBPM7FcfmpP@google.com> (raw)
In-Reply-To: <20250820162951.3499017-1-chengkev@google.com>
On Wed, Aug 20, 2025, Kevin Cheng wrote:
> The nVMX tests already have coverage for TDP A/D bits. Add a
> similar test for nSVM to improve test parity between nSVM and nVMX.
Please write a more verbose changelog that explains _exactly_ what you intend to
test. Partly because I don't entirely trust the EPT A/D tests, but mostly
because I think it will help you fully understand the double walks that need to
be done. E.g. explain what all NPT entries will be touched if the guest writes
using a virtual address.
> Signed-off-by: Kevin Cheng <chengkev@google.com>
> ---
> x86/svm.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
> x86/svm.h | 5 +++
> x86/svm_npt.c | 46 +++++++++++++++++++++++++
The {check,clear}_npt_ad() helpers belong in svm_npt.c, not svm.c
> 3 files changed, 144 insertions(+)
>
> diff --git a/x86/svm.c b/x86/svm.c
> index e715e270..53b78d16 100644
> --- a/x86/svm.c
> +++ b/x86/svm.c
> @@ -14,6 +14,8 @@
> #include "isr.h"
> #include "apic.h"
>
> +#include <asm/page.h>
> +
> /* for the nested page table*/
> u64 *pml4e;
>
> @@ -43,6 +45,97 @@ u64 *npt_get_pml4e(void)
> return pml4e;
> }
>
> +void clear_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> + unsigned long guest_addr)
GPAs are of type phys_addr_t, not "unsigned long". The tests pass because they're
64-bit only, but that's ugly and shouldn't be relied upon.
And pml4e, a.k.a. npt_get_pml4e(), is a u64 *. Actually, pml4e isn't even used,
which makes me question whether or not this test does what it intended to do.
Ah, guest_addr is actually a _virtual_ address. Please name it "gva" to remove
ambiguity (and to help clarify that guest_cr3 really is CR3, not nCR3).
> +{
> + int l;
> + unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> + u64 *npt_pte, pte, offset_in_page;
> + unsigned offset;
Please don't use bare "unsigned", and use reverse fir-tree. "offset" can simply
be an "int" because we're 100% relying on it not to be greater than 511. I.e.
unsigned long *pt = (unsigned long *)guest_cr3, gpa;
u64 *npt_pte, pte, offset_in_page;
int l, offset;
> +
> + for (l = PAGE_LEVEL; ; --l) {
> + offset = PGDIR_OFFSET(guest_addr, l);
> + npt_pte = npt_get_pte((u64) &pt[offset]);
This is retrieving a PTE from a non-NPT page table (pt == guest_cr3).
> + if(!npt_pte) {
> + printf("NPT - guest level %d page table is not mapped.\n", l);
Why printf instead of report_fail()?
> + return;
> + }
> +
> + *npt_pte &= ~(PT_AD_MASK);
And so here you're clearing bits in the non-nested page tables. The test passes
because check_npt_ad() also checks the non-nested page tables, but the test isn't
doing what it purports to do.
> +
> + pte = pt[offset];
> + if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> + break;
> + if (!(pte & PT_PRESENT_MASK))
> + return;
> + pt = (unsigned long *)(pte & PT_ADDR_MASK);
> + }
> +
> + offset = PGDIR_OFFSET(guest_addr, l);
> + offset_in_page = guest_addr & ((1 << PGDIR_BITS(l)) - 1);
> + gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> + npt_pte = npt_get_pte(gpa);
Ah, the nested NPT tables are checked, but only for the leaf PTE of the final GPA.
This fails to validate address. This fails to validate Accessed bits on non-leaf
NPT entries, and fails to validate A/D bits on NPT entries used to translate GPAs
accessed while walking the guest GVA=>GPA page tables.
> + *npt_pte &= ~(PT_AD_MASK);
> +}
> +
> +void check_npt_ad(unsigned long *pml4e, u64 guest_cr3,
> + unsigned long guest_addr, int expected_gpa_ad,
> + int expected_pt_ad)
Align indentation.
> +{
> + int l;
> + unsigned long *pt = (unsigned long *)guest_cr3, gpa;
> + u64 *npt_pte, pte, offset_in_page;
> + unsigned offset;
> + bool bad_pt_ad = false;
> +
> + for (l = PAGE_LEVEL; ; --l) {
> + offset = PGDIR_OFFSET(guest_addr, l);
> + npt_pte = npt_get_pte((u64) &pt[offset]);
> +
> + if(!npt_pte) {
> + printf("NPT - guest level %d page table is not mapped.\n", l);
Why printf instead of report_fail()?
> + return;
> + }
> +
> + if (!bad_pt_ad) {
> + bad_pt_ad |= (*npt_pte & PT_AD_MASK) != expected_pt_ad;
> + if(bad_pt_ad)
Unnecessary nesting. E.g. ignoring the below feedback, this could be:
if (!bad_pt_ad && (*npt_pte & PT_AD_MASK) != expected_pt_ad) {
bad_pt_ad = true
report_fail(...)
}
> + report_fail("NPT - received guest level %d page table A=%d/D=%d",
> + l,
> + !!(expected_pt_ad & PT_ACCESSED_MASK),
> + !!(expected_pt_ad & PT_DIRTY_MASK));
Print both expected and actual, otherwise debugging is no fun. Actually, looking
at the caller, passing in @expected_pt_ad is quite silly. The only time A/D bits
are expected to be '0' are for the initial check, immediately after A/D bits are
zeroed. That's honestly uninteresting, it's basically verifying KVM hasn't
randomly corrupted memory. If someone _really_ wants to verify that A/D bits are
cleared, just read back immediately after writing.
Dropping @expected_pt_ad will make it easier to expand coverage, e.g. for reads
vs. writes, and of intermediate GPAs.
E.g.
> + }
> +
> + pte = pt[offset];
> + if (l == 1 || (l < 4 && (pte & PT_PAGE_SIZE_MASK)))
> + break;
> + if (!(pte & PT_PRESENT_MASK))
> + return;
> + pt = (unsigned long *)(pte & PT_ADDR_MASK);
> + }
> +
> + if (!bad_pt_ad)
> + report_pass("NPT - guest page table structures A=%d/D=%d",
> + !!(expected_pt_ad & PT_ACCESSED_MASK),
> + !!(expected_pt_ad & PT_DIRTY_MASK));
> +
> + offset = PGDIR_OFFSET(guest_addr, l);
> + offset_in_page = guest_addr & ((1 << PGDIR_BITS(l)) - 1);
> + gpa = (pt[offset] & PT_ADDR_MASK) | (guest_addr & offset_in_page);
> +
> + npt_pte = npt_get_pte(gpa);
> +
> + if (!npt_pte) {
> + report_fail("NPT - guest physical address is not mapped");
> + return;
> + }
> + report((*npt_pte & PT_AD_MASK) == expected_gpa_ad,
> + "NPT - guest physical address A=%d/D=%d",
> + !!(expected_gpa_ad & PT_ACCESSED_MASK),
> + !!(expected_gpa_ad & PT_DIRTY_MASK));
> +}
next prev parent reply other threads:[~2025-11-18 23:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-20 16:29 [kvm-unit-tests PATCH] x86: nSVM: Add test for EPT A/D bits Kevin Cheng
2025-09-11 19:07 ` Kevin Cheng
2025-11-18 23:20 ` Sean Christopherson [this message]
2025-12-19 23:02 ` Kevin Cheng
2025-12-19 23:01 ` Kevin Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRz_UkBPM7FcfmpP@google.com \
--to=seanjc@google.com \
--cc=chengkev@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.