[PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support
@ 2026-01-05  6:36 Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 1/8] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

This series implements Page Modification Logging (PML) for guests, bringing
hardware-assisted dirty logging support. PML is designed to track guest
modified memory pages. PML enables the hypervisor to identify which pages in a
guest's memory have been modified since the last checkpoint or during live
migration.

The PML feature uses two new VMCB fields (PML_ADDR and PML_INDEX) and
generates a VMEXIT when the 4KB log buffer becomes full.

The feature is enabled by default when hardware support is detected and
can be disabled via the 'pml' module parameter.

Changelog:
v5:
* Rebased on latest kvm/next
* Use EXPORT_SYMBOL_FOR_KVM_INTERNAL (Kai Huang)
* Use cpu_dirty_log_size instead of enable_pml for PML checks, with this patch
  moving enable_pml to common code is not needed (Kai Huang)
* Added SEV PML self test

v4: https://lore.kernel.org/kvm/20251013062515.3712430-1-nikunj@amd.com/
* Add couple of patches to enable_pml and nested CPU dirty logging to
  common code (Kai Huang)
* Rebased to latest kvm/next

v3: https://lore.kernel.org/kvm/20250925101052.1868431-1-nikunj@amd.com/
* Update comments with nested details (Kai Huang)
* Added nested.update_vmcb01_cpu_dirty_logging to update L1 PML (Kai Huang)
* Added patch to use BIT_ULL() instead of BIT() for 64-bit nested_ctl

v2: https://lore.kernel.org/kvm/20250915085938.639049-1-nikunj@amd.com/
* Rebased on latest kvm/next
* Added patch to move pml_pg field from struct vcpu_vmx to struct kvm_vcpu_arch
  to share the PML page. (Kai Huang)
* Dropped the SNP safe allocation optimization patch, will submit it separately.
* Update commit message adding explicit mention that AMD PML follows VMX behavior
  (Kai Huang)
* Updated SNP erratum comment to include PML buffer alongside VMCB, VMSA, and
  AVIC pages. (Kai Huang)

RFC: https://lore.kernel.org/kvm/20250825152009.3512-1-nikunj@amd.com/

Kai Huang (1):
  KVM: x86: Move nested CPU dirty logging logic to common code

Nikunj A Dadhania (7):
  KVM: x86: Carve out PML flush routine
  KVM: x86: Move PML page to common vcpu arch structure
  KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks
  x86/cpufeatures: Add Page modification logging
  KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions
  KVM: SVM: Add Page modification logging support
  selftests: KVM: x86: Add SEV PML dirty logging test

 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/kvm_host.h               |   5 +-
 arch/x86/include/asm/svm.h                    |  12 +-
 arch/x86/include/uapi/asm/svm.h               |   2 +
 arch/x86/kernel/cpu/scattered.c               |   1 +
 arch/x86/kvm/kvm_cache_regs.h                 |   7 +
 arch/x86/kvm/svm/nested.c                     |   9 +-
 arch/x86/kvm/svm/sev.c                        |   2 +-
 arch/x86/kvm/svm/svm.c                        |  85 +++++++-
 arch/x86/kvm/svm/svm.h                        |   3 +
 arch/x86/kvm/vmx/main.c                       |   4 +-
 arch/x86/kvm/vmx/nested.c                     |   5 -
 arch/x86/kvm/vmx/vmx.c                        |  71 ++----
 arch/x86/kvm/vmx/vmx.h                        |  10 +-
 arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
 arch/x86/kvm/x86.c                            |  53 ++++-
 arch/x86/kvm/x86.h                            |   8 +
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 tools/testing/selftests/kvm/include/x86/sev.h |   4 +
 tools/testing/selftests/kvm/lib/x86/sev.c     |  18 +-
 .../testing/selftests/kvm/x86/sev_pml_test.c  | 203 ++++++++++++++++++
 21 files changed, 422 insertions(+), 84 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86/sev_pml_test.c


base-commit: 0499add8efd72456514c6218c062911ccc922a99
-- 
2.48.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 1/8] KVM: x86: Carve out PML flush routine
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-12 10:02   ` Huang, Kai
  2026-01-05  6:36 ` [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Move the PML (Page Modification Logging) buffer flushing logic from
VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
code duplication.

The AMD SVM PML implementations share the same behavior as VMX PML:
 1) The PML buffer is a 4K page with 512 entries
 2) Hardware records dirty GPAs in reverse order (from index 511 to 0)
 3) Hardware clears bits 11:0 when recording GPAs

The PML constants (PML_LOG_NR_ENTRIES and PML_HEAD_INDEX) are moved from
vmx.h to x86.h to make them available to both VMX and SVM.

No functional change intended for VMX, except tone down the WARN_ON() to
WARN_ON_ONCE() for the page alignment check. If hardware exhibits this
behavior once, it's likely to occur repeatedly, so use WARN_ON_ONCE() to
avoid log flooding while still capturing the unexpected condition.

The refactoring prepares for SVM to leverage the same PML flushing
implementation.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kvm/vmx/vmx.c | 26 ++------------------------
 arch/x86/kvm/vmx/vmx.h |  5 -----
 arch/x86/kvm/x86.c     | 31 +++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.h     |  8 ++++++++
 4 files changed, 41 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6b96f7aea20b..c152c8590374 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6342,37 +6342,15 @@ static void vmx_destroy_pml_buffer(struct vcpu_vmx *vmx)
 static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	u16 pml_idx, pml_tail_index;
-	u64 *pml_buf;
-	int i;
+	u16 pml_idx;
 
 	pml_idx = vmcs_read16(GUEST_PML_INDEX);
 
 	/* Do nothing if PML buffer is empty */
 	if (pml_idx == PML_HEAD_INDEX)
 		return;
-	/*
-	 * PML index always points to the next available PML buffer entity
-	 * unless PML log has just overflowed.
-	 */
-	pml_tail_index = (pml_idx >= PML_LOG_NR_ENTRIES) ? 0 : pml_idx + 1;
 
-	/*
-	 * PML log is written backwards: the CPU first writes the entry 511
-	 * then the entry 510, and so on.
-	 *
-	 * Read the entries in the same order they were written, to ensure that
-	 * the dirty ring is filled in the same order the CPU wrote them.
-	 */
-	pml_buf = page_address(vmx->pml_pg);
-
-	for (i = PML_HEAD_INDEX; i >= pml_tail_index; i--) {
-		u64 gpa;
-
-		gpa = pml_buf[i];
-		WARN_ON(gpa & (PAGE_SIZE - 1));
-		kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
-	}
+	kvm_flush_pml_buffer(vcpu, vmx->pml_pg, pml_idx);
 
 	/* reset PML index */
 	vmcs_write16(GUEST_PML_INDEX, PML_HEAD_INDEX);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index bc3ed3145d7e..e1602db0d3a4 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -272,11 +272,6 @@ struct vcpu_vmx {
 	unsigned int ple_window;
 	bool ple_window_dirty;
 
-	/* Support for PML */
-#define PML_LOG_NR_ENTRIES	512
-	/* PML is written backwards: this is the first entry written by the CPU */
-#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
-
 	struct page *pml_pg;
 
 	/* apic deadline value in host tsc */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ff8812f3a129..fec4f5c94510 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6696,6 +6696,37 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
 		kvm_vcpu_kick(vcpu);
 }
 
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_page, u16 pml_idx)
+{
+	u16 pml_tail_index;
+	u64 *pml_buf;
+	int i;
+
+	/*
+	 * PML index always points to the next available PML buffer entity
+	 * unless PML log has just overflowed.
+	 */
+	pml_tail_index = (pml_idx >= PML_LOG_NR_ENTRIES) ? 0 : pml_idx + 1;
+
+	/*
+	 * PML log is written backwards: the CPU first writes the entry 511
+	 * then the entry 510, and so on.
+	 *
+	 * Read the entries in the same order they were written, to ensure that
+	 * the dirty ring is filled in the same order the CPU wrote them.
+	 */
+	pml_buf = page_address(pml_page);
+
+	for (i = PML_HEAD_INDEX; i >= pml_tail_index; i--) {
+		u64 gpa;
+
+		gpa = pml_buf[i];
+		WARN_ON_ONCE(gpa & (PAGE_SIZE - 1));
+		kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
+	}
+}
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_flush_pml_buffer);
+
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index fdab0ad49098..24aee9d99787 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -749,4 +749,12 @@ static inline bool kvm_is_valid_u_s_cet(struct kvm_vcpu *vcpu, u64 data)
 
 	return true;
 }
+
+/* Support for PML */
+#define PML_LOG_NR_ENTRIES	512
+/* PML is written backwards: this is the first entry written by the CPU */
+#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
+
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_pg, u16 pml_idx);
+
 #endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 1/8] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-12 10:07   ` Huang, Kai
  2026-01-05  6:36 ` [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Move the PML page pointer from VMX-specific vcpu_vmx structure to the
common kvm_vcpu_arch structure to enable sharing between VMX and SVM
implementations. Only the page pointer is moved to x86 common code while
keeping allocation logic vendor-specific, since AMD requires
snp_safe_alloc_page() for PML buffer allocation.

Update all VMX references accordingly, and simplify the
kvm_flush_pml_buffer() interface by removing the page parameter since it
can now access the page directly from the vcpu structure.

No functional change, restructuring to prepare for SVM PML support.

Suggested-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/vmx/vmx.c          | 24 ++++++++++++------------
 arch/x86/kvm/vmx/vmx.h          |  2 --
 arch/x86/kvm/x86.c              |  4 ++--
 arch/x86/kvm/x86.h              |  2 +-
 5 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5a3bfa293e8b..123b4d0a8297 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -861,6 +861,8 @@ struct kvm_vcpu_arch {
 	 */
 	struct kvm_mmu_memory_cache mmu_external_spt_cache;
 
+	struct page *pml_page;
+
 	/*
 	 * QEMU userspace and the guest each have their own FPU state.
 	 * In vcpu_run, we switch between the user and guest FPU contexts.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c152c8590374..bd244b46068f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4805,7 +4805,8 @@ int vmx_vcpu_precreate(struct kvm *kvm)
 
 static void init_vmcs(struct vcpu_vmx *vmx)
 {
-	struct kvm *kvm = vmx->vcpu.kvm;
+	struct kvm_vcpu *vcpu = &vmx->vcpu;
+	struct kvm *kvm = vcpu->kvm;
 	struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm);
 
 	if (nested)
@@ -4896,7 +4897,7 @@ static void init_vmcs(struct vcpu_vmx *vmx)
 		vmcs_write64(XSS_EXIT_BITMAP, VMX_XSS_EXIT_BITMAP);
 
 	if (enable_pml) {
-		vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
+		vmcs_write64(PML_ADDRESS, page_to_phys(vcpu->arch.pml_page));
 		vmcs_write16(GUEST_PML_INDEX, PML_HEAD_INDEX);
 	}
 
@@ -6331,17 +6332,16 @@ void vmx_get_entry_info(struct kvm_vcpu *vcpu, u32 *intr_info, u32 *error_code)
 		*error_code = 0;
 }
 
-static void vmx_destroy_pml_buffer(struct vcpu_vmx *vmx)
+static void vmx_destroy_pml_buffer(struct kvm_vcpu *vcpu)
 {
-	if (vmx->pml_pg) {
-		__free_page(vmx->pml_pg);
-		vmx->pml_pg = NULL;
+	if (vcpu->arch.pml_page) {
+		__free_page(vcpu->arch.pml_page);
+		vcpu->arch.pml_page = NULL;
 	}
 }
 
 static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
 {
-	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	u16 pml_idx;
 
 	pml_idx = vmcs_read16(GUEST_PML_INDEX);
@@ -6350,7 +6350,7 @@ static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
 	if (pml_idx == PML_HEAD_INDEX)
 		return;
 
-	kvm_flush_pml_buffer(vcpu, vmx->pml_pg, pml_idx);
+	kvm_flush_pml_buffer(vcpu, pml_idx);
 
 	/* reset PML index */
 	vmcs_write16(GUEST_PML_INDEX, PML_HEAD_INDEX);
@@ -7545,7 +7545,7 @@ void vmx_vcpu_free(struct kvm_vcpu *vcpu)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
 	if (enable_pml)
-		vmx_destroy_pml_buffer(vmx);
+		vmx_destroy_pml_buffer(vcpu);
 	free_vpid(vmx->vpid);
 	nested_vmx_free_vcpu(vcpu);
 	free_loaded_vmcs(vmx->loaded_vmcs);
@@ -7574,8 +7574,8 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu)
 	 * for the guest), etc.
 	 */
 	if (enable_pml) {
-		vmx->pml_pg = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
-		if (!vmx->pml_pg)
+		vcpu->arch.pml_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+		if (!vcpu->arch.pml_page)
 			goto free_vpid;
 	}
 
@@ -7646,7 +7646,7 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu)
 free_vmcs:
 	free_loaded_vmcs(vmx->loaded_vmcs);
 free_pml:
-	vmx_destroy_pml_buffer(vmx);
+	vmx_destroy_pml_buffer(vcpu);
 free_vpid:
 	free_vpid(vmx->vpid);
 	return err;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e1602db0d3a4..c9b6760d7a2d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -272,8 +272,6 @@ struct vcpu_vmx {
 	unsigned int ple_window;
 	bool ple_window_dirty;
 
-	struct page *pml_pg;
-
 	/* apic deadline value in host tsc */
 	u64 hv_deadline_tsc;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fec4f5c94510..7e299c4b9bf7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6696,7 +6696,7 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
 		kvm_vcpu_kick(vcpu);
 }
 
-void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_page, u16 pml_idx)
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, u16 pml_idx)
 {
 	u16 pml_tail_index;
 	u64 *pml_buf;
@@ -6715,7 +6715,7 @@ void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_page, u16 pml_
 	 * Read the entries in the same order they were written, to ensure that
 	 * the dirty ring is filled in the same order the CPU wrote them.
 	 */
-	pml_buf = page_address(pml_page);
+	pml_buf = page_address(vcpu->arch.pml_page);
 
 	for (i = PML_HEAD_INDEX; i >= pml_tail_index; i--) {
 		u64 gpa;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 24aee9d99787..105d9e9ad99c 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -755,6 +755,6 @@ static inline bool kvm_is_valid_u_s_cet(struct kvm_vcpu *vcpu, u64 data)
 /* PML is written backwards: this is the first entry written by the CPU */
 #define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
 
-void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_pg, u16 pml_idx);
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, u16 pml_idx);
 
 #endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 1/8] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-05  6:49   ` Gupta, Pankaj
  2026-01-05  6:36 ` [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code Nikunj A Dadhania
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Replace the enable_pml check with cpu_dirty_log_size in VMX PML code
to determine whether PML is enabled on a per-VM basis. The enable_pml
module parameter is a global setting that doesn't reflect per-VM
capabilities, whereas cpu_dirty_log_size accurately indicates whether
a specific VM has PML enabled.

For example, TDX VMs don't yet support PML. Using cpu_dirty_log_size
ensures the check correctly reflects this, while enable_pml would
incorrectly indicate PML is available.

This also improves consistency with kvm_mmu_update_cpu_dirty_logging(),
which already uses cpu_dirty_log_size to determine PML enablement.

Suggested-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bd244b46068f..91e3cd30a147 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8242,7 +8242,7 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
-	if (WARN_ON_ONCE(!enable_pml))
+	if (WARN_ON_ONCE(!vcpu->kvm->arch.cpu_dirty_log_size))
 		return;
 
 	if (is_guest_mode(vcpu)) {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2026-01-05  6:36 ` [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-12 10:08   ` Huang, Kai
  2026-01-05  6:36 ` [PATCH v5 5/8] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

From: Kai Huang <kai.huang@intel.com>

Move nested PML dirty logging update logic from VMX-specific code to common
x86 infrastructure. Both VMX and SVM share identical logic: defer CPU dirty
logging updates when running in L2, then process pending updates when
exiting to L1.

No functional change.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Co-developed-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/kvm_cache_regs.h   |  7 +++++++
 arch/x86/kvm/vmx/main.c         |  4 ++--
 arch/x86/kvm/vmx/nested.c       |  5 -----
 arch/x86/kvm/vmx/vmx.c          | 23 ++++-------------------
 arch/x86/kvm/vmx/vmx.h          |  3 +--
 arch/x86/kvm/vmx/x86_ops.h      |  2 +-
 arch/x86/kvm/x86.c              | 22 +++++++++++++++++++++-
 8 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 123b4d0a8297..4bd4c647aaaa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -862,6 +862,7 @@ struct kvm_vcpu_arch {
 	struct kvm_mmu_memory_cache mmu_external_spt_cache;
 
 	struct page *pml_page;
+	bool update_cpu_dirty_logging_pending;
 
 	/*
 	 * QEMU userspace and the guest each have their own FPU state.
@@ -1879,7 +1880,7 @@ struct kvm_x86_ops {
 			       struct x86_exception *exception);
 	void (*handle_exit_irqoff)(struct kvm_vcpu *vcpu);
 
-	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
+	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu, bool enable);
 
 	const struct kvm_x86_nested_ops *nested_ops;
 
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 8ddb01191d6f..0c4a832a9dab 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -238,6 +238,13 @@ static inline void leave_guest_mode(struct kvm_vcpu *vcpu)
 		kvm_make_request(KVM_REQ_LOAD_EOI_EXITMAP, vcpu);
 	}
 
+	/* Also see kvm_vcpu_update_cpu_dirty_logging() */
+	if (vcpu->arch.update_cpu_dirty_logging_pending) {
+		vcpu->arch.update_cpu_dirty_logging_pending = false;
+		kvm_x86_call(update_cpu_dirty_logging)(vcpu,
+				atomic_read(&vcpu->kvm->nr_memslots_dirty_logging));
+	}
+
 	vcpu->stat.guest_mode = 0;
 }
 
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index a46ccd670785..7235913ca58f 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -103,7 +103,7 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	vmx_vcpu_load(vcpu, cpu);
 }
 
-static void vt_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
+static void vt_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable)
 {
 	/*
 	 * Basic TDX does not support feature PML. KVM does not enable PML in
@@ -112,7 +112,7 @@ static void vt_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
 	if (WARN_ON_ONCE(is_td_vcpu(vcpu)))
 		return;
 
-	vmx_update_cpu_dirty_logging(vcpu);
+	vmx_update_cpu_dirty_logging(vcpu, enable);
 }
 
 static void vt_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6137e5307d0f..920a925bb46f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5152,11 +5152,6 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
 		vmx_set_virtual_apic_mode(vcpu);
 	}
 
-	if (vmx->nested.update_vmcs01_cpu_dirty_logging) {
-		vmx->nested.update_vmcs01_cpu_dirty_logging = false;
-		vmx_update_cpu_dirty_logging(vcpu);
-	}
-
 	nested_put_vmcs12_pages(vcpu);
 
 	if (vmx->nested.reload_vmcs01_apic_access_page) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 91e3cd30a147..6c3ffaa8ce1a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8238,27 +8238,12 @@ void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
 }
 #endif
 
-void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
+void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable)
 {
-	struct vcpu_vmx *vmx = to_vmx(vcpu);
-
-	if (WARN_ON_ONCE(!vcpu->kvm->arch.cpu_dirty_log_size))
-		return;
-
-	if (is_guest_mode(vcpu)) {
-		vmx->nested.update_vmcs01_cpu_dirty_logging = true;
-		return;
-	}
-
-	/*
-	 * Note, nr_memslots_dirty_logging can be changed concurrent with this
-	 * code, but in that case another update request will be made and so
-	 * the guest will never run with a stale PML value.
-	 */
-	if (atomic_read(&vcpu->kvm->nr_memslots_dirty_logging))
-		secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_ENABLE_PML);
+	if (enable)
+		secondary_exec_controls_setbit(to_vmx(vcpu), SECONDARY_EXEC_ENABLE_PML);
 	else
-		secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_ENABLE_PML);
+		secondary_exec_controls_clearbit(to_vmx(vcpu), SECONDARY_EXEC_ENABLE_PML);
 }
 
 void vmx_setup_mce(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c9b6760d7a2d..5dff2fa213f5 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -133,7 +133,6 @@ struct nested_vmx {
 
 	bool change_vmcs01_virtual_apic_mode;
 	bool reload_vmcs01_apic_access_page;
-	bool update_vmcs01_cpu_dirty_logging;
 	bool update_vmcs01_apicv_status;
 	bool update_vmcs01_hwapic_isr;
 
@@ -400,7 +399,7 @@ u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 
 gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
 
-void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable);
 
 u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated);
 bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated);
diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
index d09abeac2b56..f4e1cb6d8ada 100644
--- a/arch/x86/kvm/vmx/x86_ops.h
+++ b/arch/x86/kvm/vmx/x86_ops.h
@@ -112,7 +112,7 @@ u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
 u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 void vmx_write_tsc_offset(struct kvm_vcpu *vcpu);
 void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu);
-void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable);
 #ifdef CONFIG_X86_64
 int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc,
 		     bool *expired);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7e299c4b9bf7..5154fa8924cf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -148,6 +148,7 @@ struct kvm_x86_ops kvm_x86_ops __read_mostly;
 #include <asm/kvm-x86-ops.h>
 EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
 EXPORT_STATIC_CALL_GPL(kvm_x86_cache_reg);
+EXPORT_STATIC_CALL_GPL(kvm_x86_update_cpu_dirty_logging);
 
 static bool __read_mostly ignore_msrs = 0;
 module_param(ignore_msrs, bool, 0644);
@@ -11066,6 +11067,25 @@ static void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
 	kvm_x86_call(set_apic_access_page_addr)(vcpu);
 }
 
+static void kvm_vcpu_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
+{
+	if (WARN_ON_ONCE(!vcpu->kvm->arch.cpu_dirty_log_size))
+		return;
+
+	if (is_guest_mode(vcpu)) {
+		vcpu->arch.update_cpu_dirty_logging_pending = true;
+		return;
+	}
+
+	/*
+	 * Note, nr_memslots_dirty_logging can be changed concurrently with this
+	 * code, but in that case another update request will be made and so the
+	 * guest will never run with a stale PML value.
+	 */
+	kvm_x86_call(update_cpu_dirty_logging)(vcpu,
+			atomic_read(&vcpu->kvm->nr_memslots_dirty_logging));
+}
+
 /*
  * Called within kvm->srcu read side.
  * Returns 1 to let vcpu_run() continue the guest execution loop without
@@ -11232,7 +11252,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_x86_call(recalc_intercepts)(vcpu);
 
 		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
-			kvm_x86_call(update_cpu_dirty_logging)(vcpu);
+			kvm_vcpu_update_cpu_dirty_logging(vcpu);
 
 		if (kvm_check_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu)) {
 			kvm_vcpu_reset(vcpu, true);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 5/8] x86/cpufeatures: Add Page modification logging
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2026-01-05  6:36 ` [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 6/8] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions Nikunj A Dadhania
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Page modification logging(PML) is a hardware feature designed to track
guest modified memory pages. PML enables the hypervisor to identify which
pages in a guest's memory have been changed since the last checkpoint or
during live migration.

The PML feature is advertised via CPUID leaf 0x8000000A ECX[4] bit.

Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/scattered.c    | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index c3b53beb1300..235e4745c6f2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -228,6 +228,7 @@
 #define X86_FEATURE_PVUNLOCK		( 8*32+20) /* PV unlock function */
 #define X86_FEATURE_VCPUPREEMPT		( 8*32+21) /* PV vcpu_is_preempted function */
 #define X86_FEATURE_TDX_GUEST		( 8*32+22) /* "tdx_guest" Intel Trust Domain Extensions Guest */
+#define X86_FEATURE_PML			( 8*32+23) /* AMD Page Modification logging */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
 #define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* "fsgsbase" RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 42c7eac0c387..cdda4e72c5e6 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -53,6 +53,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_PROC_FEEDBACK,		CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_AMD_FAST_CPPC,		CPUID_EDX, 15, 0x80000007, 0 },
 	{ X86_FEATURE_MBA,			CPUID_EBX,  6, 0x80000008, 0 },
+	{ X86_FEATURE_PML,                      CPUID_ECX,  4, 0x8000000a, 0 },
 	{ X86_FEATURE_X2AVIC_EXT,		CPUID_ECX,  6, 0x8000000a, 0 },
 	{ X86_FEATURE_COHERENCY_SFW_NO,		CPUID_EBX, 31, 0x8000001f, 0 },
 	{ X86_FEATURE_SMBA,			CPUID_EBX,  2, 0x80000020, 0 },
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 6/8] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2026-01-05  6:36 ` [PATCH v5 5/8] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 7/8] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
  2026-01-05  6:36 ` [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test Nikunj A Dadhania
  7 siblings, 0 replies; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Replace BIT() with BIT_ULL() for SVM nested control bit definitions
since nested_ctl is a 64-bit field in the VMCB control area structure.

No functional change intended.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/svm.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 56aa99503dc4..751da7cbabed 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -236,9 +236,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_IOIO_SIZE_MASK (7 << SVM_IOIO_SIZE_SHIFT)
 #define SVM_IOIO_ASIZE_MASK (7 << SVM_IOIO_ASIZE_SHIFT)
 
-#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
-#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
-#define SVM_NESTED_CTL_SEV_ES_ENABLE	BIT(2)
+#define SVM_NESTED_CTL_NP_ENABLE	BIT_ULL(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT_ULL(1)
+#define SVM_NESTED_CTL_SEV_ES_ENABLE	BIT_ULL(2)
 
 
 #define SVM_TSC_RATIO_RSVD	0xffffff0000000000ULL
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2026-01-05  6:36 ` [PATCH v5 6/8] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-12 10:24   ` Huang, Kai
  2026-01-14 22:48   ` Huang, Kai
  2026-01-05  6:36 ` [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test Nikunj A Dadhania
  7 siblings, 2 replies; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Currently, dirty logging relies on write protecting guest memory and
marking dirty GFNs during subsequent write faults. This method works but
incurs overhead due to additional write faults for each dirty GFN.

Implement support for the Page Modification Logging (PML) feature, a
hardware-assisted method for efficient dirty logging. PML automatically
logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
PML_INDEX after logging each GPA. When the PML buffer is full, a
VMEXIT(PML_FULL) with exit code 0x407 is generated.

Disable PML for nested guests.

PML is enabled by default when supported and can be disabled via the 'pml'
module parameter.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/svm.h      |  6 ++-
 arch/x86/include/uapi/asm/svm.h |  2 +
 arch/x86/kvm/svm/nested.c       |  9 +++-
 arch/x86/kvm/svm/sev.c          |  2 +-
 arch/x86/kvm/svm/svm.c          | 85 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.h          |  3 ++
 6 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 751da7cbabed..6bf88fe8ac7c 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -165,7 +165,10 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 	u8 reserved_9[22];
 	u64 allowed_sev_features;	/* Offset 0x138 */
 	u64 guest_sev_features;		/* Offset 0x140 */
-	u8 reserved_10[664];
+	u8 reserved_10[128];
+	u64 pml_addr;			/* Offset 0x1c8 */
+	u16 pml_index;			/* Offset 0x1d0 */
+	u8 reserved_11[526];
 	/*
 	 * Offset 0x3e0, 32 bytes reserved
 	 * for use by hypervisor/software.
@@ -239,6 +242,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_NESTED_CTL_NP_ENABLE	BIT_ULL(0)
 #define SVM_NESTED_CTL_SEV_ENABLE	BIT_ULL(1)
 #define SVM_NESTED_CTL_SEV_ES_ENABLE	BIT_ULL(2)
+#define SVM_NESTED_CTL_PML_ENABLE	BIT_ULL(11)
 
 
 #define SVM_TSC_RATIO_RSVD	0xffffff0000000000ULL
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 650e3256ea7d..6c41b019d553 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -101,6 +101,7 @@
 #define SVM_EXIT_AVIC_INCOMPLETE_IPI		0x401
 #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS	0x402
 #define SVM_EXIT_VMGEXIT       0x403
+#define SVM_EXIT_PML_FULL	0x407
 
 /* SEV-ES software-defined VMGEXIT events */
 #define SVM_VMGEXIT_MMIO_READ			0x80000001
@@ -236,6 +237,7 @@
 	{ SVM_EXIT_AVIC_INCOMPLETE_IPI,		"avic_incomplete_ipi" }, \
 	{ SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \
 	{ SVM_EXIT_VMGEXIT,		"vmgexit" }, \
+	{ SVM_EXIT_PML_FULL,		"pml_full" }, \
 	{ SVM_VMGEXIT_MMIO_READ,	"vmgexit_mmio_read" }, \
 	{ SVM_VMGEXIT_MMIO_WRITE,	"vmgexit_mmio_write" }, \
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ba0f11c68372..c1eb64fcc254 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -748,12 +748,19 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 						V_NMI_BLOCKING_MASK);
 	}
 
-	/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
+	/* Copied from vmcb01. msrpm_base/nested_ctl can be overwritten later. */
 	vmcb02->control.nested_ctl = vmcb01->control.nested_ctl;
 	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
 	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
 	vmcb_mark_dirty(vmcb02, VMCB_PERM_MAP);
 
+	/* Disable PML for nested guest as the A/D update is emulated by MMU */
+	if (pml) {
+		vmcb02->control.nested_ctl &= ~SVM_NESTED_CTL_PML_ENABLE;
+		vmcb02->control.pml_addr = 0;
+		vmcb02->control.pml_index = -1;
+	}
+
 	/*
 	 * Stash vmcb02's counter if the guest hasn't moved past the guilty
 	 * instruction; otherwise, reset the counter to '0'.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f59c65abe3cf..ffcc7e28d109 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4785,7 +4785,7 @@ struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
 	 * Allocate an SNP-safe page to workaround the SNP erratum where
 	 * the CPU will incorrectly signal an RMP violation #PF if a
 	 * hugepage (2MB or 1GB) collides with the RMP entry of a
-	 * 2MB-aligned VMCB, VMSA, or AVIC backing page.
+	 * 2MB-aligned VMCB, VMSA, PML or AVIC backing page.
 	 *
 	 * Allocate one extra page, choose a page which is not
 	 * 2MB-aligned, and free the other.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 24d59ccfa40d..920c7dc52470 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -170,6 +170,9 @@ module_param(intercept_smi, bool, 0444);
 bool vnmi = true;
 module_param(vnmi, bool, 0444);
 
+bool pml = true;
+module_param(pml, bool, 0444);
+
 static bool svm_gp_erratum_intercept = true;
 
 static u8 rsm_ins_bytes[] = "\x0f\xaa";
@@ -1156,6 +1159,16 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool init_event)
 	if (vcpu->kvm->arch.bus_lock_detection_enabled)
 		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
 
+	if (pml) {
+		/*
+		 * Populate the page address and index here, PML is enabled
+		 * when dirty logging is enabled on the memslot through
+		 * svm_update_cpu_dirty_logging()
+		 */
+		control->pml_addr = (u64)__sme_set(page_to_phys(vcpu->arch.pml_page));
+		control->pml_index = PML_HEAD_INDEX;
+	}
+
 	if (sev_guest(vcpu->kvm))
 		sev_init_vmcb(svm, init_event);
 
@@ -1220,9 +1233,15 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 	if (!vmcb01_page)
 		goto out;
 
+	if (pml) {
+		vcpu->arch.pml_page = snp_safe_alloc_page();
+		if (!vcpu->arch.pml_page)
+			goto error_free_vmcb_page;
+	}
+
 	err = sev_vcpu_create(vcpu);
 	if (err)
-		goto error_free_vmcb_page;
+		goto error_free_pml_page;
 
 	err = avic_init_vcpu(svm);
 	if (err)
@@ -1247,6 +1266,9 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 
 error_free_sev:
 	sev_free_vcpu(vcpu);
+error_free_pml_page:
+	if (vcpu->arch.pml_page)
+		__free_page(vcpu->arch.pml_page);
 error_free_vmcb_page:
 	__free_page(vmcb01_page);
 out:
@@ -1264,6 +1286,9 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 
 	sev_free_vcpu(vcpu);
 
+	if (pml)
+		__free_page(vcpu->arch.pml_page);
+
 	__free_page(__sme_pa_to_page(svm->vmcb01.pa));
 	svm_vcpu_free_msrpm(svm->msrpm);
 }
@@ -3156,6 +3181,42 @@ static int bus_lock_exit(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+void svm_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (enable)
+		svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_PML_ENABLE;
+	else
+		svm->vmcb->control.nested_ctl &= ~SVM_NESTED_CTL_PML_ENABLE;
+}
+
+static void svm_flush_pml_buffer(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control = &svm->vmcb->control;
+
+	/* Do nothing if PML buffer is empty */
+	if (control->pml_index == PML_HEAD_INDEX)
+		return;
+
+	kvm_flush_pml_buffer(vcpu, control->pml_index);
+
+	/* Reset the PML index */
+	control->pml_index = PML_HEAD_INDEX;
+}
+
+static int pml_full_interception(struct kvm_vcpu *vcpu)
+{
+	trace_kvm_pml_full(vcpu->vcpu_id);
+
+	/*
+	 * PML buffer is already flushed at the beginning of svm_handle_exit().
+	 * Nothing to do here.
+	 */
+	return 1;
+}
+
 static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_READ_CR0]			= cr_interception,
 	[SVM_EXIT_READ_CR3]			= cr_interception,
@@ -3232,6 +3293,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 #ifdef CONFIG_KVM_AMD_SEV
 	[SVM_EXIT_VMGEXIT]			= sev_handle_vmgexit,
 #endif
+	[SVM_EXIT_PML_FULL]			= pml_full_interception,
 };
 
 static void dump_vmcb(struct kvm_vcpu *vcpu)
@@ -3280,8 +3342,10 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
 	pr_err("%-20s%016llx\n", "exit_info2:", control->exit_info_2);
 	pr_err("%-20s%08x\n", "exit_int_info:", control->exit_int_info);
 	pr_err("%-20s%08x\n", "exit_int_info_err:", control->exit_int_info_err);
-	pr_err("%-20s%lld\n", "nested_ctl:", control->nested_ctl);
+	pr_err("%-20s%llx\n", "nested_ctl:", control->nested_ctl);
 	pr_err("%-20s%016llx\n", "nested_cr3:", control->nested_cr3);
+	pr_err("%-20s%016llx\n", "pml_addr:", control->pml_addr);
+	pr_err("%-20s%04x\n", "pml_index:", control->pml_index);
 	pr_err("%-20s%016llx\n", "avic_vapic_bar:", control->avic_vapic_bar);
 	pr_err("%-20s%016llx\n", "ghcb:", control->ghcb_gpa);
 	pr_err("%-20s%08x\n", "event_inj:", control->event_inj);
@@ -3518,6 +3582,14 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	struct kvm_run *kvm_run = vcpu->run;
 	u32 exit_code = svm->vmcb->control.exit_code;
 
+	/*
+	 * Opportunistically flush the PML buffer on VM exit. This keeps the
+	 * dirty bitmap current by processing logged GPAs rather than waiting for
+	 * PML_FULL exit.
+	 */
+	if (vcpu->kvm->arch.cpu_dirty_log_size && !is_guest_mode(vcpu))
+		svm_flush_pml_buffer(vcpu);
+
 	/* SEV-ES guests must use the CR write traps to track CR registers. */
 	if (!sev_es_guest(vcpu->kvm)) {
 		if (!svm_is_intercept(svm, INTERCEPT_CR0_WRITE))
@@ -5003,6 +5075,9 @@ static int svm_vm_init(struct kvm *kvm)
 			return ret;
 	}
 
+	if (pml)
+		kvm->arch.cpu_dirty_log_size = PML_LOG_NR_ENTRIES;
+
 	svm_srso_vm_init();
 	return 0;
 }
@@ -5157,6 +5232,8 @@ struct kvm_x86_ops svm_x86_ops __initdata = {
 	.gmem_prepare = sev_gmem_prepare,
 	.gmem_invalidate = sev_gmem_invalidate,
 	.gmem_max_mapping_level = sev_gmem_max_mapping_level,
+
+	.update_cpu_dirty_logging = svm_update_cpu_dirty_logging,
 };
 
 /*
@@ -5380,6 +5457,10 @@ static __init int svm_hardware_setup(void)
 
 	nrips = nrips && boot_cpu_has(X86_FEATURE_NRIPS);
 
+	pml = pml && npt_enabled && cpu_feature_enabled(X86_FEATURE_PML);
+	if (pml)
+		pr_info("Page modification logging supported\n");
+
 	if (lbrv) {
 		if (!boot_cpu_has(X86_FEATURE_LBRV))
 			lbrv = false;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 01be93a53d07..c1e4e9a0c6d7 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -50,6 +50,7 @@ extern int vgif;
 extern bool intercept_smi;
 extern bool vnmi;
 extern int lbrv;
+extern bool pml;
 
 extern int tsc_aux_uret_slot __ro_after_init;
 
@@ -718,6 +719,8 @@ static inline void svm_enable_intercept_for_msr(struct kvm_vcpu *vcpu,
 	svm_set_intercept_for_msr(vcpu, msr, type, true);
 }
 
+void svm_update_cpu_dirty_logging(struct kvm_vcpu *vcpu, bool enable);
+
 /* nested.c */
 
 #define NESTED_EXIT_HOST	0	/* Exit handled on host level */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test
  2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2026-01-05  6:36 ` [PATCH v5 7/8] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
@ 2026-01-05  6:36 ` Nikunj A Dadhania
  2026-01-14 11:36   ` Huang, Kai
  7 siblings, 1 reply; 22+ messages in thread
From: Nikunj A Dadhania @ 2026-01-05  6:36 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj,
	kai.huang

Add a KVM selftest to verify Page Modification Logging (PML) functionality
with AMD SEV/SEV-ES/SEV-SNP guests. The test validates that
hardware-assisted dirty page tracking works correctly across different SEV
guest types.

Test methodology:
- Create SEV guest with additional memory slot for dirty logging
- Guest continuously writes to random pages within the test memory region
- Host periodically retrieves dirty log bitmap via KVM_GET_DIRTY_LOG
- Verify dirty pages match actual guest writes

Introduce vm_sev_create_with_one_vcpu_extramem() to allow specifying extra
memory pages during VM creation.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 tools/testing/selftests/kvm/include/x86/sev.h |   4 +
 tools/testing/selftests/kvm/lib/x86/sev.c     |  18 +-
 .../testing/selftests/kvm/x86/sev_pml_test.c  | 203 ++++++++++++++++++
 4 files changed, 223 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86/sev_pml_test.c

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index ba5c2b643efa..746c79713c8d 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -134,6 +134,7 @@ TEST_GEN_PROGS_x86 += x86/xen_vmcall_test
 TEST_GEN_PROGS_x86 += x86/sev_init2_tests
 TEST_GEN_PROGS_x86 += x86/sev_migrate_tests
 TEST_GEN_PROGS_x86 += x86/sev_smoke_test
+TEST_GEN_PROGS_x86 += x86/sev_pml_test
 TEST_GEN_PROGS_x86 += x86/amx_test
 TEST_GEN_PROGS_x86 += x86/max_vcpuid_cap_test
 TEST_GEN_PROGS_x86 += x86/triple_fault_event_test
diff --git a/tools/testing/selftests/kvm/include/x86/sev.h b/tools/testing/selftests/kvm/include/x86/sev.h
index 008b4169f5e2..b06583b91447 100644
--- a/tools/testing/selftests/kvm/include/x86/sev.h
+++ b/tools/testing/selftests/kvm/include/x86/sev.h
@@ -53,8 +53,12 @@ void snp_vm_launch_start(struct kvm_vm *vm, uint64_t policy);
 void snp_vm_launch_update(struct kvm_vm *vm);
 void snp_vm_launch_finish(struct kvm_vm *vm);
 
+struct kvm_vm *_vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
+					   struct kvm_vcpu **cpu, uint64_t npages);
 struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
 					   struct kvm_vcpu **cpu);
+struct kvm_vm *vm_sev_create_with_one_vcpu_extramem(uint32_t type, void *guest_code,
+					   struct kvm_vcpu **cpu, uint64_t npages);
 void vm_sev_launch(struct kvm_vm *vm, uint64_t policy, uint8_t *measurement);
 
 kvm_static_assert(SEV_RET_SUCCESS == 0);
diff --git a/tools/testing/selftests/kvm/lib/x86/sev.c b/tools/testing/selftests/kvm/lib/x86/sev.c
index c3a9838f4806..20d67d01c997 100644
--- a/tools/testing/selftests/kvm/lib/x86/sev.c
+++ b/tools/testing/selftests/kvm/lib/x86/sev.c
@@ -158,8 +158,8 @@ void snp_vm_launch_finish(struct kvm_vm *vm)
 	vm_sev_ioctl(vm, KVM_SEV_SNP_LAUNCH_FINISH, &launch_finish);
 }
 
-struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
-					   struct kvm_vcpu **cpu)
+struct kvm_vm *_vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
+					   struct kvm_vcpu **cpu, uint64_t npages)
 {
 	struct vm_shape shape = {
 		.mode = VM_MODE_DEFAULT,
@@ -168,12 +168,24 @@ struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
 	struct kvm_vm *vm;
 	struct kvm_vcpu *cpus[1];
 
-	vm = __vm_create_with_vcpus(shape, 1, 0, guest_code, cpus);
+	vm = __vm_create_with_vcpus(shape, 1, npages, guest_code, cpus);
 	*cpu = cpus[0];
 
 	return vm;
 }
 
+struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code,
+					   struct kvm_vcpu **cpu)
+{
+	return _vm_sev_create_with_one_vcpu(type, guest_code, cpu, 0);
+}
+
+struct kvm_vm *vm_sev_create_with_one_vcpu_extramem(uint32_t type, void *guest_code,
+						    struct kvm_vcpu **cpu, uint64_t npages)
+{
+	return _vm_sev_create_with_one_vcpu(type, guest_code, cpu, npages);
+}
+
 void vm_sev_launch(struct kvm_vm *vm, uint64_t policy, uint8_t *measurement)
 {
 	if (is_sev_snp_vm(vm)) {
diff --git a/tools/testing/selftests/kvm/x86/sev_pml_test.c b/tools/testing/selftests/kvm/x86/sev_pml_test.c
new file mode 100644
index 000000000000..b1114f5a67f8
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86/sev_pml_test.c
@@ -0,0 +1,203 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/bitmap.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "sev.h"
+
+#define GUEST_NR_PAGES (1024)
+#define DEFAULT_GUEST_TEST_MEM 0xC0000000
+#define TEST_MEM_SLOT_INDEX 1
+
+/*
+ * Guest/Host shared variables.
+ */
+static uint64_t guest_page_size;
+static uint64_t guest_num_pages;
+
+/* Points to the test VM memory region on which we track dirty logs */
+static void *host_test_mem;
+
+/* Host variables */
+static pthread_t vcpu_thread;
+static bool vcpu_thread_done;
+
+/*
+ * Guest physical memory offset of the testing memory slot.
+ * This will be set to the topmost valid physical address minus
+ * the test memory size.
+ */
+static uint64_t guest_test_phys_mem;
+
+/*
+ * Guest virtual memory offset of the testing memory slot.
+ * Must not conflict with identity mapped test code.
+ */
+static uint64_t guest_test_virt_mem = DEFAULT_GUEST_TEST_MEM;
+
+/*
+ * Continuously write to the first 8 bytes of a random pages within
+ * the testing memory region.
+ */
+static void guest_pml_code(void)
+{
+	uint64_t addr;
+	int write = 0;
+
+	while (write++ != (guest_num_pages * 10)) {
+		addr = guest_test_virt_mem;
+		addr += (guest_random_u64(&guest_rng) % guest_num_pages) * guest_page_size;
+
+		vcpu_arch_put_guest(*(uint64_t *)addr, 0xAA);
+	}
+}
+
+static void guest_pml_sev_code(void)
+{
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
+
+	guest_pml_code();
+
+	GUEST_DONE();
+}
+
+static void guest_pml_sev_es_code(void)
+{
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ES_ENABLED);
+
+	guest_pml_code();
+
+	wrmsr(MSR_AMD64_SEV_ES_GHCB, GHCB_MSR_TERM_REQ);
+	vmgexit();
+}
+
+static void guest_pml_sev_snp_code(void)
+{
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ES_ENABLED);
+	GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_SNP_ENABLED);
+
+	guest_pml_code();
+
+	wrmsr(MSR_AMD64_SEV_ES_GHCB, GHCB_MSR_TERM_REQ);
+	vmgexit();
+}
+
+static unsigned long *bmap;
+static void *vcpu_worker(void *data)
+{
+	struct kvm_vcpu *vcpu = data;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	vm = vcpu->vm;
+	while (1) {
+		/* Let the guest dirty the random pages */
+		vcpu_run(vcpu);
+
+		if (is_sev_es_vm(vm)) {
+			TEST_ASSERT(vcpu->run->exit_reason == KVM_EXIT_SYSTEM_EVENT,
+				    "Wanted SYSTEM_EVENT, got %s",
+				    exit_reason_str(vcpu->run->exit_reason));
+			TEST_ASSERT_EQ(vcpu->run->system_event.type, KVM_SYSTEM_EVENT_SEV_TERM);
+			TEST_ASSERT_EQ(vcpu->run->system_event.ndata, 1);
+			TEST_ASSERT_EQ(vcpu->run->system_event.data[0], GHCB_MSR_TERM_REQ);
+			break;
+		}
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC:
+			continue;
+		case UCALL_DONE:
+			goto exit_done;
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+		default:
+			TEST_FAIL("Unexpected exit: %s", exit_reason_str(vcpu->run->exit_reason));
+		}
+	}
+
+exit_done:
+	WRITE_ONCE(vcpu_thread_done, true);
+	return NULL;
+}
+
+static void vm_dirty_log_verify(void)
+{
+	uint64_t page, nr_dirty_pages = 0, nr_clean_pages = 0;
+
+	for (page = 0; page < guest_num_pages; page++) {
+		uint64_t val = *(uint64_t *)(host_test_mem + page * guest_page_size);
+		bool bmap_dirty = __test_and_clear_bit(page, bmap);
+
+		if (bmap_dirty && val == 0xAA)
+			nr_dirty_pages++;
+		else
+			nr_clean_pages++;
+	}
+	pr_debug("Dirty pages %ld clean pages %ld\n", nr_dirty_pages, nr_clean_pages);
+}
+
+void test_pml(void *guest_code, uint32_t type, uint64_t policy)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+
+	vm = vm_sev_create_with_one_vcpu_extramem(type, guest_code, &vcpu, 2 * GUEST_NR_PAGES);
+
+	guest_page_size = vm->page_size;
+	guest_num_pages = GUEST_NR_PAGES;
+	guest_test_phys_mem = (vm->max_gfn - guest_num_pages) * guest_page_size;
+
+	bmap = bitmap_zalloc(guest_num_pages);
+
+	/* Add an extra memory slot for testing dirty logging */
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
+				guest_test_phys_mem,
+				TEST_MEM_SLOT_INDEX,
+				guest_num_pages,
+				KVM_MEM_LOG_DIRTY_PAGES);
+
+	/* Do mapping for the dirty track memory slot */
+	virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages);
+	host_test_mem = addr_gpa2hva(vm, (vm_paddr_t)guest_test_phys_mem);
+
+	/* Export the shared variables to the guest */
+	sync_global_to_guest(vm, guest_page_size);
+	sync_global_to_guest(vm, guest_test_virt_mem);
+	sync_global_to_guest(vm, guest_num_pages);
+
+	WRITE_ONCE(vcpu_thread_done, false);
+	vm_sev_launch(vm, policy, NULL);
+
+	pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu);
+	while (!READ_ONCE(vcpu_thread_done)) {
+		usleep(1000);
+		kvm_vm_get_dirty_log(vcpu->vm, TEST_MEM_SLOT_INDEX, bmap);
+	}
+	pthread_join(vcpu_thread, NULL);
+
+	vm_dirty_log_verify();
+	free(bmap);
+
+	kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(get_kvm_amd_param_bool("pml"));
+
+	if (kvm_cpu_has(X86_FEATURE_SEV))
+		test_pml(guest_pml_sev_code, KVM_X86_SEV_VM, SEV_POLICY_NO_DBG);
+
+	if (kvm_cpu_has(X86_FEATURE_SEV_ES))
+		test_pml(guest_pml_sev_es_code, KVM_X86_SEV_ES_VM,
+			 SEV_POLICY_ES | SEV_POLICY_NO_DBG);
+
+	if (kvm_cpu_has(X86_FEATURE_SEV_SNP))
+		test_pml(guest_pml_sev_snp_code, KVM_X86_SNP_VM,
+			 snp_default_policy() | SNP_POLICY_DBG);
+
+	return 0;
+}
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks
  2026-01-05  6:36 ` [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
@ 2026-01-05  6:49   ` Gupta, Pankaj
  0 siblings, 0 replies; 22+ messages in thread
From: Gupta, Pankaj @ 2026-01-05  6:49 UTC (permalink / raw)
  To: Nikunj A Dadhania, seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins,
	kai.huang


> Replace the enable_pml check with cpu_dirty_log_size in VMX PML code
> to determine whether PML is enabled on a per-VM basis. The enable_pml
> module parameter is a global setting that doesn't reflect per-VM
> capabilities, whereas cpu_dirty_log_size accurately indicates whether
> a specific VM has PML enabled.
>
> For example, TDX VMs don't yet support PML. Using cpu_dirty_log_size
> ensures the check correctly reflects this, while enable_pml would
> incorrectly indicate PML is available.
>
> This also improves consistency with kvm_mmu_update_cpu_dirty_logging(),
> which already uses cpu_dirty_log_size to determine PML enablement.
>
> Suggested-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>

> ---
>   arch/x86/kvm/vmx/vmx.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index bd244b46068f..91e3cd30a147 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8242,7 +8242,7 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
>   
> -	if (WARN_ON_ONCE(!enable_pml))
> +	if (WARN_ON_ONCE(!vcpu->kvm->arch.cpu_dirty_log_size))
>   		return;
>   
>   	if (is_guest_mode(vcpu)) {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 1/8] KVM: x86: Carve out PML flush routine
  2026-01-05  6:36 ` [PATCH v5 1/8] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
@ 2026-01-12 10:02   ` Huang, Kai
  2026-01-14 13:57     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 22+ messages in thread
From: Huang, Kai @ 2026-01-12 10:02 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> Move the PML (Page Modification Logging) buffer flushing logic from
> VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
> code duplication.
> 
> The AMD SVM PML implementations share the same behavior as VMX PML:
>  1) The PML buffer is a 4K page with 512 entries
>  2) Hardware records dirty GPAs in reverse order (from index 511 to 0)
>  3) Hardware clears bits 11:0 when recording GPAs
> 
> The PML constants (PML_LOG_NR_ENTRIES and PML_HEAD_INDEX) are moved from
> vmx.h to x86.h to make them available to both VMX and SVM.

Nit:

If a new version is needed, you can use imperative mode for the above
paragraph:

  Move PML constants (...) from vmx.h to x86.h to ...

Or IMHO you can just remove this paragraph, because the new
kvm_flush_pml_buffer() in x86.c uses both PML constants so the move is
implied actually.

> 
> No functional change intended for VMX, except tone down the WARN_ON() to
> WARN_ON_ONCE() for the page alignment check. If hardware exhibits this
> behavior once, it's likely to occur repeatedly, so use WARN_ON_ONCE() to
> avoid log flooding while still capturing the unexpected condition.
> 
> The refactoring prepares for SVM to leverage the same PML flushing
> implementation.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure
  2026-01-05  6:36 ` [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
@ 2026-01-12 10:07   ` Huang, Kai
  0 siblings, 0 replies; 22+ messages in thread
From: Huang, Kai @ 2026-01-12 10:07 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> Move the PML page pointer from VMX-specific vcpu_vmx structure to the
> common kvm_vcpu_arch structure to enable sharing between VMX and SVM
> implementations. Only the page pointer is moved to x86 common code while
> keeping allocation logic vendor-specific, since AMD requires
> snp_safe_alloc_page() for PML buffer allocation.
> 
> Update all VMX references accordingly, and simplify the
> kvm_flush_pml_buffer() interface by removing the page parameter since it
> can now access the page directly from the vcpu structure.
> 
> No functional change, restructuring to prepare for SVM PML support.
> 
> Suggested-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code
  2026-01-05  6:36 ` [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code Nikunj A Dadhania
@ 2026-01-12 10:08   ` Huang, Kai
  0 siblings, 0 replies; 22+ messages in thread
From: Huang, Kai @ 2026-01-12 10:08 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> From: Kai Huang <kai.huang@intel.com>
> 
> Move nested PML dirty logging update logic from VMX-specific code to common
> x86 infrastructure. Both VMX and SVM share identical logic: defer CPU dirty
> logging updates when running in L2, then process pending updates when
> exiting to L1.
> 
> No functional change.
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Co-developed-by: Nikunj A Dadhania <nikunj@amd.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 

If helps/needed:

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-05  6:36 ` [PATCH v5 7/8] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
@ 2026-01-12 10:24   ` Huang, Kai
  2026-01-14 14:03     ` Nikunj A. Dadhania
  2026-01-14 22:48   ` Huang, Kai
  1 sibling, 1 reply; 22+ messages in thread
From: Huang, Kai @ 2026-01-12 10:24 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> Currently, dirty logging relies on write protecting guest memory and
> marking dirty GFNs during subsequent write faults. This method works but
> incurs overhead due to additional write faults for each dirty GFN.
> 
> Implement support for the Page Modification Logging (PML) feature, a
> hardware-assisted method for efficient dirty logging. PML automatically
> logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
> VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
> initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
> PML_INDEX after logging each GPA. When the PML buffer is full, a
> VMEXIT(PML_FULL) with exit code 0x407 is generated.
> 
> Disable PML for nested guests.
> 
> PML is enabled by default when supported and can be disabled via the 'pml'
> module parameter.

Nit:

If a new version is needed, use imperative mode:

  Add a new module parameter to enable/disable PML, and enable it by 
  default when supported.

> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>

It's a bit weird for me to review, but I did anyway and it seems fine to
me, so:

Acked-by: Kai Huang <kai.huang@intel.com>

One minor thing below ...

[...]

> @@ -748,12 +748,19 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  						V_NMI_BLOCKING_MASK);
>  	}
>  
> -	/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
> +	/* Copied from vmcb01. msrpm_base/nested_ctl can be overwritten later. */
>  	vmcb02->control.nested_ctl = vmcb01->control.nested_ctl;
>  	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
>  	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
>  	vmcb_mark_dirty(vmcb02, VMCB_PERM_MAP);
>  
> +	/* Disable PML for nested guest as the A/D update is emulated by MMU */

This comment isn't accurate to me.  I think the key reason is, for L2 if
PML enabled the recorded GPA will be L2's GPA, but not L1's.

Please update the comment if a new version is needed?

> +	if (pml) {
> +		vmcb02->control.nested_ctl &= ~SVM_NESTED_CTL_PML_ENABLE;
> +		vmcb02->control.pml_addr = 0;
> +		vmcb02->control.pml_index = -1;
> +	}
> +
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test
  2026-01-05  6:36 ` [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test Nikunj A Dadhania
@ 2026-01-14 11:36   ` Huang, Kai
  2026-01-14 14:27     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 22+ messages in thread
From: Huang, Kai @ 2026-01-14 11:36 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> Add a KVM selftest to verify Page Modification Logging (PML) functionality
> with AMD SEV/SEV-ES/SEV-SNP guests. The test validates that
> hardware-assisted dirty page tracking works correctly across different SEV
> guest types.

Hi Nikunj,

Perhaps a dumb question -- Why do we need specific selftest case for SEV*
guests?

In terms of GPA logging, my understanding is there's no difference between
normal AMD SVM guests and SEV* guests from hardware's point of view.  So
if PML works for normal AMD SVM guests, it should work for SEV* guests,
no?

FWIW, a more reasonable selftest case is we probably need a AMD version of
vmx_dirty_log_test.c[*], which verifies PML is indeed not enabled when L2
runs.

[*]: see commit 094444204570 ("selftests: kvm: add test for dirty logging
inside nested guests").

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 1/8] KVM: x86: Carve out PML flush routine
  2026-01-12 10:02   ` Huang, Kai
@ 2026-01-14 13:57     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 22+ messages in thread
From: Nikunj A. Dadhania @ 2026-01-14 13:57 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 1/12/2026 3:32 PM, Huang, Kai wrote:
> On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
>> Move the PML (Page Modification Logging) buffer flushing logic from
>> VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
>> code duplication.
>>
>> The AMD SVM PML implementations share the same behavior as VMX PML:
>>  1) The PML buffer is a 4K page with 512 entries
>>  2) Hardware records dirty GPAs in reverse order (from index 511 to 0)
>>  3) Hardware clears bits 11:0 when recording GPAs
>>
>> The PML constants (PML_LOG_NR_ENTRIES and PML_HEAD_INDEX) are moved from
>> vmx.h to x86.h to make them available to both VMX and SVM.
> 
> Nit:
> 
> If a new version is needed, you can use imperative mode for the above
> paragraph:
> 
>   Move PML constants (...) from vmx.h to x86.h to ...
> 
> Or IMHO you can just remove this paragraph, because the new
> kvm_flush_pml_buffer() in x86.c uses both PML constants so the move is
> implied actually.

Sure, will remove the paragraph as it is implied.

> 
>>
>> No functional change intended for VMX, except tone down the WARN_ON() to
>> WARN_ON_ONCE() for the page alignment check. If hardware exhibits this
>> behavior once, it's likely to occur repeatedly, so use WARN_ON_ONCE() to
>> avoid log flooding while still capturing the unexpected condition.
>>
>> The refactoring prepares for SVM to leverage the same PML flushing
>> implementation.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 
> Reviewed-by: Kai Huang <kai.huang@intel.com>

Thanks
Nikunj

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-12 10:24   ` Huang, Kai
@ 2026-01-14 14:03     ` Nikunj A. Dadhania
  2026-01-14 23:10       ` Huang, Kai
  0 siblings, 1 reply; 22+ messages in thread
From: Nikunj A. Dadhania @ 2026-01-14 14:03 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 1/12/2026 3:54 PM, Huang, Kai wrote:
> On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
>> Currently, dirty logging relies on write protecting guest memory and
>> marking dirty GFNs during subsequent write faults. This method works but
>> incurs overhead due to additional write faults for each dirty GFN.
>>
>> Implement support for the Page Modification Logging (PML) feature, a
>> hardware-assisted method for efficient dirty logging. PML automatically
>> logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
>> VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
>> initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
>> PML_INDEX after logging each GPA. When the PML buffer is full, a
>> VMEXIT(PML_FULL) with exit code 0x407 is generated.
>>
>> Disable PML for nested guests.
>>
>> PML is enabled by default when supported and can be disabled via the 'pml'
>> module parameter.
> 
> Nit:
> 
> If a new version is needed, use imperative mode:
> 
>   Add a new module parameter to enable/disable PML, and enable it by 
>   default when supported.

Ack

> 
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> 
> It's a bit weird for me to review, but I did anyway and it seems fine to
> me, so:

Thank you for taking the time to review the patches and for the detailed feedback
throughout this series. Your insights have been very helpful.

> 
> Acked-by: Kai Huang <kai.huang@intel.com>
> 
> One minor thing below ...
> 
> [...]
> 
>> @@ -748,12 +748,19 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>>  						V_NMI_BLOCKING_MASK);
>>  	}
>>  
>> -	/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
>> +	/* Copied from vmcb01. msrpm_base/nested_ctl can be overwritten later. */
>>  	vmcb02->control.nested_ctl = vmcb01->control.nested_ctl;
>>  	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
>>  	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
>>  	vmcb_mark_dirty(vmcb02, VMCB_PERM_MAP);
>>  
>> +	/* Disable PML for nested guest as the A/D update is emulated by MMU */
> 
> This comment isn't accurate to me.  I think the key reason is, for L2 if
> PML enabled the recorded GPA will be L2's GPA, but not L1's.
> 
> Please update the comment if a new version is needed?

How about the below:

+	/*
+	 * Disable PML for nested guests. When L2 runs with PML enabled, the
+	 * CPU logs L2 GPAs rather than L1 GPAs, breaking dirty page tracking
+	 * for the L0 hypervisor.
+	 */

Regards
Nikunj

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test
  2026-01-14 11:36   ` Huang, Kai
@ 2026-01-14 14:27     ` Nikunj A. Dadhania
  2026-01-14 22:44       ` Huang, Kai
  0 siblings, 1 reply; 22+ messages in thread
From: Nikunj A. Dadhania @ 2026-01-14 14:27 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com, yosry.ahmed
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 1/14/2026 5:06 PM, Huang, Kai wrote:
> On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
>> Add a KVM selftest to verify Page Modification Logging (PML) functionality
>> with AMD SEV/SEV-ES/SEV-SNP guests. The test validates that
>> hardware-assisted dirty page tracking works correctly across different SEV
>> guest types.
> 
> Hi Nikunj,
> 
> Perhaps a dumb question -- Why do we need specific selftest case for SEV*
> guests?

As there was a request from you to check if SEV supports PML,
I wrote this test and thought to send it as part of this series.

> In terms of GPA logging, my understanding is there's no difference between
> normal AMD SVM guests and SEV* guests from hardware's point of view.  So
> if PML works for normal AMD SVM guests, it should work for SEV* guests,
> no?

Yes, that is correct.

> 
> FWIW, a more reasonable selftest case is we probably need a AMD version of
> vmx_dirty_log_test.c[*], which verifies PML is indeed not enabled when L2
> runs.

Yosry has added the support here [1].

> 
> [*]: see commit 094444204570 ("selftests: kvm: add test for dirty logging
> inside nested guests").

Regards
Nikunj

1. https://lore.kernel.org/kvm/20251230230150.4150236-19-seanjc@google.com/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test
  2026-01-14 14:27     ` Nikunj A. Dadhania
@ 2026-01-14 22:44       ` Huang, Kai
  0 siblings, 0 replies; 22+ messages in thread
From: Huang, Kai @ 2026-01-14 22:44 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com,
	yosry.ahmed@linux.dev
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Wed, 2026-01-14 at 19:57 +0530, Nikunj A. Dadhania wrote:
> 
> On 1/14/2026 5:06 PM, Huang, Kai wrote:
> > On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> > > Add a KVM selftest to verify Page Modification Logging (PML) functionality
> > > with AMD SEV/SEV-ES/SEV-SNP guests. The test validates that
> > > hardware-assisted dirty page tracking works correctly across different SEV
> > > guest types.
> > 
> > Hi Nikunj,
> > 
> > Perhaps a dumb question -- Why do we need specific selftest case for SEV*
> > guests?
> 
> As there was a request from you to check if SEV supports PML,
> I wrote this test and thought to send it as part of this series.

Oh for that I just meant whether PML works for SEV* on hardware level, but
not asking any selftests to confirm.  Sorry about the confusion :-(

I had this question because TDX guests don't support PML, and we needed to
only set kvm->arch.cpu_dirty_log_size for VMX guests. 

Since your patch (IIRC) always sets cpu_dirty_log_size for all AMD VMs (in
svm_init_vm()):

+	if (pml)
+		kvm->arch.cpu_dirty_log_size = PML_LOG_NR_ENTRIES;
+

I thought it would be better to check whether SEV* guests work with PML
"on hardware level", otherwise we need TDX-similar treatment on AMD side,
i.e., to clear cpu_dirty_log_size for SEV* guests.

> 
> > In terms of GPA logging, my understanding is there's no difference between
> > normal AMD SVM guests and SEV* guests from hardware's point of view.  So
> > if PML works for normal AMD SVM guests, it should work for SEV* guests,
> > no?
> 
> Yes, that is correct.
> 
> > 
> > FWIW, a more reasonable selftest case is we probably need a AMD version of
> > vmx_dirty_log_test.c[*], which verifies PML is indeed not enabled when L2
> > runs.
> 
> Yosry has added the support here [1].

Thanks for the info.

Then I don't think we need this new selftest case specifically for SEV*
(assuming [1] will be merged, if it has not been done already).

Again, sorry for the confusion.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-05  6:36 ` [PATCH v5 7/8] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
  2026-01-12 10:24   ` Huang, Kai
@ 2026-01-14 22:48   ` Huang, Kai
  2026-01-16  4:12     ` Nikunj A. Dadhania
  1 sibling, 1 reply; 22+ messages in thread
From: Huang, Kai @ 2026-01-14 22:48 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
> Currently, dirty logging relies on write protecting guest memory and
> marking dirty GFNs during subsequent write faults. This method works but
> incurs overhead due to additional write faults for each dirty GFN.
> 
> Implement support for the Page Modification Logging (PML) feature, a
> hardware-assisted method for efficient dirty logging. PML automatically
> logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
> VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
> initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
> PML_INDEX after logging each GPA. When the PML buffer is full, a
> VMEXIT(PML_FULL) with exit code 0x407 is generated.
> 
> 

Could you add sentence(s) to clarify PML works for SEV* guests on hardware
level in the same way as for normal SVM guests?

This justifies the code change like always setting cpu_dirty_log_size for
all AMD VMs when PML is enabled in svm_vm_init() (IIRC).  Otherwise you
need to have code to make sure it's cleared for SEV* guests.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-14 14:03     ` Nikunj A. Dadhania
@ 2026-01-14 23:10       ` Huang, Kai
  0 siblings, 0 replies; 22+ messages in thread
From: Huang, Kai @ 2026-01-14 23:10 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Wed, 2026-01-14 at 19:33 +0530, Nikunj A. Dadhania wrote:
> > > @@ -748,12 +748,19 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
> > >   						V_NMI_BLOCKING_MASK);
> > >   	}
> > >   
> > > -	/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
> > > +	/* Copied from vmcb01. msrpm_base/nested_ctl can be overwritten later. */
> > >   	vmcb02->control.nested_ctl = vmcb01->control.nested_ctl;
> > >   	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
> > >   	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
> > >   	vmcb_mark_dirty(vmcb02, VMCB_PERM_MAP);
> > >   
> > > +	/* Disable PML for nested guest as the A/D update is emulated by MMU */
> > 
> > This comment isn't accurate to me.  I think the key reason is, for L2 if
> > PML enabled the recorded GPA will be L2's GPA, but not L1's.
> > 
> > Please update the comment if a new version is needed?
> 
> How about the below:
> 
> +	/*
> +	 * Disable PML for nested guests. When L2 runs with PML enabled, the
> +	 * CPU logs L2 GPAs rather than L1 GPAs, breaking dirty page tracking
> +	 * for the L0 hypervisor.
> +	 */

LGTM.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/8] KVM: SVM: Add Page modification logging support
  2026-01-14 22:48   ` Huang, Kai
@ 2026-01-16  4:12     ` Nikunj A. Dadhania
  0 siblings, 0 replies; 22+ messages in thread
From: Nikunj A. Dadhania @ 2026-01-16  4:12 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 1/15/2026 4:18 AM, Huang, Kai wrote:
> On Mon, 2026-01-05 at 06:36 +0000, Nikunj A Dadhania wrote:
>> Currently, dirty logging relies on write protecting guest memory and
>> marking dirty GFNs during subsequent write faults. This method works but
>> incurs overhead due to additional write faults for each dirty GFN.
>>
>> Implement support for the Page Modification Logging (PML) feature, a
>> hardware-assisted method for efficient dirty logging. PML automatically
>> logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
>> VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
>> initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
>> PML_INDEX after logging each GPA. When the PML buffer is full, a
>> VMEXIT(PML_FULL) with exit code 0x407 is generated.
>>
>>
> 
> Could you add sentence(s) to clarify PML works for SEV* guests on hardware
> level in the same way as for normal SVM guests?
> 
> This justifies the code change like always setting cpu_dirty_log_size for
> all AMD VMs when PML is enabled in svm_vm_init() (IIRC).  Otherwise you
> need to have code to make sure it's cleared for SEV* guests.

Ack.



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-01-16  4:12 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05  6:36 [PATCH v5 0/8] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
2026-01-05  6:36 ` [PATCH v5 1/8] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
2026-01-12 10:02   ` Huang, Kai
2026-01-14 13:57     ` Nikunj A. Dadhania
2026-01-05  6:36 ` [PATCH v5 2/8] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
2026-01-12 10:07   ` Huang, Kai
2026-01-05  6:36 ` [PATCH v5 3/8] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
2026-01-05  6:49   ` Gupta, Pankaj
2026-01-05  6:36 ` [PATCH v5 4/8] KVM: x86: Move nested CPU dirty logging logic to common code Nikunj A Dadhania
2026-01-12 10:08   ` Huang, Kai
2026-01-05  6:36 ` [PATCH v5 5/8] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
2026-01-05  6:36 ` [PATCH v5 6/8] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions Nikunj A Dadhania
2026-01-05  6:36 ` [PATCH v5 7/8] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
2026-01-12 10:24   ` Huang, Kai
2026-01-14 14:03     ` Nikunj A. Dadhania
2026-01-14 23:10       ` Huang, Kai
2026-01-14 22:48   ` Huang, Kai
2026-01-16  4:12     ` Nikunj A. Dadhania
2026-01-05  6:36 ` [PATCH v5 8/8] selftests: KVM: x86: Add SEV PML dirty logging test Nikunj A Dadhania
2026-01-14 11:36   ` Huang, Kai
2026-01-14 14:27     ` Nikunj A. Dadhania
2026-01-14 22:44       ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox