[RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support
@ 2025-08-25 15:20 Nikunj A Dadhania
  2025-08-25 15:20 ` [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Nikunj A Dadhania @ 2025-08-25 15:20 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj

This series implements Page Modification Logging (PML) for guests, bringing
hardware-assisted dirty logging support. PML is designed to track guest
modified memory pages. PML enables the hypervisor to identify which pages in a
guest's memory have been modified since the last checkpoint or during live
migration.

The PML feature uses two new VMCB fields (PML_ADDR and PML_INDEX) and
generates a VMEXIT when the 4KB log buffer becomes full.

Patch breakdown:
1. Refactor existing VMX PML code to be shared between VMX and SVM
2. Skip SNP-safe alloc when HvInUseWrAllowed is available
3. Add AMD SVM PML CPUID
4. Implement SVM PML support using the shared infrastructure

The feature is enabled by default when hardware support is detected and
can be disabled via the 'pml' module parameter.

Nikunj A Dadhania (4):
  KVM: x86: Carve out PML flush routine
  KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported
  x86/cpufeatures: Add Page modification logging
  KVM: SVM: Add Page modification logging support

 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/svm.h         |  6 +-
 arch/x86/include/uapi/asm/svm.h    |  2 +
 arch/x86/kernel/cpu/scattered.c    |  1 +
 arch/x86/kvm/svm/sev.c             |  3 +-
 arch/x86/kvm/svm/svm.c             | 99 +++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.h             |  4 ++
 arch/x86/kvm/vmx/vmx.c             | 26 +-------
 arch/x86/kvm/vmx/vmx.h             |  5 --
 arch/x86/kvm/x86.c                 | 31 ++++++++++
 arch/x86/kvm/x86.h                 |  7 +++
 11 files changed, 151 insertions(+), 34 deletions(-)

base-commit: 196d9e72c4b0bd68b74a4ec7f52d248f37d0f030
--
2.43.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine
  2025-08-25 15:20 [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
@ 2025-08-25 15:20 ` Nikunj A Dadhania
  2025-08-26 10:06   ` Huang, Kai
  2025-08-25 15:20 ` [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported Nikunj A Dadhania
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Nikunj A Dadhania @ 2025-08-25 15:20 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj

Move the PML (Page Modification Logging) buffer flushing logic from
VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
code duplication.

The PML constants (PML_LOG_NR_ENTRIES and PML_HEAD_INDEX) are moved from
vmx.h to x86.h to make them available to both VMX and SVM.

No functional change intended for VMX, except tone down the WARN_ON() to
WARN_ON_ONCE() for the page alignment check. If hardware exhibits this
behavior once, it's likely to occur repeatedly, so use WARN_ON_ONCE() to
avoid log flooding while still capturing the unexpected condition.

The refactoring prepares for SVM to leverage the same PML flushing
implementation.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kvm/vmx/vmx.c | 26 ++------------------------
 arch/x86/kvm/vmx/vmx.h |  5 -----
 arch/x86/kvm/x86.c     | 31 +++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.h     |  7 +++++++
 4 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aa157fe5b7b3..a0955155d7ca 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6107,37 +6107,15 @@ static void vmx_destroy_pml_buffer(struct vcpu_vmx *vmx)
 static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	u16 pml_idx, pml_tail_index;
-	u64 *pml_buf;
-	int i;
+	u16 pml_idx;
 
 	pml_idx = vmcs_read16(GUEST_PML_INDEX);
 
 	/* Do nothing if PML buffer is empty */
 	if (pml_idx == PML_HEAD_INDEX)
 		return;
-	/*
-	 * PML index always points to the next available PML buffer entity
-	 * unless PML log has just overflowed.
-	 */
-	pml_tail_index = (pml_idx >= PML_LOG_NR_ENTRIES) ? 0 : pml_idx + 1;
 
-	/*
-	 * PML log is written backwards: the CPU first writes the entry 511
-	 * then the entry 510, and so on.
-	 *
-	 * Read the entries in the same order they were written, to ensure that
-	 * the dirty ring is filled in the same order the CPU wrote them.
-	 */
-	pml_buf = page_address(vmx->pml_pg);
-
-	for (i = PML_HEAD_INDEX; i >= pml_tail_index; i--) {
-		u64 gpa;
-
-		gpa = pml_buf[i];
-		WARN_ON(gpa & (PAGE_SIZE - 1));
-		kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
-	}
+	kvm_flush_pml_buffer(vcpu, vmx->pml_pg, pml_idx);
 
 	/* reset PML index */
 	vmcs_write16(GUEST_PML_INDEX, PML_HEAD_INDEX);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index d3389baf3ab3..4494c253727f 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -269,11 +269,6 @@ struct vcpu_vmx {
 	unsigned int ple_window;
 	bool ple_window_dirty;
 
-	/* Support for PML */
-#define PML_LOG_NR_ENTRIES	512
-	/* PML is written backwards: this is the first entry written by the CPU */
-#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
-
 	struct page *pml_pg;
 
 	/* apic deadline value in host tsc */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1c49bc681c4..054ba09d3737 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6417,6 +6417,37 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
 		kvm_vcpu_kick(vcpu);
 }
 
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_page, u16 pml_idx)
+{
+	u16 pml_tail_index;
+	u64 *pml_buf;
+	int i;
+
+	/*
+	 * PML index always points to the next available PML buffer entity
+	 * unless PML log has just overflowed.
+	 */
+	pml_tail_index = (pml_idx >= PML_LOG_NR_ENTRIES) ? 0 : pml_idx + 1;
+
+	/*
+	 * PML log is written backwards: the CPU first writes the entry 511
+	 * then the entry 510, and so on.
+	 *
+	 * Read the entries in the same order they were written, to ensure that
+	 * the dirty ring is filled in the same order the CPU wrote them.
+	 */
+	pml_buf = page_address(pml_page);
+
+	for (i = PML_HEAD_INDEX; i >= pml_tail_index; i--) {
+		u64 gpa;
+
+		gpa = pml_buf[i];
+		WARN_ON_ONCE(gpa & (PAGE_SIZE - 1));
+		kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
+	}
+}
+EXPORT_SYMBOL_GPL(kvm_flush_pml_buffer);
+
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index bcfd9b719ada..23c188c0a24b 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -699,4 +699,11 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, int cpl,
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
+/* Support for PML */
+#define PML_LOG_NR_ENTRIES	512
+/* PML is written backwards: this is the first entry written by the CPU */
+#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
+
+void kvm_flush_pml_buffer(struct kvm_vcpu *vcpu, struct page *pml_pg, u16 pml_idx);
+
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported
  2025-08-25 15:20 [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
  2025-08-25 15:20 ` [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
@ 2025-08-25 15:20 ` Nikunj A Dadhania
  2025-08-26 10:07   ` Huang, Kai
  2025-08-25 15:20 ` [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
  2025-08-25 15:20 ` [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
  3 siblings, 1 reply; 19+ messages in thread
From: Nikunj A Dadhania @ 2025-08-25 15:20 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj

When HvInUseWrAllowed (CPUID 8000001F EAX[30]) is supported, the CPU allows
hypervisor writes to in-use pages without RMP violations, making the 2MB
alignment workaround unnecessary. Check for this capability to avoid the
allocation overhead when it's not needed.

Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kvm/svm/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2fbdebf79fbb..c5477efc90b9 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4666,7 +4666,8 @@ struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
 	unsigned long pfn;
 	struct page *p;
 
-	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP) ||
+	    cpu_feature_enabled(X86_FEATURE_HV_INUSE_WR_ALLOWED))
 		return alloc_pages_node(node, gfp | __GFP_ZERO, 0);
 
 	/*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging
  2025-08-25 15:20 [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
  2025-08-25 15:20 ` [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
  2025-08-25 15:20 ` [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported Nikunj A Dadhania
@ 2025-08-25 15:20 ` Nikunj A Dadhania
  2025-08-25 15:25   ` Borislav Petkov
  2025-08-25 15:20 ` [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
  3 siblings, 1 reply; 19+ messages in thread
From: Nikunj A Dadhania @ 2025-08-25 15:20 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj

Page modification logging(PML) is a hardware feature designed to track
guest modified memory pages. PML enables the hypervisor to identify which
pages in a guest's memory have been changed since the last checkpoint or
during live migration.

The PML feature is advertised via CPUID leaf 0x8000000A ECX[4] bit.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/scattered.c    | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 286d509f9363..069c0e17113a 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -227,6 +227,7 @@
 #define X86_FEATURE_PVUNLOCK		( 8*32+20) /* PV unlock function */
 #define X86_FEATURE_VCPUPREEMPT		( 8*32+21) /* PV vcpu_is_preempted function */
 #define X86_FEATURE_TDX_GUEST		( 8*32+22) /* "tdx_guest" Intel Trust Domain Extensions Guest */
+#define X86_FEATURE_PML			( 8*32+23) /* AMD Page Modification logging */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
 #define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* "fsgsbase" RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index b4a1f6732a3a..02fc16b28bc9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -48,6 +48,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_PROC_FEEDBACK,		CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_AMD_FAST_CPPC,		CPUID_EDX, 15, 0x80000007, 0 },
 	{ X86_FEATURE_MBA,			CPUID_EBX,  6, 0x80000008, 0 },
+	{ X86_FEATURE_PML,			CPUID_ECX,  4, 0x8000000A, 0 },
 	{ X86_FEATURE_SMBA,			CPUID_EBX,  2, 0x80000020, 0 },
 	{ X86_FEATURE_BMEC,			CPUID_EBX,  3, 0x80000020, 0 },
 	{ X86_FEATURE_TSA_SQ_NO,		CPUID_ECX,  1, 0x80000021, 0 },
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-25 15:20 [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2025-08-25 15:20 ` [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
@ 2025-08-25 15:20 ` Nikunj A Dadhania
  2025-08-27 23:44   ` Huang, Kai
  3 siblings, 1 reply; 19+ messages in thread
From: Nikunj A Dadhania @ 2025-08-25 15:20 UTC (permalink / raw)
  To: seanjc, pbonzini
  Cc: kvm, thomas.lendacky, santosh.shukla, bp, joao.m.martins, nikunj

Currently, dirty logging relies on write protecting guest memory and
marking dirty GFNs during subsequent write faults. This method works but
incurs overhead due to additional write faults for each dirty GFN.

Implement support for the Page Modification Logging (PML) feature, a
hardware-assisted method for efficient dirty logging. PML automatically
logs dirty GPA[51:12] to a 4K buffer when the CPU sets NPT D-bits. Two new
VMCB fields are utilized: PML_ADDR and PML_INDEX. The PML_INDEX is
initialized to 511 (8 bytes per GPA entry), and the CPU decreases the
PML_INDEX after logging each GPA. When the PML buffer is full, a
VMEXIT(PML_FULL) with exit code 0x407 is generated.

PML is enabled by default when supported and can be disabled via the 'pml'
module parameter.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/include/asm/svm.h      |  6 +-
 arch/x86/include/uapi/asm/svm.h |  2 +
 arch/x86/kvm/svm/svm.c          | 99 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.h          |  4 ++
 4 files changed, 107 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index ffc27f676243..9fbada95afd5 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -165,7 +165,10 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 	u8 reserved_9[22];
 	u64 allowed_sev_features;	/* Offset 0x138 */
 	u64 guest_sev_features;		/* Offset 0x140 */
-	u8 reserved_10[664];
+	u8 reserved_10[128];
+	u64 pml_addr;			/* Offset 0x1c8 */
+	u16 pml_index;			/* Offset 0x1d0 */
+	u8 reserved_11[526];
 	/*
 	 * Offset 0x3e0, 32 bytes reserved
 	 * for use by hypervisor/software.
@@ -239,6 +242,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
 #define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 #define SVM_NESTED_CTL_SEV_ES_ENABLE	BIT(2)
+#define SVM_NESTED_CTL_PML_ENABLE	BIT(11)
 
 
 #define SVM_TSC_RATIO_RSVD	0xffffff0000000000ULL
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 9c640a521a67..f329dca167de 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -101,6 +101,7 @@
 #define SVM_EXIT_AVIC_INCOMPLETE_IPI		0x401
 #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS	0x402
 #define SVM_EXIT_VMGEXIT       0x403
+#define SVM_EXIT_PML_FULL	0x407
 
 /* SEV-ES software-defined VMGEXIT events */
 #define SVM_VMGEXIT_MMIO_READ			0x80000001
@@ -232,6 +233,7 @@
 	{ SVM_EXIT_AVIC_INCOMPLETE_IPI,		"avic_incomplete_ipi" }, \
 	{ SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \
 	{ SVM_EXIT_VMGEXIT,		"vmgexit" }, \
+	{ SVM_EXIT_PML_FULL,		"pml_full" }, \
 	{ SVM_VMGEXIT_MMIO_READ,	"vmgexit_mmio_read" }, \
 	{ SVM_VMGEXIT_MMIO_WRITE,	"vmgexit_mmio_write" }, \
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9931c6c4bc6..f46680a3c44f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -178,6 +178,9 @@ module_param(intercept_smi, bool, 0444);
 bool vnmi = true;
 module_param(vnmi, bool, 0444);
 
+bool pml = true;
+module_param(pml, bool, 0444);
+
 static bool svm_gp_erratum_intercept = true;
 
 static u8 rsm_ins_bytes[] = "\x0f\xaa";
@@ -1220,6 +1223,16 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	if (vcpu->kvm->arch.bus_lock_detection_enabled)
 		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
 
+	if (pml) {
+		/*
+		 * Populate the page address and index here, PML is enabled
+		 * when dirty logging is enabled on the memslot through
+		 * svm_update_cpu_dirty_logging()
+		 */
+		control->pml_addr = (u64)__sme_set(page_to_pfn(svm->pml_page) << PAGE_SHIFT);
+		control->pml_index = PML_HEAD_INDEX;
+	}
+
 	if (sev_guest(vcpu->kvm))
 		sev_init_vmcb(svm);
 
@@ -1296,14 +1309,20 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 			goto error_free_vmcb_page;
 	}
 
+	if (pml) {
+		svm->pml_page = snp_safe_alloc_page();
+		if (!svm->pml_page)
+			goto error_free_vmsa_page;
+	}
+
 	err = avic_init_vcpu(svm);
 	if (err)
-		goto error_free_vmsa_page;
+		goto error_free_pml_page;
 
 	svm->msrpm = svm_vcpu_alloc_msrpm();
 	if (!svm->msrpm) {
 		err = -ENOMEM;
-		goto error_free_vmsa_page;
+		goto error_free_pml_page;
 	}
 
 	svm->x2avic_msrs_intercepted = true;
@@ -1319,6 +1338,9 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 
 	return 0;
 
+error_free_pml_page:
+	if (svm->pml_page)
+		__free_page(svm->pml_page);
 error_free_vmsa_page:
 	if (vmsa_page)
 		__free_page(vmsa_page);
@@ -1339,6 +1361,9 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 
 	sev_free_vcpu(vcpu);
 
+	if (pml)
+		__free_page(svm->pml_page);
+
 	__free_page(__sme_pa_to_page(svm->vmcb01.pa));
 	svm_vcpu_free_msrpm(svm->msrpm);
 }
@@ -3206,6 +3231,53 @@ static int bus_lock_exit(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+void svm_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (WARN_ON_ONCE(!pml))
+		return;
+
+	if (is_guest_mode(vcpu))
+		return;
+
+	/*
+	 * Note, nr_memslots_dirty_logging can be changed concurrently with this
+	 * code, but in that case another update request will be made and so the
+	 * guest will never run with a stale PML value.
+	 */
+	if (atomic_read(&vcpu->kvm->nr_memslots_dirty_logging))
+		svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_PML_ENABLE;
+	else
+		svm->vmcb->control.nested_ctl &= ~SVM_NESTED_CTL_PML_ENABLE;
+}
+
+static void svm_flush_pml_buffer(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control = &svm->vmcb->control;
+
+	/* Do nothing if PML buffer is empty */
+	if (control->pml_index == PML_HEAD_INDEX)
+		return;
+
+	kvm_flush_pml_buffer(vcpu, svm->pml_page, control->pml_index);
+
+	/* Reset the PML index */
+	control->pml_index = PML_HEAD_INDEX;
+}
+
+static int pml_full_interception(struct kvm_vcpu *vcpu)
+{
+	trace_kvm_pml_full(vcpu->vcpu_id);
+
+	/*
+	 * PML buffer is already flushed at the beginning of svm_handle_exit().
+	 * Nothing to do here.
+	 */
+	return 1;
+}
+
 static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_READ_CR0]			= cr_interception,
 	[SVM_EXIT_READ_CR3]			= cr_interception,
@@ -3282,6 +3354,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 #ifdef CONFIG_KVM_AMD_SEV
 	[SVM_EXIT_VMGEXIT]			= sev_handle_vmgexit,
 #endif
+	[SVM_EXIT_PML_FULL]			= pml_full_interception,
 };
 
 static void dump_vmcb(struct kvm_vcpu *vcpu)
@@ -3330,8 +3403,10 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
 	pr_err("%-20s%016llx\n", "exit_info2:", control->exit_info_2);
 	pr_err("%-20s%08x\n", "exit_int_info:", control->exit_int_info);
 	pr_err("%-20s%08x\n", "exit_int_info_err:", control->exit_int_info_err);
-	pr_err("%-20s%lld\n", "nested_ctl:", control->nested_ctl);
+	pr_err("%-20s%llx\n", "nested_ctl:", control->nested_ctl);
 	pr_err("%-20s%016llx\n", "nested_cr3:", control->nested_cr3);
+	pr_err("%-20s%016llx\n", "pml_addr:", control->pml_addr);
+	pr_err("%-20s%04x\n", "pml_index:", control->pml_index);
 	pr_err("%-20s%016llx\n", "avic_vapic_bar:", control->avic_vapic_bar);
 	pr_err("%-20s%016llx\n", "ghcb:", control->ghcb_gpa);
 	pr_err("%-20s%08x\n", "event_inj:", control->event_inj);
@@ -3562,6 +3637,15 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	struct kvm_run *kvm_run = vcpu->run;
 	u32 exit_code = svm->vmcb->control.exit_code;
 
+	/*
+	 * Opportunistically flush the PML buffer on VM exit. This keeps the
+	 * dirty bitmap current by processing logged GPAs rather than waiting for
+	 * PML_FULL exit.
+	 */
+	if (pml && !is_guest_mode(vcpu))
+		svm_flush_pml_buffer(vcpu);
+
+
 	/* SEV-ES guests must use the CR write traps to track CR registers. */
 	if (!sev_es_guest(vcpu->kvm)) {
 		if (!svm_is_intercept(svm, INTERCEPT_CR0_WRITE))
@@ -5028,6 +5112,9 @@ static int svm_vm_init(struct kvm *kvm)
 			return ret;
 	}
 
+	if (pml)
+		kvm->arch.cpu_dirty_log_size = PML_LOG_NR_ENTRIES;
+
 	svm_srso_vm_init();
 	return 0;
 }
@@ -5181,6 +5268,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.gmem_prepare = sev_gmem_prepare,
 	.gmem_invalidate = sev_gmem_invalidate,
 	.private_max_mapping_level = sev_private_max_mapping_level,
+
+	.update_cpu_dirty_logging = svm_update_cpu_dirty_logging,
 };
 
 /*
@@ -5382,6 +5471,10 @@ static __init int svm_hardware_setup(void)
 
 	nrips = nrips && boot_cpu_has(X86_FEATURE_NRIPS);
 
+	pml = pml && npt_enabled && cpu_feature_enabled(X86_FEATURE_PML);
+	if (pml)
+		pr_info("Page modification logging supported\n");
+
 	if (lbrv) {
 		if (!boot_cpu_has(X86_FEATURE_LBRV))
 			lbrv = false;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 58b9d168e0c8..6c2be19d8d46 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -335,6 +335,8 @@ struct vcpu_svm {
 
 	/* Guest GIF value, used when vGIF is not enabled */
 	bool guest_gif;
+
+	struct page *pml_page;
 };
 
 struct svm_cpu_data {
@@ -717,6 +719,8 @@ static inline void svm_enable_intercept_for_msr(struct kvm_vcpu *vcpu,
 	svm_set_intercept_for_msr(vcpu, msr, type, true);
 }
 
+void svm_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+
 /* nested.c */
 
 #define NESTED_EXIT_HOST	0	/* Exit handled on host level */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging
  2025-08-25 15:20 ` [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
@ 2025-08-25 15:25   ` Borislav Petkov
  0 siblings, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2025-08-25 15:25 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: seanjc, pbonzini, kvm, thomas.lendacky, santosh.shukla,
	joao.m.martins

On Mon, Aug 25, 2025 at 03:20:08PM +0000, Nikunj A Dadhania wrote:
> Page modification logging(PML) is a hardware feature designed to track
> guest modified memory pages. PML enables the hypervisor to identify which
> pages in a guest's memory have been changed since the last checkpoint or
> during live migration.
> 
> The PML feature is advertised via CPUID leaf 0x8000000A ECX[4] bit.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/kernel/cpu/scattered.c    | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 286d509f9363..069c0e17113a 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -227,6 +227,7 @@
>  #define X86_FEATURE_PVUNLOCK		( 8*32+20) /* PV unlock function */
>  #define X86_FEATURE_VCPUPREEMPT		( 8*32+21) /* PV vcpu_is_preempted function */
>  #define X86_FEATURE_TDX_GUEST		( 8*32+22) /* "tdx_guest" Intel Trust Domain Extensions Guest */
> +#define X86_FEATURE_PML			( 8*32+23) /* AMD Page Modification logging */
>  
>  /* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
>  #define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* "fsgsbase" RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index b4a1f6732a3a..02fc16b28bc9 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -48,6 +48,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>  	{ X86_FEATURE_PROC_FEEDBACK,		CPUID_EDX, 11, 0x80000007, 0 },
>  	{ X86_FEATURE_AMD_FAST_CPPC,		CPUID_EDX, 15, 0x80000007, 0 },
>  	{ X86_FEATURE_MBA,			CPUID_EBX,  6, 0x80000008, 0 },
> +	{ X86_FEATURE_PML,			CPUID_ECX,  4, 0x8000000A, 0 },
>  	{ X86_FEATURE_SMBA,			CPUID_EBX,  2, 0x80000020, 0 },
>  	{ X86_FEATURE_BMEC,			CPUID_EBX,  3, 0x80000020, 0 },
>  	{ X86_FEATURE_TSA_SQ_NO,		CPUID_ECX,  1, 0x80000021, 0 },
> -- 

Acked-by: Borislav Petkov (AMD) <bp@alien8.de>

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine
  2025-08-25 15:20 ` [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
@ 2025-08-26 10:06   ` Huang, Kai
  2025-08-26 14:58     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-26 10:06 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> Move the PML (Page Modification Logging) buffer flushing logic from
> VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
> code duplication.

Looking at the code change, IIUC the PML code that is moved to x86 common
assumes AMD's PML also follows VMX's behaviour:

 1) The PML buffer is a 4K page;
 2) The hardware records the dirty GPA in backwards to the PML buffer

Could we point this out in the changelog?

[...]

> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -269,11 +269,6 @@ struct vcpu_vmx {
>  	unsigned int ple_window;
>  	bool ple_window_dirty;
>  
> -	/* Support for PML */
> -#define PML_LOG_NR_ENTRIES	512
> -	/* PML is written backwards: this is the first entry written by the CPU */
> -#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
> -
>  	struct page *pml_pg;

Can we share the 'pml_pg' as well?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported
  2025-08-25 15:20 ` [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported Nikunj A Dadhania
@ 2025-08-26 10:07   ` Huang, Kai
  2025-08-26 15:07     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-26 10:07 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> When HvInUseWrAllowed (CPUID 8000001F EAX[30]) is supported, the CPU allows
> hypervisor writes to in-use pages without RMP violations, making the 2MB
> alignment workaround unnecessary. Check for this capability to avoid the
> allocation overhead when it's not needed.

Could you add some text to explain why this is related to PML?

> 
> Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/kvm/svm/sev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 2fbdebf79fbb..c5477efc90b9 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4666,7 +4666,8 @@ struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
>  	unsigned long pfn;
>  	struct page *p;
>  
> -	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> +	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP) ||
> +	    cpu_feature_enabled(X86_FEATURE_HV_INUSE_WR_ALLOWED))
>  		return alloc_pages_node(node, gfp | __GFP_ZERO, 0);
>  
>  	/*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine
  2025-08-26 10:06   ` Huang, Kai
@ 2025-08-26 14:58     ` Nikunj A. Dadhania
  2025-08-27  9:53       ` Huang, Kai
  0 siblings, 1 reply; 19+ messages in thread
From: Nikunj A. Dadhania @ 2025-08-26 14:58 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 8/26/2025 3:36 PM, Huang, Kai wrote:
> On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
>> Move the PML (Page Modification Logging) buffer flushing logic from
>> VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
>> code duplication.
> 
> Looking at the code change, IIUC the PML code that is moved to x86 common
> assumes AMD's PML also follows VMX's behaviour:
> 
>  1) The PML buffer is a 4K page;
>  2) The hardware records the dirty GPA in backwards to the PML buffer
> 
> Could we point this out in the changelog?

Ack, will add in the next revision

> 
> [...]
> 
>> --- a/arch/x86/kvm/vmx/vmx.h
>> +++ b/arch/x86/kvm/vmx/vmx.h
>> @@ -269,11 +269,6 @@ struct vcpu_vmx {
>>  	unsigned int ple_window;
>>  	bool ple_window_dirty;
>>  
>> -	/* Support for PML */
>> -#define PML_LOG_NR_ENTRIES	512
>> -	/* PML is written backwards: this is the first entry written by the CPU */
>> -#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
>> -
>>  	struct page *pml_pg;
> 
> Can we share the 'pml_pg' as well?

Sure, Is struct kvm_vcpu_arch the right place?

Regards,
Nikunj


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported
  2025-08-26 10:07   ` Huang, Kai
@ 2025-08-26 15:07     ` Nikunj A. Dadhania
  2025-08-27  9:54       ` Huang, Kai
  0 siblings, 1 reply; 19+ messages in thread
From: Nikunj A. Dadhania @ 2025-08-26 15:07 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 8/26/2025 3:37 PM, Huang, Kai wrote:
> On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
>> When HvInUseWrAllowed (CPUID 8000001F EAX[30]) is supported, the CPU allows
>> hypervisor writes to in-use pages without RMP violations, making the 2MB
>> alignment workaround unnecessary. Check for this capability to avoid the
>> allocation overhead when it's not needed.
> 
> Could you add some text to explain why this is related to PML?

PML works fine without this and the change is not linked to PML. This can go
as a separate fix.

While working on PML which was using the snp_safe_alloc_page(), Tom had suggested
to apply the alignment workaround only on the CPUs without HvInUseWrAllowed.

> 
>>
>> Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> ---
>>  arch/x86/kvm/svm/sev.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index 2fbdebf79fbb..c5477efc90b9 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -4666,7 +4666,8 @@ struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
>>  	unsigned long pfn;
>>  	struct page *p;
>>  
>> -	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>> +	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP) ||
>> +	    cpu_feature_enabled(X86_FEATURE_HV_INUSE_WR_ALLOWED))
>>  		return alloc_pages_node(node, gfp | __GFP_ZERO, 0);
>>  
>>  	/*
Regards,
Nikunj


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine
  2025-08-26 14:58     ` Nikunj A. Dadhania
@ 2025-08-27  9:53       ` Huang, Kai
  2025-08-28  6:30         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-27  9:53 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Tue, 2025-08-26 at 20:28 +0530, Nikunj A. Dadhania wrote:
> 
> On 8/26/2025 3:36 PM, Huang, Kai wrote:
> > On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> > > Move the PML (Page Modification Logging) buffer flushing logic from
> > > VMX-specific code to common x86 KVM code to enable reuse by SVM and avoid
> > > code duplication.
> > 
> > Looking at the code change, IIUC the PML code that is moved to x86 common
> > assumes AMD's PML also follows VMX's behaviour:
> > 
> >  1) The PML buffer is a 4K page;
> >  2) The hardware records the dirty GPA in backwards to the PML buffer
> > 
> > Could we point this out in the changelog?
> 
> Ack, will add in the next revision

Thanks.  Maybe one more to point out:

  3) AMD PML also clears bit 11:0 when recording the GPA.

> 
> > 
> > [...]
> > 
> > > --- a/arch/x86/kvm/vmx/vmx.h
> > > +++ b/arch/x86/kvm/vmx/vmx.h
> > > @@ -269,11 +269,6 @@ struct vcpu_vmx {
> > >  	unsigned int ple_window;
> > >  	bool ple_window_dirty;
> > >  
> > > -	/* Support for PML */
> > > -#define PML_LOG_NR_ENTRIES	512
> > > -	/* PML is written backwards: this is the first entry written by the CPU */
> > > -#define PML_HEAD_INDEX		(PML_LOG_NR_ENTRIES-1)
> > > -
> > >  	struct page *pml_pg;
> > 
> > Can we share the 'pml_pg' as well?
> 
> Sure, Is struct kvm_vcpu_arch the right place?
> 

Yeah I think so.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported
  2025-08-26 15:07     ` Nikunj A. Dadhania
@ 2025-08-27  9:54       ` Huang, Kai
  0 siblings, 0 replies; 19+ messages in thread
From: Huang, Kai @ 2025-08-27  9:54 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Tue, 2025-08-26 at 20:37 +0530, Nikunj A. Dadhania wrote:
> 
> On 8/26/2025 3:37 PM, Huang, Kai wrote:
> > On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> > > When HvInUseWrAllowed (CPUID 8000001F EAX[30]) is supported, the CPU allows
> > > hypervisor writes to in-use pages without RMP violations, making the 2MB
> > > alignment workaround unnecessary. Check for this capability to avoid the
> > > allocation overhead when it's not needed.
> > 
> > Could you add some text to explain why this is related to PML?
> 
> PML works fine without this and the change is not linked to PML. This can go
> as a separate fix.

I suppose it's better to send it out as a separate one if PML doesn't need
it to work.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-25 15:20 ` [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
@ 2025-08-27 23:44   ` Huang, Kai
  2025-08-28  6:37     ` Nikunj A. Dadhania
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-27 23:44 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com, nikunj@amd.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de

On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> +	if (pml) {
> +		svm->pml_page = snp_safe_alloc_page();
> +		if (!svm->pml_page)
> +			goto error_free_vmsa_page;
> +	}

I didn't see this yesterday.  Is it mandatory for AMD PML to use
snp_safe_alloc_page() to allocate the PML buffer, or we can also use
normal page allocation API?

VMX PML just uses alloc_pages().  I was thinking the page allocation/free
code could be moved to x86 common as shared code too.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine
  2025-08-27  9:53       ` Huang, Kai
@ 2025-08-28  6:30         ` Nikunj A. Dadhania
  0 siblings, 0 replies; 19+ messages in thread
From: Nikunj A. Dadhania @ 2025-08-28  6:30 UTC (permalink / raw)
  To: Huang, Kai, pbonzini@redhat.com, seanjc@google.com
  Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org,
	joao.m.martins@oracle.com, santosh.shukla@amd.com, bp@alien8.de



On 8/27/2025 3:23 PM, Huang, Kai wrote:
> On Tue, 2025-08-26 at 20:28 +0530, Nikunj A. Dadhania wrote:
>>
>> On 8/26/2025 3:36 PM, Huang, Kai wrote:
>>> On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
>>>
>>> Looking at the code change, IIUC the PML code that is moved to x86 common
>>> assumes AMD's PML also follows VMX's behaviour:
>>>
>>>  1) The PML buffer is a 4K page;
>>>  2) The hardware records the dirty GPA in backwards to the PML buffer
>>>
>>> Could we point this out in the changelog?
>>
>> Ack, will add in the next revision
> 
> Thanks.  Maybe one more to point out:
> 
>   3) AMD PML also clears bit 11:0 when recording the GPA.

Sure.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-27 23:44   ` Huang, Kai
@ 2025-08-28  6:37     ` Nikunj A. Dadhania
  2025-08-28 10:33       ` Huang, Kai
  0 siblings, 1 reply; 19+ messages in thread
From: Nikunj A. Dadhania @ 2025-08-28  6:37 UTC (permalink / raw)
  To: Huang, Kai, thomas.lendacky@amd.com
  Cc: kvm@vger.kernel.org, joao.m.martins@oracle.com,
	santosh.shukla@amd.com, bp@alien8.de, seanjc@google.com,
	pbonzini@redhat.com



On 8/28/2025 5:14 AM, Huang, Kai wrote:
> On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
>> +	if (pml) {
>> +		svm->pml_page = snp_safe_alloc_page();
>> +		if (!svm->pml_page)
>> +			goto error_free_vmsa_page;
>> +	}
> 
> I didn't see this yesterday.  Is it mandatory for AMD PML to use
> snp_safe_alloc_page() to allocate the PML buffer, or we can also use
> normal page allocation API?

As it is dependent on HvInUseWrAllowed, I need to use snp_safe_alloc_page().

Tom?

> VMX PML just uses alloc_pages().  I was thinking the page allocation/free
> code could be moved to x86 common as shared code too.
Got that, because of the above requirement, I was going to share the variable
(pml_page) but do the allocation in the vmx/svm code.

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-28  6:37     ` Nikunj A. Dadhania
@ 2025-08-28 10:33       ` Huang, Kai
  2025-08-28 10:43         ` Nikunj A. Dadhania
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-28 10:33 UTC (permalink / raw)
  To: thomas.lendacky@amd.com, nikunj@amd.com
  Cc: kvm@vger.kernel.org, joao.m.martins@oracle.com,
	pbonzini@redhat.com, seanjc@google.com, santosh.shukla@amd.com,
	bp@alien8.de

On Thu, 2025-08-28 at 12:07 +0530, Nikunj A. Dadhania wrote:
> 
> On 8/28/2025 5:14 AM, Huang, Kai wrote:
> > On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> > > +	if (pml) {
> > > +		svm->pml_page = snp_safe_alloc_page();
> > > +		if (!svm->pml_page)
> > > +			goto error_free_vmsa_page;
> > > +	}
> > 
> > I didn't see this yesterday.  Is it mandatory for AMD PML to use
> > snp_safe_alloc_page() to allocate the PML buffer, or we can also use
> > normal page allocation API?
> 
> As it is dependent on HvInUseWrAllowed, I need to use snp_safe_alloc_page().

So the patch 2 is actually a dependent for PML?

> 
> Tom?
> 
> > VMX PML just uses alloc_pages().  I was thinking the page allocation/free
> > code could be moved to x86 common as shared code too.
> Got that, because of the above requirement, I was going to share the variable
> (pml_page) but do the allocation in the vmx/svm code.
> 

If AMD and VMX PML cannot share PML buffer allocation/free code, my desire
to share 'pml_pg' is becoming less, but I guess I kinda still think it's
better to share the buffer pointer. :-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-28 10:33       ` Huang, Kai
@ 2025-08-28 10:43         ` Nikunj A. Dadhania
  2025-08-28 11:09           ` Huang, Kai
  0 siblings, 1 reply; 19+ messages in thread
From: Nikunj A. Dadhania @ 2025-08-28 10:43 UTC (permalink / raw)
  To: Huang, Kai, thomas.lendacky@amd.com
  Cc: kvm@vger.kernel.org, joao.m.martins@oracle.com,
	pbonzini@redhat.com, seanjc@google.com, santosh.shukla@amd.com,
	bp@alien8.de



On 8/28/2025 4:03 PM, Huang, Kai wrote:
> On Thu, 2025-08-28 at 12:07 +0530, Nikunj A. Dadhania wrote:
>>
>> On 8/28/2025 5:14 AM, Huang, Kai wrote:
>>> On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
>>>> +	if (pml) {
>>>> +		svm->pml_page = snp_safe_alloc_page();
>>>> +		if (!svm->pml_page)
>>>> +			goto error_free_vmsa_page;
>>>> +	}
>>>
>>> I didn't see this yesterday.  Is it mandatory for AMD PML to use
>>> snp_safe_alloc_page() to allocate the PML buffer, or we can also use
>>> normal page allocation API?
>>
>> As it is dependent on HvInUseWrAllowed, I need to use snp_safe_alloc_page().
> 
> So the patch 2 is actually a dependent for PML?

Not really, if the patch 2 is not there, the 2MB alignment workaround will be
applied to PML page allocation.

> 
>>
>> Tom?
>>
>>> VMX PML just uses alloc_pages().  I was thinking the page allocation/free
>>> code could be moved to x86 common as shared code too.
>> Got that, because of the above requirement, I was going to share the variable
>> (pml_page) but do the allocation in the vmx/svm code.
>>
> 
> If AMD and VMX PML cannot share PML buffer allocation/free code, my desire
> to share 'pml_pg' is becoming less, but I guess I kinda still think it's
> better to share the buffer pointer. :-)
Sure, will add in my next revision.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-28 10:43         ` Nikunj A. Dadhania
@ 2025-08-28 11:09           ` Huang, Kai
  2025-08-28 11:17             ` Huang, Kai
  0 siblings, 1 reply; 19+ messages in thread
From: Huang, Kai @ 2025-08-28 11:09 UTC (permalink / raw)
  To: thomas.lendacky@amd.com, nikunj@amd.com
  Cc: kvm@vger.kernel.org, joao.m.martins@oracle.com,
	pbonzini@redhat.com, seanjc@google.com, santosh.shukla@amd.com,
	bp@alien8.de

On Thu, 2025-08-28 at 16:13 +0530, Nikunj A. Dadhania wrote:
> On 8/28/2025 4:03 PM, Huang, Kai wrote:
> > On Thu, 2025-08-28 at 12:07 +0530, Nikunj A. Dadhania wrote:
> > > 
> > > On 8/28/2025 5:14 AM, Huang, Kai wrote:
> > > > On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> > > > > +	if (pml) {
> > > > > +		svm->pml_page = snp_safe_alloc_page();
> > > > > +		if (!svm->pml_page)
> > > > > +			goto error_free_vmsa_page;
> > > > > +	}
> > > > 
> > > > I didn't see this yesterday.  Is it mandatory for AMD PML to use
> > > > snp_safe_alloc_page() to allocate the PML buffer, or we can also use
> > > > normal page allocation API?
> > > 
> > > As it is dependent on HvInUseWrAllowed, I need to use snp_safe_alloc_page().
> > 
> > So the patch 2 is actually a dependent for PML?
> 
> Not really, if the patch 2 is not there, the 2MB alignment workaround will be
> applied to PML page allocation.

Sounds they are related, at least.

I don't have intention to judge whether patch 2 should be in this series
or not, nor whether snp_safe_alloc_page_node() is the right place to
workaround the 2MB alignment for PML buffer.

I just think it's good to see some text explaining why patch 2 is needed
for PML if eventually you decide to keep it in this series.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support
  2025-08-28 11:09           ` Huang, Kai
@ 2025-08-28 11:17             ` Huang, Kai
  0 siblings, 0 replies; 19+ messages in thread
From: Huang, Kai @ 2025-08-28 11:17 UTC (permalink / raw)
  To: thomas.lendacky@amd.com, nikunj@amd.com
  Cc: kvm@vger.kernel.org, joao.m.martins@oracle.com,
	pbonzini@redhat.com, seanjc@google.com, santosh.shukla@amd.com,
	bp@alien8.de

On Thu, 2025-08-28 at 11:09 +0000, Huang, Kai wrote:
> On Thu, 2025-08-28 at 16:13 +0530, Nikunj A. Dadhania wrote:
> > On 8/28/2025 4:03 PM, Huang, Kai wrote:
> > > On Thu, 2025-08-28 at 12:07 +0530, Nikunj A. Dadhania wrote:
> > > > 
> > > > On 8/28/2025 5:14 AM, Huang, Kai wrote:
> > > > > On Mon, 2025-08-25 at 15:20 +0000, Nikunj A Dadhania wrote:
> > > > > > +	if (pml) {
> > > > > > +		svm->pml_page = snp_safe_alloc_page();
> > > > > > +		if (!svm->pml_page)
> > > > > > +			goto error_free_vmsa_page;
> > > > > > +	}
> > > > > 
> > > > > I didn't see this yesterday.  Is it mandatory for AMD PML to use
> > > > > snp_safe_alloc_page() to allocate the PML buffer, or we can also use
> > > > > normal page allocation API?
> > > > 
> > > > As it is dependent on HvInUseWrAllowed, I need to use snp_safe_alloc_page().
> > > 
> > > So the patch 2 is actually a dependent for PML?
> > 
> > Not really, if the patch 2 is not there, the 2MB alignment workaround will be
> > applied to PML page allocation.
> 
> Sounds they are related, at least.
> 
> I don't have intention to judge whether patch 2 should be in this series
> or not, nor whether snp_safe_alloc_page_node() is the right place to
> workaround the 2MB alignment for PML buffer.
> 
> I just think it's good to see some text explaining why patch 2 is needed
> for PML if eventually you decide to keep it in this series.

Btw, there's one big comment in snp_safe_alloc_page_node():

        /*
         * Allocate an SNP-safe page to workaround the SNP erratum where 
         * the CPU will incorrectly signal an RMP violation #PF if a
         * hugepage (2MB or 1GB) collides with the RMP entry of a
         * 2MB-aligned VMCB, VMSA, or AVIC backing page.  
	 * ...
	 */

You might want to include the PML buffer to the list too.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-08-28 11:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 15:20 [RFC PATCH 0/4] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
2025-08-25 15:20 ` [RFC PATCH 1/4] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
2025-08-26 10:06   ` Huang, Kai
2025-08-26 14:58     ` Nikunj A. Dadhania
2025-08-27  9:53       ` Huang, Kai
2025-08-28  6:30         ` Nikunj A. Dadhania
2025-08-25 15:20 ` [RFC PATCH 2/4] KVM: SEV: Skip SNP-safe allocation when HvInUseWrAllowed is supported Nikunj A Dadhania
2025-08-26 10:07   ` Huang, Kai
2025-08-26 15:07     ` Nikunj A. Dadhania
2025-08-27  9:54       ` Huang, Kai
2025-08-25 15:20 ` [RFC PATCH 3/4] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
2025-08-25 15:25   ` Borislav Petkov
2025-08-25 15:20 ` [RFC PATCH 4/4] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
2025-08-27 23:44   ` Huang, Kai
2025-08-28  6:37     ` Nikunj A. Dadhania
2025-08-28 10:33       ` Huang, Kai
2025-08-28 10:43         ` Nikunj A. Dadhania
2025-08-28 11:09           ` Huang, Kai
2025-08-28 11:17             ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).