* [PATCH v8 1/7] x86/cpufeatures: Add X86_FEATURE_RMPOPT feature flag
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
@ 2026-06-15 19:48 ` Ashish Kalra
2026-06-15 19:48 ` [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
` (5 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:48 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
Add a flag indicating whether RMPOPT instruction is supported.
RMPOPT is a new instruction that reduces the performance overhead of
RMP checks for the hypervisor and non-SNP guests by allowing those
checks to be skipped when 1-GB memory regions are known to contain no
SEV-SNP guest memory.
For more information on the RMPOPT instruction, see the AMD64 RMPOPT
technical documentation.
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/kernel/cpu/scattered.c | 1 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1d506e5d6f46..794cc96b8493 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -76,7 +76,7 @@
#define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */
#define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitecture */
#define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitecture */
-/* Free ( 3*32+ 7) */
+#define X86_FEATURE_RMPOPT ( 3*32+ 7) /* Support for AMD RMPOPT instruction */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */
#define X86_FEATURE_ART ( 3*32+10) /* "art" Always running timer (ART) */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 937129ce6a96..021c0bf22de2 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -67,6 +67,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ X86_FEATURE_AMD_LBR_V2, CPUID_EAX, 1, 0x80000022, 0 },
{ X86_FEATURE_AMD_LBR_PMC_FREEZE, CPUID_EAX, 2, 0x80000022, 0 },
+ { X86_FEATURE_RMPOPT, CPUID_EDX, 0, 0x80000025, 0 },
{ X86_FEATURE_AMD_HTR_CORES, CPUID_EAX, 30, 0x80000026, 0 },
{ 0, 0, 0, 0, 0 }
};
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 86d17b195e79..7ce681af1dd7 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -76,7 +76,7 @@
#define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */
#define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitecture */
#define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitecture */
-/* Free ( 3*32+ 7) */
+#define X86_FEATURE_RMPOPT ( 3*32+ 7) /* Support for AMD RMPOPT instruction */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */
#define X86_FEATURE_ART ( 3*32+10) /* "art" Always running timer (ART) */
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
2026-06-15 19:48 ` [PATCH v8 1/7] x86/cpufeatures: Add X86_FEATURE_RMPOPT feature flag Ashish Kalra
@ 2026-06-15 19:48 ` Ashish Kalra
2026-06-15 20:14 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active Ashish Kalra
` (4 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:48 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
The new RMPOPT instruction helps manage per-CPU RMP optimization
structures inside the CPU. It takes a 1GB-aligned physical address
and either returns the status of the optimizations or tries to enable
the optimizations.
Per-CPU RMPOPT tables support at most 2 TB of addressable memory for
RMP optimizations.
Initialize the per-CPU RMPOPT table base to the starting physical
address. This enables RMP optimization for up to 2 TB of system RAM on
all CPUs.
Additionally, add support to setup and enable RMPOPT once SNP is
enabled and initialized.
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/coco/core.c | 2 +
arch/x86/include/asm/msr-index.h | 3 ++
arch/x86/include/asm/sev.h | 4 ++
arch/x86/virt/svm/sev.c | 70 ++++++++++++++++++++++++++++++++
drivers/crypto/ccp/sev-dev.c | 3 ++
5 files changed, 82 insertions(+)
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 989ca9f72ba3..8c1393ddc5df 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -16,6 +16,7 @@
#include <asm/archrandom.h>
#include <asm/coco.h>
#include <asm/processor.h>
+#include <asm/sev.h>
enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
SYM_PIC_ALIAS(cc_vendor);
@@ -172,6 +173,7 @@ static void amd_cc_platform_clear(enum cc_attr attr)
switch (attr) {
case CC_ATTR_HOST_SEV_SNP:
cc_flags.host_sev_snp = 0;
+ snp_clear_rmpopt_configured();
break;
default:
break;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 86554de9a3f5..28540744f1eb 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -761,6 +761,9 @@
#define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
#define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
#define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
+#define MSR_AMD64_RMPOPT_BASE 0xc0010139
+#define MSR_AMD64_RMPOPT_ENABLE_BIT 0
+#define MSR_AMD64_RMPOPT_ENABLE BIT_ULL(MSR_AMD64_RMPOPT_ENABLE_BIT)
#define MSR_SVSM_CAA 0xc001f000
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 594cfa19cbd4..0d662221615a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -662,6 +662,8 @@ static inline void snp_leak_pages(u64 pfn, unsigned int pages)
__snp_leak_pages(pfn, pages, true);
}
int snp_prepare(void);
+void snp_setup_rmpopt(void);
+void snp_clear_rmpopt_configured(void);
void snp_shutdown(void);
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
@@ -680,6 +682,8 @@ static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
static inline int snp_prepare(void) { return -ENODEV; }
+static inline void snp_setup_rmpopt(void) {}
+static inline void snp_clear_rmpopt_configured(void) {}
static inline void snp_shutdown(void) {}
#endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 8bcdce98f6dc..1b5c18408f0b 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -124,6 +124,10 @@ static void *rmp_bookkeeping __ro_after_init;
static u64 probed_rmp_base, probed_rmp_size;
+static cpumask_t rmpopt_cpumask;
+static phys_addr_t rmpopt_pa_start;
+static bool rmpopt_configured;
+
static LIST_HEAD(snp_leaked_pages_list);
static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
@@ -490,7 +494,12 @@ static bool __init setup_rmptable(void)
if (rmp_cfg & MSR_AMD64_SEG_RMP_ENABLED) {
if (!setup_segmented_rmptable())
return false;
+ rmpopt_configured = true;
} else {
+ /*
+ * RMPOPT requires a segmented RMP table, so leave
+ * rmpopt_configured clear on contiguous RMP systems.
+ */
if (!setup_contiguous_rmptable())
return false;
}
@@ -555,6 +564,21 @@ int snp_prepare(void)
}
EXPORT_SYMBOL_FOR_MODULES(snp_prepare, "ccp");
+static void rmpopt_cleanup(void)
+{
+ int cpu;
+
+ cpus_read_lock();
+
+ for_each_cpu(cpu, &rmpopt_cpumask)
+ WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, 0));
+
+ cpus_read_unlock();
+
+ cpumask_clear(&rmpopt_cpumask);
+ rmpopt_pa_start = 0;
+}
+
void snp_shutdown(void)
{
u64 syscfg;
@@ -563,11 +587,57 @@ void snp_shutdown(void)
if (syscfg & MSR_AMD64_SYSCFG_SNP_EN)
return;
+ rmpopt_cleanup();
+
clear_rmp();
on_each_cpu(mfd_reconfigure, NULL, 1);
}
EXPORT_SYMBOL_FOR_MODULES(snp_shutdown, "ccp");
+void snp_clear_rmpopt_configured(void)
+{
+ rmpopt_configured = false;
+}
+
+void snp_setup_rmpopt(void)
+{
+ u64 rmpopt_base;
+ int cpu;
+
+ if (!cpu_feature_enabled(X86_FEATURE_RMPOPT) || !rmpopt_configured)
+ return;
+
+ cpus_read_lock();
+
+ /*
+ * The RMPOPT_BASE MSR is per-core, so only one thread per core needs
+ * to set up the RMPOPT_BASE MSR.
+ *
+ * Note: only online primary threads are included. If a core's
+ * primary thread is offline, that core is not covered. CPU hotplug
+ * is not currently supported with SNP enabled.
+ */
+
+ for_each_online_cpu(cpu)
+ if (topology_is_primary_thread(cpu))
+ cpumask_set_cpu(cpu, &rmpopt_cpumask);
+
+ rmpopt_pa_start = ALIGN_DOWN(PFN_PHYS(min_low_pfn), SZ_1G);
+ rmpopt_base = rmpopt_pa_start | MSR_AMD64_RMPOPT_ENABLE;
+
+ /*
+ * Per-CPU RMPOPT tables support at most 2 TB of addressable memory
+ * for RMP optimizations. Initialize the per-CPU RMPOPT table base
+ * to the starting physical address to enable RMP optimizations for
+ * up to 2 TB of system RAM on all CPUs.
+ */
+ for_each_cpu(cpu, &rmpopt_cpumask)
+ WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, rmpopt_base));
+
+ cpus_read_unlock();
+}
+EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");
+
/*
* Do the necessary preparations which are verified by the firmware as
* described in the SNP_INIT_EX firmware command description in the SNP
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 78f98aee7a66..217b6b19802e 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -1478,6 +1478,9 @@ static int __sev_snp_init_locked(int *error, unsigned int max_snp_asid)
}
snp_hv_fixed_pages_state_update(sev, HV_FIXED);
+
+ snp_setup_rmpopt();
+
sev->snp_initialized = true;
dev_dbg(sev->dev, "SEV-SNP firmware initialized, SEV-TIO is %s\n",
data.tio_en ? "enabled" : "disabled");
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
2026-06-15 19:48 ` [PATCH v8 1/7] x86/cpufeatures: Add X86_FEATURE_RMPOPT feature flag Ashish Kalra
2026-06-15 19:48 ` [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
@ 2026-06-15 19:49 ` Ashish Kalra
2026-06-15 20:14 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
` (3 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:49 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
The SEV firmware enumerates the CPUs at SNP initialization and is not
aware of the OS bringing CPUs online or offline afterwards, so OS CPU
hotplug can diverge from the firmware's expectations and break SNP.
Disable CPU hotplug while SNP is active.
SNP is fully torn down only on the SNP_SHUTDOWN_EX x86_snp_shutdown
path; the legacy path leaves SNP enabled in hardware while clearing
snp_initialized, so __sev_snp_init_locked() can run again. Track the
disable with a flag so it is balanced by a matching enable rather than
stacked, and re-enable hotplug only on the x86_snp_shutdown path, after
snp_shutdown() has cleared the per-core RMPOPT_BASE MSRs with hotplug
still disabled.
This also keeps the CPU set stable for the asynchronous RMPOPT scan
added later in this series, and ensures cpus_read_lock() in the scan
is uncontended.
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
drivers/crypto/ccp/sev-dev.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 217b6b19802e..c8c3c577463c 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -106,6 +106,9 @@ struct snp_hv_fixed_pages_entry {
static LIST_HEAD(snp_hv_fixed_pages);
+/* Set while SNP has CPU hotplug disabled. */
+static bool snp_cpu_hotplug_disabled;
+
/* Trusted Memory Region (TMR):
* The TMR is a 1MB area that must be 1MB aligned. Use the page allocator
* to allocate the memory, which will return aligned memory for the specified
@@ -1479,6 +1482,17 @@ static int __sev_snp_init_locked(int *error, unsigned int max_snp_asid)
snp_hv_fixed_pages_state_update(sev, HV_FIXED);
+ /*
+ * Disable CPU hotplug while SNP is active. Guard against stacking
+ * the disable count: the legacy SNP_SHUTDOWN_EX path clears
+ * snp_initialized without re-enabling hotplug, so this can run
+ * again while hotplug is already disabled.
+ */
+ if (!snp_cpu_hotplug_disabled) {
+ cpu_hotplug_disable();
+ snp_cpu_hotplug_disabled = true;
+ }
+
snp_setup_rmpopt();
sev->snp_initialized = true;
@@ -2083,8 +2097,21 @@ static int __sev_snp_shutdown_locked(int *error, bool panic)
}
if (data.x86_snp_shutdown) {
- if (!panic)
+ if (!panic) {
snp_shutdown();
+ /*
+ * snp_shutdown() fully tears SNP down (clear_rmp()) and
+ * has already cleared the per-core RMPOPT_BASE MSRs via
+ * rmpopt_cleanup() with hotplug still disabled. Re-enable
+ * CPU hotplug now. On the legacy path SNP stays
+ * enabled in hardware, so hotplug is correctly left
+ * disabled.
+ */
+ if (snp_cpu_hotplug_disabled) {
+ cpu_hotplug_enable();
+ snp_cpu_hotplug_disabled = false;
+ }
+ }
snp_hv_fixed_pages_state_update(sev, ALLOCATED);
} else {
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
` (2 preceding siblings ...)
2026-06-15 19:49 ` [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active Ashish Kalra
@ 2026-06-15 19:49 ` Ashish Kalra
2026-06-15 20:07 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 5/7] x86/sev: Add interface to re-enable RMP optimizations Ashish Kalra
` (2 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:49 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
When SEV-SNP is enabled, all writes to memory are checked to ensure
integrity of SNP guest memory. This imposes performance overhead on the
whole system.
RMPOPT is a new instruction that minimizes the performance overhead of
RMP checks on the hypervisor and on non-SNP guests by allowing RMP
checks to be skipped for 1GB regions of memory that are known not to
contain any SEV-SNP guest memory.
Add support for performing RMP optimizations asynchronously using a
dedicated workqueue.
Enable RMPOPT optimizations for up to 2TB of system RAM starting from
the lowest physical memory address aligned down to a 1GB boundary at
RMP initialization time. RMP checks can initially be skipped for 1GB
memory ranges that do not contain SEV-SNP guest memory (excluding
preassigned pages such as the RMP table and firmware pages). As SNP
guests are launched, RMPUPDATE will disable the corresponding RMPOPT
optimizations.
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/virt/svm/sev.c | 230 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 227 insertions(+), 3 deletions(-)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 1b5c18408f0b..b63b639bfc30 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -19,6 +19,7 @@
#include <linux/iommu.h>
#include <linux/amd-iommu.h>
#include <linux/nospec.h>
+#include <linux/workqueue.h>
#include <asm/sev.h>
#include <asm/processor.h>
@@ -125,9 +126,20 @@ static void *rmp_bookkeeping __ro_after_init;
static u64 probed_rmp_base, probed_rmp_size;
static cpumask_t rmpopt_cpumask;
-static phys_addr_t rmpopt_pa_start;
+static phys_addr_t rmpopt_pa_start, rmpopt_pa_end;
static bool rmpopt_configured;
+enum rmpopt_function {
+ RMPOPT_FUNC_VERIFY_AND_REPORT_STATUS,
+ RMPOPT_FUNC_REPORT_STATUS
+};
+
+#define RMPOPT_WORK_TIMEOUT 10000
+
+static struct workqueue_struct *rmpopt_wq;
+static struct delayed_work rmpopt_delayed_work;
+static DEFINE_MUTEX(rmpopt_wq_mutex);
+
static LIST_HEAD(snp_leaked_pages_list);
static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
@@ -568,6 +580,14 @@ static void rmpopt_cleanup(void)
{
int cpu;
+ guard(mutex)(&rmpopt_wq_mutex);
+
+ if (!rmpopt_wq)
+ return;
+
+ cancel_delayed_work_sync(&rmpopt_delayed_work);
+ destroy_workqueue(rmpopt_wq);
+
cpus_read_lock();
for_each_cpu(cpu, &rmpopt_cpumask)
@@ -576,7 +596,8 @@ static void rmpopt_cleanup(void)
cpus_read_unlock();
cpumask_clear(&rmpopt_cpumask);
- rmpopt_pa_start = 0;
+ rmpopt_pa_start = rmpopt_pa_end = 0;
+ rmpopt_wq = NULL;
}
void snp_shutdown(void)
@@ -599,6 +620,168 @@ void snp_clear_rmpopt_configured(void)
rmpopt_configured = false;
}
+/*
+ * RMPOPT: F2 0F 01 FC
+ * Input: RAX = system physical address (1GB aligned)
+ * RCX = operation type
+ * Output: CF set if the range was optimized
+ */
+static inline bool __rmpopt(u64 pa_start, u64 op_type)
+{
+ bool optimized;
+
+ asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfc"
+ : "=@ccc" (optimized)
+ : "a" (pa_start), "c" (op_type)
+ : "memory", "cc");
+
+ return optimized;
+}
+
+static void rmpopt(u64 pa)
+{
+ u64 pa_start = ALIGN_DOWN(pa, SZ_1G);
+ u64 op_type = RMPOPT_FUNC_VERIFY_AND_REPORT_STATUS;
+
+ __rmpopt(pa_start, op_type);
+}
+
+/*
+ * 'val' is a system physical address.
+ */
+static void rmpopt_smp(void *val)
+{
+ rmpopt((u64)val);
+}
+
+/*
+ * Leader function for work_on_cpu(): runs the full RMPOPT scan in
+ * process context on a CPU that has RMPOPT_BASE MSR programmed.
+ */
+static long rmpopt_leader_fn(void *arg)
+{
+ phys_addr_t pa;
+
+ for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
+ rmpopt(pa);
+ cond_resched();
+ }
+ return 0;
+}
+
+/*
+ * RMPOPT optimizations skip RMP checks at 1GB granularity if this
+ * range of memory does not contain any SNP guest memory.
+ */
+static void rmpopt_work_handler(struct work_struct *work)
+{
+ cpumask_var_t follower_mask;
+ phys_addr_t pa;
+ int this_cpu;
+
+ pr_info("Attempt RMP optimizations on physical address range @1GB alignment [0x%016llx - 0x%016llx]\n",
+ rmpopt_pa_start, rmpopt_pa_end);
+
+ if (!alloc_cpumask_var(&follower_mask, GFP_KERNEL))
+ return;
+
+ /*
+ * RMPOPT scans the RMP table, stores the result of the scan in the
+ * reserved processor memory. The RMP scan is the most expensive
+ * part. If a second RMPOPT occurs, it can skip the expensive scan
+ * if they can see a cached result in the reserved processor memory.
+ *
+ * Do RMPOPT on one CPU alone. Then, follow that up with RMPOPT
+ * on every other primary thread. Followers are "designed to"
+ * skip the scan if they see the "cached" scan results.
+ */
+ cpumask_copy(follower_mask, &rmpopt_cpumask);
+
+ /*
+ * Pin the worker to the current CPU for the leader loop so that
+ * this_cpu remains valid and the RMPOPT instruction executes on
+ * the correct CPU.
+ *
+ * Use migrate_disable() rather than get_cpu() to prevent
+ * migration while still allowing preemption.
+ */
+ migrate_disable();
+ this_cpu = smp_processor_id();
+
+ if (cpumask_test_cpu(this_cpu, follower_mask)) {
+ /*
+ * Current CPU is a primary thread in rmpopt_cpumask.
+ * Run leader locally and remove from follower mask.
+ */
+ cpumask_clear_cpu(this_cpu, follower_mask);
+
+ for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
+ rmpopt(pa);
+ cond_resched();
+ }
+ } else if (cpumask_intersects(topology_sibling_cpumask(this_cpu),
+ follower_mask)) {
+ /*
+ * Current CPU is a sibling thread whose primary is in
+ * rmpopt_cpumask. RMPOPT_BASE MSR is per-core, so it
+ * is safe to run the leader locally. Remove the sibling's
+ * primary from the follower mask as this core is already
+ * covered by the leader.
+ */
+ cpumask_andnot(follower_mask, follower_mask,
+ topology_sibling_cpumask(this_cpu));
+
+ for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
+ rmpopt(pa);
+ cond_resched();
+ }
+ } else {
+ /*
+ * Current CPU does not have RMPOPT_BASE MSR programmed.
+ * Pick an explicit leader from the cpumask to avoid #UD.
+ * Use work_on_cpu() to run in process context on the leader,
+ * avoiding IPI latency.
+ */
+ int leader_cpu = cpumask_first(follower_mask);
+
+ if (WARN_ON_ONCE(leader_cpu >= nr_cpu_ids)) {
+ migrate_enable();
+ goto out;
+ }
+
+ cpumask_clear_cpu(leader_cpu, follower_mask);
+
+ /* Release migration pin before work_on_cpu(). */
+ migrate_enable();
+
+ work_on_cpu(leader_cpu, rmpopt_leader_fn, NULL);
+
+ goto followers;
+ }
+
+ migrate_enable();
+
+followers:
+ /*
+ * Followers: run RMPOPT on remaining cores.
+ * CPU hotplug is disabled while SNP is active
+ * (cpu_hotplug_disable() in __sev_snp_init_locked()),
+ * so cpus_read_lock() is uncontended.
+ */
+ cpus_read_lock();
+ for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
+ on_each_cpu_mask(follower_mask, rmpopt_smp,
+ (void *)pa, true);
+
+ /* Give a chance for other threads to run */
+ cond_resched();
+ }
+ cpus_read_unlock();
+
+out:
+ free_cpumask_var(follower_mask);
+}
+
void snp_setup_rmpopt(void)
{
u64 rmpopt_base;
@@ -607,11 +790,37 @@ void snp_setup_rmpopt(void)
if (!cpu_feature_enabled(X86_FEATURE_RMPOPT) || !rmpopt_configured)
return;
+ guard(mutex)(&rmpopt_wq_mutex);
+
+ /*
+ * Guard against re-initialization. When SNP_SHUTDOWN_EX is issued
+ * with x86_snp_shutdown=0, snp_shutdown() is not called and
+ * rmpopt_cleanup() is skipped, but snp_initialized is still cleared.
+ * A subsequent __sev_snp_init_locked() would call snp_setup_rmpopt()
+ * again, leaking the existing workqueue, delayed work, debugfs
+ * entries, and cpumask state.
+ */
+ if (rmpopt_wq)
+ return;
+
+ /*
+ * Create an RMPOPT-specific workqueue to avoid scheduling
+ * RMPOPT workitem on the global system workqueue.
+ */
+ rmpopt_wq = alloc_workqueue("rmpopt_wq", WQ_UNBOUND, 1);
+ if (!rmpopt_wq) {
+ pr_err("Failed to allocate RMPOPT workqueue\n");
+ return;
+ }
+
+ INIT_DELAYED_WORK(&rmpopt_delayed_work, rmpopt_work_handler);
+
cpus_read_lock();
/*
* The RMPOPT_BASE MSR is per-core, so only one thread per core needs
- * to set up the RMPOPT_BASE MSR.
+ * to set up the RMPOPT_BASE MSR. Likewise, only one thread per core
+ * needs to issue the RMPOPT instruction.
*
* Note: only online primary threads are included. If a core's
* primary thread is offline, that core is not covered. CPU hotplug
@@ -635,6 +844,21 @@ void snp_setup_rmpopt(void)
WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, rmpopt_base));
cpus_read_unlock();
+
+ rmpopt_pa_end = ALIGN(PFN_PHYS(max_pfn), SZ_1G);
+
+ /* Limit memory scanning to 2TB of RAM */
+ if ((rmpopt_pa_end - rmpopt_pa_start) > SZ_2T) {
+ pr_info("RMPOPT coverage limited to 2TB; memory above 0x%llx not optimized\n",
+ rmpopt_pa_start + SZ_2T);
+ rmpopt_pa_end = rmpopt_pa_start + SZ_2T;
+ }
+
+ /*
+ * Once all per-CPU RMPOPT tables have been configured, enable RMPOPT
+ * optimizations on all physical memory.
+ */
+ queue_delayed_work(rmpopt_wq, &rmpopt_delayed_work, 0);
}
EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 5/7] x86/sev: Add interface to re-enable RMP optimizations.
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
` (3 preceding siblings ...)
2026-06-15 19:49 ` [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
@ 2026-06-15 19:49 ` Ashish Kalra
2026-06-15 19:50 ` [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
2026-06-15 19:50 ` [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT Ashish Kalra
6 siblings, 0 replies; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:49 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
RMPOPT table is a per-CPU table which indicates if 1GB regions of
physical memory are entirely hypervisor-owned or not.
When performing host memory accesses in hypervisor mode as well as
non-SNP guest mode, the processor may consult the RMPOPT table to
potentially skip an RMP access and improve performance.
Events such as RMPUPDATE can clear RMP optimizations. Add an interface
to re-enable those optimizations.
The interface uses mod_delayed_work() instead of queue_delayed_work()
so that the delay timer is reset on each call. This provides proper
batching semantics: re-optimization runs 10 seconds after the *last*
VM termination rather than after the first. mod_delayed_work() also
re-queues work that is already in-flight, so a re-scan request
during an active scan is not silently dropped.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/include/asm/sev.h | 2 ++
arch/x86/virt/svm/sev.c | 15 +++++++++++++++
2 files changed, 17 insertions(+)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 0d662221615a..a11306f25336 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -662,6 +662,7 @@ static inline void snp_leak_pages(u64 pfn, unsigned int pages)
__snp_leak_pages(pfn, pages, true);
}
int snp_prepare(void);
+void snp_rmpopt_all_physmem(void);
void snp_setup_rmpopt(void);
void snp_clear_rmpopt_configured(void);
void snp_shutdown(void);
@@ -682,6 +683,7 @@ static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
static inline int snp_prepare(void) { return -ENODEV; }
+static inline void snp_rmpopt_all_physmem(void) {}
static inline void snp_setup_rmpopt(void) {}
static inline void snp_clear_rmpopt_configured(void) {}
static inline void snp_shutdown(void) {}
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index b63b639bfc30..253a534b9a0d 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -782,6 +782,21 @@ static void rmpopt_work_handler(struct work_struct *work)
free_cpumask_var(follower_mask);
}
+void snp_rmpopt_all_physmem(void)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_RMPOPT) || !rmpopt_configured)
+ return;
+
+ guard(mutex)(&rmpopt_wq_mutex);
+
+ if (!rmpopt_wq)
+ return;
+
+ mod_delayed_work(rmpopt_wq, &rmpopt_delayed_work,
+ msecs_to_jiffies(RMPOPT_WORK_TIMEOUT));
+}
+EXPORT_SYMBOL_GPL(snp_rmpopt_all_physmem);
+
void snp_setup_rmpopt(void)
{
u64 rmpopt_base;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
` (4 preceding siblings ...)
2026-06-15 19:49 ` [PATCH v8 5/7] x86/sev: Add interface to re-enable RMP optimizations Ashish Kalra
@ 2026-06-15 19:50 ` Ashish Kalra
2026-06-15 20:06 ` sashiko-bot
2026-06-15 19:50 ` [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT Ashish Kalra
6 siblings, 1 reply; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:50 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
Pages are converted from shared to private as SNP guests are launched.
This destroys exisiting RMPOPT optimizations in the regions where
pages are converted.
Conversely, guest pages are converted back to shared during SNP guest
termination and their region may become eligible for RMPOPT
optimization.
To take advantage of this, perform RMPOPT after guest termination.
Do it after a delay so that a single RMPOPT pass can be done if
multiple guests terminate in a short period of time.
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/kvm/svm/sev.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e107f368ed2d..29af6f6e603c 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3005,6 +3005,8 @@ void sev_vm_destroy(struct kvm *kvm)
*/
if (snp_decommission_context(kvm))
return;
+
+ snp_rmpopt_all_physmem();
} else {
sev_unbind_asid(kvm, sev->handle);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
` (5 preceding siblings ...)
2026-06-15 19:50 ` [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
@ 2026-06-15 19:50 ` Ashish Kalra
2026-06-15 20:07 ` sashiko-bot
6 siblings, 1 reply; 12+ messages in thread
From: Ashish Kalra @ 2026-06-15 19:50 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, peterz,
thomas.lendacky, herbert, davem, ardb
Cc: pbonzini, aik, Michael.Roth, KPrateek.Nayak, Tycho.Andersen,
Nathan.Fontenot, ackerleytng, jackyli, pgonda, rientjes, jacobhxu,
xin, pawan.kumar.gupta, babu.moger, dyoung, nikunj, john.allen,
darwi, linux-kernel, linux-crypto, kvm, linux-coco
From: Ashish Kalra <ashish.kalra@amd.com>
Add a debugfs interface to report per-CPU RMPOPT status across all
system RAM.
To dump the per-CPU RMPOPT status for all system RAM:
/sys/kernel/debug/rmpopt# cat rmpopt-table
Memory @ 0GB: CPU(s): none
Memory @ 1GB: CPU(s): none
Memory @ 2GB: CPU(s): 0-1023
Memory @ 3GB: CPU(s): 0-1023
Memory @ 4GB: CPU(s): none
Memory @ 5GB: CPU(s): 0-1023
Memory @ 6GB: CPU(s): 0-1023
Memory @ 7GB: CPU(s): 0-1023
...
Memory @1025GB: CPU(s): 0-1023
Memory @1026GB: CPU(s): 0-1023
Memory @1027GB: CPU(s): 0-1023
Memory @1028GB: CPU(s): 0-1023
Memory @1029GB: CPU(s): 0-1023
Memory @1030GB: CPU(s): 0-1023
Memory @1031GB: CPU(s): 0-1023
Memory @1032GB: CPU(s): 0-1023
Memory @1033GB: CPU(s): 0-1023
Memory @1034GB: CPU(s): 0-1023
Memory @1035GB: CPU(s): 0-1023
Memory @1036GB: CPU(s): 0-1023
Memory @1037GB: CPU(s): 0-1023
Memory @1038GB: CPU(s): none
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/virt/svm/sev.c | 128 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 128 insertions(+)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 253a534b9a0d..b8b00c50ce41 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -20,6 +20,8 @@
#include <linux/amd-iommu.h>
#include <linux/nospec.h>
#include <linux/workqueue.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
#include <asm/sev.h>
#include <asm/processor.h>
@@ -145,6 +147,15 @@ static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
static unsigned long snp_nr_leaked_pages;
+/* All users of rmpopt_report_cpumask must hold rmpopt_show_mutex. */
+static cpumask_t rmpopt_report_cpumask;
+static struct dentry *rmpopt_debugfs;
+static DEFINE_MUTEX(rmpopt_show_mutex);
+
+struct seq_paddr {
+ phys_addr_t next_seq_paddr;
+};
+
#undef pr_fmt
#define pr_fmt(fmt) "SEV-SNP: " fmt
@@ -587,6 +598,8 @@ static void rmpopt_cleanup(void)
cancel_delayed_work_sync(&rmpopt_delayed_work);
destroy_workqueue(rmpopt_wq);
+ debugfs_remove_recursive(rmpopt_debugfs);
+ rmpopt_debugfs = NULL;
cpus_read_lock();
@@ -635,6 +648,10 @@ static inline bool __rmpopt(u64 pa_start, u64 op_type)
: "a" (pa_start), "c" (op_type)
: "memory", "cc");
+ if (op_type == RMPOPT_FUNC_REPORT_STATUS)
+ assign_cpu(smp_processor_id(), &rmpopt_report_cpumask,
+ optimized);
+
return optimized;
}
@@ -669,6 +686,115 @@ static long rmpopt_leader_fn(void *arg)
return 0;
}
+/*
+ * 'val' is a system physical address.
+ */
+static void rmpopt_report_status(void *val)
+{
+ u64 pa_start = ALIGN_DOWN((u64)val, SZ_1G);
+ u64 op_type = RMPOPT_FUNC_REPORT_STATUS;
+
+ __rmpopt(pa_start, op_type);
+}
+
+/*
+ * start() can be called multiple times if allocated buffer has overflowed
+ * and bigger buffer is allocated.
+ */
+static void *rmpopt_table_seq_start(struct seq_file *seq, loff_t *pos)
+{
+ phys_addr_t end_paddr = rmpopt_pa_end;
+ struct seq_paddr *p = seq->private;
+
+ if (*pos == 0) {
+ p->next_seq_paddr = rmpopt_pa_start;
+ if (p->next_seq_paddr >= end_paddr)
+ return NULL;
+ return &p->next_seq_paddr;
+ }
+
+ if (p->next_seq_paddr >= end_paddr)
+ return NULL;
+
+ return &p->next_seq_paddr;
+}
+
+static void *rmpopt_table_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+ phys_addr_t end_paddr = rmpopt_pa_end;
+ phys_addr_t *curr_paddr = v;
+
+ (*pos)++;
+ *curr_paddr += SZ_1G;
+ if (*curr_paddr >= end_paddr)
+ return NULL;
+
+ return curr_paddr;
+}
+
+static void rmpopt_table_seq_stop(struct seq_file *seq, void *v)
+{
+}
+
+static int rmpopt_table_seq_show(struct seq_file *seq, void *v)
+{
+ phys_addr_t *curr_paddr = v;
+
+ guard(mutex)(&rmpopt_show_mutex);
+
+ seq_printf(seq, "Memory @%3lluGB: ",
+ *curr_paddr >> (get_order(SZ_1G) + PAGE_SHIFT));
+
+ /*
+ * Query all online CPUs rather than just rmpopt_cpumask (primary
+ * threads only). The RMPOPT instruction only needs to run on one
+ * thread per core for the optimization to take effect, but debugfs
+ * reporting requires the RMPOPT status across all CPUs.
+ * Performance is not a concern for this diagnostic interface.
+ *
+ * This is safe because RMPOPT_BASE MSR is per-core and
+ * snp_prepare() ensures all CPUs are online when the MSR is
+ * programmed during snp_setup_rmpopt().
+ */
+ cpumask_clear(&rmpopt_report_cpumask);
+ on_each_cpu_mask(cpu_online_mask, rmpopt_report_status,
+ (void *)*curr_paddr, true);
+
+ if (cpumask_empty(&rmpopt_report_cpumask))
+ seq_puts(seq, "CPU(s): none\n");
+ else
+ seq_printf(seq, "CPU(s): %*pbl\n", cpumask_pr_args(&rmpopt_report_cpumask));
+
+ return 0;
+}
+
+static const struct seq_operations rmpopt_table_seq_ops = {
+ .start = rmpopt_table_seq_start,
+ .next = rmpopt_table_seq_next,
+ .stop = rmpopt_table_seq_stop,
+ .show = rmpopt_table_seq_show
+};
+
+static int rmpopt_table_open(struct inode *inode, struct file *file)
+{
+ return seq_open_private(file, &rmpopt_table_seq_ops, sizeof(struct seq_paddr));
+}
+
+static const struct file_operations rmpopt_table_fops = {
+ .open = rmpopt_table_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release_private,
+};
+
+static void rmpopt_debugfs_setup(void)
+{
+ rmpopt_debugfs = debugfs_create_dir("rmpopt", arch_debugfs_dir);
+
+ debugfs_create_file("rmpopt-table", 0400, rmpopt_debugfs,
+ NULL, &rmpopt_table_fops);
+}
+
/*
* RMPOPT optimizations skip RMP checks at 1GB granularity if this
* range of memory does not contain any SNP guest memory.
@@ -874,6 +1000,8 @@ void snp_setup_rmpopt(void)
* optimizations on all physical memory.
*/
queue_delayed_work(rmpopt_wq, &rmpopt_delayed_work, 0);
+
+ rmpopt_debugfs_setup();
}
EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown
2026-06-15 19:50 ` [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
@ 2026-06-15 20:06 ` sashiko-bot
0 siblings, 0 replies; 12+ messages in thread
From: sashiko-bot @ 2026-06-15 20:06 UTC (permalink / raw)
To: Ashish Kalra; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] RMPOPT optimization is skipped for large SNP VMs because the 10-second delay starts before memory is converted to shared.
--
commit b9cfe7528db8b373ee160983c730e4dd4a7d1b8a
Author: Ashish Kalra <ashish.kalra@amd.com>
KVM: SEV: Perform RMP optimizations on SNP guest shutdown
This commit introduces an optimization to perform RMPOPT after an SNP guest
terminates. It calls snp_rmpopt_all_physmem() in sev_vm_destroy(), which
queues a scan after a 10-second delay so multiple terminations can be batched.
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 4272293478766..0e94f81907ec2 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2992,6 +2992,8 @@ void sev_vm_destroy(struct kvm *kvm)
> */
> if (snp_decommission_context(kvm))
> return;
> +
> + snp_rmpopt_all_physmem();
[Severity: High]
Does this delay start too early for large VMs?
In VFS, when the last reference to a guest_memfd is dropped, __fput() calls
kvm_gmem_release(). This drops the KVM reference, triggering sev_vm_destroy()
which starts the 10-second delay here.
After kvm_gmem_release() returns, VFS drops the inode, triggering
evict_inode(). This frees folios via sev_gmem_invalidate():
arch/x86/kvm/svm/sev.c:sev_gmem_invalidate() {
...
for (pfn = start; pfn < end;) {
...
rc = rmp_make_shared(...);
...
clflush_cache_range(...);
...
}
}
For large VMs, sequentially converting millions of pages back to shared
takes significantly longer than 10 seconds. The RMPOPT scan will execute
while sev_gmem_invalidate() is still running, observe the remaining pages
as private, and skip optimizing them.
Will this result in permanently missing the optimization opportunity for those
pages on large guests?
> } else {
> sev_unbind_asid(kvm, sev->handle);
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781419998.git.ashish.kalra@amd.com?part=6
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously
2026-06-15 19:49 ` [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
@ 2026-06-15 20:07 ` sashiko-bot
0 siblings, 0 replies; 12+ messages in thread
From: sashiko-bot @ 2026-06-15 20:07 UTC (permalink / raw)
To: Ashish Kalra; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Unsynchronized MSR teardown causes #UD panic when reading RMPOPT debugfs file.
--
commit f092a06fad8eb0834da2fe3bbc80d3b029232484
Author: Ashish Kalra <ashish.kalra@amd.com>
x86/sev: Add support to perform RMP optimizations asynchronously
When SEV-SNP is enabled, all writes to memory are checked to ensure
integrity of SNP guest memory. This imposes performance overhead on the
whole system. This patch adds support for performing RMP optimizations
asynchronously using a dedicated workqueue.
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 1b5c18408f0bf..b63b639bfc30b 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
[ ... ]
> @@ -568,6 +580,14 @@ static void rmpopt_cleanup(void)
> {
> int cpu;
>
> + guard(mutex)(&rmpopt_wq_mutex);
> +
> + if (!rmpopt_wq)
> + return;
> +
> + cancel_delayed_work_sync(&rmpopt_delayed_work);
> + destroy_workqueue(rmpopt_wq);
> +
> cpus_read_lock();
>
> for_each_cpu(cpu, &rmpopt_cpumask)
> @@ -576,7 +596,8 @@ static void rmpopt_cleanup(void)
> cpus_read_unlock();
>
> cpumask_clear(&rmpopt_cpumask);
> - rmpopt_pa_start = 0;
> + rmpopt_pa_start = rmpopt_pa_end = 0;
> + rmpopt_wq = NULL;
> }
[Severity: High]
Could this MSR teardown cause a #UD panic if it races with the debugfs
interface introduced later in this series?
rmpopt_table_seq_show() unconditionally executes the RMPOPT instruction on
all online CPUs to display their optimization status, protected only by
rmpopt_show_mutex.
If system operations like module unload or kexec trigger snp_shutdown(),
it invokes rmpopt_cleanup() which tears down the feature by writing 0 to
the MSR_AMD64_RMPOPT_BASE MSR on all CPUs:
arch/x86/virt/svm/sev.c:rmpopt_cleanup() {
...
for_each_cpu(cpu, &rmpopt_cpumask)
WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, 0));
...
}
However, rmpopt_cleanup() does not acquire rmpopt_show_mutex, and the call
to debugfs_remove_recursive() does not block or abort in-progress read()
operations.
If rmpopt_cleanup() zeroes the MSR while rmpopt_table_seq_show() is still
executing on_each_cpu_mask(), could any CPU executing RMPOPT with the MSR
disabled generate an unhandled #UD (Invalid Opcode) exception?
--
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781419998.git.ashish.kalra@amd.com?part=4
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT
2026-06-15 19:50 ` [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT Ashish Kalra
@ 2026-06-15 20:07 ` sashiko-bot
0 siblings, 0 replies; 12+ messages in thread
From: sashiko-bot @ 2026-06-15 20:07 UTC (permalink / raw)
To: Ashish Kalra; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Broadcasting the RMPOPT instruction to cpu_online_mask in debugfs can trigger a kernel panic (#UD/#GP) if a core's primary thread is offline.
--
commit bd48c96c2b885c0673d20119de172ea1dd26cdf2
Author: Ashish Kalra <ashish.kalra@amd.com>
x86/sev: Add debugfs support for RMPOPT
This commit adds a debugfs interface to report per-CPU RMPOPT status across
all system RAM.
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
[ ... ]
> @@ -669,6 +686,115 @@ static long rmpopt_leader_fn(void *arg)
[ ... ]
> +static int rmpopt_table_seq_show(struct seq_file *seq, void *v)
> +{
> + phys_addr_t *curr_paddr = v;
> +
> + guard(mutex)(&rmpopt_show_mutex);
> +
> + seq_printf(seq, "Memory @%3lluGB: ",
> + *curr_paddr >> (get_order(SZ_1G) + PAGE_SHIFT));
> +
> + /*
> + * Query all online CPUs rather than just rmpopt_cpumask (primary
> + * threads only). The RMPOPT instruction only needs to run on one
> + * thread per core for the optimization to take effect, but debugfs
> + * reporting requires the RMPOPT status across all CPUs.
> + * Performance is not a concern for this diagnostic interface.
> + *
> + * This is safe because RMPOPT_BASE MSR is per-core and
> + * snp_prepare() ensures all CPUs are online when the MSR is
> + * programmed during snp_setup_rmpopt().
> + */
> + cpumask_clear(&rmpopt_report_cpumask);
> + on_each_cpu_mask(cpu_online_mask, rmpopt_report_status,
> + (void *)*curr_paddr, true);
[Severity: High]
Could broadcasting to cpu_online_mask here trigger a kernel panic
if a core's primary thread is offline?
In snp_setup_rmpopt(), the base msr is only programmed on cpus in
rmpopt_cpumask, which is strictly limited to online primary threads:
if (topology_is_primary_thread(cpu))
cpumask_set_cpu(cpu, &rmpopt_cpumask);
If a core's primary thread is offline but its secondary thread is online,
the secondary thread will be included in cpu_online_mask, but the msr for
its core will remain unprogrammed.
When this debugfs file is read, that secondary thread will receive the ipi
and execute the rmpopt instruction in rmpopt_report_status(). Because its
core's msr was never enabled, won't executing the instruction raise
a fatal invalid opcode exception and panic the kernel?
--
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781419998.git.ashish.kalra@amd.com?part=7
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs
2026-06-15 19:48 ` [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
@ 2026-06-15 20:14 ` sashiko-bot
0 siblings, 0 replies; 12+ messages in thread
From: sashiko-bot @ 2026-06-15 20:14 UTC (permalink / raw)
To: Ashish Kalra; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] MSR_AMD64_RMPOPT_BASE is not restored after suspend/resume, silently disabling the RMPOPT feature and leading to unhandled faults.
- [High] wrmsrq_on_cpu() silently swallows hardware faults, making WARN_ON_ONCE() checks ineffective and masking MSR setup failures.
--
commit 205a673f128c777438687610b84781bd7b8a3533
Author: Ashish Kalra <ashish.kalra@amd.com>
x86/sev: Initialize RMPOPT configuration MSRs
The new RMPOPT instruction helps manage per-CPU RMP optimization
structures inside the CPU. Initialize the per-CPU RMPOPT table base to
the starting physical address to enable RMP optimization for up to 2 TB
of system RAM on all CPUs.
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 8bcdce98f6dce..1b5c18408f0bf 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
[ ... ]
> @@ -555,6 +564,21 @@ int snp_prepare(void)
> }
> EXPORT_SYMBOL_FOR_MODULES(snp_prepare, "ccp");
>
> +static void rmpopt_cleanup(void)
> +{
> + int cpu;
> +
> + cpus_read_lock();
> +
> + for_each_cpu(cpu, &rmpopt_cpumask)
> + WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, 0));
[Severity: High]
Does wrmsrq_on_cpu() return an error if the MSR write causes a hardware
fault?
Looking at wrmsrq_on_cpu(), it uses wrmsr() which relies on the exception
table handler to log the error and resume execution, and always returns 0
if the IPI succeeds.
If the MSR write causes a fault, it seems this WARN_ON_ONCE() won't trigger,
masking a potential disablement failure. Should wrmsrq_safe_on_cpu() be used
here instead?
> +
> + cpus_read_unlock();
> +
> + cpumask_clear(&rmpopt_cpumask);
> + rmpopt_pa_start = 0;
> +}
> +
[ ... ]
> +void snp_setup_rmpopt(void)
> +{
> + u64 rmpopt_base;
> + int cpu;
[ ... ]
> + /*
> + * Per-CPU RMPOPT tables support at most 2 TB of addressable memory
> + * for RMP optimizations. Initialize the per-CPU RMPOPT table base
> + * to the starting physical address to enable RMP optimizations for
> + * up to 2 TB of system RAM on all CPUs.
> + */
> + for_each_cpu(cpu, &rmpopt_cpumask)
> + WARN_ON_ONCE(wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, rmpopt_base));
[Severity: High]
If a system goes into a sleep state like suspend-to-RAM (S3), CPU MSRs are
typically reset.
Since this MSR is initialized here during boot, how is it restored when the
system wakes up?
Without a power management callback like syscore_ops resume hooks or CPU PM
notifiers, will the RMPOPT feature be silently disabled in hardware? While
a later patch disables CPU hotplug, S3 suspend via freeze_secondary_cpus()
explicitly bypasses the cpu_hotplug_disabled check, making this a reachable
path.
Also, similar to rmpopt_cleanup(), if this MSR setup silently fails because
wrmsrq_on_cpu() swallows hardware faults, could subsequent RMPOPT instructions
cause unhandled faults and lead to a kernel panic?
> +
> + cpus_read_unlock();
> +}
> +EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");
--
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781419998.git.ashish.kalra@amd.com?part=2
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active
2026-06-15 19:49 ` [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active Ashish Kalra
@ 2026-06-15 20:14 ` sashiko-bot
0 siblings, 0 replies; 12+ messages in thread
From: sashiko-bot @ 2026-06-15 20:14 UTC (permalink / raw)
To: Ashish Kalra; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Medium] The `snp_cpu_hotplug_disabled` flag is a module-local static variable used to track the global `cpu_hotplug_disabled` counter. On systems with legacy firmware, CPU hotplug is intentionally left disabled when shutting down SNP. However, unloading and reloading the `ccp` module resets this static flag, causing subsequent SNP initializations to call `cpu_hotplug_disable()` again. This unintentionally stacks the global counter, leading to a reference leak over multiple load/unload cycles.
Pre-existing issues:
- [High] If `SEV_CMD_SNP_DF_FLUSH` fails during SNP initialization, the firmware state of HV_FIXED pages is leaked, potentially leading to RMP faults and host crashes when the pages are freed.
--
commit f12f8db5e522836adb88644cf2b682abf1545290
Author: Ashish Kalra <ashish.kalra@amd.com>
crypto/ccp: Disable CPU hotplug while SNP is active
This commit disables CPU hotplug during SNP execution to prevent
divergence between OS and firmware CPU expectations. It adds a static
flag to balance enable/disable calls and avoids re-enabling hotplug
on the legacy shutdown path.
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index c002a7ca26a8a..cdda30a7677e2 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
[ ... ]
> @@ -1478,6 +1481,17 @@ static int __sev_snp_init_locked(int *error, unsigned int max_snp_asid)
>
> snp_hv_fixed_pages_state_update(sev, HV_FIXED);
[Severity: High]
This isn't a bug introduced by this patch, but if the preceding
SEV_CMD_SNP_DF_FLUSH command fails, the function returns early and
bypasses updating the software state of the HV_FIXED pages.
If the caller handles the initialization failure and cleans up via
snp_free_hv_fixed_pages(), will it see the state as ALLOCATED instead
of HV_FIXED?
drivers/crypto/ccp/sev-dev.c:snp_free_hv_fixed_pages() {
...
if (entry->page_state == HV_FIXED) {
entry->free = true;
} else {
__free_pages(page, entry->order);
list_del(&entry->list);
kfree(entry);
}
...
}
Could this incorrectly free firmware-locked pages to the buddy allocator,
potentially leading to RMP violations and host crashes later when those
pages are accessed?
>
> + /*
> + * Disable CPU hotplug while SNP is active. Guard against stacking
> + * the disable count: the legacy SNP_SHUTDOWN_EX path clears
> + * snp_initialized without re-enabling hotplug, so this can run
> + * again while hotplug is already disabled.
> + */
> + if (!snp_cpu_hotplug_disabled) {
> + cpu_hotplug_disable();
> + snp_cpu_hotplug_disabled = true;
> + }
> +
> snp_setup_rmpopt();
[Severity: Medium]
Since snp_cpu_hotplug_disabled is a module-local static variable,
does it get destroyed and reset to false when the ccp module is
unloaded?
If the legacy shutdown path intentionally keeps hotplug disabled,
would unloading and reloading the module cause this code to evaluate to
true?
If so, won't cpu_hotplug_disable() be called again, stacking the global
counter and causing a reference leak over multiple load/unload cycles?
--
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781419998.git.ashish.kalra@amd.com?part=3
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-06-15 20:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1781419998.git.ashish.kalra@amd.com>
2026-06-15 19:48 ` [PATCH v8 1/7] x86/cpufeatures: Add X86_FEATURE_RMPOPT feature flag Ashish Kalra
2026-06-15 19:48 ` [PATCH v8 2/7] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
2026-06-15 20:14 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 3/7] crypto/ccp: Disable CPU hotplug while SNP is active Ashish Kalra
2026-06-15 20:14 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
2026-06-15 20:07 ` sashiko-bot
2026-06-15 19:49 ` [PATCH v8 5/7] x86/sev: Add interface to re-enable RMP optimizations Ashish Kalra
2026-06-15 19:50 ` [PATCH v8 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
2026-06-15 20:06 ` sashiko-bot
2026-06-15 19:50 ` [PATCH v8 7/7] x86/sev: Add debugfs support for RMPOPT Ashish Kalra
2026-06-15 20:07 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.