* [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP
@ 2024-09-30 15:22 Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
` (7 more replies)
0 siblings, 8 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
This series adds SEV-SNP support for a new instruction to read an RMP
entry and for a segmented RMP table.
The RMPREAD instruction is used to return information related to an RMP
entry in an architecturally defined format.
RMPREAD support is detected via CPUID 0x8000001f_EAX[21].
Segmented RMP support is a new way of representing the layout of an RMP
table. Initial RMP table support required the RMP table to be contiguous
in memory. RMP accesses from a NUMA node on which the RMP doesn't reside
can take longer than accesses from a NUMA node on which the RMP resides.
Segmented RMP support allows the RMP entries to be located on the same
node as the memory the RMP is covering, potentially reducing latency
associated with accessing an RMP entry associated with the memory. Each
RMP segment covers a specific range of system physical addresses.
Segmented RMP support is detected and established via CPUID and MSRs.
CPUID:
- 0x8000001f_EAX[23]
- Indicates support for segmented RMP
- 0x80000025_EAX
- [5:0] : Minimum supported RMP segment size
- [11:6] : Maximum supported RMP segment size
- 0x80000025_EBX
- [9:0] : Number of cacheable RMP segment definitions
- [10] : Indicates if the number of cacheable RMP segments is
a hard limit
MSR:
- 0xc0010132 (RMP_BASE)
- Is identical to current RMP support
- 0xc0010133 (RMP_END)
- Should be in reset state if segmented RMP support is active
For kernels that do not support segmented RMP, being in reset
state allows the kernel to disable SNP support if the non-segmented
RMP has not been allocated.
- 0xc0010136 (RMP_CFG)
- [0] : Indicates if segmented RMP is enabled
- [13:8] : Contains the size of memory covered by an RMP segment
(expressed as a power of 2)
The RMP segment size defined in the RMP_CFG MSR applies to all segments
of the RMP. Therefore each RMP segment covers a specific range of system
physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then
the RMP segment coverage value is 0x24 => 36, meaning the size of memory
covered by an RMP segment is 64GB (1 << 36). So the first RMP segment
covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment
covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc.
When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping
area as it does today (16K in size). However, instead of RMP entries
beginning immediately after the bookkeeping area, there is a 4K RMP
segment table. Each entry in the table is 8-bytes in size:
- [19:0] : Mapped size (in GB)
The mapped size can be less than the defined segment size.
A value of zero, indicates that no RMP exists for the range
of system physical addresses associated with this segment.
[51:20] : Segment physical address
This address is left shift 20-bits (or just masked when
read) to form the physical address of the segment (1MB
alignment).
The series is based off of and tested against the tip tree:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
5b0c5f05fb2f ("Merge branch into tip/master: 'x86/splitlock'")
---
Changes in v3:
- Added RMP documentation
Changes in v2:
- Remove the self-describing check. The SEV firmware will ensure that
all RMP segments are covered by RMP entries.
- Do not include RMP segments above maximum detected RAM address (64-bit
MMIO) in the system RAM coverage check.
- Adjust new CPUID feature entries to match the change to how they are
or are not presented to userspace.
Tom Lendacky (8):
x86/sev: Prepare for using the RMPREAD instruction to access the RMP
x86/sev: Add support for the RMPREAD instruction
x86/sev: Require the RMPREAD instruction after Fam19h
x86/sev: Move the SNP probe routine out of the way
x86/sev: Map only the RMP table entries instead of the full RMP range
x86/sev: Treat the contiguous RMP table as a single RMP segment
x86/sev: Add full support for a segmented RMP table
x86/sev/docs: Document the SNP Reverse Map Table (RMP)
.../arch/x86/amd-memory-encryption.rst | 118 ++++
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/msr-index.h | 9 +-
arch/x86/kernel/cpu/amd.c | 3 +-
arch/x86/virt/svm/sev.c | 628 +++++++++++++++---
5 files changed, 662 insertions(+), 98 deletions(-)
--
2.43.2
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-16 8:52 ` Nikunj A. Dadhania
2024-10-16 15:01 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
` (6 subsequent siblings)
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
The RMPREAD instruction returns an architecture defined format of an
RMP entry. This is the preferred method for examining RMP entries.
In preparation for using the RMPREAD instruction, convert the existing
code that directly accesses the RMP to map the raw RMP information into
the architecture defined format.
RMPREAD output returns a status bit for the 2MB region status. If the
input page address is 2MB aligned and any other pages within the 2MB
region are assigned, then 2MB region status will be set to 1. Otherwise,
the 2MB region status will be set to 0. For systems that do not support
RMPREAD, calculating this value would require looping over all of the RMP
table entries within that range until one is found with the assigned bit
set. Since this bit is not defined in the current format, and so not used
today, do not incur the overhead associated with calculating it.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/virt/svm/sev.c | 141 ++++++++++++++++++++++++++++------------
1 file changed, 98 insertions(+), 43 deletions(-)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 0ce17766c0e5..103a2dd6e81d 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -30,11 +30,27 @@
#include <asm/cmdline.h>
#include <asm/iommu.h>
+/*
+ * The RMP entry format as returned by the RMPREAD instruction.
+ */
+struct rmpentry {
+ u64 gpa;
+ u8 assigned :1,
+ rsvd1 :7;
+ u8 pagesize :1,
+ hpage_region_status :1,
+ rsvd2 :6;
+ u8 immutable :1,
+ rsvd3 :7;
+ u8 rsvd4;
+ u32 asid;
+} __packed;
+
/*
* The RMP entry format is not architectural. The format is defined in PPR
* Family 19h Model 01h, Rev B1 processor.
*/
-struct rmpentry {
+struct rmpentry_raw {
union {
struct {
u64 assigned : 1,
@@ -62,7 +78,7 @@ struct rmpentry {
#define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
static u64 probed_rmp_base, probed_rmp_size;
-static struct rmpentry *rmptable __ro_after_init;
+static struct rmpentry_raw *rmptable __ro_after_init;
static u64 rmptable_max_pfn __ro_after_init;
static LIST_HEAD(snp_leaked_pages_list);
@@ -247,8 +263,8 @@ static int __init snp_rmptable_init(void)
rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
- rmptable = (struct rmpentry *)rmptable_start;
- rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry) - 1;
+ rmptable = (struct rmpentry_raw *)rmptable_start;
+ rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
@@ -270,48 +286,77 @@ static int __init snp_rmptable_init(void)
*/
device_initcall(snp_rmptable_init);
-static struct rmpentry *get_rmpentry(u64 pfn)
+static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
{
- if (WARN_ON_ONCE(pfn > rmptable_max_pfn))
- return ERR_PTR(-EFAULT);
-
- return &rmptable[pfn];
-}
-
-static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
-{
- struct rmpentry *large_entry, *entry;
-
- if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+ if (!rmptable)
return ERR_PTR(-ENODEV);
- entry = get_rmpentry(pfn);
- if (IS_ERR(entry))
- return entry;
+ if (unlikely(pfn > rmptable_max_pfn))
+ return ERR_PTR(-EFAULT);
+
+ return rmptable + pfn;
+}
+
+static int get_rmpentry(u64 pfn, struct rmpentry *entry)
+{
+ struct rmpentry_raw *e;
+
+ e = __get_rmpentry(pfn);
+ if (IS_ERR(e))
+ return PTR_ERR(e);
+
+ /*
+ * Map the RMP table entry onto the RMPREAD output format.
+ * The 2MB region status indicator (hpage_region_status field) is not
+ * calculated, since the overhead could be significant and the field
+ * is not used.
+ */
+ memset(entry, 0, sizeof(*entry));
+ entry->gpa = e->gpa << PAGE_SHIFT;
+ entry->asid = e->asid;
+ entry->assigned = e->assigned;
+ entry->pagesize = e->pagesize;
+ entry->immutable = e->immutable;
+
+ return 0;
+}
+
+static int __snp_lookup_rmpentry(u64 pfn, struct rmpentry *entry, int *level)
+{
+ struct rmpentry large_entry;
+ int ret;
+
+ if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+ return -ENODEV;
+
+ ret = get_rmpentry(pfn, entry);
+ if (ret)
+ return ret;
/*
* Find the authoritative RMP entry for a PFN. This can be either a 4K
* RMP entry or a special large RMP entry that is authoritative for a
* whole 2M area.
*/
- large_entry = get_rmpentry(pfn & PFN_PMD_MASK);
- if (IS_ERR(large_entry))
- return large_entry;
+ ret = get_rmpentry(pfn & PFN_PMD_MASK, &large_entry);
+ if (ret)
+ return ret;
- *level = RMP_TO_PG_LEVEL(large_entry->pagesize);
+ *level = RMP_TO_PG_LEVEL(large_entry.pagesize);
- return entry;
+ return 0;
}
int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level)
{
- struct rmpentry *e;
+ struct rmpentry e;
+ int ret;
- e = __snp_lookup_rmpentry(pfn, level);
- if (IS_ERR(e))
- return PTR_ERR(e);
+ ret = __snp_lookup_rmpentry(pfn, &e, level);
+ if (ret)
+ return ret;
- *assigned = !!e->assigned;
+ *assigned = !!e.assigned;
return 0;
}
EXPORT_SYMBOL_GPL(snp_lookup_rmpentry);
@@ -324,20 +369,28 @@ EXPORT_SYMBOL_GPL(snp_lookup_rmpentry);
*/
static void dump_rmpentry(u64 pfn)
{
+ struct rmpentry_raw *e_raw;
u64 pfn_i, pfn_end;
- struct rmpentry *e;
- int level;
+ struct rmpentry e;
+ int level, ret;
- e = __snp_lookup_rmpentry(pfn, &level);
- if (IS_ERR(e)) {
- pr_err("Failed to read RMP entry for PFN 0x%llx, error %ld\n",
- pfn, PTR_ERR(e));
+ ret = __snp_lookup_rmpentry(pfn, &e, &level);
+ if (ret) {
+ pr_err("Failed to read RMP entry for PFN 0x%llx, error %d\n",
+ pfn, ret);
return;
}
- if (e->assigned) {
+ if (e.assigned) {
+ e_raw = __get_rmpentry(pfn);
+ if (IS_ERR(e_raw)) {
+ pr_err("Failed to read RMP contents for PFN 0x%llx, error %ld\n",
+ pfn, PTR_ERR(e_raw));
+ return;
+ }
+
pr_info("PFN 0x%llx, RMP entry: [0x%016llx - 0x%016llx]\n",
- pfn, e->lo, e->hi);
+ pfn, e_raw->lo, e_raw->hi);
return;
}
@@ -356,16 +409,18 @@ static void dump_rmpentry(u64 pfn)
pfn, pfn_i, pfn_end);
while (pfn_i < pfn_end) {
- e = __snp_lookup_rmpentry(pfn_i, &level);
- if (IS_ERR(e)) {
- pr_err("Error %ld reading RMP entry for PFN 0x%llx\n",
- PTR_ERR(e), pfn_i);
+ e_raw = __get_rmpentry(pfn_i);
+ if (IS_ERR(e_raw)) {
+ pr_err("Error %ld reading RMP contents for PFN 0x%llx\n",
+ PTR_ERR(e_raw), pfn_i);
pfn_i++;
continue;
}
- if (e->lo || e->hi)
- pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n", pfn_i, e->lo, e->hi);
+ if (e_raw->lo || e_raw->hi)
+ pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n",
+ pfn_i, e_raw->lo, e_raw->hi);
+
pfn_i++;
}
}
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-16 10:46 ` Nikunj A. Dadhania
` (3 more replies)
2024-09-30 15:22 ` [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h Tom Lendacky
` (5 subsequent siblings)
7 siblings, 4 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
The RMPREAD instruction returns an architecture defined format of an
RMP table entry. This is the preferred method for examining RMP entries.
The instruction is advertised in CPUID 0x8000001f_EAX[21]. Use this
instruction when available.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/virt/svm/sev.c | 11 +++++++++++
2 files changed, 12 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index dd4682857c12..93620a4c5b15 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -447,6 +447,7 @@
#define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* Virtual TSC_AUX */
#define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */
#define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
+#define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
#define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
/* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 103a2dd6e81d..73d4f422829a 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -301,6 +301,17 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
{
struct rmpentry_raw *e;
+ if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
+ int ret;
+
+ asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
+ : "=a" (ret)
+ : "a" (pfn << PAGE_SHIFT), "c" (entry)
+ : "memory", "cc");
+
+ return ret;
+ }
+
e = __get_rmpentry(pfn);
if (IS_ERR(e))
return PTR_ERR(e);
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-09-30 17:03 ` Dave Hansen
2024-10-18 4:26 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way Tom Lendacky
` (4 subsequent siblings)
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
Limit usage of the non-architectural RMP format to Fam19h processors.
The RMPREAD instruction, with its architecture defined output, is
available, and should be used, for RMP access beyond Fam19h.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/kernel/cpu/amd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 015971adadfc..ddbb6dd75fb2 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -358,7 +358,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
* for which the RMP table entry format is currently defined for.
*/
if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
- c->x86 >= 0x19 && snp_probe_rmptable_info()) {
+ (c->x86 == 0x19 || cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
+ snp_probe_rmptable_info()) {
cc_platform_set(CC_ATTR_HOST_SEV_SNP);
} else {
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
` (2 preceding siblings ...)
2024-09-30 15:22 ` [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-16 11:05 ` Nikunj A. Dadhania
2024-10-18 4:28 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range Tom Lendacky
` (3 subsequent siblings)
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
To make patch review easier for the segmented RMP support, move the SNP
probe function out from in between the initialization-related routines.
No functional change.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/virt/svm/sev.c | 60 ++++++++++++++++++++---------------------
1 file changed, 30 insertions(+), 30 deletions(-)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 73d4f422829a..31d1510ae119 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -132,36 +132,6 @@ static __init void snp_enable(void *arg)
__snp_enable(smp_processor_id());
}
-#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
-
-bool snp_probe_rmptable_info(void)
-{
- u64 rmp_sz, rmp_base, rmp_end;
-
- rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
- rdmsrl(MSR_AMD64_RMP_END, rmp_end);
-
- if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
- pr_err("Memory for the RMP table has not been reserved by BIOS\n");
- return false;
- }
-
- if (rmp_base > rmp_end) {
- pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
- return false;
- }
-
- rmp_sz = rmp_end - rmp_base + 1;
-
- probed_rmp_base = rmp_base;
- probed_rmp_size = rmp_sz;
-
- pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
- rmp_base, rmp_end);
-
- return true;
-}
-
static void __init __snp_fixup_e820_tables(u64 pa)
{
if (IS_ALIGNED(pa, PMD_SIZE))
@@ -286,6 +256,36 @@ static int __init snp_rmptable_init(void)
*/
device_initcall(snp_rmptable_init);
+#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
+
+bool snp_probe_rmptable_info(void)
+{
+ u64 rmp_sz, rmp_base, rmp_end;
+
+ rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
+ rdmsrl(MSR_AMD64_RMP_END, rmp_end);
+
+ if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
+ pr_err("Memory for the RMP table has not been reserved by BIOS\n");
+ return false;
+ }
+
+ if (rmp_base > rmp_end) {
+ pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
+ return false;
+ }
+
+ rmp_sz = rmp_end - rmp_base + 1;
+
+ probed_rmp_base = rmp_base;
+ probed_rmp_size = rmp_sz;
+
+ pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
+ rmp_base, rmp_end);
+
+ return true;
+}
+
static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
{
if (!rmptable)
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
` (3 preceding siblings ...)
2024-09-30 15:22 ` [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-16 11:25 ` [sos-linux-ext-patches] " Nikunj A. Dadhania
2024-10-18 4:38 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment Tom Lendacky
` (2 subsequent siblings)
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
In preparation for support of a segmented RMP table, map only the RMP
table entries. The RMP bookkeeping area is only ever accessed when
first enabling SNP and does not need to remain mapped. To accomplish
this, split the initialization of the RMP bookkeeping area and the
initialization of the RMP entry area. The RMP bookkeeping area will be
mapped only while it is being initialized.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/virt/svm/sev.c | 36 +++++++++++++++++++++++++++++++-----
1 file changed, 31 insertions(+), 5 deletions(-)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 31d1510ae119..81e21d833cf0 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -168,6 +168,23 @@ void __init snp_fixup_e820_tables(void)
__snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
}
+static bool __init init_rmptable_bookkeeping(void)
+{
+ void *bk;
+
+ bk = memremap(probed_rmp_base, RMPTABLE_CPU_BOOKKEEPING_SZ, MEMREMAP_WB);
+ if (!bk) {
+ pr_err("Failed to map RMP bookkeeping area\n");
+ return false;
+ }
+
+ memset(bk, 0, RMPTABLE_CPU_BOOKKEEPING_SZ);
+
+ memunmap(bk);
+
+ return true;
+}
+
/*
* Do the necessary preparations which are verified by the firmware as
* described in the SNP_INIT_EX firmware command description in the SNP
@@ -205,12 +222,17 @@ static int __init snp_rmptable_init(void)
goto nosnp;
}
- rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
+ /* Map only the RMP entries */
+ rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
+ probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
+ MEMREMAP_WB);
if (!rmptable_start) {
pr_err("Failed to map RMP table\n");
goto nosnp;
}
+ rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
+
/*
* Check if SEV-SNP is already enabled, this can happen in case of
* kexec boot.
@@ -219,7 +241,14 @@ static int __init snp_rmptable_init(void)
if (val & MSR_AMD64_SYSCFG_SNP_EN)
goto skip_enable;
- memset(rmptable_start, 0, probed_rmp_size);
+ /* Zero out the RMP bookkeeping area */
+ if (!init_rmptable_bookkeeping()) {
+ memunmap(rmptable_start);
+ goto nosnp;
+ }
+
+ /* Zero out the RMP entries */
+ memset(rmptable_start, 0, rmptable_size);
/* Flush the caches to ensure that data is written before SNP is enabled. */
wbinvd_on_all_cpus();
@@ -230,9 +259,6 @@ static int __init snp_rmptable_init(void)
on_each_cpu(snp_enable, NULL, 1);
skip_enable:
- rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
- rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
-
rmptable = (struct rmpentry_raw *)rmptable_start;
rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
` (4 preceding siblings ...)
2024-09-30 15:22 ` [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-17 11:05 ` Nikunj A. Dadhania
2024-10-18 5:59 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP) Tom Lendacky
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
In preparation for support of a segmented RMP table, treat the contiguous
RMP table as a segmented RMP table with a single segment covering all
of memory. By treating a contiguous RMP table as a single segment, much
of the code that initializes and accesses the RMP can be re-used.
Segmented RMP tables can have up to 512 segment entries. Each segment
will have metadata associated with it to identify the segment location,
the segment size, etc. The segment data and the physical address are used
to determine the index of the segment within the table and then the RMP
entry within the segment. For an actual segmented RMP table environment,
much of the segment information will come from a configuration MSR. For
the contiguous RMP, though, much of the information will be statically
defined.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/virt/svm/sev.c | 195 ++++++++++++++++++++++++++++++++++++----
1 file changed, 176 insertions(+), 19 deletions(-)
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 81e21d833cf0..ebfb924652f8 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -18,6 +18,7 @@
#include <linux/cpumask.h>
#include <linux/iommu.h>
#include <linux/amd-iommu.h>
+#include <linux/nospec.h>
#include <asm/sev.h>
#include <asm/processor.h>
@@ -74,12 +75,42 @@ struct rmpentry_raw {
*/
#define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
+/*
+ * For a non-segmented RMP table, use the maximum physical addressing as the
+ * segment size in order to always arrive at index 0 in the table.
+ */
+#define RMPTABLE_NON_SEGMENTED_SHIFT 52
+
+struct rmp_segment_desc {
+ struct rmpentry_raw *rmp_entry;
+ u64 max_index;
+ u64 size;
+};
+
+/*
+ * Segmented RMP Table support.
+ * - The segment size is used for two purposes:
+ * - Identify the amount of memory covered by an RMP segment
+ * - Quickly locate an RMP segment table entry for a physical address
+ *
+ * - The RMP segment table contains pointers to an RMP table that covers
+ * a specific portion of memory. There can be up to 512 8-byte entries,
+ * one pages worth.
+ */
+static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
+static unsigned int rst_max_index __ro_after_init = 512;
+
+static u64 rmp_segment_size_max;
+static unsigned int rmp_segment_coverage_shift;
+static unsigned long rmp_segment_coverage_size;
+static unsigned long rmp_segment_coverage_mask;
+#define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_coverage_shift)
+#define RMP_ENTRY_INDEX(x) PHYS_PFN((x) & rmp_segment_coverage_mask)
+
/* Mask to apply to a PFN to get the first PFN of a 2MB page */
#define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
static u64 probed_rmp_base, probed_rmp_size;
-static struct rmpentry_raw *rmptable __ro_after_init;
-static u64 rmptable_max_pfn __ro_after_init;
static LIST_HEAD(snp_leaked_pages_list);
static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
@@ -185,6 +216,92 @@ static bool __init init_rmptable_bookkeeping(void)
return true;
}
+static bool __init alloc_rmp_segment_desc(u64 segment_pa, u64 segment_size, u64 pa)
+{
+ struct rmp_segment_desc *desc;
+ unsigned long rst_index;
+ void *rmp_segment;
+
+ /* Validate the RMP segment size */
+ if (segment_size > rmp_segment_size_max) {
+ pr_err("Invalid RMP size (%#llx) for configured segment size (%#llx)\n",
+ segment_size, rmp_segment_size_max);
+ return false;
+ }
+
+ /* Validate the RMP segment table index */
+ rst_index = RST_ENTRY_INDEX(pa);
+ if (rst_index >= rst_max_index) {
+ pr_err("Invalid RMP segment base address (%#llx) for configured segment size (%#lx)\n",
+ pa, rmp_segment_coverage_size);
+ return false;
+ }
+ rst_index = array_index_nospec(rst_index, rst_max_index);
+
+ if (rmp_segment_table[rst_index]) {
+ pr_err("RMP segment descriptor already exists at index %lu\n", rst_index);
+ return false;
+ }
+
+ /* Map the RMP entries */
+ rmp_segment = memremap(segment_pa, segment_size, MEMREMAP_WB);
+ if (!rmp_segment) {
+ pr_err("Failed to map RMP segment addr 0x%llx size 0x%llx\n",
+ segment_pa, segment_size);
+ return false;
+ }
+
+ desc = kzalloc(sizeof(*desc), GFP_KERNEL);
+ if (!desc) {
+ memunmap(rmp_segment);
+ return false;
+ }
+
+ desc->rmp_entry = rmp_segment;
+ desc->max_index = segment_size / sizeof(*desc->rmp_entry);
+ desc->size = segment_size;
+
+ /* Add the segment descriptor to the table */
+ rmp_segment_table[rst_index] = desc;
+
+ return true;
+}
+
+static void __init free_rmp_segment_table(void)
+{
+ unsigned int i;
+
+ for (i = 0; i < rst_max_index; i++) {
+ struct rmp_segment_desc *desc;
+
+ desc = rmp_segment_table[i];
+ if (!desc)
+ continue;
+
+ memunmap(desc->rmp_entry);
+
+ kfree(desc);
+ }
+
+ free_page((unsigned long)rmp_segment_table);
+
+ rmp_segment_table = NULL;
+}
+
+static bool __init alloc_rmp_segment_table(void)
+{
+ struct page *page;
+
+ /* Allocate the table used to index into the RMP segments */
+ page = alloc_page(__GFP_ZERO);
+ if (!page)
+ return false;
+
+ rmp_segment_table = page_address(page);
+
+ return true;
+}
+
/*
* Do the necessary preparations which are verified by the firmware as
* described in the SNP_INIT_EX firmware command description in the SNP
@@ -192,8 +309,8 @@ static bool __init init_rmptable_bookkeeping(void)
*/
static int __init snp_rmptable_init(void)
{
- u64 max_rmp_pfn, calc_rmp_sz, rmptable_size, rmp_end, val;
- void *rmptable_start;
+ u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end, val;
+ unsigned int i;
if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
return 0;
@@ -222,17 +339,18 @@ static int __init snp_rmptable_init(void)
goto nosnp;
}
+ if (!alloc_rmp_segment_table())
+ goto nosnp;
+
/* Map only the RMP entries */
- rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
- probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
- MEMREMAP_WB);
- if (!rmptable_start) {
- pr_err("Failed to map RMP table\n");
+ rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
+ rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+ if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) {
+ free_rmp_segment_table();
goto nosnp;
}
- rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
-
/*
* Check if SEV-SNP is already enabled, this can happen in case of
* kexec boot.
@@ -243,12 +361,20 @@ static int __init snp_rmptable_init(void)
/* Zero out the RMP bookkeeping area */
if (!init_rmptable_bookkeeping()) {
- memunmap(rmptable_start);
+ free_rmp_segment_table();
goto nosnp;
}
/* Zero out the RMP entries */
- memset(rmptable_start, 0, rmptable_size);
+ for (i = 0; i < rst_max_index; i++) {
+ struct rmp_segment_desc *desc;
+
+ desc = rmp_segment_table[i];
+ if (!desc)
+ continue;
+
+ memset(desc->rmp_entry, 0, desc->size);
+ }
/* Flush the caches to ensure that data is written before SNP is enabled. */
wbinvd_on_all_cpus();
@@ -259,9 +385,6 @@ static int __init snp_rmptable_init(void)
on_each_cpu(snp_enable, NULL, 1);
skip_enable:
- rmptable = (struct rmpentry_raw *)rmptable_start;
- rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
-
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
/*
@@ -282,6 +405,17 @@ static int __init snp_rmptable_init(void)
*/
device_initcall(snp_rmptable_init);
+static void set_rmp_segment_info(unsigned int segment_shift)
+{
+ rmp_segment_coverage_shift = segment_shift;
+ rmp_segment_coverage_size = 1UL << rmp_segment_coverage_shift;
+ rmp_segment_coverage_mask = rmp_segment_coverage_size - 1;
+
+ /* Calculate the maximum size an RMP can be (16 bytes/page mapped) */
+ rmp_segment_size_max = PHYS_PFN(rmp_segment_coverage_size);
+ rmp_segment_size_max <<= 4;
+}
+
#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
bool snp_probe_rmptable_info(void)
@@ -303,6 +437,11 @@ bool snp_probe_rmptable_info(void)
rmp_sz = rmp_end - rmp_base + 1;
+ /* Treat the contiguous RMP table as a single segment */
+ rst_max_index = 1;
+
+ set_rmp_segment_info(RMPTABLE_NON_SEGMENTED_SHIFT);
+
probed_rmp_base = rmp_base;
probed_rmp_size = rmp_sz;
@@ -314,13 +453,31 @@ bool snp_probe_rmptable_info(void)
static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
{
- if (!rmptable)
+ struct rmp_segment_desc *desc;
+ unsigned long rst_index;
+ unsigned long paddr;
+ u64 segment_index;
+
+ if (!rmp_segment_table)
return ERR_PTR(-ENODEV);
- if (unlikely(pfn > rmptable_max_pfn))
+ paddr = pfn << PAGE_SHIFT;
+
+ rst_index = RST_ENTRY_INDEX(paddr);
+ if (unlikely(rst_index >= rst_max_index))
+ return ERR_PTR(-EFAULT);
+ rst_index = array_index_nospec(rst_index, rst_max_index);
+
+ desc = rmp_segment_table[rst_index];
+ if (unlikely(!desc))
return ERR_PTR(-EFAULT);
- return rmptable + pfn;
+ segment_index = RMP_ENTRY_INDEX(paddr);
+ if (unlikely(segment_index >= desc->max_index))
+ return ERR_PTR(-EFAULT);
+ segment_index = array_index_nospec(segment_index, desc->max_index);
+
+ return desc->rmp_entry + segment_index;
}
static int get_rmpentry(u64 pfn, struct rmpentry *entry)
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
` (5 preceding siblings ...)
2024-09-30 15:22 ` [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-18 6:32 ` Nikunj A. Dadhania
2024-10-18 8:37 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP) Tom Lendacky
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
A segmented RMP table allows for improved locality of reference between
the memory protected by the RMP and the RMP entries themselves.
Add support to detect and initialize a segmented RMP table with multiple
segments as configured by the system BIOS. While the RMPREAD instruction
will be used to read an RMP entry in a segmented RMP, initialization and
debugging capabilities will require the mapping of the segments.
The RMP_CFG MSR indicates if segmented RMP support is enabled and, if
enabled, the amount of memory that an RMP segment covers. When segmented
RMP support is enabled, the RMP_BASE MSR points to the start of the RMP
bookkeeping area, which is 16K in size. The RMP Segment Table (RST) is
located immediately after the bookkeeping area and is 4K in size. The RST
contains up to 512 8-byte entries that identify the location of the RMP
segment and amount of memory mapped by the segment (which must be less
than or equal to the configured segment size). The physical address that
is covered by a segment is based on the segment size and the index of the
segment in the RST. The RMP entry for a physical address is based on the
offset within the segment.
For example, if the segment size is 64GB (0x1000000000 or 1 << 36), then
physical address 0x9000800000 is RST entry 9 (0x9000800000 >> 36) and
RST entry 9 covers physical memory 0x9000000000 to 0x9FFFFFFFFF.
The RMP entry index within the RMP segment is the physical address
AND-ed with the segment mask, 64GB - 1 (0xFFFFFFFFF), and then
right-shifted 12 bits or PHYS_PFN(0x9000800000 & 0xFFFFFFFFF), which
is 0x800.
CPUID 0x80000025_EBX[9:0] describes the number of RMP segments that can
be cached by the hardware. Additionally, if CPUID 0x80000025_EBX[10] is
set, then the number of actual RMP segments defined cannot exceed the
number of RMP segments that can be cached and can be used as a maximum
RST index.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 9 +-
arch/x86/virt/svm/sev.c | 231 ++++++++++++++++++++++++++---
3 files changed, 218 insertions(+), 23 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 93620a4c5b15..417cdc636a12 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -448,6 +448,7 @@
#define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */
#define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
#define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
+#define X86_FEATURE_SEGMENTED_RMP (19*32+23) /* Segmented RMP support */
#define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
/* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..8b57c4d1098f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -682,11 +682,14 @@
#define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT)
#define MSR_AMD64_SNP_RESV_BIT 18
#define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
-
-#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
-
#define MSR_AMD64_RMP_BASE 0xc0010132
#define MSR_AMD64_RMP_END 0xc0010133
+#define MSR_AMD64_RMP_CFG 0xc0010136
+#define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
+#define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
+#define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
+
+#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
#define MSR_SVSM_CAA 0xc001f000
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index ebfb924652f8..2f83772d3daa 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -97,6 +97,10 @@ struct rmp_segment_desc {
* a specific portion of memory. There can be up to 512 8-byte entries,
* one pages worth.
*/
+#define RST_ENTRY_MAPPED_SIZE(x) ((x) & GENMASK_ULL(19, 0))
+#define RST_ENTRY_SEGMENT_BASE(x) ((x) & GENMASK_ULL(51, 20))
+
+#define RMP_SEGMENT_TABLE_SIZE SZ_4K
static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
static unsigned int rst_max_index __ro_after_init = 512;
@@ -107,6 +111,9 @@ static unsigned long rmp_segment_coverage_mask;
#define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_coverage_shift)
#define RMP_ENTRY_INDEX(x) PHYS_PFN((x) & rmp_segment_coverage_mask)
+static u64 rmp_cfg;
+#define RMP_IS_SEGMENTED(x) ((x) & MSR_AMD64_SEG_RMP_ENABLED)
+
/* Mask to apply to a PFN to get the first PFN of a 2MB page */
#define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
@@ -196,7 +203,42 @@ static void __init __snp_fixup_e820_tables(u64 pa)
void __init snp_fixup_e820_tables(void)
{
__snp_fixup_e820_tables(probed_rmp_base);
- __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
+
+ if (RMP_IS_SEGMENTED(rmp_cfg)) {
+ unsigned long size;
+ unsigned int i;
+ u64 pa, *rst;
+
+ pa = probed_rmp_base;
+ pa += RMPTABLE_CPU_BOOKKEEPING_SZ;
+ pa += RMP_SEGMENT_TABLE_SIZE;
+ __snp_fixup_e820_tables(pa);
+
+ pa -= RMP_SEGMENT_TABLE_SIZE;
+ rst = early_memremap(pa, RMP_SEGMENT_TABLE_SIZE);
+ if (!rst)
+ return;
+
+ for (i = 0; i < rst_max_index; i++) {
+ pa = RST_ENTRY_SEGMENT_BASE(rst[i]);
+ size = RST_ENTRY_MAPPED_SIZE(rst[i]);
+ if (!size)
+ continue;
+
+ __snp_fixup_e820_tables(pa);
+
+ /* Mapped size in GB */
+ size *= (1UL << 30);
+ if (size > rmp_segment_coverage_size)
+ size = rmp_segment_coverage_size;
+
+ __snp_fixup_e820_tables(pa + size);
+ }
+
+ early_memunmap(rst, RMP_SEGMENT_TABLE_SIZE);
+ } else {
+ __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
+ }
}
static bool __init init_rmptable_bookkeeping(void)
@@ -302,24 +344,12 @@ static bool __init alloc_rmp_segment_table(void)
return true;
}
-/*
- * Do the necessary preparations which are verified by the firmware as
- * described in the SNP_INIT_EX firmware command description in the SNP
- * firmware ABI spec.
- */
-static int __init snp_rmptable_init(void)
+static bool __init contiguous_rmptable_setup(void)
{
- u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end, val;
- unsigned int i;
-
- if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
- return 0;
-
- if (!amd_iommu_snp_en)
- goto nosnp;
+ u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end;
if (!probed_rmp_size)
- goto nosnp;
+ return false;
rmp_end = probed_rmp_base + probed_rmp_size - 1;
@@ -336,11 +366,11 @@ static int __init snp_rmptable_init(void)
if (calc_rmp_sz > probed_rmp_size) {
pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
calc_rmp_sz, probed_rmp_size);
- goto nosnp;
+ return false;
}
if (!alloc_rmp_segment_table())
- goto nosnp;
+ return false;
/* Map only the RMP entries */
rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
@@ -348,9 +378,116 @@ static int __init snp_rmptable_init(void)
if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) {
free_rmp_segment_table();
- goto nosnp;
+ return false;
}
+ return true;
+}
+
+static bool __init segmented_rmptable_setup(void)
+{
+ u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max;
+ unsigned int i, max_index;
+
+ if (!probed_rmp_base)
+ return false;
+
+ if (!alloc_rmp_segment_table())
+ return false;
+
+ /* Map the RMP Segment Table */
+ rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
+ rst = memremap(rst_pa, RMP_SEGMENT_TABLE_SIZE, MEMREMAP_WB);
+ if (!rst) {
+ pr_err("Failed to map RMP segment table addr %#llx\n", rst_pa);
+ goto e_free;
+ }
+
+ /* Get the address for the end of system RAM */
+ ram_pa_max = max_pfn << PAGE_SHIFT;
+
+ /* Process each RMP segment */
+ max_index = 0;
+ ram_pa_end = 0;
+ for (i = 0; i < rst_max_index; i++) {
+ u64 rmp_segment, rmp_size, mapped_size;
+
+ mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]);
+ if (!mapped_size)
+ continue;
+
+ max_index = i;
+
+ /* Mapped size in GB */
+ mapped_size *= (1ULL << 30);
+ if (mapped_size > rmp_segment_coverage_size)
+ mapped_size = rmp_segment_coverage_size;
+
+ rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]);
+
+ rmp_size = PHYS_PFN(mapped_size);
+ rmp_size <<= 4;
+
+ pa = (u64)i << rmp_segment_coverage_shift;
+
+ /* Some segments may be for MMIO mapped above system RAM */
+ if (pa < ram_pa_max)
+ ram_pa_end = pa + mapped_size;
+
+ if (!alloc_rmp_segment_desc(rmp_segment, rmp_size, pa))
+ goto e_unmap;
+
+ pr_info("RMP segment %u physical address [%#llx - %#llx] covering [%#llx - %#llx]\n",
+ i, rmp_segment, rmp_segment + rmp_size - 1, pa, pa + mapped_size - 1);
+ }
+
+ if (ram_pa_max > ram_pa_end) {
+ pr_err("Segmented RMP does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
+ ram_pa_max, ram_pa_end);
+ goto e_unmap;
+ }
+
+ /* Adjust the maximum index based on the found segments */
+ rst_max_index = max_index + 1;
+
+ memunmap(rst);
+
+ return true;
+
+e_unmap:
+ memunmap(rst);
+
+e_free:
+ free_rmp_segment_table();
+
+ return false;
+}
+
+static bool __init rmptable_setup(void)
+{
+ return RMP_IS_SEGMENTED(rmp_cfg) ? segmented_rmptable_setup()
+ : contiguous_rmptable_setup();
+}
+
+/*
+ * Do the necessary preparations which are verified by the firmware as
+ * described in the SNP_INIT_EX firmware command description in the SNP
+ * firmware ABI spec.
+ */
+static int __init snp_rmptable_init(void)
+{
+ unsigned int i;
+ u64 val;
+
+ if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+ return 0;
+
+ if (!amd_iommu_snp_en)
+ goto nosnp;
+
+ if (!rmptable_setup())
+ goto nosnp;
+
/*
* Check if SEV-SNP is already enabled, this can happen in case of
* kexec boot.
@@ -418,7 +555,7 @@ static void set_rmp_segment_info(unsigned int segment_shift)
#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
-bool snp_probe_rmptable_info(void)
+static bool probe_contiguous_rmptable_info(void)
{
u64 rmp_sz, rmp_base, rmp_end;
@@ -451,6 +588,60 @@ bool snp_probe_rmptable_info(void)
return true;
}
+static bool probe_segmented_rmptable_info(void)
+{
+ unsigned int eax, ebx, segment_shift, segment_shift_min, segment_shift_max;
+ u64 rmp_base, rmp_end;
+
+ rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
+ rdmsrl(MSR_AMD64_RMP_END, rmp_end);
+
+ if (!(rmp_base & RMP_ADDR_MASK)) {
+ pr_err("Memory for the RMP table has not been reserved by BIOS\n");
+ return false;
+ }
+
+ WARN_ONCE(rmp_end & RMP_ADDR_MASK,
+ "Segmented RMP enabled but RMP_END MSR is non-zero\n");
+
+ /* Obtain the min and max supported RMP segment size */
+ eax = cpuid_eax(0x80000025);
+ segment_shift_min = eax & GENMASK(5, 0);
+ segment_shift_max = (eax & GENMASK(11, 6)) >> 6;
+
+ /* Verify the segment size is within the supported limits */
+ segment_shift = MSR_AMD64_RMP_SEGMENT_SHIFT(rmp_cfg);
+ if (segment_shift > segment_shift_max || segment_shift < segment_shift_min) {
+ pr_err("RMP segment size (%u) is not within advertised bounds (min=%u, max=%u)\n",
+ segment_shift, segment_shift_min, segment_shift_max);
+ return false;
+ }
+
+ /* Override the max supported RST index if a hardware limit exists */
+ ebx = cpuid_ebx(0x80000025);
+ if (ebx & BIT(10))
+ rst_max_index = ebx & GENMASK(9, 0);
+
+ set_rmp_segment_info(segment_shift);
+
+ probed_rmp_base = rmp_base;
+ probed_rmp_size = 0;
+
+ pr_info("RMP segment table physical address [0x%016llx - 0x%016llx]\n",
+ rmp_base, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
+
+ return true;
+}
+
+bool snp_probe_rmptable_info(void)
+{
+ if (cpu_feature_enabled(X86_FEATURE_SEGMENTED_RMP))
+ rdmsrl(MSR_AMD64_RMP_CFG, rmp_cfg);
+
+ return RMP_IS_SEGMENTED(rmp_cfg) ? probe_segmented_rmptable_info()
+ : probe_contiguous_rmptable_info();
+}
+
static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
{
struct rmp_segment_desc *desc;
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP)
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
` (6 preceding siblings ...)
2024-09-30 15:22 ` [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table Tom Lendacky
@ 2024-09-30 15:22 ` Tom Lendacky
2024-10-18 6:56 ` Nikunj A. Dadhania
2024-10-18 13:31 ` Neeraj Upadhyay
7 siblings, 2 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 15:22 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
Update the AMD memory encryption documentation to include information on
the Reverse Map Table (RMP) and the two table formats.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
.../arch/x86/amd-memory-encryption.rst | 118 ++++++++++++++++++
1 file changed, 118 insertions(+)
diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
index 6df3264f23b9..bd840df708ea 100644
--- a/Documentation/arch/x86/amd-memory-encryption.rst
+++ b/Documentation/arch/x86/amd-memory-encryption.rst
@@ -130,8 +130,126 @@ SNP feature support.
More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
+Reverse Map Table (RMP)
+=======================
+
+The RMP is a structure in system memory that is used to ensure a one-to-one
+mapping between system physical addresses and guest physical addresses. Each
+page of memory that is potentially assignable to guests has one entry within
+the RMP.
+
+The RMP table can be either contiguous in memory or a collection of segments
+in memory.
+
+Contiguous RMP
+--------------
+
+Support for this form of the RMP is present when support for SEV-SNP is
+present, which can be determined using the CPUID instruction::
+
+ 0x8000001f[eax]:
+ Bit[4] indicates support for SEV-SNP
+
+The location of the RMP is identified to the hardware through two MSRs::
+
+ 0xc0010132 (RMP_BASE):
+ System physical address of the first byte of the RMP
+
+ 0xc0010133 (RMP_END):
+ System physical address of the last byte of the RMP
+
+Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV
+firmware increases the alignment requirement to require a 1MB alignment.
+
+The RMP consists of a 16KB region used for processor bookkeeping followed
+by the RMP entries, which are 16 bytes in size. The size of the RMP
+determines the range of physical memory that the hypervisor can assign to
+SEV-SNP guests. The RMP covers the system physical address from::
+
+ 0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB.
+
+The current Linux support relies on BIOS to allocate/reserve the memory for
+the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR
+values to locate the RMP and determine the size of the RMP. The RMP must
+cover all of system memory in order for Linux to enable SEV-SNP.
+
+Segmented RMP
+-------------
+
+Segmented RMP support is a new way of representing the layout of an RMP.
+Initial RMP support required the RMP table to be contiguous in memory.
+RMP accesses from a NUMA node on which the RMP doesn't reside
+can take longer than accesses from a NUMA node on which the RMP resides.
+Segmented RMP support allows the RMP entries to be located on the same
+node as the memory the RMP is covering, potentially reducing latency
+associated with accessing an RMP entry associated with the memory. Each
+RMP segment covers a specific range of system physical addresses.
+
+Support for this form of the RMP can be determined using the CPUID
+instruction::
+
+ 0x8000001f[eax]:
+ Bit[23] indicates support for segmented RMP
+
+If supported, segmented RMP attributes can be found using the CPUID
+instruction::
+
+ 0x80000025[eax]:
+ Bits[5:0] minimum supported RMP segment size
+ Bits[11:6] maximum supported RMP segment size
+
+ 0x80000025[ebx]:
+ Bits[9:0] number of cacheable RMP segment definitions
+ Bit[10] indicates if the number of cacheable RMP segments
+ is a hard limit
+
+To enable a segmented RMP, a new MSR is available::
+
+ 0xc0010136 (RMP_CFG):
+ Bit[0] indicates if segmented RMP is enabled
+ Bits[13:8] contains the size of memory covered by an RMP
+ segment (expressed as a power of 2)
+
+The RMP segment size defined in the RMP_CFG MSR applies to all segments
+of the RMP. Therefore each RMP segment covers a specific range of system
+physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then
+the RMP segment coverage value is 0x24 => 36, meaning the size of memory
+covered by an RMP segment is 64GB (1 << 36). So the first RMP segment
+covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment
+covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc.
+
+When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping
+area as it does today (16K in size). However, instead of RMP entries
+beginning immediately after the bookkeeping area, there is a 4K RMP
+segment table (RST). Each entry in the RST is 8-bytes in size and represents
+an RMP segment::
+
+ Bits[19:0] mapped size (in GB)
+ The mapped size can be less than the defined segment size.
+ A value of zero, indicates that no RMP exists for the range
+ of system physical addresses associated with this segment.
+ Bits[51:20] segment physical address
+ This address is left shift 20-bits (or just masked when
+ read) to form the physical address of the segment (1MB
+ alignment).
+
+The RST can hold 512 segment entries but can be limited in size to the number
+of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable
+RMP segments is a hard limit (CPUID 0x80000025_EBX[10]).
+
+The current Linux support relies on BIOS to allocate/reserve the memory for
+the segmented RMP (the bookkeeping area, RST, and all segments), build the RST
+and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR
+values to locate the RMP and determine the size and location of the RMP
+segments. The RMP must cover all of system memory in order for Linux to enable
+SEV-SNP.
+
+More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table",
+docID: 24593.
+
Secure VM Service Module (SVSM)
===============================
+
SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which
defines four privilege levels at which guest software can run. The most
privileged level is 0 and numerically higher numbers have lesser privileges.
--
2.43.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-09-30 15:22 ` [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h Tom Lendacky
@ 2024-09-30 17:03 ` Dave Hansen
2024-09-30 18:59 ` Tom Lendacky
2024-10-18 4:26 ` Neeraj Upadhyay
1 sibling, 1 reply; 43+ messages in thread
From: Dave Hansen @ 2024-09-30 17:03 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/24 08:22, Tom Lendacky wrote:
> if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
> - c->x86 >= 0x19 && snp_probe_rmptable_info()) {
> + (c->x86 == 0x19 || cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
> + snp_probe_rmptable_info()) {
One humble suggestion (and not a NAK of the series): Could we please
start using #define'd names for these family numbers? We started doing
it for Intel models and I think it's been really successful. We used to
do greps like:
grep -r 'x86_model == 0x0f' arch/x86/
But that misses things like '>=' or macros that build the x86_model
comparison. But now we can do things like:
grep -r INTEL_CORE2_MEROM arch/x86
which does a much better job of finding references.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-09-30 17:03 ` Dave Hansen
@ 2024-09-30 18:59 ` Tom Lendacky
2024-10-18 13:06 ` Borislav Petkov
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lendacky @ 2024-09-30 18:59 UTC (permalink / raw)
To: Dave Hansen, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/24 12:03, Dave Hansen wrote:
> On 9/30/24 08:22, Tom Lendacky wrote:
>> if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
>> - c->x86 >= 0x19 && snp_probe_rmptable_info()) {
>> + (c->x86 == 0x19 || cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
>> + snp_probe_rmptable_info()) {
>
> One humble suggestion (and not a NAK of the series): Could we please
> start using #define'd names for these family numbers? We started doing
> it for Intel models and I think it's been really successful. We used to
> do greps like:
>
> grep -r 'x86_model == 0x0f' arch/x86/
>
> But that misses things like '>=' or macros that build the x86_model
> comparison. But now we can do things like:
>
> grep -r INTEL_CORE2_MEROM arch/x86
>
> which does a much better job of finding references.
The one issue we run into is that family 0x19 contains both Milan (zen3)
and Genoa (zen4), so I'm not sure what to use as a good #define name. We
have the same problem with family 0x17 which contains zen1 and zen2.
I might be able to change the if statement to something like:
if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
(cpu_feature_enabled(X86_FEATURE_ZEN3) ||
cpu_feature_enabled(X86_FEATURE_ZEN4) ||
cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
snp_probe_rmptable_info()) {
which might make the intent clearer.
But, yes, I get your point about making grepping much easier, along with
code readability. Maybe Boris and I can put our heads together to figure
something out.
Thanks,
Tom
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
@ 2024-10-16 8:52 ` Nikunj A. Dadhania
2024-10-16 14:43 ` Tom Lendacky
2024-10-16 15:01 ` Neeraj Upadhyay
1 sibling, 1 reply; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-16 8:52 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> The RMPREAD instruction returns an architecture defined format of an
> RMP entry. This is the preferred method for examining RMP entries.
>
> In preparation for using the RMPREAD instruction, convert the existing
> code that directly accesses the RMP to map the raw RMP information into
> the architecture defined format.
>
> RMPREAD output returns a status bit for the 2MB region status. If the
> input page address is 2MB aligned and any other pages within the 2MB
> region are assigned, then 2MB region status will be set to 1. Otherwise,
> the 2MB region status will be set to 0. For systems that do not support
> RMPREAD, calculating this value would require looping over all of the RMP
> table entries within that range until one is found with the assigned bit
> set. Since this bit is not defined in the current format, and so not used
> today, do not incur the overhead associated with calculating it.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/virt/svm/sev.c | 141 ++++++++++++++++++++++++++++------------
> 1 file changed, 98 insertions(+), 43 deletions(-)
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 0ce17766c0e5..103a2dd6e81d 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -30,11 +30,27 @@
> #include <asm/cmdline.h>
> #include <asm/iommu.h>
>
> +/*
> + * The RMP entry format as returned by the RMPREAD instruction.
> + */
> +struct rmpentry {
> + u64 gpa;
> + u8 assigned :1,
> + rsvd1 :7;
> + u8 pagesize :1,
> + hpage_region_status :1,
> + rsvd2 :6;
> + u8 immutable :1,
> + rsvd3 :7;
> + u8 rsvd4;
> + u32 asid;
> +} __packed;
> +
> /*
> * The RMP entry format is not architectural. The format is defined in PPR
> * Family 19h Model 01h, Rev B1 processor.
> */
> -struct rmpentry {
> +struct rmpentry_raw {
> union {
> struct {
> u64 assigned : 1,
> @@ -62,7 +78,7 @@ struct rmpentry {
> #define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
>
> static u64 probed_rmp_base, probed_rmp_size;
> -static struct rmpentry *rmptable __ro_after_init;
> +static struct rmpentry_raw *rmptable __ro_after_init;
> static u64 rmptable_max_pfn __ro_after_init;
>
> static LIST_HEAD(snp_leaked_pages_list);
> @@ -247,8 +263,8 @@ static int __init snp_rmptable_init(void)
> rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
> rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
>
> - rmptable = (struct rmpentry *)rmptable_start;
> - rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry) - 1;
> + rmptable = (struct rmpentry_raw *)rmptable_start;
> + rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
>
> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
>
> @@ -270,48 +286,77 @@ static int __init snp_rmptable_init(void)
> */
> device_initcall(snp_rmptable_init);
>
> -static struct rmpentry *get_rmpentry(u64 pfn)
> +static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
pfn type has changed from u64 => unsigned long, is this intentional ?
> {
> - if (WARN_ON_ONCE(pfn > rmptable_max_pfn))
> - return ERR_PTR(-EFAULT);
> -
> - return &rmptable[pfn];
> -}
> -
> -static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
> -{
> - struct rmpentry *large_entry, *entry;
> -
> - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> + if (!rmptable)
> return ERR_PTR(-ENODEV);
>
> - entry = get_rmpentry(pfn);
> - if (IS_ERR(entry))
> - return entry;
> + if (unlikely(pfn > rmptable_max_pfn))
> + return ERR_PTR(-EFAULT);
> +
> + return rmptable + pfn;
> +}
> +
> +static int get_rmpentry(u64 pfn, struct rmpentry *entry)
> +{
> + struct rmpentry_raw *e;
> +
> + e = __get_rmpentry(pfn);
> + if (IS_ERR(e))
> + return PTR_ERR(e);
> +
> + /*
> + * Map the RMP table entry onto the RMPREAD output format.
> + * The 2MB region status indicator (hpage_region_status field) is not
> + * calculated, since the overhead could be significant and the field
> + * is not used.
> + */
> + memset(entry, 0, sizeof(*entry));
> + entry->gpa = e->gpa << PAGE_SHIFT;
> + entry->asid = e->asid;
> + entry->assigned = e->assigned;
> + entry->pagesize = e->pagesize;
> + entry->immutable = e->immutable;
> +
> + return 0;
> +}
> +
> +static int __snp_lookup_rmpentry(u64 pfn, struct rmpentry *entry, int *level)
> +{
> + struct rmpentry large_entry;
> + int ret;
> +
> + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> + return -ENODEV;
Can we rely on rmp_table check in __get_rmpentry() and remove the above check ?
If rmp_table is NULL, CC_ATTR_HOST_SEV_SNP is always cleared.
> +
> + ret = get_rmpentry(pfn, entry);
> + if (ret)
> + return ret;
Regards
Nikunj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
@ 2024-10-16 10:46 ` Nikunj A. Dadhania
2024-10-17 15:26 ` Borislav Petkov
` (2 subsequent siblings)
3 siblings, 0 replies; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-16 10:46 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra, nikunj
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> The RMPREAD instruction returns an architecture defined format of an
> RMP table entry. This is the preferred method for examining RMP entries.
>
> The instruction is advertised in CPUID 0x8000001f_EAX[21]. Use this
> instruction when available.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/virt/svm/sev.c | 11 +++++++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index dd4682857c12..93620a4c5b15 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -447,6 +447,7 @@
> #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* Virtual TSC_AUX */
> #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */
> #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
> +#define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
> #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
>
> /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 103a2dd6e81d..73d4f422829a 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -301,6 +301,17 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
> {
> struct rmpentry_raw *e;
>
> + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
> + int ret;
> +
> + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
> + : "=a" (ret)
> + : "a" (pfn << PAGE_SHIFT), "c" (entry)
> + : "memory", "cc");
> +
> + return ret;
> + }
> +
> e = __get_rmpentry(pfn);
> if (IS_ERR(e))
> return PTR_ERR(e);
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way
2024-09-30 15:22 ` [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way Tom Lendacky
@ 2024-10-16 11:05 ` Nikunj A. Dadhania
2024-10-18 4:28 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-16 11:05 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> To make patch review easier for the segmented RMP support, move the SNP
> probe function out from in between the initialization-related routines.
>
> No functional change.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
> arch/x86/virt/svm/sev.c | 60 ++++++++++++++++++++---------------------
> 1 file changed, 30 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 73d4f422829a..31d1510ae119 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -132,36 +132,6 @@ static __init void snp_enable(void *arg)
> __snp_enable(smp_processor_id());
> }
>
> -#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
> -
> -bool snp_probe_rmptable_info(void)
> -{
> - u64 rmp_sz, rmp_base, rmp_end;
> -
> - rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
> - rdmsrl(MSR_AMD64_RMP_END, rmp_end);
> -
> - if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
> - pr_err("Memory for the RMP table has not been reserved by BIOS\n");
> - return false;
> - }
> -
> - if (rmp_base > rmp_end) {
> - pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
> - return false;
> - }
> -
> - rmp_sz = rmp_end - rmp_base + 1;
> -
> - probed_rmp_base = rmp_base;
> - probed_rmp_size = rmp_sz;
> -
> - pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
> - rmp_base, rmp_end);
> -
> - return true;
> -}
> -
> static void __init __snp_fixup_e820_tables(u64 pa)
> {
> if (IS_ALIGNED(pa, PMD_SIZE))
> @@ -286,6 +256,36 @@ static int __init snp_rmptable_init(void)
> */
> device_initcall(snp_rmptable_init);
>
> +#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
> +
> +bool snp_probe_rmptable_info(void)
> +{
> + u64 rmp_sz, rmp_base, rmp_end;
> +
> + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
> + rdmsrl(MSR_AMD64_RMP_END, rmp_end);
> +
> + if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
> + pr_err("Memory for the RMP table has not been reserved by BIOS\n");
> + return false;
> + }
> +
> + if (rmp_base > rmp_end) {
> + pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
> + return false;
> + }
> +
> + rmp_sz = rmp_end - rmp_base + 1;
> +
> + probed_rmp_base = rmp_base;
> + probed_rmp_size = rmp_sz;
> +
> + pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
> + rmp_base, rmp_end);
> +
> + return true;
> +}
> +
> static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
> {
> if (!rmptable)
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [sos-linux-ext-patches] [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range
2024-09-30 15:22 ` [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range Tom Lendacky
@ 2024-10-16 11:25 ` Nikunj A. Dadhania
2024-10-18 4:38 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-16 11:25 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> In preparation for support of a segmented RMP table, map only the RMP
s/support of/supporting/
> table entries. The RMP bookkeeping area is only ever accessed when
> first enabling SNP and does not need to remain mapped. To accomplish
> this, split the initialization of the RMP bookkeeping area and the
> initialization of the RMP entry area. The RMP bookkeeping area will be
> mapped only while it is being initialized.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
> arch/x86/virt/svm/sev.c | 36 +++++++++++++++++++++++++++++++-----
> 1 file changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 31d1510ae119..81e21d833cf0 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -168,6 +168,23 @@ void __init snp_fixup_e820_tables(void)
> __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
> }
>
> +static bool __init init_rmptable_bookkeeping(void)
> +{
> + void *bk;
> +
> + bk = memremap(probed_rmp_base, RMPTABLE_CPU_BOOKKEEPING_SZ, MEMREMAP_WB);
> + if (!bk) {
> + pr_err("Failed to map RMP bookkeeping area\n");
> + return false;
> + }
> +
> + memset(bk, 0, RMPTABLE_CPU_BOOKKEEPING_SZ);
> +
> + memunmap(bk);
> +
> + return true;
> +}
> +
> /*
> * Do the necessary preparations which are verified by the firmware as
> * described in the SNP_INIT_EX firmware command description in the SNP
> @@ -205,12 +222,17 @@ static int __init snp_rmptable_init(void)
> goto nosnp;
> }
>
> - rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
> + /* Map only the RMP entries */
> + rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
> + probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
> + MEMREMAP_WB);
> if (!rmptable_start) {
> pr_err("Failed to map RMP table\n");
> goto nosnp;
> }
>
> + rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
> +
> /*
> * Check if SEV-SNP is already enabled, this can happen in case of
> * kexec boot.
> @@ -219,7 +241,14 @@ static int __init snp_rmptable_init(void)
> if (val & MSR_AMD64_SYSCFG_SNP_EN)
> goto skip_enable;
>
> - memset(rmptable_start, 0, probed_rmp_size);
> + /* Zero out the RMP bookkeeping area */
> + if (!init_rmptable_bookkeeping()) {
> + memunmap(rmptable_start);
> + goto nosnp;
> + }
> +
> + /* Zero out the RMP entries */
> + memset(rmptable_start, 0, rmptable_size);
>
> /* Flush the caches to ensure that data is written before SNP is enabled. */
> wbinvd_on_all_cpus();
> @@ -230,9 +259,6 @@ static int __init snp_rmptable_init(void)
> on_each_cpu(snp_enable, NULL, 1);
>
> skip_enable:
> - rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
> - rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
> -
> rmptable = (struct rmpentry_raw *)rmptable_start;
> rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP
2024-10-16 8:52 ` Nikunj A. Dadhania
@ 2024-10-16 14:43 ` Tom Lendacky
2024-10-17 5:24 ` Nikunj A. Dadhania
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lendacky @ 2024-10-16 14:43 UTC (permalink / raw)
To: Nikunj A. Dadhania, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/16/24 03:52, Nikunj A. Dadhania wrote:
> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>> The RMPREAD instruction returns an architecture defined format of an
>> RMP entry. This is the preferred method for examining RMP entries.
>>
>> In preparation for using the RMPREAD instruction, convert the existing
>> code that directly accesses the RMP to map the raw RMP information into
>> the architecture defined format.
>>
>> RMPREAD output returns a status bit for the 2MB region status. If the
>> input page address is 2MB aligned and any other pages within the 2MB
>> region are assigned, then 2MB region status will be set to 1. Otherwise,
>> the 2MB region status will be set to 0. For systems that do not support
>> RMPREAD, calculating this value would require looping over all of the RMP
>> table entries within that range until one is found with the assigned bit
>> set. Since this bit is not defined in the current format, and so not used
>> today, do not incur the overhead associated with calculating it.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> arch/x86/virt/svm/sev.c | 141 ++++++++++++++++++++++++++++------------
>> 1 file changed, 98 insertions(+), 43 deletions(-)
>>
>> -static struct rmpentry *get_rmpentry(u64 pfn)
>> +static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
>
> pfn type has changed from u64 => unsigned long, is this intentional ?
No, not intentional, I'm just used to pfn's being unsigned longs... good
catch.
>
>> +static int __snp_lookup_rmpentry(u64 pfn, struct rmpentry *entry, int *level)
>> +{
>> + struct rmpentry large_entry;
>> + int ret;
>> +
>> + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>> + return -ENODEV;
>
> Can we rely on rmp_table check in __get_rmpentry() and remove the above check ?
> If rmp_table is NULL, CC_ATTR_HOST_SEV_SNP is always cleared.
I'm trying to not change the logic and just add the new struct usage.
Once RMPREAD is used there is no checking of the table address and if
SNP is not enabled in the SYSCFG MSR the instruction will #UD.
The table address check is just to ensure we don't accidentally call
this function without checking CC_ATTR_HOST_SEV_SNP in the future to
avoid a possible crash. If anything, I can remove the table address
check that I added here, but I would like to keep it just to be safe.
Thanks,
Tom
>
>> +
>> + ret = get_rmpentry(pfn, entry);
>> + if (ret)
>> + return ret;
>
> Regards
> Nikunj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
2024-10-16 8:52 ` Nikunj A. Dadhania
@ 2024-10-16 15:01 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-16 15:01 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
...
> +
> +static int get_rmpentry(u64 pfn, struct rmpentry *entry)
> +{
> + struct rmpentry_raw *e;
> +
> + e = __get_rmpentry(pfn);
> + if (IS_ERR(e))
> + return PTR_ERR(e);
> +
> + /*
> + * Map the RMP table entry onto the RMPREAD output format.
> + * The 2MB region status indicator (hpage_region_status field) is not
> + * calculated, since the overhead could be significant and the field
> + * is not used.
> + */
> + memset(entry, 0, sizeof(*entry));
> + entry->gpa = e->gpa << PAGE_SHIFT;
Nit: Do we need to use PAGE_SHIFT here or hard code the shift to 12?
- Neeraj
> + entry->asid = e->asid;
> + entry->assigned = e->assigned;
> + entry->pagesize = e->pagesize;
> + entry->immutable = e->immutable;
> +
> + return 0;
> +}
> +
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP
2024-10-16 14:43 ` Tom Lendacky
@ 2024-10-17 5:24 ` Nikunj A. Dadhania
0 siblings, 0 replies; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-17 5:24 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/16/2024 8:13 PM, Tom Lendacky wrote:
> On 10/16/24 03:52, Nikunj A. Dadhania wrote:
>> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>>> The RMPREAD instruction returns an architecture defined format of an
>>> RMP entry. This is the preferred method for examining RMP entries.
>>>
>>> +static int __snp_lookup_rmpentry(u64 pfn, struct rmpentry *entry, int *level)
>>> +{
>>> + struct rmpentry large_entry;
>>> + int ret;
>>> +
>>> + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>>> + return -ENODEV;
>>
>> Can we rely on rmp_table check in __get_rmpentry() and remove the above check ?
>> If rmp_table is NULL, CC_ATTR_HOST_SEV_SNP is always cleared.
>
> I'm trying to not change the logic and just add the new struct usage.
> Once RMPREAD is used there is no checking of the table address and if
> SNP is not enabled in the SYSCFG MSR the instruction will #UD.
Sure, makes sense.
> The table address check is just to ensure we don't accidentally call
> this function without checking CC_ATTR_HOST_SEV_SNP in the future to
> avoid a possible crash. If anything, I can remove the table address
> check that I added here, but I would like to keep it just to be safe.
Regards
Nikunj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment
2024-09-30 15:22 ` [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment Tom Lendacky
@ 2024-10-17 11:05 ` Nikunj A. Dadhania
2024-10-18 5:59 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-17 11:05 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> In preparation for support of a segmented RMP table, treat the contiguous
> RMP table as a segmented RMP table with a single segment covering all
> of memory. By treating a contiguous RMP table as a single segment, much
> of the code that initializes and accesses the RMP can be re-used.
>
> Segmented RMP tables can have up to 512 segment entries. Each segment
> will have metadata associated with it to identify the segment location,
> the segment size, etc. The segment data and the physical address are used
> to determine the index of the segment within the table and then the RMP
> entry within the segment. For an actual segmented RMP table environment,
> much of the segment information will come from a configuration MSR. For
> the contiguous RMP, though, much of the information will be statically
> defined.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
Regards,
Nikunj
> ---
> arch/x86/virt/svm/sev.c | 195 ++++++++++++++++++++++++++++++++++++----
> 1 file changed, 176 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 81e21d833cf0..ebfb924652f8 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -18,6 +18,7 @@
> #include <linux/cpumask.h>
> #include <linux/iommu.h>
> #include <linux/amd-iommu.h>
> +#include <linux/nospec.h>
>
> #include <asm/sev.h>
> #include <asm/processor.h>
> @@ -74,12 +75,42 @@ struct rmpentry_raw {
> */
> #define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
>
> +/*
> + * For a non-segmented RMP table, use the maximum physical addressing as the
> + * segment size in order to always arrive at index 0 in the table.
> + */
> +#define RMPTABLE_NON_SEGMENTED_SHIFT 52
> +
> +struct rmp_segment_desc {
> + struct rmpentry_raw *rmp_entry;
> + u64 max_index;
> + u64 size;
> +};
> +
> +/*
> + * Segmented RMP Table support.
> + * - The segment size is used for two purposes:
> + * - Identify the amount of memory covered by an RMP segment
> + * - Quickly locate an RMP segment table entry for a physical address
> + *
> + * - The RMP segment table contains pointers to an RMP table that covers
> + * a specific portion of memory. There can be up to 512 8-byte entries,
> + * one pages worth.
> + */
> +static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
> +static unsigned int rst_max_index __ro_after_init = 512;
> +
> +static u64 rmp_segment_size_max;
> +static unsigned int rmp_segment_coverage_shift;
> +static unsigned long rmp_segment_coverage_size;
> +static unsigned long rmp_segment_coverage_mask;
> +#define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_coverage_shift)
> +#define RMP_ENTRY_INDEX(x) PHYS_PFN((x) & rmp_segment_coverage_mask)
> +
> /* Mask to apply to a PFN to get the first PFN of a 2MB page */
> #define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
>
> static u64 probed_rmp_base, probed_rmp_size;
> -static struct rmpentry_raw *rmptable __ro_after_init;
> -static u64 rmptable_max_pfn __ro_after_init;
>
> static LIST_HEAD(snp_leaked_pages_list);
> static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
> @@ -185,6 +216,92 @@ static bool __init init_rmptable_bookkeeping(void)
> return true;
> }
>
> +static bool __init alloc_rmp_segment_desc(u64 segment_pa, u64 segment_size, u64 pa)
> +{
> + struct rmp_segment_desc *desc;
> + unsigned long rst_index;
> + void *rmp_segment;
> +
> + /* Validate the RMP segment size */
> + if (segment_size > rmp_segment_size_max) {
> + pr_err("Invalid RMP size (%#llx) for configured segment size (%#llx)\n",
> + segment_size, rmp_segment_size_max);
> + return false;
> + }
> +
> + /* Validate the RMP segment table index */
> + rst_index = RST_ENTRY_INDEX(pa);
> + if (rst_index >= rst_max_index) {
> + pr_err("Invalid RMP segment base address (%#llx) for configured segment size (%#lx)\n",
> + pa, rmp_segment_coverage_size);
> + return false;
> + }
> + rst_index = array_index_nospec(rst_index, rst_max_index);
> +
> + if (rmp_segment_table[rst_index]) {
> + pr_err("RMP segment descriptor already exists at index %lu\n", rst_index);
> + return false;
> + }
> +
> + /* Map the RMP entries */
> + rmp_segment = memremap(segment_pa, segment_size, MEMREMAP_WB);
> + if (!rmp_segment) {
> + pr_err("Failed to map RMP segment addr 0x%llx size 0x%llx\n",
> + segment_pa, segment_size);
> + return false;
> + }
> +
> + desc = kzalloc(sizeof(*desc), GFP_KERNEL);
> + if (!desc) {
> + memunmap(rmp_segment);
> + return false;
> + }
> +
> + desc->rmp_entry = rmp_segment;
> + desc->max_index = segment_size / sizeof(*desc->rmp_entry);
> + desc->size = segment_size;
> +
> + /* Add the segment descriptor to the table */
> + rmp_segment_table[rst_index] = desc;
> +
> + return true;
> +}
> +
> +static void __init free_rmp_segment_table(void)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < rst_max_index; i++) {
> + struct rmp_segment_desc *desc;
> +
> + desc = rmp_segment_table[i];
> + if (!desc)
> + continue;
> +
> + memunmap(desc->rmp_entry);
> +
> + kfree(desc);
> + }
> +
> + free_page((unsigned long)rmp_segment_table);
> +
> + rmp_segment_table = NULL;
> +}
> +
> +static bool __init alloc_rmp_segment_table(void)
> +{
> + struct page *page;
> +
> + /* Allocate the table used to index into the RMP segments */
> + page = alloc_page(__GFP_ZERO);
> + if (!page)
> + return false;
> +
> + rmp_segment_table = page_address(page);
> +
> + return true;
> +}
> +
> /*
> * Do the necessary preparations which are verified by the firmware as
> * described in the SNP_INIT_EX firmware command description in the SNP
> @@ -192,8 +309,8 @@ static bool __init init_rmptable_bookkeeping(void)
> */
> static int __init snp_rmptable_init(void)
> {
> - u64 max_rmp_pfn, calc_rmp_sz, rmptable_size, rmp_end, val;
> - void *rmptable_start;
> + u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end, val;
> + unsigned int i;
>
> if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> return 0;
> @@ -222,17 +339,18 @@ static int __init snp_rmptable_init(void)
> goto nosnp;
> }
>
> + if (!alloc_rmp_segment_table())
> + goto nosnp;
> +
> /* Map only the RMP entries */
> - rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
> - probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
> - MEMREMAP_WB);
> - if (!rmptable_start) {
> - pr_err("Failed to map RMP table\n");
> + rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
> + rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
> +
> + if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) {
> + free_rmp_segment_table();
> goto nosnp;
> }
>
> - rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
> -
> /*
> * Check if SEV-SNP is already enabled, this can happen in case of
> * kexec boot.
> @@ -243,12 +361,20 @@ static int __init snp_rmptable_init(void)
>
> /* Zero out the RMP bookkeeping area */
> if (!init_rmptable_bookkeeping()) {
> - memunmap(rmptable_start);
> + free_rmp_segment_table();
> goto nosnp;
> }
>
> /* Zero out the RMP entries */
> - memset(rmptable_start, 0, rmptable_size);
> + for (i = 0; i < rst_max_index; i++) {
> + struct rmp_segment_desc *desc;
> +
> + desc = rmp_segment_table[i];
> + if (!desc)
> + continue;
> +
> + memset(desc->rmp_entry, 0, desc->size);
> + }
>
> /* Flush the caches to ensure that data is written before SNP is enabled. */
> wbinvd_on_all_cpus();
> @@ -259,9 +385,6 @@ static int __init snp_rmptable_init(void)
> on_each_cpu(snp_enable, NULL, 1);
>
> skip_enable:
> - rmptable = (struct rmpentry_raw *)rmptable_start;
> - rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry_raw) - 1;
> -
> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
>
> /*
> @@ -282,6 +405,17 @@ static int __init snp_rmptable_init(void)
> */
> device_initcall(snp_rmptable_init);
>
> +static void set_rmp_segment_info(unsigned int segment_shift)
> +{
> + rmp_segment_coverage_shift = segment_shift;
> + rmp_segment_coverage_size = 1UL << rmp_segment_coverage_shift;
> + rmp_segment_coverage_mask = rmp_segment_coverage_size - 1;
> +
> + /* Calculate the maximum size an RMP can be (16 bytes/page mapped) */
> + rmp_segment_size_max = PHYS_PFN(rmp_segment_coverage_size);
> + rmp_segment_size_max <<= 4;
> +}
> +
> #define RMP_ADDR_MASK GENMASK_ULL(51, 13)
>
> bool snp_probe_rmptable_info(void)
> @@ -303,6 +437,11 @@ bool snp_probe_rmptable_info(void)
>
> rmp_sz = rmp_end - rmp_base + 1;
>
> + /* Treat the contiguous RMP table as a single segment */
> + rst_max_index = 1;
> +
> + set_rmp_segment_info(RMPTABLE_NON_SEGMENTED_SHIFT);
> +
> probed_rmp_base = rmp_base;
> probed_rmp_size = rmp_sz;
>
> @@ -314,13 +453,31 @@ bool snp_probe_rmptable_info(void)
>
> static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
> {
> - if (!rmptable)
> + struct rmp_segment_desc *desc;
> + unsigned long rst_index;
> + unsigned long paddr;
> + u64 segment_index;
> +
> + if (!rmp_segment_table)
> return ERR_PTR(-ENODEV);
>
> - if (unlikely(pfn > rmptable_max_pfn))
> + paddr = pfn << PAGE_SHIFT;
> +
> + rst_index = RST_ENTRY_INDEX(paddr);
> + if (unlikely(rst_index >= rst_max_index))
> + return ERR_PTR(-EFAULT);
> + rst_index = array_index_nospec(rst_index, rst_max_index);
> +
> + desc = rmp_segment_table[rst_index];
> + if (unlikely(!desc))
> return ERR_PTR(-EFAULT);
>
> - return rmptable + pfn;
> + segment_index = RMP_ENTRY_INDEX(paddr);
> + if (unlikely(segment_index >= desc->max_index))
> + return ERR_PTR(-EFAULT);
> + segment_index = array_index_nospec(segment_index, desc->max_index);
> +
> + return desc->rmp_entry + segment_index;
> }
>
> static int get_rmpentry(u64 pfn, struct rmpentry *entry)
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
2024-10-16 10:46 ` Nikunj A. Dadhania
@ 2024-10-17 15:26 ` Borislav Petkov
2024-10-17 16:24 ` Tom Lendacky
2024-10-18 4:21 ` Neeraj Upadhyay
2024-10-18 12:41 ` Borislav Petkov
3 siblings, 1 reply; 43+ messages in thread
From: Borislav Petkov @ 2024-10-17 15:26 UTC (permalink / raw)
To: Tom Lendacky
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On Mon, Sep 30, 2024 at 10:22:10AM -0500, Tom Lendacky wrote:
> + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
> + int ret;
> +
> + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
> + : "=a" (ret)
> + : "a" (pfn << PAGE_SHIFT), "c" (entry)
> + : "memory", "cc");
> +
> + return ret;
> + }
> +
> e = __get_rmpentry(pfn);
So dump_rmpentry() still calls this but it doesn't require the newly added
services of RMPREAD and so this is looking to be disambiguated: a function
which gives you the entry coming from RMPREAD, I guess the architectural one,
and the other one.
IOW, I am still unclear on the nomenclature:
The _raw* entries do not come from the insn but then what's the raw-ness about
them?
This convention sounds weird as it is now, I'd say.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-10-17 15:26 ` Borislav Petkov
@ 2024-10-17 16:24 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-17 16:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/17/24 10:26, Borislav Petkov wrote:
> On Mon, Sep 30, 2024 at 10:22:10AM -0500, Tom Lendacky wrote:
>> + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
>> + int ret;
>> +
>> + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
>> + : "=a" (ret)
>> + : "a" (pfn << PAGE_SHIFT), "c" (entry)
>> + : "memory", "cc");
>> +
>> + return ret;
>> + }
>> +
>> e = __get_rmpentry(pfn);
>
> So dump_rmpentry() still calls this but it doesn't require the newly added
> services of RMPREAD and so this is looking to be disambiguated: a function
> which gives you the entry coming from RMPREAD, I guess the architectural one,
> and the other one.
Right, because for debugging purposes we want to dump the raw RMP entry
that is in the RMP table, not just the information returned by RMPREAD
(since RMPREAD doesn't return everything defined in the RMP entry).
This is why dump_rmpentry() merely prints out the RMP entry as two u64
values.
>
> IOW, I am still unclear on the nomenclature:
>
> The _raw* entries do not come from the insn but then what's the raw-ness about
> them?
The raw-ness is that it is the actual data in the RMP table. The reason
for RMPREAD is because there is no guarantee that the raw data won't be
reformatted in a future program, which is why we only allow access to
the RMP entry for Milan and Genoa, where the format is known and the same.
Thanks,
Tom
>
> This convention sounds weird as it is now, I'd say.
>
> Thx.
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
2024-10-16 10:46 ` Nikunj A. Dadhania
2024-10-17 15:26 ` Borislav Petkov
@ 2024-10-18 4:21 ` Neeraj Upadhyay
2024-10-18 12:41 ` Borislav Petkov
3 siblings, 0 replies; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 4:21 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> The RMPREAD instruction returns an architecture defined format of an
> RMP table entry. This is the preferred method for examining RMP entries.
>
> The instruction is advertised in CPUID 0x8000001f_EAX[21]. Use this
> instruction when available.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
- Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-09-30 15:22 ` [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h Tom Lendacky
2024-09-30 17:03 ` Dave Hansen
@ 2024-10-18 4:26 ` Neeraj Upadhyay
2024-10-18 13:30 ` Tom Lendacky
1 sibling, 1 reply; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 4:26 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> Limit usage of the non-architectural RMP format to Fam19h processors.
> The RMPREAD instruction, with its architecture defined output, is
> available, and should be used, for RMP access beyond Fam19h.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/kernel/cpu/amd.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 015971adadfc..ddbb6dd75fb2 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -358,7 +358,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
> * for which the RMP table entry format is currently defined for.
> */
> if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
> - c->x86 >= 0x19 && snp_probe_rmptable_info()) {
> + (c->x86 == 0x19 || cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
Maybe update the comment above this if condition with information about RMPREAD FEAT?
- Neeraj
> + snp_probe_rmptable_info()) {
> cc_platform_set(CC_ATTR_HOST_SEV_SNP);
> } else {
> setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way
2024-09-30 15:22 ` [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way Tom Lendacky
2024-10-16 11:05 ` Nikunj A. Dadhania
@ 2024-10-18 4:28 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 4:28 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> To make patch review easier for the segmented RMP support, move the SNP
> probe function out from in between the initialization-related routines.
>
> No functional change.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
- Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range
2024-09-30 15:22 ` [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range Tom Lendacky
2024-10-16 11:25 ` [sos-linux-ext-patches] " Nikunj A. Dadhania
@ 2024-10-18 4:38 ` Neeraj Upadhyay
2024-10-18 13:32 ` Tom Lendacky
1 sibling, 1 reply; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 4:38 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
> /*
> * Do the necessary preparations which are verified by the firmware as
> * described in the SNP_INIT_EX firmware command description in the SNP
> @@ -205,12 +222,17 @@ static int __init snp_rmptable_init(void)
> goto nosnp;
> }
>
> - rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
> + /* Map only the RMP entries */
> + rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
> + probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
> + MEMREMAP_WB);
> if (!rmptable_start) {
> pr_err("Failed to map RMP table\n");
> goto nosnp;
> }
>
> + rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
> +
Nit: Move this assignment above 'rmptable_start = memremap(...)', so that
rmptable_size can be used there.
- Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment
2024-09-30 15:22 ` [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment Tom Lendacky
2024-10-17 11:05 ` Nikunj A. Dadhania
@ 2024-10-18 5:59 ` Neeraj Upadhyay
2024-10-18 13:56 ` Tom Lendacky
1 sibling, 1 reply; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 5:59 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> In preparation for support of a segmented RMP table, treat the contiguous
> RMP table as a segmented RMP table with a single segment covering all
> of memory. By treating a contiguous RMP table as a single segment, much
> of the code that initializes and accesses the RMP can be re-used.
>
> Segmented RMP tables can have up to 512 segment entries. Each segment
> will have metadata associated with it to identify the segment location,
> the segment size, etc. The segment data and the physical address are used
> to determine the index of the segment within the table and then the RMP
> entry within the segment. For an actual segmented RMP table environment,
> much of the segment information will come from a configuration MSR. For
> the contiguous RMP, though, much of the information will be statically
> defined.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/virt/svm/sev.c | 195 ++++++++++++++++++++++++++++++++++++----
> 1 file changed, 176 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 81e21d833cf0..ebfb924652f8 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -18,6 +18,7 @@
> #include <linux/cpumask.h>
> #include <linux/iommu.h>
> #include <linux/amd-iommu.h>
> +#include <linux/nospec.h>
>
> #include <asm/sev.h>
> #include <asm/processor.h>
> @@ -74,12 +75,42 @@ struct rmpentry_raw {
> */
> #define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
>
> +/*
> + * For a non-segmented RMP table, use the maximum physical addressing as the
> + * segment size in order to always arrive at index 0 in the table.
> + */
> +#define RMPTABLE_NON_SEGMENTED_SHIFT 52
> +
> +struct rmp_segment_desc {
> + struct rmpentry_raw *rmp_entry;
> + u64 max_index;
> + u64 size;
> +};
> +
> +/*
> + * Segmented RMP Table support.
> + * - The segment size is used for two purposes:
> + * - Identify the amount of memory covered by an RMP segment
> + * - Quickly locate an RMP segment table entry for a physical address
> + *
> + * - The RMP segment table contains pointers to an RMP table that covers
> + * a specific portion of memory. There can be up to 512 8-byte entries,
> + * one pages worth.
> + */
> +static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
> +static unsigned int rst_max_index __ro_after_init = 512;
> +
> +static u64 rmp_segment_size_max;
> +static unsigned int rmp_segment_coverage_shift;
> +static unsigned long rmp_segment_coverage_size;
> +static unsigned long rmp_segment_coverage_mask;
rmp_segment_size_max is of type u64 and rmp_segment_coverage_size is 1 << 52
for single RMP segment. So, maybe use u64 for rmp_segment_coverage_size
and rmp_segment_coverage_mask also?
- Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table
2024-09-30 15:22 ` [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table Tom Lendacky
@ 2024-10-18 6:32 ` Nikunj A. Dadhania
2024-10-18 14:41 ` Tom Lendacky
2024-10-18 8:37 ` Neeraj Upadhyay
1 sibling, 1 reply; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-18 6:32 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> A segmented RMP table allows for improved locality of reference between
> the memory protected by the RMP and the RMP entries themselves.
>
> Add support to detect and initialize a segmented RMP table with multiple
> segments as configured by the system BIOS. While the RMPREAD instruction
> will be used to read an RMP entry in a segmented RMP, initialization and
> debugging capabilities will require the mapping of the segments.
>
> The RMP_CFG MSR indicates if segmented RMP support is enabled and, if
> enabled, the amount of memory that an RMP segment covers. When segmented
> RMP support is enabled, the RMP_BASE MSR points to the start of the RMP
> bookkeeping area, which is 16K in size. The RMP Segment Table (RST) is
> located immediately after the bookkeeping area and is 4K in size. The RST
> contains up to 512 8-byte entries that identify the location of the RMP
> segment and amount of memory mapped by the segment (which must be less
> than or equal to the configured segment size). The physical address that
> is covered by a segment is based on the segment size and the index of the
> segment in the RST. The RMP entry for a physical address is based on the
> offset within the segment.
>
> For example, if the segment size is 64GB (0x1000000000 or 1 << 36), then
> physical address 0x9000800000 is RST entry 9 (0x9000800000 >> 36) and
> RST entry 9 covers physical memory 0x9000000000 to 0x9FFFFFFFFF.
>
> The RMP entry index within the RMP segment is the physical address
> AND-ed with the segment mask, 64GB - 1 (0xFFFFFFFFF), and then
> right-shifted 12 bits or PHYS_PFN(0x9000800000 & 0xFFFFFFFFF), which
> is 0x800.
>
> CPUID 0x80000025_EBX[9:0] describes the number of RMP segments that can
> be cached by the hardware. Additionally, if CPUID 0x80000025_EBX[10] is
> set, then the number of actual RMP segments defined cannot exceed the
> number of RMP segments that can be cached and can be used as a maximum
> RST index.
In case EBX[10] is not set, we will need to iterate over all the 512 segment
entries?
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/msr-index.h | 9 +-
> arch/x86/virt/svm/sev.c | 231 ++++++++++++++++++++++++++---
> 3 files changed, 218 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 93620a4c5b15..417cdc636a12 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -448,6 +448,7 @@
> #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */
> #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
> #define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
> +#define X86_FEATURE_SEGMENTED_RMP (19*32+23) /* Segmented RMP support */
> #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
>
> /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 3ae84c3b8e6d..8b57c4d1098f 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -682,11 +682,14 @@
> #define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT)
> #define MSR_AMD64_SNP_RESV_BIT 18
> #define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
> -
> -#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
> -
Moved accidentally?
> #define MSR_AMD64_RMP_BASE 0xc0010132
> #define MSR_AMD64_RMP_END 0xc0010133
> +#define MSR_AMD64_RMP_CFG 0xc0010136
> +#define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
> +#define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
> +#define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
> +
> +#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
>
> #define MSR_SVSM_CAA 0xc001f000
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index ebfb924652f8..2f83772d3daa 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -97,6 +97,10 @@ struct rmp_segment_desc {
> * a specific portion of memory. There can be up to 512 8-byte entries,
> * one pages worth.
> */
> +#define RST_ENTRY_MAPPED_SIZE(x) ((x) & GENMASK_ULL(19, 0))
> +#define RST_ENTRY_SEGMENT_BASE(x) ((x) & GENMASK_ULL(51, 20))
> +
> +#define RMP_SEGMENT_TABLE_SIZE SZ_4K
> static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
> static unsigned int rst_max_index __ro_after_init = 512;
>
> @@ -107,6 +111,9 @@ static unsigned long rmp_segment_coverage_mask;
> #define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_coverage_shift)
> #define RMP_ENTRY_INDEX(x) PHYS_PFN((x) & rmp_segment_coverage_mask)
>
> +static u64 rmp_cfg;
> +#define RMP_IS_SEGMENTED(x) ((x) & MSR_AMD64_SEG_RMP_ENABLED)
> +
> /* Mask to apply to a PFN to get the first PFN of a 2MB page */
> #define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
>
> @@ -196,7 +203,42 @@ static void __init __snp_fixup_e820_tables(u64 pa)
<skipped the e820 bits>
> @@ -302,24 +344,12 @@ static bool __init alloc_rmp_segment_table(void)
> return true;
> }
>
> -/*
> - * Do the necessary preparations which are verified by the firmware as
> - * described in the SNP_INIT_EX firmware command description in the SNP
> - * firmware ABI spec.
> - */
> -static int __init snp_rmptable_init(void)
> +static bool __init contiguous_rmptable_setup(void)
> {
> - u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end, val;
> - unsigned int i;
> -
> - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> - return 0;
> -
> - if (!amd_iommu_snp_en)
> - goto nosnp;
> + u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end;
>
> if (!probed_rmp_size)
> - goto nosnp;
> + return false;
>
> rmp_end = probed_rmp_base + probed_rmp_size - 1;
>
If you dont mind, please fold the below comment update in contiguous_rmptable_setup()
found during review. If required, I can send a separate patch.
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 2f83772d3daa..d5a9f8164672 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -354,7 +354,7 @@ static bool __init contiguous_rmptable_setup(void)
rmp_end = probed_rmp_base + probed_rmp_size - 1;
/*
- * Calculate the amount the memory that must be reserved by the BIOS to
+ * Calculate the amount of memory that must be reserved by the BIOS to
* address the whole RAM, including the bookkeeping area. The RMP itself
* must also be covered.
*/
> @@ -336,11 +366,11 @@ static int __init snp_rmptable_init(void)
> if (calc_rmp_sz > probed_rmp_size) {
> pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
> calc_rmp_sz, probed_rmp_size);
> - goto nosnp;
> + return false;
> }
>
> if (!alloc_rmp_segment_table())
> - goto nosnp;
> + return false;
>
> /* Map only the RMP entries */
> rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
> @@ -348,9 +378,116 @@ static int __init snp_rmptable_init(void)
>
> if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) {
> free_rmp_segment_table();
> - goto nosnp;
> + return false;
> }
>
> + return true;
> +}
> +
> +static bool __init segmented_rmptable_setup(void)
> +{
> + u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max;
> + unsigned int i, max_index;
> +
> + if (!probed_rmp_base)
> + return false;
> +
> + if (!alloc_rmp_segment_table())
> + return false;
> +
> + /* Map the RMP Segment Table */
> + rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
> + rst = memremap(rst_pa, RMP_SEGMENT_TABLE_SIZE, MEMREMAP_WB);
> + if (!rst) {
> + pr_err("Failed to map RMP segment table addr %#llx\n", rst_pa);
> + goto e_free;
> + }
> +
> + /* Get the address for the end of system RAM */
> + ram_pa_max = max_pfn << PAGE_SHIFT;
> +
> + /* Process each RMP segment */
> + max_index = 0;
> + ram_pa_end = 0;
> + for (i = 0; i < rst_max_index; i++) {
> + u64 rmp_segment, rmp_size, mapped_size;
> +
> + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]);
> + if (!mapped_size)
> + continue;
> +
> + max_index = i;
> +
> + /* Mapped size in GB */
> + mapped_size *= (1ULL << 30);
> + if (mapped_size > rmp_segment_coverage_size)
> + mapped_size = rmp_segment_coverage_size;
This seems to be an error in BIOS RST programming, probably a print/warning
would help during debug.
> +
> + rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]);
> +
> + rmp_size = PHYS_PFN(mapped_size);
> + rmp_size <<= 4;
A comment above this will help, as you are calculating 16 bytes/page.
> + pa = (u64)i << rmp_segment_coverage_shift;
> +
> + /* Some segments may be for MMIO mapped above system RAM */
Why will RST have MMIO mapped entries ?
> + if (pa < ram_pa_max)
> + ram_pa_end = pa + mapped_size;
> +
> + if (!alloc_rmp_segment_desc(rmp_segment, rmp_size, pa))
> + goto e_unmap;
> +
> + pr_info("RMP segment %u physical address [%#llx - %#llx] covering [%#llx - %#llx]\n",
> + i, rmp_segment, rmp_segment + rmp_size - 1, pa, pa + mapped_size - 1);
> + }
> +
> + if (ram_pa_max > ram_pa_end) {
> + pr_err("Segmented RMP does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
> + ram_pa_max, ram_pa_end);
> + goto e_unmap;
> + }
> +
> + /* Adjust the maximum index based on the found segments */
> + rst_max_index = max_index + 1;
> +
> + memunmap(rst);
> +
> + return true;
> +
> +e_unmap:
> + memunmap(rst);
> +
> +e_free:
> + free_rmp_segment_table();
> +
> + return false;
> +}
> +
> +static bool __init rmptable_setup(void)
> +{
> + return RMP_IS_SEGMENTED(rmp_cfg) ? segmented_rmptable_setup()
> + : contiguous_rmptable_setup();
> +}
> +
> +/*
> + * Do the necessary preparations which are verified by the firmware as
> + * described in the SNP_INIT_EX firmware command description in the SNP
> + * firmware ABI spec.
> + */
> +static int __init snp_rmptable_init(void)
> +{
> + unsigned int i;
> + u64 val;
> +
> + if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> + return 0;
> +
> + if (!amd_iommu_snp_en)
> + goto nosnp;
> +
> + if (!rmptable_setup())
> + goto nosnp;
> +
> /*
> * Check if SEV-SNP is already enabled, this can happen in case of
> * kexec boot.
> @@ -418,7 +555,7 @@ static void set_rmp_segment_info(unsigned int segment_shift)
>
> #define RMP_ADDR_MASK GENMASK_ULL(51, 13)
>
> -bool snp_probe_rmptable_info(void)
> +static bool probe_contiguous_rmptable_info(void)
> {
> u64 rmp_sz, rmp_base, rmp_end;
>
> @@ -451,6 +588,60 @@ bool snp_probe_rmptable_info(void)
> return true;
> }
>
> +static bool probe_segmented_rmptable_info(void)
> +{
> + unsigned int eax, ebx, segment_shift, segment_shift_min, segment_shift_max;
> + u64 rmp_base, rmp_end;
> +
> + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
> + rdmsrl(MSR_AMD64_RMP_END, rmp_end);
> +
> + if (!(rmp_base & RMP_ADDR_MASK)) {
> + pr_err("Memory for the RMP table has not been reserved by BIOS\n");
> + return false;
> + }
> +
> + WARN_ONCE(rmp_end & RMP_ADDR_MASK,
> + "Segmented RMP enabled but RMP_END MSR is non-zero\n");
> +
> + /* Obtain the min and max supported RMP segment size */
> + eax = cpuid_eax(0x80000025);
> + segment_shift_min = eax & GENMASK(5, 0);
> + segment_shift_max = (eax & GENMASK(11, 6)) >> 6;
> +
> + /* Verify the segment size is within the supported limits */
> + segment_shift = MSR_AMD64_RMP_SEGMENT_SHIFT(rmp_cfg);
> + if (segment_shift > segment_shift_max || segment_shift < segment_shift_min) {
> + pr_err("RMP segment size (%u) is not within advertised bounds (min=%u, max=%u)\n",
> + segment_shift, segment_shift_min, segment_shift_max);
> + return false;
> + }
> +
> + /* Override the max supported RST index if a hardware limit exists */
> + ebx = cpuid_ebx(0x80000025);
> + if (ebx & BIT(10))
> + rst_max_index = ebx & GENMASK(9, 0);
> +
> + set_rmp_segment_info(segment_shift);
> +
> + probed_rmp_base = rmp_base;
> + probed_rmp_size = 0;
> +
> + pr_info("RMP segment table physical address [0x%016llx - 0x%016llx]\n",
> + rmp_base, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
> +
> + return true;
> +}
> +
> +bool snp_probe_rmptable_info(void)
> +{
> + if (cpu_feature_enabled(X86_FEATURE_SEGMENTED_RMP))
> + rdmsrl(MSR_AMD64_RMP_CFG, rmp_cfg);
> +
> + return RMP_IS_SEGMENTED(rmp_cfg) ? probe_segmented_rmptable_info()
> + : probe_contiguous_rmptable_info();
> +}
> +
> static struct rmpentry_raw *__get_rmpentry(unsigned long pfn)
> {
> struct rmp_segment_desc *desc;
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP)
2024-09-30 15:22 ` [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP) Tom Lendacky
@ 2024-10-18 6:56 ` Nikunj A. Dadhania
2024-10-18 14:48 ` Tom Lendacky
2024-10-18 13:31 ` Neeraj Upadhyay
1 sibling, 1 reply; 43+ messages in thread
From: Nikunj A. Dadhania @ 2024-10-18 6:56 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> Update the AMD memory encryption documentation to include information on
> the Reverse Map Table (RMP) and the two table formats.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> .../arch/x86/amd-memory-encryption.rst | 118 ++++++++++++++++++
> 1 file changed, 118 insertions(+)
>
> diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
> index 6df3264f23b9..bd840df708ea 100644
> --- a/Documentation/arch/x86/amd-memory-encryption.rst
> +++ b/Documentation/arch/x86/amd-memory-encryption.rst
> @@ -130,8 +130,126 @@ SNP feature support.
>
> More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
>
> +Reverse Map Table (RMP)
> +=======================
> +
> +The RMP is a structure in system memory that is used to ensure a one-to-one
> +mapping between system physical addresses and guest physical addresses. Each
> +page of memory that is potentially assignable to guests has one entry within
> +the RMP.
> +
> +The RMP table can be either contiguous in memory or a collection of segments
> +in memory.
> +
> +Contiguous RMP
> +--------------
> +
> +Support for this form of the RMP is present when support for SEV-SNP is
> +present, which can be determined using the CPUID instruction::
> +
> + 0x8000001f[eax]:
> + Bit[4] indicates support for SEV-SNP
> +
> +The location of the RMP is identified to the hardware through two MSRs::
> +
> + 0xc0010132 (RMP_BASE):
> + System physical address of the first byte of the RMP
> +
> + 0xc0010133 (RMP_END):
> + System physical address of the last byte of the RMP
> +
> +Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV
> +firmware increases the alignment requirement to require a 1MB alignment.
> +
> +The RMP consists of a 16KB region used for processor bookkeeping followed
> +by the RMP entries, which are 16 bytes in size. The size of the RMP
> +determines the range of physical memory that the hypervisor can assign to
> +SEV-SNP guests. The RMP covers the system physical address from::
> +
> + 0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB.
> +
> +The current Linux support relies on BIOS to allocate/reserve the memory for
> +the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR
> +values to locate the RMP and determine the size of the RMP. The RMP must
> +cover all of system memory in order for Linux to enable SEV-SNP.
> +
> +Segmented RMP
> +-------------
> +
> +Segmented RMP support is a new way of representing the layout of an RMP.
> +Initial RMP support required the RMP table to be contiguous in memory.
> +RMP accesses from a NUMA node on which the RMP doesn't reside
> +can take longer than accesses from a NUMA node on which the RMP resides.
> +Segmented RMP support allows the RMP entries to be located on the same
> +node as the memory the RMP is covering, potentially reducing latency
> +associated with accessing an RMP entry associated with the memory. Each
> +RMP segment covers a specific range of system physical addresses.
> +
> +Support for this form of the RMP can be determined using the CPUID
> +instruction::
> +
> + 0x8000001f[eax]:
> + Bit[23] indicates support for segmented RMP
> +
> +If supported, segmented RMP attributes can be found using the CPUID
> +instruction::
> +
> + 0x80000025[eax]:
> + Bits[5:0] minimum supported RMP segment size
> + Bits[11:6] maximum supported RMP segment size
> +
> + 0x80000025[ebx]:
> + Bits[9:0] number of cacheable RMP segment definitions
> + Bit[10] indicates if the number of cacheable RMP segments
> + is a hard limit
> +
> +To enable a segmented RMP, a new MSR is available::
This may be more appropriate:
To discover segmented RMP support, a new MSR is available::
> +
> + 0xc0010136 (RMP_CFG):
> + Bit[0] indicates if segmented RMP is enabled
> + Bits[13:8] contains the size of memory covered by an RMP
> + segment (expressed as a power of 2)
> +
> +The RMP segment size defined in the RMP_CFG MSR applies to all segments
> +of the RMP. Therefore each RMP segment covers a specific range of system
> +physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then
> +the RMP segment coverage value is 0x24 => 36, meaning the size of memory
> +covered by an RMP segment is 64GB (1 << 36). So the first RMP segment
> +covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment
> +covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc.
> +
> +When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping
> +area as it does today (16K in size). However, instead of RMP entries
> +beginning immediately after the bookkeeping area, there is a 4K RMP
> +segment table (RST). Each entry in the RST is 8-bytes in size and represents
> +an RMP segment::
> +
> + Bits[19:0] mapped size (in GB)
> + The mapped size can be less than the defined segment size.
> + A value of zero, indicates that no RMP exists for the range
> + of system physical addresses associated with this segment.
> + Bits[51:20] segment physical address
> + This address is left shift 20-bits (or just masked when
> + read) to form the physical address of the segment (1MB
> + alignment).
> +
> +The RST can hold 512 segment entries but can be limited in size to the number
> +of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable
> +RMP segments is a hard limit (CPUID 0x80000025_EBX[10]).
> +
> +The current Linux support relies on BIOS to allocate/reserve the memory for
> +the segmented RMP (the bookkeeping area, RST, and all segments), build the RST
> +and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR
> +values to locate the RMP and determine the size and location of the RMP
> +segments. The RMP must cover all of system memory in order for Linux to enable
> +SEV-SNP.
> +
> +More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table",
> +docID: 24593.
> +
> Secure VM Service Module (SVSM)
> ===============================
> +
> SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which
> defines four privilege levels at which guest software can run. The most
> privileged level is 0 and numerically higher numbers have lesser privileges.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table
2024-09-30 15:22 ` [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table Tom Lendacky
2024-10-18 6:32 ` Nikunj A. Dadhania
@ 2024-10-18 8:37 ` Neeraj Upadhyay
2024-10-18 15:06 ` Tom Lendacky
1 sibling, 1 reply; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 8:37 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
>
> @@ -196,7 +203,42 @@ static void __init __snp_fixup_e820_tables(u64 pa)
> void __init snp_fixup_e820_tables(void)
> {
> __snp_fixup_e820_tables(probed_rmp_base);
> - __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
> +
> + if (RMP_IS_SEGMENTED(rmp_cfg)) {
> + unsigned long size;
> + unsigned int i;
> + u64 pa, *rst;
> +
> + pa = probed_rmp_base;
> + pa += RMPTABLE_CPU_BOOKKEEPING_SZ;
> + pa += RMP_SEGMENT_TABLE_SIZE;
> + __snp_fixup_e820_tables(pa);
> +
> + pa -= RMP_SEGMENT_TABLE_SIZE;
> + rst = early_memremap(pa, RMP_SEGMENT_TABLE_SIZE);
> + if (!rst)
> + return;
> +
> + for (i = 0; i < rst_max_index; i++) {
> + pa = RST_ENTRY_SEGMENT_BASE(rst[i]);
> + size = RST_ENTRY_MAPPED_SIZE(rst[i]);
> + if (!size)
> + continue;
> +
> + __snp_fixup_e820_tables(pa);
> +
> + /* Mapped size in GB */
> + size *= (1UL << 30);
nit: size <<= 30 ?
> + if (size > rmp_segment_coverage_size)
> + size = rmp_segment_coverage_size;
> +
> + __snp_fixup_e820_tables(pa + size);
I might have understood this wrong, but is this call meant to fixup segmented
rmp table end. So, is below is required?
size = PHYS_PFN(size);
size <<= 4;
__snp_fixup_e820_tables(pa + size);
> + }
> +
> + early_memunmap(rst, RMP_SEGMENT_TABLE_SIZE);
> + } else {
> + __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
> + }
> }
>
...
> +static bool __init segmented_rmptable_setup(void)
> +{
> + u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max;
> + unsigned int i, max_index;
> +
> + if (!probed_rmp_base)
> + return false;
> +
> + if (!alloc_rmp_segment_table())
> + return false;
> +
> + /* Map the RMP Segment Table */
> + rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
> + rst = memremap(rst_pa, RMP_SEGMENT_TABLE_SIZE, MEMREMAP_WB);
> + if (!rst) {
> + pr_err("Failed to map RMP segment table addr %#llx\n", rst_pa);
> + goto e_free;
> + }
> +
> + /* Get the address for the end of system RAM */
> + ram_pa_max = max_pfn << PAGE_SHIFT;
> +
> + /* Process each RMP segment */
> + max_index = 0;
> + ram_pa_end = 0;
> + for (i = 0; i < rst_max_index; i++) {
> + u64 rmp_segment, rmp_size, mapped_size;
> +
> + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]);
> + if (!mapped_size)
> + continue;
> +
> + max_index = i;
> +
> + /* Mapped size in GB */
> + mapped_size *= (1ULL << 30);
nit: mapped_size <<= 30 ?
> + if (mapped_size > rmp_segment_coverage_size)
> + mapped_size = rmp_segment_coverage_size;
> +
> + rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]);
> +
> + rmp_size = PHYS_PFN(mapped_size);
> + rmp_size <<= 4;
> +
> + pa = (u64)i << rmp_segment_coverage_shift;
> +
> + /* Some segments may be for MMIO mapped above system RAM */
> + if (pa < ram_pa_max)
> + ram_pa_end = pa + mapped_size;
> +
> + if (!alloc_rmp_segment_desc(rmp_segment, rmp_size, pa))
> + goto e_unmap;
> +
> + pr_info("RMP segment %u physical address [%#llx - %#llx] covering [%#llx - %#llx]\n",
> + i, rmp_segment, rmp_segment + rmp_size - 1, pa, pa + mapped_size - 1);
> + }
> +
> + if (ram_pa_max > ram_pa_end) {
> + pr_err("Segmented RMP does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
> + ram_pa_max, ram_pa_end);
> + goto e_unmap;
> + }
> +
> + /* Adjust the maximum index based on the found segments */
> + rst_max_index = max_index + 1;
> +
> + memunmap(rst);
> +
> + return true;
> +
> +e_unmap:
> + memunmap(rst);
> +
> +e_free:
> + free_rmp_segment_table();
> +
> + return false;
> +}
> +
...
>
> +static bool probe_segmented_rmptable_info(void)
> +{
> + unsigned int eax, ebx, segment_shift, segment_shift_min, segment_shift_max;
> + u64 rmp_base, rmp_end;
> +
> + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
> + rdmsrl(MSR_AMD64_RMP_END, rmp_end);
> +
> + if (!(rmp_base & RMP_ADDR_MASK)) {
> + pr_err("Memory for the RMP table has not been reserved by BIOS\n");
> + return false;
> + }
> +
> + WARN_ONCE(rmp_end & RMP_ADDR_MASK,
> + "Segmented RMP enabled but RMP_END MSR is non-zero\n");
> +
> + /* Obtain the min and max supported RMP segment size */
> + eax = cpuid_eax(0x80000025);
> + segment_shift_min = eax & GENMASK(5, 0);
> + segment_shift_max = (eax & GENMASK(11, 6)) >> 6;
> +
> + /* Verify the segment size is within the supported limits */
> + segment_shift = MSR_AMD64_RMP_SEGMENT_SHIFT(rmp_cfg);
> + if (segment_shift > segment_shift_max || segment_shift < segment_shift_min) {
> + pr_err("RMP segment size (%u) is not within advertised bounds (min=%u, max=%u)\n",
> + segment_shift, segment_shift_min, segment_shift_max);
> + return false;
> + }
> +
> + /* Override the max supported RST index if a hardware limit exists */
> + ebx = cpuid_ebx(0x80000025);
> + if (ebx & BIT(10))
> + rst_max_index = ebx & GENMASK(9, 0);
> +
> + set_rmp_segment_info(segment_shift);
> +
> + probed_rmp_base = rmp_base;
> + probed_rmp_size = 0;
> +
> + pr_info("RMP segment table physical address [0x%016llx - 0x%016llx]\n",
> + rmp_base, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
> +
rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
- Neeraj
> + return true;
> +}
> +
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
` (2 preceding siblings ...)
2024-10-18 4:21 ` Neeraj Upadhyay
@ 2024-10-18 12:41 ` Borislav Petkov
2024-10-18 15:14 ` Tom Lendacky
3 siblings, 1 reply; 43+ messages in thread
From: Borislav Petkov @ 2024-10-18 12:41 UTC (permalink / raw)
To: Tom Lendacky
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On Mon, Sep 30, 2024 at 10:22:10AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 103a2dd6e81d..73d4f422829a 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -301,6 +301,17 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
> {
> struct rmpentry_raw *e;
>
> + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
> + int ret;
> +
> + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
> + : "=a" (ret)
> + : "a" (pfn << PAGE_SHIFT), "c" (entry)
> + : "memory", "cc");
> +
> + return ret;
> + }
I think this should be:
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 73d9295dd013..5500c5d64cba 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -303,12 +303,11 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
struct rmpentry_raw *e;
if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
- int ret;
+ int ret = pfn << PAGE_SHIFT;
asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
- : "=a" (ret)
- : "a" (pfn << PAGE_SHIFT), "c" (entry)
- : "memory", "cc");
+ : "+a" (ret), "+c" (entry)
+ :: "memory", "cc");
return ret;
}
because "The RCX register provides the effective address of a 16-byte data
structure into which the RMP state is written."
So your %rcx is both an input and an output operand and you need to do the "+"
thing here too for that.
Same for %rax.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-09-30 18:59 ` Tom Lendacky
@ 2024-10-18 13:06 ` Borislav Petkov
0 siblings, 0 replies; 43+ messages in thread
From: Borislav Petkov @ 2024-10-18 13:06 UTC (permalink / raw)
To: Tom Lendacky
Cc: Dave Hansen, linux-kernel, x86, Thomas Gleixner, Ingo Molnar,
Dave Hansen, Michael Roth, Ashish Kalra
On Mon, Sep 30, 2024 at 01:59:54PM -0500, Tom Lendacky wrote:
> The one issue we run into is that family 0x19 contains both Milan (zen3)
> and Genoa (zen4), so I'm not sure what to use as a good #define name. We
> have the same problem with family 0x17 which contains zen1 and zen2.
>
> I might be able to change the if statement to something like:
>
> if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
> (cpu_feature_enabled(X86_FEATURE_ZEN3) ||
> cpu_feature_enabled(X86_FEATURE_ZEN4) ||
> cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
> snp_probe_rmptable_info()) {
>
> which might make the intent clearer.
>
> But, yes, I get your point about making grepping much easier, along with
> code readability. Maybe Boris and I can put our heads together to figure
> something out.
Right, that's why I'm adding the synthetic feature flags - for things like
that.
I think it is very readable this way and if this check needs to be repeated,
we can carve it out into a separate helper or so...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h
2024-10-18 4:26 ` Neeraj Upadhyay
@ 2024-10-18 13:30 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 13:30 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/17/24 23:26, Neeraj Upadhyay wrote:
>
>
> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>> Limit usage of the non-architectural RMP format to Fam19h processors.
>> The RMPREAD instruction, with its architecture defined output, is
>> available, and should be used, for RMP access beyond Fam19h.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> arch/x86/kernel/cpu/amd.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 015971adadfc..ddbb6dd75fb2 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -358,7 +358,8 @@ static void bsp_determine_snp(struct cpuinfo_x86 *c)
>> * for which the RMP table entry format is currently defined for.
>> */
>> if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
>> - c->x86 >= 0x19 && snp_probe_rmptable_info()) {
>> + (c->x86 == 0x19 || cpu_feature_enabled(X86_FEATURE_RMPREAD)) &&
>
> Maybe update the comment above this if condition with information about RMPREAD FEAT?
Yep.
Thanks,
Tom
>
>
> - Neeraj
>
>> + snp_probe_rmptable_info()) {
>> cc_platform_set(CC_ATTR_HOST_SEV_SNP);
>> } else {
>> setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP)
2024-09-30 15:22 ` [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP) Tom Lendacky
2024-10-18 6:56 ` Nikunj A. Dadhania
@ 2024-10-18 13:31 ` Neeraj Upadhyay
1 sibling, 0 replies; 43+ messages in thread
From: Neeraj Upadhyay @ 2024-10-18 13:31 UTC (permalink / raw)
To: Tom Lendacky, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 9/30/2024 8:52 PM, Tom Lendacky wrote:
> Update the AMD memory encryption documentation to include information on
> the Reverse Map Table (RMP) and the two table formats.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range
2024-10-18 4:38 ` Neeraj Upadhyay
@ 2024-10-18 13:32 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 13:32 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/17/24 23:38, Neeraj Upadhyay wrote:
>
>
>
>> /*
>> * Do the necessary preparations which are verified by the firmware as
>> * described in the SNP_INIT_EX firmware command description in the SNP
>> @@ -205,12 +222,17 @@ static int __init snp_rmptable_init(void)
>> goto nosnp;
>> }
>>
>> - rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
>> + /* Map only the RMP entries */
>> + rmptable_start = memremap(probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ,
>> + probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ,
>> + MEMREMAP_WB);
>> if (!rmptable_start) {
>> pr_err("Failed to map RMP table\n");
>> goto nosnp;
>> }
>>
>> + rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
>> +
>
> Nit: Move this assignment above 'rmptable_start = memremap(...)', so that
> rmptable_size can be used there.
I like the symmetry of the base and size adjustment in the memremap(). To
me it looks very obvious as to what is occurring.
Thanks,
Tom
>
>
> - Neeraj
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment
2024-10-18 5:59 ` Neeraj Upadhyay
@ 2024-10-18 13:56 ` Tom Lendacky
2024-10-18 14:42 ` Tom Lendacky
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 13:56 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 00:59, Neeraj Upadhyay wrote:
> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>> In preparation for support of a segmented RMP table, treat the contiguous
>> RMP table as a segmented RMP table with a single segment covering all
>> of memory. By treating a contiguous RMP table as a single segment, much
>> of the code that initializes and accesses the RMP can be re-used.
>>
>> Segmented RMP tables can have up to 512 segment entries. Each segment
>> will have metadata associated with it to identify the segment location,
>> the segment size, etc. The segment data and the physical address are used
>> to determine the index of the segment within the table and then the RMP
>> entry within the segment. For an actual segmented RMP table environment,
>> much of the segment information will come from a configuration MSR. For
>> the contiguous RMP, though, much of the information will be statically
>> defined.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> arch/x86/virt/svm/sev.c | 195 ++++++++++++++++++++++++++++++++++++----
>> 1 file changed, 176 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
>> index 81e21d833cf0..ebfb924652f8 100644
>> --- a/arch/x86/virt/svm/sev.c
>> +++ b/arch/x86/virt/svm/sev.c
>> @@ -18,6 +18,7 @@
>> #include <linux/cpumask.h>
>> #include <linux/iommu.h>
>> #include <linux/amd-iommu.h>
>> +#include <linux/nospec.h>
>>
>> #include <asm/sev.h>
>> #include <asm/processor.h>
>> @@ -74,12 +75,42 @@ struct rmpentry_raw {
>> */
>> #define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
>>
>> +/*
>> + * For a non-segmented RMP table, use the maximum physical addressing as the
>> + * segment size in order to always arrive at index 0 in the table.
>> + */
>> +#define RMPTABLE_NON_SEGMENTED_SHIFT 52
>> +
>> +struct rmp_segment_desc {
>> + struct rmpentry_raw *rmp_entry;
>> + u64 max_index;
>> + u64 size;
>> +};
>> +
>> +/*
>> + * Segmented RMP Table support.
>> + * - The segment size is used for two purposes:
>> + * - Identify the amount of memory covered by an RMP segment
>> + * - Quickly locate an RMP segment table entry for a physical address
>> + *
>> + * - The RMP segment table contains pointers to an RMP table that covers
>> + * a specific portion of memory. There can be up to 512 8-byte entries,
>> + * one pages worth.
>> + */
>> +static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
>> +static unsigned int rst_max_index __ro_after_init = 512;
>> +
>> +static u64 rmp_segment_size_max;
>> +static unsigned int rmp_segment_coverage_shift;
>> +static unsigned long rmp_segment_coverage_size;
>> +static unsigned long rmp_segment_coverage_mask;
>
> rmp_segment_size_max is of type u64 and rmp_segment_coverage_size is 1 << 52
> for single RMP segment. So, maybe use u64 for rmp_segment_coverage_size
> and rmp_segment_coverage_mask also?
This is 64-bit only code where unsigned long is the same size as u64 and
is typically preferred when dealing with numbers like this, which is why I
use that here. It does get a bit confusing because of the use of u64 and
unsigned long but I tried to keep things in sync between usages of the
same type as much as possible.
Thanks,
Tom
>
>
> - Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table
2024-10-18 6:32 ` Nikunj A. Dadhania
@ 2024-10-18 14:41 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 14:41 UTC (permalink / raw)
To: Nikunj A. Dadhania, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 01:32, Nikunj A. Dadhania wrote:
> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>> A segmented RMP table allows for improved locality of reference between
>> the memory protected by the RMP and the RMP entries themselves.
>>
>> Add support to detect and initialize a segmented RMP table with multiple
>> segments as configured by the system BIOS. While the RMPREAD instruction
>> will be used to read an RMP entry in a segmented RMP, initialization and
>> debugging capabilities will require the mapping of the segments.
>>
>> The RMP_CFG MSR indicates if segmented RMP support is enabled and, if
>> enabled, the amount of memory that an RMP segment covers. When segmented
>> RMP support is enabled, the RMP_BASE MSR points to the start of the RMP
>> bookkeeping area, which is 16K in size. The RMP Segment Table (RST) is
>> located immediately after the bookkeeping area and is 4K in size. The RST
>> contains up to 512 8-byte entries that identify the location of the RMP
>> segment and amount of memory mapped by the segment (which must be less
>> than or equal to the configured segment size). The physical address that
>> is covered by a segment is based on the segment size and the index of the
>> segment in the RST. The RMP entry for a physical address is based on the
>> offset within the segment.
>>
>> For example, if the segment size is 64GB (0x1000000000 or 1 << 36), then
>> physical address 0x9000800000 is RST entry 9 (0x9000800000 >> 36) and
>> RST entry 9 covers physical memory 0x9000000000 to 0x9FFFFFFFFF.
>>
>> The RMP entry index within the RMP segment is the physical address
>> AND-ed with the segment mask, 64GB - 1 (0xFFFFFFFFF), and then
>> right-shifted 12 bits or PHYS_PFN(0x9000800000 & 0xFFFFFFFFF), which
>> is 0x800.
>>
>> CPUID 0x80000025_EBX[9:0] describes the number of RMP segments that can
>> be cached by the hardware. Additionally, if CPUID 0x80000025_EBX[10] is
>> set, then the number of actual RMP segments defined cannot exceed the
>> number of RMP segments that can be cached and can be used as a maximum
>> RST index.
>
> In case EBX[10] is not set, we will need to iterate over all the 512 segment
> entries?
Correct.
>
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> arch/x86/include/asm/cpufeatures.h | 1 +
>> arch/x86/include/asm/msr-index.h | 9 +-
>> arch/x86/virt/svm/sev.c | 231 ++++++++++++++++++++++++++---
>> 3 files changed, 218 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
>> index 93620a4c5b15..417cdc636a12 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpufeatures.h
>> @@ -448,6 +448,7 @@
>> #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */
>> #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
>> #define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
>> +#define X86_FEATURE_SEGMENTED_RMP (19*32+23) /* Segmented RMP support */
>> #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
>>
>> /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 3ae84c3b8e6d..8b57c4d1098f 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -682,11 +682,14 @@
>> #define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT)
>> #define MSR_AMD64_SNP_RESV_BIT 18
>> #define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
>
>> -
>> -#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
>> -
>
> Moved accidentally?
No, just didn't want that value in the middle of all the SNP related MSRs.
Really, I should have moved it above to keep everything in numerical order.
>
>> #define MSR_AMD64_RMP_BASE 0xc0010132
>> #define MSR_AMD64_RMP_END 0xc0010133
>> +#define MSR_AMD64_RMP_CFG 0xc0010136
>> +#define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
>> +#define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
>> +#define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
>> +
>> +#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
>>
>> #define MSR_SVSM_CAA 0xc001f000
>>
>> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
>> index ebfb924652f8..2f83772d3daa 100644
>> --- a/arch/x86/virt/svm/sev.c
>> +++ b/arch/x86/virt/svm/sev.c
>> @@ -97,6 +97,10 @@ struct rmp_segment_desc {
>> * a specific portion of memory. There can be up to 512 8-byte entries,
>> * one pages worth.
>> */
>> +#define RST_ENTRY_MAPPED_SIZE(x) ((x) & GENMASK_ULL(19, 0))
>> +#define RST_ENTRY_SEGMENT_BASE(x) ((x) & GENMASK_ULL(51, 20))
>> +
>> +#define RMP_SEGMENT_TABLE_SIZE SZ_4K
>> static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
>> static unsigned int rst_max_index __ro_after_init = 512;
>>
>> @@ -107,6 +111,9 @@ static unsigned long rmp_segment_coverage_mask;
>> #define RST_ENTRY_INDEX(x) ((x) >> rmp_segment_coverage_shift)
>> #define RMP_ENTRY_INDEX(x) PHYS_PFN((x) & rmp_segment_coverage_mask)
>>
>> +static u64 rmp_cfg;
>> +#define RMP_IS_SEGMENTED(x) ((x) & MSR_AMD64_SEG_RMP_ENABLED)
>> +
>> /* Mask to apply to a PFN to get the first PFN of a 2MB page */
>> #define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
>>
>
>> @@ -196,7 +203,42 @@ static void __init __snp_fixup_e820_tables(u64 pa)
>
> <skipped the e820 bits>
>
>> @@ -302,24 +344,12 @@ static bool __init alloc_rmp_segment_table(void)
>> return true;
>> }
>>
>> -/*
>> - * Do the necessary preparations which are verified by the firmware as
>> - * described in the SNP_INIT_EX firmware command description in the SNP
>> - * firmware ABI spec.
>> - */
>> -static int __init snp_rmptable_init(void)
>> +static bool __init contiguous_rmptable_setup(void)
>> {
>> - u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end, val;
>> - unsigned int i;
>> -
>> - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>> - return 0;
>> -
>> - if (!amd_iommu_snp_en)
>> - goto nosnp;
>> + u64 max_rmp_pfn, calc_rmp_sz, rmptable_segment, rmptable_size, rmp_end;
>>
>> if (!probed_rmp_size)
>> - goto nosnp;
>> + return false;
>>
>> rmp_end = probed_rmp_base + probed_rmp_size - 1;
>>
>
> If you dont mind, please fold the below comment update in contiguous_rmptable_setup()
> found during review. If required, I can send a separate patch.
Looks like there will be a v4, so I'll update it.
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 2f83772d3daa..d5a9f8164672 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -354,7 +354,7 @@ static bool __init contiguous_rmptable_setup(void)
> rmp_end = probed_rmp_base + probed_rmp_size - 1;
>
> /*
> - * Calculate the amount the memory that must be reserved by the BIOS to
> + * Calculate the amount of memory that must be reserved by the BIOS to
> * address the whole RAM, including the bookkeeping area. The RMP itself
> * must also be covered.
> */
>
>
>> @@ -336,11 +366,11 @@ static int __init snp_rmptable_init(void)
>> if (calc_rmp_sz > probed_rmp_size) {
>> pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
>> calc_rmp_sz, probed_rmp_size);
>> - goto nosnp;
>> + return false;
>> }
>>
>> if (!alloc_rmp_segment_table())
>> - goto nosnp;
>> + return false;
>>
>> /* Map only the RMP entries */
>> rmptable_segment = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
>> @@ -348,9 +378,116 @@ static int __init snp_rmptable_init(void)
>>
>> if (!alloc_rmp_segment_desc(rmptable_segment, rmptable_size, 0)) {
>> free_rmp_segment_table();
>> - goto nosnp;
>> + return false;
>> }
>>
>> + return true;
>> +}
>> +
>> +static bool __init segmented_rmptable_setup(void)
>> +{
>> + u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max;
>> + unsigned int i, max_index;
>> +
>> + if (!probed_rmp_base)
>> + return false;
>> +
>> + if (!alloc_rmp_segment_table())
>> + return false;
>> +
>> + /* Map the RMP Segment Table */
>> + rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
>> + rst = memremap(rst_pa, RMP_SEGMENT_TABLE_SIZE, MEMREMAP_WB);
>> + if (!rst) {
>> + pr_err("Failed to map RMP segment table addr %#llx\n", rst_pa);
>> + goto e_free;
>> + }
>> +
>> + /* Get the address for the end of system RAM */
>> + ram_pa_max = max_pfn << PAGE_SHIFT;
>> +
>> + /* Process each RMP segment */
>> + max_index = 0;
>> + ram_pa_end = 0;
>> + for (i = 0; i < rst_max_index; i++) {
>> + u64 rmp_segment, rmp_size, mapped_size;
>> +
>> + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]);
>> + if (!mapped_size)
>> + continue;
>> +
>> + max_index = i;
>> +
>> + /* Mapped size in GB */
>> + mapped_size *= (1ULL << 30);
>> + if (mapped_size > rmp_segment_coverage_size)
>> + mapped_size = rmp_segment_coverage_size;
>
> This seems to be an error in BIOS RST programming, probably a print/warning
> would help during debug.
The segmented RMP support allows for this, but, yeah, should probably
print a message when it occurs.
>
>> +
>> + rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]);
>> +
>> + rmp_size = PHYS_PFN(mapped_size);
>> + rmp_size <<= 4;
>
> A comment above this will help, as you are calculating 16 bytes/page.
Sure.
>
>> + pa = (u64)i << rmp_segment_coverage_shift;
>> +
>> + /* Some segments may be for MMIO mapped above system RAM */
>
> Why will RST have MMIO mapped entries ?
Trusted I/O. I'll add that to the comment.
Thanks,
Tom
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment
2024-10-18 13:56 ` Tom Lendacky
@ 2024-10-18 14:42 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 14:42 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 08:56, Tom Lendacky wrote:
> On 10/18/24 00:59, Neeraj Upadhyay wrote:
>> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>>> In preparation for support of a segmented RMP table, treat the contiguous
>>> RMP table as a segmented RMP table with a single segment covering all
>>> of memory. By treating a contiguous RMP table as a single segment, much
>>> of the code that initializes and accesses the RMP can be re-used.
>>>
>>> Segmented RMP tables can have up to 512 segment entries. Each segment
>>> will have metadata associated with it to identify the segment location,
>>> the segment size, etc. The segment data and the physical address are used
>>> to determine the index of the segment within the table and then the RMP
>>> entry within the segment. For an actual segmented RMP table environment,
>>> much of the segment information will come from a configuration MSR. For
>>> the contiguous RMP, though, much of the information will be statically
>>> defined.
>>>
>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> ---
>>> arch/x86/virt/svm/sev.c | 195 ++++++++++++++++++++++++++++++++++++----
>>> 1 file changed, 176 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
>>> index 81e21d833cf0..ebfb924652f8 100644
>>> --- a/arch/x86/virt/svm/sev.c
>>> +++ b/arch/x86/virt/svm/sev.c
>>> @@ -18,6 +18,7 @@
>>> #include <linux/cpumask.h>
>>> #include <linux/iommu.h>
>>> #include <linux/amd-iommu.h>
>>> +#include <linux/nospec.h>
>>>
>>> #include <asm/sev.h>
>>> #include <asm/processor.h>
>>> @@ -74,12 +75,42 @@ struct rmpentry_raw {
>>> */
>>> #define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
>>>
>>> +/*
>>> + * For a non-segmented RMP table, use the maximum physical addressing as the
>>> + * segment size in order to always arrive at index 0 in the table.
>>> + */
>>> +#define RMPTABLE_NON_SEGMENTED_SHIFT 52
>>> +
>>> +struct rmp_segment_desc {
>>> + struct rmpentry_raw *rmp_entry;
>>> + u64 max_index;
>>> + u64 size;
>>> +};
>>> +
>>> +/*
>>> + * Segmented RMP Table support.
>>> + * - The segment size is used for two purposes:
>>> + * - Identify the amount of memory covered by an RMP segment
>>> + * - Quickly locate an RMP segment table entry for a physical address
>>> + *
>>> + * - The RMP segment table contains pointers to an RMP table that covers
>>> + * a specific portion of memory. There can be up to 512 8-byte entries,
>>> + * one pages worth.
>>> + */
>>> +static struct rmp_segment_desc **rmp_segment_table __ro_after_init;
>>> +static unsigned int rst_max_index __ro_after_init = 512;
>>> +
>>> +static u64 rmp_segment_size_max;
>>> +static unsigned int rmp_segment_coverage_shift;
>>> +static unsigned long rmp_segment_coverage_size;
>>> +static unsigned long rmp_segment_coverage_mask;
>>
>> rmp_segment_size_max is of type u64 and rmp_segment_coverage_size is 1 << 52
>> for single RMP segment. So, maybe use u64 for rmp_segment_coverage_size
>> and rmp_segment_coverage_mask also?
>
> This is 64-bit only code where unsigned long is the same size as u64 and
> is typically preferred when dealing with numbers like this, which is why I
> use that here. It does get a bit confusing because of the use of u64 and
> unsigned long but I tried to keep things in sync between usages of the
> same type as much as possible.
But let me see what everything looks like if I unify all the fields to u64...
Thanks,
Tom
>
> Thanks,
> Tom
>
>>
>>
>> - Neeraj
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP)
2024-10-18 6:56 ` Nikunj A. Dadhania
@ 2024-10-18 14:48 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 14:48 UTC (permalink / raw)
To: Nikunj A. Dadhania, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 01:56, Nikunj A. Dadhania wrote:
> On 9/30/2024 8:52 PM, Tom Lendacky wrote:
>> Update the AMD memory encryption documentation to include information on
>> the Reverse Map Table (RMP) and the two table formats.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> .../arch/x86/amd-memory-encryption.rst | 118 ++++++++++++++++++
>> 1 file changed, 118 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
>> index 6df3264f23b9..bd840df708ea 100644
>> --- a/Documentation/arch/x86/amd-memory-encryption.rst
>> +++ b/Documentation/arch/x86/amd-memory-encryption.rst
>> @@ -130,8 +130,126 @@ SNP feature support.
>>
>> More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
>>
>> +Reverse Map Table (RMP)
>> +=======================
>> +
>> +The RMP is a structure in system memory that is used to ensure a one-to-one
>> +mapping between system physical addresses and guest physical addresses. Each
>> +page of memory that is potentially assignable to guests has one entry within
>> +the RMP.
>> +
>> +The RMP table can be either contiguous in memory or a collection of segments
>> +in memory.
>> +
>> +Contiguous RMP
>> +--------------
>> +
>> +Support for this form of the RMP is present when support for SEV-SNP is
>> +present, which can be determined using the CPUID instruction::
>> +
>> + 0x8000001f[eax]:
>> + Bit[4] indicates support for SEV-SNP
>> +
>> +The location of the RMP is identified to the hardware through two MSRs::
>> +
>> + 0xc0010132 (RMP_BASE):
>> + System physical address of the first byte of the RMP
>> +
>> + 0xc0010133 (RMP_END):
>> + System physical address of the last byte of the RMP
>> +
>> +Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV
>> +firmware increases the alignment requirement to require a 1MB alignment.
>> +
>> +The RMP consists of a 16KB region used for processor bookkeeping followed
>> +by the RMP entries, which are 16 bytes in size. The size of the RMP
>> +determines the range of physical memory that the hypervisor can assign to
>> +SEV-SNP guests. The RMP covers the system physical address from::
>> +
>> + 0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB.
>> +
>> +The current Linux support relies on BIOS to allocate/reserve the memory for
>> +the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR
>> +values to locate the RMP and determine the size of the RMP. The RMP must
>> +cover all of system memory in order for Linux to enable SEV-SNP.
>> +
>> +Segmented RMP
>> +-------------
>> +
>> +Segmented RMP support is a new way of representing the layout of an RMP.
>> +Initial RMP support required the RMP table to be contiguous in memory.
>> +RMP accesses from a NUMA node on which the RMP doesn't reside
>> +can take longer than accesses from a NUMA node on which the RMP resides.
>> +Segmented RMP support allows the RMP entries to be located on the same
>> +node as the memory the RMP is covering, potentially reducing latency
>> +associated with accessing an RMP entry associated with the memory. Each
>> +RMP segment covers a specific range of system physical addresses.
>> +
>> +Support for this form of the RMP can be determined using the CPUID
>> +instruction::
>> +
>> + 0x8000001f[eax]:
>> + Bit[23] indicates support for segmented RMP
>> +
>> +If supported, segmented RMP attributes can be found using the CPUID
>> +instruction::
>> +
>> + 0x80000025[eax]:
>> + Bits[5:0] minimum supported RMP segment size
>> + Bits[11:6] maximum supported RMP segment size
>> +
>> + 0x80000025[ebx]:
>> + Bits[9:0] number of cacheable RMP segment definitions
>> + Bit[10] indicates if the number of cacheable RMP segments
>> + is a hard limit
>> +
>> +To enable a segmented RMP, a new MSR is available::
>
> This may be more appropriate:
>
> To discover segmented RMP support, a new MSR is available::
Not really. You discover the ability to use segmented RMP (and the
availability of the MSR) through CPUID and then enable it through the MSR.
It's just that Linux relies on BIOS to set everything up and then we look
at the MSR to see if BIOS built a segmented RMP (I allude to that a few
paragraphs below).
Thanks,
Tom
>
>> +
>> + 0xc0010136 (RMP_CFG):
>> + Bit[0] indicates if segmented RMP is enabled
>> + Bits[13:8] contains the size of memory covered by an RMP
>> + segment (expressed as a power of 2)
>> +
>> +The RMP segment size defined in the RMP_CFG MSR applies to all segments
>> +of the RMP. Therefore each RMP segment covers a specific range of system
>> +physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then
>> +the RMP segment coverage value is 0x24 => 36, meaning the size of memory
>> +covered by an RMP segment is 64GB (1 << 36). So the first RMP segment
>> +covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment
>> +covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc.
>> +
>> +When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping
>> +area as it does today (16K in size). However, instead of RMP entries
>> +beginning immediately after the bookkeeping area, there is a 4K RMP
>> +segment table (RST). Each entry in the RST is 8-bytes in size and represents
>> +an RMP segment::
>> +
>> + Bits[19:0] mapped size (in GB)
>> + The mapped size can be less than the defined segment size.
>> + A value of zero, indicates that no RMP exists for the range
>> + of system physical addresses associated with this segment.
>> + Bits[51:20] segment physical address
>> + This address is left shift 20-bits (or just masked when
>> + read) to form the physical address of the segment (1MB
>> + alignment).
>> +
>> +The RST can hold 512 segment entries but can be limited in size to the number
>> +of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable
>> +RMP segments is a hard limit (CPUID 0x80000025_EBX[10]).
>> +
>> +The current Linux support relies on BIOS to allocate/reserve the memory for
>> +the segmented RMP (the bookkeeping area, RST, and all segments), build the RST
>> +and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR
>> +values to locate the RMP and determine the size and location of the RMP
>> +segments. The RMP must cover all of system memory in order for Linux to enable
>> +SEV-SNP.
>> +
>> +More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table",
>> +docID: 24593.
>> +
>> Secure VM Service Module (SVSM)
>> ===============================
>> +
>> SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which
>> defines four privilege levels at which guest software can run. The most
>> privileged level is 0 and numerically higher numbers have lesser privileges.
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table
2024-10-18 8:37 ` Neeraj Upadhyay
@ 2024-10-18 15:06 ` Tom Lendacky
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 15:06 UTC (permalink / raw)
To: Neeraj Upadhyay, linux-kernel, x86
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 03:37, Neeraj Upadhyay wrote:
>
>
>>
>> @@ -196,7 +203,42 @@ static void __init __snp_fixup_e820_tables(u64 pa)
>> void __init snp_fixup_e820_tables(void)
>> {
>> __snp_fixup_e820_tables(probed_rmp_base);
>> - __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
>> +
>> + if (RMP_IS_SEGMENTED(rmp_cfg)) {
>> + unsigned long size;
>> + unsigned int i;
>> + u64 pa, *rst;
>> +
>> + pa = probed_rmp_base;
>> + pa += RMPTABLE_CPU_BOOKKEEPING_SZ;
>> + pa += RMP_SEGMENT_TABLE_SIZE;
>> + __snp_fixup_e820_tables(pa);
>> +
>> + pa -= RMP_SEGMENT_TABLE_SIZE;
>> + rst = early_memremap(pa, RMP_SEGMENT_TABLE_SIZE);
>> + if (!rst)
>> + return;
>> +
>> + for (i = 0; i < rst_max_index; i++) {
>> + pa = RST_ENTRY_SEGMENT_BASE(rst[i]);
>> + size = RST_ENTRY_MAPPED_SIZE(rst[i]);
>> + if (!size)
>> + continue;
>> +
>> + __snp_fixup_e820_tables(pa);
>> +
>> + /* Mapped size in GB */
>> + size *= (1UL << 30);
>
> nit: size <<= 30 ?
Yeah, might be clearer.
>
>> + if (size > rmp_segment_coverage_size)
>> + size = rmp_segment_coverage_size;
>> +
>> + __snp_fixup_e820_tables(pa + size);
>
> I might have understood this wrong, but is this call meant to fixup segmented
> rmp table end. So, is below is required?
>
> size = PHYS_PFN(size);
> size <<= 4;
> __snp_fixup_e820_tables(pa + size);
Good catch. Yes, it is supposed to be checking the end of the RMP segment
which should be the number of entries and not the mapped size.
>
>> + }
>> +
>> + early_memunmap(rst, RMP_SEGMENT_TABLE_SIZE);
>> + } else {
>> + __snp_fixup_e820_tables(probed_rmp_base + probed_rmp_size);
>> + }
>> }
>>
>
> ...
>
>> +static bool __init segmented_rmptable_setup(void)
>> +{
>> + u64 rst_pa, *rst, pa, ram_pa_end, ram_pa_max;
>> + unsigned int i, max_index;
>> +
>> + if (!probed_rmp_base)
>> + return false;
>> +
>> + if (!alloc_rmp_segment_table())
>> + return false;
>> +
>> + /* Map the RMP Segment Table */
>> + rst_pa = probed_rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ;
>> + rst = memremap(rst_pa, RMP_SEGMENT_TABLE_SIZE, MEMREMAP_WB);
>> + if (!rst) {
>> + pr_err("Failed to map RMP segment table addr %#llx\n", rst_pa);
>> + goto e_free;
>> + }
>> +
>> + /* Get the address for the end of system RAM */
>> + ram_pa_max = max_pfn << PAGE_SHIFT;
>> +
>> + /* Process each RMP segment */
>> + max_index = 0;
>> + ram_pa_end = 0;
>> + for (i = 0; i < rst_max_index; i++) {
>> + u64 rmp_segment, rmp_size, mapped_size;
>> +
>> + mapped_size = RST_ENTRY_MAPPED_SIZE(rst[i]);
>> + if (!mapped_size)
>> + continue;
>> +
>> + max_index = i;
>> +
>> + /* Mapped size in GB */
>> + mapped_size *= (1ULL << 30);
>
> nit: mapped_size <<= 30 ?
Ditto.
>
>> + if (mapped_size > rmp_segment_coverage_size)
>> + mapped_size = rmp_segment_coverage_size;
>> +
>> + rmp_segment = RST_ENTRY_SEGMENT_BASE(rst[i]);
>> +
>> + rmp_size = PHYS_PFN(mapped_size);
>> + rmp_size <<= 4;
>> +
>> + pa = (u64)i << rmp_segment_coverage_shift;
>> +
>> + /* Some segments may be for MMIO mapped above system RAM */
>> + if (pa < ram_pa_max)
>> + ram_pa_end = pa + mapped_size;
>> +
>> + if (!alloc_rmp_segment_desc(rmp_segment, rmp_size, pa))
>> + goto e_unmap;
>> +
>> + pr_info("RMP segment %u physical address [%#llx - %#llx] covering [%#llx - %#llx]\n",
>> + i, rmp_segment, rmp_segment + rmp_size - 1, pa, pa + mapped_size - 1);
>> + }
>> +
>> + if (ram_pa_max > ram_pa_end) {
>> + pr_err("Segmented RMP does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
>> + ram_pa_max, ram_pa_end);
>> + goto e_unmap;
>> + }
>> +
>> + /* Adjust the maximum index based on the found segments */
>> + rst_max_index = max_index + 1;
>> +
>> + memunmap(rst);
>> +
>> + return true;
>> +
>> +e_unmap:
>> + memunmap(rst);
>> +
>> +e_free:
>> + free_rmp_segment_table();
>> +
>> + return false;
>> +}
>> +
>
> ...
>
>>
>> +static bool probe_segmented_rmptable_info(void)
>> +{
>> + unsigned int eax, ebx, segment_shift, segment_shift_min, segment_shift_max;
>> + u64 rmp_base, rmp_end;
>> +
>> + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
>> + rdmsrl(MSR_AMD64_RMP_END, rmp_end);
>> +
>> + if (!(rmp_base & RMP_ADDR_MASK)) {
>> + pr_err("Memory for the RMP table has not been reserved by BIOS\n");
>> + return false;
>> + }
>> +
>> + WARN_ONCE(rmp_end & RMP_ADDR_MASK,
>> + "Segmented RMP enabled but RMP_END MSR is non-zero\n");
>> +
>> + /* Obtain the min and max supported RMP segment size */
>> + eax = cpuid_eax(0x80000025);
>> + segment_shift_min = eax & GENMASK(5, 0);
>> + segment_shift_max = (eax & GENMASK(11, 6)) >> 6;
>> +
>> + /* Verify the segment size is within the supported limits */
>> + segment_shift = MSR_AMD64_RMP_SEGMENT_SHIFT(rmp_cfg);
>> + if (segment_shift > segment_shift_max || segment_shift < segment_shift_min) {
>> + pr_err("RMP segment size (%u) is not within advertised bounds (min=%u, max=%u)\n",
>> + segment_shift, segment_shift_min, segment_shift_max);
>> + return false;
>> + }
>> +
>> + /* Override the max supported RST index if a hardware limit exists */
>> + ebx = cpuid_ebx(0x80000025);
>> + if (ebx & BIT(10))
>> + rst_max_index = ebx & GENMASK(9, 0);
>> +
>> + set_rmp_segment_info(segment_shift);
>> +
>> + probed_rmp_base = rmp_base;
>> + probed_rmp_size = 0;
>> +
>> + pr_info("RMP segment table physical address [0x%016llx - 0x%016llx]\n",
>> + rmp_base, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
>> +
>
> rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ, rmp_base + RMPTABLE_CPU_BOOKKEEPING_SZ + RMP_SEGMENT_TABLE_SIZE);
I really want the full range printed, which includes the bookkeeping area.
So maybe the text could be clearer, let me think about that.
Thanks,
Tom
>
>
> - Neeraj
>
>> + return true;
>> +}
>> +
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-10-18 12:41 ` Borislav Petkov
@ 2024-10-18 15:14 ` Tom Lendacky
2024-10-21 15:41 ` Borislav Petkov
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lendacky @ 2024-10-18 15:14 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/18/24 07:41, Borislav Petkov wrote:
> On Mon, Sep 30, 2024 at 10:22:10AM -0500, Tom Lendacky wrote:
>> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
>> index 103a2dd6e81d..73d4f422829a 100644
>> --- a/arch/x86/virt/svm/sev.c
>> +++ b/arch/x86/virt/svm/sev.c
>> @@ -301,6 +301,17 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
>> {
>> struct rmpentry_raw *e;
>>
>> + if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
>> + int ret;
>> +
>> + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
>> + : "=a" (ret)
>> + : "a" (pfn << PAGE_SHIFT), "c" (entry)
>> + : "memory", "cc");
>> +
>> + return ret;
>> + }
>
> I think this should be:
>
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 73d9295dd013..5500c5d64cba 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -303,12 +303,11 @@ static int get_rmpentry(u64 pfn, struct rmpentry *entry)
> struct rmpentry_raw *e;
>
> if (cpu_feature_enabled(X86_FEATURE_RMPREAD)) {
> - int ret;
> + int ret = pfn << PAGE_SHIFT;
>
> asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfd"
> - : "=a" (ret)
> - : "a" (pfn << PAGE_SHIFT), "c" (entry)
> - : "memory", "cc");
> + : "+a" (ret), "+c" (entry)
> + :: "memory", "cc");
>
> return ret;
> }
>
> because "The RCX register provides the effective address of a 16-byte data
> structure into which the RMP state is written."
>
> So your %rcx is both an input and an output operand and you need to do the "+"
> thing here too for that.
I don't think so. RCX does not change on output, the contents that RCX
points to changes, but the register value does not so the "+" is not
correct. The instruction doesn't take a memory location as part of
operands (like a MOV instruction could), which is why the "memory" clobber
is specified.
>
> Same for %rax.
For RAX, yes, if I set "ret" to the input value then I can use "+"
specification. But the way it's coded now is also correct.
Thanks,
Tom
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-10-18 15:14 ` Tom Lendacky
@ 2024-10-21 15:41 ` Borislav Petkov
2024-10-21 17:10 ` Tom Lendacky
0 siblings, 1 reply; 43+ messages in thread
From: Borislav Petkov @ 2024-10-21 15:41 UTC (permalink / raw)
To: Tom Lendacky
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On Fri, Oct 18, 2024 at 10:14:04AM -0500, Tom Lendacky wrote:
> I don't think so. RCX does not change on output, the contents that RCX
> points to changes, but the register value does not so the "+" is not
> correct. The instruction doesn't take a memory location as part of
> operands (like a MOV instruction could), which is why the "memory" clobber
> is specified.
Just confirmed it with my compiler guy: yes, you're right. The rule is this:
*if* RCX itself doesn't change but memory it points to, does change, then you
need the "memory" clobber. Otherwise the compiler can reorder accesses.
> For RAX, yes, if I set "ret" to the input value then I can use "+"
> specification. But the way it's coded now is also correct.
If you set ret, it means a smaller and simpler inline asm which is always
better.
:-)
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-10-21 15:41 ` Borislav Petkov
@ 2024-10-21 17:10 ` Tom Lendacky
2024-10-21 17:49 ` Borislav Petkov
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lendacky @ 2024-10-21 17:10 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On 10/21/24 10:41, Borislav Petkov wrote:
> On Fri, Oct 18, 2024 at 10:14:04AM -0500, Tom Lendacky wrote:
>> I don't think so. RCX does not change on output, the contents that RCX
>> points to changes, but the register value does not so the "+" is not
>> correct. The instruction doesn't take a memory location as part of
>> operands (like a MOV instruction could), which is why the "memory" clobber
>> is specified.
>
> Just confirmed it with my compiler guy: yes, you're right. The rule is this:
> *if* RCX itself doesn't change but memory it points to, does change, then you
> need the "memory" clobber. Otherwise the compiler can reorder accesses.
>
>> For RAX, yes, if I set "ret" to the input value then I can use "+"
>> specification. But the way it's coded now is also correct.
>
> If you set ret, it means a smaller and simpler inline asm which is always
> better.
The input value is a 64-bit value and on output the return code is in
EAX, a 32-bit value. So the use of the "=a" (ret) for output and "a"
(pfn << PAGE_SHIFT) for input is more accurate.
It's not a complicated statement and is much clearer to me.
I can change it if you really want the "+a" thing (including changing
the ret variable to a u64), but would prefer not to do that.
Thanks,
Tom
>
> :-)
>
> Thx.
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction
2024-10-21 17:10 ` Tom Lendacky
@ 2024-10-21 17:49 ` Borislav Petkov
0 siblings, 0 replies; 43+ messages in thread
From: Borislav Petkov @ 2024-10-21 17:49 UTC (permalink / raw)
To: Tom Lendacky
Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Michael Roth, Ashish Kalra
On Mon, Oct 21, 2024 at 12:10:55PM -0500, Tom Lendacky wrote:
> The input value is a 64-bit value and on output the return code is in
> EAX, a 32-bit value. So the use of the "=a" (ret) for output and "a"
> (pfn << PAGE_SHIFT) for input is more accurate.
Oh, they differ in width. Ok.
> It's not a complicated statement and is much clearer to me.
>
> I can change it if you really want the "+a" thing (including changing
> the ret variable to a u64), but would prefer not to do that.
Nah, it'll get uglier if you do. Let's keep it this way.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2024-10-21 17:49 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-30 15:22 [PATCH v3 0/8] Provide support for RMPREAD and a segmented RMP Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 1/8] x86/sev: Prepare for using the RMPREAD instruction to access the RMP Tom Lendacky
2024-10-16 8:52 ` Nikunj A. Dadhania
2024-10-16 14:43 ` Tom Lendacky
2024-10-17 5:24 ` Nikunj A. Dadhania
2024-10-16 15:01 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 2/8] x86/sev: Add support for the RMPREAD instruction Tom Lendacky
2024-10-16 10:46 ` Nikunj A. Dadhania
2024-10-17 15:26 ` Borislav Petkov
2024-10-17 16:24 ` Tom Lendacky
2024-10-18 4:21 ` Neeraj Upadhyay
2024-10-18 12:41 ` Borislav Petkov
2024-10-18 15:14 ` Tom Lendacky
2024-10-21 15:41 ` Borislav Petkov
2024-10-21 17:10 ` Tom Lendacky
2024-10-21 17:49 ` Borislav Petkov
2024-09-30 15:22 ` [PATCH v3 3/8] x86/sev: Require the RMPREAD instruction after Fam19h Tom Lendacky
2024-09-30 17:03 ` Dave Hansen
2024-09-30 18:59 ` Tom Lendacky
2024-10-18 13:06 ` Borislav Petkov
2024-10-18 4:26 ` Neeraj Upadhyay
2024-10-18 13:30 ` Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 4/8] x86/sev: Move the SNP probe routine out of the way Tom Lendacky
2024-10-16 11:05 ` Nikunj A. Dadhania
2024-10-18 4:28 ` Neeraj Upadhyay
2024-09-30 15:22 ` [PATCH v3 5/8] x86/sev: Map only the RMP table entries instead of the full RMP range Tom Lendacky
2024-10-16 11:25 ` [sos-linux-ext-patches] " Nikunj A. Dadhania
2024-10-18 4:38 ` Neeraj Upadhyay
2024-10-18 13:32 ` Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 6/8] x86/sev: Treat the contiguous RMP table as a single RMP segment Tom Lendacky
2024-10-17 11:05 ` Nikunj A. Dadhania
2024-10-18 5:59 ` Neeraj Upadhyay
2024-10-18 13:56 ` Tom Lendacky
2024-10-18 14:42 ` Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 7/8] x86/sev: Add full support for a segmented RMP table Tom Lendacky
2024-10-18 6:32 ` Nikunj A. Dadhania
2024-10-18 14:41 ` Tom Lendacky
2024-10-18 8:37 ` Neeraj Upadhyay
2024-10-18 15:06 ` Tom Lendacky
2024-09-30 15:22 ` [PATCH v3 8/8] x86/sev/docs: Document the SNP Reverse Map Table (RMP) Tom Lendacky
2024-10-18 6:56 ` Nikunj A. Dadhania
2024-10-18 14:48 ` Tom Lendacky
2024-10-18 13:31 ` Neeraj Upadhyay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).