* [PATCH v5 00/22] x86: strict separation of startup code
@ 2025-07-16 3:18 Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback Ard Biesheuvel
` (22 more replies)
0 siblings, 23 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
This series implements a strict separation between startup code and
ordinary code, where startup code is built in a way that tolerates being
invoked from the initial 1:1 mapping of memory.
The existing approach of emitting this code into .head.text and checking
for absolute relocations in that section is not 100% safe, and produces
diagnostics that are sometimes difficult to interpret. [0]
Instead, rely on symbol prefixes, similar to how this is implemented for
the EFI stub and for the startup code in the arm64 port. This ensures
that startup code can only call other startup code, unless a special
symbol alias is emitted that exposes a non-startup routine to the
startup code.
This is somewhat intrusive, as there are many data objects that are
referenced both by startup code and by ordinary code, and an alias needs
to be emitted for each of those. If startup code references anything
that has not been made available to it explicitly, a build time link
error will occur.
This ultimately allows the .head.text section to be dropped entirely, as
it no longer has a special significance. Instead, code that only
executes at boot is emitted into .init.text as it should.
The majority of changes is around early SEV code. The main issue is that
its use of GHCB pages and SVSM calling areas in code that may run from
both the 1:1 mapping and the kernel virtual mapping is problematic as it
relies on __pa() to perform VA to PA translations, which are ambiguous
in this context. Also, __pa() pulls in non-trivial instrumented code
when CONFIG_DEBUG_VIRTUAL=y and so it is better to avoid VA to PA
translations altogether in the startup code.
Changes since v4:
- Incorporate feedback from Tom, and add a couple of RBs
- Drop patch that moved the MSR save/restore out of the early page state
change helper - this is less efficient but likely negligible in
practice
- Drop patch that unified the SEV-SNP hypervisor feature check, which
was identified by Nikunj as the one breaking SEV-SNP boot.
Changes since RFT/v3:
- Rebase onto tip/master
- Incorporate Borislav's feedback on v3
- Switch to objtool to check for absolute references in startup code
- Remap inittext R-X when running on EFI implementations that require
strict R-X/RW- separation
- Include a kbuild fix to incorporate arch/x86/boot/startup/ in the
right manner
- For now, omit the LA57 changes that remove the problematic early
5-level paging checks. We can revisit this once there is agreement on
the approach.
Changes since RFT/v2:
- Rebase onto tip/x86/boot and drop the patches from the previous
revision that have been applied in the meantime.
- Omit the pgtable_l5_enabled() changes for now, and just expose PIC
aliases for the variables in question - this can be sorted later.
- Don't use the boot SVSM calling area in snp_kexec_finish(), but pass
down the correct per-CPU one to the early page state API.
- Rename arch/x86/coco/sev/sev-noinstr.o to arch/x86/coco/sev/noinstr.o
- Further reduce the amount of SEV code that needs to be constructed in
a special way.
Change since RFC/v1:
- Include a major disentanglement/refactor of the SEV-SNP startup code,
so that only code that really needs to run from the 1:1 mapping is
included in the startup/ code
- Incorporate some early notes from Ingo
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nikunj A Dadhania <nikunj@amd.com>
[0] https://lore.kernel.org/all/CAHk-=wj7k9nvJn6cpa3-5Ciwn2RGyE605BMkjWE4MqnvC9E92A@mail.gmail.com/
Ard Biesheuvel (22):
x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback
x86/sev: Use MSR protocol for remapping SVSM calling area
x86/sev: Use MSR protocol only for early SVSM PVALIDATE call
x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL
x86/sev: Move GHCB page based HV communication out of startup code
x86/sev: Avoid global variable to store virtual address of SVSM area
x86/sev: Share implementation of MSR-based page state change
x86/sev: Pass SVSM calling area down to early page state change API
x86/sev: Use boot SVSM CA for all startup and init code
x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check
x86/boot: Provide PIC aliases for 5-level paging related constants
x86/sev: Provide PIC aliases for SEV related data objects
x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object
x86/sev: Export startup routines for later use
objtool: Add action to check for absence of absolute relocations
x86/boot: Check startup code for absence of absolute relocations
x86/boot: Revert "Reject absolute references in .head.text"
x86/kbuild: Incorporate boot/startup/ via Kbuild makefile
x86/boot: Create a confined code area for startup code
efistub/x86: Remap inittext read-execute when needed
x86/boot: Move startup code out of __head section
x86/boot: Get rid of the .head.text section
arch/x86/Kbuild | 2 +
arch/x86/Makefile | 1 -
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/misc.c | 2 +
arch/x86/boot/compressed/sev-handle-vc.c | 3 +
arch/x86/boot/compressed/sev.c | 108 +------
arch/x86/boot/startup/Makefile | 22 ++
arch/x86/boot/startup/exports.h | 14 +
arch/x86/boot/startup/gdt_idt.c | 4 +-
arch/x86/boot/startup/map_kernel.c | 4 +-
arch/x86/boot/startup/sev-shared.c | 317 ++++++--------------
arch/x86/boot/startup/sev-startup.c | 196 ++----------
arch/x86/boot/startup/sme.c | 27 +-
arch/x86/coco/sev/Makefile | 6 +-
arch/x86/coco/sev/core.c | 169 ++++++++---
arch/x86/coco/sev/{sev-nmi.c => noinstr.c} | 74 +++++
arch/x86/coco/sev/vc-handle.c | 2 +
arch/x86/coco/sev/vc-shared.c | 143 ++++++++-
arch/x86/include/asm/boot.h | 2 +
arch/x86/include/asm/init.h | 6 -
arch/x86/include/asm/setup.h | 1 +
arch/x86/include/asm/sev-internal.h | 27 +-
arch/x86/include/asm/sev.h | 17 +-
arch/x86/kernel/head64.c | 5 +-
arch/x86/kernel/head_32.S | 2 +-
arch/x86/kernel/head_64.S | 10 +-
arch/x86/kernel/vmlinux.lds.S | 9 +-
arch/x86/mm/mem_encrypt_amd.c | 6 -
arch/x86/mm/mem_encrypt_boot.S | 6 +-
arch/x86/platform/pvh/head.S | 2 +-
arch/x86/tools/relocs.c | 8 +-
drivers/firmware/efi/libstub/x86-stub.c | 4 +-
tools/objtool/builtin-check.c | 2 +
tools/objtool/check.c | 39 ++-
tools/objtool/include/objtool/builtin.h | 1 +
35 files changed, 620 insertions(+), 623 deletions(-)
create mode 100644 arch/x86/boot/startup/exports.h
rename arch/x86/coco/sev/{sev-nmi.c => noinstr.c} (61%)
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 16:52 ` Tom Lendacky
2025-07-16 3:18 ` [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area Ard Biesheuvel
` (21 subsequent siblings)
22 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
There are two distinct callers of snp_cpuid(): one where the MSR
protocol is always used, and one where the GHCB page based interface is
always used.
The snp_cpuid() logic does not care about the distinction, which only
matters at a lower level. But the fact that it supports both interfaces
means that the GHCB page based logic is pulled into the early startup
code where PA to VA conversions are problematic, given that it runs from
the 1:1 mapping of memory.
So keep snp_cpuid() itself in the startup code, but factor out the
hypervisor calls via a callback, so that the GHCB page handling can be
moved out.
Code refactoring only - no functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/sev-shared.c | 60 ++++----------------
arch/x86/coco/sev/vc-shared.c | 49 +++++++++++++++-
arch/x86/include/asm/sev.h | 3 +-
3 files changed, 61 insertions(+), 51 deletions(-)
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index 7a706db87b93..c401d0391537 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -342,44 +342,7 @@ static int __sev_cpuid_hv_msr(struct cpuid_leaf *leaf)
return ret;
}
-static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
-{
- u32 cr4 = native_read_cr4();
- int ret;
-
- ghcb_set_rax(ghcb, leaf->fn);
- ghcb_set_rcx(ghcb, leaf->subfn);
-
- if (cr4 & X86_CR4_OSXSAVE)
- /* Safe to read xcr0 */
- ghcb_set_xcr0(ghcb, xgetbv(XCR_XFEATURE_ENABLED_MASK));
- else
- /* xgetbv will cause #UD - use reset value for xcr0 */
- ghcb_set_xcr0(ghcb, 1);
-
- ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_CPUID, 0, 0);
- if (ret != ES_OK)
- return ret;
-
- if (!(ghcb_rax_is_valid(ghcb) &&
- ghcb_rbx_is_valid(ghcb) &&
- ghcb_rcx_is_valid(ghcb) &&
- ghcb_rdx_is_valid(ghcb)))
- return ES_VMM_ERROR;
- leaf->eax = ghcb->save.rax;
- leaf->ebx = ghcb->save.rbx;
- leaf->ecx = ghcb->save.rcx;
- leaf->edx = ghcb->save.rdx;
-
- return ES_OK;
-}
-
-static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
-{
- return ghcb ? __sev_cpuid_hv_ghcb(ghcb, ctxt, leaf)
- : __sev_cpuid_hv_msr(leaf);
-}
/*
* This may be called early while still running on the initial identity
@@ -484,21 +447,20 @@ snp_cpuid_get_validated_func(struct cpuid_leaf *leaf)
return false;
}
-static void snp_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+static void snp_cpuid_hv_msr(void *ctx, struct cpuid_leaf *leaf)
{
- if (sev_cpuid_hv(ghcb, ctxt, leaf))
+ if (__sev_cpuid_hv_msr(leaf))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
}
-static int __head
-snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
- struct cpuid_leaf *leaf)
+static int __head snp_cpuid_postprocess(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *),
+ void *ctx, struct cpuid_leaf *leaf)
{
struct cpuid_leaf leaf_hv = *leaf;
switch (leaf->fn) {
case 0x1:
- snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
+ cpuid_fn(ctx, &leaf_hv);
/* initial APIC ID */
leaf->ebx = (leaf_hv.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0));
@@ -517,7 +479,7 @@ snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
break;
case 0xB:
leaf_hv.subfn = 0;
- snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
+ cpuid_fn(ctx, &leaf_hv);
/* extended APIC ID */
leaf->edx = leaf_hv.edx;
@@ -565,7 +527,7 @@ snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
}
break;
case 0x8000001E:
- snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
+ cpuid_fn(ctx, &leaf_hv);
/* extended APIC ID */
leaf->eax = leaf_hv.eax;
@@ -586,8 +548,8 @@ snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
* Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value
* should be treated as fatal by caller.
*/
-int __head
-snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+int __head snp_cpuid(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *), void *ctx,
+ struct cpuid_leaf *leaf)
{
const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -621,7 +583,7 @@ snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
return 0;
}
- return snp_cpuid_postprocess(ghcb, ctxt, leaf);
+ return snp_cpuid_postprocess(cpuid_fn, ctx, leaf);
}
/*
@@ -648,7 +610,7 @@ void __head do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
leaf.fn = fn;
leaf.subfn = subfn;
- ret = snp_cpuid(NULL, NULL, &leaf);
+ ret = snp_cpuid(snp_cpuid_hv_msr, NULL, &leaf);
if (!ret)
goto cpuid_done;
diff --git a/arch/x86/coco/sev/vc-shared.c b/arch/x86/coco/sev/vc-shared.c
index 2c0ab0fdc060..b4688f69102e 100644
--- a/arch/x86/coco/sev/vc-shared.c
+++ b/arch/x86/coco/sev/vc-shared.c
@@ -409,15 +409,62 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}
+static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+{
+ u32 cr4 = native_read_cr4();
+ int ret;
+
+ ghcb_set_rax(ghcb, leaf->fn);
+ ghcb_set_rcx(ghcb, leaf->subfn);
+
+ if (cr4 & X86_CR4_OSXSAVE)
+ /* Safe to read xcr0 */
+ ghcb_set_xcr0(ghcb, xgetbv(XCR_XFEATURE_ENABLED_MASK));
+ else
+ /* xgetbv will cause #UD - use reset value for xcr0 */
+ ghcb_set_xcr0(ghcb, 1);
+
+ ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_CPUID, 0, 0);
+ if (ret != ES_OK)
+ return ret;
+
+ if (!(ghcb_rax_is_valid(ghcb) &&
+ ghcb_rbx_is_valid(ghcb) &&
+ ghcb_rcx_is_valid(ghcb) &&
+ ghcb_rdx_is_valid(ghcb)))
+ return ES_VMM_ERROR;
+
+ leaf->eax = ghcb->save.rax;
+ leaf->ebx = ghcb->save.rbx;
+ leaf->ecx = ghcb->save.rcx;
+ leaf->edx = ghcb->save.rdx;
+
+ return ES_OK;
+}
+
+struct cpuid_ctx {
+ struct ghcb *ghcb;
+ struct es_em_ctxt *ctxt;
+};
+
+static void snp_cpuid_hv_ghcb(void *p, struct cpuid_leaf *leaf)
+{
+ struct cpuid_ctx *ctx = p;
+
+ if (__sev_cpuid_hv_ghcb(ctx->ghcb, ctx->ctxt, leaf))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
+}
+
static int vc_handle_cpuid_snp(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
{
+ struct cpuid_ctx ctx = { ghcb, ctxt };
struct pt_regs *regs = ctxt->regs;
struct cpuid_leaf leaf;
int ret;
leaf.fn = regs->ax;
leaf.subfn = regs->cx;
- ret = snp_cpuid(ghcb, ctxt, &leaf);
+ ret = snp_cpuid(snp_cpuid_hv_ghcb, &ctx, &leaf);
if (!ret) {
regs->ax = leaf.eax;
regs->bx = leaf.ebx;
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 89075ff19afa..2cabf617de3c 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -552,7 +552,8 @@ struct cpuid_leaf {
u32 edx;
};
-int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf);
+int snp_cpuid(void (*cpuid_hv)(void *ctx, struct cpuid_leaf *),
+ void *ctx, struct cpuid_leaf *leaf);
void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 17:03 ` Tom Lendacky
2025-07-16 3:18 ` [PATCH v5 03/22] x86/sev: Use MSR protocol only for early SVSM PVALIDATE call Ard Biesheuvel
` (20 subsequent siblings)
22 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
As the preceding code comment already indicates, remapping the SVSM
calling area occurs long before the GHCB page is configured, and so
calling svsm_perform_call_protocol() is guaranteed to result in a call
to svsm_perform_msr_protocol().
So just call the latter directly. This allows most of the GHCB based API
infrastructure to be moved out of the startup code in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/boot/startup/sev-shared.c | 11 +++++++++++
arch/x86/boot/startup/sev-startup.c | 5 ++---
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index c401d0391537..60ab09b3149d 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -723,6 +723,17 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
}
}
+static int __head svsm_call_msr_protocol(struct svsm_call *call)
+{
+ int ret;
+
+ do {
+ ret = svsm_perform_msr_protocol(call);
+ } while (ret == -EAGAIN);
+
+ return ret;
+}
+
static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
{
struct svsm_pvalidate_call *pc;
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 0b7e3b950183..c30e0eed0131 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -295,7 +295,6 @@ static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
{
struct svsm_call call = {};
- int ret;
u64 pa;
/*
@@ -325,8 +324,8 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
call.caa = svsm_get_caa();
call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA);
call.rcx = pa;
- ret = svsm_perform_call_protocol(&call);
- if (ret)
+
+ if (svsm_perform_msr_protocol(&call))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL);
boot_svsm_caa = (struct svsm_ca *)pa;
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 03/22] x86/sev: Use MSR protocol only for early SVSM PVALIDATE call
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 04/22] x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL Ard Biesheuvel
` (19 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
The early page state change API performs an SVSM call to PVALIDATE each
page when running under a SVSM, and this involves either a GHCB page
based call or a call based on the MSR protocol.
The GHCB page based variant involves VA to PA translation of the GHCB
address, and this is best avoided in the startup code, where virtual
addresses are ambiguous (1:1 or kernel virtual).
As this is the last remaining occurrence of svsm_perform_call_protocol()
in the startup code, switch to the MSR protocol exclusively in this
particular case, so that the GHCB based plumbing can be moved out of the
startup code entirely in a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 20 --------------------
arch/x86/boot/startup/sev-shared.c | 9 ++++++---
2 files changed, 6 insertions(+), 23 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index fd1b67dfea22..b71c1ab6a282 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -50,31 +50,11 @@ u64 svsm_get_caa_pa(void)
return boot_svsm_caa_pa;
}
-int svsm_perform_call_protocol(struct svsm_call *call);
-
u8 snp_vmpl;
/* Include code for early handlers */
#include "../../boot/startup/sev-shared.c"
-int svsm_perform_call_protocol(struct svsm_call *call)
-{
- struct ghcb *ghcb;
- int ret;
-
- if (boot_ghcb)
- ghcb = boot_ghcb;
- else
- ghcb = NULL;
-
- do {
- ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call)
- : svsm_perform_msr_protocol(call);
- } while (ret == -EAGAIN);
-
- return ret;
-}
-
static bool sev_snp_enabled(void)
{
return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index 60ab09b3149d..d9c0c64d80fe 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -740,7 +740,6 @@ static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
struct svsm_call call = {};
unsigned long flags;
u64 pc_pa;
- int ret;
/*
* This can be called very early in the boot, use native functions in
@@ -764,8 +763,12 @@ static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
call.rax = SVSM_CORE_CALL(SVSM_CORE_PVALIDATE);
call.rcx = pc_pa;
- ret = svsm_perform_call_protocol(&call);
- if (ret)
+ /*
+ * Use the MSR protocol exclusively, so that this code is usable in
+ * startup code where VA/PA translations of the GHCB page's address may
+ * be problematic.
+ */
+ if (svsm_call_msr_protocol(&call))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
native_local_irq_restore(flags);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 04/22] x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (2 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 03/22] x86/sev: Use MSR protocol only for early SVSM PVALIDATE call Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 05/22] x86/sev: Move GHCB page based HV communication out of startup code Ard Biesheuvel
` (18 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Determining the VMPL at which the kernel runs involves performing a
RMPADJUST operation on an arbitrary page of memory, and observing whether
it succeeds.
The use of boot_ghcb_page in the core kernel in this case is completely
arbitrary, but results in the need to provide a PIC alias for it. So use
boot_svsm_ca_page instead, which already needs this alias for other
reasons.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/boot/compressed/sev.c | 2 +-
arch/x86/boot/startup/sev-shared.c | 5 +++--
arch/x86/boot/startup/sev-startup.c | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index b71c1ab6a282..3628e9bddc6a 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -327,7 +327,7 @@ static bool early_snp_init(struct boot_params *bp)
* running at VMPL0. The CA will be used to communicate with the
* SVSM and request its services.
*/
- svsm_setup_ca(cc_info);
+ svsm_setup_ca(cc_info, rip_rel_ptr(&boot_ghcb_page));
/*
* Pass run-time kernel a pointer to CC info via boot_params so EFI
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index d9c0c64d80fe..cbf26466e0da 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -792,7 +792,8 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
* Maintain the GPA of the SVSM Calling Area (CA) in order to utilize the SVSM
* services needed when not running in VMPL0.
*/
-static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info)
+static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info,
+ void *page)
{
struct snp_secrets_page *secrets_page;
struct snp_cpuid_table *cpuid_table;
@@ -815,7 +816,7 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info)
* routine is running identity mapped when called, both by the decompressor
* code and the early kernel code.
*/
- if (!rmpadjust((unsigned long)rip_rel_ptr(&boot_ghcb_page), RMP_PG_SIZE_4K, 1))
+ if (!rmpadjust((unsigned long)page, RMP_PG_SIZE_4K, 1))
return false;
/*
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index c30e0eed0131..4b9e8ccc0e91 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -302,7 +302,7 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
* running at VMPL0. The CA will be used to communicate with the
* SVSM to perform the SVSM services.
*/
- if (!svsm_setup_ca(cc_info))
+ if (!svsm_setup_ca(cc_info, rip_rel_ptr(&boot_svsm_ca_page)))
return;
/*
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 05/22] x86/sev: Move GHCB page based HV communication out of startup code
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (3 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 04/22] x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 06/22] x86/sev: Avoid global variable to store virtual address of SVSM area Ard Biesheuvel
` (17 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Both the decompressor and the core kernel implement an early #VC
handler, which only deals with CPUID instructions, and full featured
one, which can handle any #VC exception.
The former communicates with the hypervisor using the MSR based
protocol, whereas the latter uses a shared GHCB page, which is
configured a bit later during the boot, when the kernel runs from its
ordinary virtual mapping, rather than the 1:1 mapping that the startup
code uses.
Accessing this shared GHCB page from the core kernel's startup code is
problematic, because it involves converting the GHCB address provided by
the caller to a physical address. In the startup code, virtual to
physical address translations are problematic, given that the virtual
address might be a 1:1 mapped address, and such translations should
therefore be avoided.
This means that exposing startup code dealing with the GHCB to callers
that execute from the ordinary kernel virtual mapping should be avoided
too. So move all GHCB page based communication out of the startup code,
now that all communication occurring before the kernel virtual mapping
is up relies on the MSR protocol only.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev-handle-vc.c | 3 +
arch/x86/boot/startup/sev-shared.c | 143 +-------------------
arch/x86/boot/startup/sev-startup.c | 42 ------
arch/x86/coco/sev/core.c | 76 +++++++++++
arch/x86/coco/sev/vc-handle.c | 2 +
arch/x86/coco/sev/vc-shared.c | 94 +++++++++++++
arch/x86/include/asm/sev-internal.h | 7 +-
arch/x86/include/asm/sev.h | 11 +-
8 files changed, 190 insertions(+), 188 deletions(-)
diff --git a/arch/x86/boot/compressed/sev-handle-vc.c b/arch/x86/boot/compressed/sev-handle-vc.c
index 89dd02de2a0f..7530ad8b768b 100644
--- a/arch/x86/boot/compressed/sev-handle-vc.c
+++ b/arch/x86/boot/compressed/sev-handle-vc.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include "misc.h"
+#include "error.h"
#include "sev.h"
#include <linux/kernel.h>
@@ -14,6 +15,8 @@
#include <asm/fpu/xcr.h>
#define __BOOT_COMPRESSED
+#undef __init
+#define __init
/* Basic instruction decoding support needed */
#include "../../lib/inat.c"
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index cbf26466e0da..f9de8b33de6c 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -13,12 +13,9 @@
#ifndef __BOOT_COMPRESSED
#define error(v) pr_err(v)
-#define has_cpuflag(f) boot_cpu_has(f)
#else
#undef WARN
#define WARN(condition, format...) (!!(condition))
-#undef vc_forward_exception
-#define vc_forward_exception(c) panic("SNP: Hypervisor requested exception\n")
#endif
/*
@@ -39,7 +36,7 @@ u64 boot_svsm_caa_pa __ro_after_init;
*
* GHCB protocol version negotiated with the hypervisor.
*/
-static u16 ghcb_version __ro_after_init;
+u16 ghcb_version __ro_after_init;
/* Copy of the SNP firmware's CPUID page. */
static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
@@ -54,16 +51,6 @@ static u32 cpuid_std_range_max __ro_after_init;
static u32 cpuid_hyp_range_max __ro_after_init;
static u32 cpuid_ext_range_max __ro_after_init;
-bool __init sev_es_check_cpu_features(void)
-{
- if (!has_cpuflag(X86_FEATURE_RDRAND)) {
- error("RDRAND instruction not supported - no trusted source of randomness available\n");
- return false;
- }
-
- return true;
-}
-
void __head __noreturn
sev_es_terminate(unsigned int set, unsigned int reason)
{
@@ -100,72 +87,7 @@ u64 get_hv_features(void)
return GHCB_MSR_HV_FT_RESP_VAL(val);
}
-void snp_register_ghcb_early(unsigned long paddr)
-{
- unsigned long pfn = paddr >> PAGE_SHIFT;
- u64 val;
-
- sev_es_wr_ghcb_msr(GHCB_MSR_REG_GPA_REQ_VAL(pfn));
- VMGEXIT();
-
- val = sev_es_rd_ghcb_msr();
-
- /* If the response GPA is not ours then abort the guest */
- if ((GHCB_RESP_CODE(val) != GHCB_MSR_REG_GPA_RESP) ||
- (GHCB_MSR_REG_GPA_RESP_VAL(val) != pfn))
- sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
-}
-
-bool sev_es_negotiate_protocol(void)
-{
- u64 val;
-
- /* Do the GHCB protocol version negotiation */
- sev_es_wr_ghcb_msr(GHCB_MSR_SEV_INFO_REQ);
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
-
- if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
- return false;
-
- if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
- GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
- return false;
-
- ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
-
- return true;
-}
-
-static enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
-{
- u32 ret;
-
- ret = ghcb->save.sw_exit_info_1 & GENMASK_ULL(31, 0);
- if (!ret)
- return ES_OK;
-
- if (ret == 1) {
- u64 info = ghcb->save.sw_exit_info_2;
- unsigned long v = info & SVM_EVTINJ_VEC_MASK;
-
- /* Check if exception information from hypervisor is sane. */
- if ((info & SVM_EVTINJ_VALID) &&
- ((v == X86_TRAP_GP) || (v == X86_TRAP_UD)) &&
- ((info & SVM_EVTINJ_TYPE_MASK) == SVM_EVTINJ_TYPE_EXEPT)) {
- ctxt->fi.vector = v;
-
- if (info & SVM_EVTINJ_VALID_ERR)
- ctxt->fi.error_code = info >> 32;
-
- return ES_EXCEPTION;
- }
- }
-
- return ES_VMM_ERROR;
-}
-
-static inline int svsm_process_result_codes(struct svsm_call *call)
+int svsm_process_result_codes(struct svsm_call *call)
{
switch (call->rax_out) {
case SVSM_SUCCESS:
@@ -193,7 +115,7 @@ static inline int svsm_process_result_codes(struct svsm_call *call)
* - RAX specifies the SVSM protocol/callid as input and the return code
* as output.
*/
-static __always_inline void svsm_issue_call(struct svsm_call *call, u8 *pending)
+void svsm_issue_call(struct svsm_call *call, u8 *pending)
{
register unsigned long rax asm("rax") = call->rax;
register unsigned long rcx asm("rcx") = call->rcx;
@@ -216,7 +138,7 @@ static __always_inline void svsm_issue_call(struct svsm_call *call, u8 *pending)
call->r9_out = r9;
}
-static int svsm_perform_msr_protocol(struct svsm_call *call)
+int svsm_perform_msr_protocol(struct svsm_call *call)
{
u8 pending = 0;
u64 val, resp;
@@ -247,63 +169,6 @@ static int svsm_perform_msr_protocol(struct svsm_call *call)
return svsm_process_result_codes(call);
}
-static int svsm_perform_ghcb_protocol(struct ghcb *ghcb, struct svsm_call *call)
-{
- struct es_em_ctxt ctxt;
- u8 pending = 0;
-
- vc_ghcb_invalidate(ghcb);
-
- /*
- * Fill in protocol and format specifiers. This can be called very early
- * in the boot, so use rip-relative references as needed.
- */
- ghcb->protocol_version = ghcb_version;
- ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
-
- ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_SNP_RUN_VMPL);
- ghcb_set_sw_exit_info_1(ghcb, 0);
- ghcb_set_sw_exit_info_2(ghcb, 0);
-
- sev_es_wr_ghcb_msr(__pa(ghcb));
-
- svsm_issue_call(call, &pending);
-
- if (pending)
- return -EINVAL;
-
- switch (verify_exception_info(ghcb, &ctxt)) {
- case ES_OK:
- break;
- case ES_EXCEPTION:
- vc_forward_exception(&ctxt);
- fallthrough;
- default:
- return -EINVAL;
- }
-
- return svsm_process_result_codes(call);
-}
-
-enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
- struct es_em_ctxt *ctxt,
- u64 exit_code, u64 exit_info_1,
- u64 exit_info_2)
-{
- /* Fill in protocol and format specifiers */
- ghcb->protocol_version = ghcb_version;
- ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
-
- ghcb_set_sw_exit_code(ghcb, exit_code);
- ghcb_set_sw_exit_info_1(ghcb, exit_info_1);
- ghcb_set_sw_exit_info_2(ghcb, exit_info_2);
-
- sev_es_wr_ghcb_msr(__pa(ghcb));
- VMGEXIT();
-
- return verify_exception_info(ghcb, ctxt);
-}
-
static int __sev_cpuid_hv(u32 fn, int reg_idx, u32 *reg)
{
u64 val;
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 4b9e8ccc0e91..7b38085c7218 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -41,15 +41,6 @@
#include <asm/cpuid/api.h>
#include <asm/cmdline.h>
-/* For early boot hypervisor communication in SEV-ES enabled guests */
-struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
-
-/*
- * Needs to be in the .data section because we need it NULL before bss is
- * cleared
- */
-struct ghcb *boot_ghcb __section(".data");
-
/* Bitmap of SEV features supported by the hypervisor */
u64 sev_hv_features __ro_after_init;
@@ -139,39 +130,6 @@ noinstr void __sev_put_ghcb(struct ghcb_state *state)
}
}
-int svsm_perform_call_protocol(struct svsm_call *call)
-{
- struct ghcb_state state;
- unsigned long flags;
- struct ghcb *ghcb;
- int ret;
-
- /*
- * This can be called very early in the boot, use native functions in
- * order to avoid paravirt issues.
- */
- flags = native_local_irq_save();
-
- if (sev_cfg.ghcbs_initialized)
- ghcb = __sev_get_ghcb(&state);
- else if (boot_ghcb)
- ghcb = boot_ghcb;
- else
- ghcb = NULL;
-
- do {
- ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call)
- : svsm_perform_msr_protocol(call);
- } while (ret == -EAGAIN);
-
- if (sev_cfg.ghcbs_initialized)
- __sev_put_ghcb(&state);
-
- native_local_irq_restore(flags);
-
- return ret;
-}
-
void __head
early_set_pages_state(unsigned long vaddr, unsigned long paddr,
unsigned long npages, enum psc_op op)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index fc59ce78c477..15be9e52848d 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -101,6 +101,15 @@ DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
u8 snp_vmpl __ro_after_init;
EXPORT_SYMBOL_GPL(snp_vmpl);
+/* For early boot hypervisor communication in SEV-ES enabled guests */
+static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
+
+/*
+ * Needs to be in the .data section because we need it NULL before bss is
+ * cleared
+ */
+struct ghcb *boot_ghcb __section(".data");
+
static u64 __init get_snp_jump_table_addr(void)
{
struct snp_secrets_page *secrets;
@@ -154,6 +163,73 @@ static u64 __init get_jump_table_addr(void)
return ret;
}
+static int svsm_perform_ghcb_protocol(struct ghcb *ghcb, struct svsm_call *call)
+{
+ struct es_em_ctxt ctxt;
+ u8 pending = 0;
+
+ vc_ghcb_invalidate(ghcb);
+
+ /*
+ * Fill in protocol and format specifiers. This can be called very early
+ * in the boot, so use rip-relative references as needed.
+ */
+ ghcb->protocol_version = ghcb_version;
+ ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
+
+ ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_SNP_RUN_VMPL);
+ ghcb_set_sw_exit_info_1(ghcb, 0);
+ ghcb_set_sw_exit_info_2(ghcb, 0);
+
+ sev_es_wr_ghcb_msr(__pa(ghcb));
+
+ svsm_issue_call(call, &pending);
+
+ if (pending)
+ return -EINVAL;
+
+ switch (verify_exception_info(ghcb, &ctxt)) {
+ case ES_OK:
+ break;
+ case ES_EXCEPTION:
+ vc_forward_exception(&ctxt);
+ fallthrough;
+ default:
+ return -EINVAL;
+ }
+
+ return svsm_process_result_codes(call);
+}
+
+static int svsm_perform_call_protocol(struct svsm_call *call)
+{
+ struct ghcb_state state;
+ unsigned long flags;
+ struct ghcb *ghcb;
+ int ret;
+
+ flags = native_local_irq_save();
+
+ if (sev_cfg.ghcbs_initialized)
+ ghcb = __sev_get_ghcb(&state);
+ else if (boot_ghcb)
+ ghcb = boot_ghcb;
+ else
+ ghcb = NULL;
+
+ do {
+ ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call)
+ : svsm_perform_msr_protocol(call);
+ } while (ret == -EAGAIN);
+
+ if (sev_cfg.ghcbs_initialized)
+ __sev_put_ghcb(&state);
+
+ native_local_irq_restore(flags);
+
+ return ret;
+}
+
static inline void __pval_terminate(u64 pfn, bool action, unsigned int page_size,
int ret, u64 svsm_ret)
{
diff --git a/arch/x86/coco/sev/vc-handle.c b/arch/x86/coco/sev/vc-handle.c
index faf1fce89ed4..9a5e16f70e83 100644
--- a/arch/x86/coco/sev/vc-handle.c
+++ b/arch/x86/coco/sev/vc-handle.c
@@ -351,6 +351,8 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
}
#define sev_printk(fmt, ...) printk(fmt, ##__VA_ARGS__)
+#define error(v)
+#define has_cpuflag(f) boot_cpu_has(f)
#include "vc-shared.c"
diff --git a/arch/x86/coco/sev/vc-shared.c b/arch/x86/coco/sev/vc-shared.c
index b4688f69102e..9b01c9ad81be 100644
--- a/arch/x86/coco/sev/vc-shared.c
+++ b/arch/x86/coco/sev/vc-shared.c
@@ -409,6 +409,53 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}
+enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
+{
+ u32 ret;
+
+ ret = ghcb->save.sw_exit_info_1 & GENMASK_ULL(31, 0);
+ if (!ret)
+ return ES_OK;
+
+ if (ret == 1) {
+ u64 info = ghcb->save.sw_exit_info_2;
+ unsigned long v = info & SVM_EVTINJ_VEC_MASK;
+
+ /* Check if exception information from hypervisor is sane. */
+ if ((info & SVM_EVTINJ_VALID) &&
+ ((v == X86_TRAP_GP) || (v == X86_TRAP_UD)) &&
+ ((info & SVM_EVTINJ_TYPE_MASK) == SVM_EVTINJ_TYPE_EXEPT)) {
+ ctxt->fi.vector = v;
+
+ if (info & SVM_EVTINJ_VALID_ERR)
+ ctxt->fi.error_code = info >> 32;
+
+ return ES_EXCEPTION;
+ }
+ }
+
+ return ES_VMM_ERROR;
+}
+
+enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
+ struct es_em_ctxt *ctxt,
+ u64 exit_code, u64 exit_info_1,
+ u64 exit_info_2)
+{
+ /* Fill in protocol and format specifiers */
+ ghcb->protocol_version = ghcb_version;
+ ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
+
+ ghcb_set_sw_exit_code(ghcb, exit_code);
+ ghcb_set_sw_exit_info_1(ghcb, exit_info_1);
+ ghcb_set_sw_exit_info_2(ghcb, exit_info_2);
+
+ sev_es_wr_ghcb_msr(__pa(ghcb));
+ VMGEXIT();
+
+ return verify_exception_info(ghcb, ctxt);
+}
+
static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
{
u32 cr4 = native_read_cr4();
@@ -549,3 +596,50 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
return ES_OK;
}
+
+void snp_register_ghcb_early(unsigned long paddr)
+{
+ unsigned long pfn = paddr >> PAGE_SHIFT;
+ u64 val;
+
+ sev_es_wr_ghcb_msr(GHCB_MSR_REG_GPA_REQ_VAL(pfn));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ /* If the response GPA is not ours then abort the guest */
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_REG_GPA_RESP) ||
+ (GHCB_MSR_REG_GPA_RESP_VAL(val) != pfn))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
+}
+
+bool __init sev_es_check_cpu_features(void)
+{
+ if (!has_cpuflag(X86_FEATURE_RDRAND)) {
+ error("RDRAND instruction not supported - no trusted source of randomness available\n");
+ return false;
+ }
+
+ return true;
+}
+
+bool sev_es_negotiate_protocol(void)
+{
+ u64 val;
+
+ /* Do the GHCB protocol version negotiation */
+ sev_es_wr_ghcb_msr(GHCB_MSR_SEV_INFO_REQ);
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+
+ if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
+ return false;
+
+ if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+ GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
+ return false;
+
+ ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
+ return true;
+}
diff --git a/arch/x86/include/asm/sev-internal.h b/arch/x86/include/asm/sev-internal.h
index 3dfd306d1c9e..6199b35a82e4 100644
--- a/arch/x86/include/asm/sev-internal.h
+++ b/arch/x86/include/asm/sev-internal.h
@@ -2,7 +2,6 @@
#define DR7_RESET_VALUE 0x400
-extern struct ghcb boot_ghcb_page;
extern u64 sev_hv_features;
extern u64 sev_secrets_pa;
@@ -80,7 +79,8 @@ static __always_inline u64 svsm_get_caa_pa(void)
return boot_svsm_caa_pa;
}
-int svsm_perform_call_protocol(struct svsm_call *call);
+enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt *ctxt);
+void vc_forward_exception(struct es_em_ctxt *ctxt);
static inline u64 sev_es_rd_ghcb_msr(void)
{
@@ -97,9 +97,6 @@ static __always_inline void sev_es_wr_ghcb_msr(u64 val)
native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
}
-void snp_register_ghcb_early(unsigned long paddr);
-bool sev_es_negotiate_protocol(void);
-bool sev_es_check_cpu_features(void);
u64 get_hv_features(void);
const struct snp_cpuid_table *snp_cpuid_get_table(void);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 2cabf617de3c..135e91a17d04 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -503,6 +503,7 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
}
void setup_ghcb(void);
+void snp_register_ghcb_early(unsigned long paddr);
void early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
unsigned long npages);
void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
@@ -540,8 +541,6 @@ static __always_inline void vc_ghcb_invalidate(struct ghcb *ghcb)
__builtin_memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
}
-void vc_forward_exception(struct es_em_ctxt *ctxt);
-
/* I/O parameters for CPUID-related helpers */
struct cpuid_leaf {
u32 fn;
@@ -552,15 +551,23 @@ struct cpuid_leaf {
u32 edx;
};
+int svsm_perform_msr_protocol(struct svsm_call *call);
int snp_cpuid(void (*cpuid_hv)(void *ctx, struct cpuid_leaf *),
void *ctx, struct cpuid_leaf *leaf);
+void svsm_issue_call(struct svsm_call *call, u8 *pending);
+int svsm_process_result_codes(struct svsm_call *call);
+
void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
struct es_em_ctxt *ctxt,
u64 exit_code, u64 exit_info_1,
u64 exit_info_2);
+bool sev_es_negotiate_protocol(void);
+bool sev_es_check_cpu_features(void);
+
+extern u16 ghcb_version;
extern struct ghcb *boot_ghcb;
#else /* !CONFIG_AMD_MEM_ENCRYPT */
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 06/22] x86/sev: Avoid global variable to store virtual address of SVSM area
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (4 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 05/22] x86/sev: Move GHCB page based HV communication out of startup code Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 07/22] x86/sev: Share implementation of MSR-based page state change Ard Biesheuvel
` (16 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
The boottime SVSM calling area is used both by the startup code running
from a 1:1 mapping, and potentially later on running from the ordinary
kernel mapping.
This SVSM calling area is statically allocated, and so its physical
address doesn't change. However, its virtual address depends on the
calling context (1:1 mapping or kernel virtual mapping), and even though
the variable that holds the virtual address of this calling area gets
updated from 1:1 address to kernel address during the boot, it is hard
to reason about why this is guaranteed to be safe.
So instead, take the RIP-relative address of the boottime SVSM calling
area whenever its virtual address is required, and only use a global
variable for the physical address.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/boot/compressed/sev.c | 5 ++---
arch/x86/boot/startup/sev-shared.c | 6 ------
arch/x86/boot/startup/sev-startup.c | 9 +++++----
arch/x86/coco/sev/core.c | 9 ---------
arch/x86/include/asm/sev-internal.h | 3 +--
arch/x86/include/asm/sev.h | 2 --
arch/x86/mm/mem_encrypt_amd.c | 6 ------
7 files changed, 8 insertions(+), 32 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 3628e9bddc6a..6c0f91d38595 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -37,12 +37,12 @@ struct ghcb *boot_ghcb;
#define __BOOT_COMPRESSED
-extern struct svsm_ca *boot_svsm_caa;
extern u64 boot_svsm_caa_pa;
struct svsm_ca *svsm_get_caa(void)
{
- return boot_svsm_caa;
+ /* The decompressor is mapped 1:1 so VA == PA */
+ return (struct svsm_ca *)boot_svsm_caa_pa;
}
u64 svsm_get_caa_pa(void)
@@ -530,7 +530,6 @@ bool early_is_sevsnp_guest(void)
/* Obtain the address of the calling area to use */
boot_rdmsr(MSR_SVSM_CAA, &m);
- boot_svsm_caa = (void *)m.q;
boot_svsm_caa_pa = m.q;
/*
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index f9de8b33de6c..51f2110bc509 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -26,7 +26,6 @@
* early boot, both with identity mapped virtual addresses and proper kernel
* virtual addresses.
*/
-struct svsm_ca *boot_svsm_caa __ro_after_init;
u64 boot_svsm_caa_pa __ro_after_init;
/*
@@ -709,11 +708,6 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info,
if (caa & (PAGE_SIZE - 1))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CAA);
- /*
- * The CA is identity mapped when this routine is called, both by the
- * decompressor code and the early kernel code.
- */
- boot_svsm_caa = (struct svsm_ca *)caa;
boot_svsm_caa_pa = caa;
/* Advertise the SVSM presence via CPUID. */
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 7b38085c7218..f3e247d205b7 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -252,6 +252,7 @@ static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
{
+ struct snp_secrets_page *secrets = (void *)cc_info->secrets_phys;
struct svsm_call call = {};
u64 pa;
@@ -272,21 +273,21 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
pa = (u64)rip_rel_ptr(&boot_svsm_ca_page);
/*
- * Switch over to the boot SVSM CA while the current CA is still
- * addressable. There is no GHCB at this point so use the MSR protocol.
+ * Switch over to the boot SVSM CA while the current CA is still 1:1
+ * mapped and thus addressable with VA == PA. There is no GHCB at this
+ * point so use the MSR protocol.
*
* SVSM_CORE_REMAP_CA call:
* RAX = 0 (Protocol=0, CallID=0)
* RCX = New CA GPA
*/
- call.caa = svsm_get_caa();
+ call.caa = (struct svsm_ca *)secrets->svsm_caa;
call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA);
call.rcx = pa;
if (svsm_perform_msr_protocol(&call))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL);
- boot_svsm_caa = (struct svsm_ca *)pa;
boot_svsm_caa_pa = pa;
}
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 15be9e52848d..bea67d017bf0 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1643,15 +1643,6 @@ void sev_show_status(void)
pr_cont("\n");
}
-void __init snp_update_svsm_ca(void)
-{
- if (!snp_vmpl)
- return;
-
- /* Update the CAA to a proper kernel address */
- boot_svsm_caa = &boot_svsm_ca_page;
-}
-
#ifdef CONFIG_SYSFS
static ssize_t vmpl_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
diff --git a/arch/x86/include/asm/sev-internal.h b/arch/x86/include/asm/sev-internal.h
index 6199b35a82e4..ffe4755962fe 100644
--- a/arch/x86/include/asm/sev-internal.h
+++ b/arch/x86/include/asm/sev-internal.h
@@ -60,7 +60,6 @@ void early_set_pages_state(unsigned long vaddr, unsigned long paddr,
DECLARE_PER_CPU(struct svsm_ca *, svsm_caa);
DECLARE_PER_CPU(u64, svsm_caa_pa);
-extern struct svsm_ca *boot_svsm_caa;
extern u64 boot_svsm_caa_pa;
static __always_inline struct svsm_ca *svsm_get_caa(void)
@@ -68,7 +67,7 @@ static __always_inline struct svsm_ca *svsm_get_caa(void)
if (sev_cfg.use_cas)
return this_cpu_read(svsm_caa);
else
- return boot_svsm_caa;
+ return rip_rel_ptr(&boot_svsm_ca_page);
}
static __always_inline u64 svsm_get_caa_pa(void)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 135e91a17d04..f3acbfcdca9a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -519,7 +519,6 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end);
u64 snp_get_unsupported_features(u64 status);
u64 sev_get_status(void);
void sev_show_status(void);
-void snp_update_svsm_ca(void);
int prepare_pte_enc(struct pte_enc_desc *d);
void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot);
void snp_kexec_finish(void);
@@ -600,7 +599,6 @@ static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
static inline u64 sev_get_status(void) { return 0; }
static inline void sev_show_status(void) { }
-static inline void snp_update_svsm_ca(void) { }
static inline int prepare_pte_enc(struct pte_enc_desc *d) { return 0; }
static inline void set_pte_enc_mask(pte_t *kpte, unsigned long pfn, pgprot_t new_prot) { }
static inline void snp_kexec_finish(void) { }
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index faf3a13fb6ba..2f8c32173972 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -536,12 +536,6 @@ void __init sme_early_init(void)
x86_init.resources.dmi_setup = snp_dmi_setup;
}
- /*
- * Switch the SVSM CA mapping (if active) from identity mapped to
- * kernel mapped.
- */
- snp_update_svsm_ca();
-
if (sev_status & MSR_AMD64_SNP_SECURE_TSC)
setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 07/22] x86/sev: Share implementation of MSR-based page state change
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (5 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 06/22] x86/sev: Avoid global variable to store virtual address of SVSM area Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 08/22] x86/sev: Pass SVSM calling area down to early page state change API Ard Biesheuvel
` (15 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Both the decompressor and the SEV startup code implement the exact same
sequence for invoking the MSR based communication protocol to effectuate
a page state change.
Before tweaking the internal APIs used in both versions, merge them and
share them so those tweaks are only needed in a single place.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 40 ++------------------
arch/x86/boot/startup/sev-shared.c | 35 +++++++++++++++++
arch/x86/boot/startup/sev-startup.c | 29 +-------------
3 files changed, 39 insertions(+), 65 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 6c0f91d38595..f714235d3222 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -60,46 +60,12 @@ static bool sev_snp_enabled(void)
return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
}
-static void __page_state_change(unsigned long paddr, enum psc_op op)
-{
- u64 val, msr;
-
- /*
- * If private -> shared then invalidate the page before requesting the
- * state change in the RMP table.
- */
- if (op == SNP_PAGE_STATE_SHARED)
- pvalidate_4k_page(paddr, paddr, false);
-
- /* Save the current GHCB MSR value */
- msr = sev_es_rd_ghcb_msr();
-
- /* Issue VMGEXIT to change the page state in RMP table. */
- sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
- VMGEXIT();
-
- /* Read the response of the VMGEXIT. */
- val = sev_es_rd_ghcb_msr();
- if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
- sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
-
- /* Restore the GHCB MSR value */
- sev_es_wr_ghcb_msr(msr);
-
- /*
- * Now that page state is changed in the RMP table, validate it so that it is
- * consistent with the RMP entry.
- */
- if (op == SNP_PAGE_STATE_PRIVATE)
- pvalidate_4k_page(paddr, paddr, true);
-}
-
void snp_set_page_private(unsigned long paddr)
{
if (!sev_snp_enabled())
return;
- __page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
+ __page_state_change(paddr, paddr, SNP_PAGE_STATE_PRIVATE);
}
void snp_set_page_shared(unsigned long paddr)
@@ -107,7 +73,7 @@ void snp_set_page_shared(unsigned long paddr)
if (!sev_snp_enabled())
return;
- __page_state_change(paddr, SNP_PAGE_STATE_SHARED);
+ __page_state_change(paddr, paddr, SNP_PAGE_STATE_SHARED);
}
bool early_setup_ghcb(void)
@@ -133,7 +99,7 @@ bool early_setup_ghcb(void)
void snp_accept_memory(phys_addr_t start, phys_addr_t end)
{
for (phys_addr_t pa = start; pa < end; pa += PAGE_SIZE)
- __page_state_change(pa, SNP_PAGE_STATE_PRIVATE);
+ __page_state_change(pa, pa, SNP_PAGE_STATE_PRIVATE);
}
void sev_es_shutdown_ghcb(void)
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index 51f2110bc509..eb241ff1156d 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -652,6 +652,41 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
}
}
+static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
+ enum psc_op op)
+{
+ u64 val, msr;
+
+ /*
+ * If private -> shared then invalidate the page before requesting the
+ * state change in the RMP table.
+ */
+ if (op == SNP_PAGE_STATE_SHARED)
+ pvalidate_4k_page(vaddr, paddr, false);
+
+ /* Save the current GHCB MSR value */
+ msr = sev_es_rd_ghcb_msr();
+
+ /* Issue VMGEXIT to change the page state in RMP table. */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ /* Read the response of the VMGEXIT. */
+ val = sev_es_rd_ghcb_msr();
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+
+ /* Restore the GHCB MSR value */
+ sev_es_wr_ghcb_msr(msr);
+
+ /*
+ * Now that page state is changed in the RMP table, validate it so that it is
+ * consistent with the RMP entry.
+ */
+ if (op == SNP_PAGE_STATE_PRIVATE)
+ pvalidate_4k_page(vaddr, paddr, true);
+}
+
/*
* Maintain the GPA of the SVSM Calling Area (CA) in order to utilize the SVSM
* services needed when not running in VMPL0.
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index f3e247d205b7..b4e2cb7bc44a 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -135,7 +135,6 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr,
unsigned long npages, enum psc_op op)
{
unsigned long paddr_end;
- u64 val;
vaddr = vaddr & PAGE_MASK;
@@ -143,37 +142,11 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr,
paddr_end = paddr + (npages << PAGE_SHIFT);
while (paddr < paddr_end) {
- /* Page validation must be rescinded before changing to shared */
- if (op == SNP_PAGE_STATE_SHARED)
- pvalidate_4k_page(vaddr, paddr, false);
-
- /*
- * Use the MSR protocol because this function can be called before
- * the GHCB is established.
- */
- sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
- VMGEXIT();
-
- val = sev_es_rd_ghcb_msr();
-
- if (GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP)
- goto e_term;
-
- if (GHCB_MSR_PSC_RESP_VAL(val))
- goto e_term;
-
- /* Page validation must be performed after changing to private */
- if (op == SNP_PAGE_STATE_PRIVATE)
- pvalidate_4k_page(vaddr, paddr, true);
+ __page_state_change(vaddr, paddr, op);
vaddr += PAGE_SIZE;
paddr += PAGE_SIZE;
}
-
- return;
-
-e_term:
- sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
}
void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 08/22] x86/sev: Pass SVSM calling area down to early page state change API
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (6 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 07/22] x86/sev: Share implementation of MSR-based page state change Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 09/22] x86/sev: Use boot SVSM CA for all startup and init code Ard Biesheuvel
` (14 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
The early page state change API is mostly only used very early, when
only the boot time SVSM calling area is in use. However, this API is
also called by the kexec finishing code, which runs very late, and
potentially from a different CPU (which uses a different calling area).
To avoid pulling the per-CPU SVSM calling area pointers and related SEV
state into the startup code, refactor the page state change API so the
SVSM calling area virtual and physical addresses can be provided by the
caller.
No functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 12 +++++++++---
arch/x86/boot/startup/sev-shared.c | 17 +++++++++--------
arch/x86/boot/startup/sev-startup.c | 11 +++++++----
arch/x86/coco/sev/core.c | 3 ++-
arch/x86/include/asm/sev-internal.h | 3 ++-
5 files changed, 29 insertions(+), 17 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index f714235d3222..18b0ccf517eb 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -65,7 +65,9 @@ void snp_set_page_private(unsigned long paddr)
if (!sev_snp_enabled())
return;
- __page_state_change(paddr, paddr, SNP_PAGE_STATE_PRIVATE);
+ __page_state_change(paddr, paddr, SNP_PAGE_STATE_PRIVATE,
+ (struct svsm_ca *)boot_svsm_caa_pa,
+ boot_svsm_caa_pa);
}
void snp_set_page_shared(unsigned long paddr)
@@ -73,7 +75,9 @@ void snp_set_page_shared(unsigned long paddr)
if (!sev_snp_enabled())
return;
- __page_state_change(paddr, paddr, SNP_PAGE_STATE_SHARED);
+ __page_state_change(paddr, paddr, SNP_PAGE_STATE_SHARED,
+ (struct svsm_ca *)boot_svsm_caa_pa,
+ boot_svsm_caa_pa);
}
bool early_setup_ghcb(void)
@@ -99,7 +103,9 @@ bool early_setup_ghcb(void)
void snp_accept_memory(phys_addr_t start, phys_addr_t end)
{
for (phys_addr_t pa = start; pa < end; pa += PAGE_SIZE)
- __page_state_change(pa, pa, SNP_PAGE_STATE_PRIVATE);
+ __page_state_change(pa, pa, SNP_PAGE_STATE_PRIVATE,
+ (struct svsm_ca *)boot_svsm_caa_pa,
+ boot_svsm_caa_pa);
}
void sev_es_shutdown_ghcb(void)
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index eb241ff1156d..83ca97df0808 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -598,7 +598,8 @@ static int __head svsm_call_msr_protocol(struct svsm_call *call)
return ret;
}
-static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
+static void __head svsm_pval_4k_page(unsigned long paddr, bool validate,
+ struct svsm_ca *caa, u64 caa_pa)
{
struct svsm_pvalidate_call *pc;
struct svsm_call call = {};
@@ -611,10 +612,10 @@ static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
*/
flags = native_local_irq_save();
- call.caa = svsm_get_caa();
+ call.caa = caa;
pc = (struct svsm_pvalidate_call *)call.caa->svsm_buffer;
- pc_pa = svsm_get_caa_pa() + offsetof(struct svsm_ca, svsm_buffer);
+ pc_pa = caa_pa + offsetof(struct svsm_ca, svsm_buffer);
pc->num_entries = 1;
pc->cur_index = 0;
@@ -639,12 +640,12 @@ static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
}
static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
- bool validate)
+ bool validate, struct svsm_ca *caa, u64 caa_pa)
{
int ret;
if (snp_vmpl) {
- svsm_pval_4k_page(paddr, validate);
+ svsm_pval_4k_page(paddr, validate, caa, caa_pa);
} else {
ret = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
if (ret)
@@ -653,7 +654,7 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
}
static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
- enum psc_op op)
+ enum psc_op op, struct svsm_ca *caa, u64 caa_pa)
{
u64 val, msr;
@@ -662,7 +663,7 @@ static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
* state change in the RMP table.
*/
if (op == SNP_PAGE_STATE_SHARED)
- pvalidate_4k_page(vaddr, paddr, false);
+ pvalidate_4k_page(vaddr, paddr, false, caa, caa_pa);
/* Save the current GHCB MSR value */
msr = sev_es_rd_ghcb_msr();
@@ -684,7 +685,7 @@ static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
* consistent with the RMP entry.
*/
if (op == SNP_PAGE_STATE_PRIVATE)
- pvalidate_4k_page(vaddr, paddr, true);
+ pvalidate_4k_page(vaddr, paddr, true, caa, caa_pa);
}
/*
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index b4e2cb7bc44a..7aabda0b378e 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -132,7 +132,8 @@ noinstr void __sev_put_ghcb(struct ghcb_state *state)
void __head
early_set_pages_state(unsigned long vaddr, unsigned long paddr,
- unsigned long npages, enum psc_op op)
+ unsigned long npages, enum psc_op op,
+ struct svsm_ca *caa, u64 caa_pa)
{
unsigned long paddr_end;
@@ -142,7 +143,7 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr,
paddr_end = paddr + (npages << PAGE_SHIFT);
while (paddr < paddr_end) {
- __page_state_change(vaddr, paddr, op);
+ __page_state_change(vaddr, paddr, op, caa, caa_pa);
vaddr += PAGE_SIZE;
paddr += PAGE_SIZE;
@@ -165,7 +166,8 @@ void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
* Ask the hypervisor to mark the memory pages as private in the RMP
* table.
*/
- early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE);
+ early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE,
+ svsm_get_caa(), svsm_get_caa_pa());
}
void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
@@ -181,7 +183,8 @@ void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
return;
/* Ask hypervisor to mark the memory pages shared in the RMP table. */
- early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
+ early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED,
+ svsm_get_caa(), svsm_get_caa_pa());
}
/*
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index bea67d017bf0..7a86a2fe494d 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -585,7 +585,8 @@ static void set_pages_state(unsigned long vaddr, unsigned long npages, int op)
/* Use the MSR protocol when a GHCB is not available. */
if (!boot_ghcb)
- return early_set_pages_state(vaddr, __pa(vaddr), npages, op);
+ return early_set_pages_state(vaddr, __pa(vaddr), npages, op,
+ svsm_get_caa(), svsm_get_caa_pa());
vaddr = vaddr & PAGE_MASK;
vaddr_end = vaddr + (npages << PAGE_SHIFT);
diff --git a/arch/x86/include/asm/sev-internal.h b/arch/x86/include/asm/sev-internal.h
index ffe4755962fe..3b72d8217827 100644
--- a/arch/x86/include/asm/sev-internal.h
+++ b/arch/x86/include/asm/sev-internal.h
@@ -55,7 +55,8 @@ DECLARE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
DECLARE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
void early_set_pages_state(unsigned long vaddr, unsigned long paddr,
- unsigned long npages, enum psc_op op);
+ unsigned long npages, enum psc_op op,
+ struct svsm_ca *ca, u64 caa_pa);
DECLARE_PER_CPU(struct svsm_ca *, svsm_caa);
DECLARE_PER_CPU(u64, svsm_caa_pa);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 09/22] x86/sev: Use boot SVSM CA for all startup and init code
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (7 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 08/22] x86/sev: Pass SVSM calling area down to early page state change API Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 10/22] x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check Ard Biesheuvel
` (13 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
To avoid having to reason about whether or not to use the per-CPU SVSM
calling area when running startup and init code on the boot CPU, reuse
the boot SVSM calling area as the per-CPU area for CPU #0.
This removes the need to make the per-CPU variables and associated state
in sev_cfg accessible to the startup code once confined.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 13 ------
arch/x86/boot/startup/sev-startup.c | 7 +--
arch/x86/coco/sev/core.c | 47 +++++++++-----------
arch/x86/include/asm/sev-internal.h | 16 -------
4 files changed, 24 insertions(+), 59 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 18b0ccf517eb..4bdf5595ed96 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -37,19 +37,6 @@ struct ghcb *boot_ghcb;
#define __BOOT_COMPRESSED
-extern u64 boot_svsm_caa_pa;
-
-struct svsm_ca *svsm_get_caa(void)
-{
- /* The decompressor is mapped 1:1 so VA == PA */
- return (struct svsm_ca *)boot_svsm_caa_pa;
-}
-
-u64 svsm_get_caa_pa(void)
-{
- return boot_svsm_caa_pa;
-}
-
u8 snp_vmpl;
/* Include code for early handlers */
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 7aabda0b378e..8e804369cc60 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -50,9 +50,6 @@ u64 sev_secrets_pa __ro_after_init;
/* For early boot SVSM communication */
struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE);
-DEFINE_PER_CPU(struct svsm_ca *, svsm_caa);
-DEFINE_PER_CPU(u64, svsm_caa_pa);
-
/*
* Nothing shall interrupt this code path while holding the per-CPU
* GHCB. The backup GHCB is only for NMIs interrupting this path.
@@ -167,7 +164,7 @@ void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
* table.
*/
early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE,
- svsm_get_caa(), svsm_get_caa_pa());
+ rip_rel_ptr(&boot_svsm_ca_page), boot_svsm_caa_pa);
}
void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
@@ -184,7 +181,7 @@ void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
/* Ask hypervisor to mark the memory pages shared in the RMP table. */
early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED,
- svsm_get_caa(), svsm_get_caa_pa());
+ rip_rel_ptr(&boot_svsm_ca_page), boot_svsm_caa_pa);
}
/*
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 7a86a2fe494d..4fe0928bc0ad 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -46,6 +46,25 @@
#include <asm/cmdline.h>
#include <asm/msr.h>
+DEFINE_PER_CPU(struct svsm_ca *, svsm_caa);
+DEFINE_PER_CPU(u64, svsm_caa_pa);
+
+static inline struct svsm_ca *svsm_get_caa(void)
+{
+ if (sev_cfg.use_cas)
+ return this_cpu_read(svsm_caa);
+ else
+ return rip_rel_ptr(&boot_svsm_ca_page);
+}
+
+static inline u64 svsm_get_caa_pa(void)
+{
+ if (sev_cfg.use_cas)
+ return this_cpu_read(svsm_caa_pa);
+ else
+ return boot_svsm_caa_pa;
+}
+
/* AP INIT values as documented in the APM2 section "Processor Initialization State" */
#define AP_INIT_CS_LIMIT 0xffff
#define AP_INIT_DS_LIMIT 0xffff
@@ -1287,7 +1306,8 @@ static void __init alloc_runtime_data(int cpu)
struct svsm_ca *caa;
/* Allocate the SVSM CA page if an SVSM is present */
- caa = memblock_alloc_or_panic(sizeof(*caa), PAGE_SIZE);
+ caa = cpu ? memblock_alloc_or_panic(sizeof(*caa), PAGE_SIZE)
+ : &boot_svsm_ca_page;
per_cpu(svsm_caa, cpu) = caa;
per_cpu(svsm_caa_pa, cpu) = __pa(caa);
@@ -1341,32 +1361,9 @@ void __init sev_es_init_vc_handling(void)
init_ghcb(cpu);
}
- /* If running under an SVSM, switch to the per-cpu CA */
- if (snp_vmpl) {
- struct svsm_call call = {};
- unsigned long flags;
- int ret;
-
- local_irq_save(flags);
-
- /*
- * SVSM_CORE_REMAP_CA call:
- * RAX = 0 (Protocol=0, CallID=0)
- * RCX = New CA GPA
- */
- call.caa = svsm_get_caa();
- call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA);
- call.rcx = this_cpu_read(svsm_caa_pa);
- ret = svsm_perform_call_protocol(&call);
- if (ret)
- panic("Can't remap the SVSM CA, ret=%d, rax_out=0x%llx\n",
- ret, call.rax_out);
-
+ if (snp_vmpl)
sev_cfg.use_cas = true;
- local_irq_restore(flags);
- }
-
sev_es_setup_play_dead();
/* Secondary CPUs use the runtime #VC handler */
diff --git a/arch/x86/include/asm/sev-internal.h b/arch/x86/include/asm/sev-internal.h
index 3b72d8217827..bdfe008120f3 100644
--- a/arch/x86/include/asm/sev-internal.h
+++ b/arch/x86/include/asm/sev-internal.h
@@ -63,22 +63,6 @@ DECLARE_PER_CPU(u64, svsm_caa_pa);
extern u64 boot_svsm_caa_pa;
-static __always_inline struct svsm_ca *svsm_get_caa(void)
-{
- if (sev_cfg.use_cas)
- return this_cpu_read(svsm_caa);
- else
- return rip_rel_ptr(&boot_svsm_ca_page);
-}
-
-static __always_inline u64 svsm_get_caa_pa(void)
-{
- if (sev_cfg.use_cas)
- return this_cpu_read(svsm_caa_pa);
- else
- return boot_svsm_caa_pa;
-}
-
enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt *ctxt);
void vc_forward_exception(struct es_em_ctxt *ctxt);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 10/22] x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (8 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 09/22] x86/sev: Use boot SVSM CA for all startup and init code Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 11/22] x86/boot: Provide PIC aliases for 5-level paging related constants Ard Biesheuvel
` (12 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
snp_vmpl will be assigned a non-zero value when executing at a VMPL
other than 0, and this is inferred from a call to RMPADJUST, which only
works when running at VMPL0.
This means that testing snp_vmpl is sufficient, and there is no need to
perform the same check again.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 20 +++-----------------
1 file changed, 3 insertions(+), 17 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 4bdf5595ed96..d62722dd2de1 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -392,30 +392,16 @@ void sev_enable(struct boot_params *bp)
*/
if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
u64 hv_features;
- int ret;
hv_features = get_hv_features();
if (!(hv_features & GHCB_HV_FT_SNP))
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
/*
- * Enforce running at VMPL0 or with an SVSM.
- *
- * Use RMPADJUST (see the rmpadjust() function for a description of
- * what the instruction does) to update the VMPL1 permissions of a
- * page. If the guest is running at VMPL0, this will succeed. If the
- * guest is running at any other VMPL, this will fail. Linux SNP guests
- * only ever run at a single VMPL level so permission mask changes of a
- * lesser-privileged VMPL are a don't-care.
+ * Running at VMPL0 is required unless an SVSM is present and
+ * the hypervisor supports the required SVSM GHCB events.
*/
- ret = rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, 1);
-
- /*
- * Running at VMPL0 is not required if an SVSM is present and the hypervisor
- * supports the required SVSM GHCB events.
- */
- if (ret &&
- !(snp_vmpl && (hv_features & GHCB_HV_FT_SNP_MULTI_VMPL)))
+ if (snp_vmpl > 0 && !(hv_features & GHCB_HV_FT_SNP_MULTI_VMPL))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 11/22] x86/boot: Provide PIC aliases for 5-level paging related constants
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (9 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 10/22] x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 12/22] x86/sev: Provide PIC aliases for SEV related data objects Ard Biesheuvel
` (11 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Provide PIC aliases for the global variables related to 5-level paging,
so that the startup code can access them in order to populate the kernel
page tables.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/kernel/head64.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 533fcf5636fc..1bc40d0785ee 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -52,10 +52,13 @@ SYM_PIC_ALIAS(next_early_pgt);
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
unsigned int __pgtable_l5_enabled __ro_after_init;
+SYM_PIC_ALIAS(__pgtable_l5_enabled);
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
+SYM_PIC_ALIAS(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
EXPORT_SYMBOL(ptrs_per_p4d);
+SYM_PIC_ALIAS(ptrs_per_p4d);
unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
EXPORT_SYMBOL(page_offset_base);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 12/22] x86/sev: Provide PIC aliases for SEV related data objects
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (10 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 11/22] x86/boot: Provide PIC aliases for 5-level paging related constants Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 13/22] x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object Ard Biesheuvel
` (10 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Provide PIC aliases for data objects that are shared between the SEV
startup code and the SEV code that executes later. This is needed so
that the confined startup code is permitted to access them.
This requires some of these variables to be moved into a source file
that is not part of the startup code, as the PIC alias is already
implied, and exporting variables in the opposite direction is not
supported.
Move ghcb_version as well, but don't provide a PIC alias as it is not
actually needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 3 ++
arch/x86/boot/startup/sev-shared.c | 19 -----------
arch/x86/boot/startup/sev-startup.c | 9 ------
arch/x86/coco/sev/core.c | 34 ++++++++++++++++++++
4 files changed, 37 insertions(+), 28 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index d62722dd2de1..faa6cc2f9990 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -38,6 +38,9 @@ struct ghcb *boot_ghcb;
#define __BOOT_COMPRESSED
u8 snp_vmpl;
+u16 ghcb_version;
+
+u64 boot_svsm_caa_pa;
/* Include code for early handlers */
#include "../../boot/startup/sev-shared.c"
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index 83ca97df0808..cc14daf816e8 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -18,25 +18,6 @@
#define WARN(condition, format...) (!!(condition))
#endif
-/*
- * SVSM related information:
- * During boot, the page tables are set up as identity mapped and later
- * changed to use kernel virtual addresses. Maintain separate virtual and
- * physical addresses for the CAA to allow SVSM functions to be used during
- * early boot, both with identity mapped virtual addresses and proper kernel
- * virtual addresses.
- */
-u64 boot_svsm_caa_pa __ro_after_init;
-
-/*
- * Since feature negotiation related variables are set early in the boot
- * process they must reside in the .data section so as not to be zeroed
- * out when the .bss section is later cleared.
- *
- * GHCB protocol version negotiated with the hypervisor.
- */
-u16 ghcb_version __ro_after_init;
-
/* Copy of the SNP firmware's CPUID page. */
static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 8e804369cc60..733491482cbb 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -41,15 +41,6 @@
#include <asm/cpuid/api.h>
#include <asm/cmdline.h>
-/* Bitmap of SEV features supported by the hypervisor */
-u64 sev_hv_features __ro_after_init;
-
-/* Secrets page physical address from the CC blob */
-u64 sev_secrets_pa __ro_after_init;
-
-/* For early boot SVSM communication */
-struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE);
-
/*
* Nothing shall interrupt this code path while holding the per-CPU
* GHCB. The backup GHCB is only for NMIs interrupting this path.
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 4fe0928bc0ad..be89f0a4a28f 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -46,6 +46,29 @@
#include <asm/cmdline.h>
#include <asm/msr.h>
+/* Bitmap of SEV features supported by the hypervisor */
+u64 sev_hv_features __ro_after_init;
+SYM_PIC_ALIAS(sev_hv_features);
+
+/* Secrets page physical address from the CC blob */
+u64 sev_secrets_pa __ro_after_init;
+SYM_PIC_ALIAS(sev_secrets_pa);
+
+/* For early boot SVSM communication */
+struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE);
+SYM_PIC_ALIAS(boot_svsm_ca_page);
+
+/*
+ * SVSM related information:
+ * During boot, the page tables are set up as identity mapped and later
+ * changed to use kernel virtual addresses. Maintain separate virtual and
+ * physical addresses for the CAA to allow SVSM functions to be used during
+ * early boot, both with identity mapped virtual addresses and proper kernel
+ * virtual addresses.
+ */
+u64 boot_svsm_caa_pa __ro_after_init;
+SYM_PIC_ALIAS(boot_svsm_caa_pa);
+
DEFINE_PER_CPU(struct svsm_ca *, svsm_caa);
DEFINE_PER_CPU(u64, svsm_caa_pa);
@@ -119,6 +142,17 @@ DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
*/
u8 snp_vmpl __ro_after_init;
EXPORT_SYMBOL_GPL(snp_vmpl);
+SYM_PIC_ALIAS(snp_vmpl);
+
+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+u16 ghcb_version __ro_after_init;
+SYM_PIC_ALIAS(ghcb_version);
/* For early boot hypervisor communication in SEV-ES enabled guests */
static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 13/22] x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (11 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 12/22] x86/sev: Provide PIC aliases for SEV related data objects Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 14/22] x86/sev: Export startup routines for later use Ard Biesheuvel
` (9 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Rename sev-nmi.c to noinstr.c, and move the get/put GHCB routines
into it too, which are also annotated as 'noinstr' and suffer from the
same problem as the NMI code, i.e., that GCC may ignore the
__no_sanitize_address__ function attribute implied by 'noinstr' and
insert KASAN instrumentation anyway.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/sev-startup.c | 74 --------------------
arch/x86/coco/sev/Makefile | 6 +-
arch/x86/coco/sev/{sev-nmi.c => noinstr.c} | 74 ++++++++++++++++++++
3 files changed, 77 insertions(+), 77 deletions(-)
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 733491482cbb..e9238149f2cf 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -41,83 +41,9 @@
#include <asm/cpuid/api.h>
#include <asm/cmdline.h>
-/*
- * Nothing shall interrupt this code path while holding the per-CPU
- * GHCB. The backup GHCB is only for NMIs interrupting this path.
- *
- * Callers must disable local interrupts around it.
- */
-noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
-{
- struct sev_es_runtime_data *data;
- struct ghcb *ghcb;
-
- WARN_ON(!irqs_disabled());
-
- data = this_cpu_read(runtime_data);
- ghcb = &data->ghcb_page;
-
- if (unlikely(data->ghcb_active)) {
- /* GHCB is already in use - save its contents */
-
- if (unlikely(data->backup_ghcb_active)) {
- /*
- * Backup-GHCB is also already in use. There is no way
- * to continue here so just kill the machine. To make
- * panic() work, mark GHCBs inactive so that messages
- * can be printed out.
- */
- data->ghcb_active = false;
- data->backup_ghcb_active = false;
-
- instrumentation_begin();
- panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
- instrumentation_end();
- }
-
- /* Mark backup_ghcb active before writing to it */
- data->backup_ghcb_active = true;
-
- state->ghcb = &data->backup_ghcb;
-
- /* Backup GHCB content */
- *state->ghcb = *ghcb;
- } else {
- state->ghcb = NULL;
- data->ghcb_active = true;
- }
-
- return ghcb;
-}
-
/* Include code shared with pre-decompression boot stage */
#include "sev-shared.c"
-noinstr void __sev_put_ghcb(struct ghcb_state *state)
-{
- struct sev_es_runtime_data *data;
- struct ghcb *ghcb;
-
- WARN_ON(!irqs_disabled());
-
- data = this_cpu_read(runtime_data);
- ghcb = &data->ghcb_page;
-
- if (state->ghcb) {
- /* Restore GHCB from Backup */
- *ghcb = *state->ghcb;
- data->backup_ghcb_active = false;
- state->ghcb = NULL;
- } else {
- /*
- * Invalidate the GHCB so a VMGEXIT instruction issued
- * from userspace won't appear to be valid.
- */
- vc_ghcb_invalidate(ghcb);
- data->ghcb_active = false;
- }
-}
-
void __head
early_set_pages_state(unsigned long vaddr, unsigned long paddr,
unsigned long npages, enum psc_op op,
diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile
index db3255b979bd..53e964a22759 100644
--- a/arch/x86/coco/sev/Makefile
+++ b/arch/x86/coco/sev/Makefile
@@ -1,9 +1,9 @@
# SPDX-License-Identifier: GPL-2.0
-obj-y += core.o sev-nmi.o vc-handle.o
+obj-y += core.o noinstr.o vc-handle.o
# Clang 14 and older may fail to respect __no_sanitize_undefined when inlining
-UBSAN_SANITIZE_sev-nmi.o := n
+UBSAN_SANITIZE_noinstr.o := n
# GCC may fail to respect __no_sanitize_address when inlining
-KASAN_SANITIZE_sev-nmi.o := n
+KASAN_SANITIZE_noinstr.o := n
diff --git a/arch/x86/coco/sev/sev-nmi.c b/arch/x86/coco/sev/noinstr.c
similarity index 61%
rename from arch/x86/coco/sev/sev-nmi.c
rename to arch/x86/coco/sev/noinstr.c
index d8dfaddfb367..b527eafb6312 100644
--- a/arch/x86/coco/sev/sev-nmi.c
+++ b/arch/x86/coco/sev/noinstr.c
@@ -106,3 +106,77 @@ void noinstr __sev_es_nmi_complete(void)
__sev_put_ghcb(&state);
}
+
+/*
+ * Nothing shall interrupt this code path while holding the per-CPU
+ * GHCB. The backup GHCB is only for NMIs interrupting this path.
+ *
+ * Callers must disable local interrupts around it.
+ */
+noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
+{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ WARN_ON(!irqs_disabled());
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ if (unlikely(data->ghcb_active)) {
+ /* GHCB is already in use - save its contents */
+
+ if (unlikely(data->backup_ghcb_active)) {
+ /*
+ * Backup-GHCB is also already in use. There is no way
+ * to continue here so just kill the machine. To make
+ * panic() work, mark GHCBs inactive so that messages
+ * can be printed out.
+ */
+ data->ghcb_active = false;
+ data->backup_ghcb_active = false;
+
+ instrumentation_begin();
+ panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
+ instrumentation_end();
+ }
+
+ /* Mark backup_ghcb active before writing to it */
+ data->backup_ghcb_active = true;
+
+ state->ghcb = &data->backup_ghcb;
+
+ /* Backup GHCB content */
+ *state->ghcb = *ghcb;
+ } else {
+ state->ghcb = NULL;
+ data->ghcb_active = true;
+ }
+
+ return ghcb;
+}
+
+noinstr void __sev_put_ghcb(struct ghcb_state *state)
+{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ WARN_ON(!irqs_disabled());
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ if (state->ghcb) {
+ /* Restore GHCB from Backup */
+ *ghcb = *state->ghcb;
+ data->backup_ghcb_active = false;
+ state->ghcb = NULL;
+ } else {
+ /*
+ * Invalidate the GHCB so a VMGEXIT instruction issued
+ * from userspace won't appear to be valid.
+ */
+ vc_ghcb_invalidate(ghcb);
+ data->ghcb_active = false;
+ }
+}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 14/22] x86/sev: Export startup routines for later use
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (12 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 13/22] x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations Ard Biesheuvel
` (8 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Create aliases that expose routines that are part of the startup code to
other code in the core kernel, so that they can be called later as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/exports.h | 14 ++++++++++++++
arch/x86/kernel/vmlinux.lds.S | 2 ++
2 files changed, 16 insertions(+)
diff --git a/arch/x86/boot/startup/exports.h b/arch/x86/boot/startup/exports.h
new file mode 100644
index 000000000000..01d2363dc445
--- /dev/null
+++ b/arch/x86/boot/startup/exports.h
@@ -0,0 +1,14 @@
+
+/*
+ * The symbols below are functions that are implemented by the startup code,
+ * but called at runtime by the SEV code residing in the core kernel.
+ */
+PROVIDE(early_set_pages_state = __pi_early_set_pages_state);
+PROVIDE(early_snp_set_memory_private = __pi_early_snp_set_memory_private);
+PROVIDE(early_snp_set_memory_shared = __pi_early_snp_set_memory_shared);
+PROVIDE(get_hv_features = __pi_get_hv_features);
+PROVIDE(sev_es_terminate = __pi_sev_es_terminate);
+PROVIDE(snp_cpuid = __pi_snp_cpuid);
+PROVIDE(snp_cpuid_get_table = __pi_snp_cpuid_get_table);
+PROVIDE(svsm_issue_call = __pi_svsm_issue_call);
+PROVIDE(svsm_process_result_codes = __pi_svsm_process_result_codes);
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 4fa0be732af1..5d5e3a95e1f9 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -535,3 +535,5 @@ xen_elfnote_entry_value =
xen_elfnote_phys32_entry_value =
ABSOLUTE(xen_elfnote_phys32_entry) + ABSOLUTE(pvh_start_xen - LOAD_OFFSET);
#endif
+
+#include "../boot/startup/exports.h"
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (13 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 14/22] x86/sev: Export startup routines for later use Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 9:54 ` Peter Zijlstra
2025-07-16 3:18 ` [PATCH v5 16/22] x86/boot: Check startup code " Ard Biesheuvel
` (7 subsequent siblings)
22 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
The x86 startup code must not use absolute references to code or data,
as it executes before the kernel virtual mapping is up.
Add an action to objtool to check all allocatable sections (with the
exception of __patchable_function_entries, which uses absolute
references for nebulous reasons) and raise an error if any absolute
references are found.
Note that debug sections typically contain lots of absolute references
too, but those are not allocatable so they will be ignored.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
tools/objtool/builtin-check.c | 2 ++
tools/objtool/check.c | 36 ++++++++++++++++++++
tools/objtool/include/objtool/builtin.h | 1 +
3 files changed, 39 insertions(+)
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 80239843e9f0..0f6b197cfcb0 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -87,6 +87,7 @@ static const struct option check_options[] = {
OPT_BOOLEAN('t', "static-call", &opts.static_call, "annotate static calls"),
OPT_BOOLEAN('u', "uaccess", &opts.uaccess, "validate uaccess rules for SMAP"),
OPT_BOOLEAN(0 , "cfi", &opts.cfi, "annotate kernel control flow integrity (kCFI) function preambles"),
+ OPT_BOOLEAN(0 , "noabs", &opts.noabs, "reject absolute references in allocatable sections"),
OPT_CALLBACK_OPTARG(0, "dump", NULL, NULL, "orc", "dump metadata", parse_dump),
OPT_GROUP("Options:"),
@@ -162,6 +163,7 @@ static bool opts_valid(void)
opts.hack_noinstr ||
opts.ibt ||
opts.mcount ||
+ opts.noabs ||
opts.noinstr ||
opts.orc ||
opts.retpoline ||
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d967ac001498..5d1d38404892 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -4643,6 +4643,39 @@ static void disas_warned_funcs(struct objtool_file *file)
disas_funcs(funcs);
}
+static int check_abs_references(struct objtool_file *file)
+{
+ struct section *sec;
+ struct reloc *reloc;
+ int ret = 0;
+
+ for_each_sec(file, sec) {
+ /* absolute references in non-loadable sections are fine */
+ if (!(sec->sh.sh_flags & SHF_ALLOC))
+ continue;
+
+ /* section must have an associated .rela section */
+ if (!sec->rsec)
+ continue;
+
+ /*
+ * Special case for compiler generated metadata that is not
+ * consumed until after boot.
+ */
+ if (!strcmp(sec->name, "__patchable_function_entries"))
+ continue;
+
+ for_each_reloc(sec->rsec, reloc) {
+ if (reloc_type(reloc) == R_ABS64) {
+ WARN("section %s has absolute relocation at offset 0x%lx",
+ sec->name, reloc_offset(reloc));
+ ret++;
+ }
+ }
+ }
+ return ret;
+}
+
struct insn_chunk {
void *addr;
struct insn_chunk *next;
@@ -4776,6 +4809,9 @@ int check(struct objtool_file *file)
goto out;
}
+ if (opts.noabs)
+ warnings += check_abs_references(file);
+
if (opts.orc && nr_insns) {
ret = orc_create(file);
if (ret)
diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index 6b08666fa69d..ab22673862e1 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -26,6 +26,7 @@ struct opts {
bool uaccess;
int prefix;
bool cfi;
+ bool noabs;
/* options: */
bool backtrace;
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 16/22] x86/boot: Check startup code for absence of absolute relocations
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (14 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 17/22] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
` (6 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Invoke objtool on each startup code object individually to check for the
absence of absolute relocations. This is needed because this code will
be invoked from the 1:1 mapping of memory before those absolute virtual
addresses (which are derived from the kernel virtual base address
provided to the linker and possibly shifted at boot) are mapped.
Only objects built under arch/x86/boot/startup/ have this restriction,
and once they have been incorporated into vmlinux.o, this distinction is
difficult to make. So force the invocation of objtool for each object
file individually, even if objtool is deferred to vmlinux.o for the rest
of the build. In the latter case, only pass --noabs and nothing else;
otherwise, append it to the existing objtool command line.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index b514f7e81332..32737f4ab5a8 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -19,6 +19,7 @@ KCOV_INSTRUMENT := n
obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o sev-startup.o
+pi-objs := $(patsubst %.o,$(obj)/%.o,$(obj-y))
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
@@ -28,3 +29,10 @@ lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
# to be linked into the decompressor or the EFI stub but not vmlinux
#
$(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
+
+#
+# Invoke objtool for each object individually to check for absolute
+# relocations, even if other objtool actions are being deferred.
+#
+$(pi-objs): objtool-enabled = 1
+$(pi-objs): objtool-args = $(if $(delay-objtool),,$(objtool-args-y)) --noabs
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 17/22] x86/boot: Revert "Reject absolute references in .head.text"
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (15 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 16/22] x86/boot: Check startup code " Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 18/22] x86/kbuild: Incorporate boot/startup/ via Kbuild makefile Ard Biesheuvel
` (5 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
This reverts commit
faf0ed487415 ("x86/boot: Reject absolute references in .head.text")
The startup code is checked directly for the absence of absolute symbol
references, so checking the .head.text section in the relocs tool is no
longer needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/tools/relocs.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 5778bc498415..e5a2b9a912d1 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -740,10 +740,10 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
const char *symname)
{
- int headtext = !strcmp(sec_name(sec->shdr.sh_info), ".head.text");
unsigned r_type = ELF64_R_TYPE(rel->r_info);
ElfW(Addr) offset = rel->r_offset;
int shn_abs = (sym->st_shndx == SHN_ABS) && !is_reloc(S_REL, symname);
+
if (sym->st_shndx == SHN_UNDEF)
return 0;
@@ -783,12 +783,6 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
break;
}
- if (headtext) {
- die("Absolute reference to symbol '%s' not permitted in .head.text\n",
- symname);
- break;
- }
-
/*
* Relocation offsets for 64 bit kernels are output
* as 32 bits and sign extended back to 64 bits when
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 18/22] x86/kbuild: Incorporate boot/startup/ via Kbuild makefile
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (16 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 17/22] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 19/22] x86/boot: Create a confined code area for startup code Ard Biesheuvel
` (4 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Using core-y is not the correct way to get kbuild to descend into
arch/x86/boot/startup. For instance, building an individual object does
not work as expected when the pattern rule is local to the Makefile
$ make arch/x86/boot/startup/map_kernel.pi.o
GEN Makefile
CALL /home/ardb/linux/scripts/checksyscalls.sh
DESCEND objtool
INSTALL libsubcmd_headers
make[3]: *** No rule to make target 'arch/x86/boot/startup/map_kernel.pi.o'. Stop.
make[2]: *** [/home/ardb/linux/scripts/Makefile.build:461: arch/x86] Error 2
make[1]: *** [/home/ardb/linux/Makefile:2011: .] Error 2
make: *** [/home/ardb/linux/Makefile:248: __sub-make] Error 2
So use obj-y from arch.x86/Kbuild instead, which makes things work as
expected.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/Kbuild | 2 ++
arch/x86/Makefile | 1 -
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild
index f7fb3d88c57b..36b985d0e7bf 100644
--- a/arch/x86/Kbuild
+++ b/arch/x86/Kbuild
@@ -3,6 +3,8 @@
# Branch profiling isn't noinstr-safe. Disable it for arch/x86/*
subdir-ccflags-$(CONFIG_TRACE_BRANCH_PROFILING) += -DDISABLE_BRANCH_PROFILING
+obj-y += boot/startup/
+
obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += coco/
obj-y += entry/
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1913d342969b..9b76e77ff7f7 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -286,7 +286,6 @@ archprepare: $(cpufeaturemasks.hdr)
###
# Kernel objects
-core-y += arch/x86/boot/startup/
libs-y += arch/x86/lib/
# drivers-y are linked after core-y
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 19/22] x86/boot: Create a confined code area for startup code
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (17 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 18/22] x86/kbuild: Incorporate boot/startup/ via Kbuild makefile Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 20/22] efistub/x86: Remap inittext read-execute when needed Ard Biesheuvel
` (3 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
In order to be able to have tight control over which code may execute
from the early 1:1 mapping of memory, but still link vmlinux as a single
executable, prefix all symbol references in startup code with __pi_, and
invoke it from outside using the __pi_ prefix.
Use objtool to check that no absolute symbol references are present in
the startup code, as these cannot be used from code running from the 1:1
mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 14 ++++++++++++++
arch/x86/boot/startup/sev-shared.c | 4 +---
arch/x86/boot/startup/sme.c | 1 -
arch/x86/coco/sev/core.c | 2 +-
arch/x86/include/asm/setup.h | 1 +
arch/x86/include/asm/sev.h | 1 +
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/head_64.S | 8 ++++----
arch/x86/mm/mem_encrypt_boot.S | 6 +++---
tools/objtool/check.c | 3 ++-
10 files changed, 28 insertions(+), 14 deletions(-)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 32737f4ab5a8..e8fdf020b422 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -4,6 +4,7 @@ KBUILD_AFLAGS += -D__DISABLE_EXPORTS
KBUILD_CFLAGS += -D__DISABLE_EXPORTS -mcmodel=small -fPIC \
-Os -DDISABLE_BRANCH_PROFILING \
$(DISABLE_STACKLEAK_PLUGIN) \
+ $(DISABLE_LATENT_ENTROPY_PLUGIN) \
-fno-stack-protector -D__NO_FORTIFY \
-fno-jump-tables \
-include $(srctree)/include/linux/hidden.h
@@ -36,3 +37,16 @@ $(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
#
$(pi-objs): objtool-enabled = 1
$(pi-objs): objtool-args = $(if $(delay-objtool),,$(objtool-args-y)) --noabs
+
+#
+# Confine the startup code by prefixing all symbols with __pi_ (for position
+# independent). This ensures that startup code can only call other startup
+# code, or code that has explicitly been made accessible to it via a symbol
+# alias.
+#
+$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_
+$(obj)/%.pi.o: $(obj)/%.o FORCE
+ $(call if_changed,objcopy)
+
+targets += $(obj-y)
+obj-y := $(patsubst %.o,%.pi.o,$(obj-y))
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index cc14daf816e8..b60d546e74a7 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -11,9 +11,7 @@
#include <asm/setup_data.h>
-#ifndef __BOOT_COMPRESSED
-#define error(v) pr_err(v)
-#else
+#ifdef __BOOT_COMPRESSED
#undef WARN
#define WARN(condition, format...) (!!(condition))
#endif
diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c
index 70ea1748c0a7..eb6a758ba660 100644
--- a/arch/x86/boot/startup/sme.c
+++ b/arch/x86/boot/startup/sme.c
@@ -567,7 +567,6 @@ void __head sme_enable(struct boot_params *bp)
#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
/* Local version for startup code, which never operates on user page tables */
-__weak
pgd_t __pti_set_user_pgtbl(pgd_t *pgdp, pgd_t pgd)
{
return pgd;
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index be89f0a4a28f..fbed1651f6d8 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -272,7 +272,7 @@ static int svsm_perform_call_protocol(struct svsm_call *call)
do {
ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call)
- : svsm_perform_msr_protocol(call);
+ : __pi_svsm_perform_msr_protocol(call);
} while (ret == -EAGAIN);
if (sev_cfg.ghcbs_initialized)
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 692af46603a1..914eb32581c7 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -53,6 +53,7 @@ extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long p2v_offset, struct boot_params *bp);
extern void startup_64_setup_gdt_idt(void);
extern void startup_64_load_idt(void *vc_handler);
+extern void __pi_startup_64_load_idt(void *vc_handler);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f3acbfcdca9a..2d61b13e1810 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -551,6 +551,7 @@ struct cpuid_leaf {
};
int svsm_perform_msr_protocol(struct svsm_call *call);
+int __pi_svsm_perform_msr_protocol(struct svsm_call *call);
int snp_cpuid(void (*cpuid_hv)(void *ctx, struct cpuid_leaf *),
void *ctx, struct cpuid_leaf *leaf);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 1bc40d0785ee..fd28b53dbac5 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -319,5 +319,5 @@ void early_setup_idt(void)
handler = vc_boot_ghcb;
}
- startup_64_load_idt(handler);
+ __pi_startup_64_load_idt(handler);
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 3e9b3a3bd039..d219963ecb60 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -71,7 +71,7 @@ SYM_CODE_START_NOALIGN(startup_64)
xorl %edx, %edx
wrmsr
- call startup_64_setup_gdt_idt
+ call __pi_startup_64_setup_gdt_idt
/* Now switch to __KERNEL_CS so IRET works reliably */
pushq $__KERNEL_CS
@@ -91,7 +91,7 @@ SYM_CODE_START_NOALIGN(startup_64)
* subsequent code. Pass the boot_params pointer as the first argument.
*/
movq %r15, %rdi
- call sme_enable
+ call __pi_sme_enable
#endif
/* Sanitize CPU configuration */
@@ -111,7 +111,7 @@ SYM_CODE_START_NOALIGN(startup_64)
* programmed into CR3.
*/
movq %r15, %rsi
- call __startup_64
+ call __pi___startup_64
/* Form the CR3 value being sure to include the CR3 modifier */
leaq early_top_pgt(%rip), %rcx
@@ -562,7 +562,7 @@ SYM_CODE_START_NOALIGN(vc_no_ghcb)
/* Call C handler */
movq %rsp, %rdi
movq ORIG_RAX(%rsp), %rsi
- call do_vc_no_ghcb
+ call __pi_do_vc_no_ghcb
/* Unwind pt_regs */
POP_REGS
diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
index f8a33b25ae86..edbf9c998848 100644
--- a/arch/x86/mm/mem_encrypt_boot.S
+++ b/arch/x86/mm/mem_encrypt_boot.S
@@ -16,7 +16,7 @@
.text
.code64
-SYM_FUNC_START(sme_encrypt_execute)
+SYM_FUNC_START(__pi_sme_encrypt_execute)
/*
* Entry parameters:
@@ -69,9 +69,9 @@ SYM_FUNC_START(sme_encrypt_execute)
ANNOTATE_UNRET_SAFE
ret
int3
-SYM_FUNC_END(sme_encrypt_execute)
+SYM_FUNC_END(__pi_sme_encrypt_execute)
-SYM_FUNC_START(__enc_copy)
+SYM_FUNC_START_LOCAL(__enc_copy)
ANNOTATE_NOENDBR
/*
* Routine used to encrypt memory in place.
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5d1d38404892..f43bd598d928 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3563,7 +3563,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
/* Ignore KCFI type preambles, which always fall through */
if (!strncmp(func->name, "__cfi_", 6) ||
- !strncmp(func->name, "__pfx_", 6))
+ !strncmp(func->name, "__pfx_", 6) ||
+ !strncmp(func->name, "__pi___pfx_", 11))
return 0;
if (file->ignore_unreachables)
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 20/22] efistub/x86: Remap inittext read-execute when needed
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (18 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 19/22] x86/boot: Create a confined code area for startup code Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 21/22] x86/boot: Move startup code out of __head section Ard Biesheuvel
` (2 subsequent siblings)
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Recent EFI x86 systems are more strict when it comes to mapping boot
images, and require that mappings are either read-write or read-execute.
Now that the boot code is being cleaned up and refactored, most of it is
being moved into .init.text [where it arguably belongs] but that implies
that when booting on such strict EFI firmware, we need to take care to
map .init.text (and the .altinstr_aux section that follows it)
read-execute as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/misc.c | 2 ++
arch/x86/include/asm/boot.h | 2 ++
arch/x86/kernel/vmlinux.lds.S | 2 ++
drivers/firmware/efi/libstub/x86-stub.c | 4 +++-
5 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 3a38fdcdb9bd..74657589264d 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -73,7 +73,7 @@ LDFLAGS_vmlinux += -T
hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include
-sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABbCDGRSTtVW] \(_text\|__start_rodata\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
+sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABbCDGRSTtVW] \(_text\|__start_rodata\|_sinittext\|__inittext_end\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
quiet_cmd_voffset = VOFFSET $@
cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 94b5991da001..0f41ca0e52c0 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -332,6 +332,8 @@ static size_t parse_elf(void *output)
}
const unsigned long kernel_text_size = VO___start_rodata - VO__text;
+const unsigned long kernel_inittext_offset = VO__sinittext - VO__text;
+const unsigned long kernel_inittext_size = VO___inittext_end - VO__sinittext;
const unsigned long kernel_total_size = VO__end - VO__text;
static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4);
diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h
index 02b23aa78955..f7b67cb73915 100644
--- a/arch/x86/include/asm/boot.h
+++ b/arch/x86/include/asm/boot.h
@@ -82,6 +82,8 @@
#ifndef __ASSEMBLER__
extern unsigned int output_len;
extern const unsigned long kernel_text_size;
+extern const unsigned long kernel_inittext_offset;
+extern const unsigned long kernel_inittext_size;
extern const unsigned long kernel_total_size;
unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr,
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 5d5e3a95e1f9..4277efb26358 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -227,6 +227,8 @@ SECTIONS
*/
.altinstr_aux : AT(ADDR(.altinstr_aux) - LOAD_OFFSET) {
*(.altinstr_aux)
+ . = ALIGN(PAGE_SIZE);
+ __inittext_end = .;
}
INIT_DATA_SECTION(16)
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index cafc90d4caaf..0d05eac7c72b 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -788,7 +788,9 @@ static efi_status_t efi_decompress_kernel(unsigned long *kernel_entry,
*kernel_entry = addr + entry;
- return efi_adjust_memory_range_protection(addr, kernel_text_size);
+ return efi_adjust_memory_range_protection(addr, kernel_text_size) ?:
+ efi_adjust_memory_range_protection(addr + kernel_inittext_offset,
+ kernel_inittext_size);
}
static void __noreturn enter_kernel(unsigned long kernel_addr,
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 21/22] x86/boot: Move startup code out of __head section
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (19 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 20/22] efistub/x86: Remap inittext read-execute when needed Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 22/22] x86/boot: Get rid of the .head.text section Ard Biesheuvel
2025-07-16 14:27 ` [PATCH v5 00/22] x86: strict separation of startup code Tom Lendacky
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
Move startup code out of the __head section, now that this no longer has
a special significance. Move everything into .text or .init.text as
appropriate, so that startup code is not kept around unnecessarily.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/sev.c | 3 --
arch/x86/boot/startup/gdt_idt.c | 4 +--
arch/x86/boot/startup/map_kernel.c | 4 +--
arch/x86/boot/startup/sev-shared.c | 38 ++++++++++----------
arch/x86/boot/startup/sev-startup.c | 14 ++++----
arch/x86/boot/startup/sme.c | 26 +++++++-------
arch/x86/include/asm/init.h | 6 ----
arch/x86/kernel/head_32.S | 2 +-
arch/x86/kernel/head_64.S | 2 +-
arch/x86/platform/pvh/head.S | 2 +-
10 files changed, 46 insertions(+), 55 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index faa6cc2f9990..a7af906145e8 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -32,9 +32,6 @@ struct ghcb *boot_ghcb;
#undef __init
#define __init
-#undef __head
-#define __head
-
#define __BOOT_COMPRESSED
u8 snp_vmpl;
diff --git a/arch/x86/boot/startup/gdt_idt.c b/arch/x86/boot/startup/gdt_idt.c
index a3112a69b06a..d16102abdaec 100644
--- a/arch/x86/boot/startup/gdt_idt.c
+++ b/arch/x86/boot/startup/gdt_idt.c
@@ -24,7 +24,7 @@
static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
/* This may run while still in the direct mapping */
-void __head startup_64_load_idt(void *vc_handler)
+void startup_64_load_idt(void *vc_handler)
{
struct desc_ptr desc = {
.address = (unsigned long)rip_rel_ptr(bringup_idt_table),
@@ -46,7 +46,7 @@ void __head startup_64_load_idt(void *vc_handler)
/*
* Setup boot CPU state needed before kernel switches to virtual addresses.
*/
-void __head startup_64_setup_gdt_idt(void)
+void __init startup_64_setup_gdt_idt(void)
{
struct gdt_page *gp = rip_rel_ptr((void *)(__force unsigned long)&gdt_page);
void *handler = NULL;
diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c
index 332dbe6688c4..83ba98d61572 100644
--- a/arch/x86/boot/startup/map_kernel.c
+++ b/arch/x86/boot/startup/map_kernel.c
@@ -30,7 +30,7 @@ static inline bool check_la57_support(void)
return true;
}
-static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
+static unsigned long __init sme_postprocess_startup(struct boot_params *bp,
pmdval_t *pmd,
unsigned long p2v_offset)
{
@@ -84,7 +84,7 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
* the 1:1 mapping of memory. Kernel virtual addresses can be determined by
* subtracting p2v_offset from the RIP-relative address.
*/
-unsigned long __head __startup_64(unsigned long p2v_offset,
+unsigned long __init __startup_64(unsigned long p2v_offset,
struct boot_params *bp)
{
pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts);
diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
index b60d546e74a7..4e36a81d8c18 100644
--- a/arch/x86/boot/startup/sev-shared.c
+++ b/arch/x86/boot/startup/sev-shared.c
@@ -29,7 +29,7 @@ static u32 cpuid_std_range_max __ro_after_init;
static u32 cpuid_hyp_range_max __ro_after_init;
static u32 cpuid_ext_range_max __ro_after_init;
-void __head __noreturn
+void __noreturn
sev_es_terminate(unsigned int set, unsigned int reason)
{
u64 val = GHCB_MSR_TERM_REQ;
@@ -48,7 +48,7 @@ sev_es_terminate(unsigned int set, unsigned int reason)
/*
* The hypervisor features are available from GHCB version 2 onward.
*/
-u64 get_hv_features(void)
+u64 __init get_hv_features(void)
{
u64 val;
@@ -218,7 +218,7 @@ const struct snp_cpuid_table *snp_cpuid_get_table(void)
*
* Return: XSAVE area size on success, 0 otherwise.
*/
-static u32 __head snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
+static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
{
const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
u64 xfeatures_found = 0;
@@ -254,7 +254,7 @@ static u32 __head snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
return xsave_size;
}
-static bool __head
+static bool
snp_cpuid_get_validated_func(struct cpuid_leaf *leaf)
{
const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -296,8 +296,8 @@ static void snp_cpuid_hv_msr(void *ctx, struct cpuid_leaf *leaf)
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
}
-static int __head snp_cpuid_postprocess(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *),
- void *ctx, struct cpuid_leaf *leaf)
+static int snp_cpuid_postprocess(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *),
+ void *ctx, struct cpuid_leaf *leaf)
{
struct cpuid_leaf leaf_hv = *leaf;
@@ -391,8 +391,8 @@ static int __head snp_cpuid_postprocess(void (*cpuid_fn)(void *ctx, struct cpuid
* Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value
* should be treated as fatal by caller.
*/
-int __head snp_cpuid(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *), void *ctx,
- struct cpuid_leaf *leaf)
+int snp_cpuid(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *), void *ctx,
+ struct cpuid_leaf *leaf)
{
const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -434,7 +434,7 @@ int __head snp_cpuid(void (*cpuid_fn)(void *ctx, struct cpuid_leaf *), void *ctx
* page yet, so it only supports the MSR based communication with the
* hypervisor and only the CPUID exit-code.
*/
-void __head do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
+void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
{
unsigned int subfn = lower_bits(regs->cx, 32);
unsigned int fn = lower_bits(regs->ax, 32);
@@ -510,7 +510,7 @@ struct cc_setup_data {
* Search for a Confidential Computing blob passed in as a setup_data entry
* via the Linux Boot Protocol.
*/
-static __head
+static __init
struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
{
struct cc_setup_data *sd = NULL;
@@ -538,7 +538,7 @@ struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
* mapping needs to be updated in sync with all the changes to virtual memory
* layout and related mapping facilities throughout the boot process.
*/
-static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
+static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
{
const struct snp_cpuid_table *cpuid_table_fw, *cpuid_table;
int i;
@@ -566,7 +566,7 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
}
}
-static int __head svsm_call_msr_protocol(struct svsm_call *call)
+static int svsm_call_msr_protocol(struct svsm_call *call)
{
int ret;
@@ -577,8 +577,8 @@ static int __head svsm_call_msr_protocol(struct svsm_call *call)
return ret;
}
-static void __head svsm_pval_4k_page(unsigned long paddr, bool validate,
- struct svsm_ca *caa, u64 caa_pa)
+static void svsm_pval_4k_page(unsigned long paddr, bool validate,
+ struct svsm_ca *caa, u64 caa_pa)
{
struct svsm_pvalidate_call *pc;
struct svsm_call call = {};
@@ -618,8 +618,8 @@ static void __head svsm_pval_4k_page(unsigned long paddr, bool validate,
native_local_irq_restore(flags);
}
-static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
- bool validate, struct svsm_ca *caa, u64 caa_pa)
+static void pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
+ bool validate, struct svsm_ca *caa, u64 caa_pa)
{
int ret;
@@ -632,8 +632,8 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr,
}
}
-static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
- enum psc_op op, struct svsm_ca *caa, u64 caa_pa)
+static void __page_state_change(unsigned long vaddr, unsigned long paddr,
+ enum psc_op op, struct svsm_ca *caa, u64 caa_pa)
{
u64 val, msr;
@@ -671,7 +671,7 @@ static void __head __page_state_change(unsigned long vaddr, unsigned long paddr,
* Maintain the GPA of the SVSM Calling Area (CA) in order to utilize the SVSM
* services needed when not running in VMPL0.
*/
-static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info,
+static bool __init svsm_setup_ca(const struct cc_blob_sev_info *cc_info,
void *page)
{
struct snp_secrets_page *secrets_page;
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index e9238149f2cf..1fdf196f9fad 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -44,7 +44,7 @@
/* Include code shared with pre-decompression boot stage */
#include "sev-shared.c"
-void __head
+void __init
early_set_pages_state(unsigned long vaddr, unsigned long paddr,
unsigned long npages, enum psc_op op,
struct svsm_ca *caa, u64 caa_pa)
@@ -64,7 +64,7 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr,
}
}
-void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
unsigned long npages)
{
/*
@@ -84,7 +84,7 @@ void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
rip_rel_ptr(&boot_svsm_ca_page), boot_svsm_caa_pa);
}
-void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
unsigned long npages)
{
/*
@@ -114,7 +114,7 @@ void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
*
* Scan for the blob in that order.
*/
-static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+static struct cc_blob_sev_info *__init find_cc_blob(struct boot_params *bp)
{
struct cc_blob_sev_info *cc_info;
@@ -140,7 +140,7 @@ static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
return cc_info;
}
-static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
+static void __init svsm_setup(struct cc_blob_sev_info *cc_info)
{
struct snp_secrets_page *secrets = (void *)cc_info->secrets_phys;
struct svsm_call call = {};
@@ -181,7 +181,7 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
boot_svsm_caa_pa = pa;
}
-bool __head snp_init(struct boot_params *bp)
+bool __init snp_init(struct boot_params *bp)
{
struct cc_blob_sev_info *cc_info;
@@ -210,7 +210,7 @@ bool __head snp_init(struct boot_params *bp)
return true;
}
-void __head __noreturn snp_abort(void)
+void __init __noreturn snp_abort(void)
{
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
}
diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c
index eb6a758ba660..39e7e9d18974 100644
--- a/arch/x86/boot/startup/sme.c
+++ b/arch/x86/boot/startup/sme.c
@@ -91,7 +91,7 @@ struct sme_populate_pgd_data {
*/
static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
-static void __head sme_clear_pgd(struct sme_populate_pgd_data *ppd)
+static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
{
unsigned long pgd_start, pgd_end, pgd_size;
pgd_t *pgd_p;
@@ -106,7 +106,7 @@ static void __head sme_clear_pgd(struct sme_populate_pgd_data *ppd)
memset(pgd_p, 0, pgd_size);
}
-static pud_t __head *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
+static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
{
pgd_t *pgd;
p4d_t *p4d;
@@ -143,7 +143,7 @@ static pud_t __head *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
return pud;
}
-static void __head sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
+static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
{
pud_t *pud;
pmd_t *pmd;
@@ -159,7 +159,7 @@ static void __head sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
}
-static void __head sme_populate_pgd(struct sme_populate_pgd_data *ppd)
+static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
{
pud_t *pud;
pmd_t *pmd;
@@ -185,7 +185,7 @@ static void __head sme_populate_pgd(struct sme_populate_pgd_data *ppd)
set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
}
-static void __head __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
+static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
{
while (ppd->vaddr < ppd->vaddr_end) {
sme_populate_pgd_large(ppd);
@@ -195,7 +195,7 @@ static void __head __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
}
}
-static void __head __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
+static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
{
while (ppd->vaddr < ppd->vaddr_end) {
sme_populate_pgd(ppd);
@@ -205,7 +205,7 @@ static void __head __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
}
}
-static void __head __sme_map_range(struct sme_populate_pgd_data *ppd,
+static void __init __sme_map_range(struct sme_populate_pgd_data *ppd,
pmdval_t pmd_flags, pteval_t pte_flags)
{
unsigned long vaddr_end;
@@ -229,22 +229,22 @@ static void __head __sme_map_range(struct sme_populate_pgd_data *ppd,
__sme_map_range_pte(ppd);
}
-static void __head sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
}
-static void __head sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
}
-static void __head sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP);
}
-static unsigned long __head sme_pgtable_calc(unsigned long len)
+static unsigned long __init sme_pgtable_calc(unsigned long len)
{
unsigned long entries = 0, tables = 0;
@@ -281,7 +281,7 @@ static unsigned long __head sme_pgtable_calc(unsigned long len)
return entries + tables;
}
-void __head sme_encrypt_kernel(struct boot_params *bp)
+void __init sme_encrypt_kernel(struct boot_params *bp)
{
unsigned long workarea_start, workarea_end, workarea_len;
unsigned long execute_start, execute_end, execute_len;
@@ -485,7 +485,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp)
native_write_cr3(__native_read_cr3());
}
-void __head sme_enable(struct boot_params *bp)
+void __init sme_enable(struct boot_params *bp)
{
unsigned int eax, ebx, ecx, edx;
unsigned long feature_mask;
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 8b1b1abcef15..01ccdd168df0 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -2,12 +2,6 @@
#ifndef _ASM_X86_INIT_H
#define _ASM_X86_INIT_H
-#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
-#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
-#else
-#define __head __section(".head.text") __no_sanitize_undefined
-#endif
-
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
void (*free_pgt_page)(void *, void *); /* free buf for page table */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 76743dfad6ab..437effb1ef03 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -61,7 +61,7 @@ RESERVE_BRK(pagetables, INIT_MAP_SIZE)
* any particular GDT layout, because we load our own as soon as we
* can.
*/
-__HEAD
+ __INIT
SYM_CODE_START(startup_32)
movl pa(initial_stack),%ecx
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d219963ecb60..21816b48537c 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -33,7 +33,7 @@
* because we need identity-mapped pages.
*/
- __HEAD
+ __INIT
.code64
SYM_CODE_START_NOALIGN(startup_64)
UNWIND_HINT_END_OF_STACK
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 1d78e5631bb8..344030c1a81d 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -24,7 +24,7 @@
#include <asm/nospec-branch.h>
#include <xen/interface/elfnote.h>
- __HEAD
+ __INIT
/*
* Entry point for PVH guests.
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 22/22] x86/boot: Get rid of the .head.text section
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (20 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 21/22] x86/boot: Move startup code out of __head section Ard Biesheuvel
@ 2025-07-16 3:18 ` Ard Biesheuvel
2025-07-16 14:27 ` [PATCH v5 00/22] x86: strict separation of startup code Tom Lendacky
22 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 3:18 UTC (permalink / raw)
To: linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Tom Lendacky, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
From: Ard Biesheuvel <ardb@kernel.org>
The .head.text section is now empty, so it can be dropped from the
linker script.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/kernel/vmlinux.lds.S | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 4277efb26358..d7af4a64c211 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -160,11 +160,6 @@ SECTIONS
} :text = 0xcccccccc
- /* bootstrapping code */
- .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
- HEAD_TEXT
- } :text = 0xcccccccc
-
/* End of text section, which should occupy whole number of pages */
_etext = .;
. = ALIGN(PAGE_SIZE);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations
2025-07-16 3:18 ` [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations Ard Biesheuvel
@ 2025-07-16 9:54 ` Peter Zijlstra
2025-07-16 10:26 ` Ard Biesheuvel
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-07-16 9:54 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-kernel, linux-efi, x86, Ard Biesheuvel, Borislav Petkov,
Ingo Molnar, Kevin Loughlin, Tom Lendacky, Josh Poimboeuf,
Nikunj A Dadhania
On Wed, Jul 16, 2025 at 05:18:30AM +0200, Ard Biesheuvel wrote:
> index d967ac001498..5d1d38404892 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -4643,6 +4643,39 @@ static void disas_warned_funcs(struct objtool_file *file)
> disas_funcs(funcs);
> }
>
> +static int check_abs_references(struct objtool_file *file)
> +{
> + struct section *sec;
> + struct reloc *reloc;
> + int ret = 0;
> +
> + for_each_sec(file, sec) {
> + /* absolute references in non-loadable sections are fine */
> + if (!(sec->sh.sh_flags & SHF_ALLOC))
> + continue;
> +
> + /* section must have an associated .rela section */
> + if (!sec->rsec)
> + continue;
> +
> + /*
> + * Special case for compiler generated metadata that is not
> + * consumed until after boot.
> + */
> + if (!strcmp(sec->name, "__patchable_function_entries"))
> + continue;
> +
> + for_each_reloc(sec->rsec, reloc) {
> + if (reloc_type(reloc) == R_ABS64) {
This should probably also check R_ABS32. Yes, your current only user is
x86_64 so R_ABS64 covers things, but we're getting more and more archs
using objtool. No reason this check shouldn't also work on PPC32 for
example.
> + WARN("section %s has absolute relocation at offset 0x%lx",
> + sec->name, reloc_offset(reloc));
> + ret++;
> + }
> + }
> + }
> + return ret;
> +}
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations
2025-07-16 9:54 ` Peter Zijlstra
@ 2025-07-16 10:26 ` Ard Biesheuvel
2025-07-16 11:32 ` Peter Zijlstra
0 siblings, 1 reply; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 10:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ard Biesheuvel, linux-kernel, linux-efi, x86, Borislav Petkov,
Ingo Molnar, Kevin Loughlin, Tom Lendacky, Josh Poimboeuf,
Nikunj A Dadhania
On Wed, 16 Jul 2025 at 19:54, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jul 16, 2025 at 05:18:30AM +0200, Ard Biesheuvel wrote:
> > index d967ac001498..5d1d38404892 100644
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -4643,6 +4643,39 @@ static void disas_warned_funcs(struct objtool_file *file)
> > disas_funcs(funcs);
> > }
> >
> > +static int check_abs_references(struct objtool_file *file)
> > +{
> > + struct section *sec;
> > + struct reloc *reloc;
> > + int ret = 0;
> > +
> > + for_each_sec(file, sec) {
> > + /* absolute references in non-loadable sections are fine */
> > + if (!(sec->sh.sh_flags & SHF_ALLOC))
> > + continue;
> > +
> > + /* section must have an associated .rela section */
> > + if (!sec->rsec)
> > + continue;
> > +
> > + /*
> > + * Special case for compiler generated metadata that is not
> > + * consumed until after boot.
> > + */
> > + if (!strcmp(sec->name, "__patchable_function_entries"))
> > + continue;
> > +
> > + for_each_reloc(sec->rsec, reloc) {
> > + if (reloc_type(reloc) == R_ABS64) {
>
> This should probably also check R_ABS32. Yes, your current only user is
> x86_64 so R_ABS64 covers things, but we're getting more and more archs
> using objtool. No reason this check shouldn't also work on PPC32 for
> example.
>
Yeah, I was unsure about this.
This check is sufficient to ensure that PIC code is not emitted with,
e.g., global variables with absolute addresses, etc. So the R_ABS64
check here is only a check whether any relocations of the native
pointer size are present (but no R_ABS_NATIVE exists at this point)
For robustness, we should actually check for all absolute relocations
here, including R_X86_64_32S, which is not abstracted into a R_ABSxx
type for objtool.
So perhaps this needs an arch hook where x86_64 can implement it as
bool arch_is_abs_reloc(reloc)
{
switch (reloc_type(reloc)) {
case R_X86_64_32:
case R_X86_64_32S:
case R_X86_64_64:
return true;
}
return false;
}
and the default just compares against R_ABS32 / R_ABS64 depending on
the word size?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations
2025-07-16 10:26 ` Ard Biesheuvel
@ 2025-07-16 11:32 ` Peter Zijlstra
2025-07-16 20:48 ` Josh Poimboeuf
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-07-16 11:32 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Ard Biesheuvel, linux-kernel, linux-efi, x86, Borislav Petkov,
Ingo Molnar, Kevin Loughlin, Tom Lendacky, Josh Poimboeuf,
Nikunj A Dadhania
On Wed, Jul 16, 2025 at 08:26:55PM +1000, Ard Biesheuvel wrote:
> For robustness, we should actually check for all absolute relocations
> here, including R_X86_64_32S, which is not abstracted into a R_ABSxx
> type for objtool.
>
> So perhaps this needs an arch hook where x86_64 can implement it as
>
> bool arch_is_abs_reloc(reloc)
> {
> switch (reloc_type(reloc)) {
> case R_X86_64_32:
> case R_X86_64_32S:
> case R_X86_64_64:
> return true;
> }
> return false;
> }
>
> and the default just compares against R_ABS32 / R_ABS64 depending on
> the word size?
Yes, an arch hook like that makes sense. Perhaps make the signature:
bool arch_is_abs_reloc(struct elf *, struct reloc *);
Because the word size comes from elf_addr_size().
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/22] x86: strict separation of startup code
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
` (21 preceding siblings ...)
2025-07-16 3:18 ` [PATCH v5 22/22] x86/boot: Get rid of the .head.text section Ard Biesheuvel
@ 2025-07-16 14:27 ` Tom Lendacky
2025-07-16 22:02 ` Ard Biesheuvel
22 siblings, 1 reply; 32+ messages in thread
From: Tom Lendacky @ 2025-07-16 14:27 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Josh Poimboeuf, Peter Zijlstra, Nikunj A Dadhania
On 7/15/25 22:18, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
Hi Ard,
I tried to apply this to tip/master but ran into conflicts. What commit
is the series based on?
Thanks,
Tom
>
> This series implements a strict separation between startup code and
> ordinary code, where startup code is built in a way that tolerates being
> invoked from the initial 1:1 mapping of memory.
>
> The existing approach of emitting this code into .head.text and checking
> for absolute relocations in that section is not 100% safe, and produces
> diagnostics that are sometimes difficult to interpret. [0]
>
> Instead, rely on symbol prefixes, similar to how this is implemented for
> the EFI stub and for the startup code in the arm64 port. This ensures
> that startup code can only call other startup code, unless a special
> symbol alias is emitted that exposes a non-startup routine to the
> startup code.
>
> This is somewhat intrusive, as there are many data objects that are
> referenced both by startup code and by ordinary code, and an alias needs
> to be emitted for each of those. If startup code references anything
> that has not been made available to it explicitly, a build time link
> error will occur.
>
> This ultimately allows the .head.text section to be dropped entirely, as
> it no longer has a special significance. Instead, code that only
> executes at boot is emitted into .init.text as it should.
>
> The majority of changes is around early SEV code. The main issue is that
> its use of GHCB pages and SVSM calling areas in code that may run from
> both the 1:1 mapping and the kernel virtual mapping is problematic as it
> relies on __pa() to perform VA to PA translations, which are ambiguous
> in this context. Also, __pa() pulls in non-trivial instrumented code
> when CONFIG_DEBUG_VIRTUAL=y and so it is better to avoid VA to PA
> translations altogether in the startup code.
>
> Changes since v4:
> - Incorporate feedback from Tom, and add a couple of RBs
> - Drop patch that moved the MSR save/restore out of the early page state
> change helper - this is less efficient but likely negligible in
> practice
> - Drop patch that unified the SEV-SNP hypervisor feature check, which
> was identified by Nikunj as the one breaking SEV-SNP boot.
>
> Changes since RFT/v3:
> - Rebase onto tip/master
> - Incorporate Borislav's feedback on v3
> - Switch to objtool to check for absolute references in startup code
> - Remap inittext R-X when running on EFI implementations that require
> strict R-X/RW- separation
> - Include a kbuild fix to incorporate arch/x86/boot/startup/ in the
> right manner
> - For now, omit the LA57 changes that remove the problematic early
> 5-level paging checks. We can revisit this once there is agreement on
> the approach.
>
> Changes since RFT/v2:
> - Rebase onto tip/x86/boot and drop the patches from the previous
> revision that have been applied in the meantime.
> - Omit the pgtable_l5_enabled() changes for now, and just expose PIC
> aliases for the variables in question - this can be sorted later.
> - Don't use the boot SVSM calling area in snp_kexec_finish(), but pass
> down the correct per-CPU one to the early page state API.
> - Rename arch/x86/coco/sev/sev-noinstr.o to arch/x86/coco/sev/noinstr.o
> - Further reduce the amount of SEV code that needs to be constructed in
> a special way.
>
> Change since RFC/v1:
> - Include a major disentanglement/refactor of the SEV-SNP startup code,
> so that only code that really needs to run from the 1:1 mapping is
> included in the startup/ code
>
> - Incorporate some early notes from Ingo
>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Kevin Loughlin <kevinloughlin@google.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Nikunj A Dadhania <nikunj@amd.com>
>
> [0] https://lore.kernel.org/all/CAHk-=wj7k9nvJn6cpa3-5Ciwn2RGyE605BMkjWE4MqnvC9E92A@mail.gmail.com/
>
> Ard Biesheuvel (22):
> x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback
> x86/sev: Use MSR protocol for remapping SVSM calling area
> x86/sev: Use MSR protocol only for early SVSM PVALIDATE call
> x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL
> x86/sev: Move GHCB page based HV communication out of startup code
> x86/sev: Avoid global variable to store virtual address of SVSM area
> x86/sev: Share implementation of MSR-based page state change
> x86/sev: Pass SVSM calling area down to early page state change API
> x86/sev: Use boot SVSM CA for all startup and init code
> x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check
> x86/boot: Provide PIC aliases for 5-level paging related constants
> x86/sev: Provide PIC aliases for SEV related data objects
> x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object
> x86/sev: Export startup routines for later use
> objtool: Add action to check for absence of absolute relocations
> x86/boot: Check startup code for absence of absolute relocations
> x86/boot: Revert "Reject absolute references in .head.text"
> x86/kbuild: Incorporate boot/startup/ via Kbuild makefile
> x86/boot: Create a confined code area for startup code
> efistub/x86: Remap inittext read-execute when needed
> x86/boot: Move startup code out of __head section
> x86/boot: Get rid of the .head.text section
>
> arch/x86/Kbuild | 2 +
> arch/x86/Makefile | 1 -
> arch/x86/boot/compressed/Makefile | 2 +-
> arch/x86/boot/compressed/misc.c | 2 +
> arch/x86/boot/compressed/sev-handle-vc.c | 3 +
> arch/x86/boot/compressed/sev.c | 108 +------
> arch/x86/boot/startup/Makefile | 22 ++
> arch/x86/boot/startup/exports.h | 14 +
> arch/x86/boot/startup/gdt_idt.c | 4 +-
> arch/x86/boot/startup/map_kernel.c | 4 +-
> arch/x86/boot/startup/sev-shared.c | 317 ++++++--------------
> arch/x86/boot/startup/sev-startup.c | 196 ++----------
> arch/x86/boot/startup/sme.c | 27 +-
> arch/x86/coco/sev/Makefile | 6 +-
> arch/x86/coco/sev/core.c | 169 ++++++++---
> arch/x86/coco/sev/{sev-nmi.c => noinstr.c} | 74 +++++
> arch/x86/coco/sev/vc-handle.c | 2 +
> arch/x86/coco/sev/vc-shared.c | 143 ++++++++-
> arch/x86/include/asm/boot.h | 2 +
> arch/x86/include/asm/init.h | 6 -
> arch/x86/include/asm/setup.h | 1 +
> arch/x86/include/asm/sev-internal.h | 27 +-
> arch/x86/include/asm/sev.h | 17 +-
> arch/x86/kernel/head64.c | 5 +-
> arch/x86/kernel/head_32.S | 2 +-
> arch/x86/kernel/head_64.S | 10 +-
> arch/x86/kernel/vmlinux.lds.S | 9 +-
> arch/x86/mm/mem_encrypt_amd.c | 6 -
> arch/x86/mm/mem_encrypt_boot.S | 6 +-
> arch/x86/platform/pvh/head.S | 2 +-
> arch/x86/tools/relocs.c | 8 +-
> drivers/firmware/efi/libstub/x86-stub.c | 4 +-
> tools/objtool/builtin-check.c | 2 +
> tools/objtool/check.c | 39 ++-
> tools/objtool/include/objtool/builtin.h | 1 +
> 35 files changed, 620 insertions(+), 623 deletions(-)
> create mode 100644 arch/x86/boot/startup/exports.h
> rename arch/x86/coco/sev/{sev-nmi.c => noinstr.c} (61%)
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback
2025-07-16 3:18 ` [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback Ard Biesheuvel
@ 2025-07-16 16:52 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-07-16 16:52 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Josh Poimboeuf, Peter Zijlstra, Nikunj A Dadhania
On 7/15/25 22:18, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> There are two distinct callers of snp_cpuid(): one where the MSR
> protocol is always used, and one where the GHCB page based interface is
> always used.
>
> The snp_cpuid() logic does not care about the distinction, which only
> matters at a lower level. But the fact that it supports both interfaces
> means that the GHCB page based logic is pulled into the early startup
> code where PA to VA conversions are problematic, given that it runs from
> the 1:1 mapping of memory.
>
> So keep snp_cpuid() itself in the startup code, but factor out the
> hypervisor calls via a callback, so that the GHCB page handling can be
> moved out.
>
> Code refactoring only - no functional change intended.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Just a minor comment below...
> ---
> arch/x86/boot/startup/sev-shared.c | 60 ++++----------------
> arch/x86/coco/sev/vc-shared.c | 49 +++++++++++++++-
> arch/x86/include/asm/sev.h | 3 +-
> 3 files changed, 61 insertions(+), 51 deletions(-)
>
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 89075ff19afa..2cabf617de3c 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -552,7 +552,8 @@ struct cpuid_leaf {
> u32 edx;
> };
>
> -int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf);
> +int snp_cpuid(void (*cpuid_hv)(void *ctx, struct cpuid_leaf *),
You use cpuid_fn elsewhere in the patch. I'm ok with either one, but
they should match name-wise.
Also, you should provide an argument name for the leaf pointer or no
argument name for the void pointer (throughout).
Thanks,
Tom
> + void *ctx, struct cpuid_leaf *leaf);
>
> void __noreturn sev_es_terminate(unsigned int set, unsigned int reason);
> enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area
2025-07-16 3:18 ` [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area Ard Biesheuvel
@ 2025-07-16 17:03 ` Tom Lendacky
2025-07-18 9:45 ` Ard Biesheuvel
0 siblings, 1 reply; 32+ messages in thread
From: Tom Lendacky @ 2025-07-16 17:03 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: linux-efi, x86, Ard Biesheuvel, Borislav Petkov, Ingo Molnar,
Kevin Loughlin, Josh Poimboeuf, Peter Zijlstra, Nikunj A Dadhania
On 7/15/25 22:18, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> As the preceding code comment already indicates, remapping the SVSM
> calling area occurs long before the GHCB page is configured, and so
> calling svsm_perform_call_protocol() is guaranteed to result in a call
> to svsm_perform_msr_protocol().
>
> So just call the latter directly. This allows most of the GHCB based API
> infrastructure to be moved out of the startup code in a subsequent
> patch.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/boot/startup/sev-shared.c | 11 +++++++++++
> arch/x86/boot/startup/sev-startup.c | 5 ++---
> 2 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
> index c401d0391537..60ab09b3149d 100644
> --- a/arch/x86/boot/startup/sev-shared.c
> +++ b/arch/x86/boot/startup/sev-shared.c
> @@ -723,6 +723,17 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
> }
> }
>
> +static int __head svsm_call_msr_protocol(struct svsm_call *call)
> +{
> + int ret;
> +
> + do {
> + ret = svsm_perform_msr_protocol(call);
> + } while (ret == -EAGAIN);
> +
> + return ret;
> +}
> +
> static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
> {
> struct svsm_pvalidate_call *pc;
> diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
> index 0b7e3b950183..c30e0eed0131 100644
> --- a/arch/x86/boot/startup/sev-startup.c
> +++ b/arch/x86/boot/startup/sev-startup.c
> @@ -295,7 +295,6 @@ static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
> static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
> {
> struct svsm_call call = {};
> - int ret;
> u64 pa;
>
> /*
> @@ -325,8 +324,8 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
> call.caa = svsm_get_caa();
> call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA);
> call.rcx = pa;
> - ret = svsm_perform_call_protocol(&call);
> - if (ret)
> +
> + if (svsm_perform_msr_protocol(&call))
This should be svsm_call_msr_protocol().
Thanks,
Tom
> sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL);
>
> boot_svsm_caa = (struct svsm_ca *)pa;
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations
2025-07-16 11:32 ` Peter Zijlstra
@ 2025-07-16 20:48 ` Josh Poimboeuf
0 siblings, 0 replies; 32+ messages in thread
From: Josh Poimboeuf @ 2025-07-16 20:48 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ard Biesheuvel, Ard Biesheuvel, linux-kernel, linux-efi, x86,
Borislav Petkov, Ingo Molnar, Kevin Loughlin, Tom Lendacky,
Nikunj A Dadhania
On Wed, Jul 16, 2025 at 01:32:43PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 16, 2025 at 08:26:55PM +1000, Ard Biesheuvel wrote:
>
> > For robustness, we should actually check for all absolute relocations
> > here, including R_X86_64_32S, which is not abstracted into a R_ABSxx
> > type for objtool.
> >
> > So perhaps this needs an arch hook where x86_64 can implement it as
> >
> > bool arch_is_abs_reloc(reloc)
> > {
> > switch (reloc_type(reloc)) {
> > case R_X86_64_32:
> > case R_X86_64_32S:
> > case R_X86_64_64:
> > return true;
> > }
> > return false;
> > }
> >
> > and the default just compares against R_ABS32 / R_ABS64 depending on
> > the word size?
>
> Yes, an arch hook like that makes sense. Perhaps make the signature:
>
> bool arch_is_abs_reloc(struct elf *, struct reloc *);
>
> Because the word size comes from elf_addr_size().
We already have an arch_pc_relative_reloc(), please try to keep the
naming consistent.
--
Josh
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 00/22] x86: strict separation of startup code
2025-07-16 14:27 ` [PATCH v5 00/22] x86: strict separation of startup code Tom Lendacky
@ 2025-07-16 22:02 ` Ard Biesheuvel
0 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-16 22:02 UTC (permalink / raw)
To: Tom Lendacky
Cc: Ard Biesheuvel, linux-kernel, linux-efi, x86, Borislav Petkov,
Ingo Molnar, Kevin Loughlin, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
On Thu, 17 Jul 2025 at 00:27, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 7/15/25 22:18, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
>
> Hi Ard,
>
> I tried to apply this to tip/master but ran into conflicts. What commit
> is the series based on?
>
Apologies, it is based on the same commit as v4:
f339770f60d9c3312133cfe6a349476848d9b128
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=x86-startup-confine-v5
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area
2025-07-16 17:03 ` Tom Lendacky
@ 2025-07-18 9:45 ` Ard Biesheuvel
0 siblings, 0 replies; 32+ messages in thread
From: Ard Biesheuvel @ 2025-07-18 9:45 UTC (permalink / raw)
To: Tom Lendacky
Cc: Ard Biesheuvel, linux-kernel, linux-efi, x86, Borislav Petkov,
Ingo Molnar, Kevin Loughlin, Josh Poimboeuf, Peter Zijlstra,
Nikunj A Dadhania
On Thu, 17 Jul 2025 at 03:03, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 7/15/25 22:18, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > As the preceding code comment already indicates, remapping the SVSM
> > calling area occurs long before the GHCB page is configured, and so
> > calling svsm_perform_call_protocol() is guaranteed to result in a call
> > to svsm_perform_msr_protocol().
> >
> > So just call the latter directly. This allows most of the GHCB based API
> > infrastructure to be moved out of the startup code in a subsequent
> > patch.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> > Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> > ---
> > arch/x86/boot/startup/sev-shared.c | 11 +++++++++++
> > arch/x86/boot/startup/sev-startup.c | 5 ++---
> > 2 files changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c
> > index c401d0391537..60ab09b3149d 100644
> > --- a/arch/x86/boot/startup/sev-shared.c
> > +++ b/arch/x86/boot/startup/sev-shared.c
> > @@ -723,6 +723,17 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
> > }
> > }
> >
> > +static int __head svsm_call_msr_protocol(struct svsm_call *call)
> > +{
> > + int ret;
> > +
> > + do {
> > + ret = svsm_perform_msr_protocol(call);
> > + } while (ret == -EAGAIN);
> > +
> > + return ret;
> > +}
> > +
> > static void __head svsm_pval_4k_page(unsigned long paddr, bool validate)
> > {
> > struct svsm_pvalidate_call *pc;
> > diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
> > index 0b7e3b950183..c30e0eed0131 100644
> > --- a/arch/x86/boot/startup/sev-startup.c
> > +++ b/arch/x86/boot/startup/sev-startup.c
> > @@ -295,7 +295,6 @@ static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
> > static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
> > {
> > struct svsm_call call = {};
> > - int ret;
> > u64 pa;
> >
> > /*
> > @@ -325,8 +324,8 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
> > call.caa = svsm_get_caa();
> > call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA);
> > call.rcx = pa;
> > - ret = svsm_perform_call_protocol(&call);
> > - if (ret)
> > +
> > + if (svsm_perform_msr_protocol(&call))
>
> This should be svsm_call_msr_protocol().
>
OK will fix
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-07-18 9:46 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 3:18 [PATCH v5 00/22] x86: strict separation of startup code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 01/22] x86/sev: Separate MSR and GHCB based snp_cpuid() via a callback Ard Biesheuvel
2025-07-16 16:52 ` Tom Lendacky
2025-07-16 3:18 ` [PATCH v5 02/22] x86/sev: Use MSR protocol for remapping SVSM calling area Ard Biesheuvel
2025-07-16 17:03 ` Tom Lendacky
2025-07-18 9:45 ` Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 03/22] x86/sev: Use MSR protocol only for early SVSM PVALIDATE call Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 04/22] x86/sev: Run RMPADJUST on SVSM calling area page to test VMPL Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 05/22] x86/sev: Move GHCB page based HV communication out of startup code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 06/22] x86/sev: Avoid global variable to store virtual address of SVSM area Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 07/22] x86/sev: Share implementation of MSR-based page state change Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 08/22] x86/sev: Pass SVSM calling area down to early page state change API Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 09/22] x86/sev: Use boot SVSM CA for all startup and init code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 10/22] x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 11/22] x86/boot: Provide PIC aliases for 5-level paging related constants Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 12/22] x86/sev: Provide PIC aliases for SEV related data objects Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 13/22] x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr object Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 14/22] x86/sev: Export startup routines for later use Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 15/22] objtool: Add action to check for absence of absolute relocations Ard Biesheuvel
2025-07-16 9:54 ` Peter Zijlstra
2025-07-16 10:26 ` Ard Biesheuvel
2025-07-16 11:32 ` Peter Zijlstra
2025-07-16 20:48 ` Josh Poimboeuf
2025-07-16 3:18 ` [PATCH v5 16/22] x86/boot: Check startup code " Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 17/22] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 18/22] x86/kbuild: Incorporate boot/startup/ via Kbuild makefile Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 19/22] x86/boot: Create a confined code area for startup code Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 20/22] efistub/x86: Remap inittext read-execute when needed Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 21/22] x86/boot: Move startup code out of __head section Ard Biesheuvel
2025-07-16 3:18 ` [PATCH v5 22/22] x86/boot: Get rid of the .head.text section Ard Biesheuvel
2025-07-16 14:27 ` [PATCH v5 00/22] x86: strict separation of startup code Tom Lendacky
2025-07-16 22:02 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).