* [RFC PATCH PoC 00/11] x86: strict separation of startup code
@ 2025-04-23 11:09 Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases Ard Biesheuvel
` (11 more replies)
0 siblings, 12 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
This is a proof-of-concept series that implements a strict separation
between startup code and ordinary code, where startup code is built in a
way that tolerates being invoked from the initial 1:1 mapping of memory.
The current approach of emitting this code into .head.text and checking
for absolute relocations in that section is not 100% safe, and produces
diagnostics that are sometimes difficult to interpret.
Instead, rely on symbol prefixes, similar to how this is implemented for
the EFI stub and for the startup code in the arm64 port. This ensures
that startup code can only call other startup code, unless a special
symbol alias is emitted that exposes a non-startup routine to the
startup code.
This is somewhat intrusive, as there are many data objects that are
referenced both by startup code and by ordinary code, and an alias needs
to be emitted for each of those.
This ultimately allows the .head.text section to be dropped entirely, as
it no longer has a special significance. Instead, code that only
executes at boot is emitted into .init.text as it should.
This series is presented for discussion only - defconfig should build
and run correctly, but allmodconfig will likely need the last patch
omitted.
Ard Biesheuvel (11):
x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases
x86/boot: Move early_setup_gdt() back into head64.c
x86/boot: Disregard __supported_pte_mask in __startup_64()
x86/boot: Add a bunch of PI aliases
HACK: provide __pti_set_user_pgtbl() to startup code
x86/boot: Created a confined code area for startup code
HACK: work around sev-startup.c being omitted for now
x86/boot: Move startup code out of __head section
x86/boot: Disallow absolute symbol references in startup code
x86/boot: Revert "Reject absolute references in .head.text"
x86/boot: Get rid of the .head.text section
arch/x86/boot/startup/Makefile | 26 ++++++++++++++--
arch/x86/boot/startup/gdt_idt.c | 17 ++---------
arch/x86/boot/startup/map_kernel.c | 6 ++--
arch/x86/boot/startup/sev-startup.c | 3 ++
arch/x86/boot/startup/sme.c | 31 ++++++++++++--------
arch/x86/coco/core.c | 2 ++
arch/x86/include/asm/linkage.h | 6 ++++
arch/x86/include/asm/setup.h | 2 ++
arch/x86/include/asm/sev.h | 2 +-
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/head64.c | 19 ++++++++++++
arch/x86/kernel/head_32.S | 2 +-
arch/x86/kernel/head_64.S | 16 +++++++---
arch/x86/kernel/setup.c | 1 +
arch/x86/kernel/vmlinux.lds.S | 9 +++---
arch/x86/lib/retpoline.S | 1 +
arch/x86/mm/mem_encrypt_amd.c | 2 ++
arch/x86/mm/mem_encrypt_boot.S | 6 ++--
arch/x86/mm/pgtable.c | 1 +
arch/x86/platform/pvh/head.S | 2 +-
arch/x86/tools/relocs.c | 8 +----
21 files changed, 107 insertions(+), 56 deletions(-)
base-commit: 121c335b36e02d6aefb72501186e060474fdf33c
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-24 18:05 ` Ingo Molnar
2025-04-23 11:09 ` [RFC PATCH PoC 02/11] x86/boot: Move early_setup_gdt() back into head64.c Ard Biesheuvel
` (10 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Startup code that may execute from the early 1:1 mapping of memory will
be confined into its own address space, and only be permitted to access
ordinary kernel symbols if this is known to be safe.
Introduce a macro helper PI_ALIAS() that emits a __pi_ prefixed alias
for a symbol, which allows startup code to access it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/include/asm/linkage.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
index b51d8a4673f5..ad59ff384f72 100644
--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -141,5 +141,11 @@
#define SYM_FUNC_START_WEAK_NOALIGN(name) \
SYM_START(name, SYM_L_WEAK, SYM_A_NONE)
+#ifdef __ASSEMBLER__
+#define SYM_PI_ALIAS(sym) SYM_ALIAS(__pi_ ## sym, sym, SYM_L_GLOBAL)
+#else
+#define SYM_PI_ALIAS(sym) extern typeof(sym) __PASTE(__pi_, sym) __alias(sym)
+#endif
+
#endif /* _ASM_X86_LINKAGE_H */
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 02/11] x86/boot: Move early_setup_gdt() back into head64.c
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 03/11] x86/boot: Disregard __supported_pte_mask in __startup_64() Ard Biesheuvel
` (9 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Move early_setup_gdt() out of the startup code that is callable from the
1:1 mapping - this is not needed, and instead, it is better to expose
the helper that does reside in __head directly. This reduces the amount
of code that needs special checks for 1:1 execution suitability.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/gdt_idt.c | 15 +--------------
arch/x86/include/asm/setup.h | 1 +
arch/x86/kernel/head64.c | 12 ++++++++++++
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/boot/startup/gdt_idt.c b/arch/x86/boot/startup/gdt_idt.c
index 7e34d0b426b1..a3112a69b06a 100644
--- a/arch/x86/boot/startup/gdt_idt.c
+++ b/arch/x86/boot/startup/gdt_idt.c
@@ -24,7 +24,7 @@
static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
/* This may run while still in the direct mapping */
-static void __head startup_64_load_idt(void *vc_handler)
+void __head startup_64_load_idt(void *vc_handler)
{
struct desc_ptr desc = {
.address = (unsigned long)rip_rel_ptr(bringup_idt_table),
@@ -43,19 +43,6 @@ static void __head startup_64_load_idt(void *vc_handler)
native_load_idt(&desc);
}
-/* This is used when running on kernel addresses */
-void early_setup_idt(void)
-{
- void *handler = NULL;
-
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
- setup_ghcb();
- handler = vc_boot_ghcb;
- }
-
- startup_64_load_idt(handler);
-}
-
/*
* Setup boot CPU state needed before kernel switches to virtual addresses.
*/
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index ad9212df0ec0..6324f4c6c545 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -52,6 +52,7 @@ extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long p2v_offset, struct boot_params *bp);
extern void startup_64_setup_gdt_idt(void);
+extern void startup_64_load_idt(void *vc_handler);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 6b68a206fa7f..29226f3ac064 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -303,3 +303,15 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data)
start_kernel();
}
+
+void early_setup_idt(void)
+{
+ void *handler = NULL;
+
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ setup_ghcb();
+ handler = vc_boot_ghcb;
+ }
+
+ startup_64_load_idt(handler);
+}
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 03/11] x86/boot: Disregard __supported_pte_mask in __startup_64()
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 02/11] x86/boot: Move early_setup_gdt() back into head64.c Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 04/11] x86/boot: Add a bunch of PI aliases Ard Biesheuvel
` (8 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
__supported_pte_mask is statically initialized to U64_MAX and never
assigned until long after the startup code executes that creates the
initial page tables. So applying the mask is unnecessary, and can be
avoided.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/map_kernel.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c
index 0eac3f17dbd3..099ae2559336 100644
--- a/arch/x86/boot/startup/map_kernel.c
+++ b/arch/x86/boot/startup/map_kernel.c
@@ -179,8 +179,6 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
pud[(i + 1) % PTRS_PER_PUD] = (pudval_t)pmd + pgtable_flags;
pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
- /* Filter out unsupported __PAGE_KERNEL_* bits: */
- pmd_entry &= __supported_pte_mask;
pmd_entry += sme_get_me_mask();
pmd_entry += physaddr;
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 04/11] x86/boot: Add a bunch of PI aliases
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (2 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 03/11] x86/boot: Disregard __supported_pte_mask in __startup_64() Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 05/11] HACK: provide __pti_set_user_pgtbl() to startup code Ard Biesheuvel
` (7 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Add aliases for all the data objects that the startup code references -
this is needed so that this code can be moved into its own confined area
where it can only access symbols that have a __pi_ prefix.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/coco/core.c | 2 ++
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/head64.c | 7 +++++++
arch/x86/kernel/head_64.S | 8 ++++++++
arch/x86/kernel/setup.c | 1 +
arch/x86/kernel/vmlinux.lds.S | 4 ++++
arch/x86/lib/retpoline.S | 1 +
arch/x86/mm/mem_encrypt_amd.c | 2 ++
arch/x86/mm/pgtable.c | 1 +
9 files changed, 27 insertions(+)
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 9a0ddda3aa69..303360508a71 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -18,7 +18,9 @@
#include <asm/processor.h>
enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
+SYM_PI_ALIAS(cc_vendor);
u64 cc_mask __ro_after_init;
+SYM_PI_ALIAS(cc_mask);
static struct cc_attr_flags {
__u64 host_sev_snp : 1,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 12126adbc3a9..8fe2e9859c4b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -242,6 +242,7 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
#endif
} };
EXPORT_PER_CPU_SYMBOL_GPL(gdt_page);
+SYM_PI_ALIAS(gdt_page);
#ifdef CONFIG_X86_64
static int __init x86_nopcid_setup(char *s)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 29226f3ac064..b251186a819e 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -48,23 +48,30 @@
*/
extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
unsigned int __initdata next_early_pgt;
+SYM_PI_ALIAS(next_early_pgt);
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled __ro_after_init;
+SYM_PI_ALIAS(__pgtable_l5_enabled);
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
+SYM_PI_ALIAS(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
EXPORT_SYMBOL(ptrs_per_p4d);
+SYM_PI_ALIAS(ptrs_per_p4d);
#endif
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
EXPORT_SYMBOL(page_offset_base);
+SYM_PI_ALIAS(page_offset_base);
unsigned long vmalloc_base __ro_after_init = __VMALLOC_BASE_L4;
EXPORT_SYMBOL(vmalloc_base);
+SYM_PI_ALIAS(vmalloc_base);
unsigned long vmemmap_base __ro_after_init = __VMEMMAP_BASE_L4;
EXPORT_SYMBOL(vmemmap_base);
+SYM_PI_ALIAS(vmemmap_base);
#endif
/* Wipe all early page tables except for the kernel symbol map */
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index fefe2a25cf02..0c0d38ebf70b 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -573,6 +573,7 @@ SYM_CODE_START_NOALIGN(vc_no_ghcb)
/* Pure iret required here - don't use INTERRUPT_RETURN */
iretq
SYM_CODE_END(vc_no_ghcb)
+SYM_PI_ALIAS(vc_no_ghcb);
#endif
#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
@@ -604,10 +605,12 @@ SYM_DATA_START_PTI_ALIGNED(early_top_pgt)
.quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
.fill PTI_USER_PGD_FILL,8,0
SYM_DATA_END(early_top_pgt)
+SYM_PI_ALIAS(early_top_pgt)
SYM_DATA_START_PAGE_ALIGNED(early_dynamic_pgts)
.fill 512*EARLY_DYNAMIC_PAGE_TABLES,8,0
SYM_DATA_END(early_dynamic_pgts)
+SYM_PI_ALIAS(early_dynamic_pgts);
SYM_DATA(early_recursion_flag, .long 0)
@@ -651,6 +654,7 @@ SYM_DATA_START_PAGE_ALIGNED(level4_kernel_pgt)
.fill 511,8,0
.quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
SYM_DATA_END(level4_kernel_pgt)
+SYM_PI_ALIAS(level4_kernel_pgt)
#endif
SYM_DATA_START_PAGE_ALIGNED(level3_kernel_pgt)
@@ -659,6 +663,7 @@ SYM_DATA_START_PAGE_ALIGNED(level3_kernel_pgt)
.quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
.quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
SYM_DATA_END(level3_kernel_pgt)
+SYM_PI_ALIAS(level3_kernel_pgt)
SYM_DATA_START_PAGE_ALIGNED(level2_kernel_pgt)
/*
@@ -676,6 +681,7 @@ SYM_DATA_START_PAGE_ALIGNED(level2_kernel_pgt)
*/
PMDS(0, __PAGE_KERNEL_LARGE_EXEC, KERNEL_IMAGE_SIZE/PMD_SIZE)
SYM_DATA_END(level2_kernel_pgt)
+SYM_PI_ALIAS(level2_kernel_pgt)
SYM_DATA_START_PAGE_ALIGNED(level2_fixmap_pgt)
.fill (512 - 4 - FIXMAP_PMD_NUM),8,0
@@ -688,6 +694,7 @@ SYM_DATA_START_PAGE_ALIGNED(level2_fixmap_pgt)
/* 6 MB reserved space + a 2MB hole */
.fill 4,8,0
SYM_DATA_END(level2_fixmap_pgt)
+SYM_PI_ALIAS(level2_fixmap_pgt)
SYM_DATA_START_PAGE_ALIGNED(level1_fixmap_pgt)
.rept (FIXMAP_PMD_NUM)
@@ -703,6 +710,7 @@ SYM_DATA(smpboot_control, .long 0)
.align 16
/* This must match the first entry in level2_kernel_pgt */
SYM_DATA(phys_base, .quad 0x0)
+SYM_PI_ALIAS(phys_base);
EXPORT_SYMBOL(phys_base)
#include "../xen/xen-head.S"
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9d2a13b37833..ae1fdb0fc6ba 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -134,6 +134,7 @@ struct ist_info ist_info;
struct cpuinfo_x86 boot_cpu_data __read_mostly;
EXPORT_SYMBOL(boot_cpu_data);
+SYM_PI_ALIAS(boot_cpu_data);
#if !defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
__visible unsigned long mmu_cr4_features __ro_after_init;
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index ccdc45e5b759..9340c74b680d 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -79,11 +79,13 @@ const_cpu_current_top_of_stack = cpu_current_top_of_stack;
#define BSS_DECRYPTED \
. = ALIGN(PMD_SIZE); \
__start_bss_decrypted = .; \
+ __pi___start_bss_decrypted = .; \
*(.bss..decrypted); \
. = ALIGN(PAGE_SIZE); \
__start_bss_decrypted_unused = .; \
. = ALIGN(PMD_SIZE); \
__end_bss_decrypted = .; \
+ __pi___end_bss_decrypted = .; \
#else
@@ -128,6 +130,7 @@ SECTIONS
/* Text and read-only data */
.text : AT(ADDR(.text) - LOAD_OFFSET) {
_text = .;
+ __pi__text = .;
_stext = .;
ALIGN_ENTRY_TEXT_BEGIN
*(.text..__x86.rethunk_untrain)
@@ -391,6 +394,7 @@ SECTIONS
. = ALIGN(PAGE_SIZE); /* keep VO_INIT_SIZE page aligned */
_end = .;
+ __pi__end = .;
#ifdef CONFIG_AMD_MEM_ENCRYPT
/*
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index a26c43abd47d..cabec2788e70 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -394,6 +394,7 @@ SYM_CODE_START(__x86_return_thunk)
#endif
int3
SYM_CODE_END(__x86_return_thunk)
+SYM_PI_ALIAS(__x86_return_thunk)
EXPORT_SYMBOL(__x86_return_thunk)
#endif /* CONFIG_MITIGATION_RETHUNK */
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 7490ff6d83b1..9aaeda6eb83d 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -40,7 +40,9 @@
* section is later cleared.
*/
u64 sme_me_mask __section(".data") = 0;
+SYM_PI_ALIAS(sme_me_mask);
u64 sev_status __section(".data") = 0;
+SYM_PI_ALIAS(sev_status);
u64 sev_check_data __section(".data") = 0;
EXPORT_SYMBOL(sme_me_mask);
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a05fcddfc811..9e26215da18d 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -10,6 +10,7 @@
#ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
phys_addr_t physical_mask __ro_after_init = (1ULL << __PHYSICAL_MASK_SHIFT) - 1;
EXPORT_SYMBOL(physical_mask);
+SYM_PI_ALIAS(physical_mask);
#endif
pgtable_t pte_alloc_one(struct mm_struct *mm)
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 05/11] HACK: provide __pti_set_user_pgtbl() to startup code
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (3 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 04/11] x86/boot: Add a bunch of PI aliases Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 06/11] x86/boot: Created a confined code area for " Ard Biesheuvel
` (6 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
The SME startup code may call out to pti_set_user_pgtbl(), which is not
part of the code corpus that is explicitly built to tolerate execution
from the 1:1 mapping of memory.
Hack around this for now by providing an alternative that just returns
the pgd.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/sme.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c
index 5738b31c8e60..d55b24cd4d08 100644
--- a/arch/x86/boot/startup/sme.c
+++ b/arch/x86/boot/startup/sme.c
@@ -564,3 +564,8 @@ void __head sme_enable(struct boot_params *bp)
cc_vendor = CC_VENDOR_AMD;
cc_set_mask(me_mask);
}
+
+pgd_t __pti_set_user_pgtbl(pgd_t *pgdp, pgd_t pgd)
+{
+ return pgd;
+}
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 06/11] x86/boot: Created a confined code area for startup code
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (4 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 05/11] HACK: provide __pti_set_user_pgtbl() to startup code Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 07/11] HACK: work around sev-startup.c being omitted for now Ard Biesheuvel
` (5 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
In order to be able to have tight control over which code may execute
from the early 1:1 mapping of memory, but still link vmlinux as a single
executable, prefix all symbol references in startup code with __pi_, and
invoke it from outside using the __pi_ prefix.
HACK: omit sev-status.c for the time being - disentangling that is
rather challenging, and not necessary for a proof of concept
implementation.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 18 ++++++++++++++++--
arch/x86/include/asm/setup.h | 1 +
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/head_64.S | 6 +++---
arch/x86/mm/mem_encrypt_boot.S | 6 +++---
5 files changed, 24 insertions(+), 9 deletions(-)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index b514f7e81332..4062582144f6 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -17,8 +17,9 @@ KMSAN_SANITIZE := n
UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
-obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o
-obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o sev-startup.o
+pi-obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o
+pi-obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o #sev-startup.o
+obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev-startup.o
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
@@ -28,3 +29,16 @@ lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
# to be linked into the decompressor or the EFI stub but not vmlinux
#
$(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
+
+#
+# Confine the startup code by prefixing all symbols with __pi_ (for position
+# independent). This ensures that startup code can only call other startup
+# code, or code that has explicitly been made accessible to it via a symbol
+# alias.
+#
+$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_
+$(obj)/%.pi.o: $(obj)/%.o FORCE
+ $(call if_changed,objcopy)
+
+extra-y := $(pi-obj-y)
+obj-y += $(patsubst %.o,%.pi.o,$(pi-obj-y))
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 6324f4c6c545..895d09faaf83 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -53,6 +53,7 @@ extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long p2v_offset, struct boot_params *bp);
extern void startup_64_setup_gdt_idt(void);
extern void startup_64_load_idt(void *vc_handler);
+extern void __pi_startup_64_load_idt(void *vc_handler);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b251186a819e..8107cd68bc41 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -320,5 +320,5 @@ void early_setup_idt(void)
handler = vc_boot_ghcb;
}
- startup_64_load_idt(handler);
+ __pi_startup_64_load_idt(handler);
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 0c0d38ebf70b..e448279a0f87 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -71,7 +71,7 @@ SYM_CODE_START_NOALIGN(startup_64)
xorl %edx, %edx
wrmsr
- call startup_64_setup_gdt_idt
+ call __pi_startup_64_setup_gdt_idt
/* Now switch to __KERNEL_CS so IRET works reliably */
pushq $__KERNEL_CS
@@ -91,7 +91,7 @@ SYM_CODE_START_NOALIGN(startup_64)
* subsequent code. Pass the boot_params pointer as the first argument.
*/
movq %r15, %rdi
- call sme_enable
+ call __pi_sme_enable
#endif
/* Sanitize CPU configuration */
@@ -111,7 +111,7 @@ SYM_CODE_START_NOALIGN(startup_64)
* programmed into CR3.
*/
movq %r15, %rsi
- call __startup_64
+ call __pi___startup_64
/* Form the CR3 value being sure to include the CR3 modifier */
leaq early_top_pgt(%rip), %rcx
diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
index f8a33b25ae86..edbf9c998848 100644
--- a/arch/x86/mm/mem_encrypt_boot.S
+++ b/arch/x86/mm/mem_encrypt_boot.S
@@ -16,7 +16,7 @@
.text
.code64
-SYM_FUNC_START(sme_encrypt_execute)
+SYM_FUNC_START(__pi_sme_encrypt_execute)
/*
* Entry parameters:
@@ -69,9 +69,9 @@ SYM_FUNC_START(sme_encrypt_execute)
ANNOTATE_UNRET_SAFE
ret
int3
-SYM_FUNC_END(sme_encrypt_execute)
+SYM_FUNC_END(__pi_sme_encrypt_execute)
-SYM_FUNC_START(__enc_copy)
+SYM_FUNC_START_LOCAL(__enc_copy)
ANNOTATE_NOENDBR
/*
* Routine used to encrypt memory in place.
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 07/11] HACK: work around sev-startup.c being omitted for now
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (5 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 06/11] x86/boot: Created a confined code area for " Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 08/11] x86/boot: Move startup code out of __head section Ard Biesheuvel
` (4 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Add some PI aliases that shouldn't be needed once sev-startup.c is also
built with __pi_ aliases.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/sev-startup.c | 3 +++
arch/x86/include/asm/sev.h | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c
index 36a75c5096b0..7b9de4479c0c 100644
--- a/arch/x86/boot/startup/sev-startup.c
+++ b/arch/x86/boot/startup/sev-startup.c
@@ -562,6 +562,7 @@ void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
/* Ask hypervisor to mark the memory pages shared in the RMP table. */
early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
}
+SYM_PI_ALIAS(early_snp_set_memory_shared);
/* Writes to the SVSM CAA MSR are ignored */
static enum es_result __vc_handle_msr_caa(struct pt_regs *regs, bool write)
@@ -1383,8 +1384,10 @@ bool __head snp_init(struct boot_params *bp)
return true;
}
+SYM_PI_ALIAS(snp_init);
void __head __noreturn snp_abort(void)
{
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
}
+SYM_PI_ALIAS(snp_abort);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index a8661dfc9a9a..9ba1f30eb03e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -494,6 +494,7 @@ void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
void snp_set_wakeup_secondary_cpu(void);
bool snp_init(struct boot_params *bp);
void __noreturn snp_abort(void);
+void __noreturn __pi_snp_abort(void);
void snp_dmi_setup(void);
int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input);
void snp_accept_memory(phys_addr_t start, phys_addr_t end);
@@ -541,7 +542,6 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npag
static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
static inline void snp_set_wakeup_secondary_cpu(void) { }
static inline bool snp_init(struct boot_params *bp) { return false; }
-static inline void snp_abort(void) { }
static inline void snp_dmi_setup(void) { }
static inline int snp_issue_svsm_attest_req(u64 call_id, struct svsm_call *call, struct svsm_attest_call *input)
{
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 08/11] x86/boot: Move startup code out of __head section
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (6 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 07/11] HACK: work around sev-startup.c being omitted for now Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 09/11] x86/boot: Disallow absolute symbol references in startup code Ard Biesheuvel
` (3 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Move startup code out of the __head section, now that this no longer has
a special significance. Move everything into .text or .init.text as
appropriate, so that startup code is not kept around unnecessarily.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/gdt_idt.c | 4 +--
arch/x86/boot/startup/map_kernel.c | 4 +--
arch/x86/boot/startup/sme.c | 26 ++++++++++----------
arch/x86/kernel/head_32.S | 2 +-
arch/x86/kernel/head_64.S | 2 +-
arch/x86/platform/pvh/head.S | 2 +-
6 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/arch/x86/boot/startup/gdt_idt.c b/arch/x86/boot/startup/gdt_idt.c
index a3112a69b06a..d16102abdaec 100644
--- a/arch/x86/boot/startup/gdt_idt.c
+++ b/arch/x86/boot/startup/gdt_idt.c
@@ -24,7 +24,7 @@
static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
/* This may run while still in the direct mapping */
-void __head startup_64_load_idt(void *vc_handler)
+void startup_64_load_idt(void *vc_handler)
{
struct desc_ptr desc = {
.address = (unsigned long)rip_rel_ptr(bringup_idt_table),
@@ -46,7 +46,7 @@ void __head startup_64_load_idt(void *vc_handler)
/*
* Setup boot CPU state needed before kernel switches to virtual addresses.
*/
-void __head startup_64_setup_gdt_idt(void)
+void __init startup_64_setup_gdt_idt(void)
{
struct gdt_page *gp = rip_rel_ptr((void *)(__force unsigned long)&gdt_page);
void *handler = NULL;
diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c
index 099ae2559336..75b3dd62da50 100644
--- a/arch/x86/boot/startup/map_kernel.c
+++ b/arch/x86/boot/startup/map_kernel.c
@@ -36,7 +36,7 @@ static inline bool check_la57_support(void)
return true;
}
-static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
+static unsigned long __init sme_postprocess_startup(struct boot_params *bp,
pmdval_t *pmd,
unsigned long p2v_offset)
{
@@ -90,7 +90,7 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
* the 1:1 mapping of memory. Kernel virtual addresses can be determined by
* subtracting p2v_offset from the RIP-relative address.
*/
-unsigned long __head __startup_64(unsigned long p2v_offset,
+unsigned long __init __startup_64(unsigned long p2v_offset,
struct boot_params *bp)
{
pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts);
diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c
index d55b24cd4d08..914016184755 100644
--- a/arch/x86/boot/startup/sme.c
+++ b/arch/x86/boot/startup/sme.c
@@ -91,7 +91,7 @@ struct sme_populate_pgd_data {
*/
static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
-static void __head sme_clear_pgd(struct sme_populate_pgd_data *ppd)
+static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
{
unsigned long pgd_start, pgd_end, pgd_size;
pgd_t *pgd_p;
@@ -106,7 +106,7 @@ static void __head sme_clear_pgd(struct sme_populate_pgd_data *ppd)
memset(pgd_p, 0, pgd_size);
}
-static pud_t __head *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
+static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
{
pgd_t *pgd;
p4d_t *p4d;
@@ -143,7 +143,7 @@ static pud_t __head *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
return pud;
}
-static void __head sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
+static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
{
pud_t *pud;
pmd_t *pmd;
@@ -159,7 +159,7 @@ static void __head sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
}
-static void __head sme_populate_pgd(struct sme_populate_pgd_data *ppd)
+static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
{
pud_t *pud;
pmd_t *pmd;
@@ -185,7 +185,7 @@ static void __head sme_populate_pgd(struct sme_populate_pgd_data *ppd)
set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
}
-static void __head __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
+static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
{
while (ppd->vaddr < ppd->vaddr_end) {
sme_populate_pgd_large(ppd);
@@ -195,7 +195,7 @@ static void __head __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
}
}
-static void __head __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
+static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
{
while (ppd->vaddr < ppd->vaddr_end) {
sme_populate_pgd(ppd);
@@ -205,7 +205,7 @@ static void __head __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
}
}
-static void __head __sme_map_range(struct sme_populate_pgd_data *ppd,
+static void __init __sme_map_range(struct sme_populate_pgd_data *ppd,
pmdval_t pmd_flags, pteval_t pte_flags)
{
unsigned long vaddr_end;
@@ -229,22 +229,22 @@ static void __head __sme_map_range(struct sme_populate_pgd_data *ppd,
__sme_map_range_pte(ppd);
}
-static void __head sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
}
-static void __head sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
}
-static void __head sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
+static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
{
__sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP);
}
-static unsigned long __head sme_pgtable_calc(unsigned long len)
+static unsigned long __init sme_pgtable_calc(unsigned long len)
{
unsigned long entries = 0, tables = 0;
@@ -281,7 +281,7 @@ static unsigned long __head sme_pgtable_calc(unsigned long len)
return entries + tables;
}
-void __head sme_encrypt_kernel(struct boot_params *bp)
+void __init sme_encrypt_kernel(struct boot_params *bp)
{
unsigned long workarea_start, workarea_end, workarea_len;
unsigned long execute_start, execute_end, execute_len;
@@ -485,7 +485,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp)
native_write_cr3(__native_read_cr3());
}
-void __head sme_enable(struct boot_params *bp)
+void __init sme_enable(struct boot_params *bp)
{
unsigned int eax, ebx, ecx, edx;
unsigned long feature_mask;
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 2e42056d2306..5962ff2a189a 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -61,7 +61,7 @@ RESERVE_BRK(pagetables, INIT_MAP_SIZE)
* any particular GDT layout, because we load our own as soon as we
* can.
*/
-__HEAD
+ __INIT
SYM_CODE_START(startup_32)
movl pa(initial_stack),%ecx
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index e448279a0f87..0cbc992c39e4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -33,7 +33,7 @@
* because we need identity-mapped pages.
*/
- __HEAD
+ __INIT
.code64
SYM_CODE_START_NOALIGN(startup_64)
UNWIND_HINT_END_OF_STACK
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index cfa18ec7d55f..16aa1f018b80 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -24,7 +24,7 @@
#include <asm/nospec-branch.h>
#include <xen/interface/elfnote.h>
- __HEAD
+ __INIT
/*
* Entry point for PVH guests.
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 09/11] x86/boot: Disallow absolute symbol references in startup code
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (7 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 08/11] x86/boot: Move startup code out of __head section Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 10/11] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
` (2 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
Check that the objects built under arch/x86/boot/startup do not contain
any absolute symbol reference. Given that the code is built with -fPIC,
such references can only be emitted using R_X86_64_64 relocations, so
checking that those are absent is sufficient.
Note that debug sections and __patchable_funtion_entries section may
contain such relocations nonetheless, but these are unnecessary in the
startup code, so they can be dropped first.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 4062582144f6..43560ab9e21a 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -36,9 +36,17 @@ $(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
# code, or code that has explicitly been made accessible to it via a symbol
# alias.
#
-$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_
+$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ --strip-debug \
+ --remove-section=.rela__patchable_function_entries
$(obj)/%.pi.o: $(obj)/%.o FORCE
- $(call if_changed,objcopy)
+ $(call if_changed,piobjcopy)
+
+quiet_cmd_piobjcopy = $(quiet_cmd_objcopy)
+ cmd_piobjcopy = $(cmd_objcopy); \
+ if $(READELF) -r $(@) | grep R_X86_64_64; then \
+ echo "$@: R_X86_64_64 references not allowed in startup code" >&2; \
+ /bin/false; \
+ fi
extra-y := $(pi-obj-y)
obj-y += $(patsubst %.o,%.pi.o,$(pi-obj-y))
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 10/11] x86/boot: Revert "Reject absolute references in .head.text"
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (8 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 09/11] x86/boot: Disallow absolute symbol references in startup code Ard Biesheuvel
@ 2025-04-23 11:09 ` Ard Biesheuvel
2025-04-23 11:10 ` [RFC PATCH PoC 11/11] x86/boot: Get rid of the .head.text section Ard Biesheuvel
2025-04-24 18:09 ` [RFC PATCH PoC 00/11] x86: strict separation of startup code Ingo Molnar
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:09 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
This reverts commit faf0ed487415f76fe4acf7980ce360901f5e1698.
The startup code is checked directly for the absence of absolute symbol
references, so checking the .head.text section in the relocs tool is no
longer needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/tools/relocs.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 5778bc498415..e5a2b9a912d1 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -740,10 +740,10 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
const char *symname)
{
- int headtext = !strcmp(sec_name(sec->shdr.sh_info), ".head.text");
unsigned r_type = ELF64_R_TYPE(rel->r_info);
ElfW(Addr) offset = rel->r_offset;
int shn_abs = (sym->st_shndx == SHN_ABS) && !is_reloc(S_REL, symname);
+
if (sym->st_shndx == SHN_UNDEF)
return 0;
@@ -783,12 +783,6 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
break;
}
- if (headtext) {
- die("Absolute reference to symbol '%s' not permitted in .head.text\n",
- symname);
- break;
- }
-
/*
* Relocation offsets for 64 bit kernels are output
* as 32 bits and sign extended back to 64 bits when
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC PATCH PoC 11/11] x86/boot: Get rid of the .head.text section
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (9 preceding siblings ...)
2025-04-23 11:09 ` [RFC PATCH PoC 10/11] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
@ 2025-04-23 11:10 ` Ard Biesheuvel
2025-04-24 18:09 ` [RFC PATCH PoC 00/11] x86: strict separation of startup code Ingo Molnar
11 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-23 11:10 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, mingo, Ard Biesheuvel
From: Ard Biesheuvel <ardb@kernel.org>
The .head.text section is now empty, so it can be dropped from the
linker script.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/kernel/vmlinux.lds.S | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9340c74b680d..9c50546b11a1 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -160,11 +160,6 @@ SECTIONS
} :text = 0xcccccccc
- /* bootstrapping code */
- .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
- HEAD_TEXT
- } :text = 0xcccccccc
-
/* End of text section, which should occupy whole number of pages */
_etext = .;
. = ALIGN(PAGE_SIZE);
--
2.49.0.805.g082f7c87e0-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases
2025-04-23 11:09 ` [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases Ard Biesheuvel
@ 2025-04-24 18:05 ` Ingo Molnar
2025-04-24 18:17 ` Ard Biesheuvel
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2025-04-24 18:05 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: linux-kernel, x86, Ard Biesheuvel
* Ard Biesheuvel <ardb+git@google.com> wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Startup code that may execute from the early 1:1 mapping of memory will
> be confined into its own address space, and only be permitted to access
> ordinary kernel symbols if this is known to be safe.
>
> Introduce a macro helper PI_ALIAS() that emits a __pi_ prefixed alias
> for a symbol, which allows startup code to access it.
s/PI_ALIAS
/SYM_PI_ALIAS
What does 'PI' stand for? 'Physical memory Identity' map?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH PoC 00/11] x86: strict separation of startup code
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
` (10 preceding siblings ...)
2025-04-23 11:10 ` [RFC PATCH PoC 11/11] x86/boot: Get rid of the .head.text section Ard Biesheuvel
@ 2025-04-24 18:09 ` Ingo Molnar
2025-04-24 18:16 ` Ard Biesheuvel
11 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2025-04-24 18:09 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: linux-kernel, x86, Ard Biesheuvel
* Ard Biesheuvel <ardb+git@google.com> wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> This is a proof-of-concept series that implements a strict separation
> between startup code and ordinary code, where startup code is built in a
> way that tolerates being invoked from the initial 1:1 mapping of memory.
>
> The current approach of emitting this code into .head.text and checking
> for absolute relocations in that section is not 100% safe, and produces
> diagnostics that are sometimes difficult to interpret.
>
> Instead, rely on symbol prefixes, similar to how this is implemented for
> the EFI stub and for the startup code in the arm64 port. This ensures
> that startup code can only call other startup code, unless a special
> symbol alias is emitted that exposes a non-startup routine to the
> startup code.
So when startup code accidentally references non-startup symbols
outside the __pi namespace, we get a build/link error, right?
> This is somewhat intrusive, as there are many data objects that are
> referenced both by startup code and by ordinary code, and an alias
> needs to be emitted for each of those.
Yeah, but this should make it ultimately safe(r): every object is
either local to the startup code, or has been 'exported' intentionally
to the startup code.
> This ultimately allows the .head.text section to be dropped entirely,
> as it no longer has a special significance. Instead, code that only
> executes at boot is emitted into .init.text as it should.
>
> This series is presented for discussion only - defconfig should build
> and run correctly, but allmodconfig will likely need the last patch
> omitted.
No fundamental objections from me.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH PoC 00/11] x86: strict separation of startup code
2025-04-24 18:09 ` [RFC PATCH PoC 00/11] x86: strict separation of startup code Ingo Molnar
@ 2025-04-24 18:16 ` Ard Biesheuvel
0 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-24 18:16 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Ard Biesheuvel, linux-kernel, x86
On Thu, 24 Apr 2025 at 20:09, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Ard Biesheuvel <ardb+git@google.com> wrote:
>
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > This is a proof-of-concept series that implements a strict separation
> > between startup code and ordinary code, where startup code is built in a
> > way that tolerates being invoked from the initial 1:1 mapping of memory.
> >
> > The current approach of emitting this code into .head.text and checking
> > for absolute relocations in that section is not 100% safe, and produces
> > diagnostics that are sometimes difficult to interpret.
> >
> > Instead, rely on symbol prefixes, similar to how this is implemented for
> > the EFI stub and for the startup code in the arm64 port. This ensures
> > that startup code can only call other startup code, unless a special
> > symbol alias is emitted that exposes a non-startup routine to the
> > startup code.
>
> So when startup code accidentally references non-startup symbols
> outside the __pi namespace, we get a build/link error, right?
>
Yes.
> > This is somewhat intrusive, as there are many data objects that are
> > referenced both by startup code and by ordinary code, and an alias
> > needs to be emitted for each of those.
>
> Yeah, but this should make it ultimately safe(r): every object is
> either local to the startup code, or has been 'exported' intentionally
> to the startup code.
>
Indeed.
> > This ultimately allows the .head.text section to be dropped entirely,
> > as it no longer has a special significance. Instead, code that only
> > executes at boot is emitted into .init.text as it should.
> >
> > This series is presented for discussion only - defconfig should build
> > and run correctly, but allmodconfig will likely need the last patch
> > omitted.
>
> No fundamental objections from me.
>
Good to know - thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases
2025-04-24 18:05 ` Ingo Molnar
@ 2025-04-24 18:17 ` Ard Biesheuvel
2025-04-24 18:23 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-24 18:17 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Ard Biesheuvel, linux-kernel, x86
On Thu, 24 Apr 2025 at 20:05, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Ard Biesheuvel <ardb+git@google.com> wrote:
>
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Startup code that may execute from the early 1:1 mapping of memory will
> > be confined into its own address space, and only be permitted to access
> > ordinary kernel symbols if this is known to be safe.
> >
> > Introduce a macro helper PI_ALIAS() that emits a __pi_ prefixed alias
> > for a symbol, which allows startup code to access it.
>
> s/PI_ALIAS
> /SYM_PI_ALIAS
>
> What does 'PI' stand for? 'Physical memory Identity' map?
>
'position independent' - it's what we ended up with on arm64, but I'm
not attached to it so happy to switch to something better.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases
2025-04-24 18:17 ` Ard Biesheuvel
@ 2025-04-24 18:23 ` Ingo Molnar
0 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2025-04-24 18:23 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: Ard Biesheuvel, linux-kernel, x86
* Ard Biesheuvel <ardb@kernel.org> wrote:
> On Thu, 24 Apr 2025 at 20:05, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Startup code that may execute from the early 1:1 mapping of memory will
> > > be confined into its own address space, and only be permitted to access
> > > ordinary kernel symbols if this is known to be safe.
> > >
> > > Introduce a macro helper PI_ALIAS() that emits a __pi_ prefixed alias
> > > for a symbol, which allows startup code to access it.
> >
> > s/PI_ALIAS
> > /SYM_PI_ALIAS
> >
> > What does 'PI' stand for? 'Physical memory Identity' map?
> >
>
> 'position independent'
/facepalm
Clearly it's getting late here :)
> - it's what we ended up with on arm64, but I'm
> not attached to it so happy to switch to something better.
Could we make it something like SYM_PIC_ALIAS() at least? Because 'PIC'
is something most people will recognize in this context. PI goes for
3.1415. ;-)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-04-24 18:23 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23 11:09 [RFC PATCH PoC 00/11] x86: strict separation of startup code Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 01/11] x86/linkage: Add SYM_PI_ALIAS() macro helper to emit symbol aliases Ard Biesheuvel
2025-04-24 18:05 ` Ingo Molnar
2025-04-24 18:17 ` Ard Biesheuvel
2025-04-24 18:23 ` Ingo Molnar
2025-04-23 11:09 ` [RFC PATCH PoC 02/11] x86/boot: Move early_setup_gdt() back into head64.c Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 03/11] x86/boot: Disregard __supported_pte_mask in __startup_64() Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 04/11] x86/boot: Add a bunch of PI aliases Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 05/11] HACK: provide __pti_set_user_pgtbl() to startup code Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 06/11] x86/boot: Created a confined code area for " Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 07/11] HACK: work around sev-startup.c being omitted for now Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 08/11] x86/boot: Move startup code out of __head section Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 09/11] x86/boot: Disallow absolute symbol references in startup code Ard Biesheuvel
2025-04-23 11:09 ` [RFC PATCH PoC 10/11] x86/boot: Revert "Reject absolute references in .head.text" Ard Biesheuvel
2025-04-23 11:10 ` [RFC PATCH PoC 11/11] x86/boot: Get rid of the .head.text section Ard Biesheuvel
2025-04-24 18:09 ` [RFC PATCH PoC 00/11] x86: strict separation of startup code Ingo Molnar
2025-04-24 18:16 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox