* [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
@ 2025-04-08 8:52 ` Ard Biesheuvel
2025-04-09 8:15 ` Borah, Chaitanya Kumar
2025-04-09 10:21 ` [tip: x86/boot] " tip-bot2 for Ard Biesheuvel
2025-04-08 8:52 ` [PATCH v3 2/7] x86/asm: Make rip_rel_ptr() usable from fPIC code Ard Biesheuvel
` (6 subsequent siblings)
7 siblings, 2 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:52 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
The library code built under arch/x86/boot/startup is not intended to be
linked into vmlinux but only into the decompressor and/or the EFI stub.
This means objtool validation is not needed here, and may result in
false positive errors for things like missing retpolines.
So disable it for all objects added to lib-y
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 73946a3f6b3b..8919a1cbcb5a 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -4,3 +4,9 @@ KBUILD_AFLAGS += -D__DISABLE_EXPORTS
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
+
+#
+# Disable objtool validation for all library code, which is intended
+# to be linked into the decompressor or the EFI stub but not vmlinux
+#
+$(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code
2025-04-08 8:52 ` [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code Ard Biesheuvel
@ 2025-04-09 8:15 ` Borah, Chaitanya Kumar
2025-04-09 9:53 ` Ingo Molnar
2025-04-09 10:21 ` [tip: x86/boot] " tip-bot2 for Ard Biesheuvel
1 sibling, 1 reply; 17+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-09 8:15 UTC (permalink / raw)
To: Ard Biesheuvel, linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin, chaitanya.kumar.borah
On 4/8/2025 2:22 PM, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The library code built under arch/x86/boot/startup is not intended to be
> linked into vmlinux but only into the decompressor and/or the EFI stub.
>
> This means objtool validation is not needed here, and may result in
> false positive errors for things like missing retpolines.
>
> So disable it for all objects added to lib-y
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Solves
https://lore.kernel.org/intel-gfx/CAMj1kXEfBMczOmA2+dMMubuD-qE59GTAiV2E_9m8KNG4-rgP6Q@mail.gmail.com/T/#mbf2913e778475b70617390d4a5d0244295b9cb8c
Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
> ---
> arch/x86/boot/startup/Makefile | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
> index 73946a3f6b3b..8919a1cbcb5a 100644
> --- a/arch/x86/boot/startup/Makefile
> +++ b/arch/x86/boot/startup/Makefile
> @@ -4,3 +4,9 @@ KBUILD_AFLAGS += -D__DISABLE_EXPORTS
>
> lib-$(CONFIG_X86_64) += la57toggle.o
> lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
> +
> +#
> +# Disable objtool validation for all library code, which is intended
> +# to be linked into the decompressor or the EFI stub but not vmlinux
> +#
> +$(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code
2025-04-09 8:15 ` Borah, Chaitanya Kumar
@ 2025-04-09 9:53 ` Ingo Molnar
0 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2025-04-09 9:53 UTC (permalink / raw)
To: Borah, Chaitanya Kumar
Cc: Ard Biesheuvel, linux-efi, x86, linux-kernel, Ard Biesheuvel,
Tom Lendacky, Dionna Amalie Glaze, Kevin Loughlin
* Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> wrote:
>
> On 4/8/2025 2:22 PM, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > The library code built under arch/x86/boot/startup is not intended to be
> > linked into vmlinux but only into the decompressor and/or the EFI stub.
> >
> > This means objtool validation is not needed here, and may result in
> > false positive errors for things like missing retpolines.
> >
> > So disable it for all objects added to lib-y
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>
> Solves https://lore.kernel.org/intel-gfx/CAMj1kXEfBMczOmA2+dMMubuD-qE59GTAiV2E_9m8KNG4-rgP6Q@mail.gmail.com/T/#mbf2913e778475b70617390d4a5d0244295b9cb8c
>
> Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Thank you for the testing!
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip: x86/boot] x86/boot/startup: Disable objtool validation for library code
2025-04-08 8:52 ` [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code Ard Biesheuvel
2025-04-09 8:15 ` Borah, Chaitanya Kumar
@ 2025-04-09 10:21 ` tip-bot2 for Ard Biesheuvel
1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Ard Biesheuvel @ 2025-04-09 10:21 UTC (permalink / raw)
To: linux-tip-commits
Cc: Chaitanya Kumar Borah, Ard Biesheuvel, Ingo Molnar,
H. Peter Anvin, Kees Cook, Linus Torvalds, David Woodhouse,
Rafael J. Wysocki, Len Brown, Josh Poimboeuf, x86, linux-kernel
The following commit has been merged into the x86/boot branch of tip:
Commit-ID: d9fa398fe82728ee703ad2bd9cf5247df9626470
Gitweb: https://git.kernel.org/tip/d9fa398fe82728ee703ad2bd9cf5247df9626470
Author: Ard Biesheuvel <ardb@kernel.org>
AuthorDate: Tue, 08 Apr 2025 10:52:56 +02:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 09 Apr 2025 11:59:03 +02:00
x86/boot/startup: Disable objtool validation for library code
The library code built under arch/x86/boot/startup is not intended to be
linked into vmlinux but only into the decompressor and/or the EFI stub.
This means objtool validation is not needed here, and may result in
false positive errors for things like missing retpolines.
So disable it for all objects added to lib-y
Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250408085254.836788-10-ardb+git@google.com
---
arch/x86/boot/startup/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 73946a3..8919a1c 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -4,3 +4,9 @@ KBUILD_AFLAGS += -D__DISABLE_EXPORTS
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
+
+#
+# Disable objtool validation for all library code, which is intended
+# to be linked into the decompressor or the EFI stub but not vmlinux
+#
+$(patsubst %.o,$(obj)/%.o,$(lib-y)): OBJECT_FILES_NON_STANDARD := y
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 2/7] x86/asm: Make rip_rel_ptr() usable from fPIC code
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
2025-04-08 8:52 ` [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code Ard Biesheuvel
@ 2025-04-08 8:52 ` Ard Biesheuvel
2025-04-08 8:52 ` [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/ Ard Biesheuvel
` (5 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:52 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
RIP_REL_REF() is used in non-PIC C code that is called very early,
before the kernel virtual mapping is up, which is the mapping that the
linker expects. It is currently used in two different ways:
- to refer to the value of a global variable, including as an lvalue in
assignments;
- to take the address of a global variable via the mapping that the code
currently executes at.
The former case is only needed in non-PIC code, as PIC code will never
use absolute symbol references when the address of the symbol is not
being used. But taking the address of a variable in PIC code may still
require extra care, as a stack allocated struct assignment may be
emitted as a memcpy() from a statically allocated copy in .rodata.
For instance, this
void startup_64_setup_gdt_idt(void)
{
struct desc_ptr startup_gdt_descr = {
.address = (__force unsigned long)gdt_page.gdt,
.size = GDT_SIZE - 1,
};
may result in an absolute symbol reference in PIC code, even though the
struct is allocated on the stack and populated at runtime.
To address this case, make rip_rel_ptr() accessible in PIC code, and
update any existing uses where the address of a global variable is
taken using RIP_REL_REF.
Once all code of this nature has been moved into arch/x86/boot/startup
and built with -fPIC, RIP_REL_REF() can be retired, and only
rip_rel_ptr() will remain.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/coco/sev/core.c | 2 +-
arch/x86/coco/sev/shared.c | 4 ++--
arch/x86/include/asm/asm.h | 2 +-
arch/x86/kernel/head64.c | 23 ++++++++++----------
arch/x86/mm/mem_encrypt_identity.c | 6 ++---
5 files changed, 18 insertions(+), 19 deletions(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index b0c1a7a57497..832f7a7b10b2 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -2400,7 +2400,7 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info)
* kernel was loaded (physbase), so the get the CA address using
* RIP-relative addressing.
*/
- pa = (u64)&RIP_REL_REF(boot_svsm_ca_page);
+ pa = (u64)rip_rel_ptr(&boot_svsm_ca_page);
/*
* Switch over to the boot SVSM CA while the current CA is still
diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c
index 2e4122f8aa6b..04982d356803 100644
--- a/arch/x86/coco/sev/shared.c
+++ b/arch/x86/coco/sev/shared.c
@@ -475,7 +475,7 @@ static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid
*/
static const struct snp_cpuid_table *snp_cpuid_get_table(void)
{
- return &RIP_REL_REF(cpuid_table_copy);
+ return rip_rel_ptr(&cpuid_table_copy);
}
/*
@@ -1681,7 +1681,7 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info)
* routine is running identity mapped when called, both by the decompressor
* code and the early kernel code.
*/
- if (!rmpadjust((unsigned long)&RIP_REL_REF(boot_ghcb_page), RMP_PG_SIZE_4K, 1))
+ if (!rmpadjust((unsigned long)rip_rel_ptr(&boot_ghcb_page), RMP_PG_SIZE_4K, 1))
return false;
/*
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index cc2881576c2c..a9f07799e337 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -114,13 +114,13 @@
#endif
#ifndef __ASSEMBLER__
-#ifndef __pic__
static __always_inline __pure void *rip_rel_ptr(void *p)
{
asm("leaq %c1(%%rip), %0" : "=r"(p) : "i"(p));
return p;
}
+#ifndef __pic__
#define RIP_REL_REF(var) (*(typeof(&(var)))rip_rel_ptr(&(var)))
#else
#define RIP_REL_REF(var) (var)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index fa9b6339975f..3fb23d805cef 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -106,8 +106,8 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
* attribute.
*/
if (sme_get_me_mask()) {
- paddr = (unsigned long)&RIP_REL_REF(__start_bss_decrypted);
- paddr_end = (unsigned long)&RIP_REL_REF(__end_bss_decrypted);
+ paddr = (unsigned long)rip_rel_ptr(__start_bss_decrypted);
+ paddr_end = (unsigned long)rip_rel_ptr(__end_bss_decrypted);
for (; paddr < paddr_end; paddr += PMD_SIZE) {
/*
@@ -144,8 +144,8 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
unsigned long __head __startup_64(unsigned long p2v_offset,
struct boot_params *bp)
{
- pmd_t (*early_pgts)[PTRS_PER_PMD] = RIP_REL_REF(early_dynamic_pgts);
- unsigned long physaddr = (unsigned long)&RIP_REL_REF(_text);
+ pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts);
+ unsigned long physaddr = (unsigned long)rip_rel_ptr(_text);
unsigned long va_text, va_end;
unsigned long pgtable_flags;
unsigned long load_delta;
@@ -174,18 +174,18 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
for (;;);
va_text = physaddr - p2v_offset;
- va_end = (unsigned long)&RIP_REL_REF(_end) - p2v_offset;
+ va_end = (unsigned long)rip_rel_ptr(_end) - p2v_offset;
/* Include the SME encryption mask in the fixup value */
load_delta += sme_get_me_mask();
/* Fixup the physical addresses in the page table */
- pgd = &RIP_REL_REF(early_top_pgt)->pgd;
+ pgd = rip_rel_ptr(early_top_pgt);
pgd[pgd_index(__START_KERNEL_map)] += load_delta;
if (IS_ENABLED(CONFIG_X86_5LEVEL) && la57) {
- p4d = (p4dval_t *)&RIP_REL_REF(level4_kernel_pgt);
+ p4d = (p4dval_t *)rip_rel_ptr(level4_kernel_pgt);
p4d[MAX_PTRS_PER_P4D - 1] += load_delta;
pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE;
@@ -258,7 +258,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
* error, causing the BIOS to halt the system.
*/
- pmd = &RIP_REL_REF(level2_kernel_pgt)->pmd;
+ pmd = rip_rel_ptr(level2_kernel_pgt);
/* invalidate pages before the kernel image */
for (i = 0; i < pmd_index(va_text); i++)
@@ -531,7 +531,7 @@ static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
static void __head startup_64_load_idt(void *vc_handler)
{
struct desc_ptr desc = {
- .address = (unsigned long)&RIP_REL_REF(bringup_idt_table),
+ .address = (unsigned long)rip_rel_ptr(bringup_idt_table),
.size = sizeof(bringup_idt_table) - 1,
};
struct idt_data data;
@@ -565,11 +565,10 @@ void early_setup_idt(void)
*/
void __head startup_64_setup_gdt_idt(void)
{
- struct desc_struct *gdt = (void *)(__force unsigned long)gdt_page.gdt;
void *handler = NULL;
struct desc_ptr startup_gdt_descr = {
- .address = (unsigned long)&RIP_REL_REF(*gdt),
+ .address = (unsigned long)rip_rel_ptr((__force void *)&gdt_page),
.size = GDT_SIZE - 1,
};
@@ -582,7 +581,7 @@ void __head startup_64_setup_gdt_idt(void)
"movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
- handler = &RIP_REL_REF(vc_no_ghcb);
+ handler = rip_rel_ptr(vc_no_ghcb);
startup_64_load_idt(handler);
}
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 5eecdd92da10..e7fb3779b35f 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -318,8 +318,8 @@ void __head sme_encrypt_kernel(struct boot_params *bp)
* memory from being cached.
*/
- kernel_start = (unsigned long)RIP_REL_REF(_text);
- kernel_end = ALIGN((unsigned long)RIP_REL_REF(_end), PMD_SIZE);
+ kernel_start = (unsigned long)rip_rel_ptr(_text);
+ kernel_end = ALIGN((unsigned long)rip_rel_ptr(_end), PMD_SIZE);
kernel_len = kernel_end - kernel_start;
initrd_start = 0;
@@ -345,7 +345,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp)
* pagetable structures for the encryption of the kernel
* pagetable structures for workarea (in case not currently mapped)
*/
- execute_start = workarea_start = (unsigned long)RIP_REL_REF(sme_workarea);
+ execute_start = workarea_start = (unsigned long)rip_rel_ptr(sme_workarea);
execute_end = execute_start + (PAGE_SIZE * 2) + PMD_SIZE;
execute_len = execute_end - execute_start;
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
2025-04-08 8:52 ` [PATCH v3 1/7] x86/boot/startup: Disable objtool validation for library code Ard Biesheuvel
2025-04-08 8:52 ` [PATCH v3 2/7] x86/asm: Make rip_rel_ptr() usable from fPIC code Ard Biesheuvel
@ 2025-04-08 8:52 ` Ard Biesheuvel
2025-04-09 10:05 ` Ingo Molnar
2025-04-08 8:52 ` [PATCH v3 4/7] x86/boot: Move early kernel mapping " Ard Biesheuvel
` (4 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:52 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
Move the early GDT/IDT setup code that runs long before the kernel
virtual mapping is up into arch/x86/boot/startup/, and build it in a way
that ensures that the code tolerates being called from the 1:1 mapping
of memory. The code itself is left unchanged by this patch.
Also tweak the sed symbol matching pattern in the decompressor to match
on lower case 't' or 'b', as these will be emitted by Clang for symbols
with hidden linkage.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/startup/Makefile | 15 ++++
arch/x86/boot/startup/gdt_idt.c | 83 ++++++++++++++++++++
arch/x86/kernel/head64.c | 73 -----------------
4 files changed, 99 insertions(+), 74 deletions(-)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 37b85ce9b2a3..0fcad7b7e007 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -73,7 +73,7 @@ LDFLAGS_vmlinux += -T
hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include
-sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__start_rodata\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
+sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABbCDGRSTtVW] \(_text\|__start_rodata\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
quiet_cmd_voffset = VOFFSET $@
cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 8919a1cbcb5a..1beb5de30735 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -1,6 +1,21 @@
# SPDX-License-Identifier: GPL-2.0
KBUILD_AFLAGS += -D__DISABLE_EXPORTS
+KBUILD_CFLAGS += -D__DISABLE_EXPORTS -mcmodel=small -fPIC \
+ -Os -DDISABLE_BRANCH_PROFILING \
+ $(DISABLE_STACKLEAK_PLUGIN) \
+ -fno-stack-protector -D__NO_FORTIFY \
+ -include $(srctree)/include/linux/hidden.h
+
+# disable ftrace hooks
+KBUILD_CFLAGS := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS))
+KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
+KMSAN_SANITIZE := n
+UBSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+
+obj-$(CONFIG_X86_64) += gdt_idt.o
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
diff --git a/arch/x86/boot/startup/gdt_idt.c b/arch/x86/boot/startup/gdt_idt.c
new file mode 100644
index 000000000000..1ba6bd5786fe
--- /dev/null
+++ b/arch/x86/boot/startup/gdt_idt.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/linkage.h>
+#include <linux/types.h>
+
+#include <asm/desc.h>
+#include <asm/init.h>
+#include <asm/setup.h>
+#include <asm/sev.h>
+#include <asm/trapnr.h>
+
+/*
+ * Data structures and code used for IDT setup in head_64.S. The bringup-IDT is
+ * used until the idt_table takes over. On the boot CPU this happens in
+ * x86_64_start_kernel(), on secondary CPUs in start_secondary(). In both cases
+ * this happens in the functions called from head_64.S.
+ *
+ * The idt_table can't be used that early because all the code modifying it is
+ * in idt.c and can be instrumented by tracing or KASAN, which both don't work
+ * during early CPU bringup. Also the idt_table has the runtime vectors
+ * configured which require certain CPU state to be setup already (like TSS),
+ * which also hasn't happened yet in early CPU bringup.
+ */
+static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
+
+/* This may run while still in the direct mapping */
+static void __head startup_64_load_idt(void *vc_handler)
+{
+ struct desc_ptr desc = {
+ .address = (unsigned long)rip_rel_ptr(bringup_idt_table),
+ .size = sizeof(bringup_idt_table) - 1,
+ };
+ struct idt_data data;
+ gate_desc idt_desc;
+
+ /* @vc_handler is set only for a VMM Communication Exception */
+ if (vc_handler) {
+ init_idt_data(&data, X86_TRAP_VC, vc_handler);
+ idt_init_desc(&idt_desc, &data);
+ native_write_idt_entry((gate_desc *)desc.address, X86_TRAP_VC, &idt_desc);
+ }
+
+ native_load_idt(&desc);
+}
+
+/* This is used when running on kernel addresses */
+void early_setup_idt(void)
+{
+ void *handler = NULL;
+
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ setup_ghcb();
+ handler = vc_boot_ghcb;
+ }
+
+ startup_64_load_idt(handler);
+}
+
+/*
+ * Setup boot CPU state needed before kernel switches to virtual addresses.
+ */
+void __head startup_64_setup_gdt_idt(void)
+{
+ void *handler = NULL;
+
+ struct desc_ptr startup_gdt_descr = {
+ .address = (unsigned long)rip_rel_ptr((__force void *)&gdt_page),
+ .size = GDT_SIZE - 1,
+ };
+
+ /* Load GDT */
+ native_load_gdt(&startup_gdt_descr);
+
+ /* New GDT is live - reload data segment registers */
+ asm volatile("movl %%eax, %%ds\n"
+ "movl %%eax, %%ss\n"
+ "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");
+
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ handler = rip_rel_ptr(vc_no_ghcb);
+
+ startup_64_load_idt(handler);
+}
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 3fb23d805cef..9b2ffec4bbad 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -512,76 +512,3 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data)
start_kernel();
}
-
-/*
- * Data structures and code used for IDT setup in head_64.S. The bringup-IDT is
- * used until the idt_table takes over. On the boot CPU this happens in
- * x86_64_start_kernel(), on secondary CPUs in start_secondary(). In both cases
- * this happens in the functions called from head_64.S.
- *
- * The idt_table can't be used that early because all the code modifying it is
- * in idt.c and can be instrumented by tracing or KASAN, which both don't work
- * during early CPU bringup. Also the idt_table has the runtime vectors
- * configured which require certain CPU state to be setup already (like TSS),
- * which also hasn't happened yet in early CPU bringup.
- */
-static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
-
-/* This may run while still in the direct mapping */
-static void __head startup_64_load_idt(void *vc_handler)
-{
- struct desc_ptr desc = {
- .address = (unsigned long)rip_rel_ptr(bringup_idt_table),
- .size = sizeof(bringup_idt_table) - 1,
- };
- struct idt_data data;
- gate_desc idt_desc;
-
- /* @vc_handler is set only for a VMM Communication Exception */
- if (vc_handler) {
- init_idt_data(&data, X86_TRAP_VC, vc_handler);
- idt_init_desc(&idt_desc, &data);
- native_write_idt_entry((gate_desc *)desc.address, X86_TRAP_VC, &idt_desc);
- }
-
- native_load_idt(&desc);
-}
-
-/* This is used when running on kernel addresses */
-void early_setup_idt(void)
-{
- void *handler = NULL;
-
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
- setup_ghcb();
- handler = vc_boot_ghcb;
- }
-
- startup_64_load_idt(handler);
-}
-
-/*
- * Setup boot CPU state needed before kernel switches to virtual addresses.
- */
-void __head startup_64_setup_gdt_idt(void)
-{
- void *handler = NULL;
-
- struct desc_ptr startup_gdt_descr = {
- .address = (unsigned long)rip_rel_ptr((__force void *)&gdt_page),
- .size = GDT_SIZE - 1,
- };
-
- /* Load GDT */
- native_load_gdt(&startup_gdt_descr);
-
- /* New GDT is live - reload data segment registers */
- asm volatile("movl %%eax, %%ds\n"
- "movl %%eax, %%ss\n"
- "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");
-
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
- handler = rip_rel_ptr(vc_no_ghcb);
-
- startup_64_load_idt(handler);
-}
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/
2025-04-08 8:52 ` [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/ Ard Biesheuvel
@ 2025-04-09 10:05 ` Ingo Molnar
2025-04-09 10:07 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2025-04-09 10:05 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-efi, x86, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
* Ard Biesheuvel <ardb+git@google.com> wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Move the early GDT/IDT setup code that runs long before the kernel
> virtual mapping is up into arch/x86/boot/startup/, and build it in a way
> that ensures that the code tolerates being called from the 1:1 mapping
> of memory. The code itself is left unchanged by this patch.
>
> Also tweak the sed symbol matching pattern in the decompressor to match
> on lower case 't' or 'b', as these will be emitted by Clang for symbols
> with hidden linkage.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/x86/boot/compressed/Makefile | 2 +-
> arch/x86/boot/startup/Makefile | 15 ++++
> arch/x86/boot/startup/gdt_idt.c | 83 ++++++++++++++++++++
> arch/x86/kernel/head64.c | 73 -----------------
> 4 files changed, 99 insertions(+), 74 deletions(-)
This causes the following build failure on x86-64-defconfig:
arch/x86/boot/startup/gdt_idt.c:67:55: error: cast to generic address space pointer from disjoint ‘__seg_gs’ address space pointer [-Werror]
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/
2025-04-09 10:05 ` Ingo Molnar
@ 2025-04-09 10:07 ` Ingo Molnar
2025-04-09 11:42 ` Ard Biesheuvel
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2025-04-09 10:07 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-efi, x86, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Ard Biesheuvel <ardb+git@google.com> wrote:
>
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Move the early GDT/IDT setup code that runs long before the kernel
> > virtual mapping is up into arch/x86/boot/startup/, and build it in a way
> > that ensures that the code tolerates being called from the 1:1 mapping
> > of memory. The code itself is left unchanged by this patch.
> >
> > Also tweak the sed symbol matching pattern in the decompressor to match
> > on lower case 't' or 'b', as these will be emitted by Clang for symbols
> > with hidden linkage.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > arch/x86/boot/compressed/Makefile | 2 +-
> > arch/x86/boot/startup/Makefile | 15 ++++
> > arch/x86/boot/startup/gdt_idt.c | 83 ++++++++++++++++++++
> > arch/x86/kernel/head64.c | 73 -----------------
> > 4 files changed, 99 insertions(+), 74 deletions(-)
>
> This causes the following build failure on x86-64-defconfig:
>
> arch/x86/boot/startup/gdt_idt.c:67:55: error: cast to generic address space pointer from disjoint ‘__seg_gs’ address space pointer [-Werror]
Caused by the previous patch:
x86/asm: Make rip_rel_ptr() usable from fPIC code
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/
2025-04-09 10:07 ` Ingo Molnar
@ 2025-04-09 11:42 ` Ard Biesheuvel
2025-04-09 12:01 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-09 11:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ard Biesheuvel, linux-efi, x86, linux-kernel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
On Wed, 9 Apr 2025 at 12:07, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Ingo Molnar <mingo@kernel.org> wrote:
>
> >
> > * Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Move the early GDT/IDT setup code that runs long before the kernel
> > > virtual mapping is up into arch/x86/boot/startup/, and build it in a way
> > > that ensures that the code tolerates being called from the 1:1 mapping
> > > of memory. The code itself is left unchanged by this patch.
> > >
> > > Also tweak the sed symbol matching pattern in the decompressor to match
> > > on lower case 't' or 'b', as these will be emitted by Clang for symbols
> > > with hidden linkage.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > > arch/x86/boot/compressed/Makefile | 2 +-
> > > arch/x86/boot/startup/Makefile | 15 ++++
> > > arch/x86/boot/startup/gdt_idt.c | 83 ++++++++++++++++++++
> > > arch/x86/kernel/head64.c | 73 -----------------
> > > 4 files changed, 99 insertions(+), 74 deletions(-)
> >
> > This causes the following build failure on x86-64-defconfig:
> >
> > arch/x86/boot/startup/gdt_idt.c:67:55: error: cast to generic address space pointer from disjoint ‘__seg_gs’ address space pointer [-Werror]
>
> Caused by the previous patch:
>
> x86/asm: Make rip_rel_ptr() usable from fPIC code
>
Oops, sorry about that. I saw that error and thought I had fixed it
with the (__force void*) cast.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/
2025-04-09 11:42 ` Ard Biesheuvel
@ 2025-04-09 12:01 ` Ingo Molnar
0 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2025-04-09 12:01 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Ard Biesheuvel, linux-efi, x86, linux-kernel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
* Ard Biesheuvel <ardb@kernel.org> wrote:
> On Wed, 9 Apr 2025 at 12:07, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Ingo Molnar <mingo@kernel.org> wrote:
> >
> > >
> > > * Ard Biesheuvel <ardb+git@google.com> wrote:
> > >
> > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > >
> > > > Move the early GDT/IDT setup code that runs long before the kernel
> > > > virtual mapping is up into arch/x86/boot/startup/, and build it in a way
> > > > that ensures that the code tolerates being called from the 1:1 mapping
> > > > of memory. The code itself is left unchanged by this patch.
> > > >
> > > > Also tweak the sed symbol matching pattern in the decompressor to match
> > > > on lower case 't' or 'b', as these will be emitted by Clang for symbols
> > > > with hidden linkage.
> > > >
> > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > > ---
> > > > arch/x86/boot/compressed/Makefile | 2 +-
> > > > arch/x86/boot/startup/Makefile | 15 ++++
> > > > arch/x86/boot/startup/gdt_idt.c | 83 ++++++++++++++++++++
> > > > arch/x86/kernel/head64.c | 73 -----------------
> > > > 4 files changed, 99 insertions(+), 74 deletions(-)
> > >
> > > This causes the following build failure on x86-64-defconfig:
> > >
> > > arch/x86/boot/startup/gdt_idt.c:67:55: error: cast to generic address space pointer from disjoint ‘__seg_gs’ address space pointer [-Werror]
> >
> > Caused by the previous patch:
> >
> > x86/asm: Make rip_rel_ptr() usable from fPIC code
> >
>
> Oops, sorry about that. I saw that error and thought I had fixed it
> with the (__force void*) cast.
NP, caught it early enough.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 4/7] x86/boot: Move early kernel mapping code into startup/
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
` (2 preceding siblings ...)
2025-04-08 8:52 ` [PATCH v3 3/7] x86/boot: Move the early GDT/IDT setup code into startup/ Ard Biesheuvel
@ 2025-04-08 8:52 ` Ard Biesheuvel
2025-04-08 8:53 ` [PATCH v3 5/7] x86/boot: Drop RIP_REL_REF() uses from early mapping code Ard Biesheuvel
` (3 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:52 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
The startup code that constructs the kernel virtual mapping runs from
the 1:1 mapping of memory itself, and therefore, cannot use absolute
symbol references. Before making changes in subsequent patches, move
this code into a separate source file under arch/x86/boot/startup/ where
all such code will be kept from now on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 2 +-
arch/x86/boot/startup/map_kernel.c | 224 ++++++++++++++++++++
arch/x86/kernel/head64.c | 211 +-----------------
3 files changed, 226 insertions(+), 211 deletions(-)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 1beb5de30735..10319aee666b 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -15,7 +15,7 @@ KMSAN_SANITIZE := n
UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
-obj-$(CONFIG_X86_64) += gdt_idt.o
+obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c
new file mode 100644
index 000000000000..5f1b7e0ba26e
--- /dev/null
+++ b/arch/x86/boot/startup/map_kernel.c
@@ -0,0 +1,224 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/pgtable.h>
+
+#include <asm/init.h>
+#include <asm/sections.h>
+#include <asm/setup.h>
+#include <asm/sev.h>
+
+extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
+extern unsigned int next_early_pgt;
+
+static inline bool check_la57_support(void)
+{
+ if (!IS_ENABLED(CONFIG_X86_5LEVEL))
+ return false;
+
+ /*
+ * 5-level paging is detected and enabled at kernel decompression
+ * stage. Only check if it has been enabled there.
+ */
+ if (!(native_read_cr4() & X86_CR4_LA57))
+ return false;
+
+ RIP_REL_REF(__pgtable_l5_enabled) = 1;
+ RIP_REL_REF(pgdir_shift) = 48;
+ RIP_REL_REF(ptrs_per_p4d) = 512;
+ RIP_REL_REF(page_offset_base) = __PAGE_OFFSET_BASE_L5;
+ RIP_REL_REF(vmalloc_base) = __VMALLOC_BASE_L5;
+ RIP_REL_REF(vmemmap_base) = __VMEMMAP_BASE_L5;
+
+ return true;
+}
+
+static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
+ pmdval_t *pmd,
+ unsigned long p2v_offset)
+{
+ unsigned long paddr, paddr_end;
+ int i;
+
+ /* Encrypt the kernel and related (if SME is active) */
+ sme_encrypt_kernel(bp);
+
+ /*
+ * Clear the memory encryption mask from the .bss..decrypted section.
+ * The bss section will be memset to zero later in the initialization so
+ * there is no need to zero it after changing the memory encryption
+ * attribute.
+ */
+ if (sme_get_me_mask()) {
+ paddr = (unsigned long)rip_rel_ptr(__start_bss_decrypted);
+ paddr_end = (unsigned long)rip_rel_ptr(__end_bss_decrypted);
+
+ for (; paddr < paddr_end; paddr += PMD_SIZE) {
+ /*
+ * On SNP, transition the page to shared in the RMP table so that
+ * it is consistent with the page table attribute change.
+ *
+ * __start_bss_decrypted has a virtual address in the high range
+ * mapping (kernel .text). PVALIDATE, by way of
+ * early_snp_set_memory_shared(), requires a valid virtual
+ * address but the kernel is currently running off of the identity
+ * mapping so use the PA to get a *currently* valid virtual address.
+ */
+ early_snp_set_memory_shared(paddr, paddr, PTRS_PER_PMD);
+
+ i = pmd_index(paddr - p2v_offset);
+ pmd[i] -= sme_get_me_mask();
+ }
+ }
+
+ /*
+ * Return the SME encryption mask (if SME is active) to be used as a
+ * modifier for the initial pgdir entry programmed into CR3.
+ */
+ return sme_get_me_mask();
+}
+
+/* Code in __startup_64() can be relocated during execution, but the compiler
+ * doesn't have to generate PC-relative relocations when accessing globals from
+ * that function. Clang actually does not generate them, which leads to
+ * boot-time crashes. To work around this problem, every global pointer must
+ * be accessed using RIP_REL_REF(). Kernel virtual addresses can be determined
+ * by subtracting p2v_offset from the RIP-relative address.
+ */
+unsigned long __head __startup_64(unsigned long p2v_offset,
+ struct boot_params *bp)
+{
+ pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts);
+ unsigned long physaddr = (unsigned long)rip_rel_ptr(_text);
+ unsigned long va_text, va_end;
+ unsigned long pgtable_flags;
+ unsigned long load_delta;
+ pgdval_t *pgd;
+ p4dval_t *p4d;
+ pudval_t *pud;
+ pmdval_t *pmd, pmd_entry;
+ bool la57;
+ int i;
+
+ la57 = check_la57_support();
+
+ /* Is the address too large? */
+ if (physaddr >> MAX_PHYSMEM_BITS)
+ for (;;);
+
+ /*
+ * Compute the delta between the address I am compiled to run at
+ * and the address I am actually running at.
+ */
+ load_delta = __START_KERNEL_map + p2v_offset;
+ RIP_REL_REF(phys_base) = load_delta;
+
+ /* Is the address not 2M aligned? */
+ if (load_delta & ~PMD_MASK)
+ for (;;);
+
+ va_text = physaddr - p2v_offset;
+ va_end = (unsigned long)rip_rel_ptr(_end) - p2v_offset;
+
+ /* Include the SME encryption mask in the fixup value */
+ load_delta += sme_get_me_mask();
+
+ /* Fixup the physical addresses in the page table */
+
+ pgd = rip_rel_ptr(early_top_pgt);
+ pgd[pgd_index(__START_KERNEL_map)] += load_delta;
+
+ if (IS_ENABLED(CONFIG_X86_5LEVEL) && la57) {
+ p4d = (p4dval_t *)rip_rel_ptr(level4_kernel_pgt);
+ p4d[MAX_PTRS_PER_P4D - 1] += load_delta;
+
+ pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE;
+ }
+
+ RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 2].pud += load_delta;
+ RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 1].pud += load_delta;
+
+ for (i = FIXMAP_PMD_TOP; i > FIXMAP_PMD_TOP - FIXMAP_PMD_NUM; i--)
+ RIP_REL_REF(level2_fixmap_pgt)[i].pmd += load_delta;
+
+ /*
+ * Set up the identity mapping for the switchover. These
+ * entries should *NOT* have the global bit set! This also
+ * creates a bunch of nonsense entries but that is fine --
+ * it avoids problems around wraparound.
+ */
+
+ pud = &early_pgts[0]->pmd;
+ pmd = &early_pgts[1]->pmd;
+ RIP_REL_REF(next_early_pgt) = 2;
+
+ pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask();
+
+ if (la57) {
+ p4d = &early_pgts[RIP_REL_REF(next_early_pgt)++]->pmd;
+
+ i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
+ pgd[i + 0] = (pgdval_t)p4d + pgtable_flags;
+ pgd[i + 1] = (pgdval_t)p4d + pgtable_flags;
+
+ i = physaddr >> P4D_SHIFT;
+ p4d[(i + 0) % PTRS_PER_P4D] = (pgdval_t)pud + pgtable_flags;
+ p4d[(i + 1) % PTRS_PER_P4D] = (pgdval_t)pud + pgtable_flags;
+ } else {
+ i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
+ pgd[i + 0] = (pgdval_t)pud + pgtable_flags;
+ pgd[i + 1] = (pgdval_t)pud + pgtable_flags;
+ }
+
+ i = physaddr >> PUD_SHIFT;
+ pud[(i + 0) % PTRS_PER_PUD] = (pudval_t)pmd + pgtable_flags;
+ pud[(i + 1) % PTRS_PER_PUD] = (pudval_t)pmd + pgtable_flags;
+
+ pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
+ /* Filter out unsupported __PAGE_KERNEL_* bits: */
+ pmd_entry &= RIP_REL_REF(__supported_pte_mask);
+ pmd_entry += sme_get_me_mask();
+ pmd_entry += physaddr;
+
+ for (i = 0; i < DIV_ROUND_UP(va_end - va_text, PMD_SIZE); i++) {
+ int idx = i + (physaddr >> PMD_SHIFT);
+
+ pmd[idx % PTRS_PER_PMD] = pmd_entry + i * PMD_SIZE;
+ }
+
+ /*
+ * Fixup the kernel text+data virtual addresses. Note that
+ * we might write invalid pmds, when the kernel is relocated
+ * cleanup_highmap() fixes this up along with the mappings
+ * beyond _end.
+ *
+ * Only the region occupied by the kernel image has so far
+ * been checked against the table of usable memory regions
+ * provided by the firmware, so invalidate pages outside that
+ * region. A page table entry that maps to a reserved area of
+ * memory would allow processor speculation into that area,
+ * and on some hardware (particularly the UV platform) even
+ * speculative access to some reserved areas is caught as an
+ * error, causing the BIOS to halt the system.
+ */
+
+ pmd = rip_rel_ptr(level2_kernel_pgt);
+
+ /* invalidate pages before the kernel image */
+ for (i = 0; i < pmd_index(va_text); i++)
+ pmd[i] &= ~_PAGE_PRESENT;
+
+ /* fixup pages that are part of the kernel image */
+ for (; i <= pmd_index(va_end); i++)
+ if (pmd[i] & _PAGE_PRESENT)
+ pmd[i] += load_delta;
+
+ /* invalidate pages after the kernel image */
+ for (; i < PTRS_PER_PMD; i++)
+ pmd[i] &= ~_PAGE_PRESENT;
+
+ return sme_postprocess_startup(bp, pmd, p2v_offset);
+}
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 9b2ffec4bbad..6b68a206fa7f 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -47,7 +47,7 @@
* Manage page tables very early on.
*/
extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
-static unsigned int __initdata next_early_pgt;
+unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
@@ -67,215 +67,6 @@ unsigned long vmemmap_base __ro_after_init = __VMEMMAP_BASE_L4;
EXPORT_SYMBOL(vmemmap_base);
#endif
-static inline bool check_la57_support(void)
-{
- if (!IS_ENABLED(CONFIG_X86_5LEVEL))
- return false;
-
- /*
- * 5-level paging is detected and enabled at kernel decompression
- * stage. Only check if it has been enabled there.
- */
- if (!(native_read_cr4() & X86_CR4_LA57))
- return false;
-
- RIP_REL_REF(__pgtable_l5_enabled) = 1;
- RIP_REL_REF(pgdir_shift) = 48;
- RIP_REL_REF(ptrs_per_p4d) = 512;
- RIP_REL_REF(page_offset_base) = __PAGE_OFFSET_BASE_L5;
- RIP_REL_REF(vmalloc_base) = __VMALLOC_BASE_L5;
- RIP_REL_REF(vmemmap_base) = __VMEMMAP_BASE_L5;
-
- return true;
-}
-
-static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
- pmdval_t *pmd,
- unsigned long p2v_offset)
-{
- unsigned long paddr, paddr_end;
- int i;
-
- /* Encrypt the kernel and related (if SME is active) */
- sme_encrypt_kernel(bp);
-
- /*
- * Clear the memory encryption mask from the .bss..decrypted section.
- * The bss section will be memset to zero later in the initialization so
- * there is no need to zero it after changing the memory encryption
- * attribute.
- */
- if (sme_get_me_mask()) {
- paddr = (unsigned long)rip_rel_ptr(__start_bss_decrypted);
- paddr_end = (unsigned long)rip_rel_ptr(__end_bss_decrypted);
-
- for (; paddr < paddr_end; paddr += PMD_SIZE) {
- /*
- * On SNP, transition the page to shared in the RMP table so that
- * it is consistent with the page table attribute change.
- *
- * __start_bss_decrypted has a virtual address in the high range
- * mapping (kernel .text). PVALIDATE, by way of
- * early_snp_set_memory_shared(), requires a valid virtual
- * address but the kernel is currently running off of the identity
- * mapping so use the PA to get a *currently* valid virtual address.
- */
- early_snp_set_memory_shared(paddr, paddr, PTRS_PER_PMD);
-
- i = pmd_index(paddr - p2v_offset);
- pmd[i] -= sme_get_me_mask();
- }
- }
-
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
-}
-
-/* Code in __startup_64() can be relocated during execution, but the compiler
- * doesn't have to generate PC-relative relocations when accessing globals from
- * that function. Clang actually does not generate them, which leads to
- * boot-time crashes. To work around this problem, every global pointer must
- * be accessed using RIP_REL_REF(). Kernel virtual addresses can be determined
- * by subtracting p2v_offset from the RIP-relative address.
- */
-unsigned long __head __startup_64(unsigned long p2v_offset,
- struct boot_params *bp)
-{
- pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts);
- unsigned long physaddr = (unsigned long)rip_rel_ptr(_text);
- unsigned long va_text, va_end;
- unsigned long pgtable_flags;
- unsigned long load_delta;
- pgdval_t *pgd;
- p4dval_t *p4d;
- pudval_t *pud;
- pmdval_t *pmd, pmd_entry;
- bool la57;
- int i;
-
- la57 = check_la57_support();
-
- /* Is the address too large? */
- if (physaddr >> MAX_PHYSMEM_BITS)
- for (;;);
-
- /*
- * Compute the delta between the address I am compiled to run at
- * and the address I am actually running at.
- */
- load_delta = __START_KERNEL_map + p2v_offset;
- RIP_REL_REF(phys_base) = load_delta;
-
- /* Is the address not 2M aligned? */
- if (load_delta & ~PMD_MASK)
- for (;;);
-
- va_text = physaddr - p2v_offset;
- va_end = (unsigned long)rip_rel_ptr(_end) - p2v_offset;
-
- /* Include the SME encryption mask in the fixup value */
- load_delta += sme_get_me_mask();
-
- /* Fixup the physical addresses in the page table */
-
- pgd = rip_rel_ptr(early_top_pgt);
- pgd[pgd_index(__START_KERNEL_map)] += load_delta;
-
- if (IS_ENABLED(CONFIG_X86_5LEVEL) && la57) {
- p4d = (p4dval_t *)rip_rel_ptr(level4_kernel_pgt);
- p4d[MAX_PTRS_PER_P4D - 1] += load_delta;
-
- pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE;
- }
-
- RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 2].pud += load_delta;
- RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 1].pud += load_delta;
-
- for (i = FIXMAP_PMD_TOP; i > FIXMAP_PMD_TOP - FIXMAP_PMD_NUM; i--)
- RIP_REL_REF(level2_fixmap_pgt)[i].pmd += load_delta;
-
- /*
- * Set up the identity mapping for the switchover. These
- * entries should *NOT* have the global bit set! This also
- * creates a bunch of nonsense entries but that is fine --
- * it avoids problems around wraparound.
- */
-
- pud = &early_pgts[0]->pmd;
- pmd = &early_pgts[1]->pmd;
- RIP_REL_REF(next_early_pgt) = 2;
-
- pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask();
-
- if (la57) {
- p4d = &early_pgts[RIP_REL_REF(next_early_pgt)++]->pmd;
-
- i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
- pgd[i + 0] = (pgdval_t)p4d + pgtable_flags;
- pgd[i + 1] = (pgdval_t)p4d + pgtable_flags;
-
- i = physaddr >> P4D_SHIFT;
- p4d[(i + 0) % PTRS_PER_P4D] = (pgdval_t)pud + pgtable_flags;
- p4d[(i + 1) % PTRS_PER_P4D] = (pgdval_t)pud + pgtable_flags;
- } else {
- i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
- pgd[i + 0] = (pgdval_t)pud + pgtable_flags;
- pgd[i + 1] = (pgdval_t)pud + pgtable_flags;
- }
-
- i = physaddr >> PUD_SHIFT;
- pud[(i + 0) % PTRS_PER_PUD] = (pudval_t)pmd + pgtable_flags;
- pud[(i + 1) % PTRS_PER_PUD] = (pudval_t)pmd + pgtable_flags;
-
- pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
- /* Filter out unsupported __PAGE_KERNEL_* bits: */
- pmd_entry &= RIP_REL_REF(__supported_pte_mask);
- pmd_entry += sme_get_me_mask();
- pmd_entry += physaddr;
-
- for (i = 0; i < DIV_ROUND_UP(va_end - va_text, PMD_SIZE); i++) {
- int idx = i + (physaddr >> PMD_SHIFT);
-
- pmd[idx % PTRS_PER_PMD] = pmd_entry + i * PMD_SIZE;
- }
-
- /*
- * Fixup the kernel text+data virtual addresses. Note that
- * we might write invalid pmds, when the kernel is relocated
- * cleanup_highmap() fixes this up along with the mappings
- * beyond _end.
- *
- * Only the region occupied by the kernel image has so far
- * been checked against the table of usable memory regions
- * provided by the firmware, so invalidate pages outside that
- * region. A page table entry that maps to a reserved area of
- * memory would allow processor speculation into that area,
- * and on some hardware (particularly the UV platform) even
- * speculative access to some reserved areas is caught as an
- * error, causing the BIOS to halt the system.
- */
-
- pmd = rip_rel_ptr(level2_kernel_pgt);
-
- /* invalidate pages before the kernel image */
- for (i = 0; i < pmd_index(va_text); i++)
- pmd[i] &= ~_PAGE_PRESENT;
-
- /* fixup pages that are part of the kernel image */
- for (; i <= pmd_index(va_end); i++)
- if (pmd[i] & _PAGE_PRESENT)
- pmd[i] += load_delta;
-
- /* invalidate pages after the kernel image */
- for (; i < PTRS_PER_PMD; i++)
- pmd[i] &= ~_PAGE_PRESENT;
-
- return sme_postprocess_startup(bp, pmd, p2v_offset);
-}
-
/* Wipe all early page tables except for the kernel symbol map */
static void __init reset_early_page_tables(void)
{
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 5/7] x86/boot: Drop RIP_REL_REF() uses from early mapping code
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
` (3 preceding siblings ...)
2025-04-08 8:52 ` [PATCH v3 4/7] x86/boot: Move early kernel mapping " Ard Biesheuvel
@ 2025-04-08 8:53 ` Ard Biesheuvel
2025-04-08 8:53 ` [PATCH v3 6/7] x86/boot: Move early SME init code into startup/ Ard Biesheuvel
` (2 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:53 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
Now that __startup_64() is built using -fPIC, RIP_REL_REF() has become a
NOP and can be removed. Only some occurrences of rip_rel_ptr() will
remain, to explicitly take the address of certain global structures in
the 1:1 mapping of memory.
While at it, update the code comment to describe why this is needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/map_kernel.c | 41 ++++++++++----------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c
index 5f1b7e0ba26e..0eac3f17dbd3 100644
--- a/arch/x86/boot/startup/map_kernel.c
+++ b/arch/x86/boot/startup/map_kernel.c
@@ -26,12 +26,12 @@ static inline bool check_la57_support(void)
if (!(native_read_cr4() & X86_CR4_LA57))
return false;
- RIP_REL_REF(__pgtable_l5_enabled) = 1;
- RIP_REL_REF(pgdir_shift) = 48;
- RIP_REL_REF(ptrs_per_p4d) = 512;
- RIP_REL_REF(page_offset_base) = __PAGE_OFFSET_BASE_L5;
- RIP_REL_REF(vmalloc_base) = __VMALLOC_BASE_L5;
- RIP_REL_REF(vmemmap_base) = __VMEMMAP_BASE_L5;
+ __pgtable_l5_enabled = 1;
+ pgdir_shift = 48;
+ ptrs_per_p4d = 512;
+ page_offset_base = __PAGE_OFFSET_BASE_L5;
+ vmalloc_base = __VMALLOC_BASE_L5;
+ vmemmap_base = __VMEMMAP_BASE_L5;
return true;
}
@@ -81,12 +81,14 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
return sme_get_me_mask();
}
-/* Code in __startup_64() can be relocated during execution, but the compiler
- * doesn't have to generate PC-relative relocations when accessing globals from
- * that function. Clang actually does not generate them, which leads to
- * boot-time crashes. To work around this problem, every global pointer must
- * be accessed using RIP_REL_REF(). Kernel virtual addresses can be determined
- * by subtracting p2v_offset from the RIP-relative address.
+/*
+ * This code is compiled using PIC codegen because it will execute from the
+ * early 1:1 mapping of memory, which deviates from the mapping expected by the
+ * linker. Due to this deviation, taking the address of a global variable will
+ * produce an ambiguous result when using the plain & operator. Instead,
+ * rip_rel_ptr() must be used, which will return the RIP-relative address in
+ * the 1:1 mapping of memory. Kernel virtual addresses can be determined by
+ * subtracting p2v_offset from the RIP-relative address.
*/
unsigned long __head __startup_64(unsigned long p2v_offset,
struct boot_params *bp)
@@ -113,8 +115,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
* Compute the delta between the address I am compiled to run at
* and the address I am actually running at.
*/
- load_delta = __START_KERNEL_map + p2v_offset;
- RIP_REL_REF(phys_base) = load_delta;
+ phys_base = load_delta = __START_KERNEL_map + p2v_offset;
/* Is the address not 2M aligned? */
if (load_delta & ~PMD_MASK)
@@ -138,11 +139,11 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE;
}
- RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 2].pud += load_delta;
- RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 1].pud += load_delta;
+ level3_kernel_pgt[PTRS_PER_PUD - 2].pud += load_delta;
+ level3_kernel_pgt[PTRS_PER_PUD - 1].pud += load_delta;
for (i = FIXMAP_PMD_TOP; i > FIXMAP_PMD_TOP - FIXMAP_PMD_NUM; i--)
- RIP_REL_REF(level2_fixmap_pgt)[i].pmd += load_delta;
+ level2_fixmap_pgt[i].pmd += load_delta;
/*
* Set up the identity mapping for the switchover. These
@@ -153,12 +154,12 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
pud = &early_pgts[0]->pmd;
pmd = &early_pgts[1]->pmd;
- RIP_REL_REF(next_early_pgt) = 2;
+ next_early_pgt = 2;
pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask();
if (la57) {
- p4d = &early_pgts[RIP_REL_REF(next_early_pgt)++]->pmd;
+ p4d = &early_pgts[next_early_pgt++]->pmd;
i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
pgd[i + 0] = (pgdval_t)p4d + pgtable_flags;
@@ -179,7 +180,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset,
pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
/* Filter out unsupported __PAGE_KERNEL_* bits: */
- pmd_entry &= RIP_REL_REF(__supported_pte_mask);
+ pmd_entry &= __supported_pte_mask;
pmd_entry += sme_get_me_mask();
pmd_entry += physaddr;
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 6/7] x86/boot: Move early SME init code into startup/
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
` (4 preceding siblings ...)
2025-04-08 8:53 ` [PATCH v3 5/7] x86/boot: Drop RIP_REL_REF() uses from early mapping code Ard Biesheuvel
@ 2025-04-08 8:53 ` Ard Biesheuvel
2025-04-08 8:53 ` [PATCH v3 7/7] x86/boot: Drop RIP_REL_REF() uses from SME startup code Ard Biesheuvel
2025-04-08 18:16 ` [PATCH v3 0/7] x86: Refactor and consolidate " Brian Gerst
7 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:53 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
Move the SME initialization code, which runs from the 1:1 mapping of
memory as it operates on the kernel virtual mapping, into the new
sub-directory arch/x86/boot/startup/ where all startup code will reside
that needs to tolerate executing from the 1:1 mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/Makefile | 1 +
arch/x86/{mm/mem_encrypt_identity.c => boot/startup/sme.c} | 2 --
arch/x86/mm/Makefile | 6 ------
3 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile
index 10319aee666b..ccdfc42a4d59 100644
--- a/arch/x86/boot/startup/Makefile
+++ b/arch/x86/boot/startup/Makefile
@@ -16,6 +16,7 @@ UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o
+obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o
lib-$(CONFIG_X86_64) += la57toggle.o
lib-$(CONFIG_EFI_MIXED) += efi-mixed.o
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/boot/startup/sme.c
similarity index 99%
rename from arch/x86/mm/mem_encrypt_identity.c
rename to arch/x86/boot/startup/sme.c
index e7fb3779b35f..23d10cda5b58 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/boot/startup/sme.c
@@ -45,8 +45,6 @@
#include <asm/coco.h>
#include <asm/sev.h>
-#include "mm_internal.h"
-
#define PGD_FLAGS _KERNPG_TABLE_NOENC
#define P4D_FLAGS _KERNPG_TABLE_NOENC
#define PUD_FLAGS _KERNPG_TABLE_NOENC
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 32035d5be5a0..3faa60f13a61 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -3,12 +3,10 @@
KCOV_INSTRUMENT_tlb.o := n
KCOV_INSTRUMENT_mem_encrypt.o := n
KCOV_INSTRUMENT_mem_encrypt_amd.o := n
-KCOV_INSTRUMENT_mem_encrypt_identity.o := n
KCOV_INSTRUMENT_pgprot.o := n
KASAN_SANITIZE_mem_encrypt.o := n
KASAN_SANITIZE_mem_encrypt_amd.o := n
-KASAN_SANITIZE_mem_encrypt_identity.o := n
KASAN_SANITIZE_pgprot.o := n
# Disable KCSAN entirely, because otherwise we get warnings that some functions
@@ -16,12 +14,10 @@ KASAN_SANITIZE_pgprot.o := n
KCSAN_SANITIZE := n
# Avoid recursion by not calling KMSAN hooks for CEA code.
KMSAN_SANITIZE_cpu_entry_area.o := n
-KMSAN_SANITIZE_mem_encrypt_identity.o := n
ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_mem_encrypt.o = -pg
CFLAGS_REMOVE_mem_encrypt_amd.o = -pg
-CFLAGS_REMOVE_mem_encrypt_identity.o = -pg
CFLAGS_REMOVE_pgprot.o = -pg
endif
@@ -32,7 +28,6 @@ obj-y += pat/
# Make sure __phys_addr has no stackprotector
CFLAGS_physaddr.o := -fno-stack-protector
-CFLAGS_mem_encrypt_identity.o := -fno-stack-protector
CFLAGS_fault.o := -I $(src)/../include/asm/trace
@@ -63,5 +58,4 @@ obj-$(CONFIG_MITIGATION_PAGE_TABLE_ISOLATION) += pti.o
obj-$(CONFIG_X86_MEM_ENCRYPT) += mem_encrypt.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o
-obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 7/7] x86/boot: Drop RIP_REL_REF() uses from SME startup code
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
` (5 preceding siblings ...)
2025-04-08 8:53 ` [PATCH v3 6/7] x86/boot: Move early SME init code into startup/ Ard Biesheuvel
@ 2025-04-08 8:53 ` Ard Biesheuvel
2025-04-08 18:16 ` [PATCH v3 0/7] x86: Refactor and consolidate " Brian Gerst
7 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-08 8:53 UTC (permalink / raw)
To: linux-efi
Cc: x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
From: Ard Biesheuvel <ardb@kernel.org>
RIP_REL_REF() has no effect on code residing in arch/x86/boot/startup,
as it is built with -fPIC. So remove any occurrences from the SME
startup code.
Note the SME is the only caller of cc_set_mask() that requires this, so
drop it from there as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/boot/startup/sme.c | 11 +++++------
arch/x86/include/asm/coco.h | 2 +-
arch/x86/include/asm/mem_encrypt.h | 2 +-
3 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c
index 23d10cda5b58..5738b31c8e60 100644
--- a/arch/x86/boot/startup/sme.c
+++ b/arch/x86/boot/startup/sme.c
@@ -297,8 +297,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp)
* instrumentation or checking boot_cpu_data in the cc_platform_has()
* function.
*/
- if (!sme_get_me_mask() ||
- RIP_REL_REF(sev_status) & MSR_AMD64_SEV_ENABLED)
+ if (!sme_get_me_mask() || sev_status & MSR_AMD64_SEV_ENABLED)
return;
/*
@@ -524,7 +523,7 @@ void __head sme_enable(struct boot_params *bp)
me_mask = 1UL << (ebx & 0x3f);
/* Check the SEV MSR whether SEV or SME is enabled */
- RIP_REL_REF(sev_status) = msr = __rdmsr(MSR_AMD64_SEV);
+ sev_status = msr = __rdmsr(MSR_AMD64_SEV);
feature_mask = (msr & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
/*
@@ -560,8 +559,8 @@ void __head sme_enable(struct boot_params *bp)
return;
}
- RIP_REL_REF(sme_me_mask) = me_mask;
- RIP_REL_REF(physical_mask) &= ~me_mask;
- RIP_REL_REF(cc_vendor) = CC_VENDOR_AMD;
+ sme_me_mask = me_mask;
+ physical_mask &= ~me_mask;
+ cc_vendor = CC_VENDOR_AMD;
cc_set_mask(me_mask);
}
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index e7225452963f..e1dbf8df1b69 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -22,7 +22,7 @@ static inline u64 cc_get_mask(void)
static inline void cc_set_mask(u64 mask)
{
- RIP_REL_REF(cc_mask) = mask;
+ cc_mask = mask;
}
u64 cc_mkenc(u64 val);
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1530ee301dfe..ea6494628cb0 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -61,7 +61,7 @@ void __init sev_es_init_vc_handling(void);
static inline u64 sme_get_me_mask(void)
{
- return RIP_REL_REF(sme_me_mask);
+ return sme_me_mask;
}
#define __bss_decrypted __section(".bss..decrypted")
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/7] x86: Refactor and consolidate startup code
2025-04-08 8:52 [PATCH v3 0/7] x86: Refactor and consolidate startup code Ard Biesheuvel
` (6 preceding siblings ...)
2025-04-08 8:53 ` [PATCH v3 7/7] x86/boot: Drop RIP_REL_REF() uses from SME startup code Ard Biesheuvel
@ 2025-04-08 18:16 ` Brian Gerst
2025-04-09 6:47 ` Ard Biesheuvel
7 siblings, 1 reply; 17+ messages in thread
From: Brian Gerst @ 2025-04-08 18:16 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-efi, x86, mingo, linux-kernel, Ard Biesheuvel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
On Tue, Apr 8, 2025 at 5:01 AM Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Reorganize C code that is used during early boot, either in the
> decompressor/EFI stub or the kernel proper, but before the kernel
> virtual mapping is up.
>
> v3:
> - keep rip_rel_ptr() around in PIC code - sadly, it is still needed in
> some cases
> - remove RIP_REL_REF() uses in separate patches
> - keep __head annotations for now, they will all be removed later
> - disable objtool validation for library objects (i.e., pieces that are
> not linked into vmlinux)
>
> I will follow up with a series that gets rid of .head.text altogether,
> as it will no longer be needed at all once the startup code is checked
> for absolute relocations.
>
> The SEV startup code needs to be moved first, though, and this is a bit
> more complicated, so I will decouple that effort from this series, also
> because there is a known issue that needs to be fixed first related to
> memory acceptance from the EFI stub.
Is there anything to verify that the compiler doesn't do something
unexpected with PIC code generation like create GOT references?
Brian Gerst
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/7] x86: Refactor and consolidate startup code
2025-04-08 18:16 ` [PATCH v3 0/7] x86: Refactor and consolidate " Brian Gerst
@ 2025-04-09 6:47 ` Ard Biesheuvel
0 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2025-04-09 6:47 UTC (permalink / raw)
To: Brian Gerst
Cc: linux-efi, x86, mingo, linux-kernel, Tom Lendacky,
Dionna Amalie Glaze, Kevin Loughlin
On Tue, 8 Apr 2025 at 20:16, Brian Gerst <brgerst@gmail.com> wrote:
>
> On Tue, Apr 8, 2025 at 5:01 AM Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Reorganize C code that is used during early boot, either in the
> > decompressor/EFI stub or the kernel proper, but before the kernel
> > virtual mapping is up.
> >
> > v3:
> > - keep rip_rel_ptr() around in PIC code - sadly, it is still needed in
> > some cases
> > - remove RIP_REL_REF() uses in separate patches
> > - keep __head annotations for now, they will all be removed later
> > - disable objtool validation for library objects (i.e., pieces that are
> > not linked into vmlinux)
> >
> > I will follow up with a series that gets rid of .head.text altogether,
> > as it will no longer be needed at all once the startup code is checked
> > for absolute relocations.
> >
> > The SEV startup code needs to be moved first, though, and this is a bit
> > more complicated, so I will decouple that effort from this series, also
> > because there is a known issue that needs to be fixed first related to
> > memory acceptance from the EFI stub.
>
> Is there anything to verify that the compiler doesn't do something
> unexpected with PIC code generation like create GOT references?
>
I will propose something along the lines of what is already being done
for the EFI stub:
------%<------
STUBCOPY_RELOC-$(CONFIG_X86_64) := R_X86_64_64
quiet_cmd_stubcopy = STUBCPY $@
cmd_stubcopy = \
$(STRIP) --strip-debug -o $@ $<; \
if $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y); then \
echo "$@: absolute symbol references not allowed in
the EFI stub" >&2; \
/bin/false; \
fi; \
$(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@
^ permalink raw reply [flat|nested] 17+ messages in thread