* [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping
2024-09-26 10:41 [PATCH 0/5] x86/xen: Drop absolute references from startup code Ard Biesheuvel
@ 2024-09-26 10:41 ` Ard Biesheuvel
2024-09-26 10:55 ` Ard Biesheuvel
2024-09-26 10:41 ` [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor Ard Biesheuvel
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
From: Ard Biesheuvel <ardb@kernel.org>
Calling C code via a different mapping than it was linked at is
problematic, because the compiler assumes that RIP-relative and absolute
symbol references are interchangeable. GCC in particular may use
RIP-relative per-CPU variable references even when not using -fpic.
So call xen_prepare_pvh() via its kernel virtual mapping on x86_64, so
that those RIP-relative references produce the correct values. This
matches the pre-existing behavior for i386, which also invokes
xen_prepare_pvh() via the kernel virtual mapping before invoking
startup_32 with paging disabled again.
Fixes: 7243b93345f7 ("xen/pvh: Bootstrap PVH guest")
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/platform/pvh/head.S | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 64fca49cd88f..98ddd552885a 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -172,7 +172,13 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
movq %rbp, %rbx
subq $_pa(pvh_start_xen), %rbx
movq %rbx, phys_base(%rip)
- call xen_prepare_pvh
+
+ /* Call xen_prepare_pvh() via the kernel virtual mapping */
+ leaq xen_prepare_pvh(%rip), %rax
+ addq $__START_KERNEL_map, %rax
+ ANNOTATE_RETPOLINE_SAFE
+ call *%rax
+
/*
* Clear phys_base. __startup_64 will *add* to its value,
* so reset to 0.
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping
2024-09-26 10:41 ` [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping Ard Biesheuvel
@ 2024-09-26 10:55 ` Ard Biesheuvel
2024-09-26 20:29 ` Jason Andryuk
0 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:55 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-kernel, Jason Andryuk, Juergen Gross, Boris Ostrovsky, x86,
xen-devel
On Thu, 26 Sept 2024 at 12:41, Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Calling C code via a different mapping than it was linked at is
> problematic, because the compiler assumes that RIP-relative and absolute
> symbol references are interchangeable. GCC in particular may use
> RIP-relative per-CPU variable references even when not using -fpic.
>
> So call xen_prepare_pvh() via its kernel virtual mapping on x86_64, so
> that those RIP-relative references produce the correct values. This
> matches the pre-existing behavior for i386, which also invokes
> xen_prepare_pvh() via the kernel virtual mapping before invoking
> startup_32 with paging disabled again.
>
> Fixes: 7243b93345f7 ("xen/pvh: Bootstrap PVH guest")
> Tested-by: Jason Andryuk <jason.andryuk@amd.com>
> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/x86/platform/pvh/head.S | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
> index 64fca49cd88f..98ddd552885a 100644
> --- a/arch/x86/platform/pvh/head.S
> +++ b/arch/x86/platform/pvh/head.S
> @@ -172,7 +172,13 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
> movq %rbp, %rbx
> subq $_pa(pvh_start_xen), %rbx
> movq %rbx, phys_base(%rip)
> - call xen_prepare_pvh
> +
> + /* Call xen_prepare_pvh() via the kernel virtual mapping */
> + leaq xen_prepare_pvh(%rip), %rax
Just realized that we probably need
+ subq phys_base(%rip), %rax
here (or grab phys_base from %rbx, but the above is more obvious and
less likely to get broken in the future)
> + addq $__START_KERNEL_map, %rax
> + ANNOTATE_RETPOLINE_SAFE
> + call *%rax
> +
> /*
> * Clear phys_base. __startup_64 will *add* to its value,
> * so reset to 0.
> --
> 2.46.0.792.g87dc391469-goog
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping
2024-09-26 10:55 ` Ard Biesheuvel
@ 2024-09-26 20:29 ` Jason Andryuk
0 siblings, 0 replies; 15+ messages in thread
From: Jason Andryuk @ 2024-09-26 20:29 UTC (permalink / raw)
To: Ard Biesheuvel, Ard Biesheuvel
Cc: linux-kernel, Juergen Gross, Boris Ostrovsky, x86, xen-devel
On 2024-09-26 06:55, Ard Biesheuvel wrote:
> On Thu, 26 Sept 2024 at 12:41, Ard Biesheuvel <ardb+git@google.com> wrote:
>>
>> From: Ard Biesheuvel <ardb@kernel.org>
>>
>> Calling C code via a different mapping than it was linked at is
>> problematic, because the compiler assumes that RIP-relative and absolute
>> symbol references are interchangeable. GCC in particular may use
>> RIP-relative per-CPU variable references even when not using -fpic.
>>
>> So call xen_prepare_pvh() via its kernel virtual mapping on x86_64, so
>> that those RIP-relative references produce the correct values. This
>> matches the pre-existing behavior for i386, which also invokes
>> xen_prepare_pvh() via the kernel virtual mapping before invoking
>> startup_32 with paging disabled again.
>>
>> Fixes: 7243b93345f7 ("xen/pvh: Bootstrap PVH guest")
>> Tested-by: Jason Andryuk <jason.andryuk@amd.com>
>> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>> ---
>> arch/x86/platform/pvh/head.S | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
>> index 64fca49cd88f..98ddd552885a 100644
>> --- a/arch/x86/platform/pvh/head.S
>> +++ b/arch/x86/platform/pvh/head.S
>> @@ -172,7 +172,13 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
>> movq %rbp, %rbx
>> subq $_pa(pvh_start_xen), %rbx
>> movq %rbx, phys_base(%rip)
>> - call xen_prepare_pvh
>> +
>> + /* Call xen_prepare_pvh() via the kernel virtual mapping */
>> + leaq xen_prepare_pvh(%rip), %rax
>
> Just realized that we probably need
>
> + subq phys_base(%rip), %rax
Yes, this is necessary when phys_base is non-0. I intended to test a
non-0 case yesterday, but it turns out I didn't. Re-testing, I have
confirmed this subq is necessary.
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor
2024-09-26 10:41 [PATCH 0/5] x86/xen: Drop absolute references from startup code Ard Biesheuvel
2024-09-26 10:41 ` [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping Ard Biesheuvel
@ 2024-09-26 10:41 ` Ard Biesheuvel
2024-09-26 14:29 ` Andrew Cooper
2024-09-26 20:32 ` Jason Andryuk
2024-09-26 10:41 ` [PATCH 3/5] x86/pvh: Omit needless clearing of phys_base Ard Biesheuvel
` (2 subsequent siblings)
4 siblings, 2 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
From: Ard Biesheuvel <ardb@kernel.org>
The size field in a GDT descriptor is offset by 1, so subtract 1 from
the calculated range.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/platform/pvh/head.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 98ddd552885a..f09e0fb832e4 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -223,7 +223,7 @@ SYM_CODE_END(pvh_start_xen)
.section ".init.data","aw"
.balign 8
SYM_DATA_START_LOCAL(gdt)
- .word gdt_end - gdt_start
+ .word gdt_end - gdt_start - 1
.long _pa(gdt_start) /* x86-64 will overwrite if relocated. */
.word 0
SYM_DATA_END(gdt)
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor
2024-09-26 10:41 ` [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor Ard Biesheuvel
@ 2024-09-26 14:29 ` Andrew Cooper
2024-09-26 20:32 ` Jason Andryuk
1 sibling, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2024-09-26 14:29 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
On 26/09/2024 11:41 am, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The size field in a GDT descriptor is offset by 1, so subtract 1 from
> the calculated range.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
I realise this is probably nitpicking, but the GDT descriptor has a
limit field, which is (intentionally) not a size field.
This is why there's a difference of 1 between them.
Working in terms of limits rather than sizes avoids needing a 16th bit
on the segment limit circuitry, which mattered for the 286.
~Andrew
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor
2024-09-26 10:41 ` [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor Ard Biesheuvel
2024-09-26 14:29 ` Andrew Cooper
@ 2024-09-26 20:32 ` Jason Andryuk
1 sibling, 0 replies; 15+ messages in thread
From: Jason Andryuk @ 2024-09-26 20:32 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: Ard Biesheuvel, Juergen Gross, Boris Ostrovsky, x86, xen-devel
On 2024-09-26 06:41, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The size field in a GDT descriptor is offset by 1, so subtract 1 from
> the calculated range.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
I guess with s/size/limit/ per Andrew.
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/5] x86/pvh: Omit needless clearing of phys_base
2024-09-26 10:41 [PATCH 0/5] x86/xen: Drop absolute references from startup code Ard Biesheuvel
2024-09-26 10:41 ` [PATCH 1/5] x86/pvh: Call C code via the kernel virtual mapping Ard Biesheuvel
2024-09-26 10:41 ` [PATCH 2/5] x86/pvh: Use correct size value in GDT descriptor Ard Biesheuvel
@ 2024-09-26 10:41 ` Ard Biesheuvel
2024-09-26 20:35 ` Jason Andryuk
2024-09-26 10:41 ` [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes Ard Biesheuvel
2024-09-26 10:41 ` [PATCH 5/5] x86/pvh: Avoid absolute symbol references in .head.text Ard Biesheuvel
4 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
From: Ard Biesheuvel <ardb@kernel.org>
Since commit
d9ec1158056b ("x86/boot/64: Use RIP_REL_REF() to assign 'phys_base'")
phys_base is assigned directly rather than added to, so it is no longer
necessary to clear it after use.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/platform/pvh/head.S | 7 -------
1 file changed, 7 deletions(-)
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index f09e0fb832e4..592747f2d731 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -179,13 +179,6 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
ANNOTATE_RETPOLINE_SAFE
call *%rax
- /*
- * Clear phys_base. __startup_64 will *add* to its value,
- * so reset to 0.
- */
- xor %rbx, %rbx
- movq %rbx, phys_base(%rip)
-
/* startup_64 expects boot_params in %rsi. */
lea pvh_bootparams(%rip), %rsi
jmp startup_64
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 3/5] x86/pvh: Omit needless clearing of phys_base
2024-09-26 10:41 ` [PATCH 3/5] x86/pvh: Omit needless clearing of phys_base Ard Biesheuvel
@ 2024-09-26 20:35 ` Jason Andryuk
0 siblings, 0 replies; 15+ messages in thread
From: Jason Andryuk @ 2024-09-26 20:35 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: Ard Biesheuvel, Juergen Gross, Boris Ostrovsky, x86, xen-devel
On 2024-09-26 06:41, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Since commit
>
> d9ec1158056b ("x86/boot/64: Use RIP_REL_REF() to assign 'phys_base'")
>
> phys_base is assigned directly rather than added to, so it is no longer
> necessary to clear it after use.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
My patch pre-dated the above, but I never liked it. Happy to see this gone.
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes
2024-09-26 10:41 [PATCH 0/5] x86/xen: Drop absolute references from startup code Ard Biesheuvel
` (2 preceding siblings ...)
2024-09-26 10:41 ` [PATCH 3/5] x86/pvh: Omit needless clearing of phys_base Ard Biesheuvel
@ 2024-09-26 10:41 ` Ard Biesheuvel
2024-09-27 1:46 ` Jason Andryuk
2024-09-26 10:41 ` [PATCH 5/5] x86/pvh: Avoid absolute symbol references in .head.text Ard Biesheuvel
4 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
From: Ard Biesheuvel <ardb@kernel.org>
Xen puts virtual and physical addresses into ELF notes that are treated
by the linker as relocatable by default. Doing so is not only pointless,
given that the ELF notes are only intended for consumption by Xen before
the kernel boots. It is also a KASLR leak, given that the kernel's ELF
notes are exposed via the world readable /sys/kernel/notes.
So emit these constants in a way that prevents the linker from marking
them as relocatable. This involves place-relative relocations (which
subtract their own virtual address from the symbol value) and linker
provided absolute symbols that add the address of the place to the
desired value.
While at it, switch to a 32-bit field for XEN_ELFNOTE_PHYS32_ENTRY,
which better matches the intent as well as the Xen documentation and
source code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/kernel/vmlinux.lds.S | 12 ++++++++++++
arch/x86/platform/pvh/head.S | 6 +++---
arch/x86/tools/relocs.c | 1 +
arch/x86/xen/xen-head.S | 6 ++++--
4 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 6e73403e874f..dce17afcc186 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -528,3 +528,15 @@ INIT_PER_CPU(irq_stack_backing_store);
#endif
#endif /* CONFIG_X86_64 */
+
+#ifdef CONFIG_XEN_PV
+xen_elfnote_entry_offset =
+ ABSOLUTE(xen_elfnote_entry) + ABSOLUTE(startup_xen);
+xen_elfnote_hypercall_page_offset =
+ ABSOLUTE(xen_elfnote_hypercall_page) + ABSOLUTE(hypercall_page);
+#endif
+
+#ifdef CONFIG_PVH
+xen_elfnote_phys32_entry_offset =
+ ABSOLUTE(xen_elfnote_phys32_entry) + ABSOLUTE(pvh_start_xen - LOAD_OFFSET);
+#endif
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 592747f2d731..e2ab4c74f596 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -52,7 +52,7 @@
#define PVH_CS_SEL (PVH_GDT_ENTRY_CS * 8)
#define PVH_DS_SEL (PVH_GDT_ENTRY_DS * 8)
-SYM_CODE_START_LOCAL(pvh_start_xen)
+SYM_CODE_START(pvh_start_xen)
UNWIND_HINT_END_OF_STACK
cld
@@ -299,5 +299,5 @@ SYM_DATA_END(pvh_level2_kernel_pgt)
.long KERNEL_IMAGE_SIZE - 1)
#endif
- ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY,
- _ASM_PTR (pvh_start_xen - __START_KERNEL_map))
+ ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY, .global xen_elfnote_phys32_entry;
+ xen_elfnote_phys32_entry: .long xen_elfnote_phys32_entry_offset - .)
diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index c101bed61940..3ede19ca8432 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -56,6 +56,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = {
[S_ABS] =
"^(xen_irq_disable_direct_reloc$|"
"xen_save_fl_direct_reloc$|"
+ "xen_elfnote_.+_offset$|"
"VDSO|"
"__kcfi_typeid_|"
"__crc_)",
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 758bcd47b72d..3deaae3601f7 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -94,7 +94,8 @@ SYM_CODE_END(xen_cpu_bringup_again)
ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE, _ASM_PTR __START_KERNEL_map)
/* Map the p2m table to a 512GB-aligned user address. */
ELFNOTE(Xen, XEN_ELFNOTE_INIT_P2M, .quad (PUD_SIZE * PTRS_PER_PUD))
- ELFNOTE(Xen, XEN_ELFNOTE_ENTRY, _ASM_PTR startup_xen)
+ ELFNOTE(Xen, XEN_ELFNOTE_ENTRY, .globl xen_elfnote_entry;
+ xen_elfnote_entry: _ASM_PTR xen_elfnote_entry_offset - .)
ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .ascii "!writable_page_tables")
ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz "yes")
ELFNOTE(Xen, XEN_ELFNOTE_L1_MFN_VALID,
@@ -115,7 +116,8 @@ SYM_CODE_END(xen_cpu_bringup_again)
#else
# define FEATURES_DOM0 0
#endif
- ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, _ASM_PTR hypercall_page)
+ ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .globl xen_elfnote_hypercall_page;
+ xen_elfnote_hypercall_page: _ASM_PTR xen_elfnote_hypercall_page_offset - .)
ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES,
.long FEATURES_PV | FEATURES_PVH | FEATURES_DOM0)
ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz "generic")
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes
2024-09-26 10:41 ` [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes Ard Biesheuvel
@ 2024-09-27 1:46 ` Jason Andryuk
2024-09-27 5:49 ` Ard Biesheuvel
0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2024-09-27 1:46 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: Ard Biesheuvel, Juergen Gross, Boris Ostrovsky, x86, xen-devel
On 2024-09-26 06:41, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Xen puts virtual and physical addresses into ELF notes that are treated
> by the linker as relocatable by default. Doing so is not only pointless,
> given that the ELF notes are only intended for consumption by Xen before
> the kernel boots. It is also a KASLR leak, given that the kernel's ELF
> notes are exposed via the world readable /sys/kernel/notes.
>
> So emit these constants in a way that prevents the linker from marking
> them as relocatable. This involves place-relative relocations (which
> subtract their own virtual address from the symbol value) and linker
> provided absolute symbols that add the address of the place to the
> desired value.
>
> While at it, switch to a 32-bit field for XEN_ELFNOTE_PHYS32_ENTRY,
> which better matches the intent as well as the Xen documentation and
> source code.
QEMU parses this according to the ELF bitness. It looks like this reads
8 bytes on 64bit, and 4 on 32. So I think this change would break its
parsing.
(I don't use QEMU PVH and I'm not that familiar with its implementation.)
Regards,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes
2024-09-27 1:46 ` Jason Andryuk
@ 2024-09-27 5:49 ` Ard Biesheuvel
2024-09-27 7:21 ` Ard Biesheuvel
0 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-27 5:49 UTC (permalink / raw)
To: Jason Andryuk
Cc: Ard Biesheuvel, linux-kernel, Juergen Gross, Boris Ostrovsky, x86,
xen-devel
On Fri, 27 Sept 2024 at 03:47, Jason Andryuk <jason.andryuk@amd.com> wrote:
>
> On 2024-09-26 06:41, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Xen puts virtual and physical addresses into ELF notes that are treated
> > by the linker as relocatable by default. Doing so is not only pointless,
> > given that the ELF notes are only intended for consumption by Xen before
> > the kernel boots. It is also a KASLR leak, given that the kernel's ELF
> > notes are exposed via the world readable /sys/kernel/notes.
> >
> > So emit these constants in a way that prevents the linker from marking
> > them as relocatable. This involves place-relative relocations (which
> > subtract their own virtual address from the symbol value) and linker
> > provided absolute symbols that add the address of the place to the
> > desired value.
> >
> > While at it, switch to a 32-bit field for XEN_ELFNOTE_PHYS32_ENTRY,
> > which better matches the intent as well as the Xen documentation and
> > source code.
>
> QEMU parses this according to the ELF bitness. It looks like this reads
> 8 bytes on 64bit, and 4 on 32. So I think this change would break its
> parsing.
>
Indeed, well spotted.
> (I don't use QEMU PVH and I'm not that familiar with its implementation.)
>
This is what I used for testing, and it worked fine.
But looking at the code, it does dereference a size_t*, which seems
bizarre but will clearly produce garbage in the upper bits if the note
is 32-bits only and followed by unrelated non-zero data.
I'll just back out this part of the change - it isn't really necessary anyway.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes
2024-09-27 5:49 ` Ard Biesheuvel
@ 2024-09-27 7:21 ` Ard Biesheuvel
0 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-27 7:21 UTC (permalink / raw)
To: Jason Andryuk
Cc: Ard Biesheuvel, linux-kernel, Juergen Gross, Boris Ostrovsky, x86,
xen-devel
On Fri, 27 Sept 2024 at 07:49, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Fri, 27 Sept 2024 at 03:47, Jason Andryuk <jason.andryuk@amd.com> wrote:
> >
> > On 2024-09-26 06:41, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Xen puts virtual and physical addresses into ELF notes that are treated
> > > by the linker as relocatable by default. Doing so is not only pointless,
> > > given that the ELF notes are only intended for consumption by Xen before
> > > the kernel boots. It is also a KASLR leak, given that the kernel's ELF
> > > notes are exposed via the world readable /sys/kernel/notes.
> > >
> > > So emit these constants in a way that prevents the linker from marking
> > > them as relocatable. This involves place-relative relocations (which
> > > subtract their own virtual address from the symbol value) and linker
> > > provided absolute symbols that add the address of the place to the
> > > desired value.
> > >
> > > While at it, switch to a 32-bit field for XEN_ELFNOTE_PHYS32_ENTRY,
> > > which better matches the intent as well as the Xen documentation and
> > > source code.
> >
> > QEMU parses this according to the ELF bitness. It looks like this reads
> > 8 bytes on 64bit, and 4 on 32. So I think this change would break its
> > parsing.
> >
>
> Indeed, well spotted.
>
> > (I don't use QEMU PVH and I'm not that familiar with its implementation.)
> >
>
> This is what I used for testing, and it worked fine.
>
> But looking at the code, it does dereference a size_t*, which seems
> bizarre but will clearly produce garbage in the upper bits if the note
> is 32-bits only and followed by unrelated non-zero data.
>
> I'll just back out this part of the change - it isn't really necessary anyway.
... and fix QEMU as well:
https://lore.kernel.org/qemu-devel/20240927071950.1458596-1-ardb+git@google.com/T/#u
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 5/5] x86/pvh: Avoid absolute symbol references in .head.text
2024-09-26 10:41 [PATCH 0/5] x86/xen: Drop absolute references from startup code Ard Biesheuvel
` (3 preceding siblings ...)
2024-09-26 10:41 ` [PATCH 4/5] x86/xen: Avoid relocatable quantities in Xen ELF notes Ard Biesheuvel
@ 2024-09-26 10:41 ` Ard Biesheuvel
2024-09-27 22:12 ` Jason Andryuk
4 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2024-09-26 10:41 UTC (permalink / raw)
To: linux-kernel
Cc: Ard Biesheuvel, Jason Andryuk, Juergen Gross, Boris Ostrovsky,
x86, xen-devel
From: Ard Biesheuvel <ardb@kernel.org>
The .head.text section contains code that may execute from a different
address than it was linked at. This is fragile, given that the x86 ABI
can refer to global symbols via absolute or relative references, and the
toolchain assumes that these are interchangeable, which they are not in
this particular case.
For this reason, all absolute symbol references are being removed from
code that is emitted into .head.text. Subsequently, build time
validation may be added that ensures that no absolute ELF relocations
exist at all in that ELF section.
In the case of the PVH code, the absolute references are in 32-bit code,
which get emitted with R_X86_64_32 relocations, and these are even more
problematic going forward, as it prevents running the linker in PIE
mode.
So update the 64-bit code to avoid _pa(), and to only rely on relative
symbol references: these are always 32-bits wide, even in 64-bit code,
and are resolved by the linker at build time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/platform/pvh/head.S | 30 ++++++++++++--------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index e2ab4c74f596..b2742259ed60 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -6,7 +6,9 @@
.code32
.text
+#ifdef CONFIG_X86_32
#define _pa(x) ((x) - __START_KERNEL_map)
+#endif
#define rva(x) ((x) - pvh_start_xen)
#include <linux/elfnote.h>
@@ -72,8 +74,7 @@ SYM_CODE_START(pvh_start_xen)
movl $0, %esp
leal rva(gdt)(%ebp), %eax
- leal rva(gdt_start)(%ebp), %ecx
- movl %ecx, 2(%eax)
+ addl %eax, 2(%eax)
lgdt (%eax)
mov $PVH_DS_SEL,%eax
@@ -103,10 +104,23 @@ SYM_CODE_START(pvh_start_xen)
btsl $_EFER_LME, %eax
wrmsr
+ /*
+ * Reuse the non-relocatable symbol emitted for the ELF note to
+ * subtract the build time physical address of pvh_start_xen() from
+ * its actual runtime address, without relying on absolute 32-bit ELF
+ * relocations, as these are not supported by the linker when running
+ * in -pie mode, and should be avoided in .head.text in general.
+ */
mov %ebp, %ebx
- subl $_pa(pvh_start_xen), %ebx /* offset */
+ subl rva(xen_elfnote_phys32_entry)(%ebp), %ebx
jz .Lpagetable_done
+ /*
+ * Store the resulting load offset in phys_base. __pa() needs
+ * phys_base set to calculate the hypercall page in xen_pvh_init().
+ */
+ movl %ebx, rva(phys_base)(%ebp)
+
/* Fixup page-tables for relocation. */
leal rva(pvh_init_top_pgt)(%ebp), %edi
movl $PTRS_PER_PGD, %ecx
@@ -165,14 +179,6 @@ SYM_CODE_START(pvh_start_xen)
xor %edx, %edx
wrmsr
- /*
- * Calculate load offset and store in phys_base. __pa() needs
- * phys_base set to calculate the hypercall page in xen_pvh_init().
- */
- movq %rbp, %rbx
- subq $_pa(pvh_start_xen), %rbx
- movq %rbx, phys_base(%rip)
-
/* Call xen_prepare_pvh() via the kernel virtual mapping */
leaq xen_prepare_pvh(%rip), %rax
addq $__START_KERNEL_map, %rax
@@ -217,7 +223,7 @@ SYM_CODE_END(pvh_start_xen)
.balign 8
SYM_DATA_START_LOCAL(gdt)
.word gdt_end - gdt_start - 1
- .long _pa(gdt_start) /* x86-64 will overwrite if relocated. */
+ .long gdt_start - gdt
.word 0
SYM_DATA_END(gdt)
SYM_DATA_START_LOCAL(gdt_start)
--
2.46.0.792.g87dc391469-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 5/5] x86/pvh: Avoid absolute symbol references in .head.text
2024-09-26 10:41 ` [PATCH 5/5] x86/pvh: Avoid absolute symbol references in .head.text Ard Biesheuvel
@ 2024-09-27 22:12 ` Jason Andryuk
0 siblings, 0 replies; 15+ messages in thread
From: Jason Andryuk @ 2024-09-27 22:12 UTC (permalink / raw)
To: Ard Biesheuvel, linux-kernel
Cc: Ard Biesheuvel, Juergen Gross, Boris Ostrovsky, x86, xen-devel
On 2024-09-26 06:41, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The .head.text section contains code that may execute from a different
> address than it was linked at. This is fragile, given that the x86 ABI
> can refer to global symbols via absolute or relative references, and the
> toolchain assumes that these are interchangeable, which they are not in
> this particular case.
>
> For this reason, all absolute symbol references are being removed from
> code that is emitted into .head.text. Subsequently, build time
> validation may be added that ensures that no absolute ELF relocations
> exist at all in that ELF section.
>
> In the case of the PVH code, the absolute references are in 32-bit code,
> which get emitted with R_X86_64_32 relocations, and these are even more
> problematic going forward, as it prevents running the linker in PIE
> mode.
>
> So update the 64-bit code to avoid _pa(), and to only rely on relative
> symbol references: these are always 32-bits wide, even in 64-bit code,
> and are resolved by the linker at build time.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread