All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving
@ 2024-11-22 22:38 David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S David Woodhouse
                   ` (15 more replies)
  0 siblings, 16 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

Make it easier to pass information into relocate_kernel by allowing it to
have actual variables which are set from the real kernel. To do this, move
it into the kernel's .data section, keeping its data and code together
with linker script rules. Execute it from the *copy* instead of its
original in the kernel data section, and clean it up a bit.

Then do what I originally started with, which is add a GDT+IDT and some
exception handling so we can actually catch problems instead of just
suffering a triple fault and wondering why the world hates us.

The serial output of the debug mode can be cleaned up a little, and it's
even now possible to pass in information about which serial port to write
to.

I'll also work on resyncing with the i386 code and applying as many of
these cleanups there as possible. And probably also make the 64-bit one
use a separate image->arch.pgd instead of lumping it into a single 8KiB
"control page" as we do on x86_64 at the moment.

But the basic cleanups are probably ready for another round of bikeshedding.

Testing the preserve_context mode with the following test case:

 #include <unistd.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <linux/kexec.h>
 #include <linux/reboot.h>
 #include <sys/reboot.h>
 #include <sys/syscall.h>

int main (void)
{
        struct kexec_segment segment = {};
	unsigned char purgatory[] = {
		0x66, 0xba, 0xf8, 0x03,	// mov $0x3f8, %dx
		0xb0, 0x42,		// mov $0x42, %al
		0xee,			// outb %al, (%dx)
		0xc3,			// ret
	};
	int ret;

	segment.buf = &purgatory;
	segment.bufsz = sizeof(purgatory);
	segment.mem = (void *)0x400000;
	segment.memsz = 0x1000;
	ret = syscall(__NR_kexec_load, 0x400000, 1, &segment, KEXEC_PRESERVE_CONTEXT);
	if (ret) {
		perror("kexec_load");
		exit(1);
	}
	return 0;
}


David Woodhouse (16):
      x86/kexec: Clean up and document register use in relocate_kernel_64.S
      x86/kexec: Use named labels in swap_pages in relocate_kernel_64.S
      x86/kexec: Restore GDT on return from preserve_context kexec
      x86/kexec: Only swap pages for preserve_context mode
      x86/kexec: Invoke copy of relocate_kernel() instead of the original
      x86/kexec: Move relocate_kernel to kernel .data section
      x86/kexec: Add data section to relocate_kernel
      x86/kexec: Copy control page into place in machine_kexec_prepare()
      x86/kexec: Drop page_list argument from relocate_kernel()
      x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page
      x86/kexec: Clean up register usage in relocate_kernel()
      x86/kexec: Mark relocate_kernel page as ROX instead of RWX
      x86/kexec: Debugging support: load a GDT
      x86/kexec: Debugging support: Load an IDT and basic exception entry points
      x86/kexec: Debugging support: Dump registers on exception
      [DO NOT MERGE] x86/kexec: enable DEBUG

 arch/x86/include/asm/kexec.h         |  13 +-
 arch/x86/include/asm/sections.h      |   1 +
 arch/x86/kernel/machine_kexec_64.c   |  55 +++--
 arch/x86/kernel/relocate_kernel_64.S | 384 +++++++++++++++++++++++++++--------
 arch/x86/kernel/vmlinux.lds.S        |  12 +-
 5 files changed, 358 insertions(+), 107 deletions(-)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 02/16] x86/kexec: Use named labels in swap_pages " David Woodhouse
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Add more comments explaining what each register contains, and save the
preserve_context flag to a non-clobbered register sooner, to keep things
simpler.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/relocate_kernel_64.S | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index e9e88c342f75..7ee32bcb6e01 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -100,6 +100,9 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	movq	%r10, CP_PA_SWAP_PAGE(%r11)
 	movq	%rdi, CP_PA_BACKUP_PAGES_MAP(%r11)
 
+	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
+	movq	%rcx, %r11
+
 	/* Switch to the identity mapped page tables */
 	movq	%r9, %cr3
 
@@ -116,6 +119,14 @@ SYM_CODE_END(relocate_kernel)
 
 SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	UNWIND_HINT_END_OF_STACK
+	/*
+	 * %rdi	indirection page
+	 * %rdx start address
+	 * %r11 preserve_context
+	 * %r12 host_mem_enc_active
+	 * %r13 original CR4 when relocate_kernel() was invoked
+	 */
+
 	/* set return address to 0 if not preserving context */
 	pushq	$0
 	/* store the start address on the stack */
@@ -170,8 +181,6 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	wbinvd
 .Lsme_off:
 
-	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
-	movq	%rcx, %r11
 	call	swap_pages
 
 	/*
@@ -183,13 +192,14 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	movq	%cr3, %rax
 	movq	%rax, %cr3
 
+	testq	%r11, %r11	/* preserve_context */
+	jnz .Lrelocate
+
 	/*
 	 * set all of the registers to known values
 	 * leave %rsp alone
 	 */
 
-	testq	%r11, %r11
-	jnz .Lrelocate
 	xorl	%eax, %eax
 	xorl	%ebx, %ebx
 	xorl    %ecx, %ecx
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 02/16] x86/kexec: Use named labels in swap_pages in relocate_kernel_64.S
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 03/16] x86/kexec: Restore GDT on return from preserve_context kexec David Woodhouse
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Make the code a little more readable.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/relocate_kernel_64.S | 30 ++++++++++++++--------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 7ee32bcb6e01..ca01e3e2f097 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -272,31 +272,31 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	movq	%rdi, %rcx	/* Put the indirection_page in %rcx */
 	xorl	%edi, %edi
 	xorl	%esi, %esi
-	jmp	1f
+	jmp	.Lstart		/* Should start with an indirection record */
 
-0:	/* top, read another word for the indirection page */
+.Lloop:	/* top, read another word for the indirection page */
 
 	movq	(%rbx), %rcx
 	addq	$8,	%rbx
-1:
+.Lstart:
 	testb	$0x1,	%cl   /* is it a destination page? */
-	jz	2f
+	jz	.Lnotdest
 	movq	%rcx,	%rdi
 	andq	$0xfffffffffffff000, %rdi
-	jmp	0b
-2:
+	jmp	.Lloop
+.Lnotdest:
 	testb	$0x2,	%cl   /* is it an indirection page? */
-	jz	2f
+	jz	.Lnotind
 	movq	%rcx,   %rbx
 	andq	$0xfffffffffffff000, %rbx
-	jmp	0b
-2:
+	jmp	.Lloop
+.Lnotind:
 	testb	$0x4,	%cl   /* is it the done indicator? */
-	jz	2f
-	jmp	3f
-2:
+	jz	.Lnotdone
+	jmp	.Ldone
+.Lnotdone:
 	testb	$0x8,	%cl   /* is it the source indicator? */
-	jz	0b	      /* Ignore it otherwise */
+	jz	.Lloop	      /* Ignore it otherwise */
 	movq	%rcx,   %rsi  /* For ever source page do a copy */
 	andq	$0xfffffffffffff000, %rsi
 
@@ -321,8 +321,8 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	rep ; movsq
 
 	lea	PAGE_SIZE(%rax), %rsi
-	jmp	0b
-3:
+	jmp	.Lloop
+.Ldone:
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 03/16] x86/kexec: Restore GDT on return from preserve_context kexec
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 02/16] x86/kexec: Use named labels in swap_pages " David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 04/16] x86/kexec: Only swap pages for preserve_context mode David Woodhouse
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

The restore_processor_state() function explicitly states that "the asm code
that gets us here will have restored a usable GDT". That wasn't true in the
case of returning from a preserve_context kexec. Make it so.

Without this, the kernel was depending on the called function to reload an
appropriate GDT.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index ca01e3e2f097..ed2ae50535dd 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -252,6 +252,11 @@ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
 	movq	CR0(%r8), %r8
 	movq	%rax, %cr3
 	movq	%r8, %cr0
+
+	/* Saved in save_processor_state. */
+	movq    $saved_context, %rax
+	lgdt    saved_context_gdt_desc(%rax)
+
 	movq	%rbp, %rax
 
 	popf
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 04/16] x86/kexec: Only swap pages for preserve_context mode
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (2 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 03/16] x86/kexec: Restore GDT on return from preserve_context kexec David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 05/16] x86/kexec: Invoke copy of relocate_kernel() instead of the original David Woodhouse
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

There's no need to swap pages (which involves three memcopies for each
page) in the plain kexec case. Just do a single copy from source to
destination page.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index ed2ae50535dd..92d5dbed3097 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -308,6 +308,9 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	movq	%rdi, %rdx    /* Save destination page to %rdx */
 	movq	%rsi, %rax    /* Save source page to %rax */
 
+	testq	%r11, %r11    /* Only actually swap for preserve_context */
+	jnz .Lnoswap
+
 	/* copy source page to swap page */
 	movq	%r10, %rdi
 	movl	$512, %ecx
@@ -322,6 +325,7 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	/* copy swap page to destination page */
 	movq	%rdx, %rdi
 	movq	%r10, %rsi
+.Lnoswap:
 	movl	$512, %ecx
 	rep ; movsq
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 05/16] x86/kexec: Invoke copy of relocate_kernel() instead of the original
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (3 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 04/16] x86/kexec: Only swap pages for preserve_context mode David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 06/16] x86/kexec: Move relocate_kernel to kernel .data section David Woodhouse
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

This currently calls set_memory_x() from machine_kexec_prepare() just
like the 32-bit version does. That's actually a bit earlier than I'd
like, as it leaves the page RWX all the time the image is even *loaded*.

Subsequent commits will eliminate all the writes to the page between the
point it's marked executable in machine_kexec_prepare() the time that
relocate_kernel() is running and has switched to the identmap %cr3, so
that it can be ROX. But that can't happen until it's moved to the .data
section of the kernel, and *that* can't happen until we start executing
the copy instead of executing it in place in the kernel .text. So break
the circular dependency in those commits by letting it be RWX for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c   | 28 +++++++++++++++++++++-------
 arch/x86/kernel/relocate_kernel_64.S |  5 ++++-
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 9c9ac606893e..3aeb225a0b36 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -156,8 +156,8 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
 	pmd_t *pmd;
 	pte_t *pte;
 
-	vaddr = (unsigned long)relocate_kernel;
-	paddr = __pa(page_address(image->control_code_page)+PAGE_SIZE);
+	vaddr = (unsigned long)page_address(image->control_code_page) + PAGE_SIZE;
+	paddr = __pa(vaddr);
 	pgd += pgd_index(vaddr);
 	if (!pgd_present(*pgd)) {
 		p4d = (p4d_t *)get_zeroed_page(GFP_KERNEL);
@@ -296,6 +296,7 @@ static void load_segments(void)
 
 int machine_kexec_prepare(struct kimage *image)
 {
+	void *control_page = page_address(image->control_code_page) + PAGE_SIZE;
 	unsigned long start_pgtable;
 	int result;
 
@@ -307,11 +308,17 @@ int machine_kexec_prepare(struct kimage *image)
 	if (result)
 		return result;
 
+	set_memory_x((unsigned long)control_page, 1);
+
 	return 0;
 }
 
 void machine_kexec_cleanup(struct kimage *image)
 {
+	void *control_page = page_address(image->control_code_page) + PAGE_SIZE;
+
+	set_memory_nx((unsigned long)control_page, 1);
+
 	free_transition_pgtable(image);
 }
 
@@ -321,6 +328,11 @@ void machine_kexec_cleanup(struct kimage *image)
  */
 void machine_kexec(struct kimage *image)
 {
+	unsigned long (*relocate_kernel_ptr)(unsigned long indirection_page,
+					     unsigned long page_list,
+					     unsigned long start_address,
+					     unsigned int preserve_context,
+					     unsigned int host_mem_enc_active);
 	unsigned long page_list[PAGES_NR];
 	unsigned int host_mem_enc_active;
 	int save_ftrace_enabled;
@@ -369,6 +381,8 @@ void machine_kexec(struct kimage *image)
 		page_list[PA_SWAP_PAGE] = (page_to_pfn(image->swap_page)
 						<< PAGE_SHIFT);
 
+	relocate_kernel_ptr = control_page;
+
 	/*
 	 * The segment registers are funny things, they have both a
 	 * visible and an invisible part.  Whenever the visible part is
@@ -388,11 +402,11 @@ void machine_kexec(struct kimage *image)
 	native_gdt_invalidate();
 
 	/* now call it */
-	image->start = relocate_kernel((unsigned long)image->head,
-				       (unsigned long)page_list,
-				       image->start,
-				       image->preserve_context,
-				       host_mem_enc_active);
+	image->start = relocate_kernel_ptr((unsigned long)image->head,
+					   (unsigned long)page_list,
+					   image->start,
+					   image->preserve_context,
+					   host_mem_enc_active);
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (image->preserve_context)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 92d5dbed3097..70539b1b9545 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -39,6 +39,7 @@
 #define CP_PA_TABLE_PAGE	DATA(0x20)
 #define CP_PA_SWAP_PAGE		DATA(0x28)
 #define CP_PA_BACKUP_PAGES_MAP	DATA(0x30)
+#define CP_VA_CONTROL_PAGE	DATA(0x38)
 
 	.text
 	.align PAGE_SIZE
@@ -99,6 +100,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	movq	%r9, CP_PA_TABLE_PAGE(%r11)
 	movq	%r10, CP_PA_SWAP_PAGE(%r11)
 	movq	%rdi, CP_PA_BACKUP_PAGES_MAP(%r11)
+	movq	%r11, CP_VA_CONTROL_PAGE(%r11)
 
 	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
 	movq	%rcx, %r11
@@ -235,7 +237,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	movq	%rax, %cr3
 	lea	PAGE_SIZE(%r8), %rsp
 	call	swap_pages
-	movq	$virtual_mapped, %rax
+	movq	CP_VA_CONTROL_PAGE(%r8), %rax
+	addq	$(virtual_mapped - relocate_kernel), %rax
 	pushq	%rax
 	ANNOTATE_UNRET_SAFE
 	ret
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 06/16] x86/kexec: Move relocate_kernel to kernel .data section
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (4 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 05/16] x86/kexec: Invoke copy of relocate_kernel() instead of the original David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 07/16] x86/kexec: Add data section to relocate_kernel David Woodhouse
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Now that the copy is executed instead of the original, the relocate_kernel
page can live in the kernel's .text section. This will allow subsequent
commits to actually add real data to it and clean up the code somewhat as
well as making the control page ROX.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/sections.h      |  1 +
 arch/x86/kernel/machine_kexec_64.c   |  4 +++-
 arch/x86/kernel/relocate_kernel_64.S |  6 +-----
 arch/x86/kernel/vmlinux.lds.S        | 11 ++++++++++-
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
index 3fa87e5e11ab..30e8ee7006f9 100644
--- a/arch/x86/include/asm/sections.h
+++ b/arch/x86/include/asm/sections.h
@@ -5,6 +5,7 @@
 #include <asm-generic/sections.h>
 #include <asm/extable.h>
 
+extern char __relocate_kernel_start[], __relocate_kernel_end[];
 extern char __brk_base[], __brk_limit[];
 extern char __end_rodata_aligned[];
 
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 3aeb225a0b36..048868d868ce 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -333,6 +333,8 @@ void machine_kexec(struct kimage *image)
 					     unsigned long start_address,
 					     unsigned int preserve_context,
 					     unsigned int host_mem_enc_active);
+	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
+	unsigned long reloc_end = (unsigned long)__relocate_kernel_end;
 	unsigned long page_list[PAGES_NR];
 	unsigned int host_mem_enc_active;
 	int save_ftrace_enabled;
@@ -370,7 +372,7 @@ void machine_kexec(struct kimage *image)
 	}
 
 	control_page = page_address(image->control_code_page) + PAGE_SIZE;
-	__memcpy(control_page, relocate_kernel, KEXEC_CONTROL_CODE_MAX_SIZE);
+	__memcpy(control_page, __relocate_kernel_start, reloc_end - reloc_start);
 
 	page_list[PA_CONTROL_PAGE] = virt_to_phys(control_page);
 	page_list[VA_CONTROL_PAGE] = (unsigned long)control_page;
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 70539b1b9545..085dddf79476 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -41,10 +41,8 @@
 #define CP_PA_BACKUP_PAGES_MAP	DATA(0x30)
 #define CP_VA_CONTROL_PAGE	DATA(0x38)
 
-	.text
-	.align PAGE_SIZE
+	.section .text.relocate_kernel,"ax";
 	.code64
-SYM_CODE_START_NOALIGN(relocate_range)
 SYM_CODE_START_NOALIGN(relocate_kernel)
 	UNWIND_HINT_END_OF_STACK
 	ANNOTATE_NOENDBR
@@ -340,5 +338,3 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	int3
 SYM_CODE_END(swap_pages)
 
-	.skip KEXEC_CONTROL_CODE_MAX_SIZE - (. - relocate_kernel), 0xcc
-SYM_CODE_END(relocate_range);
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index b8c5741d2fb4..925a821134b5 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -95,7 +95,15 @@ const_pcpu_hot = pcpu_hot;
 #define BSS_DECRYPTED
 
 #endif
-
+#if defined(CONFIG_X86_64) && defined(CONFIG_KEXEC_CORE)
+#define KEXEC_RELOCATE_KERNEL					\
+	. = ALIGN(0x100);					\
+	__relocate_kernel_start = .;				\
+	*(.text.relocate_kernel);				\
+	__relocate_kernel_end = .;
+#else
+#define KEXEC_RELOCATE_KERNEL
+#endif
 PHDRS {
 	text PT_LOAD FLAGS(5);          /* R_E */
 	data PT_LOAD FLAGS(6);          /* RW_ */
@@ -181,6 +189,7 @@ SECTIONS
 
 		DATA_DATA
 		CONSTRUCTORS
+		KEXEC_RELOCATE_KERNEL
 
 		/* rarely changed data like cpu maps */
 		READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 07/16] x86/kexec: Add data section to relocate_kernel
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (5 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 06/16] x86/kexec: Move relocate_kernel to kernel .data section David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 08/16] x86/kexec: Copy control page into place in machine_kexec_prepare() David Woodhouse
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Now that the relocate_kernel page is handled sanely by a linker script
we can have actual data, and just use %rip-relative addressing to access
it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c   |  7 +++-
 arch/x86/kernel/relocate_kernel_64.S | 62 ++++++++++++++--------------
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 3 files changed, 37 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 048868d868ce..123e9544506b 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -383,7 +383,12 @@ void machine_kexec(struct kimage *image)
 		page_list[PA_SWAP_PAGE] = (page_to_pfn(image->swap_page)
 						<< PAGE_SHIFT);
 
-	relocate_kernel_ptr = control_page;
+	/*
+	 * Allow for the possibility that relocate_kernel might not be at
+	 * the very start of the page.
+	 */
+	relocate_kernel_ptr = control_page + (unsigned long)relocate_kernel -
+		reloc_start;
 
 	/*
 	 * The segment registers are funny things, they have both a
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 085dddf79476..445ca56dabbe 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -23,23 +23,21 @@
 #define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
 
 /*
- * control_page + KEXEC_CONTROL_CODE_MAX_SIZE
- * ~ control_page + PAGE_SIZE are used as data storage and stack for
- * jumping back
+ * The .text.relocate_kernel and .data.relocate_kernel sections are copied
+ * into the control page, and the remainder of the page is used as the stack.
  */
-#define DATA(offset)		(KEXEC_CONTROL_CODE_MAX_SIZE+(offset))
 
+	.section .data.relocate_kernel,"a";
 /* Minimal CPU state */
-#define RSP			DATA(0x0)
-#define CR0			DATA(0x8)
-#define CR3			DATA(0x10)
-#define CR4			DATA(0x18)
-
-/* other data */
-#define CP_PA_TABLE_PAGE	DATA(0x20)
-#define CP_PA_SWAP_PAGE		DATA(0x28)
-#define CP_PA_BACKUP_PAGES_MAP	DATA(0x30)
-#define CP_VA_CONTROL_PAGE	DATA(0x38)
+SYM_DATA_LOCAL(saved_rsp, .quad 0)
+SYM_DATA_LOCAL(saved_cr0, .quad 0)
+SYM_DATA_LOCAL(saved_cr3, .quad 0)
+SYM_DATA_LOCAL(saved_cr4, .quad 0)
+	/* other data */
+SYM_DATA_LOCAL(va_control_page, .quad 0)
+SYM_DATA_LOCAL(pa_table_page, .quad 0)
+SYM_DATA_LOCAL(pa_swap_page, .quad 0)
+SYM_DATA_LOCAL(pa_backup_pages_map, .quad 0)
 
 	.section .text.relocate_kernel,"ax";
 	.code64
@@ -63,14 +61,13 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	pushq %r15
 	pushf
 
-	movq	PTR(VA_CONTROL_PAGE)(%rsi), %r11
-	movq	%rsp, RSP(%r11)
+	movq	%rsp, saved_rsp(%rip)
 	movq	%cr0, %rax
-	movq	%rax, CR0(%r11)
+	movq	%rax, saved_cr0(%rip)
 	movq	%cr3, %rax
-	movq	%rax, CR3(%r11)
+	movq	%rax, saved_cr3(%rip)
 	movq	%cr4, %rax
-	movq	%rax, CR4(%r11)
+	movq	%rax, saved_cr4(%rip)
 
 	/* Save CR4. Required to enable the right paging mode later. */
 	movq	%rax, %r13
@@ -83,10 +80,11 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	movq	%r8, %r12
 
 	/*
-	 * get physical address of control page now
+	 * get physical and virtual address of control page now
 	 * this is impossible after page table switch
 	 */
 	movq	PTR(PA_CONTROL_PAGE)(%rsi), %r8
+	movq	PTR(VA_CONTROL_PAGE)(%rsi), %r11
 
 	/* get physical address of page table now too */
 	movq	PTR(PA_TABLE_PAGE)(%rsi), %r9
@@ -95,10 +93,10 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	movq	PTR(PA_SWAP_PAGE)(%rsi), %r10
 
 	/* save some information for jumping back */
-	movq	%r9, CP_PA_TABLE_PAGE(%r11)
-	movq	%r10, CP_PA_SWAP_PAGE(%r11)
-	movq	%rdi, CP_PA_BACKUP_PAGES_MAP(%r11)
-	movq	%r11, CP_VA_CONTROL_PAGE(%r11)
+	movq	%r9, pa_table_page(%rip)
+	movq	%r10, pa_swap_page(%rip)
+	movq	%rdi, pa_backup_pages_map(%rip)
+	movq	%r11, va_control_page(%rip)
 
 	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
 	movq	%rcx, %r11
@@ -229,13 +227,13 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	/* get the re-entry point of the peer system */
 	movq	0(%rsp), %rbp
 	leaq	relocate_kernel(%rip), %r8
-	movq	CP_PA_SWAP_PAGE(%r8), %r10
-	movq	CP_PA_BACKUP_PAGES_MAP(%r8), %rdi
-	movq	CP_PA_TABLE_PAGE(%r8), %rax
+	movq	pa_swap_page(%rip), %r10
+	movq	pa_backup_pages_map(%rip), %rdi
+	movq	pa_table_page(%rip), %rax
 	movq	%rax, %cr3
 	lea	PAGE_SIZE(%r8), %rsp
 	call	swap_pages
-	movq	CP_VA_CONTROL_PAGE(%r8), %rax
+	movq	va_control_page(%rip), %rax
 	addq	$(virtual_mapped - relocate_kernel), %rax
 	pushq	%rax
 	ANNOTATE_UNRET_SAFE
@@ -246,11 +244,11 @@ SYM_CODE_END(identity_mapped)
 SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
 	UNWIND_HINT_END_OF_STACK
 	ANNOTATE_NOENDBR // RET target, above
-	movq	RSP(%r8), %rsp
-	movq	CR4(%r8), %rax
+	movq	saved_rsp(%rip), %rsp
+	movq	saved_cr4(%rip), %rax
 	movq	%rax, %cr4
-	movq	CR3(%r8), %rax
-	movq	CR0(%r8), %r8
+	movq	saved_cr3(%rip), %rax
+	movq	saved_cr0(%rip), %r8
 	movq	%rax, %cr3
 	movq	%r8, %cr0
 
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 925a821134b5..324c1c42faae 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -100,6 +100,7 @@ const_pcpu_hot = pcpu_hot;
 	. = ALIGN(0x100);					\
 	__relocate_kernel_start = .;				\
 	*(.text.relocate_kernel);				\
+	*(.data.relocate_kernel);				\
 	__relocate_kernel_end = .;
 #else
 #define KEXEC_RELOCATE_KERNEL
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 08/16] x86/kexec: Copy control page into place in machine_kexec_prepare()
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (6 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 07/16] x86/kexec: Add data section to relocate_kernel David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 09/16] x86/kexec: Drop page_list argument from relocate_kernel() David Woodhouse
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

There's no need for this to wait until the actual machine_kexec() invocation;
a subsequent change will mark the control page ROX so all writes should be
completed earlier.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 123e9544506b..60632a5a2a13 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -297,6 +297,8 @@ static void load_segments(void)
 int machine_kexec_prepare(struct kimage *image)
 {
 	void *control_page = page_address(image->control_code_page) + PAGE_SIZE;
+	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
+	unsigned long reloc_end = (unsigned long)__relocate_kernel_end;
 	unsigned long start_pgtable;
 	int result;
 
@@ -308,6 +310,8 @@ int machine_kexec_prepare(struct kimage *image)
 	if (result)
 		return result;
 
+	__memcpy(control_page, __relocate_kernel_start, reloc_end - reloc_start);
+
 	set_memory_x((unsigned long)control_page, 1);
 
 	return 0;
@@ -334,7 +338,6 @@ void machine_kexec(struct kimage *image)
 					     unsigned int preserve_context,
 					     unsigned int host_mem_enc_active);
 	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
-	unsigned long reloc_end = (unsigned long)__relocate_kernel_end;
 	unsigned long page_list[PAGES_NR];
 	unsigned int host_mem_enc_active;
 	int save_ftrace_enabled;
@@ -372,7 +375,6 @@ void machine_kexec(struct kimage *image)
 	}
 
 	control_page = page_address(image->control_code_page) + PAGE_SIZE;
-	__memcpy(control_page, __relocate_kernel_start, reloc_end - reloc_start);
 
 	page_list[PA_CONTROL_PAGE] = virt_to_phys(control_page);
 	page_list[VA_CONTROL_PAGE] = (unsigned long)control_page;
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 09/16] x86/kexec: Drop page_list argument from relocate_kernel()
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (7 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 08/16] x86/kexec: Copy control page into place in machine_kexec_prepare() David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 10/16] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page David Woodhouse
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

The kernel's virtual mapping of the relocate_kernel page currently needs
to be RWX because it is written to before the %cr3 switch.

Now that the relocate_kernel page has its own .data section and local
variables, it can also have *global* variables. So eliminate the separate
page_list argument, and write the same information directly to variables
in the relocate_kernel page instead. This way, the relocate_kernel code
itself doesn't need to copy it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/kexec.h         | 13 +++++-----
 arch/x86/kernel/machine_kexec_64.c   | 21 +++++++---------
 arch/x86/kernel/relocate_kernel_64.S | 36 ++++++++++------------------
 3 files changed, 27 insertions(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index ae5482a2f0ca..9af54743de90 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -8,12 +8,6 @@
 # define PA_PGD			2
 # define PA_SWAP_PAGE		3
 # define PAGES_NR		4
-#else
-# define PA_CONTROL_PAGE	0
-# define VA_CONTROL_PAGE	1
-# define PA_TABLE_PAGE		2
-# define PA_SWAP_PAGE		3
-# define PAGES_NR		4
 #endif
 
 # define KEXEC_CONTROL_CODE_MAX_SIZE	2048
@@ -63,6 +57,11 @@ struct kimage;
 
 /* The native architecture */
 # define KEXEC_ARCH KEXEC_ARCH_X86_64
+
+extern unsigned long kexec_pa_control_page;
+extern unsigned long kexec_va_control_page;
+extern unsigned long kexec_pa_table_page;
+extern unsigned long kexec_pa_swap_page;
 #endif
 
 /*
@@ -125,7 +124,7 @@ relocate_kernel(unsigned long indirection_page,
 #else
 unsigned long
 relocate_kernel(unsigned long indirection_page,
-		unsigned long page_list,
+		unsigned long pa_control_page,
 		unsigned long start_address,
 		unsigned int preserve_context,
 		unsigned int host_mem_enc_active);
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 60632a5a2a13..c653c2c22d63 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -309,6 +309,13 @@ int machine_kexec_prepare(struct kimage *image)
 	result = init_pgtable(image, start_pgtable);
 	if (result)
 		return result;
+	kexec_va_control_page = (unsigned long)control_page;
+	kexec_pa_table_page =
+	  (unsigned long)__pa(page_address(image->control_code_page));
+
+	if (image->type == KEXEC_TYPE_DEFAULT)
+		kexec_pa_swap_page = (page_to_pfn(image->swap_page)
+						<< PAGE_SHIFT);
 
 	__memcpy(control_page, __relocate_kernel_start, reloc_end - reloc_start);
 
@@ -333,12 +340,11 @@ void machine_kexec_cleanup(struct kimage *image)
 void machine_kexec(struct kimage *image)
 {
 	unsigned long (*relocate_kernel_ptr)(unsigned long indirection_page,
-					     unsigned long page_list,
+					     unsigned long pa_control_page,
 					     unsigned long start_address,
 					     unsigned int preserve_context,
 					     unsigned int host_mem_enc_active);
 	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
-	unsigned long page_list[PAGES_NR];
 	unsigned int host_mem_enc_active;
 	int save_ftrace_enabled;
 	void *control_page;
@@ -376,15 +382,6 @@ void machine_kexec(struct kimage *image)
 
 	control_page = page_address(image->control_code_page) + PAGE_SIZE;
 
-	page_list[PA_CONTROL_PAGE] = virt_to_phys(control_page);
-	page_list[VA_CONTROL_PAGE] = (unsigned long)control_page;
-	page_list[PA_TABLE_PAGE] =
-	  (unsigned long)__pa(page_address(image->control_code_page));
-
-	if (image->type == KEXEC_TYPE_DEFAULT)
-		page_list[PA_SWAP_PAGE] = (page_to_pfn(image->swap_page)
-						<< PAGE_SHIFT);
-
 	/*
 	 * Allow for the possibility that relocate_kernel might not be at
 	 * the very start of the page.
@@ -412,7 +409,7 @@ void machine_kexec(struct kimage *image)
 
 	/* now call it */
 	image->start = relocate_kernel_ptr((unsigned long)image->head,
-					   (unsigned long)page_list,
+					   virt_to_phys(control_page),
 					   image->start,
 					   image->preserve_context,
 					   host_mem_enc_active);
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 445ca56dabbe..b9ad3ef0b982 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -34,9 +34,9 @@ SYM_DATA_LOCAL(saved_cr0, .quad 0)
 SYM_DATA_LOCAL(saved_cr3, .quad 0)
 SYM_DATA_LOCAL(saved_cr4, .quad 0)
 	/* other data */
-SYM_DATA_LOCAL(va_control_page, .quad 0)
-SYM_DATA_LOCAL(pa_table_page, .quad 0)
-SYM_DATA_LOCAL(pa_swap_page, .quad 0)
+SYM_DATA(kexec_va_control_page, .quad 0)
+SYM_DATA(kexec_pa_table_page, .quad 0)
+SYM_DATA(kexec_pa_swap_page, .quad 0)
 SYM_DATA_LOCAL(pa_backup_pages_map, .quad 0)
 
 	.section .text.relocate_kernel,"ax";
@@ -46,7 +46,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	ANNOTATE_NOENDBR
 	/*
 	 * %rdi indirection_page
-	 * %rsi page_list
+	 * %rsi pa_control_page
 	 * %rdx start address
 	 * %rcx preserve_context
 	 * %r8  host_mem_enc_active
@@ -79,31 +79,19 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	/* Save SME active flag */
 	movq	%r8, %r12
 
-	/*
-	 * get physical and virtual address of control page now
-	 * this is impossible after page table switch
-	 */
-	movq	PTR(PA_CONTROL_PAGE)(%rsi), %r8
-	movq	PTR(VA_CONTROL_PAGE)(%rsi), %r11
-
-	/* get physical address of page table now too */
-	movq	PTR(PA_TABLE_PAGE)(%rsi), %r9
-
-	/* get physical address of swap page now */
-	movq	PTR(PA_SWAP_PAGE)(%rsi), %r10
-
-	/* save some information for jumping back */
-	movq	%r9, pa_table_page(%rip)
-	movq	%r10, pa_swap_page(%rip)
+	/* save indirection list for jumping back */
 	movq	%rdi, pa_backup_pages_map(%rip)
-	movq	%r11, va_control_page(%rip)
 
 	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
 	movq	%rcx, %r11
 
 	/* Switch to the identity mapped page tables */
+	movq	kexec_pa_table_page(%rip), %r9
 	movq	%r9, %cr3
 
+	/* Physical address of control page */
+	movq    %rsi, %r8
+
 	/* setup a new stack at the end of the physical control page */
 	lea	PAGE_SIZE(%r8), %rsp
 
@@ -227,13 +215,13 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	/* get the re-entry point of the peer system */
 	movq	0(%rsp), %rbp
 	leaq	relocate_kernel(%rip), %r8
-	movq	pa_swap_page(%rip), %r10
+	movq	kexec_pa_swap_page(%rip), %r10
 	movq	pa_backup_pages_map(%rip), %rdi
-	movq	pa_table_page(%rip), %rax
+	movq	kexec_pa_table_page(%rip), %rax
 	movq	%rax, %cr3
 	lea	PAGE_SIZE(%r8), %rsp
 	call	swap_pages
-	movq	va_control_page(%rip), %rax
+	movq	kexec_va_control_page(%rip), %rax
 	addq	$(virtual_mapped - relocate_kernel), %rax
 	pushq	%rax
 	ANNOTATE_UNRET_SAFE
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 10/16] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (8 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 09/16] x86/kexec: Drop page_list argument from relocate_kernel() David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 11/16] x86/kexec: Clean up register usage in relocate_kernel() David Woodhouse
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

All writes to the relocate_kernel control page are now done *after* the
%cr3 switch via simple %rip-relative addressing, which means the DATA()
macro with its pointer arithmetic can also now be removed.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 29 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index b9ad3ef0b982..5c6456467f08 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -61,21 +61,24 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	pushq %r15
 	pushf
 
-	movq	%rsp, saved_rsp(%rip)
-	movq	%cr0, %rax
-	movq	%rax, saved_cr0(%rip)
-	movq	%cr3, %rax
-	movq	%rax, saved_cr3(%rip)
-	movq	%cr4, %rax
-	movq	%rax, saved_cr4(%rip)
-
-	/* Save CR4. Required to enable the right paging mode later. */
-	movq	%rax, %r13
-
 	/* zero out flags, and disable interrupts */
 	pushq $0
 	popfq
 
+	/* Switch to the identity mapped page tables */
+	movq	%cr3, %rax
+	movq	kexec_pa_table_page(%rip), %r9
+	movq	%r9, %cr3
+
+	/* Save %rsp and CRs. */
+	movq    %rsp, saved_rsp(%rip)
+	movq	%rax, saved_cr3(%rip)
+	movq	%cr0, %rax
+	movq	%rax, saved_cr0(%rip)
+	/* Leave CR4 in %r13 to enable the right paging mode later. */
+	movq	%cr4, %r13
+	movq	%r13, saved_cr4(%rip)
+
 	/* Save SME active flag */
 	movq	%r8, %r12
 
@@ -85,10 +88,6 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
 	movq	%rcx, %r11
 
-	/* Switch to the identity mapped page tables */
-	movq	kexec_pa_table_page(%rip), %r9
-	movq	%r9, %cr3
-
 	/* Physical address of control page */
 	movq    %rsi, %r8
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 11/16] x86/kexec: Clean up register usage in relocate_kernel()
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (9 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 10/16] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 12/16] x86/kexec: Mark relocate_kernel page as ROX instead of RWX David Woodhouse
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

The memory encryption flag is passed in %r8 because that's where the
calling convention puts it. Instead of moving it to %r12 and then using
%r8 for other things, just leave it in %r8 and use other registers
instead.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 5c6456467f08..51dc55ac4395 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -79,24 +79,18 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	movq	%cr4, %r13
 	movq	%r13, saved_cr4(%rip)
 
-	/* Save SME active flag */
-	movq	%r8, %r12
-
 	/* save indirection list for jumping back */
 	movq	%rdi, pa_backup_pages_map(%rip)
 
 	/* Save the preserve_context to %r11 as swap_pages clobbers %rcx. */
 	movq	%rcx, %r11
 
-	/* Physical address of control page */
-	movq    %rsi, %r8
-
 	/* setup a new stack at the end of the physical control page */
-	lea	PAGE_SIZE(%r8), %rsp
+	lea	PAGE_SIZE(%rsi), %rsp
 
 	/* jump to identity mapped page */
-	addq	$(identity_mapped - relocate_kernel), %r8
-	pushq	%r8
+	addq	$(identity_mapped - relocate_kernel), %rsi
+	pushq	%rsi
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
@@ -107,8 +101,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	/*
 	 * %rdi	indirection page
 	 * %rdx start address
+	 * %r8 host_mem_enc_active
+	 * %r9 page table page
 	 * %r11 preserve_context
-	 * %r12 host_mem_enc_active
 	 * %r13 original CR4 when relocate_kernel() was invoked
 	 */
 
@@ -161,7 +156,7 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	 * entries that will conflict with the now unencrypted memory
 	 * used by kexec. Flush the caches before copying the kernel.
 	 */
-	testq	%r12, %r12
+	testq	%r8, %r8
 	jz .Lsme_off
 	wbinvd
 .Lsme_off:
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 12/16] x86/kexec: Mark relocate_kernel page as ROX instead of RWX
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (10 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 11/16] x86/kexec: Clean up register usage in relocate_kernel() David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 13/16] x86/kexec: Debugging support: load a GDT David Woodhouse
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

All writes to the page now happen before it gets marked as executable
(or after it's already switched to the identmap page tables where it's
OK to be RWX).

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index c653c2c22d63..2a294daeeb1a 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -319,7 +319,7 @@ int machine_kexec_prepare(struct kimage *image)
 
 	__memcpy(control_page, __relocate_kernel_start, reloc_end - reloc_start);
 
-	set_memory_x((unsigned long)control_page, 1);
+	set_memory_rox((unsigned long)control_page, 1);
 
 	return 0;
 }
@@ -329,6 +329,7 @@ void machine_kexec_cleanup(struct kimage *image)
 	void *control_page = page_address(image->control_code_page) + PAGE_SIZE;
 
 	set_memory_nx((unsigned long)control_page, 1);
+	set_memory_rw((unsigned long)control_page, 1);
 
 	free_transition_pgtable(image);
 }
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 13/16] x86/kexec: Debugging support: load a GDT
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (11 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 12/16] x86/kexec: Mark relocate_kernel page as ROX instead of RWX David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 14/16] x86/kexec: Debugging support: Load an IDT and basic exception entry points David Woodhouse
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

There are some failure modes which lead to triple-faults in the
relocate_kernel function, which is fairly much undebuggable for normal
mortals.

Adding a GDT in the relocate_kernel environment is step 1 towards being
able to catch faults and do something more useful.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 51dc55ac4395..5c174829f794 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -39,6 +39,18 @@ SYM_DATA(kexec_pa_table_page, .quad 0)
 SYM_DATA(kexec_pa_swap_page, .quad 0)
 SYM_DATA_LOCAL(pa_backup_pages_map, .quad 0)
 
+#ifdef DEBUG
+SYM_DATA_START_LOCAL(reloc_kernel_gdt)
+	.balign 16
+	.word   reloc_kernel_gdt_end - reloc_kernel_gdt - 1
+	.long   0
+	.word   0
+	.quad   0x00cf9a000000ffff      /* __KERNEL32_CS */
+	.quad   0x00af9a000000ffff      /* __KERNEL_CS */
+	.quad   0x00cf92000000ffff      /* __KERNEL_DS */
+SYM_DATA_END_LABEL(reloc_kernel_gdt, SYM_L_LOCAL, reloc_kernel_gdt_end)
+#endif /* DEBUG */
+
 	.section .text.relocate_kernel,"ax";
 	.code64
 SYM_CODE_START_NOALIGN(relocate_kernel)
@@ -112,6 +124,21 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	/* store the start address on the stack */
 	pushq   %rdx
 
+#ifdef DEBUG
+	/* Create a GDTR (16 bits limit, 64 bits addr) on stack */
+	leaq	reloc_kernel_gdt(%rip), %rax
+	pushq	%rax
+	pushw	(%rax)
+
+	/* Load the GDT, put the stack back */
+	lgdt	(%rsp)
+	addq	$10, %rsp
+
+	/* Test that we can load segments */
+	movq	%ds, %rax
+	movq	%rax, %ds
+#endif /* DEBUG */
+
 	/*
 	 * Clear X86_CR4_CET (if it was set) such that we can clear CR0_WP
 	 * below.
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 14/16] x86/kexec: Debugging support: Load an IDT and basic exception entry points
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (12 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 13/16] x86/kexec: Debugging support: load a GDT David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 15/16] x86/kexec: Debugging support: Dump registers on exception David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG David Woodhouse
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 114 +++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 5c174829f794..4ace2577afc6 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -40,6 +40,9 @@ SYM_DATA(kexec_pa_swap_page, .quad 0)
 SYM_DATA_LOCAL(pa_backup_pages_map, .quad 0)
 
 #ifdef DEBUG
+	/* Size of each exception handler referenced by the IDT */
+#define EXC_HANDLER_SIZE	6 /* pushi, pushi, 2-byte jmp */
+
 SYM_DATA_START_LOCAL(reloc_kernel_gdt)
 	.balign 16
 	.word   reloc_kernel_gdt_end - reloc_kernel_gdt - 1
@@ -108,6 +111,11 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 	int3
 SYM_CODE_END(relocate_kernel)
 
+#ifdef DEBUG
+	UNWIND_HINT_UNDEFINED
+	.balign 0x100	/* relocate_kernel will be overwritten with an IDT */
+#endif
+
 SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	UNWIND_HINT_END_OF_STACK
 	/*
@@ -137,6 +145,52 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	/* Test that we can load segments */
 	movq	%ds, %rax
 	movq	%rax, %ds
+
+	/* Load address of reloc_kernel, at start of this page, into %rsi */
+	lea	relocate_kernel(%rip), %rsi
+
+	/*
+	 * Build an IDT descriptor in %rax/%rbx. The address is in the low 16
+	 * and high 16 bits of %rax, and low 32 of %rbx. The niddle 32 bits
+	 * of %rax hold the selector/ist/flags which are hard-coded below.
+         */
+	movq	%rsi, %rax         // 1234567890abcdef
+
+	andq	$-0xFFFF, %rax    // 1234567890ab....
+	shlq	$16, %rax         // 567890ab........
+
+	movq	$0x8F000010, %rcx // Present, DPL0, Interrupt Gate, __KERNEL_CS.
+	orq	%rcx, %rax        // 567890ab8F000010
+	shlq	$16, %rax         // 90ab8F000010....
+
+	movq	%rsi, %rcx
+	andq	$0xffff, %rcx     // ............cdef
+	orq	%rcx, %rax        // 90ab87000010cdef
+
+	movq	%rsi, %rbx
+	shrq	$32, %rbx
+
+	/*
+	 * The descriptor was built using the address of relocate_kernel. Add
+	 * the required offset to point to the actual entry points.
+	 */
+	addq	$(exc_vectors - relocate_kernel), %rax
+
+	/* Loop 16 times to handle exception 0-15 */
+	movq	$16, %rcx
+1:
+	movq	%rax, (%rsi)
+	movq	%rbx, 8(%rsi)
+	addq	$16, %rsi
+	addq	$EXC_HANDLER_SIZE, %rax
+	loop	1b
+
+	/* Now put an IDTR on the stack (temporarily) to load it */
+	subq	$0x100, %rsi
+	pushq	%rsi
+	pushw	$0xff
+	lidt	(%rsp)
+	addq	$10, %rsp
 #endif /* DEBUG */
 
 	/*
@@ -345,3 +399,63 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 	int3
 SYM_CODE_END(swap_pages)
 
+#ifdef DEBUG
+SYM_CODE_START_LOCAL_NOALIGN(exc_vectors)
+	/* Each of these is 6 bytes. */
+.macro vec_err exc
+	UNWIND_HINT_ENTRY
+	. = exc_vectors + (\exc * EXC_HANDLER_SIZE)
+	nop
+	nop
+	pushq	$\exc
+	jmp	exc_handler
+.endm
+
+.macro vec_noerr exc
+	UNWIND_HINT_ENTRY
+	. = exc_vectors + (\exc * EXC_HANDLER_SIZE)
+	pushq	$0
+	pushq	$\exc
+	jmp	exc_handler
+.endm
+
+	vec_noerr 0 // #DE
+	vec_noerr 1 // #DB
+	vec_noerr 2 // #NMI
+	vec_noerr 3 // #BP
+	vec_noerr 4 // #OF
+	vec_noerr 5 // #BR
+	vec_noerr 6 // #UD
+	vec_noerr 7 // #NM
+	vec_err 8   // #DF
+	vec_noerr 9
+	vec_err 10 // #TS
+	vec_err 11 // #NP
+	vec_err 12 // #SS
+	vec_err 13 // #GP
+	vec_err 14 // #PF
+	vec_noerr 15
+SYM_CODE_END(exc_vectors)
+
+SYM_CODE_START_LOCAL_NOALIGN(exc_handler)
+	pushq	%rax
+	pushq	%rdx
+	movw	$0x3f8, %dx
+	movb	$'A', %al
+	outb	%al, %dx
+	popq	%rdx
+	popq	%rax
+
+	/* Only return from int3 */
+	cmpq	$3, (%rsp)
+	jne	.Ldie
+
+	addq	$16, %rsp
+	iretq
+
+.Ldie:
+	hlt
+	jmp	.Ldie
+
+SYM_CODE_END(exc_handler)
+#endif /* DEBUG */
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 15/16] x86/kexec: Debugging support: Dump registers on exception
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (13 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 14/16] x86/kexec: Debugging support: Load an IDT and basic exception entry points David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-22 22:38 ` [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG David Woodhouse
  15 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 83 +++++++++++++++++++++++++++-
 1 file changed, 80 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 4ace2577afc6..67f6853c7abe 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -400,6 +400,55 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
 SYM_CODE_END(swap_pages)
 
 #ifdef DEBUG
+/*
+ * This allows other types of serial ports to be used.
+ *  - %al: Character to be printed (no clobber %rax)
+ *  - %rdx: MMIO address or port.
+ */
+.macro pr_char
+	outb	%al, %dx
+.endm
+
+/* Print the nybble in %bl, clobber %rax */
+SYM_CODE_START_LOCAL_NOALIGN(pr_nybble)
+	UNWIND_HINT_FUNC
+	movb	%bl, %al
+	nop
+	andb	$0x0f, %al
+	addb 	$0x30, %al
+	cmpb	$0x3a, %al
+	jb	1f
+	addb	$('a' - '0' - 10), %al
+1:	pr_char
+	ANNOTATE_UNRET_SAFE
+	ret
+SYM_CODE_END(pr_nybble)
+
+SYM_CODE_START_LOCAL_NOALIGN(pr_qword)
+	UNWIND_HINT_FUNC
+	movq	$16, %rcx
+1:	rolq	$4, %rbx
+	call	pr_nybble
+	loop	1b
+	movb	$'\n', %al
+	pr_char
+	ANNOTATE_UNRET_SAFE
+	ret
+SYM_CODE_END(pr_qword)
+
+.macro print_reg a, b, c, d, r
+	movb	$\a, %al
+	pr_char
+	movb	$\b, %al
+	pr_char
+	movb	$\c, %al
+	pr_char
+	movb	$\d, %al
+	pr_char
+	movq	\r, %rbx
+	call	pr_qword
+.endm
+
 SYM_CODE_START_LOCAL_NOALIGN(exc_vectors)
 	/* Each of these is 6 bytes. */
 .macro vec_err exc
@@ -439,11 +488,39 @@ SYM_CODE_END(exc_vectors)
 
 SYM_CODE_START_LOCAL_NOALIGN(exc_handler)
 	pushq	%rax
+	pushq	%rbx
+	pushq	%rcx
 	pushq	%rdx
+
 	movw	$0x3f8, %dx
-	movb	$'A', %al
-	outb	%al, %dx
+
+	/* rip and exception info */
+	print_reg 'E', 'x', 'c', ':', 32(%rsp)
+	print_reg 'E', 'r', 'r', ':', 40(%rsp)
+	print_reg 'r', 'i', 'p', ':', 48(%rsp)
+
+	/* We spilled these to the stack */
+	print_reg 'r', 'a', 'x', ':', 24(%rsp)
+	print_reg 'r', 'b', 'x', ':', 16(%rsp)
+	print_reg 'r', 'c', 'x', ':', 8(%rsp)
+	print_reg 'r', 'd', 'x', ':', (%rsp)
+
+	/* Other registers */
+	print_reg 'r', 's', 'i', ':', %rsi
+	print_reg 'r', 'd', 'i', ':', %rdi
+	print_reg 'r', '8', ' ', ':', %r8
+	print_reg 'r', '9', ' ', ':', %r9
+	print_reg 'r', '1', '0', ':', %r10
+	print_reg 'r', '1', '1', ':', %r11
+	print_reg 'r', '1', '2', ':', %r12
+	print_reg 'r', '1', '3', ':', %r13
+	print_reg 'r', '1', '4', ':', %r14
+	print_reg 'r', '1', '5', ':', %r15
+	print_reg 'c', 'r', '2', ':', %cr2
+
 	popq	%rdx
+	popq	%rcx
+	popq	%rbx
 	popq	%rax
 
 	/* Only return from int3 */
@@ -456,6 +533,6 @@ SYM_CODE_START_LOCAL_NOALIGN(exc_handler)
 .Ldie:
 	hlt
 	jmp	.Ldie
-
+	int3
 SYM_CODE_END(exc_handler)
 #endif /* DEBUG */
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
  2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
                   ` (14 preceding siblings ...)
  2024-11-22 22:38 ` [RFC PATCH v2 15/16] x86/kexec: Debugging support: Dump registers on exception David Woodhouse
@ 2024-11-22 22:38 ` David Woodhouse
  2024-11-25  9:21   ` Ingo Molnar
  15 siblings, 1 reply; 21+ messages in thread
From: David Woodhouse @ 2024-11-22 22:38 UTC (permalink / raw)
  To: kexec
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, David Woodhouse, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

From: David Woodhouse <dwmw@amazon.co.uk>

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/relocate_kernel_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 67f6853c7abe..ebbd76c9a3e9 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -14,6 +14,8 @@
 #include <asm/nospec-branch.h>
 #include <asm/unwind_hints.h>
 
+#define DEBUG
+
 /*
  * Must be relocatable PIC code callable as a C function, in particular
  * there must be a plain RET and not jump to return thunk.
@@ -191,6 +193,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	pushw	$0xff
 	lidt	(%rsp)
 	addq	$10, %rsp
+
+	int3
 #endif /* DEBUG */
 
 	/*
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
  2024-11-22 22:38 ` [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG David Woodhouse
@ 2024-11-25  9:21   ` Ingo Molnar
  2024-11-25  9:32     ` David Woodhouse
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2024-11-25  9:21 UTC (permalink / raw)
  To: David Woodhouse
  Cc: kexec, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, David Woodhouse, Kirill A. Shutemov,
	Kai Huang, Nikolay Borisov, linux-kernel, Simon Horman,
	Dave Young, Peter Zijlstra, jpoimboe


* David Woodhouse <dwmw2@infradead.org> wrote:

> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/x86/kernel/relocate_kernel_64.S | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 67f6853c7abe..ebbd76c9a3e9 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -14,6 +14,8 @@
>  #include <asm/nospec-branch.h>
>  #include <asm/unwind_hints.h>
>  
> +#define DEBUG
> +
>  /*
>   * Must be relocatable PIC code callable as a C function, in particular
>   * there must be a plain RET and not jump to return thunk.
> @@ -191,6 +193,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>  	pushw	$0xff
>  	lidt	(%rsp)
>  	addq	$10, %rsp
> +
> +	int3
>  #endif /* DEBUG */

That's a really nice piece of debugging code written in assembly, 
combined with the exception handling feature that generates debug 
output to begin with. Epic effort. :-)

Just curious: did you write this code to debug the series, or was there 
some original hair-tearing regression that motivated you? Is there's an 
upstream fix to marvel at and be horrified about in equal measure?

I'd argue that this debugging code probably needs a default-off Kconfig 
option, even with the obvious hard-coded environmental limitations & 
assumptions it has. Could be useful to very early debugging & would 
preserve your effort without it bitrotting too obviously.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
  2024-11-25  9:21   ` Ingo Molnar
@ 2024-11-25  9:32     ` David Woodhouse
  2024-11-25 20:34       ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: David Woodhouse @ 2024-11-25  9:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kexec, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]

On Mon, 2024-11-25 at 10:21 +0100, Ingo Molnar wrote:
> 
> * David Woodhouse <dwmw2@infradead.org> wrote:
> 
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> >  arch/x86/kernel/relocate_kernel_64.S | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > index 67f6853c7abe..ebbd76c9a3e9 100644
> > --- a/arch/x86/kernel/relocate_kernel_64.S
> > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > @@ -14,6 +14,8 @@
> >  #include <asm/nospec-branch.h>
> >  #include <asm/unwind_hints.h>
> >  
> > +#define DEBUG
> > +
> >  /*
> >   * Must be relocatable PIC code callable as a C function, in particular
> >   * there must be a plain RET and not jump to return thunk.
> > @@ -191,6 +193,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> >  	pushw	$0xff
> >  	lidt	(%rsp)
> >  	addq	$10, %rsp
> > +
> > +	int3
> >  #endif /* DEBUG */
> 
> That's a really nice piece of debugging code written in assembly, 
> combined with the exception handling feature that generates debug 
> output to begin with. Epic effort. :-)

Thanks :)

> Just curious: did you write this code to debug the series, or was there 
> some original hair-tearing regression that motivated you? Is there's an 
> upstream fix to marvel at and be horrified about in equal measure?

https://lore.kernel.org/all/2ab14f6f-2690-056b-cf9e-38a12dafd728@amd.com/t/#u
is the upstream fix. It's all the more horrifying because it was
already *fixed* upstream before I lost weeks of my life to chasing it.
And the trigger which actually made it *happen*, and made our
production systems allocate memory within that dangerous 1MiB region
adjacent to the RMP table, was a tweak to the NMI watchdog period...
leading to an assumption that we were getting stray perf NMIs during
the kexec, and a *long* wild goose chase based on that false
assumption...

Once I'd written the debug code, I just wanted to clean it up a bit and
push it out for the benefit of others; that *was* the main point of
this series. All the rest of the cleanups are just yak shaving.

The realisation that we never even explicitly mapped the control code
page and always just got lucky because it happened to be in the same
2MiB or 1GiB superpage as something else that we did map... was just a
bonus :)

(That one is fixed in v3 which I'll post shortly, and is already in 
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kexec-debug
)

> I'd argue that this debugging code probably needs a default-off Kconfig 
> option, even with the obvious hard-coded environmental limitations & 
> assumptions it has. Could be useful to very early debugging & would 
> preserve your effort without it bitrotting too obviously.

Yeah. In v3 I've made it a config option, and made it use the
early_printk serial console (as long as that's an I/O based 8250; we
can add others too later).

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
  2024-11-25  9:32     ` David Woodhouse
@ 2024-11-25 20:34       ` Ingo Molnar
  2024-11-25 20:46         ` David Woodhouse
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2024-11-25 20:34 UTC (permalink / raw)
  To: David Woodhouse
  Cc: kexec, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe


* David Woodhouse <dwmw2@infradead.org> wrote:

> > Just curious: did you write this code to debug the series, or was 
> > there some original hair-tearing regression that motivated you? Is 
> > there's an upstream fix to marvel at and be horrified about in 
> > equal measure?
> 
> https://lore.kernel.org/all/2ab14f6f-2690-056b-cf9e-38a12dafd728@amd.com/t/#u
> is the upstream fix.

Which ended up being the following upstream commit:

  88a921aa3c6b ("x86/sev: Ensure that RMP table fixups are reserved")

Might make sense to add this commit reference to one of the central 
patches of the GDT/IDT code, to document how this feature is able to 
pin down very hard to debug regressions. (Even if the upstream fix was 
done independently in probably luckier circumstances.)

> [...] It's all the more horrifying because it was already *fixed* 
> upstream before I lost weeks of my life to chasing it. And the 
> trigger which actually made it *happen*, and made our production 
> systems allocate memory within that dangerous 1MiB region adjacent to 
> the RMP table, was a tweak to the NMI watchdog period... leading to 
> an assumption that we were getting stray perf NMIs during the kexec, 
> and a *long* wild goose chase based on that false assumption...

:-/

> Once I'd written the debug code, I just wanted to clean it up a bit 
> and push it out for the benefit of others; that *was* the main point 
> of this series. All the rest of the cleanups are just yak shaving.
> 
> The realisation that we never even explicitly mapped the control code 
> page and always just got lucky because it happened to be in the same 
> 2MiB or 1GiB superpage as something else that we did map... was just 
> a bonus :)

I'm amazed and horrified in equal measure ;-)

> (That one is fixed in v3 which I'll post shortly, and is already in 
> https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kexec-debug
> )
> 
> > I'd argue that this debugging code probably needs a default-off Kconfig 
> > option, even with the obvious hard-coded environmental limitations & 
> > assumptions it has. Could be useful to very early debugging & would 
> > preserve your effort without it bitrotting too obviously.
> 
> Yeah. In v3 I've made it a config option, and made it use the 
> early_printk serial console (as long as that's an I/O based 8250; we 
> can add others too later).

That's lovely!

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
  2024-11-25 20:34       ` Ingo Molnar
@ 2024-11-25 20:46         ` David Woodhouse
  0 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2024-11-25 20:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kexec, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, Kirill A. Shutemov, Kai Huang,
	Nikolay Borisov, linux-kernel, Simon Horman, Dave Young,
	Peter Zijlstra, jpoimboe

[-- Attachment #1: Type: text/plain, Size: 4615 bytes --]

On Mon, 2024-11-25 at 21:34 +0100, Ingo Molnar wrote:
>  
> > The realisation that we never even explicitly mapped the control code 
> > page and always just got lucky because it happened to be in the same 
> > 2MiB or 1GiB superpage as something else that we did map... was just 
> > a bonus :)
> 
> I'm amazed and horrified in equal measure ;-)

:)

The rest of today was dedicated to finding out that that isn't entirely
true. Mapping the control page explicitly was only helping because it
forced 2MiB mappings instead of a 1GiB mapping, and masked the fact
that PTI was causing the identmap code to scribble off the end of the
root PGD page...

It all just worked by pure fluke on x86_64 before, because x86_64 would
allocate a 8KiB control region and use the first half of it for the
PGD, and *then* copy the trampoline code into the second half, after
the identmap code had finished scribbling on it. So when I cleaned that
up to allocate the PGD separately and explicitly like i386 does, that's
why it exploded; not just due to allocation patterns.

Still, I think I have a handle on fairly much everything that's broken,
except the occasional warning on the way back from
KEXEC_PRESERVE_CONTEXT thus:

[    1.423464] ------------[ cut here ]------------
[    1.423950] Interrupts enabled after irqrouter_resume+0x0/0x50
[    1.424605] WARNING: CPU: 0 PID: 215 at drivers/base/syscore.c:103 syscore_resume+0x152/0x180
[    1.425467] Modules linked in:
[    1.425791] CPU: 0 UID: 0 PID: 215 Comm: kexec Not tainted 6.12.0-rc5+ #2015
[    1.426498] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[    1.427628] RIP: 0010:syscore_resume+0x152/0x180
[    1.428101] Code: 00 e9 e1 fe ff ff 80 3d b1 b8 c4 01 00 0f 85 21 ff ff ff 48 8b 73 18 48 c7 c7 32 b8 b6 ac c6 05 99 b8 c4 01 01 e8 9e 3f 55 ff <0f> 0b e9 03 ff ff ff 80 3d 87 b8 c4 01 00 0f 85 b8 fe ff ff 48 c7
[    1.429913] RSP: 0018:ffffae9bc03bfd00 EFLAGS: 00010282
[    1.430445] RAX: 0000000000000000 RBX: ffffffffad6fbb20 RCX: ffffffffad5636a8
[    1.431153] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
[    1.431869] RBP: 0000000028121969 R08: 0000000000000000 R09: 0000000000000000
[    1.432594] R10: ffffae9bc03bfaa8 R11: 7075727265746e49 R12: ffffae9bc03bfd28
[    1.433313] R13: ffffffffad471f60 R14: 00000000fee1dead R15: 0000000000000000
[    1.434021] FS:  00007f77d4a45740(0000) GS:ffff91d0fd600000(0000) knlGS:0000000000000000
[    1.434815] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.435385] CR2: 00007f7e011e7070 CR3: 00000000012fe001 CR4: 0000000000170ef0
[    1.436073] Call Trace:
[    1.436334]  <TASK>
[    1.436558]  ? syscore_resume+0x152/0x180
[    1.436956]  ? __warn.cold+0x93/0xfa
[    1.437319]  ? syscore_resume+0x152/0x180
[    1.437717]  ? report_bug+0xff/0x140
[    1.438075]  ? handle_bug+0x58/0x90
[    1.438438]  ? exc_invalid_op+0x17/0x70
[    1.438826]  ? asm_exc_invalid_op+0x1a/0x20
[    1.439246]  ? syscore_resume+0x152/0x180
[    1.439644]  kernel_kexec+0x10a/0x160
[    1.440010]  __do_sys_reboot+0x1fd/0x240
[    1.440485]  do_syscall_64+0x82/0x160
[    1.440863]  ? syscall_exit_to_user_mode+0x10/0x210
[    1.441351]  ? do_syscall_64+0x8e/0x160
[    1.441735]  ? exc_page_fault+0x7e/0x180
[    1.442123]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[    1.442623] RIP: 0033:0x7f77d4b5adb7
[    1.442992] Code: c7 c0 ff ff ff ff eb be 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 49 50 0c 00 f7 d8 64 89 02 b8
[    1.444757] RSP: 002b:00007ffc56bc30f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
[    1.445493] RAX: ffffffffffffffda RBX: 00007ffc56bc3260 RCX: 00007f77d4b5adb7
[    1.446173] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
[    1.446848] RBP: 00007ffc56bc32c0 R08: 000055e1cef3e010 R09: 0000000000000007
[    1.447527] R10: 000055e1cef41020 R11: 0000000000000246 R12: 0000000000000001
[    1.448219] R13: 000055e19046b896 R14: 000055e1cef3e4a0 R15: 0000000000000000
[    1.448893]  </TASK>
[    1.449119] ---[ end trace 0000000000000000 ]---
[    1.452539] Enabling non-boot CPUs ...
[    1.452935] crash hp: kexec_trylock() failed, kdump image may be inaccurate
[    1.453678] smpboot: Booting Node 0 Processor 1 APIC 0x1
[    1.455531] CPU1 is up
[    1.460031] virtio_blk virtio1: 2/0/0 default/read/poll queues
[    1.465246] OOM killer enabled.
[    1.465580] Restarting tasks ... done.



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-11-25 20:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 02/16] x86/kexec: Use named labels in swap_pages " David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 03/16] x86/kexec: Restore GDT on return from preserve_context kexec David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 04/16] x86/kexec: Only swap pages for preserve_context mode David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 05/16] x86/kexec: Invoke copy of relocate_kernel() instead of the original David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 06/16] x86/kexec: Move relocate_kernel to kernel .data section David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 07/16] x86/kexec: Add data section to relocate_kernel David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 08/16] x86/kexec: Copy control page into place in machine_kexec_prepare() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 09/16] x86/kexec: Drop page_list argument from relocate_kernel() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 10/16] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 11/16] x86/kexec: Clean up register usage in relocate_kernel() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 12/16] x86/kexec: Mark relocate_kernel page as ROX instead of RWX David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 13/16] x86/kexec: Debugging support: load a GDT David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 14/16] x86/kexec: Debugging support: Load an IDT and basic exception entry points David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 15/16] x86/kexec: Debugging support: Dump registers on exception David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG David Woodhouse
2024-11-25  9:21   ` Ingo Molnar
2024-11-25  9:32     ` David Woodhouse
2024-11-25 20:34       ` Ingo Molnar
2024-11-25 20:46         ` David Woodhouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.