From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 08/16] arm64/kexec: Add core kexec support
Date: Fri, 30 Oct 2015 16:29:01 +0000 [thread overview]
Message-ID: <56339ACD.7010506@arm.com> (raw)
In-Reply-To: <a1c2a702f127d9dade1f9af8ab13decb2ef1c0da.1445297709.git.geoff@infradead.org>
Hi Geoff,
On 20/10/15 00:38, Geoff Levand wrote:
> Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
> arm64 architecture that add support for the kexec re-boot mechanism
> (CONFIG_KEXEC) on arm64 platforms.
>
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
> arch/arm64/Kconfig | 10 +++
> arch/arm64/include/asm/kexec.h | 48 +++++++++++
> arch/arm64/kernel/Makefile | 2 +
> arch/arm64/kernel/cpu-reset.S | 2 +-
> arch/arm64/kernel/machine_kexec.c | 141 +++++++++++++++++++++++++++++++
> arch/arm64/kernel/relocate_kernel.S | 163 ++++++++++++++++++++++++++++++++++++
> include/uapi/linux/kexec.h | 1 +
> 7 files changed, 366 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/kexec.h
> create mode 100644 arch/arm64/kernel/machine_kexec.c
> create mode 100644 arch/arm64/kernel/relocate_kernel.S
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 07d1811..73e8e31 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -491,6 +491,16 @@ config SECCOMP
> and the task is only allowed to execute a few safe syscalls
> defined by each seccomp mode.
>
> +config KEXEC
> + depends on (!SMP || PM_SLEEP_SMP)
Commit 4b3dc9679cf7 got rid of '!SMP'.
> + select KEXEC_CORE
> + bool "kexec system call"
> + ---help---
> + kexec is a system call that implements the ability to shutdown your
> + current kernel, and to start another kernel. It is like a reboot
> + but it is independent of the system firmware. And like a reboot
> + you can start any kernel with it, not just Linux.
> +
> config XEN_DOM0
> def_bool y
> depends on XEN
> diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
> index ffc9e385e..7cc7f56 100644
> --- a/arch/arm64/kernel/cpu-reset.S
> +++ b/arch/arm64/kernel/cpu-reset.S
> @@ -3,7 +3,7 @@
> *
> * Copyright (C) 2001 Deep Blue Solutions Ltd.
> * Copyright (C) 2012 ARM Ltd.
> - * Copyright (C) 2015 Huawei Futurewei Technologies.
> + * Copyright (C) Huawei Futurewei Technologies.
Move this hunk into the patch that adds the file?
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License version 2 as
> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> new file mode 100644
> index 0000000..7b07a16
> --- /dev/null
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -0,0 +1,163 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kexec.h>
> +
> +#include <asm/assembler.h>
> +#include <asm/kexec.h>
> +#include <asm/memory.h>
> +#include <asm/page.h>
> +
> +
> +/*
> + * arm64_relocate_new_kernel - Put a 2nd stage kernel image in place and boot it.
> + *
> + * The memory that the old kernel occupies may be overwritten when coping the
> + * new image to its final location. To assure that the
> + * arm64_relocate_new_kernel routine which does that copy is not overwritten,
> + * all code and data needed by arm64_relocate_new_kernel must be between the
> + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end. The
> + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> + * control_code_page, a special page which has been set up to be preserved
> + * during the copy operation.
> + */
> +.globl arm64_relocate_new_kernel
> +arm64_relocate_new_kernel:
> +
> + /* Setup the list loop variables. */
> + ldr x18, .Lkimage_head /* x18 = list entry */
> + dcache_line_size x17, x0 /* x17 = dcache line size */
> + mov x16, xzr /* x16 = segment start */
> + mov x15, xzr /* x15 = entry ptr */
> + mov x14, xzr /* x14 = copy dest */
> +
> + /* Check if the new image needs relocation. */
> + cbz x18, .Ldone
> + tbnz x18, IND_DONE_BIT, .Ldone
> +
> +.Lloop:
> + and x13, x18, PAGE_MASK /* x13 = addr */
> +
> + /* Test the entry flags. */
> +.Ltest_source:
> + tbz x18, IND_SOURCE_BIT, .Ltest_indirection
> +
> + mov x20, x14 /* x20 = copy dest */
> + mov x21, x13 /* x21 = copy src */
> +
> + /* Invalidate dest page to PoC. */
> + mov x0, x20
> + add x19, x0, #PAGE_SIZE
> + sub x1, x17, #1
> + bic x0, x0, x1
> +1: dc ivac, x0
> + add x0, x0, x17
> + cmp x0, x19
> + b.lo 1b
> + dsb sy
If I've followed all this through properly:
With KVM - mmu+caches are configured, but then disabled by 'kvm: allows kvm
cpu hotplug'. This 'arm64_relocate_new_kernel' function then runs at EL2
with M=0, C=0, I=0.
Without KVM - when there is no user of EL2, the mmu+caches are left in
whatever state the bootloader (or efi stub) left them in. From
Documentation/arm64/booting.txt:
> Instruction cache may be on or off.
and
> System caches which respect the architected cache maintenance by VA
> operations must be configured and may be enabled.
So 'arm64_relocate_new_kernel' function could run at EL2 with M=0, C=?, I=?.
I think this means you can't guarantee anything you are copying below
actually makes it through the caches - booting secondary processors may get
stale values.
The EFI stub disables the M and C bits when booted at EL2 with uefi - but
it leaves the instruction cache enabled. You only clean the
reboot_code_buffer from the data cache, so there may be stale values in the
instruction cache.
I think you need to disable the i-cache at EL1. If you jump to EL2, I think
you need to disable the I/C bits there too - as you can't rely on the code
in 'kvm: allows kvm cpu hotplug' to do this in a non-kvm case.
> +
> + /* Copy page. */
> +1: ldp x22, x23, [x21]
> + ldp x24, x25, [x21, #16]
> + ldp x26, x27, [x21, #32]
> + ldp x28, x29, [x21, #48]
> + add x21, x21, #64
> + stnp x22, x23, [x20]
> + stnp x24, x25, [x20, #16]
> + stnp x26, x27, [x20, #32]
> + stnp x28, x29, [x20, #48]
> + add x20, x20, #64
> + tst x21, #(PAGE_SIZE - 1)
> + b.ne 1b
> +
> + /* dest += PAGE_SIZE */
> + add x14, x14, PAGE_SIZE
> + b .Lnext
> +
> +.Ltest_indirection:
> + tbz x18, IND_INDIRECTION_BIT, .Ltest_destination
> +
> + /* ptr = addr */
> + mov x15, x13
> + b .Lnext
> +
> +.Ltest_destination:
> + tbz x18, IND_DESTINATION_BIT, .Lnext
> +
> + mov x16, x13
> +
> + /* dest = addr */
> + mov x14, x13
> +
> +.Lnext:
> + /* entry = *ptr++ */
> + ldr x18, [x15], #8
> +
> + /* while (!(entry & DONE)) */
> + tbz x18, IND_DONE_BIT, .Lloop
> +
> +.Ldone:
> + dsb sy
> + isb
> + ic ialluis
> + dsb sy
Why the second dsb?
> + isb
> +
> + /* Start new image. */
> + ldr x4, .Lkimage_start
> + mov x0, xzr
> + mov x1, xzr
> + mov x2, xzr
> + mov x3, xzr
Once the kexec'd kernel is booting, I get:
> WARNING: x1-x3 nonzero in violation of boot protocol:
> x1: 0000000080008000
> x2: 0000000000000020
> x3: 0000000000000020
> This indicates a broken bootloader or old kernel
Presumably this 'kimage_start' isn't pointing to the new kernel, but the
purgatory code, (which comes from user-space?). (If so what are these xzr-s
for?)
> + br x4
> +
> +.align 3 /* To keep the 64-bit values below naturally aligned. */
> +
> +/* The machine_kexec routine sets these variables via offsets from
> + * arm64_relocate_new_kernel.
> + */
> +
> +/*
> + * .Lkimage_start - Copy of image->start, the entry point of the new
> + * image.
> + */
> +.Lkimage_start:
> + .quad 0x0
> +
> +/*
> + * .Lkimage_head - Copy of image->head, the list of kimage entries.
> + */
> +.Lkimage_head:
> + .quad 0x0
> +
I assume these .quad-s are used because you can't pass the values in via
registers - due to the complicated soft_restart(). Given you are the only
user, couldn't you simplify it to do all the disabling in
arm64_relocate_new_kernel?
> +.Lcopy_end:
> +.org KEXEC_CONTROL_PAGE_SIZE
> +
> +/*
> + * arm64_relocate_new_kernel_size - Number of bytes to copy to the control_code_page.
> + */
> +.globl arm64_relocate_new_kernel_size
> +arm64_relocate_new_kernel_size:
> + .quad .Lcopy_end - arm64_relocate_new_kernel
> +
> +/*
> + * arm64_kexec_kimage_start_offset - Offset for writing .Lkimage_start.
> + */
> +.globl arm64_kexec_kimage_start_offset
> +arm64_kexec_kimage_start_offset:
> + .quad .Lkimage_start - arm64_relocate_new_kernel
> +
> +/*
> + * arm64_kexec_kimage_head_offset - Offset for writing .Lkimage_head.
> + */
> +.globl arm64_kexec_kimage_head_offset
> +arm64_kexec_kimage_head_offset:
> + .quad .Lkimage_head - arm64_relocate_new_kernel
>From 'kexec -e' to the first messages from the new kernel takes ~1 minute
on Juno, Did you see a similar delay? Or should I go looking for what I've
configured wrong!?
(Copying code with the mmu+caches on, then cleaning to PoC was noticeably
faster for hibernate)
I've used this series for kexec-ing between 4K and 64K page_size kernels on
Juno.
Tested-By: James Morse <james.morse@arm.com>
Thanks!
James
next prev parent reply other threads:[~2015-10-30 16:29 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 23:38 [PATCH 00/16] arm64 kexec kernel patches v10 Geoff Levand
2015-10-19 23:38 ` [PATCH 01/16] arm64: Fold proc-macros.S into assembler.h Geoff Levand
2015-10-19 23:38 ` [PATCH 08/16] arm64/kexec: Add core kexec support Geoff Levand
2015-10-20 8:56 ` Pratyush Anand
2015-10-20 17:19 ` Geoff Levand
2015-10-23 7:29 ` Pratyush Anand
2015-10-21 18:30 ` [PATCH v10.2 " Geoff Levand
2015-10-30 16:29 ` James Morse [this message]
2015-10-30 16:54 ` [PATCH " Mark Rutland
2015-11-02 9:26 ` Pratyush Anand
2015-11-03 0:30 ` Geoff Levand
2015-10-19 23:38 ` [PATCH 03/16] arm64: Add new hcall HVC_CALL_FUNC Geoff Levand
2015-10-19 23:38 ` [PATCH 02/16] arm64: Convert hcalls to use HVC immediate value Geoff Levand
2015-10-19 23:38 ` [PATCH 07/16] Revert "arm64: remove dead code" Geoff Levand
2015-10-19 23:38 ` [PATCH 05/16] arm64: Add back cpu_reset routines Geoff Levand
2015-10-19 23:38 ` [PATCH 04/16] arm64: kvm: allows kvm cpu hotplug Geoff Levand
2015-10-20 18:57 ` [PATCH v10.1 " Geoff Levand
2015-10-19 23:38 ` [PATCH 06/16] arm64: Add EL2 switch to cpu_reset Geoff Levand
2015-10-19 23:38 ` [PATCH 14/16] arm64: kdump: update a kernel doc Geoff Levand
2015-10-19 23:38 ` [PATCH 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active Geoff Levand
2015-10-19 23:38 ` [PATCH 15/16] arm64: kdump: enable kdump in the arm64 defconfig Geoff Levand
2015-10-19 23:38 ` [PATCH 11/16] arm64: kdump: reserve memory for crash dump kernel Geoff Levand
2015-10-19 23:38 ` [PATCH 12/16] arm64: kdump: implement machine_crash_shutdown() Geoff Levand
2015-10-20 18:54 ` [PATCH v10.1 " Geoff Levand
2015-10-19 23:38 ` [PATCH 10/16] arm64/kexec: Enable kexec in the arm64 defconfig Geoff Levand
2015-10-19 23:38 ` [PATCH 09/16] arm64/kexec: Add pr_devel output Geoff Levand
2015-10-19 23:38 ` [PATCH 13/16] arm64: kdump: add kdump support Geoff Levand
2015-10-22 3:25 ` Dave Young
2015-10-22 4:29 ` AKASHI Takahiro
2015-10-22 5:15 ` Dave Young
2015-10-22 9:57 ` AKASHI Takahiro
2015-10-23 9:50 ` Dave Young
2015-10-29 5:55 ` AKASHI Takahiro
2015-10-29 6:40 ` Dave Young
2015-10-29 6:53 ` AKASHI Takahiro
2015-10-29 7:01 ` Dave Young
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56339ACD.7010506@arm.com \
--to=james.morse@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).