* [patch v3 0/8] kdump: Patch series for s390 support (version 3)
@ 2011-08-12 13:48 Michael Holzheu
2011-08-12 13:48 ` [patch v3 1/8] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT Michael Holzheu
` (7 more replies)
0 siblings, 8 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
Hello Vivek,
I updated the patch series according to our last discussions. As you requested,
I removed the "#if !defined(CONFIG_S390)" in the panic function.
The semantics is now as follows:
If kdump is loaded, kdump is always triggered for panic and PSW restart
(s390 NMI). If kdump is not loaded, the s390 shutdown actions defined
under /sys/firmware are executed as we are doing it already today.
I also added again patches for stand-alone dump integration:
For s390 we add a parameter to the purgatory entry point. When "0" is passed,
purgatory only returns the result of the checksum test. When "1" is passed,
purgatory triggers kdump. So we call purgatory twice. First for checking and
second time for execution. You can argue that it would be better to call
purgatory only once and it returns only, if checksums are invalid.
Unfortunately this would be very hard to implement for us, because we
switch to the boot CPU before kdump is finally triggered and after that
currently it is not possible to return from the called function.
panic --------+
+--- crash_kexec()
| call_s390_shutdown_actions() -> stand-alone dump
PSW restart --+
crash_kexec +--> kdump loaded? --> machine_kexec()
|
+--> kdump not loaded --> return
machine_kexec +-> purgatory(0)==0 -> switch to IPL cpu -> purgatory(1) -> kdump
|
+-> purgatory(0)!=0 -> return
See patches [6] and [8] for details.
Does that sound ok for you?
Michael
Patch overview:
---------------
[1-3] common code changes (could you please ACC patches [1] and [3]?)
[4] s390 kdump preparation patch
[5] s390 kdump backend
[6] s390 stand-alone dump/shutdown actions integration
[7] kexec-tools: s390 kdump backend for kexec-tools
[8] kexec-tools: s390 stand-alone dump/shutdown actions integration
History: v1->v2:
----------------
Main changes compared to version 1:
1. We use purgatory code
2. We use pre-allocated ELF core header
3. Registers are saved in old kernel
4. Removed meminfo
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 1/8] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak Michael Holzheu
` (6 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-common-control-limit.patch --]
[-- Type: text/plain, Size: 1404 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
On s390 there is a different KEXEC_CONTROL_MEMORY_LIMIT for the normal and
the kdump kexec case. Therefore this patch introduces a new macro
KEXEC_CRASH_CONTROL_MEMORY_LIMIT. This is set to
KEXEC_CONTROL_MEMORY_LIMIT for all architectures that do not define
KEXEC_CRASH_CONTROL_MEMORY_LIMIT.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
include/linux/kexec.h | 4 ++++
kernel/kexec.c | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -33,6 +33,10 @@
#error KEXEC_ARCH not defined
#endif
+#ifndef KEXEC_CRASH_CONTROL_MEMORY_LIMIT
+#define KEXEC_CRASH_CONTROL_MEMORY_LIMIT KEXEC_CONTROL_MEMORY_LIMIT
+#endif
+
#define KEXEC_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4)
#define KEXEC_CORE_NOTE_NAME "CORE"
#define KEXEC_CORE_NOTE_NAME_BYTES ALIGN(sizeof(KEXEC_CORE_NOTE_NAME), 4)
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -498,7 +498,7 @@ static struct page *kimage_alloc_crash_c
while (hole_end <= crashk_res.end) {
unsigned long i;
- if (hole_end > KEXEC_CONTROL_MEMORY_LIMIT)
+ if (hole_end > KEXEC_CRASH_CONTROL_MEMORY_LIMIT)
break;
if (hole_end > crashk_res.end)
break;
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
2011-08-12 13:48 ` [patch v3 1/8] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-18 17:15 ` Vivek Goyal
2011-08-12 13:48 ` [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter Michael Holzheu
` (5 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-common-load_crash_segment.patch --]
[-- Type: text/plain, Size: 1760 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
On s390 we do not create page tables at all for the crashkernel memory.
This requires a s390 specific version for kimage_load_crash_segment().
Therefore this patch declares this function as "__weak". The s390 version is
very simple. It just copies the kexec segment to real memory without using
page tables:
int kimage_load_crash_segment(struct kimage *image,
struct kexec_segment *segment)
{
return copy_from_user_real((void *) segment->mem, segment->buf,
segment->bufsz);
}
There are two main advantages of not creating page tables for the
crashkernel memory:
a) It saves memory. We have scenarios in mind, where crashkernel
memory can be very large and saving page table space is important.
b) We protect the crashkernel memory from being overwritten.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
kernel/kexec.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -842,8 +842,13 @@ out:
return result;
}
-static int kimage_load_crash_segment(struct kimage *image,
- struct kexec_segment *segment)
+/*
+ * Load crash segment into memory. Architecture code can override this
+ * function. E.g. this is necessary for architectures that do not
+ * create page tables for crashkernel memory.
+ */
+int __weak kimage_load_crash_segment(struct kimage *image,
+ struct kexec_segment *segment)
{
/* For crash dumps kernels we simply copy the data from
* user space to it's destination.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
2011-08-12 13:48 ` [patch v3 1/8] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT Michael Holzheu
2011-08-12 13:48 ` [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-17 21:05 ` Vivek Goyal
2011-08-12 13:48 ` [patch v3 4/8] s390: Add real memory access functions Michael Holzheu
` (4 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-common-elfcorehdr-parm.patch --]
[-- Type: text/plain, Size: 3011 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Currently only the address of the pre-allocated ELF header is passed with
the elfcorehdr= kernel parameter. In order to reserve memory for the header
in the 2nd kernel also the size is required. Current kdump architecture
backends use different methods to do that, e.g. x86 uses the memmap= kernel
parameter. On s390 there is no easy way to transfer this information.
Therefore the elfcorehdr kernel parameter is extended to also pass the size.
This now can also be used as standard mechanism by all future kdump
architecture backends.
The syntax of the kernel parameter is extended as follows:
elfcorehdr=[size[KMG]@]offset[KMG]
This change is backward compatible because elfcorehdr=size is still allowed.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
Documentation/kernel-parameters.txt | 6 +++---
include/linux/crash_dump.h | 1 +
kernel/crash_dump.c | 11 +++++++++++
3 files changed, 15 insertions(+), 3 deletions(-)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -725,10 +725,10 @@ bytes respectively. Such letter suffixes
See Documentation/block/as-iosched.txt and
Documentation/block/deadline-iosched.txt for details.
- elfcorehdr= [IA64,PPC,SH,X86]
+ elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
Specifies physical address of start of kernel core
- image elf header. Generally kexec loader will
- pass this option to capture kernel.
+ image elf header and optionally the size. Generally
+ kexec loader will pass this option to capture kernel.
See Documentation/kdump/kdump.txt for details.
enable_mtrr_cleanup [X86]
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -10,6 +10,7 @@
#define ELFCORE_ADDR_ERR (-2ULL)
extern unsigned long long elfcorehdr_addr;
+extern unsigned long long elfcorehdr_size;
extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
unsigned long, int);
--- a/kernel/crash_dump.c
+++ b/kernel/crash_dump.c
@@ -20,8 +20,15 @@ unsigned long saved_max_pfn;
unsigned long long elfcorehdr_addr = ELFCORE_ADDR_MAX;
/*
+ * stores the size of elf header of crash image
+ */
+unsigned long long elfcorehdr_size;
+
+/*
* elfcorehdr= specifies the location of elf core header stored by the crashed
* kernel. This option will be passed by kexec loader to the capture kernel.
+ *
+ * Syntax: elfcorehdr=[size[KMG]@]offset[KMG]
*/
static int __init setup_elfcorehdr(char *arg)
{
@@ -29,6 +36,10 @@ static int __init setup_elfcorehdr(char
if (!arg)
return -EINVAL;
elfcorehdr_addr = memparse(arg, &end);
+ if (*end == '@') {
+ elfcorehdr_size = elfcorehdr_addr;
+ elfcorehdr_addr = memparse(end + 1, &end);
+ }
return end > arg ? 0 : -EINVAL;
}
early_param("elfcorehdr", setup_elfcorehdr);
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 4/8] s390: Add real memory access functions
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
` (2 preceding siblings ...)
2011-08-12 13:48 ` [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 5/8] s390: kdump backend code Michael Holzheu
` (3 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-arch-maccess.patch --]
[-- Type: text/plain, Size: 3744 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Add access function for real memory needed by s390 kdump backend.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
arch/s390/include/asm/system.h | 2 +
arch/s390/mm/maccess.c | 57 +++++++++++++++++++++++++++++++++++++++++
drivers/s390/char/zcore.c | 20 +-------------
3 files changed, 61 insertions(+), 18 deletions(-)
--- a/arch/s390/include/asm/system.h
+++ b/arch/s390/include/asm/system.h
@@ -114,6 +114,8 @@ extern void pfault_fini(void);
extern void cmma_init(void);
extern int memcpy_real(void *, void *, size_t);
extern void copy_to_absolute_zero(void *dest, void *src, size_t count);
+extern int copy_to_user_real(void __user *dest, void *src, size_t count);
+extern int copy_from_user_real(void *dest, void __user *src, size_t count);
#define finish_arch_switch(prev) do { \
set_fs(current->thread.mm_segment); \
--- a/arch/s390/mm/maccess.c
+++ b/arch/s390/mm/maccess.c
@@ -11,6 +11,7 @@
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/errno.h>
+#include <linux/gfp.h>
#include <asm/system.h>
/*
@@ -60,6 +61,9 @@ long probe_kernel_write(void *dst, const
return copied < 0 ? -EFAULT : 0;
}
+/*
+ * Copy memory in real mode (kernel to kernel)
+ */
int memcpy_real(void *dest, void *src, size_t count)
{
register unsigned long _dest asm("2") = (unsigned long) dest;
@@ -101,3 +105,56 @@ void copy_to_absolute_zero(void *dest, v
__ctl_load(cr0, 0, 0);
preempt_enable();
}
+
+/*
+ * Copy memory from kernel (real) to user (virtual)
+ */
+int copy_to_user_real(void __user *dest, void *src, size_t count)
+{
+ int offs = 0, size, rc;
+ char *buf;
+
+ buf = (char *) __get_free_page(GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+ rc = -EFAULT;
+ while (offs < count) {
+ size = min(PAGE_SIZE, count - offs);
+ if (memcpy_real(buf, src + offs, size))
+ goto out;
+ if (copy_to_user(dest + offs, buf, size))
+ goto out;
+ offs += size;
+ }
+ rc = 0;
+out:
+ free_page((unsigned long) buf);
+ return rc;
+}
+
+/*
+ * Copy memory from user (virtual) to kernel (real)
+ */
+int copy_from_user_real(void *dest, void __user *src, size_t count)
+{
+ int offs = 0, size, rc;
+ char *buf;
+
+ buf = (char *) __get_free_page(GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+ rc = -EFAULT;
+ while (offs < count) {
+ size = min(PAGE_SIZE, count - offs);
+ if (copy_from_user(buf, src + offs, size))
+ goto out;
+ if (memcpy_real(dest + offs, buf, size))
+ goto out;
+ offs += size;
+ }
+ rc = 0;
+out:
+ free_page((unsigned long) buf);
+ return rc;
+}
+
--- a/drivers/s390/char/zcore.c
+++ b/drivers/s390/char/zcore.c
@@ -142,22 +142,6 @@ static int memcpy_hsa_kernel(void *dest,
return memcpy_hsa(dest, src, count, TO_KERNEL);
}
-static int memcpy_real_user(void __user *dest, unsigned long src, size_t count)
-{
- static char buf[4096];
- int offs = 0, size;
-
- while (offs < count) {
- size = min(sizeof(buf), count - offs);
- if (memcpy_real(buf, (void *) src + offs, size))
- return -EFAULT;
- if (copy_to_user(dest + offs, buf, size))
- return -EFAULT;
- offs += size;
- }
- return 0;
-}
-
static int __init init_cpu_info(enum arch_id arch)
{
struct save_area *sa;
@@ -346,8 +330,8 @@ static ssize_t zcore_read(struct file *f
/* Copy from real mem */
size = count - mem_offs - hdr_count;
- rc = memcpy_real_user(buf + hdr_count + mem_offs, mem_start + mem_offs,
- size);
+ rc = copy_to_user_real(buf + hdr_count + mem_offs,
+ (void *) mem_start + mem_offs, size);
if (rc)
goto fail;
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 5/8] s390: kdump backend code
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
` (3 preceding siblings ...)
2011-08-12 13:48 ` [patch v3 4/8] s390: Add real memory access functions Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 6/8] s390: Do first kdump checksum test before really starting kdump Michael Holzheu
` (2 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-arch-backend.patch --]
[-- Type: text/plain, Size: 26000 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
This patch provides the architecture specific part of the s390 kdump
support.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
arch/s390/Kconfig | 10 +
arch/s390/include/asm/ipl.h | 1
arch/s390/include/asm/kexec.h | 3
arch/s390/include/asm/setup.h | 9 +
arch/s390/kernel/Makefile | 1
arch/s390/kernel/crash_dump.c | 39 +++++
arch/s390/kernel/head.S | 16 ++
arch/s390/kernel/head_kdump.S | 116 +++++++++++++++++
arch/s390/kernel/ipl.c | 4
arch/s390/kernel/machine_kexec.c | 255 +++++++++++++++++++++++++++++++++++++++
arch/s390/kernel/mem_detect.c | 69 ++++++++++
arch/s390/kernel/setup.c | 190 +++++++++++++++++++++++++++++
arch/s390/mm/vmem.c | 3
13 files changed, 716 insertions(+)
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -568,6 +568,16 @@ config KEXEC
current kernel, and to start another kernel. It is like a reboot
but is independent of hardware/microcode support.
+config CRASH_DUMP
+ bool "kernel crash dumps"
+ depends on 64BIT
+ help
+ Generate crash dump after being started by kexec.
+ Crash dump kernels are loaded in the main kernel with kexec-tools
+ into a specially reserved region and then later executed after
+ a crash by kdump/kexec.
+ For more details see Documentation/kdump/kdump.txt
+
config ZFCPDUMP
def_bool n
prompt "zfcpdump support"
--- a/arch/s390/include/asm/ipl.h
+++ b/arch/s390/include/asm/ipl.h
@@ -168,5 +168,6 @@ enum diag308_rc {
extern int diag308(unsigned long subcode, void *addr);
extern void diag308_reset(void);
+extern void store_status(void);
#endif /* _ASM_S390_IPL_H */
--- a/arch/s390/include/asm/kexec.h
+++ b/arch/s390/include/asm/kexec.h
@@ -30,6 +30,9 @@
/* Not more than 2GB */
#define KEXEC_CONTROL_MEMORY_LIMIT (1UL<<31)
+/* Maximum address we can use for the crash control pages */
+#define KEXEC_CRASH_CONTROL_MEMORY_LIMIT (-1UL)
+
/* Allocate one page for the pdp and the second for the code */
#define KEXEC_CONTROL_PAGE_SIZE 4096
--- a/arch/s390/include/asm/setup.h
+++ b/arch/s390/include/asm/setup.h
@@ -30,11 +30,15 @@
#define IPL_DEVICE (*(unsigned long *) (0x10400))
#define INITRD_START (*(unsigned long *) (0x10408))
#define INITRD_SIZE (*(unsigned long *) (0x10410))
+#define OLDMEM_BASE (*(unsigned long *) (0x10418))
+#define OLDMEM_SIZE (*(unsigned long *) (0x10420))
#endif /* __s390x__ */
#define COMMAND_LINE ((char *) (0x10480))
#define CHUNK_READ_WRITE 0
#define CHUNK_READ_ONLY 1
+#define CHUNK_OLDMEM 4
+#define CHUNK_CRASHK 5
struct mem_chunk {
unsigned long addr;
@@ -48,6 +52,8 @@ extern int memory_end_set;
extern unsigned long memory_end;
void detect_memory_layout(struct mem_chunk chunk[]);
+void create_mem_hole(struct mem_chunk memory_chunk[], unsigned long addr,
+ unsigned long size, int type);
#define PRIMARY_SPACE_MODE 0
#define ACCESS_REGISTER_MODE 1
@@ -106,6 +112,7 @@ extern unsigned int user_mode;
#endif /* __s390x__ */
#define ZFCPDUMP_HSA_SIZE (32UL<<20)
+#define ZFCPDUMP_HSA_SIZE_MAX (64UL<<20)
/*
* Console mode. Override with conmode=
@@ -138,6 +145,8 @@ extern char kernel_nss_name[];
#define IPL_DEVICE 0x10400
#define INITRD_START 0x10408
#define INITRD_SIZE 0x10410
+#define OLDMEM_BASE 0x10418
+#define OLDMEM_SIZE 0x10420
#endif /* __s390x__ */
#define COMMAND_LINE 0x10480
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -48,6 +48,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += $(if $(
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
+obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
# Kexec part
S390_KEXEC_OBJS := machine_kexec.o crash.o
--- /dev/null
+++ b/arch/s390/kernel/crash_dump.c
@@ -0,0 +1,39 @@
+/*
+ * S390 kdump implementation
+ *
+ * Copyright IBM Corp. 2011
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+#include <linux/crash_dump.h>
+#include <asm/lowcore.h>
+
+/*
+ * Copy one page from "oldmem"
+ *
+ * For the kdump reserved memory this functions performs a swap operation:
+ * - [OLDMEM_BASE - OLDMEM_BASE + OLDMEM_SIZE] is mapped to [0 - OLDMEM_SIZE].
+ * - [0 - OLDMEM_SIZE] is mapped to [OLDMEM_BASE - OLDMEM_BASE + OLDMEM_SIZE]
+ */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ size_t csize, unsigned long offset, int userbuf)
+{
+ unsigned long src;
+ int rc;
+
+ if (!csize)
+ return 0;
+
+ src = (pfn << PAGE_SHIFT) + offset;
+ if (src < OLDMEM_SIZE)
+ src += OLDMEM_BASE;
+ else if (src > OLDMEM_BASE &&
+ src < OLDMEM_BASE + OLDMEM_SIZE)
+ src -= OLDMEM_BASE;
+ if (userbuf)
+ rc = copy_to_user_real((void __user *) buf, (void *) src,
+ csize);
+ else
+ rc = memcpy_real(buf, (void *) src, csize);
+ return rc < 0 ? rc : csize;
+}
--- a/arch/s390/kernel/head.S
+++ b/arch/s390/kernel/head.S
@@ -449,10 +449,22 @@ ENTRY(start)
#
.org 0x10000
ENTRY(startup)
+ j .Lep_startup_normal
+ .org 0x10008
+ .ascii "S390EP"
+ .byte 0x00,0x01
+#
+# kdump startup-code at 0x10010, running in 64 bit absolute addressing mode
+#
+ .org 0x10010
+ENTRY(startup_kdump)
+ j .Lep_startup_kdump
+.Lep_startup_normal:
basr %r13,0 # get base
.LPG0:
xc 0x200(256),0x200 # partially clear lowcore
xc 0x300(256),0x300
+ xc 0xe00(256),0xe00
stck __LC_LAST_UPDATE_CLOCK
spt 5f-.LPG0(%r13)
mvc __LC_LAST_UPDATE_TIMER(8),5f-.LPG0(%r13)
@@ -534,6 +546,8 @@ ENTRY(startup)
.align 8
5: .long 0x7fffffff,0xffffffff
+#include "head_kdump.S"
+
#
# params at 10400 (setup.h)
#
@@ -541,6 +555,8 @@ ENTRY(startup)
.long 0,0 # IPL_DEVICE
.long 0,0 # INITRD_START
.long 0,0 # INITRD_SIZE
+ .long 0,0 # OLDMEM_BASE
+ .long 0,0 # OLDMEM_SIZE
.org COMMAND_LINE
.byte "root=/dev/ram0 ro"
--- /dev/null
+++ b/arch/s390/kernel/head_kdump.S
@@ -0,0 +1,116 @@
+/*
+ * S390 kdump lowlevel functions (new kernel)
+ *
+ * Copyright IBM Corp. 2011
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+#define DATAMOVER_ADDR 0x4000
+#define COPY_PAGE_ADDR 0x6000
+
+#ifdef CONFIG_CRASH_DUMP
+
+#
+# kdump entry (new kernel - not yet relocated)
+#
+# Note: This code has to be position independent
+#
+
+.align 2
+.Lep_startup_kdump:
+ basr %r13,0
+.Lbase:
+ larl %r2,.Lbase_addr # Check, if we have been
+ lg %r2,0(%r2) # already relocated:
+ clgr %r2,%r13 #
+ jne .Lrelocate # No : Start data mover
+ lghi %r2,0 # Yes: Start kdump kernel
+ brasl %r14,startup_kdump_relocated
+
+.Lrelocate:
+ larl %r4,startup
+ lg %r2,0x418(%r4) # Get kdump base
+ lg %r3,0x420(%r4) # Get kdump size
+
+ larl %r10,.Lcopy_start # Source of data mover
+ lghi %r8,DATAMOVER_ADDR # Target of data mover
+ mvc 0(256,%r8),0(%r10) # Copy data mover code
+
+ agr %r8,%r2 # Copy data mover to
+ mvc 0(256,%r8),0(%r10) # reserved mem
+
+ lghi %r14,DATAMOVER_ADDR # Jump to copied data mover
+ basr %r14,%r14
+.Lbase_addr:
+ .quad .Lbase
+
+#
+# kdump data mover code (runs at address DATAMOVER_ADDR)
+#
+# r2: kdump base address
+# r3: kdump size
+#
+.Lcopy_start:
+ basr %r13,0 # Base
+0:
+ lgr %r11,%r2 # Save kdump base address
+ lgr %r12,%r2
+ agr %r12,%r3 # Compute kdump end address
+
+ lghi %r5,0
+ lghi %r10,COPY_PAGE_ADDR # Load copy page address
+1:
+ mvc 0(256,%r10),0(%r5) # Copy old kernel to tmp
+ mvc 0(256,%r5),0(%r11) # Copy new kernel to old
+ mvc 0(256,%r11),0(%r10) # Copy tmp to new
+ aghi %r11,256
+ aghi %r5,256
+ clgr %r11,%r12
+ jl 1b
+
+ lg %r14,.Lstartup_kdump-0b(%r13)
+ basr %r14,%r14 # Start relocated kernel
+.Lstartup_kdump:
+ .long 0x00000000,0x00000000 + startup_kdump_relocated
+.Lcopy_end:
+
+#
+# Startup of kdump (relocated new kernel)
+#
+.align 2
+startup_kdump_relocated:
+ basr %r13,0
+0:
+ mvc 0(8,%r0),.Lrestart_psw-0b(%r13) # Setup restart PSW
+ mvc 464(16,%r0),.Lpgm_psw-0b(%r13) # Setup pgm check PSW
+ lhi %r1,1 # Start new kernel
+ diag %r1,%r1,0x308 # with diag 308
+
+.Lno_diag308: # No diag 308
+ sam31 # Switch to 31 bit addr mode
+ sr %r1,%r1 # Erase register r1
+ sr %r2,%r2 # Erase register r2
+ sigp %r1,%r2,0x12 # Switch to 31 bit arch mode
+ lpsw 0 # Start new kernel...
+.align 8
+.Lrestart_psw:
+ .long 0x00080000,0x80000000 + startup
+.Lpgm_psw:
+ .quad 0x0000000180000000,0x0000000000000000 + .Lno_diag308
+#else
+.align 2
+.Lep_startup_kdump:
+#ifdef CONFIG_64BIT
+ larl %r13,startup_kdump_crash
+ lpswe 0(%r13)
+.align 8
+startup_kdump_crash:
+ .quad 0x0002000080000000,0x0000000000000000 + startup_kdump_crash
+#else
+ basr %r13,0
+0: lpsw startup_kdump_crash-0b(%r13)
+.align 8
+startup_kdump_crash:
+ .long 0x000a0000,0x00000000 + startup_kdump_crash
+#endif /* CONFIG_64BIT */
+#endif /* CONFIG_CRASH_DUMP */
--- a/arch/s390/kernel/ipl.c
+++ b/arch/s390/kernel/ipl.c
@@ -16,6 +16,7 @@
#include <linux/ctype.h>
#include <linux/fs.h>
#include <linux/gfp.h>
+#include <linux/crash_dump.h>
#include <asm/ipl.h>
#include <asm/smp.h>
#include <asm/setup.h>
@@ -1738,6 +1739,9 @@ static struct kobj_attribute on_restart_
void do_restart(void)
{
smp_send_stop();
+#ifdef CONFIG_CRASH_DUMP
+ crash_kexec(NULL);
+#endif
on_restart_trigger.action->fn(&on_restart_trigger);
stop_run(&on_restart_trigger);
}
--- a/arch/s390/kernel/machine_kexec.c
+++ b/arch/s390/kernel/machine_kexec.c
@@ -21,12 +21,260 @@
#include <asm/smp.h>
#include <asm/reset.h>
#include <asm/ipl.h>
+#include <asm/diag.h>
typedef void (*relocate_kernel_t)(kimage_entry_t *, unsigned long);
extern const unsigned char relocate_kernel[];
extern const unsigned long long relocate_kernel_len;
+#ifdef CONFIG_CRASH_DUMP
+
+#define ROUNDUP(x, y) ((((x) + ((y) - 1)) / (y)) * (y))
+#define PTR_ADD(x, y) (((char *) (x)) + ((unsigned long) (y)))
+
+#ifndef NT_FPREGSET
+#define NT_FPREGSET 2
+#endif
+
+/*
+ * fpregset ELF Note
+ */
+struct nt_fpregset_64 {
+ u32 fpc;
+ u32 pad;
+ u64 fprs[16];
+} __packed;
+
+/*
+ * Initialize ELF note
+ */
+static void *nt_init(void *buf, Elf64_Word type, void *desc, int d_len,
+ const char *name)
+{
+ Elf64_Nhdr *note;
+ u64 len;
+
+ note = (Elf64_Nhdr *)buf;
+ note->n_namesz = strlen(name) + 1;
+ note->n_descsz = d_len;
+ note->n_type = type;
+ len = sizeof(Elf64_Nhdr);
+
+ memcpy(buf + len, name, note->n_namesz);
+ len = ROUNDUP(len + note->n_namesz, 4);
+
+ memcpy(buf + len, desc, note->n_descsz);
+ len = ROUNDUP(len + note->n_descsz, 4);
+
+ return PTR_ADD(buf, len);
+}
+
+/*
+ * Initialize prstatus note
+ */
+static void *nt_prstatus(void *ptr, struct save_area *sa)
+{
+ struct elf_prstatus nt_prstatus;
+ static int cpu_nr = 1;
+
+ memset(&nt_prstatus, 0, sizeof(nt_prstatus));
+ memcpy(&nt_prstatus.pr_reg.gprs, sa->gp_regs, sizeof(sa->gp_regs));
+ memcpy(&nt_prstatus.pr_reg.psw, sa->psw, sizeof(sa->psw));
+ memcpy(&nt_prstatus.pr_reg.acrs, sa->acc_regs, sizeof(sa->acc_regs));
+ nt_prstatus.pr_pid = cpu_nr;
+ cpu_nr++;
+
+ return nt_init(ptr, NT_PRSTATUS, &nt_prstatus, sizeof(nt_prstatus),
+ "CORE");
+}
+
+/*
+ * Initialize fpregset (floating point) note
+ */
+static void *nt_fpregset(void *ptr, struct save_area *sa)
+{
+ struct nt_fpregset_64 nt_fpregset;
+
+ memset(&nt_fpregset, 0, sizeof(nt_fpregset));
+ memcpy(&nt_fpregset.fpc, &sa->fp_ctrl_reg, sizeof(sa->fp_ctrl_reg));
+ memcpy(&nt_fpregset.fprs, &sa->fp_regs, sizeof(sa->fp_regs));
+
+ return nt_init(ptr, NT_FPREGSET, &nt_fpregset, sizeof(nt_fpregset),
+ "CORE");
+}
+
+/*
+ * Initialize timer note
+ */
+static void *nt_s390_timer(void *ptr, struct save_area *sa)
+{
+ return nt_init(ptr, NT_S390_TIMER, &sa->timer, sizeof(sa->timer),
+ KEXEC_CORE_NOTE_NAME);
+}
+
+/*
+ * Initialize TOD clock comparator note
+ */
+static void *nt_s390_tod_cmp(void *ptr, struct save_area *sa)
+{
+ return nt_init(ptr, NT_S390_TODCMP, &sa->clk_cmp,
+ sizeof(sa->clk_cmp), KEXEC_CORE_NOTE_NAME);
+}
+
+/*
+ * Initialize TOD programmable register note
+ */
+static void *nt_s390_tod_preg(void *ptr, struct save_area *sa)
+{
+ return nt_init(ptr, NT_S390_TODPREG, &sa->tod_reg,
+ sizeof(sa->tod_reg), KEXEC_CORE_NOTE_NAME);
+}
+
+/*
+ * Initialize control register note
+ */
+static void *nt_s390_ctrs(void *ptr, struct save_area *sa)
+{
+ return nt_init(ptr, NT_S390_CTRS, &sa->ctrl_regs,
+ sizeof(sa->ctrl_regs), KEXEC_CORE_NOTE_NAME);
+}
+
+/*
+ * Initialize prefix register note
+ */
+static void *nt_s390_prefix(void *ptr, struct save_area *sa)
+{
+ return nt_init(ptr, NT_S390_PREFIX, &sa->pref_reg,
+ sizeof(sa->pref_reg), KEXEC_CORE_NOTE_NAME);
+}
+
+/*
+ * Final empty node
+ */
+static void nt_final(void *ptr)
+{
+ memset(ptr, 0, sizeof(struct elf_note));
+}
+
+/*
+ * Add create ELF notes for CPU
+ */
+static void add_elf_notes(int cpu)
+{
+ struct save_area *sa = (void *) 4608 + store_prefix();
+ void *ptr;
+
+ memcpy((void *) (4608UL + sa->pref_reg), sa, sizeof(*sa));
+ ptr = (u64 *) per_cpu_ptr(crash_notes, cpu);
+ ptr = nt_prstatus(ptr, sa);
+ ptr = nt_fpregset(ptr, sa);
+ ptr = nt_s390_timer(ptr, sa);
+ ptr = nt_s390_tod_cmp(ptr, sa);
+ ptr = nt_s390_tod_preg(ptr, sa);
+ ptr = nt_s390_ctrs(ptr, sa);
+ ptr = nt_s390_prefix(ptr, sa);
+ nt_final(ptr);
+}
+
+/*
+ * Store status of next available physical CPU
+ */
+static int store_status_next(int start_cpu, int this_cpu)
+{
+ struct save_area *sa = (void *) 4608 + store_prefix();
+ int cpu, rc;
+
+ for (cpu = start_cpu; cpu < 65536; cpu++) {
+ if (cpu == this_cpu)
+ continue;
+ do {
+ rc = raw_sigp(cpu, sigp_stop_and_store_status);
+ } while (rc == sigp_busy);
+ if (rc != sigp_order_code_accepted)
+ continue;
+ if (sa->pref_reg)
+ return cpu;
+ }
+ return -1;
+}
+
+/*
+ * Initialize CPU ELF notes
+ */
+void setup_regs(void)
+{
+ int cpu, this_cpu, phys_cpu = 0, first = 1;
+
+ this_cpu = stap();
+
+ store_status();
+ if (!S390_lowcore.prefixreg_save_area)
+ first = 0;
+ for_each_online_cpu(cpu) {
+ if (first) {
+ add_elf_notes(cpu);
+ first = 0;
+ continue;
+ }
+ phys_cpu = store_status_next(phys_cpu, this_cpu);
+ if (phys_cpu == -1)
+ return;
+ add_elf_notes(cpu);
+ phys_cpu++;
+ }
+}
+
+/*
+ * S390 version: Currently we do not support freeing crashkernel memory
+ */
+void crash_free_reserved_phys_range(unsigned long begin, unsigned long end)
+{
+ return;
+}
+
+/*
+ * S390 version: Just do real copy of segment
+ */
+int kimage_load_crash_segment(struct kimage *image,
+ struct kexec_segment *segment)
+{
+ return copy_from_user_real((void *) segment->mem, segment->buf,
+ segment->bufsz);
+}
+
+/*
+ * Start kdump
+ */
+static void __machine_kdump(void *image)
+{
+ int (*start_kdump)(int) = (void *)((struct kimage *) image)->start;
+
+ pfault_fini();
+ s390_reset_system();
+ __load_psw_mask(PSW_BASE_BITS | PSW_DEFAULT_KEY);
+ setup_regs();
+ start_kdump(1);
+ disabled_wait((unsigned long) __builtin_return_address(0));
+}
+
+#endif
+
+/*
+ * Give back memory to hypervisor before new kdump is loaded
+ */
+static int machine_kexec_prepare_kdump(void)
+{
+#ifdef CONFIG_CRASH_DUMP
+ if (MACHINE_IS_VM)
+ diag10_range(PFN_DOWN(crashk_res.start),
+ PFN_DOWN(crashk_res.end - crashk_res.start + 1));
+ return 0;
+#else
+ return -EINVAL;
+#endif
+}
+
int machine_kexec_prepare(struct kimage *image)
{
void *reboot_code_buffer;
@@ -35,6 +283,9 @@ int machine_kexec_prepare(struct kimage
if (ipl_flags & IPL_NSS_VALID)
return -ENOSYS;
+ if (image->type == KEXEC_TYPE_CRASH)
+ return machine_kexec_prepare_kdump();
+
/* We don't support anything but the default image type for now. */
if (image->type != KEXEC_TYPE_DEFAULT)
return -EINVAL;
@@ -73,6 +324,10 @@ static void __machine_kexec(void *data)
void machine_kexec(struct kimage *image)
{
tracer_disable();
+#ifdef CONFIG_CRASH_DUMP
+ if (image->type == KEXEC_TYPE_CRASH)
+ smp_switch_to_ipl_cpu(__machine_kdump, image);
+#endif
smp_send_stop();
smp_switch_to_ipl_cpu(__machine_kexec, image);
}
--- a/arch/s390/kernel/mem_detect.c
+++ b/arch/s390/kernel/mem_detect.c
@@ -62,3 +62,72 @@ void detect_memory_layout(struct mem_chu
arch_local_irq_restore(flags);
}
EXPORT_SYMBOL(detect_memory_layout);
+
+/*
+ * Create memory hole with given address, size, and type
+ */
+void create_mem_hole(struct mem_chunk chunks[], unsigned long addr,
+ unsigned long size, int type)
+{
+ unsigned long start, end, new_size;
+ int i;
+
+ for (i = 0; i < MEMORY_CHUNKS; i++) {
+ if (chunks[i].size == 0)
+ continue;
+ if (addr + size < chunks[i].addr)
+ continue;
+ if (addr >= chunks[i].addr + chunks[i].size)
+ continue;
+ start = max(addr, chunks[i].addr);
+ end = min(addr + size, chunks[i].addr + chunks[i].size);
+ new_size = end - start;
+ if (new_size == 0)
+ continue;
+ if (start == chunks[i].addr &&
+ end == chunks[i].addr + chunks[i].size) {
+ /* Remove chunk */
+ chunks[i].type = type;
+ } else if (start == chunks[i].addr) {
+ /* Make chunk smaller at start */
+ if (i >= MEMORY_CHUNKS - 1)
+ panic("Unable to create memory hole");
+ memmove(&chunks[i + 1], &chunks[i],
+ sizeof(struct mem_chunk) *
+ (MEMORY_CHUNKS - (i + 1)));
+ chunks[i + 1].addr = chunks[i].addr + new_size;
+ chunks[i + 1].size = chunks[i].size - new_size;
+ chunks[i].size = new_size;
+ chunks[i].type = type;
+ i += 1;
+ } else if (end == chunks[i].addr + chunks[i].size) {
+ /* Make chunk smaller at end */
+ if (i >= MEMORY_CHUNKS - 1)
+ panic("Unable to create memory hole");
+ memmove(&chunks[i + 1], &chunks[i],
+ sizeof(struct mem_chunk) *
+ (MEMORY_CHUNKS - (i + 1)));
+ chunks[i + 1].addr = start;
+ chunks[i + 1].size = new_size;
+ chunks[i + 1].type = type;
+ chunks[i].size -= new_size;
+ i += 1;
+ } else {
+ /* Create memory hole */
+ if (i >= MEMORY_CHUNKS - 2)
+ panic("Unable to create memory hole");
+ memmove(&chunks[i + 2], &chunks[i],
+ sizeof(struct mem_chunk) *
+ (MEMORY_CHUNKS - (i + 2)));
+ chunks[i + 1].addr = addr;
+ chunks[i + 1].size = size;
+ chunks[i + 1].type = type;
+ chunks[i + 2].addr = addr + size;
+ chunks[i + 2].size =
+ chunks[i].addr + chunks[i].size - (addr + size);
+ chunks[i + 2].type = chunks[i].type;
+ chunks[i].size = addr - chunks[i].addr;
+ i += 2;
+ }
+ }
+}
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -42,6 +42,9 @@
#include <linux/reboot.h>
#include <linux/topology.h>
#include <linux/ftrace.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>
+#include <linux/memory.h>
#include <asm/ipl.h>
#include <asm/uaccess.h>
@@ -57,6 +60,7 @@
#include <asm/ebcdic.h>
#include <asm/compat.h>
#include <asm/kvm_virtio.h>
+#include <asm/diag.h>
long psw_kernel_bits = (PSW_BASE_BITS | PSW_MASK_DAT | PSW_ASC_PRIMARY |
PSW_MASK_MCHECK | PSW_DEFAULT_KEY);
@@ -435,6 +439,9 @@ static void __init setup_resources(void)
for (i = 0; i < MEMORY_CHUNKS; i++) {
if (!memory_chunk[i].size)
continue;
+ if (memory_chunk[i].type == CHUNK_OLDMEM ||
+ memory_chunk[i].type == CHUNK_CRASHK)
+ continue;
res = alloc_bootmem_low(sizeof(*res));
res->flags = IORESOURCE_BUSY | IORESOURCE_MEM;
switch (memory_chunk[i].type) {
@@ -479,6 +486,7 @@ static void __init setup_memory_end(void
unsigned long max_mem;
int i;
+
#ifdef CONFIG_ZFCPDUMP
if (ipl_info.type == IPL_TYPE_FCP_DUMP) {
memory_end = ZFCPDUMP_HSA_SIZE;
@@ -550,6 +558,173 @@ static void __init setup_restart_psw(voi
copy_to_absolute_zero(&S390_lowcore.restart_psw, &psw, sizeof(psw));
}
+#ifdef CONFIG_CRASH_DUMP
+
+/*
+ * Find suitable location for crashkernel memory
+ */
+static unsigned long __init find_crash_base(unsigned long crash_size)
+{
+ unsigned long crash_base;
+ struct mem_chunk *chunk;
+ int i;
+
+ if (is_kdump_kernel() && (crash_size == OLDMEM_SIZE))
+ return OLDMEM_BASE;
+
+ for (i = MEMORY_CHUNKS - 1; i >= 0; i--) {
+ chunk = &memory_chunk[i];
+ if (chunk->size == 0)
+ continue;
+ if (chunk->type != CHUNK_READ_WRITE)
+ continue;
+ if (chunk->size < crash_size)
+ continue;
+ crash_base = max(chunk->addr, crash_size);
+ crash_base = max(crash_base, ZFCPDUMP_HSA_SIZE_MAX);
+ crash_base = max(crash_base, (unsigned long) INITRD_START +
+ INITRD_SIZE);
+ crash_base = PAGE_ALIGN(crash_base);
+ if (crash_base >= chunk->addr + chunk->size)
+ continue;
+ if (chunk->addr + chunk->size - crash_base < crash_size)
+ continue;
+ crash_base = chunk->size - crash_size;
+ return crash_base;
+ }
+ return 0;
+}
+
+/*
+ * Check if crash_base and crash_size is valid
+ */
+static int __init verify_crash_base(unsigned long crash_base,
+ unsigned long crash_size)
+{
+ struct mem_chunk *chunk;
+ int i;
+
+ /*
+ * Because we do the swap to zero, we must have at least 'crash_size'
+ * bytes free space before crash_base
+ */
+ if (crash_size > crash_base)
+ return -EINVAL;
+
+ /* First memory chunk must be at least crash_size */
+ if (memory_chunk[0].size < crash_size)
+ return -EINVAL;
+
+ /* Check if we fit into the respective memory chunk */
+ for (i = 0; i < MEMORY_CHUNKS; i++) {
+ chunk = &memory_chunk[i];
+ if (chunk->size == 0)
+ continue;
+ if (crash_base < chunk->addr)
+ continue;
+ if (crash_base >= chunk->addr + chunk->size)
+ continue;
+ /* we have found the memory chunk */
+ if (crash_base + crash_size > chunk->addr + chunk->size)
+ return -EINVAL;
+ return 0;
+ }
+ return -EINVAL;
+}
+
+/*
+ * Reserve kdump memory by creating a memory hole in the mem_chunk array
+ */
+static void __init reserve_kdump_bootmem(unsigned long addr, unsigned long size,
+ int type)
+{
+ create_mem_hole(memory_chunk, addr, size, type);
+}
+
+/*
+ * When kdump is enabled, we have to ensure that no memory from
+ * the area [0 - crashkernel memory size] is set offline
+ */
+static int kdump_mem_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct memory_notify *arg = data;
+
+ if (arg->start_pfn >= PFN_DOWN(crashk_res.end - crashk_res.start + 1))
+ return NOTIFY_OK;
+ return NOTIFY_BAD;
+}
+
+static struct notifier_block kdump_mem_nb = {
+ .notifier_call = kdump_mem_notifier,
+};
+
+#endif
+
+/*
+ * Make sure that oldmem, where the dump is stored, is protected
+ */
+static void reserve_oldmem(void)
+{
+#ifdef CONFIG_CRASH_DUMP
+ if (!is_kdump_kernel())
+ return;
+
+ reserve_kdump_bootmem(OLDMEM_BASE, OLDMEM_SIZE, CHUNK_OLDMEM);
+ reserve_kdump_bootmem(OLDMEM_SIZE, memory_end - OLDMEM_SIZE,
+ CHUNK_OLDMEM);
+ if (OLDMEM_BASE + OLDMEM_SIZE == real_memory_size)
+ saved_max_pfn = PFN_DOWN(OLDMEM_BASE) - 1;
+ else
+ saved_max_pfn = PFN_DOWN(real_memory_size) - 1;
+#endif
+}
+
+/*
+ * Reserve memory for kdump kernel to be loaded with kexec
+ */
+static void __init reserve_crashkernel(void)
+{
+#ifdef CONFIG_CRASH_DUMP
+ unsigned long long crash_base, crash_size;
+ int rc;
+
+ rc = parse_crashkernel(boot_command_line, memory_end, &crash_size,
+ &crash_base);
+ if (rc || crash_size == 0)
+ return;
+ crash_base = PAGE_ALIGN(crash_base);
+ crash_size = PAGE_ALIGN(crash_size);
+ if (register_memory_notifier(&kdump_mem_nb))
+ return;
+ if (!crash_base)
+ crash_base = find_crash_base(crash_size);
+ if (!crash_base) {
+ pr_info("crashkernel reservation failed: %s\n",
+ "No suitable area found");
+ unregister_memory_notifier(&kdump_mem_nb);
+ return;
+ }
+ if (verify_crash_base(crash_base, crash_size)) {
+ pr_info("crashkernel reservation failed: %s\n",
+ "Invalid memory range specified");
+ unregister_memory_notifier(&kdump_mem_nb);
+ return;
+ }
+ if (!is_kdump_kernel() && MACHINE_IS_VM)
+ diag10_range(PFN_DOWN(crash_base), PFN_DOWN(crash_size));
+ crashk_res.start = crash_base;
+ crashk_res.end = crash_base + crash_size - 1;
+ insert_resource(&iomem_resource, &crashk_res);
+ reserve_kdump_bootmem(crashk_res.start,
+ crashk_res.end - crashk_res.start + 1,
+ CHUNK_CRASHK);
+ pr_info("Reserving %lluMB of memory at %lluMB "
+ "for crashkernel (System RAM: %luMB)\n",
+ crash_size >> 20, crash_base >> 20, memory_end >> 20);
+#endif
+}
+
static void __init
setup_memory(void)
{
@@ -580,6 +755,14 @@ setup_memory(void)
if (PFN_PHYS(start_pfn) + bmap_size > INITRD_START) {
start = PFN_PHYS(start_pfn) + bmap_size + PAGE_SIZE;
+#ifdef CONFIG_CRASH_DUMP
+ if (is_kdump_kernel()) {
+ /* Move initrd behind kdump oldmem */
+ if (start + INITRD_SIZE > OLDMEM_BASE &&
+ start < OLDMEM_BASE + OLDMEM_SIZE)
+ start = OLDMEM_BASE + OLDMEM_SIZE;
+ }
+#endif
if (start + INITRD_SIZE > memory_end) {
pr_err("initrd extends beyond end of "
"memory (0x%08lx > 0x%08lx) "
@@ -644,6 +827,11 @@ setup_memory(void)
reserve_bootmem(start_pfn << PAGE_SHIFT, bootmap_size,
BOOTMEM_DEFAULT);
+#ifdef CONFIG_CRASH_DUMP
+ if (is_kdump_kernel())
+ reserve_bootmem(elfcorehdr_addr - OLDMEM_BASE,
+ PAGE_ALIGN(elfcorehdr_size), BOOTMEM_DEFAULT);
+#endif
#ifdef CONFIG_BLK_DEV_INITRD
if (INITRD_START && INITRD_SIZE) {
if (INITRD_START + INITRD_SIZE <= memory_end) {
@@ -812,6 +1000,8 @@ setup_arch(char **cmdline_p)
setup_ipl();
setup_memory_end();
setup_addressing_mode();
+ reserve_oldmem();
+ reserve_crashkernel();
setup_memory();
setup_resources();
setup_restart_psw();
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -335,6 +335,9 @@ void __init vmem_map_init(void)
ro_start = ((unsigned long)&_stext) & PAGE_MASK;
ro_end = PFN_ALIGN((unsigned long)&_eshared);
for (i = 0; i < MEMORY_CHUNKS && memory_chunk[i].size > 0; i++) {
+ if (memory_chunk[i].type == CHUNK_CRASHK ||
+ memory_chunk[i].type == CHUNK_OLDMEM)
+ continue;
start = memory_chunk[i].addr;
end = memory_chunk[i].addr + memory_chunk[i].size;
if (start >= ro_end || end <= ro_start)
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 6/8] s390: Do first kdump checksum test before really starting kdump
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
` (4 preceding siblings ...)
2011-08-12 13:48 ` [patch v3 5/8] s390: kdump backend code Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 7/8] kexec-tools: Add s390 kdump support Michael Holzheu
2011-08-12 13:48 ` [patch v3 8/8] kexec-tools: Allow to call verify_sha256_digest() from kernel Michael Holzheu
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: s390-kdump-arch-backend-entry.patch --]
[-- Type: text/plain, Size: 1494 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
With this patch first the kdump checksums in purgatory are verified (with
start_kdump(0)) before kdump is started. This allows us to do the shutdown
actions defined under /sys/firmware as recovery action in case kdump is
overwritten. The main use case is to define stand-alone dump as recovery
action.
We have to split the purgatory function into "checksum test" and "real
execution", because we have to switch to the IPL CPU when we execute kdump.
After the switch it is not possible to return from the called function.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
arch/s390/kernel/machine_kexec.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/arch/s390/kernel/machine_kexec.c
+++ b/arch/s390/kernel/machine_kexec.c
@@ -325,8 +325,16 @@ void machine_kexec(struct kimage *image)
{
tracer_disable();
#ifdef CONFIG_CRASH_DUMP
- if (image->type == KEXEC_TYPE_CRASH)
+ if (image->type == KEXEC_TYPE_CRASH) {
+ int (*start_kdump)(int) = (void *)image->start;
+ int rc;
+ __arch_local_irq_stnsm(0xfb); /* disable DAT */
+ rc = start_kdump(0);
+ __arch_local_irq_stosm(0x04); /* enable DAT */
+ if (rc)
+ return;
smp_switch_to_ipl_cpu(__machine_kdump, image);
+ }
#endif
smp_send_stop();
smp_switch_to_ipl_cpu(__machine_kexec, image);
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 7/8] kexec-tools: Add s390 kdump support
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
` (5 preceding siblings ...)
2011-08-12 13:48 ` [patch v3 6/8] s390: Do first kdump checksum test before really starting kdump Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 8/8] kexec-tools: Allow to call verify_sha256_digest() from kernel Michael Holzheu
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: kexec-tools-s390-kdump.patch --]
[-- Type: text/plain, Size: 18240 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
This patch adds kdump support for s390 to the kexec tool and enables the
"--load-panic" option. When loading the kdump kernel and ramdisk we add the
address of the crashkernel memory to the normal load address.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
include/elf.h | 8 ++-
kexec/arch/s390/Makefile | 1
kexec/arch/s390/crashdump-s390.c | 62 +++++++++++++++++++++++
kexec/arch/s390/kexec-elf-rel-s390.c | 66 +++++++++++++++++++++---
kexec/arch/s390/kexec-image.c | 69 ++++++++++++++++++-------
kexec/arch/s390/kexec-s390.c | 66 +++++++++++++++++-------
kexec/arch/s390/kexec-s390.h | 9 +++
purgatory/arch/s390/Makefile | 4 +
purgatory/arch/s390/console-s390.c | 14 +++++
purgatory/arch/s390/purgatory-s390.c | 93 +++++++++++++++++++++++++++++++++++
purgatory/arch/s390/setup-s390.S | 33 ++++++++++++
11 files changed, 376 insertions(+), 49 deletions(-)
--- a/include/elf.h
+++ b/include/elf.h
@@ -2304,9 +2304,13 @@ typedef Elf32_Addr Elf32_Conflict;
#define R_390_TLS_DTPOFF 55 /* Offset in TLS block. */
#define R_390_TLS_TPOFF 56 /* Negated offset in static TLS
block. */
-
+#define R_390_20 57 /* Direct 20 bit. */
+#define R_390_GOT20 58 /* 20 bit GOT offset. */
+#define R_390_GOTPLT20 59 /* 20 bit offset to jump slot. */
+#define R_390_TLS_GOTIE20 60 /* 20 bit GOT offset for static TLS
+ block offset. */
/* Keep this the last entry. */
-#define R_390_NUM 57
+#define R_390_NUM 61
/* CRIS relocations. */
#define R_CRIS_NONE 0
--- a/kexec/arch/s390/Makefile
+++ b/kexec/arch/s390/Makefile
@@ -4,6 +4,7 @@
s390_KEXEC_SRCS = kexec/arch/s390/kexec-s390.c
s390_KEXEC_SRCS += kexec/arch/s390/kexec-image.c
s390_KEXEC_SRCS += kexec/arch/s390/kexec-elf-rel-s390.c
+s390_KEXEC_SRCS += kexec/arch/s390/crashdump-s390.c
dist += kexec/arch/s390/Makefile $(s390_KEXEC_SRCS) \
kexec/arch/s390/kexec-s390.h \
--- /dev/null
+++ b/kexec/arch/s390/crashdump-s390.c
@@ -0,0 +1,62 @@
+/*
+ * kexec/arch/s390/crashdump-s390.c
+ *
+ * Copyright IBM Corp. 2011
+ *
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <elf.h>
+#include <limits.h>
+#include "../../kexec.h"
+#include "../../kexec-syscall.h"
+#include "../../kexec/crashdump.h"
+#include "kexec-s390.h"
+
+/*
+ * Load additional segments for kdump kernel
+ */
+int load_crashdump_segments(struct kexec_info *info, unsigned long crash_base,
+ unsigned long crash_end)
+{
+ static struct memory_range crash_memory_range[MAX_MEMORY_RANGES];
+ unsigned long bufsz, elfcorehdr, elfcorehdr_size, crash_size;
+ struct crash_elf_info elf_info;
+ char str[COMMAND_LINESIZE];
+ int ranges;
+ void *tmp;
+
+ crash_size = crash_end - crash_base + 1;
+ memset(&elf_info, 0, sizeof(elf_info));
+
+ elf_info.data = ELFDATA2MSB;
+ elf_info.machine = EM_S390;
+ elf_info.class = ELFCLASS64;
+ elf_info.get_note_info = get_crash_notes_per_cpu;
+
+ if (get_memory_ranges_s390(crash_memory_range, &ranges, 0))
+ return -1;
+
+ if (crash_create_elf64_headers(info, &elf_info, crash_memory_range,
+ ranges, &tmp, &bufsz,
+ ELF_CORE_HEADER_ALIGN))
+ return -1;
+
+ elfcorehdr = add_buffer(info, tmp, bufsz, bufsz, 1024,
+ crash_base, crash_end, -1);
+ elfcorehdr_size = bufsz;
+ elf_rel_build_load(info, &info->rhdr, (const char *) purgatory,
+ purgatory_size, crash_base + 0x2000,
+ crash_base + 0x10000, -1, 0);
+ elf_rel_set_symbol(&info->rhdr, "crash_base", &crash_base,
+ sizeof(crash_base));
+ elf_rel_set_symbol(&info->rhdr, "crash_size", &crash_size,
+ sizeof(crash_size));
+ info->entry = (void *) elf_rel_get_addr(&info->rhdr, "purgatory_start");
+ snprintf(str, sizeof(str), " elfcorehdr=%ld@%ldK\n",
+ elfcorehdr_size, elfcorehdr / 1024);
+ command_line_add(str);
+ return 0;
+}
--- a/kexec/arch/s390/kexec-elf-rel-s390.c
+++ b/kexec/arch/s390/kexec-elf-rel-s390.c
@@ -1,7 +1,7 @@
/*
* kexec/arch/s390/kexec-elf-rel-s390.c
*
- * (C) Copyright IBM Corp. 2005
+ * Copyright IBM Corp. 2005,2011
*
* Author(s): Heiko Carstens <heiko.carstens@de.ibm.com>
*
@@ -12,15 +12,65 @@
#include "../../kexec.h"
#include "../../kexec-elf.h"
-int machine_verify_elf_rel(struct mem_ehdr *UNUSED(ehdr))
+int machine_verify_elf_rel(struct mem_ehdr *ehdr)
{
- return 0;
+ if (ehdr->ei_data != ELFDATA2MSB)
+ return 0;
+ if (ehdr->ei_class != ELFCLASS64)
+ return 0;
+ if (ehdr->e_machine != EM_S390)
+ return 0;
+ return 1;
}
-void machine_apply_elf_rel(struct mem_ehdr *UNUSED(ehdr),
- unsigned long UNUSED(r_type),
- void *UNUSED(location),
- unsigned long UNUSED(address),
- unsigned long UNUSED(value))
+void machine_apply_elf_rel(struct mem_ehdr *ehdr,
+ unsigned long r_type,
+ void *loc,
+ unsigned long address,
+ unsigned long val)
{
+ switch (r_type) {
+ case R_390_8: /* Direct 8 bit. */
+ case R_390_12: /* Direct 12 bit. */
+ case R_390_16: /* Direct 16 bit. */
+ case R_390_20: /* Direct 20 bit. */
+ case R_390_32: /* Direct 32 bit. */
+ case R_390_64: /* Direct 64 bit. */
+ if (r_type == R_390_8)
+ *(unsigned char *) loc = val;
+ else if (r_type == R_390_12)
+ *(unsigned short *) loc = (val & 0xfff) |
+ (*(unsigned short *) loc & 0xf000);
+ else if (r_type == R_390_16)
+ *(unsigned short *) loc = val;
+ else if (r_type == R_390_20)
+ *(unsigned int *) loc =
+ (*(unsigned int *) loc & 0xf00000ff) |
+ (val & 0xfff) << 16 | (val & 0xff000) >> 4;
+ else if (r_type == R_390_32)
+ *(unsigned int *) loc = val;
+ else if (r_type == R_390_64)
+ *(unsigned long *) loc = val;
+ break;
+ case R_390_PC16: /* PC relative 16 bit. */
+ case R_390_PC16DBL: /* PC relative 16 bit shifted by 1. */
+ case R_390_PC32DBL: /* PC relative 32 bit shifted by 1. */
+ case R_390_PC32: /* PC relative 32 bit. */
+ case R_390_PC64: /* PC relative 64 bit. */
+ val -= address;
+ if (r_type == R_390_PC16)
+ *(unsigned short *) loc = val;
+ else if (r_type == R_390_PC16DBL)
+ *(unsigned short *) loc = val >> 1;
+ else if (r_type == R_390_PC32DBL)
+ *(unsigned int *) loc = val >> 1;
+ else if (r_type == R_390_PC32)
+ *(unsigned int *) loc = val;
+ else if (r_type == R_390_PC64)
+ *(unsigned long *) loc = val;
+ break;
+ default:
+ die("Unknown rela relocation: 0x%lx 0x%lx\n", r_type, address);
+ break;
+ }
}
--- a/kexec/arch/s390/kexec-image.c
+++ b/kexec/arch/s390/kexec-image.c
@@ -18,18 +18,41 @@
#include <unistd.h>
#include <getopt.h>
#include "../../kexec.h"
+#include "../../kexec-syscall.h"
+#include "../../kexec/crashdump.h"
#include "kexec-s390.h"
+#include "elf.h"
#include <arch/options.h>
+static uint64_t crash_base, crash_end;
+static char command_line[COMMAND_LINESIZE];
+
+static void add_segment_check(struct kexec_info *info, const void *buf,
+ size_t bufsz, unsigned long base, size_t memsz)
+{
+ if (info->kexec_flags & KEXEC_ON_CRASH)
+ if (base + memsz > crash_end - crash_base)
+ die("Not enough crashkernel memory to load segments\n");
+ add_segment(info, buf, bufsz, crash_base + base, memsz);
+}
+
+int command_line_add(const char *str)
+{
+ if (strlen(command_line) + strlen(str) + 1 > COMMAND_LINESIZE) {
+ fprintf(stderr, "Command line too long.\n");
+ return -1;
+ }
+ strcat(command_line, str);
+ return 0;
+}
+
int
image_s390_load(int argc, char **argv, const char *kernel_buf,
off_t kernel_size, struct kexec_info *info)
{
void *krnl_buffer;
char *rd_buffer;
- const char *command_line;
const char *ramdisk;
- int command_line_len;
off_t ramdisk_len;
unsigned int ramdisk_origin;
int opt;
@@ -44,7 +67,6 @@ image_s390_load(int argc, char **argv, c
static const char short_options[] = KEXEC_OPT_STR "";
ramdisk = NULL;
- command_line = NULL;
ramdisk_len = 0;
ramdisk_origin = 0;
@@ -55,7 +77,8 @@ image_s390_load(int argc, char **argv, c
return -1;
break;
case OPT_APPEND:
- command_line = optarg;
+ if (command_line_add(optarg))
+ return -1;
break;
case OPT_RAMDISK:
ramdisk = optarg;
@@ -63,17 +86,14 @@ image_s390_load(int argc, char **argv, c
}
}
- /* Process a given command_line: */
- if (command_line) {
- command_line_len = strlen(command_line) + 1; /* Remember the '\0' */
- if (command_line_len > COMMAND_LINESIZE) {
- fprintf(stderr, "Command line too long.\n");
+ if (info->kexec_flags & KEXEC_ON_CRASH) {
+ if (parse_iomem_single("Crash kernel\n", &crash_base,
+ &crash_end))
return -1;
- }
}
/* Add kernel segment */
- add_segment(info, kernel_buf + IMAGE_READ_OFFSET,
+ add_segment_check(info, kernel_buf + IMAGE_READ_OFFSET,
kernel_size - IMAGE_READ_OFFSET, IMAGE_READ_OFFSET,
kernel_size - IMAGE_READ_OFFSET);
@@ -88,10 +108,17 @@ image_s390_load(int argc, char **argv, c
return -1;
}
ramdisk_origin = RAMDISK_ORIGIN_ADDR;
- add_segment(info, rd_buffer, ramdisk_len, RAMDISK_ORIGIN_ADDR, ramdisk_len);
+ add_segment_check(info, rd_buffer, ramdisk_len,
+ RAMDISK_ORIGIN_ADDR, ramdisk_len);
}
-
- /* Register the ramdisk in the kernel. */
+ if (info->kexec_flags & KEXEC_ON_CRASH) {
+ if (load_crashdump_segments(info, crash_base, crash_end))
+ return -1;
+ } else {
+ info->entry = (void *) IMAGE_READ_OFFSET;
+ }
+
+ /* Register the ramdisk and crashkernel memory in the kernel. */
{
unsigned long long *tmp;
@@ -100,19 +127,23 @@ image_s390_load(int argc, char **argv, c
tmp = krnl_buffer + INITRD_SIZE_OFFS;
*tmp = (unsigned long long) ramdisk_len;
- }
+ if (info->kexec_flags & KEXEC_ON_CRASH) {
+ tmp = krnl_buffer + OLDMEM_BASE_OFFS;
+ *tmp = crash_base;
+
+ tmp = krnl_buffer + OLDMEM_SIZE_OFFS;
+ *tmp = crash_end - crash_base + 1;
+ }
+ }
/*
* We will write a probably given command line.
* First, erase the old area, then setup the new parameters:
*/
- if (command_line) {
+ if (strlen(command_line) != 0) {
memset(krnl_buffer + COMMAND_LINE_OFFS, 0, COMMAND_LINESIZE);
memcpy(krnl_buffer + COMMAND_LINE_OFFS, command_line, strlen(command_line));
}
-
- info->entry = (void *) IMAGE_READ_OFFSET;
-
return 0;
}
--- a/kexec/arch/s390/kexec-s390.c
+++ b/kexec/arch/s390/kexec-s390.c
@@ -1,9 +1,10 @@
/*
* kexec/arch/s390/kexec-s390.c
*
- * (C) Copyright IBM Corp. 2005
+ * Copyright IBM Corp. 2005,2011
*
* Author(s): Rolf Adelsberger <adelsberger@de.ibm.com>
+ * Michael Holzheu <holzheu@linux.vnet.ibm.com>
*
*/
@@ -19,26 +20,16 @@
#include "kexec-s390.h"
#include <arch/options.h>
-#define MAX_MEMORY_RANGES 64
static struct memory_range memory_range[MAX_MEMORY_RANGES];
/*
- * get_memory_ranges:
- * Return a list of memory ranges by parsing the file returned by
- * proc_iomem()
- *
- * INPUT:
- * - Pointer to an array of memory_range structures.
- * - Pointer to an integer with holds the number of memory ranges.
- *
- * RETURN:
- * - 0 on normal execution.
- * - (-1) if something went wrong.
+ * Get memory ranges of type "System RAM" from /proc/iomem. If with_crashk=1
+ * then also type "Crash kernel" is added.
*/
-
-int get_memory_ranges(struct memory_range **range, int *ranges,
- unsigned long UNUSED(flags))
+int get_memory_ranges_s390(struct memory_range memory_range[], int *ranges,
+ int with_crashk)
{
+ char crash_kernel[] = "Crash kernel\n";
char sys_ram[] = "System RAM\n";
const char *iomem = proc_iomem();
FILE *fp;
@@ -62,7 +53,9 @@ int get_memory_ranges(struct memory_rang
sscanf(line,"%Lx-%Lx : %n", &start, &end, &cons);
str = line+cons;
- if(memcmp(str,sys_ram,strlen(sys_ram)) == 0) {
+ if ((memcmp(str, sys_ram, strlen(sys_ram)) == 0) ||
+ (memcmp(str, crash_kernel, strlen(crash_kernel)) == 0) &&
+ with_crashk) {
memory_range[current_range].start = start;
memory_range[current_range].end = end;
memory_range[current_range].type = RANGE_RAM;
@@ -73,9 +66,41 @@ int get_memory_ranges(struct memory_rang
}
}
fclose(fp);
- *range = memory_range;
*ranges = current_range;
+ return 0;
+}
+/*
+ * get_memory_ranges:
+ * Return a list of memory ranges by parsing the file returned by
+ * proc_iomem()
+ *
+ * INPUT:
+ * - Pointer to an array of memory_range structures.
+ * - Pointer to an integer with holds the number of memory ranges.
+ *
+ * RETURN:
+ * - 0 on normal execution.
+ * - (-1) if something went wrong.
+ */
+
+int get_memory_ranges(struct memory_range **range, int *ranges,
+ unsigned long flags)
+{
+ uint64_t start, end;
+
+ if (get_memory_ranges_s390(memory_range, ranges,
+ flags & KEXEC_ON_CRASH))
+ return -1;
+ *range = memory_range;
+ if ((flags & KEXEC_ON_CRASH) && !(flags & KEXEC_PRESERVE_CONTEXT)) {
+ if (parse_iomem_single("Crash kernel\n", &start, &end))
+ return -1;
+ if (start > mem_min)
+ mem_min = start;
+ if (end < mem_max)
+ mem_max = end;
+ }
return 0;
}
@@ -112,5 +137,8 @@ void arch_update_purgatory(struct kexec_
int is_crashkernel_mem_reserved(void)
{
- return 0; /* kdump is not supported on this platform (yet) */
+ uint64_t start, end;
+
+ return parse_iomem_single("Crash kernel\n", &start, &end) == 0 ?
+ (start != end) : 0;
}
--- a/kexec/arch/s390/kexec-s390.h
+++ b/kexec/arch/s390/kexec-s390.h
@@ -15,11 +15,20 @@
#define RAMDISK_ORIGIN_ADDR 0x800000
#define INITRD_START_OFFS 0x408
#define INITRD_SIZE_OFFS 0x410
+#define OLDMEM_BASE_OFFS 0x418
+#define OLDMEM_SIZE_OFFS 0x420
#define COMMAND_LINE_OFFS 0x480
#define COMMAND_LINESIZE 896
+#define MAX_MEMORY_RANGES 64
extern int image_s390_load(int, char **, const char *, off_t, struct kexec_info *);
extern int image_s390_probe(const char *, off_t);
extern void image_s390_usage(void);
+extern int load_crashdump_segments(struct kexec_info *info,
+ unsigned long crash_base,
+ unsigned long crash_end);
+extern int get_memory_ranges_s390(struct memory_range range[], int *ranges,
+ int with_crashk);
+extern int command_line_add(const char *str);
#endif /* KEXEC_IA64_H */
--- a/purgatory/arch/s390/Makefile
+++ b/purgatory/arch/s390/Makefile
@@ -2,7 +2,9 @@
# Purgatory s390
#
-s390_PURGATORY_SRCS =
+s390_PURGATORY_SRCS += purgatory/arch/s390/console-s390.c
+s390_PURGATORY_SRCS += purgatory/arch/s390/setup-s390.S
+s390_PURGATORY_SRCS += purgatory/arch/s390/purgatory-s390.c
dist += purgatory/arch/s390/Makefile $(s390_PURGATORY_SRCS)
--- /dev/null
+++ b/purgatory/arch/s390/console-s390.c
@@ -0,0 +1,14 @@
+/*
+ * S390 console code (currently not implemented)
+ *
+ * Copyright IBM Corp. 2011
+ *
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+#include <purgatory.h>
+#include "unused.h"
+
+void putchar(int UNUSED(ch))
+{
+}
--- /dev/null
+++ b/purgatory/arch/s390/purgatory-s390.c
@@ -0,0 +1,93 @@
+/*
+ * S390 purgatory
+ *
+ * Copyright IBM Corp. 2011
+ *
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+#include <stdint.h>
+#include <stddef.h>
+#include <string.h>
+#include "../../../kexec/kexec-sha256.h"
+
+#define MIN(x, y) ((x) < (y) ? (x) : (y))
+#define MAX(x, y) ((x) > (y) ? (x) : (y))
+
+extern struct sha256_region sha256_regions[SHA256_REGIONS];
+
+unsigned long crash_base = (unsigned long) -1;
+unsigned long crash_size = (unsigned long) -1;
+
+/*
+ * Implement memcpy using the mvcle instruction
+ */
+static void memcpy_fast(void *target, void *src, unsigned long size)
+{
+ register unsigned long __target asm("2") = (unsigned long) target;
+ register unsigned long __size1 asm("3") = size;
+ register unsigned long __src asm("4") = (unsigned long) src;
+ register unsigned long __size2 asm("5") = size;
+
+ asm volatile (
+ "0: mvcle %0,%2,0\n"
+ " jo 0b\n"
+ : "+d" (__target), "+d" (__size1), "+d" (__src), "+d" (__size2)
+ :
+ : "cc", "memory"
+ );
+}
+
+/*
+ * Swap memory areas
+ */
+static void memswap(void *addr1, void *addr2, unsigned long size)
+{
+ unsigned long off, copy_len;
+ static char buf[1024];
+
+ for (off = 0; off < size; off += sizeof(buf)) {
+ copy_len = MIN(size - off, sizeof(buf));
+ memcpy_fast(buf, (void *) addr2 + off, copy_len);
+ memcpy_fast(addr2 + off, addr1 + off, copy_len);
+ memcpy_fast(addr1 + off, buf, copy_len);
+ }
+}
+
+/*
+ * Nothing to do
+ */
+void setup_arch(void)
+{
+}
+
+/*
+ * Do swap of [crash base - crash base + size] with [0 - crash size]
+ *
+ * We swap all kexec segments except of purgatory. The rest is copied
+ * from [0 - crash size] to [crash base - crash base + size].
+ * We use [0x2000 - 0x10000] for purgatory. This area is never used
+ * by s390 Linux kernels.
+ *
+ * This functions assumes that the sha256_regions[] is sorted.
+ */
+void post_verification_setup_arch(void)
+{
+ unsigned long start, len, last = crash_base + 0x10000;
+ struct sha256_region *ptr, *end;
+
+ end = &sha256_regions[sizeof(sha256_regions)/sizeof(sha256_regions[0])];
+ for (ptr = sha256_regions; ptr < end; ptr++) {
+ if (!ptr->start)
+ continue;
+ start = MAX(ptr->start, crash_base + 0x10000);
+ len = ptr->len - (start - ptr->start);
+ memcpy_fast((void *) last, (void *) last - crash_base,
+ start - last);
+ memswap((void *) start - crash_base, (void *) start, len);
+ last = start + len;
+ }
+ memcpy_fast((void *) last, (void *) last - crash_base,
+ crash_base + crash_size - last);
+ memcpy_fast((void *) crash_base, (void *) 0, 0x2000);
+}
--- /dev/null
+++ b/purgatory/arch/s390/setup-s390.S
@@ -0,0 +1,33 @@
+/*
+ * Purgatory setup code
+ *
+ * Copyright IBM Corp. 2011
+ *
+ * Author(s): Michael Holzheu <holzheu@linux.vnet.ibm.com>
+ */
+
+ .text
+ .globl purgatory_start
+ .balign 16
+purgatory_start:
+#ifdef __s390x__
+ larl %r15,lstack_end
+ aghi %r15,-160
+ brasl %r14,purgatory
+ larl %r14,kdump_psw
+ lpswe 0(%r14)
+
+ .section ".data"
+ .balign 16
+kdump_psw:
+ .quad 0x0000000180000000
+ .quad 0x0000000000010010
+
+ .bss
+ .balign 4096
+lstack:
+ .skip 4096
+lstack_end:
+#else
+0: j 0
+#endif
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [patch v3 8/8] kexec-tools: Allow to call verify_sha256_digest() from kernel
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
` (6 preceding siblings ...)
2011-08-12 13:48 ` [patch v3 7/8] kexec-tools: Add s390 kdump support Michael Holzheu
@ 2011-08-12 13:48 ` Michael Holzheu
7 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-12 13:48 UTC (permalink / raw)
To: vgoyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
[-- Attachment #1: kexec-tools-s390-kdump-entry.patch --]
[-- Type: text/plain, Size: 2159 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
For s390 we first want to check if kdump checksums are valid before we start
the kdump kernel. With this patch on s390 the purgatory entry point is
called with a parameter. If the parameter is "0", only the checksum test
is done and the result (0 = ok, 1 = invalid) is passed as return code back
to the caller (kernel). If the parameter is "1", the complete purgatory code
is executed and kdump is started.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
purgatory/arch/s390/setup-s390.S | 14 ++++++++++++++
purgatory/purgatory.c | 13 ++++++++-----
2 files changed, 22 insertions(+), 5 deletions(-)
--- a/purgatory/arch/s390/setup-s390.S
+++ b/purgatory/arch/s390/setup-s390.S
@@ -11,12 +11,23 @@
.balign 16
purgatory_start:
#ifdef __s390x__
+ larl %r5,gprs_save_area
+ stmg %r6,%r15,0(%r5)
larl %r15,lstack_end
aghi %r15,-160
+
+ clgfi %r2,0
+ je verify_checksums
+
brasl %r14,purgatory
larl %r14,kdump_psw
lpswe 0(%r14)
+verify_checksums:
+ brasl %r14,verify_sha256_digest
+ larl %r5,gprs_save_area
+ lmg %r6,%r15,0(%r5)
+ br %r14
.section ".data"
.balign 16
kdump_psw:
@@ -24,6 +35,9 @@ kdump_psw:
.quad 0x0000000000010010
.bss
+gprs_save_area:
+ .fill 80
+
.balign 4096
lstack:
.skip 4096
--- a/purgatory/purgatory.c
+++ b/purgatory/purgatory.c
@@ -9,7 +9,7 @@
struct sha256_region sha256_regions[SHA256_REGIONS] = {};
sha256_digest_t sha256_digest = { };
-void verify_sha256_digest(void)
+int verify_sha256_digest(void)
{
struct sha256_region *ptr, *end;
sha256_digest_t digest;
@@ -34,16 +34,19 @@ void verify_sha256_digest(void)
printf("%hhx ", sha256_digest[i]);
}
printf("\n");
- for(;;) {
- /* loop forever */
- }
+ return 1;
}
+ return 0;
}
void purgatory(void)
{
printf("I'm in purgatory\n");
setup_arch();
- verify_sha256_digest();
+ if (verify_sha256_digest()) {
+ for(;;) {
+ /* loop forever */
+ }
+ }
post_verification_setup_arch();
}
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter
2011-08-12 13:48 ` [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter Michael Holzheu
@ 2011-08-17 21:05 ` Vivek Goyal
2011-08-18 8:47 ` Michael Holzheu
0 siblings, 1 reply; 20+ messages in thread
From: Vivek Goyal @ 2011-08-17 21:05 UTC (permalink / raw)
To: Michael Holzheu
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
On Fri, Aug 12, 2011 at 03:48:52PM +0200, Michael Holzheu wrote:
> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
>
> Currently only the address of the pre-allocated ELF header is passed with
> the elfcorehdr= kernel parameter. In order to reserve memory for the header
> in the 2nd kernel also the size is required. Current kdump architecture
> backends use different methods to do that, e.g. x86 uses the memmap= kernel
> parameter. On s390 there is no easy way to transfer this information.
> Therefore the elfcorehdr kernel parameter is extended to also pass the size.
> This now can also be used as standard mechanism by all future kdump
> architecture backends.
Michael,
This version looks much better. A quick question. Who parses this
elfcorehdr= parameter in s390 and how do we make sure these headers are not
overwritten.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter
2011-08-17 21:05 ` Vivek Goyal
@ 2011-08-18 8:47 ` Michael Holzheu
2011-08-18 17:28 ` Vivek Goyal
0 siblings, 1 reply; 20+ messages in thread
From: Michael Holzheu @ 2011-08-18 8:47 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
Hello Vivek,
On Wed, 2011-08-17 at 17:05 -0400, Vivek Goyal wrote:
> On Fri, Aug 12, 2011 at 03:48:52PM +0200, Michael Holzheu wrote:
> > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> >
> > Currently only the address of the pre-allocated ELF header is passed with
> > the elfcorehdr= kernel parameter. In order to reserve memory for the header
> > in the 2nd kernel also the size is required. Current kdump architecture
> > backends use different methods to do that, e.g. x86 uses the memmap= kernel
> > parameter. On s390 there is no easy way to transfer this information.
> > Therefore the elfcorehdr kernel parameter is extended to also pass the size.
> > This now can also be used as standard mechanism by all future kdump
> > architecture backends.
>
> Michael,
>
> This version looks much better. A quick question. Who parses this
> elfcorehdr= parameter in s390 and how do we make sure these headers are not
> overwritten.
The parameter is parsed in common code (kernel/crash_dump.c) in
early_param("elfcorehdr", setup_elfcorehdr), as it is already currently
the case.
We use address and size of the ELF core header to reserve the header
memory in setup.c (see patch #8):
+#ifdef CONFIG_CRASH_DUMP
+ if (is_kdump_kernel())
+ reserve_bootmem(elfcorehdr_addr - OLDMEM_BASE,
+ PAGE_ALIGN(elfcorehdr_size), BOOTMEM_DEFAULT);
+#endif
Does that answer your question?
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-12 13:48 ` [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak Michael Holzheu
@ 2011-08-18 17:15 ` Vivek Goyal
2011-08-19 13:27 ` Michael Holzheu
0 siblings, 1 reply; 20+ messages in thread
From: Vivek Goyal @ 2011-08-18 17:15 UTC (permalink / raw)
To: Michael Holzheu
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
>
> On s390 we do not create page tables at all for the crashkernel memory.
> This requires a s390 specific version for kimage_load_crash_segment().
> Therefore this patch declares this function as "__weak". The s390 version is
> very simple. It just copies the kexec segment to real memory without using
> page tables:
>
> int kimage_load_crash_segment(struct kimage *image,
> struct kexec_segment *segment)
> {
> return copy_from_user_real((void *) segment->mem, segment->buf,
> segment->bufsz);
> }
>
> There are two main advantages of not creating page tables for the
> crashkernel memory:
>
> a) It saves memory. We have scenarios in mind, where crashkernel
> memory can be very large and saving page table space is important.
> b) We protect the crashkernel memory from being overwritten.
Michael,
Thinking more about it. Can't we provide a arch specific version of
kmap() and kunmap() so that we create temporary mappings to copy
the pages and then these are torn off. That way you don't waste space
as well you don't have the risk of overwritting the loaded kernel.
Exporting the knowledge of generic kexec segments to arch code sounds
like little odd choice.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter
2011-08-18 8:47 ` Michael Holzheu
@ 2011-08-18 17:28 ` Vivek Goyal
2011-08-18 17:56 ` Michael Holzheu
0 siblings, 1 reply; 20+ messages in thread
From: Vivek Goyal @ 2011-08-18 17:28 UTC (permalink / raw)
To: Michael Holzheu
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
On Thu, Aug 18, 2011 at 10:47:59AM +0200, Michael Holzheu wrote:
> Hello Vivek,
>
> On Wed, 2011-08-17 at 17:05 -0400, Vivek Goyal wrote:
> > On Fri, Aug 12, 2011 at 03:48:52PM +0200, Michael Holzheu wrote:
> > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > >
> > > Currently only the address of the pre-allocated ELF header is passed with
> > > the elfcorehdr= kernel parameter. In order to reserve memory for the header
> > > in the 2nd kernel also the size is required. Current kdump architecture
> > > backends use different methods to do that, e.g. x86 uses the memmap= kernel
> > > parameter. On s390 there is no easy way to transfer this information.
> > > Therefore the elfcorehdr kernel parameter is extended to also pass the size.
> > > This now can also be used as standard mechanism by all future kdump
> > > architecture backends.
> >
> > Michael,
> >
> > This version looks much better. A quick question. Who parses this
> > elfcorehdr= parameter in s390 and how do we make sure these headers are not
> > overwritten.
>
> The parameter is parsed in common code (kernel/crash_dump.c) in
> early_param("elfcorehdr", setup_elfcorehdr), as it is already currently
> the case.
>
> We use address and size of the ELF core header to reserve the header
> memory in setup.c (see patch #8):
>
> +#ifdef CONFIG_CRASH_DUMP
> + if (is_kdump_kernel())
> + reserve_bootmem(elfcorehdr_addr - OLDMEM_BASE,
> + PAGE_ALIGN(elfcorehdr_size), BOOTMEM_DEFAULT);
> +#endif
>
> Does that answer your question?
Yes it does. Thanks.
It brings up few more questions about rest of the memory mangement.
So kdump kernel is loaded in reserved area but does not run from there.
It reloads itself into lower memory areas and swaps the contents of
lower memory with reserved memory? If yes, how does it, kernel or
purgatory?
How does kernel come to know about how much memory is to be swapped
and how do you bound the memory usage of second kernel so that it
does not try to use other memory which has not been swapped into
reserved area.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter
2011-08-18 17:28 ` Vivek Goyal
@ 2011-08-18 17:56 ` Michael Holzheu
0 siblings, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-18 17:56 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
Hello Vivek,
On Thu, 2011-08-18 at 13:28 -0400, Vivek Goyal wrote:
> > The parameter is parsed in common code (kernel/crash_dump.c) in
> > early_param("elfcorehdr", setup_elfcorehdr), as it is already currently
> > the case.
> >
> > We use address and size of the ELF core header to reserve the header
> > memory in setup.c (see patch #8):
> >
> > +#ifdef CONFIG_CRASH_DUMP
> > + if (is_kdump_kernel())
> > + reserve_bootmem(elfcorehdr_addr - OLDMEM_BASE,
> > + PAGE_ALIGN(elfcorehdr_size), BOOTMEM_DEFAULT);
> > +#endif
> >
> > Does that answer your question?
>
> Yes it does. Thanks.
>
> It brings up few more questions about rest of the memory mangement.
>
> So kdump kernel is loaded in reserved area but does not run from there.
> It reloads itself into lower memory areas and swaps the contents of
> lower memory with reserved memory? If yes, how does it, kernel or
> purgatory?
With the v3 patch series purgatory does that (see
purgatory-s390.c/post_verification_setup_arch()):
> How does kernel come to know about how much memory is to be swapped
2nd kernel knows crash_base and crash_base because kexec tools told him
that. We do it like registering the ramdisk. See kexec-image.c:
+ if (info->kexec_flags & KEXEC_ON_CRASH) {
+ tmp = krnl_buffer + OLDMEM_BASE_OFFS;
+ *tmp = crash_base;
+
+ tmp = krnl_buffer + OLDMEM_SIZE_OFFS;
+ *tmp = crash_end - crash_base + 1;
+ }
> and how do you bound the memory usage of second kernel so that it
> does not try to use other memory which has not been swapped into
> reserved area.
See kernel patches setup.c:
+/*
+ * Make sure that oldmem, where the dump is stored, is protected
+ */
+static void reserve_oldmem(void)
+{
+#ifdef CONFIG_CRASH_DUMP
+ if (!is_kdump_kernel())
+ return;
+
+ reserve_kdump_bootmem(OLDMEM_BASE, OLDMEM_SIZE, CHUNK_OLDMEM);
+ reserve_kdump_bootmem(OLDMEM_SIZE, memory_end - OLDMEM_SIZE,
+ CHUNK_OLDMEM);
+ if (OLDMEM_BASE + OLDMEM_SIZE == real_memory_size)
+ saved_max_pfn = PFN_DOWN(OLDMEM_BASE) - 1;
+ else
+ saved_max_pfn = PFN_DOWN(real_memory_size) - 1;
+#endif
+}
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-18 17:15 ` Vivek Goyal
@ 2011-08-19 13:27 ` Michael Holzheu
2011-08-19 13:48 ` Vivek Goyal
0 siblings, 1 reply; 20+ messages in thread
From: Michael Holzheu @ 2011-08-19 13:27 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel, hbabu,
horms, ebiederm, schwidefsky, kexec
Hello Vivek,
On Thu, 2011-08-18 at 13:15 -0400, Vivek Goyal wrote:
> On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> >
> > On s390 we do not create page tables at all for the crashkernel memory.
> > This requires a s390 specific version for kimage_load_crash_segment().
> > Therefore this patch declares this function as "__weak". The s390 version is
> > very simple. It just copies the kexec segment to real memory without using
> > page tables:
> >
> > int kimage_load_crash_segment(struct kimage *image,
> > struct kexec_segment *segment)
> > {
> > return copy_from_user_real((void *) segment->mem, segment->buf,
> > segment->bufsz);
> > }
> >
> > There are two main advantages of not creating page tables for the
> > crashkernel memory:
> >
> > a) It saves memory. We have scenarios in mind, where crashkernel
> > memory can be very large and saving page table space is important.
> > b) We protect the crashkernel memory from being overwritten.
>
> Michael,
>
> Thinking more about it. Can't we provide a arch specific version of
> kmap() and kunmap() so that we create temporary mappings to copy
> the pages and then these are torn off.
Isn't kmap/kunmap() used for higmem? These functions are called from
many different functions in the Linux kernel, not only for kdump. I
would assume that creating and removing mappings with these functions is
not what a caller would expect and probably would break the Linux kernel
at many other places, no?
Perhaps we can finish this discussion after my vacation. I will change
my patch series that we even do not need this patch...
So only two common code patches are remaining. I will send the common
code patches again and will ask Andrew Morton to integrate them in the
next merge window. The s390 patches will be integrated by Martin.
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-19 13:27 ` Michael Holzheu
@ 2011-08-19 13:48 ` Vivek Goyal
2011-08-19 14:02 ` Michael Holzheu
2011-08-19 14:28 ` Martin Schwidefsky
0 siblings, 2 replies; 20+ messages in thread
From: Vivek Goyal @ 2011-08-19 13:48 UTC (permalink / raw)
To: Michael Holzheu
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel,
linux-mm, hbabu, horms, ebiederm, schwidefsky, kexec
On Fri, Aug 19, 2011 at 03:27:52PM +0200, Michael Holzheu wrote:
> Hello Vivek,
>
> On Thu, 2011-08-18 at 13:15 -0400, Vivek Goyal wrote:
> > On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > >
> > > On s390 we do not create page tables at all for the crashkernel memory.
> > > This requires a s390 specific version for kimage_load_crash_segment().
> > > Therefore this patch declares this function as "__weak". The s390 version is
> > > very simple. It just copies the kexec segment to real memory without using
> > > page tables:
> > >
> > > int kimage_load_crash_segment(struct kimage *image,
> > > struct kexec_segment *segment)
> > > {
> > > return copy_from_user_real((void *) segment->mem, segment->buf,
> > > segment->bufsz);
> > > }
> > >
> > > There are two main advantages of not creating page tables for the
> > > crashkernel memory:
> > >
> > > a) It saves memory. We have scenarios in mind, where crashkernel
> > > memory can be very large and saving page table space is important.
> > > b) We protect the crashkernel memory from being overwritten.
> >
> > Michael,
> >
> > Thinking more about it. Can't we provide a arch specific version of
> > kmap() and kunmap() so that we create temporary mappings to copy
> > the pages and then these are torn off.
>
> Isn't kmap/kunmap() used for higmem? These functions are called from
> many different functions in the Linux kernel, not only for kdump. I
> would assume that creating and removing mappings with these functions is
> not what a caller would expect and probably would break the Linux kernel
> at many other places, no?
[CCing linux-mm]
Yes it is being used for highmem pages. If arch has not defined kmap()
then generic definition is just returning page_address(page), expecting
that page will be mapped.
I was wondering that what will be broken if arch decides to extend this
to create temporary mappings for pages which are not HIGHMEM but do
not have any mapping. (Like this special case of s390).
I guess memory management guys can give a better answer here. As a layman,
kmap() seems to be the way to get a kernel mapping for any page frame
and if one is not already there, then arch might create one on the fly,
like we do for HIGHMEM pages. So the question is can be extend this
to also cover pages which are not highmem but do not have any mappings
on s390.
>
> Perhaps we can finish this discussion after my vacation. I will change
> my patch series that we even do not need this patch...
So how are you planning to get rid of this patch without modifying kmap(),
kunmap() implementation for s390?
>
> So only two common code patches are remaining. I will send the common
> code patches again and will ask Andrew Morton to integrate them in the
> next merge window.The s390 patches will be integrated by Martin.
I am fine with merge of other 2 common patches. Once you repost the
series, I will ack those.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-19 13:48 ` Vivek Goyal
@ 2011-08-19 14:02 ` Michael Holzheu
2011-08-19 14:28 ` Martin Schwidefsky
1 sibling, 0 replies; 20+ messages in thread
From: Michael Holzheu @ 2011-08-19 14:02 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel,
linux-mm, hbabu, horms, ebiederm, schwidefsky, kexec
Hello Vivek,
On Fri, 2011-08-19 at 09:48 -0400, Vivek Goyal wrote:
[snip]
> > > Michael,
> > >
> > > Thinking more about it. Can't we provide a arch specific version of
> > > kmap() and kunmap() so that we create temporary mappings to copy
> > > the pages and then these are torn off.
> >
> > Isn't kmap/kunmap() used for higmem? These functions are called from
> > many different functions in the Linux kernel, not only for kdump. I
> > would assume that creating and removing mappings with these functions is
> > not what a caller would expect and probably would break the Linux kernel
> > at many other places, no?
>
> [CCing linux-mm]
>
> Yes it is being used for highmem pages. If arch has not defined kmap()
> then generic definition is just returning page_address(page), expecting
> that page will be mapped.
>
> I was wondering that what will be broken if arch decides to extend this
> to create temporary mappings for pages which are not HIGHMEM but do
> not have any mapping. (Like this special case of s390).
At least we have significant additional overhead for all the other
places where kmap/kunmap is called.
> I guess memory management guys can give a better answer here. As a layman,
> kmap() seems to be the way to get a kernel mapping for any page frame
> and if one is not already there, then arch might create one on the fly,
> like we do for HIGHMEM pages. So the question is can be extend this
> to also cover pages which are not highmem but do not have any mappings
> on s390.
>
> >
> > Perhaps we can finish this discussion after my vacation. I will change
> > my patch series that we even do not need this patch...
>
> So how are you planning to get rid of this patch without modifying kmap(),
> kunmap() implementation for s390?
I will update my patch series that we do not remove page tables for
crashkernel memory. So everything will be as on other architectures.
I hope that we can find a good solution after my vacation. Perhaps then
I have enough energy again :-)
> > So only two common code patches are remaining. I will send the common
> > code patches again and will ask Andrew Morton to integrate them in the
> > next merge window.The s390 patches will be integrated by Martin.
>
> I am fine with merge of other 2 common patches. Once you repost the
> series, I will ack those.
Great! I will resend the patches and contact Andrew Morton.
Thanks!
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-19 13:48 ` Vivek Goyal
2011-08-19 14:02 ` Michael Holzheu
@ 2011-08-19 14:28 ` Martin Schwidefsky
2011-08-19 14:37 ` Vivek Goyal
1 sibling, 1 reply; 20+ messages in thread
From: Martin Schwidefsky @ 2011-08-19 14:28 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel,
linux-mm, hbabu, horms, ebiederm, Michael Holzheu, kexec
On Fri, 19 Aug 2011 09:48:36 -0400
Vivek Goyal <vgoyal@redhat.com> wrote:
> On Fri, Aug 19, 2011 at 03:27:52PM +0200, Michael Holzheu wrote:
> > Hello Vivek,
> >
> > On Thu, 2011-08-18 at 13:15 -0400, Vivek Goyal wrote:
> > > On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> > > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > > >
> > > > On s390 we do not create page tables at all for the crashkernel memory.
> > > > This requires a s390 specific version for kimage_load_crash_segment().
> > > > Therefore this patch declares this function as "__weak". The s390 version is
> > > > very simple. It just copies the kexec segment to real memory without using
> > > > page tables:
> > > >
> > > > int kimage_load_crash_segment(struct kimage *image,
> > > > struct kexec_segment *segment)
> > > > {
> > > > return copy_from_user_real((void *) segment->mem, segment->buf,
> > > > segment->bufsz);
> > > > }
> > > >
> > > > There are two main advantages of not creating page tables for the
> > > > crashkernel memory:
> > > >
> > > > a) It saves memory. We have scenarios in mind, where crashkernel
> > > > memory can be very large and saving page table space is important.
> > > > b) We protect the crashkernel memory from being overwritten.
> > >
> > > Michael,
> > >
> > > Thinking more about it. Can't we provide a arch specific version of
> > > kmap() and kunmap() so that we create temporary mappings to copy
> > > the pages and then these are torn off.
> >
> > Isn't kmap/kunmap() used for higmem? These functions are called from
> > many different functions in the Linux kernel, not only for kdump. I
> > would assume that creating and removing mappings with these functions is
> > not what a caller would expect and probably would break the Linux kernel
> > at many other places, no?
>
> [CCing linux-mm]
>
> Yes it is being used for highmem pages. If arch has not defined kmap()
> then generic definition is just returning page_address(page), expecting
> that page will be mapped.
>
> I was wondering that what will be broken if arch decides to extend this
> to create temporary mappings for pages which are not HIGHMEM but do
> not have any mapping. (Like this special case of s390).
>
> I guess memory management guys can give a better answer here. As a layman,
> kmap() seems to be the way to get a kernel mapping for any page frame
> and if one is not already there, then arch might create one on the fly,
> like we do for HIGHMEM pages. So the question is can be extend this
> to also cover pages which are not highmem but do not have any mappings
> on s390.
Imho it would be wrong to misuse kmap/kunmap to get around a minor problem
with the memory for the crash kernel. These functions are used to provide
accessibility to highmem pages in the kernel address space. The highmem
area is "normal" memory with corresponding struct page elements (the
functions do take a struct page * as argument after all). They are not
usable to map arbitrary page frames.
And we definitely don't want to make the memory management any slower by
defining non-trivial kmap/kunmap functions.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-19 14:28 ` Martin Schwidefsky
@ 2011-08-19 14:37 ` Vivek Goyal
2011-08-19 14:44 ` Martin Schwidefsky
0 siblings, 1 reply; 20+ messages in thread
From: Vivek Goyal @ 2011-08-19 14:37 UTC (permalink / raw)
To: Martin Schwidefsky
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel,
linux-mm, hbabu, horms, ebiederm, Michael Holzheu, kexec
On Fri, Aug 19, 2011 at 04:28:54PM +0200, Martin Schwidefsky wrote:
> On Fri, 19 Aug 2011 09:48:36 -0400
> Vivek Goyal <vgoyal@redhat.com> wrote:
>
> > On Fri, Aug 19, 2011 at 03:27:52PM +0200, Michael Holzheu wrote:
> > > Hello Vivek,
> > >
> > > On Thu, 2011-08-18 at 13:15 -0400, Vivek Goyal wrote:
> > > > On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> > > > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > > > >
> > > > > On s390 we do not create page tables at all for the crashkernel memory.
> > > > > This requires a s390 specific version for kimage_load_crash_segment().
> > > > > Therefore this patch declares this function as "__weak". The s390 version is
> > > > > very simple. It just copies the kexec segment to real memory without using
> > > > > page tables:
> > > > >
> > > > > int kimage_load_crash_segment(struct kimage *image,
> > > > > struct kexec_segment *segment)
> > > > > {
> > > > > return copy_from_user_real((void *) segment->mem, segment->buf,
> > > > > segment->bufsz);
> > > > > }
> > > > >
> > > > > There are two main advantages of not creating page tables for the
> > > > > crashkernel memory:
> > > > >
> > > > > a) It saves memory. We have scenarios in mind, where crashkernel
> > > > > memory can be very large and saving page table space is important.
> > > > > b) We protect the crashkernel memory from being overwritten.
> > > >
> > > > Michael,
> > > >
> > > > Thinking more about it. Can't we provide a arch specific version of
> > > > kmap() and kunmap() so that we create temporary mappings to copy
> > > > the pages and then these are torn off.
> > >
> > > Isn't kmap/kunmap() used for higmem? These functions are called from
> > > many different functions in the Linux kernel, not only for kdump. I
> > > would assume that creating and removing mappings with these functions is
> > > not what a caller would expect and probably would break the Linux kernel
> > > at many other places, no?
> >
> > [CCing linux-mm]
> >
> > Yes it is being used for highmem pages. If arch has not defined kmap()
> > then generic definition is just returning page_address(page), expecting
> > that page will be mapped.
> >
> > I was wondering that what will be broken if arch decides to extend this
> > to create temporary mappings for pages which are not HIGHMEM but do
> > not have any mapping. (Like this special case of s390).
> >
> > I guess memory management guys can give a better answer here. As a layman,
> > kmap() seems to be the way to get a kernel mapping for any page frame
> > and if one is not already there, then arch might create one on the fly,
> > like we do for HIGHMEM pages. So the question is can be extend this
> > to also cover pages which are not highmem but do not have any mappings
> > on s390.
>
> Imho it would be wrong to misuse kmap/kunmap to get around a minor problem
> with the memory for the crash kernel. These functions are used to provide
> accessibility to highmem pages in the kernel address space. The highmem
> area is "normal" memory with corresponding struct page elements (the
> functions do take a struct page * as argument after all). They are not
> usable to map arbitrary page frames.
Same is the case with crashkernel memory in s390 where these are
normal pages, just that these are not mapped in linearly mapped region.
The only exception to highmem pages is that linearly mapped region is
not big enough in certain arches, so we left them unmapped.
>
> And we definitely don't want to make the memory management any slower by
> defining non-trivial kmap/kunmap functions.
if we continue to return page_address() and only go into else loop if page
is not mapped, then there should not be any slow down for exisitng cases
where memory is mapped?
Anyway, this was just a thought. I am not too particular about it and
michael has agreed to get rid of code which was removing mappings for
crashkernel area. So for the time we don't need above.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak
2011-08-19 14:37 ` Vivek Goyal
@ 2011-08-19 14:44 ` Martin Schwidefsky
0 siblings, 0 replies; 20+ messages in thread
From: Martin Schwidefsky @ 2011-08-19 14:44 UTC (permalink / raw)
To: Vivek Goyal
Cc: oomichi, linux-s390, mahesh, heiko.carstens, linux-kernel,
linux-mm, hbabu, horms, ebiederm, Michael Holzheu, kexec
On Fri, 19 Aug 2011 10:37:48 -0400
Vivek Goyal <vgoyal@redhat.com> wrote:
> On Fri, Aug 19, 2011 at 04:28:54PM +0200, Martin Schwidefsky wrote:
> > On Fri, 19 Aug 2011 09:48:36 -0400
> > Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > > On Fri, Aug 19, 2011 at 03:27:52PM +0200, Michael Holzheu wrote:
> > > > Hello Vivek,
> > > >
> > > > On Thu, 2011-08-18 at 13:15 -0400, Vivek Goyal wrote:
> > > > > On Fri, Aug 12, 2011 at 03:48:51PM +0200, Michael Holzheu wrote:
> > > > > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > > > > >
> > > > > > On s390 we do not create page tables at all for the crashkernel memory.
> > > > > > This requires a s390 specific version for kimage_load_crash_segment().
> > > > > > Therefore this patch declares this function as "__weak". The s390 version is
> > > > > > very simple. It just copies the kexec segment to real memory without using
> > > > > > page tables:
> > > > > >
> > > > > > int kimage_load_crash_segment(struct kimage *image,
> > > > > > struct kexec_segment *segment)
> > > > > > {
> > > > > > return copy_from_user_real((void *) segment->mem, segment->buf,
> > > > > > segment->bufsz);
> > > > > > }
> > > > > >
> > > > > > There are two main advantages of not creating page tables for the
> > > > > > crashkernel memory:
> > > > > >
> > > > > > a) It saves memory. We have scenarios in mind, where crashkernel
> > > > > > memory can be very large and saving page table space is important.
> > > > > > b) We protect the crashkernel memory from being overwritten.
> > > > >
> > > > > Michael,
> > > > >
> > > > > Thinking more about it. Can't we provide a arch specific version of
> > > > > kmap() and kunmap() so that we create temporary mappings to copy
> > > > > the pages and then these are torn off.
> > > >
> > > > Isn't kmap/kunmap() used for higmem? These functions are called from
> > > > many different functions in the Linux kernel, not only for kdump. I
> > > > would assume that creating and removing mappings with these functions is
> > > > not what a caller would expect and probably would break the Linux kernel
> > > > at many other places, no?
> > >
> > > [CCing linux-mm]
> > >
> > > Yes it is being used for highmem pages. If arch has not defined kmap()
> > > then generic definition is just returning page_address(page), expecting
> > > that page will be mapped.
> > >
> > > I was wondering that what will be broken if arch decides to extend this
> > > to create temporary mappings for pages which are not HIGHMEM but do
> > > not have any mapping. (Like this special case of s390).
> > >
> > > I guess memory management guys can give a better answer here. As a layman,
> > > kmap() seems to be the way to get a kernel mapping for any page frame
> > > and if one is not already there, then arch might create one on the fly,
> > > like we do for HIGHMEM pages. So the question is can be extend this
> > > to also cover pages which are not highmem but do not have any mappings
> > > on s390.
> >
> > Imho it would be wrong to misuse kmap/kunmap to get around a minor problem
> > with the memory for the crash kernel. These functions are used to provide
> > accessibility to highmem pages in the kernel address space. The highmem
> > area is "normal" memory with corresponding struct page elements (the
> > functions do take a struct page * as argument after all). They are not
> > usable to map arbitrary page frames.
>
> Same is the case with crashkernel memory in s390 where these are
> normal pages, just that these are not mapped in linearly mapped region.
>
> The only exception to highmem pages is that linearly mapped region is
> not big enough in certain arches, so we left them unmapped.
Well, the crashkernel memory in s390 is addressable without a page table
but is not included in the mem_map array (no struct page). For the highmem
page it is the other way round, the are not addressable without a page table
entry but they are included in mem_map. That makes all the difference, no?
> >
> > And we definitely don't want to make the memory management any slower by
> > defining non-trivial kmap/kunmap functions.
>
> if we continue to return page_address() and only go into else loop if page
> is not mapped, then there should not be any slow down for exisitng cases
> where memory is mapped?
>
> Anyway, this was just a thought. I am not too particular about it and
> michael has agreed to get rid of code which was removing mappings for
> crashkernel area. So for the time we don't need above.
Then we can use the standard kimage_load_crash_segment function. But we
should think about adding code that protects the crash kernel memory,
e.g. by making the memory area read-only in the kernel page table. One
reason why we wanted to exclude the crash kernel memory from the memory
that is seen by the Linux kernel is protection from wild stores.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-08-19 14:44 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-12 13:48 [patch v3 0/8] kdump: Patch series for s390 support (version 3) Michael Holzheu
2011-08-12 13:48 ` [patch v3 1/8] kdump: Add KEXEC_CRASH_CONTROL_MEMORY_LIMIT Michael Holzheu
2011-08-12 13:48 ` [patch v3 2/8] kdump: Make kimage_load_crash_segment() weak Michael Holzheu
2011-08-18 17:15 ` Vivek Goyal
2011-08-19 13:27 ` Michael Holzheu
2011-08-19 13:48 ` Vivek Goyal
2011-08-19 14:02 ` Michael Holzheu
2011-08-19 14:28 ` Martin Schwidefsky
2011-08-19 14:37 ` Vivek Goyal
2011-08-19 14:44 ` Martin Schwidefsky
2011-08-12 13:48 ` [patch v3 3/8] kdump: Add size to elfcorehdr kernel parameter Michael Holzheu
2011-08-17 21:05 ` Vivek Goyal
2011-08-18 8:47 ` Michael Holzheu
2011-08-18 17:28 ` Vivek Goyal
2011-08-18 17:56 ` Michael Holzheu
2011-08-12 13:48 ` [patch v3 4/8] s390: Add real memory access functions Michael Holzheu
2011-08-12 13:48 ` [patch v3 5/8] s390: kdump backend code Michael Holzheu
2011-08-12 13:48 ` [patch v3 6/8] s390: Do first kdump checksum test before really starting kdump Michael Holzheu
2011-08-12 13:48 ` [patch v3 7/8] kexec-tools: Add s390 kdump support Michael Holzheu
2011-08-12 13:48 ` [patch v3 8/8] kexec-tools: Allow to call verify_sha256_digest() from kernel Michael Holzheu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox