public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [Patch 0/8] V4 Implement crashkernel=auto
@ 2009-08-21  6:54 Amerigo Wang
  2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
                   ` (8 more replies)
  0 siblings, 9 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman,
	kamezawa.hiroyu, Andi Kleen, Amerigo Wang, akpm, bernhard.walle,
	Fenghua Yu, Ingo Molnar, Anton Vorontsov

V3 -> V4:
 - Reorder the patches.
 - Really free the reserved memory, instead of remapping it.
   (Thanks to KAMEZAWA Hiroyuki!)
 - Release the reserved memory resource when the size is 0.
 - Use strict_strtoul() instead of simple_strtoul().

V2 -> V3:
 - Use more clever way to calculate reserved memory size, especially for IA64.
 - Add that patch that implements shrinking reserved memory

V1 -> V2:
 - Use include/asm-generic/kexec.h, suggested by Neil.
 - Rename a local variable, suggested by Fenghua.
 - Fix some style problems found by checkpatch.pl.
 - Unify the Kconfig docs.

This series of patch implements automatically reserved memory for crashkernel,
by introducing a new boot option "crashkernel=auto". This idea is from Neil.

In case of breaking user-space applications, it modifies this boot option after
it decides how much memory should be reserved.

On different arch, the threshold and reserved memory size is different. Please
refer patch 8/8 which contains an update for the documentation.

Patch 1/8 implements shrinking reserved memory at run-time, which is useful
when more than enough memory is reserved automatically.

Note: This patchset was only tested on x86_64 with differernt memory sizes.

Cc: Neil Horman <nhorman@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Bernhard Walle <bernhard.walle@gmx.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: WANG Cong <amwang@redhat.com>

---
 Documentation/kdump/kdump.txt    |   28 ++++++
 arch/ia64/Kconfig                |   14 +++
 arch/ia64/include/asm/kexec.h    |   23 ++++
 arch/powerpc/Kconfig             |   11 ++
 arch/powerpc/include/asm/kexec.h |    8 +
 arch/x86/Kconfig                 |   13 ++
 arch/x86/include/asm/kexec.h     |    1 
 include/asm-generic/kexec.h      |   42 +++++++++
 include/linux/kexec.h            |    5 +
 kernel/kexec.c                   |  180 +++++++++++++++++++++++++++++++++++++++
 kernel/ksysfs.c                  |   46 +++++++++
 11 files changed, 371 insertions(+)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 1/8] kexec: allow to shrink reserved memory
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
@ 2009-08-21  6:54 ` Amerigo Wang
  2009-08-22  0:17   ` Andrew Morton
  2009-08-22  1:39   ` Eric W. Biederman
  2009-08-21  6:54 ` [Patch 2/8] x86: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman, Andi Kleen,
	Ingo Molnar, Amerigo Wang, akpm, bernhard.walle, Fenghua Yu,
	kamezawa.hiroyu, Anton Vorontsov

This patch implements shrinking the reserved memory for crash kernel,
if it is more than enough.

For example, if you have already reserved 128M, now you just want 100M,
you can do:

# echo $((100*1024*1024)) > /sys/kernel/kexec_crash_size

Note, you can only do this before loading the crash kernel.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

---

Index: linux-2.6/include/linux/kexec.h
===================================================================
--- linux-2.6.orig/include/linux/kexec.h
+++ linux-2.6/include/linux/kexec.h
@@ -206,6 +206,8 @@ extern size_t vmcoreinfo_max_size;
 
 int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
+int shrink_crash_memory(unsigned long new_size);
+size_t get_crash_memory_size(void);
 
 #else /* !CONFIG_KEXEC */
 struct pt_regs;
Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -31,6 +31,7 @@
 #include <linux/cpu.h>
 #include <linux/console.h>
 #include <linux/vmalloc.h>
+#include <linux/swap.h>
 
 #include <asm/page.h>
 #include <asm/uaccess.h>
@@ -1083,6 +1084,58 @@ void crash_kexec(struct pt_regs *regs)
 	}
 }
 
+size_t get_crash_memory_size(void)
+{
+	size_t size;
+	mutex_lock(&kexec_mutex);
+	size = crashk_res.end - crashk_res.start + 1;
+	mutex_unlock(&kexec_mutex);
+	return size;
+}
+
+int shrink_crash_memory(unsigned long new_size)
+{
+	int ret = 0;
+	unsigned long addr;
+	unsigned long start, end;
+
+	mutex_lock(&kexec_mutex);
+
+	if (kexec_crash_image) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+	start = crashk_res.start;
+	end = crashk_res.end;
+
+	if (new_size >= end - start + 1) {
+		ret = -EINVAL;
+		if (new_size == end - start + 1)
+			ret = 0;
+		goto unlock;
+	}
+
+	start = roundup(start, PAGE_SIZE);
+	end = roundup(start + new_size, PAGE_SIZE);
+
+	for (addr = end; addr < crashk_res.end; addr += PAGE_SIZE) {
+		ClearPageReserved(pfn_to_page(addr >> PAGE_SHIFT));
+		init_page_count(pfn_to_page(addr >> PAGE_SHIFT));
+		free_page((unsigned long)__va(addr));
+		totalram_pages++;
+	}
+
+	if (start == end) {
+		crashk_res.end = end;
+		release_resource(&crashk_res);
+	} else
+		crashk_res.end = end - 1;
+
+unlock:
+	mutex_unlock(&kexec_mutex);
+	return ret;
+}
+
 static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
 			    size_t data_len)
 {
Index: linux-2.6/kernel/ksysfs.c
===================================================================
--- linux-2.6.orig/kernel/ksysfs.c
+++ linux-2.6/kernel/ksysfs.c
@@ -100,6 +100,26 @@ static ssize_t kexec_crash_loaded_show(s
 }
 KERNEL_ATTR_RO(kexec_crash_loaded);
 
+static ssize_t kexec_crash_size_show(struct kobject *kobj,
+				       struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%lu\n", get_crash_memory_size());
+}
+static ssize_t kexec_crash_size_store(struct kobject *kobj,
+				   struct kobj_attribute *attr,
+				   const char *buf, size_t count)
+{
+	unsigned long cnt;
+	int ret;
+
+	if (strict_strtoul(buf, 0, &cnt))
+		return -EINVAL;
+
+	ret = shrink_crash_memory(cnt);
+	return ret < 0 ? ret : count;
+}
+KERNEL_ATTR_RW(kexec_crash_size);
+
 static ssize_t vmcoreinfo_show(struct kobject *kobj,
 			       struct kobj_attribute *attr, char *buf)
 {
@@ -147,6 +167,7 @@ static struct attribute * kernel_attrs[]
 #ifdef CONFIG_KEXEC
 	&kexec_loaded_attr.attr,
 	&kexec_crash_loaded_attr.attr,
+	&kexec_crash_size_attr.attr,
 	&vmcoreinfo_attr.attr,
 #endif
 	NULL

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 2/8] x86: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
  2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
@ 2009-08-21  6:54 ` Amerigo Wang
  2009-08-21  6:54 ` [Patch 3/8] x86: implement crashkernel=auto Amerigo Wang
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman,
	kamezawa.hiroyu, Andi Kleen, Amerigo Wang, akpm, bernhard.walle,
	Fenghua Yu, Ingo Molnar, Anton Vorontsov


Introduce a new config option KEXEC_AUTO_RESERVE for x86.

Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>

---

Index: linux-2.6/arch/x86/Kconfig
===================================================================
--- linux-2.6.orig/arch/x86/Kconfig
+++ linux-2.6/arch/x86/Kconfig
@@ -1482,6 +1482,19 @@ config KEXEC
 	  support.  As of this writing the exact hardware interface is
 	  strongly in flux, so no good recommendation can be made.
 
+config KEXEC_AUTO_RESERVE
+	bool "automatically reserve memory for kexec kernel"
+	depends on KEXEC
+	default y
+	---help---
+	  Automatically reserve memory for a kexec kernel, so that you don't
+	  need to specify numbers for the "crashkernel=X@Y" boot option,
+	  instead you can use "crashkernel=auto". To make this work, you need
+	  to have more than 4G memory.
+
+	  On x86_32, 128M is reserved, on x86_64 1/32 of your memory is
+	  reserved, but it will not exceed 1T/32.
+
 config CRASH_DUMP
 	bool "kernel crash dumps"
 	depends on X86_64 || (X86_32 && HIGHMEM)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 3/8] x86: implement crashkernel=auto
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
  2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
  2009-08-21  6:54 ` [Patch 2/8] x86: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
@ 2009-08-21  6:54 ` Amerigo Wang
  2009-08-21  6:54 ` [Patch 4/8] ia64: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman, Andi Kleen,
	Ingo Molnar, Amerigo Wang, akpm, bernhard.walle, Fenghua Yu,
	kamezawa.hiroyu, Anton Vorontsov


Implement "crashkernel=auto" for x86 first, other arch will be added in the
following patches.

The kernel will modify this command line with the actually reserved size,
in case of breaking any user-space programs.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>

---

Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -37,6 +37,7 @@
 #include <asm/io.h>
 #include <asm/system.h>
 #include <asm/sections.h>
+#include <asm/setup.h>
 
 /* Per cpu memory for storing cpu states in case of system crash. */
 note_buf_t* crash_notes;
@@ -1297,6 +1298,39 @@ int __init parse_crashkernel(char 		 *cm
 
 	ck_cmdline += 12; /* strlen("crashkernel=") */
 
+#ifdef CONFIG_KEXEC_AUTO_RESERVE
+	if (strncmp(ck_cmdline, "auto", 4) == 0) {
+		unsigned long long size;
+		int len;
+		char tmp[32];
+
+		size = arch_default_crash_size(system_ram);
+		if (size != 0) {
+			*crash_size = size;
+			*crash_base = arch_default_crash_base();
+			len = scnprintf(tmp, sizeof(tmp), "%luM@%luM",
+					(unsigned long)(*crash_size)>>20,
+					(unsigned long)(*crash_base)>>20);
+			/* 'len' can't be <= 4. */
+			if (likely((len - 4 + strlen(cmdline))
+					< COMMAND_LINE_SIZE - 1)) {
+				memmove(ck_cmdline + len, ck_cmdline + 4,
+					strlen(cmdline) - (ck_cmdline + 4 - cmdline) + 1);
+				memcpy(ck_cmdline, tmp, len);
+			}
+			return 0;
+		} else {
+			/*
+			 * We can't reserve memory auotmatcally,
+			 * remove "crashkernel=auto" from cmdline.
+			 */
+			ck_cmdline += 4; /* strlen("auto") */
+			memmove(ck_cmdline - 16, ck_cmdline,
+				strlen(cmdline) - (ck_cmdline - cmdline) + 1);
+			return -ENOMEM;
+		}
+	}
+#endif
 	/*
 	 * if the commandline contains a ':', then that's the extended
 	 * syntax -- if not, it must be the classic syntax
Index: linux-2.6/arch/x86/include/asm/kexec.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/kexec.h
+++ linux-2.6/arch/x86/include/asm/kexec.h
@@ -23,6 +23,7 @@
 
 #include <asm/page.h>
 #include <asm/ptrace.h>
+#include <asm-generic/kexec.h>
 
 /*
  * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
Index: linux-2.6/include/asm-generic/kexec.h
===================================================================
--- /dev/null
+++ linux-2.6/include/asm-generic/kexec.h
@@ -0,0 +1,42 @@
+#ifndef _ASM_GENERIC_KEXEC_H
+#define _ASM_GENERIC_KEXEC_H
+
+#ifdef CONFIG_KEXEC_AUTO_RESERVE
+
+#ifndef KEXEC_AUTO_RESERVED_SIZE
+#define KEXEC_AUTO_RESERVED_SIZE (1ULL<<27) /* 128M */
+#endif
+#ifndef KEXEC_AUTO_THRESHOLD
+#define KEXEC_AUTO_THRESHOLD (1ULL<<32) /* 4G */
+#endif
+
+#ifndef ARCH_HAS_DEFAULT_CRASH_SIZE
+static inline
+unsigned long long arch_default_crash_size(unsigned long long total_size)
+{
+	if (total_size < KEXEC_AUTO_THRESHOLD)
+		return 0;
+	else {
+#ifdef CONFIG_64BIT
+		if (total_size > 1ULL<<40) /* 1TB */
+			return KEXEC_AUTO_RESERVED_SIZE
+				* (1ULL<<40 / KEXEC_AUTO_THRESHOLD);
+		return 1ULL<<ilog2(roundup(total_size/32, 1ULL<<21));
+#else
+		return KEXEC_AUTO_RESERVED_SIZE;
+#endif
+	}
+}
+#endif
+#ifndef ARCH_HAS_DEFAULT_CRASH_BASE
+static inline
+unsigned long long arch_default_crash_base(void)
+{
+	/* 0 means find the base address automatically. */
+	return 0;
+}
+#endif
+
+#endif /* CONFIG_KEXEC_AUTO_RESERVE */
+
+#endif

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 4/8] ia64: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (2 preceding siblings ...)
  2009-08-21  6:54 ` [Patch 3/8] x86: implement crashkernel=auto Amerigo Wang
@ 2009-08-21  6:54 ` Amerigo Wang
  2009-08-21  6:55 ` [Patch 5/8] ia64: implement crashkernel=auto Amerigo Wang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman,
	kamezawa.hiroyu, Andi Kleen, Amerigo Wang, akpm, bernhard.walle,
	Fenghua Yu, Ingo Molnar, Anton Vorontsov


Introduce a new config option KEXEC_AUTO_RESERVE for ia64.

Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>

---

Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig
+++ linux-2.6/arch/ia64/Kconfig
@@ -582,6 +582,20 @@ config KEXEC
 	  support.  As of this writing the exact hardware interface is
 	  strongly in flux, so no good recommendation can be made.
 
+config KEXEC_AUTO_RESERVE
+	bool "automatically reserve memory for kexec kernel"
+	depends on KEXEC
+	default y
+	---help---
+	  Automatically reserve memory for a kexec kernel, so that you don't
+	  need to specify numbers for the "crashkernel=X@Y" boot option,
+	  instead you can use "crashkernel=auto". To make this work, you need
+	  to have more than 4G memory.
+
+	  The reserved memory size is different depends on how much memory
+	  you actually have. Please check Documentation/kdump/kdump.txt.
+	  If you doubt, say N.
+
 config CRASH_DUMP
 	  bool "kernel crash dumps"
 	  depends on IA64_MCA_RECOVERY && !IA64_HP_SIM && (!SMP || HOTPLUG_CPU)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (3 preceding siblings ...)
  2009-08-21  6:54 ` [Patch 4/8] ia64: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
@ 2009-08-21  6:55 ` Amerigo Wang
  2009-08-22  0:24   ` Andrew Morton
  2009-08-21  6:55 ` [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman, Andi Kleen,
	Ingo Molnar, Amerigo Wang, akpm, bernhard.walle, Fenghua Yu,
	kamezawa.hiroyu, Anton Vorontsov


Since in patch 3/8 we already implement the generic part, this will
add the rest part for ia64.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>

---

Index: linux-2.6/arch/ia64/include/asm/kexec.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/kexec.h
+++ linux-2.6/arch/ia64/include/asm/kexec.h
@@ -19,6 +19,29 @@
                 flush_icache_range(page_addr, page_addr + PAGE_SIZE); \
         } while(0)
 
+#ifdef CONFIG_KEXEC_AUTO_RESERVE
+#define ARCH_HAS_DEFAULT_CRASH_SIZE
+static inline
+unsigned long long arch_default_crash_size(unsigned long long total_size)
+{
+	if (total_size >= 4ULL<<30 && total_size < 12ULL<<30)
+		return 1ULL<<28;
+	else if (total_size >= 12ULL<<30 && total_size < 128ULL<<30)
+		return 1ULL<<29;
+	else if (total_size >= 128ULL<<30 && total_size < 256ULL<<30)
+		return 3ULL<<28;
+	else if (total_size >= 256ULL<<30 && total_size < 378ULL<<30)
+		return 1ULL<<30;
+	else if (total_size >= 318ULL<<30 && total_size < 512ULL<<30)
+		return 3ULL<<29;
+	else if (total_size >= 512ULL<<30 && total_size < 768ULL<<30)
+		return 2ULL<<30;
+	else if (total_size >= 768ULL<<30)
+		return 3ULL<<30;
+}
+#include <asm-generic/kexec.h>
+#endif
+
 extern struct kimage *ia64_kimage;
 extern const unsigned int relocate_new_kernel_size;
 extern void relocate_new_kernel(unsigned long, unsigned long,

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (4 preceding siblings ...)
  2009-08-21  6:55 ` [Patch 5/8] ia64: implement crashkernel=auto Amerigo Wang
@ 2009-08-21  6:55 ` Amerigo Wang
  2009-08-24 13:44   ` Michael Ellerman
  2009-08-21  6:55 ` [Patch 7/8] powerpc: implement crashkernel=auto Amerigo Wang
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman,
	kamezawa.hiroyu, Andi Kleen, Amerigo Wang, akpm, bernhard.walle,
	Fenghua Yu, Ingo Molnar, Anton Vorontsov


Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.

Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>

---

Index: linux-2.6/arch/powerpc/Kconfig
===================================================================
--- linux-2.6.orig/arch/powerpc/Kconfig
+++ linux-2.6/arch/powerpc/Kconfig
@@ -346,6 +346,17 @@ config KEXEC
 	  support.  As of this writing the exact hardware interface is
 	  strongly in flux, so no good recommendation can be made.
 
+config KEXEC_AUTO_RESERVE
+	bool "automatically reserve memory for kexec kernel"
+	depends on KEXEC
+	default y
+	---help---
+	  Automatically reserve memory for a kexec kernel, so that you don't
+	  need to specify numbers for the "crashkernel=X@Y" boot option,
+	  instead you can use "crashkernel=auto". To make this work, you need
+	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
+	  on PPC64, but it will not exceed 1T/32.
+
 config CRASH_DUMP
 	bool "Build a kdump crash kernel"
 	depends on PPC64 || 6xx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 7/8] powerpc: implement crashkernel=auto
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (5 preceding siblings ...)
  2009-08-21  6:55 ` [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
@ 2009-08-21  6:55 ` Amerigo Wang
  2009-08-21  6:55 ` [Patch 8/8] doc: update the kdump document Amerigo Wang
  2009-08-22  0:06 ` [Patch 0/8] V4 Implement crashkernel=auto Andrew Morton
  8 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman, Andi Kleen,
	Ingo Molnar, Amerigo Wang, akpm, bernhard.walle, Fenghua Yu,
	kamezawa.hiroyu, Anton Vorontsov


Since in patch 3/8 we already implement the generic part, this will
add the rest part for powerpc.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>

---

Index: linux-2.6/arch/powerpc/include/asm/kexec.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/kexec.h
+++ linux-2.6/arch/powerpc/include/asm/kexec.h
@@ -39,6 +39,14 @@ typedef void (*crash_shutdown_t)(void);
 
 #ifdef CONFIG_KEXEC
 
+#ifdef CONFIG_KEXEC_AUTO_RESERVE
+#ifdef KEXEC_AUTO_RESERVED_SIZE
+#undef KEXEC_AUTO_RESERVED_SIZE
+#endif
+#define KEXEC_AUTO_RESERVED_SIZE (1ULL<<28) /* 256M */
+#include <asm-generic/kexec.h>
+#endif
+
 /*
  * This function is responsible for capturing register states if coming
  * via panic or invoking dump using sysrq-trigger.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch 8/8] doc: update the kdump document
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (6 preceding siblings ...)
  2009-08-21  6:55 ` [Patch 7/8] powerpc: implement crashkernel=auto Amerigo Wang
@ 2009-08-21  6:55 ` Amerigo Wang
  2009-08-22  0:06 ` [Patch 0/8] V4 Implement crashkernel=auto Andrew Morton
  8 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-21  6:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, linux-ia64, Neil Horman, Eric W. Biederman,
	kamezawa.hiroyu, Andi Kleen, Amerigo Wang, akpm, bernhard.walle,
	Fenghua Yu, Ingo Molnar, Anton Vorontsov


Update the document for kdump.

Signed-off-by: WANG Cong <amwang@redhat.com>

---

Index: linux-2.6/Documentation/kdump/kdump.txt
===================================================================
--- linux-2.6.orig/Documentation/kdump/kdump.txt
+++ linux-2.6/Documentation/kdump/kdump.txt
@@ -147,6 +147,15 @@ System kernel config options
    analysis tools require a vmlinux with debug symbols in order to read
    and analyze a dump file.
 
+4) Enable "automatically reserve memory for kexec kernel" in
+   "Processor type and features."
+
+   CONFIG_KEXEC_AUTO_RESERVE=y
+
+   This will let you to use "crashkernel=auto", instead of specifying
+   numbers for "crashkernel=". Note, you need to have enough memory.
+   The threshold and reserved memory size are arch-dependent.
+
 Dump-capture kernel config options (Arch Independent)
 -----------------------------------------------------
 
@@ -266,6 +275,25 @@ This would mean:
     2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
     3) if the RAM size is larger than 2G, then reserve 128M
 
+Or you can use:
+
+    crashkernel=auto
+
+if you have enough memory. The threshold is 4G, below which this won't work.
+
+The automatically reserved memory size would be 128M on x86_32, 256M on
+ppc, 1/32 of your physical memory size on x86_64 and ppc64 (but it will not
+exceed 1TB/32 if you have more). IA64 has its own policy, shown below:
+
+	Memory size	Reserved memory
+	===========	===============
+	[4G, 12G)	256M
+	[12G, 128G)	512M
+	[128G, 256G)	768M
+	[256G, 378G)	1024M
+	[378G, 512G)	1536M
+	[512G, 768G)	2048M
+	[768G, )	3072M
 
 
 Boot into System Kernel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 0/8] V4 Implement crashkernel=auto
  2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
                   ` (7 preceding siblings ...)
  2009-08-21  6:55 ` [Patch 8/8] doc: update the kdump document Amerigo Wang
@ 2009-08-22  0:06 ` Andrew Morton
  2009-08-24  1:34   ` Amerigo Wang
  8 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2009-08-22  0:06 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm,
	kamezawa.hiroyu, andi, amwang, bernhard.walle, fenghua.yu, mingo,
	avorontsov, linuxppc-dev

(cc linuxppc-dev@ozlabs.org)

On Fri, 21 Aug 2009 02:54:12 -0400
Amerigo Wang <amwang@redhat.com> wrote:

> This series of patch implements automatically reserved memory for crashkernel,
> by introducing a new boot option "crashkernel=auto". This idea is from Neil.
> 
> In case of breaking user-space applications, it modifies this boot option after
> it decides how much memory should be reserved.
> 
> On different arch, the threshold and reserved memory size is different. Please
> refer patch 8/8 which contains an update for the documentation.
> 
> Patch 1/8 implements shrinking reserved memory at run-time, which is useful
> when more than enough memory is reserved automatically.
> 
> Note: This patchset was only tested on x86_64 with differernt memory sizes.


I'd prefer that this change had been runtime tested on ia64 and powerpc
and has had some quality review from relevant developers of those
architectures.

Looking at the cc's, I'm not sure that the powerpc guys even know about
this work?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 1/8] kexec: allow to shrink reserved memory
  2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
@ 2009-08-22  0:17   ` Andrew Morton
  2009-08-24  1:36     ` Amerigo Wang
  2009-08-22  1:39   ` Eric W. Biederman
  1 sibling, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2009-08-22  0:17 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm, andi,
	mingo, amwang, bernhard.walle, fenghua.yu, kamezawa.hiroyu,
	avorontsov

On Fri, 21 Aug 2009 02:54:25 -0400
Amerigo Wang <amwang@redhat.com> wrote:

> +size_t get_crash_memory_size(void)
> +int shrink_crash_memory(unsigned long new_size)

These aren't particualrly well-chosen global identifiers.  It would be
better if they were called crash_*() or kexec_*(), to make it clear
which subsystem they belong to.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-21  6:55 ` [Patch 5/8] ia64: implement crashkernel=auto Amerigo Wang
@ 2009-08-22  0:24   ` Andrew Morton
  2009-08-22 11:18     ` Ingo Molnar
  2009-08-24  1:59     ` Amerigo Wang
  0 siblings, 2 replies; 29+ messages in thread
From: Andrew Morton @ 2009-08-22  0:24 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm, andi,
	mingo, amwang, bernhard.walle, fenghua.yu, kamezawa.hiroyu,
	avorontsov

On Fri, 21 Aug 2009 02:55:04 -0400
Amerigo Wang <amwang@redhat.com> wrote:

> +#ifdef CONFIG_KEXEC_AUTO_RESERVE
> +#define ARCH_HAS_DEFAULT_CRASH_SIZE
> +static inline
> +unsigned long long arch_default_crash_size(unsigned long long total_size)
> +{
> +	if (total_size >= 4ULL<<30 && total_size < 12ULL<<30)
> +		return 1ULL<<28;
> +	else if (total_size >= 12ULL<<30 && total_size < 128ULL<<30)
> +		return 1ULL<<29;
> +	else if (total_size >= 128ULL<<30 && total_size < 256ULL<<30)
> +		return 3ULL<<28;
> +	else if (total_size >= 256ULL<<30 && total_size < 378ULL<<30)
> +		return 1ULL<<30;
> +	else if (total_size >= 318ULL<<30 && total_size < 512ULL<<30)
> +		return 3ULL<<29;
> +	else if (total_size >= 512ULL<<30 && total_size < 768ULL<<30)
> +		return 2ULL<<30;
> +	else if (total_size >= 768ULL<<30)
> +		return 3ULL<<30;
> +}
> +#include <asm-generic/kexec.h>
> +#endif

a) Why on earth is this inlined?

b) Please consider making arch_default_crash_size() a __weak
   function.  You'll probably find the result to be pleasing.

c) If we can't use __weak then please don't add
   ARCH_HAS_DEFAULT_CRASH_SIZE.  Instead do this trick:


	#ifndef arch_default_crash_size
	static inline unsigned long long arch_default_crash_size(unsigned long long total_size)
	{
		...
	}
	#define arch_default_crash_size arch_default_crash_size
	#endif

	because i) it's good to standardise on something and ii) one less
	symbol gets added to the kernel.

d) why is asm-generic/kexec.h only included in asm/kexec.h if
   CONFIG_KEXEC_AUTO_RESERVE happened to be defined?  That makes no
   sense - there may be a multitude of reasons why asm/kexec.h wants
   to include asm-generic/kexec.h.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 1/8] kexec: allow to shrink reserved memory
  2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
  2009-08-22  0:17   ` Andrew Morton
@ 2009-08-22  1:39   ` Eric W. Biederman
  2009-08-24  2:02     ` Amerigo Wang
  1 sibling, 1 reply; 29+ messages in thread
From: Eric W. Biederman @ 2009-08-22  1:39 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, tony.luck, linux-ia64, Neil Horman, Andi Kleen,
	Ingo Molnar, akpm, bernhard.walle, Fenghua Yu, kamezawa.hiroyu,
	Anton Vorontsov

Amerigo Wang <amwang@redhat.com> writes:

> This patch implements shrinking the reserved memory for crash kernel,
> if it is more than enough.
>
> For example, if you have already reserved 128M, now you just want 100M,
> you can do:
>
> # echo $((100*1024*1024)) > /sys/kernel/kexec_crash_size
>
> Note, you can only do this before loading the crash kernel.
>
> Signed-off-by: WANG Cong <amwang@redhat.com>
> Cc: Neil Horman <nhorman@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Andi Kleen <andi@firstfloor.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> ---
>
> Index: linux-2.6/include/linux/kexec.h
> ===================================================================
> --- linux-2.6.orig/include/linux/kexec.h
> +++ linux-2.6/include/linux/kexec.h
> @@ -206,6 +206,8 @@ extern size_t vmcoreinfo_max_size;
>  
>  int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
>  		unsigned long long *crash_size, unsigned long long *crash_base);
> +int shrink_crash_memory(unsigned long new_size);
> +size_t get_crash_memory_size(void);
>  
>  #else /* !CONFIG_KEXEC */
>  struct pt_regs;
> Index: linux-2.6/kernel/kexec.c
> ===================================================================
> --- linux-2.6.orig/kernel/kexec.c
> +++ linux-2.6/kernel/kexec.c
> @@ -31,6 +31,7 @@
>  #include <linux/cpu.h>
>  #include <linux/console.h>
>  #include <linux/vmalloc.h>
> +#include <linux/swap.h>
>  
>  #include <asm/page.h>
>  #include <asm/uaccess.h>
> @@ -1083,6 +1084,58 @@ void crash_kexec(struct pt_regs *regs)
>  	}
>  }
>  
> +size_t get_crash_memory_size(void)
> +{
> +	size_t size;
> +	mutex_lock(&kexec_mutex);
> +	size = crashk_res.end - crashk_res.start + 1;
> +	mutex_unlock(&kexec_mutex);
> +	return size;
> +}
> +
> +int shrink_crash_memory(unsigned long new_size)
> +{
> +	int ret = 0;
> +	unsigned long addr;
> +	unsigned long start, end;
> +
> +	mutex_lock(&kexec_mutex);
> +
> +	if (kexec_crash_image) {
> +		ret = -ENOENT;
> +		goto unlock;
> +	}
> +	start = crashk_res.start;
> +	end = crashk_res.end;
> +
> +	if (new_size >= end - start + 1) {
> +		ret = -EINVAL;
> +		if (new_size == end - start + 1)
> +			ret = 0;
> +		goto unlock;
> +	}
> +
> +	start = roundup(start, PAGE_SIZE);
> +	end = roundup(start + new_size, PAGE_SIZE);
> +
> +	for (addr = end; addr < crashk_res.end; addr += PAGE_SIZE) {
> +		ClearPageReserved(pfn_to_page(addr >> PAGE_SHIFT));
> +		init_page_count(pfn_to_page(addr >> PAGE_SHIFT));
> +		free_page((unsigned long)__va(addr));
> +		totalram_pages++;

Any chance we can move this inline snippet into a helper function in
-mm.  To expose what is happening here to the mm developers.

Eric

> +	}
> +
> +	if (start == end) {
> +		crashk_res.end = end;
> +		release_resource(&crashk_res);
> +	} else
> +		crashk_res.end = end - 1;
> +
> +unlock:
> +	mutex_unlock(&kexec_mutex);
> +	return ret;
> +}
> +
>  static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
>  			    size_t data_len)
>  {
> Index: linux-2.6/kernel/ksysfs.c
> ===================================================================
> --- linux-2.6.orig/kernel/ksysfs.c
> +++ linux-2.6/kernel/ksysfs.c
> @@ -100,6 +100,26 @@ static ssize_t kexec_crash_loaded_show(s
>  }
>  KERNEL_ATTR_RO(kexec_crash_loaded);
>  
> +static ssize_t kexec_crash_size_show(struct kobject *kobj,
> +				       struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%lu\n", get_crash_memory_size());
> +}
> +static ssize_t kexec_crash_size_store(struct kobject *kobj,
> +				   struct kobj_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	unsigned long cnt;
> +	int ret;
> +
> +	if (strict_strtoul(buf, 0, &cnt))
> +		return -EINVAL;
> +
> +	ret = shrink_crash_memory(cnt);
> +	return ret < 0 ? ret : count;
> +}
> +KERNEL_ATTR_RW(kexec_crash_size);
> +
>  static ssize_t vmcoreinfo_show(struct kobject *kobj,
>  			       struct kobj_attribute *attr, char *buf)
>  {
> @@ -147,6 +167,7 @@ static struct attribute * kernel_attrs[]
>  #ifdef CONFIG_KEXEC
>  	&kexec_loaded_attr.attr,
>  	&kexec_crash_loaded_attr.attr,
> +	&kexec_crash_size_attr.attr,
>  	&vmcoreinfo_attr.attr,
>  #endif
>  	NULL

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-22  0:24   ` Andrew Morton
@ 2009-08-22 11:18     ` Ingo Molnar
  2009-08-24  2:05       ` Amerigo Wang
  2009-08-24  1:59     ` Amerigo Wang
  1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2009-08-22 11:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Amerigo Wang, linux-kernel, tony.luck, linux-ia64, nhorman,
	ebiederm, andi, bernhard.walle, fenghua.yu, kamezawa.hiroyu,
	avorontsov


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 21 Aug 2009 02:55:04 -0400
> Amerigo Wang <amwang@redhat.com> wrote:
> 
> > +#ifdef CONFIG_KEXEC_AUTO_RESERVE
> > +#define ARCH_HAS_DEFAULT_CRASH_SIZE
> > +static inline
> > +unsigned long long arch_default_crash_size(unsigned long long total_size)
> > +{
> > +	if (total_size >= 4ULL<<30 && total_size < 12ULL<<30)
> > +		return 1ULL<<28;
> > +	else if (total_size >= 12ULL<<30 && total_size < 128ULL<<30)
> > +		return 1ULL<<29;
> > +	else if (total_size >= 128ULL<<30 && total_size < 256ULL<<30)
> > +		return 3ULL<<28;
> > +	else if (total_size >= 256ULL<<30 && total_size < 378ULL<<30)
> > +		return 1ULL<<30;
> > +	else if (total_size >= 318ULL<<30 && total_size < 512ULL<<30)
> > +		return 3ULL<<29;
> > +	else if (total_size >= 512ULL<<30 && total_size < 768ULL<<30)
> > +		return 2ULL<<30;
> > +	else if (total_size >= 768ULL<<30)
> > +		return 3ULL<<30;
> > +}
> > +#include <asm-generic/kexec.h>
> > +#endif
> 
> a) Why on earth is this inlined?
> 
> b) Please consider making arch_default_crash_size() a __weak
>    function.  You'll probably find the result to be pleasing.
> 
> c) If we can't use __weak then please don't add
>    ARCH_HAS_DEFAULT_CRASH_SIZE.  Instead do this trick:
> 
> 
> 	#ifndef arch_default_crash_size
> 	static inline unsigned long long arch_default_crash_size(unsigned long long total_size)
> 	{
> 		...
> 	}
> 	#define arch_default_crash_size arch_default_crash_size
> 	#endif
> 
> 	because i) it's good to standardise on something and ii) one less
> 	symbol gets added to the kernel.
> 
> d) why is asm-generic/kexec.h only included in asm/kexec.h if
>    CONFIG_KEXEC_AUTO_RESERVE happened to be defined?  That makes no
>    sense - there may be a multitude of reasons why asm/kexec.h wants
>    to include asm-generic/kexec.h.

e) All the 'else' statements are superflous and make it all harder 
   to read.

f) 2ULL<<30 should be written as 1ULL<31, to keep things consistent.

g) A nice comment explaining the purpose and logic wouldnt hurt.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 0/8] V4 Implement crashkernel=auto
  2009-08-22  0:06 ` [Patch 0/8] V4 Implement crashkernel=auto Andrew Morton
@ 2009-08-24  1:34   ` Amerigo Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24  1:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm,
	kamezawa.hiroyu, andi, bernhard.walle, fenghua.yu, mingo,
	avorontsov, linuxppc-dev

Andrew Morton wrote:
> (cc linuxppc-dev@ozlabs.org)
>
> On Fri, 21 Aug 2009 02:54:12 -0400
> Amerigo Wang <amwang@redhat.com> wrote:
>
>   
>> This series of patch implements automatically reserved memory for crashkernel,
>> by introducing a new boot option "crashkernel=auto". This idea is from Neil.
>>
>> In case of breaking user-space applications, it modifies this boot option after
>> it decides how much memory should be reserved.
>>
>> On different arch, the threshold and reserved memory size is different. Please
>> refer patch 8/8 which contains an update for the documentation.
>>
>> Patch 1/8 implements shrinking reserved memory at run-time, which is useful
>> when more than enough memory is reserved automatically.
>>
>> Note: This patchset was only tested on x86_64 with differernt memory sizes.
>>     
>
>
> I'd prefer that this change had been runtime tested on ia64 and powerpc
> and has had some quality review from relevant developers of those
> architectures.
>
> Looking at the cc's, I'm not sure that the powerpc guys even know about
> this work?
>
>   
Ok, let me try to find some ppc and ia64 machines in the company.. ;)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 1/8] kexec: allow to shrink reserved memory
  2009-08-22  0:17   ` Andrew Morton
@ 2009-08-24  1:36     ` Amerigo Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24  1:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm, andi,
	mingo, bernhard.walle, fenghua.yu, kamezawa.hiroyu, avorontsov

Andrew Morton wrote:
> On Fri, 21 Aug 2009 02:54:25 -0400
> Amerigo Wang <amwang@redhat.com> wrote:
>
>   
>> +size_t get_crash_memory_size(void)
>> +int shrink_crash_memory(unsigned long new_size)
>>     
>
> These aren't particualrly well-chosen global identifiers.  It would be
> better if they were called crash_*() or kexec_*(), to make it clear
> which subsystem they belong to.
>
>   

Ah, good poi nt!

How about crash_get_memory_size() and crash_shrink_memory()?

Thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-22  0:24   ` Andrew Morton
  2009-08-22 11:18     ` Ingo Molnar
@ 2009-08-24  1:59     ` Amerigo Wang
  1 sibling, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24  1:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, tony.luck, linux-ia64, nhorman, ebiederm, andi,
	mingo, bernhard.walle, fenghua.yu, kamezawa.hiroyu, avorontsov

Andrew Morton wrote:
> On Fri, 21 Aug 2009 02:55:04 -0400
> Amerigo Wang <amwang@redhat.com> wrote:
>
>   
>> +#ifdef CONFIG_KEXEC_AUTO_RESERVE
>> +#define ARCH_HAS_DEFAULT_CRASH_SIZE
>> +static inline
>> +unsigned long long arch_default_crash_size(unsigned long long total_size)
>> +{
>> +	if (total_size >= 4ULL<<30 && total_size < 12ULL<<30)
>> +		return 1ULL<<28;
>> +	else if (total_size >= 12ULL<<30 && total_size < 128ULL<<30)
>> +		return 1ULL<<29;
>> +	else if (total_size >= 128ULL<<30 && total_size < 256ULL<<30)
>> +		return 3ULL<<28;
>> +	else if (total_size >= 256ULL<<30 && total_size < 378ULL<<30)
>> +		return 1ULL<<30;
>> +	else if (total_size >= 318ULL<<30 && total_size < 512ULL<<30)
>> +		return 3ULL<<29;
>> +	else if (total_size >= 512ULL<<30 && total_size < 768ULL<<30)
>> +		return 2ULL<<30;
>> +	else if (total_size >= 768ULL<<30)
>> +		return 3ULL<<30;
>> +}
>> +#include <asm-generic/kexec.h>
>> +#endif
>>     
>
> a) Why on earth is this inlined?
>
> b) Please consider making arch_default_crash_size() a __weak
>    function.  You'll probably find the result to be pleasing.
>   

Good idea!

> c) If we can't use __weak then please don't add
>    ARCH_HAS_DEFAULT_CRASH_SIZE.  Instead do this trick:
>
>
> 	#ifndef arch_default_crash_size
> 	static inline unsigned long long arch_default_crash_size(unsigned long long total_size)
> 	{
> 		...
> 	}
> 	#define arch_default_crash_size arch_default_crash_size
> 	#endif
>
> 	because i) it's good to standardise on something and ii) one less
> 	symbol gets added to the kernel.
>   

Yes, agree. I will try b) which seems to be better than c).

> d) why is asm-generic/kexec.h only included in asm/kexec.h if
>    CONFIG_KEXEC_AUTO_RESERVE happened to be defined?  That makes no
>    sense - there may be a multitude of reasons why asm/kexec.h wants
>    to include asm-generic/kexec.h.
>
>   
Hmm, yes? kernel/kexec.c includes linux/kexec.h which already includes 
asm/kexec.h....




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 1/8] kexec: allow to shrink reserved memory
  2009-08-22  1:39   ` Eric W. Biederman
@ 2009-08-24  2:02     ` Amerigo Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24  2:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, tony.luck, linux-ia64, Neil Horman, Andi Kleen,
	Ingo Molnar, akpm, bernhard.walle, Fenghua Yu, kamezawa.hiroyu,
	Anton Vorontsov

Eric W. Biederman wrote:
> Amerigo Wang <amwang@redhat.com> writes:
>
>   
>> +
>> +	start = roundup(start, PAGE_SIZE);
>> +	end = roundup(start + new_size, PAGE_SIZE);
>> +
>> +	for (addr = end; addr < crashk_res.end; addr += PAGE_SIZE) {
>> +		ClearPageReserved(pfn_to_page(addr >> PAGE_SHIFT));
>> +		init_page_count(pfn_to_page(addr >> PAGE_SHIFT));
>> +		free_page((unsigned long)__va(addr));
>> +		totalram_pages++;
>>     
>
> Any chance we can move this inline snippet into a helper function in
> -mm.  To expose what is happening here to the mm developers.
>
>   

Yes, I believe so. In fact I also wanted to do this, but I forgot before 
I sent these patches... Will do it in the next version.

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-22 11:18     ` Ingo Molnar
@ 2009-08-24  2:05       ` Amerigo Wang
  2009-08-24  7:43         ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24  2:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, linux-kernel, tony.luck, linux-ia64, nhorman,
	ebiederm, andi, bernhard.walle, fenghua.yu, kamezawa.hiroyu,
	avorontsov

Ingo Molnar wrote:
>
> e) All the 'else' statements are superflous and make it all harder 
>    to read.
>
> f) 2ULL<<30 should be written as 1ULL<31, to keep things consistent.
>   

Hi,

The reason that I kept 2ULL<<30 instead of 1ULL<<31 is that '1<<30' is 
exactly 1G, so 2ULL<<30 can be easily read as 2G. ;)

> g) A nice comment explaining the purpose and logic wouldnt hurt.
>   

Yup, in fact patch 8/8 has the doc for this, but I will copy that here 
as a comment too.

Thanks!


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-24  2:05       ` Amerigo Wang
@ 2009-08-24  7:43         ` Ingo Molnar
  2009-08-24  8:21           ` Bernhard Walle
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2009-08-24  7:43 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: Andrew Morton, linux-kernel, tony.luck, linux-ia64, nhorman,
	ebiederm, andi, bernhard.walle, fenghua.yu, kamezawa.hiroyu,
	avorontsov


* Amerigo Wang <amwang@redhat.com> wrote:

> Ingo Molnar wrote:
>>
>> e) All the 'else' statements are superflous and make it all harder    
>> to read.
>>
>> f) 2ULL<<30 should be written as 1ULL<31, to keep things consistent.
>>   
>
> Hi,
>
> The reason that I kept 2ULL<<30 instead of 1ULL<<31 is that '1<<30' is  
> exactly 1G, so 2ULL<<30 can be easily read as 2G. ;)

i have no trouble reading 1ULL<<31 as 2G ;-) OTOH, the logic and 
pattern of the comparisons (especially without the comment) looked 
odd at first sight, until i noticed this.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-24  7:43         ` Ingo Molnar
@ 2009-08-24  8:21           ` Bernhard Walle
  2009-08-24 10:23             ` Amerigo Wang
  0 siblings, 1 reply; 29+ messages in thread
From: Bernhard Walle @ 2009-08-24  8:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Amerigo Wang, Andrew Morton, linux-kernel, tony.luck, linux-ia64,
	nhorman, ebiederm, andi, fenghua.yu, kamezawa.hiroyu, avorontsov

* Ingo Molnar <mingo@elte.hu> [2009-08-24 09:43]:
> * Amerigo Wang <amwang@redhat.com> wrote:
> >
> > The reason that I kept 2ULL<<30 instead of 1ULL<<31 is that '1<<30' is  
> > exactly 1G, so 2ULL<<30 can be easily read as 2G. ;)
> 
> i have no trouble reading 1ULL<<31 as 2G ;-) OTOH, the logic and 
> pattern of the comparisons (especially without the comment) looked 
> odd at first sight, until i noticed this.

Why not just something like

    #define KBYTE(x)  ((x)*1024ULL)
    #define MBYTE(x)  ((x)*1024ULL*1024)
    #define GBYTE(x)  ((x)*1024ULL*1024*1024) 
    #define TBYTE(x)  ((x)*1024ULL*1024*1024*1024)

I find GBYTE(2) much easier to read than 1ULL<<31. Honestly, I would
add a comment '/* 2G */' if I would write 1ULL<<31 in own code.

But I'm of course not one of that super kernel hackers. ;-)


Regards,
Bernhard

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 5/8] ia64: implement crashkernel=auto
  2009-08-24  8:21           ` Bernhard Walle
@ 2009-08-24 10:23             ` Amerigo Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-24 10:23 UTC (permalink / raw)
  To: Bernhard Walle
  Cc: Ingo Molnar, Andrew Morton, linux-kernel, tony.luck, linux-ia64,
	nhorman, ebiederm, andi, fenghua.yu, kamezawa.hiroyu, avorontsov

Bernhard Walle wrote:
> * Ingo Molnar <mingo@elte.hu> [2009-08-24 09:43]:
>   
>> * Amerigo Wang <amwang@redhat.com> wrote:
>>     
>>> The reason that I kept 2ULL<<30 instead of 1ULL<<31 is that '1<<30' is  
>>> exactly 1G, so 2ULL<<30 can be easily read as 2G. ;)
>>>       
>> i have no trouble reading 1ULL<<31 as 2G ;-) OTOH, the logic and 
>> pattern of the comparisons (especially without the comment) looked 
>> odd at first sight, until i noticed this.
>>     
>
> Why not just something like
>
>     #define KBYTE(x)  ((x)*1024ULL)
>     #define MBYTE(x)  ((x)*1024ULL*1024)
>     #define GBYTE(x)  ((x)*1024ULL*1024*1024) 
>     #define TBYTE(x)  ((x)*1024ULL*1024*1024*1024)
>
> I find GBYTE(2) much easier to read than 1ULL<<31. Honestly, I would
> add a comment '/* 2G */' if I would write 1ULL<<31 in own code.
>
> But I'm of course not one of that super kernel hackers. ;-)
>
>   

Yeah, great. Here we only need MBYTES() and GBYTES(). ;)

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-21  6:55 ` [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
@ 2009-08-24 13:44   ` Michael Ellerman
  2009-08-24 14:45     ` M. Mohan Kumar
  2009-08-25  6:23     ` Amerigo Wang
  0 siblings, 2 replies; 29+ messages in thread
From: Michael Ellerman @ 2009-08-24 13:44 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, tony.luck, linux-ia64, Neil Horman,
	Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

[-- Attachment #1: Type: text/plain, Size: 1939 bytes --]

On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
> Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
> 
> Index: linux-2.6/arch/powerpc/Kconfig
> ===================================================================
> --- linux-2.6.orig/arch/powerpc/Kconfig
> +++ linux-2.6/arch/powerpc/Kconfig
> @@ -346,6 +346,17 @@ config KEXEC
>  	  support.  As of this writing the exact hardware interface is
>  	  strongly in flux, so no good recommendation can be made.
>  
> +config KEXEC_AUTO_RESERVE
> +	bool "automatically reserve memory for kexec kernel"
> +	depends on KEXEC
> +	default y
> +	---help---
> +	  Automatically reserve memory for a kexec kernel, so that you don't
> +	  need to specify numbers for the "crashkernel=X@Y" boot option,
> +	  instead you can use "crashkernel=auto". To make this work, you need
> +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
> +	  on PPC64, but it will not exceed 1T/32.

To be honest I don't see why this logic goes in the kernel. It seems to
me that it's policy how much memory you devote to the crash kernel vs
the production kernel. It depends on what kind of crash kernel you're
loading, a minimal UP dump kernel, or a full-featured SMP behemoth, An
it depends on how much memory you're willing to leave idle in the
off-chance you crash.

That aside, I don't see how this will be useful in practice, if it only
works for memory sizes over 4G? Or are we saying that people with less
than 4G don't need crash kernels? If we're not saying that, those users,
or those users' distros, still need to do some logic to work out if they
have < 4GB of memory and if so pick a crash kernel size. So why can't
they pick the size in the > 4GB case also?

Also the numbers seem a bit arbitrary. 4GB ? 256M ? 1/32?  I don't think
we really want to be blowing 32GB on a crash kernel, even if we do have
1T of RAM :)

cheers



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-24 13:44   ` Michael Ellerman
@ 2009-08-24 14:45     ` M. Mohan Kumar
  2009-08-25  6:37       ` Amerigo Wang
  2009-08-25  6:23     ` Amerigo Wang
  1 sibling, 1 reply; 29+ messages in thread
From: M. Mohan Kumar @ 2009-08-24 14:45 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Amerigo Wang, linux-kernel, tony.luck, linux-ia64, Neil Horman,
	Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

On Mon, Aug 24, 2009 at 11:44:03PM +1000, Michael Ellerman wrote:
> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
> > Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
> > 
> > Index: linux-2.6/arch/powerpc/Kconfig
> > ===================================================================
> > --- linux-2.6.orig/arch/powerpc/Kconfig
> > +++ linux-2.6/arch/powerpc/Kconfig
> > @@ -346,6 +346,17 @@ config KEXEC
> >  	  support.  As of this writing the exact hardware interface is
> >  	  strongly in flux, so no good recommendation can be made.
> >  
> > +config KEXEC_AUTO_RESERVE
> > +	bool "automatically reserve memory for kexec kernel"
> > +	depends on KEXEC
> > +	default y
> > +	---help---
> > +	  Automatically reserve memory for a kexec kernel, so that you don't
> > +	  need to specify numbers for the "crashkernel=X@Y" boot option,
> > +	  instead you can use "crashkernel=auto". To make this work, you need
> > +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
> > +	  on PPC64, but it will not exceed 1T/32.
> 
> That aside, I don't see how this will be useful in practice, if it only
> works for memory sizes over 4G? Or are we saying that people with less
> than 4G don't need crash kernels? If we're not saying that, those users,
> or those users' distros, still need to do some logic to work out if they
> have < 4GB of memory and if so pick a crash kernel size. So why can't
> they pick the size in the > 4GB case also?

True, I wanted to test the patch and when tested on a ppc64 machine which
has RAM less than 4GB, I have to modify arch_default_crash_size routine to
return 256MB (I didn't have a PPC64 machine with more than 4GB RAM handy).
So its better to consider machines with less than 4GB RAM also.

PPC64 crashkernel base is always 32MB. So at least ppc64 code should have
its own arch_default_crash_base to return 32MB to avoid the kernel warning
message "Crash kernel location must be 0x2000000"

Regards,
M. Mohan Kumar

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-24 13:44   ` Michael Ellerman
  2009-08-24 14:45     ` M. Mohan Kumar
@ 2009-08-25  6:23     ` Amerigo Wang
  2009-08-25 10:28       ` M. Mohan Kumar
  1 sibling, 1 reply; 29+ messages in thread
From: Amerigo Wang @ 2009-08-25  6:23 UTC (permalink / raw)
  To: michael
  Cc: linux-kernel, tony.luck, linux-ia64, Neil Horman,
	Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

Michael Ellerman wrote:
> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
>   
>> Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
>>
>> Index: linux-2.6/arch/powerpc/Kconfig
>> ===================================================================
>> --- linux-2.6.orig/arch/powerpc/Kconfig
>> +++ linux-2.6/arch/powerpc/Kconfig
>> @@ -346,6 +346,17 @@ config KEXEC
>>  	  support.  As of this writing the exact hardware interface is
>>  	  strongly in flux, so no good recommendation can be made.
>>  
>> +config KEXEC_AUTO_RESERVE
>> +	bool "automatically reserve memory for kexec kernel"
>> +	depends on KEXEC
>> +	default y
>> +	---help---
>> +	  Automatically reserve memory for a kexec kernel, so that you don't
>> +	  need to specify numbers for the "crashkernel=X@Y" boot option,
>> +	  instead you can use "crashkernel=auto". To make this work, you need
>> +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
>> +	  on PPC64, but it will not exceed 1T/32.
>>     
>
> To be honest I don't see why this logic goes in the kernel. It seems to
> me that it's policy how much memory you devote to the crash kernel vs
> the production kernel. It depends on what kind of crash kernel you're
> loading, a minimal UP dump kernel, or a full-featured SMP behemoth, An
> it depends on how much memory you're willing to leave idle in the
> off-chance you crash.
>   

True, but since in the crash kernel, we have very little memory, so 
probably loading a full-featured SMP kernel doesn't make much sense...

And in patch 1/8, I introduced a way to free the reserved memory at 
run-time.

> That aside, I don't see how this will be useful in practice, if it only
> works for memory sizes over 4G? Or are we saying that people with less
> than 4G don't need crash kernels? If we're not saying that, those users,
> or those users' distros, still need to do some logic to work out if they
> have < 4GB of memory and if so pick a crash kernel size. So why can't
> they pick the size in the > 4GB case also?
>   

No, we set 4G as a threshold because we only want this work when have 
have enough memory which is defined as 4G currently... This can be 
changed to arch-dependent, e.g. ppc. I am very open to this.


> Also the numbers seem a bit arbitrary. 4GB ? 256M ? 1/32?  I don't think
> we really want to be blowing 32GB on a crash kernel, even if we do have
> 1T of RAM :)
>   

Ah, maybe, to be honest, I am not familiar with ppc at all.

Please feel free to suggest other numbers for ppc (or other algorithms 
to reserve memory automatically for ppc).

Thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-24 14:45     ` M. Mohan Kumar
@ 2009-08-25  6:37       ` Amerigo Wang
  2009-08-25 10:24         ` M. Mohan Kumar
  0 siblings, 1 reply; 29+ messages in thread
From: Amerigo Wang @ 2009-08-25  6:37 UTC (permalink / raw)
  To: mohan
  Cc: Michael Ellerman, linux-kernel, tony.luck, linux-ia64,
	Neil Horman, Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

M. Mohan Kumar wrote:
> On Mon, Aug 24, 2009 at 11:44:03PM +1000, Michael Ellerman wrote:
>   
>> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
>>     
>>> Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
>>>
>>> Index: linux-2.6/arch/powerpc/Kconfig
>>> ===================================================================
>>> --- linux-2.6.orig/arch/powerpc/Kconfig
>>> +++ linux-2.6/arch/powerpc/Kconfig
>>> @@ -346,6 +346,17 @@ config KEXEC
>>>  	  support.  As of this writing the exact hardware interface is
>>>  	  strongly in flux, so no good recommendation can be made.
>>>  
>>> +config KEXEC_AUTO_RESERVE
>>> +	bool "automatically reserve memory for kexec kernel"
>>> +	depends on KEXEC
>>> +	default y
>>> +	---help---
>>> +	  Automatically reserve memory for a kexec kernel, so that you don't
>>> +	  need to specify numbers for the "crashkernel=X@Y" boot option,
>>> +	  instead you can use "crashkernel=auto". To make this work, you need
>>> +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
>>> +	  on PPC64, but it will not exceed 1T/32.
>>>       
>> That aside, I don't see how this will be useful in practice, if it only
>> works for memory sizes over 4G? Or are we saying that people with less
>> than 4G don't need crash kernels? If we're not saying that, those users,
>> or those users' distros, still need to do some logic to work out if they
>> have < 4GB of memory and if so pick a crash kernel size. So why can't
>> they pick the size in the > 4GB case also?
>>     
>
> True, I wanted to test the patch and when tested on a ppc64 machine which
> has RAM less than 4GB, I have to modify arch_default_crash_size routine to
> return 256MB (I didn't have a PPC64 machine with more than 4GB RAM handy).
> So its better to consider machines with less than 4GB RAM also.
>   

OK, how about 2G on ppc? Is it safe to reserve 256M when I have 2G?

> PPC64 crashkernel base is always 32MB. So at least ppc64 code should have
> its own arch_default_crash_base to return 32MB to avoid the kernel warning
> message "Crash kernel location must be 0x2000000"
>   
Hmm, good point, how about KDUMP_KERNELBASE? It looks fine for both ppc 
and ppc64.

Thanks!



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-25  6:37       ` Amerigo Wang
@ 2009-08-25 10:24         ` M. Mohan Kumar
  0 siblings, 0 replies; 29+ messages in thread
From: M. Mohan Kumar @ 2009-08-25 10:24 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: Michael Ellerman, linux-kernel, tony.luck, linux-ia64,
	Neil Horman, Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

On Tue, Aug 25, 2009 at 02:37:12PM +0800, Amerigo Wang wrote:
> M. Mohan Kumar wrote:
>> On Mon, Aug 24, 2009 at 11:44:03PM +1000, Michael Ellerman wrote:
>>   
>>> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
>>>     
     
>>> That aside, I don't see how this will be useful in practice, if it only
>>> works for memory sizes over 4G? Or are we saying that people with less
>>> than 4G don't need crash kernels? If we're not saying that, those users,
>>> or those users' distros, still need to do some logic to work out if they
>>> have < 4GB of memory and if so pick a crash kernel size. So why can't
>>> they pick the size in the > 4GB case also?
>>>     
>>
>> True, I wanted to test the patch and when tested on a ppc64 machine which
>> has RAM less than 4GB, I have to modify arch_default_crash_size routine to
>> return 256MB (I didn't have a PPC64 machine with more than 4GB RAM handy).
>> So its better to consider machines with less than 4GB RAM also.
>>   
>
> OK, how about 2G on ppc? Is it safe to reserve 256M when I have 2G?

I would prefer 2G-4G 128MB.

>
>> PPC64 crashkernel base is always 32MB. So at least ppc64 code should have
>> its own arch_default_crash_base to return 32MB to avoid the kernel warning
>> message "Crash kernel location must be 0x2000000"
>>   
> Hmm, good point, how about KDUMP_KERNELBASE? It looks fine for both ppc  
> and ppc64.

Yes, you can use KDUMP_KERNELBASE for arch_default_crash_base

Regards,
M. Mohan Kumar

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-25  6:23     ` Amerigo Wang
@ 2009-08-25 10:28       ` M. Mohan Kumar
  2009-08-26  6:59         ` Amerigo Wang
  0 siblings, 1 reply; 29+ messages in thread
From: M. Mohan Kumar @ 2009-08-25 10:28 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: michael, linux-kernel, tony.luck, linux-ia64, Neil Horman,
	Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

On Tue, Aug 25, 2009 at 02:23:04PM +0800, Amerigo Wang wrote:
> Michael Ellerman wrote:
>> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
>>   
>>> Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
>>>
>>> Index: linux-2.6/arch/powerpc/Kconfig
>>> ===================================================================
>>> --- linux-2.6.orig/arch/powerpc/Kconfig
>>> +++ linux-2.6/arch/powerpc/Kconfig
>>> @@ -346,6 +346,17 @@ config KEXEC
>>>  	  support.  As of this writing the exact hardware interface is
>>>  	  strongly in flux, so no good recommendation can be made.
>>>  +config KEXEC_AUTO_RESERVE
>>> +	bool "automatically reserve memory for kexec kernel"
>>> +	depends on KEXEC
>>> +	default y
>>> +	---help---
>>> +	  Automatically reserve memory for a kexec kernel, so that you don't
>>> +	  need to specify numbers for the "crashkernel=X@Y" boot option,
>>> +	  instead you can use "crashkernel=auto". To make this work, you need
>>> +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
>>> +	  on PPC64, but it will not exceed 1T/32.
>>>     
>>
>> To be honest I don't see why this logic goes in the kernel. It seems to
>> me that it's policy how much memory you devote to the crash kernel vs
>> the production kernel. It depends on what kind of crash kernel you're
>> loading, a minimal UP dump kernel, or a full-featured SMP behemoth, An
>> it depends on how much memory you're willing to leave idle in the
>> off-chance you crash.
>>   
>
> True, but since in the crash kernel, we have very little memory, so  
> probably loading a full-featured SMP kernel doesn't make much sense...
>
> And in patch 1/8, I introduced a way to free the reserved memory at  
> run-time.
>
>> That aside, I don't see how this will be useful in practice, if it only
>> works for memory sizes over 4G? Or are we saying that people with less
>> than 4G don't need crash kernels? If we're not saying that, those users,
>> or those users' distros, still need to do some logic to work out if they
>> have < 4GB of memory and if so pick a crash kernel size. So why can't
>> they pick the size in the > 4GB case also?
>>   
>
> No, we set 4G as a threshold because we only want this work when have  
> have enough memory which is defined as 4G currently... This can be  
> changed to arch-dependent, e.g. ppc. I am very open to this.
>

So the distro/admin have to use crashkernel=auto for machines having more
than 4GB RAM and for machines with less than 4GB RAM they have to use the
crashkernel=x@y (or extended crashkernel syntax)? IMHO it will be nice if
crashkernel=auto could handle all of the situations.

Regards,
M. Mohan Kumar

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE
  2009-08-25 10:28       ` M. Mohan Kumar
@ 2009-08-26  6:59         ` Amerigo Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Amerigo Wang @ 2009-08-26  6:59 UTC (permalink / raw)
  To: mohan
  Cc: michael, linux-kernel, tony.luck, linux-ia64, Neil Horman,
	Eric W. Biederman, kamezawa.hiroyu, Andi Kleen, akpm,
	bernhard.walle, Fenghua Yu, Ingo Molnar, Anton Vorontsov

M. Mohan Kumar wrote:
> On Tue, Aug 25, 2009 at 02:23:04PM +0800, Amerigo Wang wrote:
>   
>> Michael Ellerman wrote:
>>     
>>> On Fri, 2009-08-21 at 02:55 -0400, Amerigo Wang wrote:
>>>   
>>>       
>>>> Introduce a new config option KEXEC_AUTO_RESERVE for powerpc.
>>>>
>>>> Index: linux-2.6/arch/powerpc/Kconfig
>>>> ===================================================================
>>>> --- linux-2.6.orig/arch/powerpc/Kconfig
>>>> +++ linux-2.6/arch/powerpc/Kconfig
>>>> @@ -346,6 +346,17 @@ config KEXEC
>>>>  	  support.  As of this writing the exact hardware interface is
>>>>  	  strongly in flux, so no good recommendation can be made.
>>>>  +config KEXEC_AUTO_RESERVE
>>>> +	bool "automatically reserve memory for kexec kernel"
>>>> +	depends on KEXEC
>>>> +	default y
>>>> +	---help---
>>>> +	  Automatically reserve memory for a kexec kernel, so that you don't
>>>> +	  need to specify numbers for the "crashkernel=X@Y" boot option,
>>>> +	  instead you can use "crashkernel=auto". To make this work, you need
>>>> +	  to have more than 4G memory. On PPC, 256M is reserved, 1/32 memory
>>>> +	  on PPC64, but it will not exceed 1T/32.
>>>>     
>>>>         
>>> To be honest I don't see why this logic goes in the kernel. It seems to
>>> me that it's policy how much memory you devote to the crash kernel vs
>>> the production kernel. It depends on what kind of crash kernel you're
>>> loading, a minimal UP dump kernel, or a full-featured SMP behemoth, An
>>> it depends on how much memory you're willing to leave idle in the
>>> off-chance you crash.
>>>   
>>>       
>> True, but since in the crash kernel, we have very little memory, so  
>> probably loading a full-featured SMP kernel doesn't make much sense...
>>
>> And in patch 1/8, I introduced a way to free the reserved memory at  
>> run-time.
>>
>>     
>>> That aside, I don't see how this will be useful in practice, if it only
>>> works for memory sizes over 4G? Or are we saying that people with less
>>> than 4G don't need crash kernels? If we're not saying that, those users,
>>> or those users' distros, still need to do some logic to work out if they
>>> have < 4GB of memory and if so pick a crash kernel size. So why can't
>>> they pick the size in the > 4GB case also?
>>>   
>>>       
>> No, we set 4G as a threshold because we only want this work when have  
>> have enough memory which is defined as 4G currently... This can be  
>> changed to arch-dependent, e.g. ppc. I am very open to this.
>>
>>     
>
> So the distro/admin have to use crashkernel=auto for machines having more
> than 4GB RAM and for machines with less than 4GB RAM they have to use the
> crashkernel=x@y (or extended crashkernel syntax)? IMHO it will be nice if
> crashkernel=auto could handle all of the situations.
>   

Exactly yes.

As you suggested, I already change 4G to 2G on ppc.

I think I have already explained the reason in a previous email, I just 
don't know if '2G - reserved_memory' is safe for ppc or not. What is the 
minimum memory size for a normal kernel to run on ppc? And how is it if 
I configure PAGE_SIZE > 4K, e.g. 64K?

Thanks!


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2009-08-26  6:58 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-21  6:54 [Patch 0/8] V4 Implement crashkernel=auto Amerigo Wang
2009-08-21  6:54 ` [Patch 1/8] kexec: allow to shrink reserved memory Amerigo Wang
2009-08-22  0:17   ` Andrew Morton
2009-08-24  1:36     ` Amerigo Wang
2009-08-22  1:39   ` Eric W. Biederman
2009-08-24  2:02     ` Amerigo Wang
2009-08-21  6:54 ` [Patch 2/8] x86: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
2009-08-21  6:54 ` [Patch 3/8] x86: implement crashkernel=auto Amerigo Wang
2009-08-21  6:54 ` [Patch 4/8] ia64: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
2009-08-21  6:55 ` [Patch 5/8] ia64: implement crashkernel=auto Amerigo Wang
2009-08-22  0:24   ` Andrew Morton
2009-08-22 11:18     ` Ingo Molnar
2009-08-24  2:05       ` Amerigo Wang
2009-08-24  7:43         ` Ingo Molnar
2009-08-24  8:21           ` Bernhard Walle
2009-08-24 10:23             ` Amerigo Wang
2009-08-24  1:59     ` Amerigo Wang
2009-08-21  6:55 ` [Patch 6/8] powerpc: add CONFIG_KEXEC_AUTO_RESERVE Amerigo Wang
2009-08-24 13:44   ` Michael Ellerman
2009-08-24 14:45     ` M. Mohan Kumar
2009-08-25  6:37       ` Amerigo Wang
2009-08-25 10:24         ` M. Mohan Kumar
2009-08-25  6:23     ` Amerigo Wang
2009-08-25 10:28       ` M. Mohan Kumar
2009-08-26  6:59         ` Amerigo Wang
2009-08-21  6:55 ` [Patch 7/8] powerpc: implement crashkernel=auto Amerigo Wang
2009-08-21  6:55 ` [Patch 8/8] doc: update the kdump document Amerigo Wang
2009-08-22  0:06 ` [Patch 0/8] V4 Implement crashkernel=auto Andrew Morton
2009-08-24  1:34   ` Amerigo Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox