linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump
@ 2008-01-22 19:12 Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:12 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas, Michael Strosaker, Larry Kessler

The following series of patches implement a basic framework
for hypervisor-assisted dump. The very first patch provides 
documentation explaining what this is    :-)   . Yes, its supposed
to be an improvement over kdump.

A list of open issues / todo list is included in the documentation.
It also appears that the not-yet-released firmware versions this was tested 
on are still, ahem, incomplete; this work is also pending.

I have included most of the changes requested. Although, I did find
one or two, fixed in a later patch file rather than the first location
they appeared at.

-- Manish & Linas.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/8] pseries: phyp dump: Docmentation
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-01-22 19:26 ` Manish Ahuja
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:26 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: Larry Kessler, Michael Strosaker

Basic documentation for hypervisor-assisted dump.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>

----
 Documentation/powerpc/phyp-assisted-dump.txt |  129 +++++++++++++++++++++++++++
 1 file changed, 129 insertions(+)

Index: 2.6.24-rc5/Documentation/powerpc/phyp-assisted-dump.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/Documentation/powerpc/phyp-assisted-dump.txt	2008-01-07 18:05:46.000000000 -0600
@@ -0,0 +1,129 @@
+
+                   Hypervisor-Assisted Dump
+                   ------------------------
+                       November 2007
+
+The goal of hypervisor-assisted dump is to enable the dump of
+a crashed system, and to do so from a fully-reset system, and
+to minimize the total elapsed time until the system is back
+in production use.
+
+As compared to kdump or other strategies, hypervisor-assisted
+dump offers several strong, practical advantages:
+
+-- Unlike kdump, the system has been reset, and loaded
+   with a fresh copy of the kernel.  In particular,
+   PCI and I/O devices have been reinitialized and are
+   in a clean, consistent state.
+-- As the dump is performed, the dumped memory becomes
+   immediately available to the system for normal use.
+-- After the dump is completed, no further reboots are
+   required; the system will be fully usable, and running
+   in it's normal, production mode on it normal kernel.
+
+The above can only be accomplished by coordination with,
+and assistance from the hypervisor. The procedure is
+as follows:
+
+-- When a system crashes, the hypervisor will save
+   the low 256MB of RAM to a previously registered
+   save region. It will also save system state, system
+   registers, and hardware PTE's.
+
+-- After the low 256MB area has been saved, the
+   hypervisor will reset PCI and other hardware state.
+   It will *not* clear RAM. It will then launch the
+   bootloader, as normal.
+
+-- The freshly booted kernel will notice that there
+   is a new node (ibm,dump-kernel) in the device tree,
+   indicating that there is crash data available from
+   a previous boot. It will boot into only 256MB of RAM,
+   reserving the rest of system memory.
+
+-- Userspace tools will parse /sys/kernel/release_region
+   and read /proc/vmcore to obtain the contents of memory,
+   which holds the previous crashed kernel. The userspace
+   tools may copy this info to disk, or network, nas, san,
+   iscsi, etc. as desired.
+
+   For Example: the values in /sys/kernel/release-region
+   would look something like this (address-range pairs).
+   CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
+   DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
+
+-- As the userspace tools complete saving a portion of
+   dump, they echo an offset and size to
+   /sys/kernel/release_region to release the reserved
+   memory back to general use.
+
+   An example of this is:
+     "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+   which will release 256MB at the 1GB boundary.
+
+Please note that the hypervisor-assisted dump feature
+is only available on Power6-based systems with recent
+firmware versions.
+
+Implementation details:
+----------------------
+In order for this scheme to work, memory needs to be reserved
+quite early in the boot cycle. However, access to the device
+tree this early in the boot cycle is difficult, and device-tree
+access is needed to determine if there is a crash data waiting.
+To work around this problem, all but 256MB of RAM is reserved
+during early boot. A short while later in boot, a check is made
+to determine if there is dump data waiting. If there isn't,
+then the reserved memory is released to general kernel use.
+If there is dump data, then the /sys/kernel/release_region
+file is created, and the reserved memory is held.
+
+If there is no waiting dump data, then all but 256MB of the
+reserved ram will be released for general kernel use. The
+highest 256 MB of RAM will *not* be released: this region
+will be kept permanently reserved, so that it can act as
+a receptacle for a copy of the low 256MB in the case a crash
+does occur. See, however, "open issues" below, as to whether
+such a reserved region is really needed.
+
+Currently the dump will be copied from /proc/vmcore to a
+a new file upon user intervention. The starting address
+to be read and the range for each data point in provided
+in /sys/kernel/release_region.
+
+The tools to examine the dump will be same as the ones
+used for kdump.
+
+
+General notes:
+--------------
+Security: please note that there are potential security issues
+with any sort of dump mechanism. In particular, plaintext
+(unencrypted) data, and possibly passwords, may be present in
+the dump data. Userspace tools must take adequate precautions to
+preserve security.
+
+Open issues/ToDo:
+------------
+ o The various code paths that tell the hypervisor that a crash
+   occurred, vs. it simply being a normal reboot, should be
+   reviewed, and possibly clarified/fixed.
+
+ o Instead of using /sys/kernel, should there be a /sys/dump
+   instead? There is a dump_subsys being created by the s390 code,
+   perhaps the pseries code should use a similar layout as well.
+
+ o Is reserving a 256MB region really required? The goal of
+   reserving a 256MB scratch area is to make sure that no
+   important crash data is clobbered when the hypervisor
+   save low mem to the scratch area. But, if one could assure
+   that nothing important is located in some 256MB area, then
+   it would not need to be reserved. Something that can be
+   improved in subsequent versions.
+
+ o Still working the kdump team to integrate this with kdump,
+   some work remains but this would not affect the current
+   patches.
+
+ o Still need to write a shell script, to copy the dump away.
+   Currently I am parsing it manually.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
@ 2008-01-22 19:29 ` Manish Ahuja
  2008-01-22 21:00   ` Manish Ahuja
  2008-02-07  0:42   ` Paul Mackerras
  2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:29 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: Larry Kessler, Michael Strosaker


Initial patch for reserving memory in early boot, and freeing it later.
If the previous boot had ended with a crash, the reserved memory would contain
a copy of the crashed kernel data.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>

----
 arch/powerpc/kernel/prom.c                 |   46 ++++++++++++++++++
 arch/powerpc/kernel/rtas.c                 |   27 +++++++++++
 arch/powerpc/platforms/pseries/Makefile    |    1 
 arch/powerpc/platforms/pseries/phyp_dump.c |   71 +++++++++++++++++++++++++++++
 include/asm-powerpc/phyp_dump.h            |   37 +++++++++++++++
 include/asm/rtas.h                         |    3 +
 6 files changed, 185 insertions(+)

Index: 2.6.24-rc5/include/asm-powerpc/phyp_dump.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/include/asm-powerpc/phyp_dump.h	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,37 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyright (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PPC64_PHYP_DUMP_H
+#define _PPC64_PHYP_DUMP_H
+
+#ifdef CONFIG_PHYP_DUMP
+
+/* The RMR region will be saved for later dumping
+ * whenever the kernel crashes. Set this to 256MB. */
+#define PHYP_DUMP_RMR_START 0x0
+#define PHYP_DUMP_RMR_END   (1UL<<28)
+
+struct phyp_dump {
+	/* Memory that is reserved during very early boot. */
+	unsigned long init_reserve_start;
+	unsigned long init_reserve_size;
+	/* Check status during boot if dump active & present*/
+	unsigned long phyp_dump_is_active;
+	/* store cpu & hpte size */
+	unsigned long cpu_state_size;
+	unsigned long hpte_region_size;
+};
+
+extern struct phyp_dump *phyp_dump_info;
+
+#endif /* CONFIG_PHYP_DUMP */
+#endif /* _PPC64_PHYP_DUMP_H */
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,71 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyrhgit (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/pfn.h>
+#include <linux/swap.h>
+
+#include <asm/page.h>
+#include <asm/phyp_dump.h>
+
+/* Global, used to communicate data between early boot and late boot */
+static struct phyp_dump phyp_dump_global;
+struct phyp_dump *phyp_dump_info = &phyp_dump_global;
+
+/**
+ * release_memory_range -- release memory previously lmb_reserved
+ * @start_pfn: starting physical frame number
+ * @nr_pages: number of pages to free.
+ *
+ * This routine will release memory that had been previously
+ * lmb_reserved in early boot. The released memory becomes
+ * available for genreal use.
+ */
+static void
+release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct page *rpage;
+	unsigned long end_pfn;
+	long i;
+
+	end_pfn = start_pfn + nr_pages;
+
+	for (i=start_pfn; i <= end_pfn; i++) {
+		rpage = pfn_to_page(i);
+		if (PageReserved(rpage)) {
+			ClearPageReserved(rpage);
+			init_page_count(rpage);
+			__free_page(rpage);
+			totalram_pages++;
+		}
+	}
+}
+
+static int __init phyp_dump_setup(void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Release memory that was reserved in early boot */
+	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
+	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
+	release_memory_range(start_pfn, nr_pages);
+
+	return 0;
+}
+
+subsys_initcall(phyp_dump_setup);
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:33.000000000 -0600
@@ -18,3 +18,4 @@ obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu
 obj-$(CONFIG_HVC_CONSOLE)	+= hvconsole.o
 obj-$(CONFIG_HVCS)		+= hvcserver.o
 obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst.o
+obj-$(CONFIG_PHYP_DUMP)	+= phyp_dump.o
Index: 2.6.24-rc5/arch/powerpc/kernel/prom.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/prom.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/prom.c	2008-01-18 07:37:33.000000000 -0600
@@ -51,6 +51,7 @@
 #include <asm/machdep.h>
 #include <asm/pSeries_reconfig.h>
 #include <asm/pci-bridge.h>
+#include <asm/phyp_dump.h>
 #include <asm/kexec.h>
 
 #ifdef DEBUG
@@ -1011,6 +1012,48 @@ static void __init early_reserve_mem(voi
 #endif
 }
 
+#ifdef CONFIG_PHYP_DUMP
+
+/**
+ * reserve_crashed_mem() - reserve all not-yet-dumped mmemory
+ *
+ * This routine will reserve almost all of the memory in the
+ * system, except for a few hundred megabytes used to boot the
+ * new kernel. As the reserved memory is dumped to the dump
+ * device (by userland tools), it will be freed and made available.
+ */
+static void __init reserve_crashed_mem(void)
+{
+	unsigned long base, size;
+
+	if (phyp_dump_info->phyp_dump_is_active) {
+		/* Reserve *everything* above RMR. We'll free this real soon.*/
+		base = PHYP_DUMP_RMR_END;
+		size = lmb_end_of_DRAM() - base;
+
+		/* XXX crashed_ram_end is wrong, since it may be beyond
+	 	* the memory_limit, it will need to be adjusted. */
+		lmb_reserve(base, size);
+
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+	else {
+		size = phyp_dump_info->cpu_state_size +
+			phyp_dump_info->hpte_region_size +
+			PHYP_DUMP_RMR_END;
+		base = lmb_end_of_DRAM() - size;
+	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
+		lmb_reserve(base, size);
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+}
+#else
+static inline void __init reserve_crashed_mem(void) {}
+#endif /* CONFIG_PHYP_DUMP */
+
+
 void __init early_init_devtree(void *params)
 {
 	DBG(" -> early_init_devtree(%p)\n", params);
@@ -1022,6 +1065,8 @@ void __init early_init_devtree(void *par
 	/* Some machines might need RTAS info for debugging, grab it now. */
 	of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
 #endif
+	/* scan tree to see if dump occured during last boot */
+	of_scan_flat_dt(early_init_dt_scan_phyp_dump, NULL);
 
 	/* Retrieve various informations from the /chosen node of the
 	 * device-tree, including the platform type, initrd location and
@@ -1043,6 +1088,7 @@ void __init early_init_devtree(void *par
 	reserve_kdump_trampoline();
 	reserve_crashkernel();
 	early_reserve_mem();
+	reserve_crashed_mem();
 
 	lmb_enforce_memory_limit(memory_limit);
 	lmb_analyze();
Index: 2.6.24-rc5/arch/powerpc/kernel/rtas.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:33.000000000 -0600
@@ -39,6 +39,7 @@
 #include <asm/syscalls.h>
 #include <asm/smp.h>
 #include <asm/atomic.h>
+#include <asm/phyp_dump.h>
 
 struct rtas_t rtas = {
 	.lock = SPIN_LOCK_UNLOCKED
@@ -883,6 +884,32 @@ void __init rtas_initialize(void)
 #endif
 }
 
+int __init early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data)
+{
+#ifdef CONFIG_PHYP_DUMP
+	const unsigned int *sizes;
+
+	phyp_dump_info->phyp_dump_is_active = 0;
+	if (depth != 1 || strcmp(uname, "rtas") != 0)
+		return 0;
+
+	if (of_get_flat_dt_prop(node, "ibm,dump-kernel", NULL))
+		phyp_dump_info->phyp_dump_is_active++;
+
+	sizes = of_get_flat_dt_prop(node, "ibm,configure-kernel-dump-sizes", NULL);
+	if (!sizes)
+		return 0;
+
+	if (sizes[0] == 1)
+		phyp_dump_info->cpu_state_size = *((unsigned long *)&sizes[1]);
+
+	if (sizes[3] == 2)
+		phyp_dump_info->hpte_region_size = *((unsigned long *)&sizes[4]);
+#endif
+	return 1;
+}
+
 int __init early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data)
 {
Index: 2.6.24-rc5/include/asm/rtas.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/rtas.h	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/include/asm/rtas.h	2008-01-18 07:37:33.000000000 -0600
@@ -183,6 +183,9 @@ extern unsigned int rtas_busy_delay(int 
 
 extern int early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data);
+int early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data);
+
 
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 19:33 ` Manish Ahuja
  2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:33 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: lkessler, strosake


Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-sapce tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  102 +++++++++++++++++++++++++++--
 1 file changed, 96 insertions(+), 6 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 22:43:00.000000000 -0600
@@ -12,17 +12,24 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
+#include <asm/rtas.h>
 
 /* Global, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+static int ibm_configure_kernel_dump;
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -52,20 +59,103 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t
+store_release_region(struct kset *kset, const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
 
-	/* If no memory was reserved in early boot, there is nothing to do */
-	if (phyp_dump_info->init_reserve_size == 0)
-		return 0;
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range (start_pfn, nr_pages);
 
-	/* Release memory that was reserved in early boot */
+	return count;
+}
+
+static ssize_t
+show_release_region(struct kset * kset, char *buf)
+{
+	return sprintf(buf, "ola\n");
+}
+
+static struct subsys_attribute rr = __ATTR(release_region, 0600,
+					 show_release_region,
+					 store_release_region);
+
+/* ------------------------------------------------- */
+
+static void release_all (void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* Release all memory that was reserved in early boot */
 	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
 	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
 	release_memory_range(start_pfn, nr_pages);
+}
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header;
+	int header_len = 0;
+	int rc;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Return if phyp dump not supported */
+	ibm_configure_kernel_dump = rtas_token("ibm,configure-kernel-dump");
+	if (ibm_configure_kernel_dump == RTAS_UNKNOWN_SERVICE) {
+		release_all();
+		return -ENOSYS;
+	}
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	if (dump_header == NULL) {
+		release_all();
+		return 0;
+	}
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = subsys_create_file(&kernel_subsys, &rr);
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
+		release_all();
+		return 0;
+	}
 
 	return 0;
 }
-
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (2 preceding siblings ...)
  2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
@ 2008-01-22 20:09 ` Manish Ahuja
  2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 20:09 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas

Initial patch for reserving memory in early boot, and freeing it later.
If the previous boot had ended with a crash, the reserved memory would contain
a copy of the crashed kernel data.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

----
 arch/powerpc/kernel/prom.c                 |   46 ++++++++++++++++++
 arch/powerpc/kernel/rtas.c                 |   27 +++++++++++
 arch/powerpc/platforms/pseries/Makefile    |    1 
 arch/powerpc/platforms/pseries/phyp_dump.c |   71 +++++++++++++++++++++++++++++
 include/asm-powerpc/phyp_dump.h            |   37 +++++++++++++++
 include/asm/rtas.h                         |    3 +
 6 files changed, 185 insertions(+)

Index: 2.6.24-rc5/include/asm-powerpc/phyp_dump.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/include/asm-powerpc/phyp_dump.h	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,37 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyright (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PPC64_PHYP_DUMP_H
+#define _PPC64_PHYP_DUMP_H
+
+#ifdef CONFIG_PHYP_DUMP
+
+/* The RMR region will be saved for later dumping
+ * whenever the kernel crashes. Set this to 256MB. */
+#define PHYP_DUMP_RMR_START 0x0
+#define PHYP_DUMP_RMR_END   (1UL<<28)
+
+struct phyp_dump {
+	/* Memory that is reserved during very early boot. */
+	unsigned long init_reserve_start;
+	unsigned long init_reserve_size;
+	/* Check status during boot if dump active & present*/
+	unsigned long phyp_dump_is_active;
+	/* store cpu & hpte size */
+	unsigned long cpu_state_size;
+	unsigned long hpte_region_size;
+};
+
+extern struct phyp_dump *phyp_dump_info;
+
+#endif /* CONFIG_PHYP_DUMP */
+#endif /* _PPC64_PHYP_DUMP_H */
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,71 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyrhgit (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/pfn.h>
+#include <linux/swap.h>
+
+#include <asm/page.h>
+#include <asm/phyp_dump.h>
+
+/* Global, used to communicate data between early boot and late boot */
+static struct phyp_dump phyp_dump_global;
+struct phyp_dump *phyp_dump_info = &phyp_dump_global;
+
+/**
+ * release_memory_range -- release memory previously lmb_reserved
+ * @start_pfn: starting physical frame number
+ * @nr_pages: number of pages to free.
+ *
+ * This routine will release memory that had been previously
+ * lmb_reserved in early boot. The released memory becomes
+ * available for genreal use.
+ */
+static void
+release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct page *rpage;
+	unsigned long end_pfn;
+	long i;
+
+	end_pfn = start_pfn + nr_pages;
+
+	for (i=start_pfn; i <= end_pfn; i++) {
+		rpage = pfn_to_page(i);
+		if (PageReserved(rpage)) {
+			ClearPageReserved(rpage);
+			init_page_count(rpage);
+			__free_page(rpage);
+			totalram_pages++;
+		}
+	}
+}
+
+static int __init phyp_dump_setup(void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Release memory that was reserved in early boot */
+	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
+	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
+	release_memory_range(start_pfn, nr_pages);
+
+	return 0;
+}
+
+subsys_initcall(phyp_dump_setup);
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:33.000000000 -0600
@@ -18,3 +18,4 @@ obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu
 obj-$(CONFIG_HVC_CONSOLE)	+= hvconsole.o
 obj-$(CONFIG_HVCS)		+= hvcserver.o
 obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst.o
+obj-$(CONFIG_PHYP_DUMP)	+= phyp_dump.o
Index: 2.6.24-rc5/arch/powerpc/kernel/prom.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/prom.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/prom.c	2008-01-18 07:37:33.000000000 -0600
@@ -51,6 +51,7 @@
 #include <asm/machdep.h>
 #include <asm/pSeries_reconfig.h>
 #include <asm/pci-bridge.h>
+#include <asm/phyp_dump.h>
 #include <asm/kexec.h>
 
 #ifdef DEBUG
@@ -1011,6 +1012,48 @@ static void __init early_reserve_mem(voi
 #endif
 }
 
+#ifdef CONFIG_PHYP_DUMP
+
+/**
+ * reserve_crashed_mem() - reserve all not-yet-dumped mmemory
+ *
+ * This routine will reserve almost all of the memory in the
+ * system, except for a few hundred megabytes used to boot the
+ * new kernel. As the reserved memory is dumped to the dump
+ * device (by userland tools), it will be freed and made available.
+ */
+static void __init reserve_crashed_mem(void)
+{
+	unsigned long base, size;
+
+	if (phyp_dump_info->phyp_dump_is_active) {
+		/* Reserve *everything* above RMR. We'll free this real soon.*/
+		base = PHYP_DUMP_RMR_END;
+		size = lmb_end_of_DRAM() - base;
+
+		/* XXX crashed_ram_end is wrong, since it may be beyond
+	 	* the memory_limit, it will need to be adjusted. */
+		lmb_reserve(base, size);
+
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+	else {
+		size = phyp_dump_info->cpu_state_size +
+			phyp_dump_info->hpte_region_size +
+			PHYP_DUMP_RMR_END;
+		base = lmb_end_of_DRAM() - size;
+	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
+		lmb_reserve(base, size);
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+}
+#else
+static inline void __init reserve_crashed_mem(void) {}
+#endif /* CONFIG_PHYP_DUMP */
+
+
 void __init early_init_devtree(void *params)
 {
 	DBG(" -> early_init_devtree(%p)\n", params);
@@ -1022,6 +1065,8 @@ void __init early_init_devtree(void *par
 	/* Some machines might need RTAS info for debugging, grab it now. */
 	of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
 #endif
+	/* scan tree to see if dump occured during last boot */
+	of_scan_flat_dt(early_init_dt_scan_phyp_dump, NULL);
 
 	/* Retrieve various informations from the /chosen node of the
 	 * device-tree, including the platform type, initrd location and
@@ -1043,6 +1088,7 @@ void __init early_init_devtree(void *par
 	reserve_kdump_trampoline();
 	reserve_crashkernel();
 	early_reserve_mem();
+	reserve_crashed_mem();
 
 	lmb_enforce_memory_limit(memory_limit);
 	lmb_analyze();
Index: 2.6.24-rc5/arch/powerpc/kernel/rtas.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:33.000000000 -0600
@@ -39,6 +39,7 @@
 #include <asm/syscalls.h>
 #include <asm/smp.h>
 #include <asm/atomic.h>
+#include <asm/phyp_dump.h>
 
 struct rtas_t rtas = {
 	.lock = SPIN_LOCK_UNLOCKED
@@ -883,6 +884,32 @@ void __init rtas_initialize(void)
 #endif
 }
 
+int __init early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data)
+{
+#ifdef CONFIG_PHYP_DUMP
+	const unsigned int *sizes;
+
+	phyp_dump_info->phyp_dump_is_active = 0;
+	if (depth != 1 || strcmp(uname, "rtas") != 0)
+		return 0;
+
+	if (of_get_flat_dt_prop(node, "ibm,dump-kernel", NULL))
+		phyp_dump_info->phyp_dump_is_active++;
+
+	sizes = of_get_flat_dt_prop(node, "ibm,configure-kernel-dump-sizes", NULL);
+	if (!sizes)
+		return 0;
+
+	if (sizes[0] == 1)
+		phyp_dump_info->cpu_state_size = *((unsigned long *)&sizes[1]);
+
+	if (sizes[3] == 2)
+		phyp_dump_info->hpte_region_size = *((unsigned long *)&sizes[4]);
+#endif
+	return 1;
+}
+
 int __init early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data)
 {
Index: 2.6.24-rc5/include/asm/rtas.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/rtas.h	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/include/asm/rtas.h	2008-01-18 07:37:33.000000000 -0600
@@ -183,6 +183,9 @@ extern unsigned int rtas_busy_delay(int 
 
 extern int early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data);
+int early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data);
+
 
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (3 preceding siblings ...)
  2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 20:15 ` Manish Ahuja
  2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 20:15 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: mahuja, linasvepstas, lkessler, strosake


Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  136 +++++++++++++++++++++++++++--
 1 file changed, 129 insertions(+), 7 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 22:43:00.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:49:23.000000000 -0600
@@ -30,6 +30,117 @@ struct phyp_dump *phyp_dump_info = &phyp
 static int ibm_configure_kernel_dump;
 
 /* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS 3
+#define DUMP_HEADER_VERSION 0x1
+#define DUMP_REQUEST_FLAG 0x1
+#define DUMP_SOURCE_CPU 0x0001
+#define DUMP_SOURCE_HPTE 0x0002
+#define DUMP_SOURCE_RMO  0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+	{
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+	}
+}
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -125,7 +236,9 @@ static void release_all (void)
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header;
+	const struct phyp_dump_header *dump_header;
+	unsigned long dump_area_start;
+	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -140,22 +253,31 @@ static int __init phyp_dump_setup(void)
 		return -ENOSYS;
 	}
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	of_node_put(rtas);
+
+	dump_area_length = init_dump_header(&phdr);
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK; /* align down */
+
 	if (dump_header == NULL) {
-		release_all();
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
 	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = subsys_create_file(&kernel_subsys, &rr);
-	if (rc) {
+	if (rc)
 		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
-		release_all();
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 21:00   ` Manish Ahuja
  2008-02-07  0:42   ` Paul Mackerras
  1 sibling, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:00 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas

Reposted this one. I got the email id wrong in this one.

Sorry about that. 

Manish

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 5/8] pseries: phyp dump: debugging print routines.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (4 preceding siblings ...)
  2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
@ 2008-01-22 21:02 ` Manish Ahuja
  2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:02 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


Provide some basic debugging support.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
-----

 arch/powerpc/platforms/pseries/phyp_dump.c |   64 +++++++++++++++++++++++++++--
 1 file changed, 60 insertions(+), 4 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 02:51:54.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 02:58:41.000000000 -0600
@@ -2,7 +2,7 @@
  * Hypervisor-assisted dump
  *
  * Linas Vepstas, Manish Ahuja 2007
- * Copyrhgit (c) 2007 IBM Corp.
+ * Copyright (c) 2007 IBM Corp.
  *
  *      This program is free software; you can redistribute it and/or
  *      modify it under the terms of the GNU General Public License
@@ -122,6 +122,61 @@ static unsigned long init_dump_header(st
 	return addr_offset;
 }
 
+static void print_dump_header(const struct phyp_dump_header *ph)
+{
+#ifdef DEBUG
+	printk(KERN_INFO "dump header:\n");
+	/* setup some ph->sections required */
+	printk(KERN_INFO "version = %d\n", ph->version);
+	printk(KERN_INFO "Sections = %d\n", ph->num_of_sections);
+	printk(KERN_INFO "Status = 0x%x\n", ph->status);
+
+	/* No ph->disk, so all should be set to 0 */
+	printk(KERN_INFO "Offset to first section 0x%x\n",
+						ph->first_offset_section);
+	printk(KERN_INFO "dump disk sections should be zero\n");
+	printk(KERN_INFO "dump disk section = %d\n", ph->dump_disk_section);
+	printk(KERN_INFO "block num = %ld\n", ph->block_num_dd);
+	printk(KERN_INFO "number of blocks = %ld\n", ph->num_of_blocks_dd);
+	printk(KERN_INFO "dump disk offset = %d\n", ph->offset_dd);
+	printk(KERN_INFO "Max auto time= %d\n", ph->maxtime_to_auto);
+
+	/*set cpu state and hpte states as well scratch pad area */
+	printk(KERN_INFO " CPU AREA \n");
+	printk(KERN_INFO "cpu dump_flags =%d\n", ph->cpu_data.dump_flags);
+	printk(KERN_INFO "cpu source_type =%d\n", ph->cpu_data.source_type);
+	printk(KERN_INFO "cpu error_flags =%d\n", ph->cpu_data.error_flags);
+	printk(KERN_INFO "cpu source_address =%lx\n",
+						ph->cpu_data.source_address);
+	printk(KERN_INFO "cpu source_length =%lx\n",
+						ph->cpu_data.source_length);
+	printk(KERN_INFO "cpu length_copied =%lx\n",
+						ph->cpu_data.length_copied);
+
+	printk(KERN_INFO " HPTE AREA \n");
+	printk(KERN_INFO "HPTE dump_flags =%d\n", ph->hpte_data.dump_flags);
+	printk(KERN_INFO "HPTE source_type =%d\n", ph->hpte_data.source_type);
+	printk(KERN_INFO "HPTE error_flags =%d\n", ph->hpte_data.error_flags);
+	printk(KERN_INFO "HPTE source_address =%lx\n",
+						ph->hpte_data.source_address);
+	printk(KERN_INFO "HPTE source_length =%lx\n",
+						ph->hpte_data.source_length);
+	printk(KERN_INFO "HPTE length_copied =%lx\n",
+						ph->hpte_data.length_copied);
+
+	printk(KERN_INFO " SRSD AREA \n");
+	printk(KERN_INFO "SRSD dump_flags =%d\n", ph->kernel_data.dump_flags);
+	printk(KERN_INFO "SRSD source_type =%d\n", ph->kernel_data.source_type);
+	printk(KERN_INFO "SRSD error_flags =%d\n", ph->kernel_data.error_flags);
+	printk(KERN_INFO "SRSD source_address =%lx\n",
+						ph->kernel_data.source_address);
+	printk(KERN_INFO "SRSD source_length =%lx\n",
+						ph->kernel_data.source_length);
+	printk(KERN_INFO "SRSD length_copied =%lx\n",
+						ph->kernel_data.length_copied);
+#endif
+}
+
 static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
 {
 	int rc;
@@ -134,9 +189,9 @@ static void register_dump_area(struct ph
 		               1, ph, sizeof(struct phyp_dump_header));
 	} while (rtas_busy_delay(rc));
 
-	if (rc)
-	{
-		printk (KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+	if (rc) {
+		printk(KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+		print_dump_header(ph);
 	}
 }
 
@@ -249,6 +304,7 @@ static int __init phyp_dump_setup(void)
 		release_all();
 		return -ENOSYS;
 	}
+	print_dump_header(dump_header);
 
 	/* Is there dump data waiting for us? If there isn't,
 	 * then register a new dump area, and release all of

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (5 preceding siblings ...)
  2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
@ 2008-01-22 21:05 ` Manish Ahuja
  2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
  2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:05 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


Routines to invalidate and unregister dump routines. 
Unregister has not been used yet, I will release another
patch for that at a later stage with the kdump integration patches.

There is also a routine which calculates the regions to be
freed and exports that through sysfs.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
-----

---
 arch/powerpc/platforms/pseries/phyp_dump.c |  101 +++++++++++++++++++++++++----
 include/asm/phyp_dump.h                    |    3 
 2 files changed, 93 insertions(+), 11 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:06:20.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:49:10.000000000 -0600
@@ -69,6 +69,10 @@ static struct phyp_dump_header phdr;
 #define DUMP_SOURCE_CPU 0x0001
 #define DUMP_SOURCE_HPTE 0x0002
 #define DUMP_SOURCE_RMO  0x0011
+#define DUMP_ERROR_FLAG 0x2000
+#define DUMP_TRIGGERED 0x4000
+#define DUMP_PERFORMED 0x8000
+
 
 /**
  * init_dump_header() - initialize the header declaring a dump
@@ -180,9 +184,15 @@ static void print_dump_header(const stru
 static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
 {
 	int rc;
-	ph->cpu_data.destination_address += addr;
-	ph->hpte_data.destination_address += addr;
-	ph->kernel_data.destination_address += addr;
+
+	/* Add addr value if not initialized before */
+	if (ph->cpu_data.destination_address == 0) {
+		ph->cpu_data.destination_address += addr;
+		ph->hpte_data.destination_address += addr;
+		ph->kernel_data.destination_address += addr;
+	}
+
+	/* ToDo Invalidate kdump and free memory range. */
 
 	do {
 		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
@@ -195,6 +205,46 @@ static void register_dump_area(struct ph
 	}
 }
 
+static
+void invalidate_last_dump(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+
+	/* Add addr value if not initialized before */
+	if (ph->cpu_data.destination_address == 0) {
+		ph->cpu_data.destination_address += addr;
+		ph->hpte_data.destination_address += addr;
+		ph->kernel_data.destination_address += addr;
+	}
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               2, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) "
+						"on invalidate\n", rc);
+		print_dump_header(ph);
+	}
+}
+
+static void unregister_dump_area(struct phyp_dump_header *ph)
+{
+	int rc;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               3, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) "
+						"on unregister\n", rc);
+		print_dump_header(ph);
+	}
+}
+
 /* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
@@ -205,8 +255,8 @@ static void register_dump_area(struct ph
  * lmb_reserved in early boot. The released memory becomes
  * available for genreal use.
  */
-static void
-release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+static
+void release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
 {
 	struct page *rpage;
 	unsigned long end_pfn;
@@ -237,8 +287,8 @@ release_memory_range(unsigned long start
  *
  * will release 256MB starting at 1GB.
  */
-static ssize_t
-store_release_region(struct kset *kset, const char *buf, size_t count)
+static
+ssize_t store_release_region(struct kset *kset, const char *buf, size_t count)
 {
 	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
@@ -266,10 +316,23 @@ store_release_region(struct kset *kset, 
 	return count;
 }
 
-static ssize_t
-show_release_region(struct kset * kset, char *buf)
+static ssize_t show_release_region(struct kset * kset, char *buf)
 {
-	return sprintf(buf, "ola\n");
+	u64 second_addr_range;
+
+	/* total reserved size - start of scratch area */
+	second_addr_range = phyp_dump_info->init_reserve_size -
+				phyp_dump_info->reserved_scratch_size;
+	return sprintf(buf, "CPU:0x%lx-0x%lx: HPTE:0x%lx-0x%lx:"
+			    " DUMP:0x%lx-0x%lx, 0x%lx-0x%lx:\n",
+		phdr.cpu_data.destination_address,
+		phdr.cpu_data.length_copied,
+		phdr.hpte_data.destination_address,
+		phdr.hpte_data.length_copied,
+		phdr.kernel_data.destination_address,
+		phdr.kernel_data.length_copied,
+		phyp_dump_info->init_reserve_start,
+		second_addr_range);
 }
 
 static struct subsys_attribute rr = __ATTR(release_region, 0600,
@@ -307,7 +370,6 @@ static int __init phyp_dump_setup(void)
 		release_all();
 		return -ENOSYS;
 	}
-	print_dump_header(dump_header);
 
 	/* Is there dump data waiting for us? If there isn't,
 	 * then register a new dump area, and release all of
@@ -319,6 +381,7 @@ static int __init phyp_dump_setup(void)
 	rtas = of_find_node_by_path("/rtas");
 	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
 	of_node_put(rtas);
+	print_dump_header(dump_header);
 
 	dump_area_length = init_dump_header(&phdr);
 	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK; /* align down */
@@ -328,6 +391,22 @@ static int __init phyp_dump_setup(void)
 		return 0;
 	}
 
+	/* re-register the dump area, if old dump was invalid */
+	if ((dump_header) && (dump_header->status & DUMP_ERROR_FLAG)) {
+		invalidate_last_dump(&phdr, dump_area_start);
+		register_dump_area(&phdr, dump_area_start);
+		return 0;
+	}
+
+	if (dump_header) {
+		phyp_dump_info->reserved_scratch_addr =
+				dump_header->cpu_data.destination_address;
+		phyp_dump_info->reserved_scratch_size =
+				dump_header->cpu_data.source_length +
+				dump_header->hpte_data.source_length +
+				dump_header->kernel_data.source_length;
+	}
+
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = subsys_create_file(&kernel_subsys, &rr);
 	if (rc)
Index: 2.6.24-rc5/include/asm/phyp_dump.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/phyp_dump.h	2008-01-21 22:21:01.000000000 -0600
+++ 2.6.24-rc5/include/asm/phyp_dump.h	2008-01-21 23:29:05.000000000 -0600
@@ -29,6 +29,9 @@ struct phyp_dump {
 	/* store cpu & hpte size */
 	unsigned long cpu_state_size;
 	unsigned long hpte_region_size;
+	/* previous scratch area values */
+	unsigned long reserved_scratch_addr;
+	unsigned long reserved_scratch_size;
 };
 
 extern struct phyp_dump *phyp_dump_info;

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 7/8] pseries: phyp dump: Tracking memory range freed.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (6 preceding siblings ...)
  2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
@ 2008-01-22 21:07 ` Manish Ahuja
  2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:07 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


This patch tracks the size freed. For now it does a simple
rudimentary calculation of the ranges freed. The idea is
to keep it simple at the external shell script level and 
send in large chunks for now.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
-----

---
 arch/powerpc/platforms/pseries/phyp_dump.c |   35 +++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:30:18.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:42:04.000000000 -0600
@@ -275,6 +275,39 @@ void release_memory_range(unsigned long 
 	}
 }
 
+/**
+ * track_freed_range -- Counts the range being freed.
+ * Once the counter goes to zero, it re-registers dump for
+ * future use.
+ */
+static void
+track_freed_range(unsigned long addr, unsigned long length)
+{
+	static unsigned long scratch_area_size, reserved_area_size;
+
+	if (addr < phyp_dump_info->init_reserve_start)
+		return;
+
+	if ((addr >= phyp_dump_info->init_reserve_start) &&
+	    (addr <= phyp_dump_info->init_reserve_start +
+	     phyp_dump_info->init_reserve_size))
+		reserved_area_size += length;
+
+	if ((addr >= phyp_dump_info->reserved_scratch_addr) &&
+	    (addr <= phyp_dump_info->reserved_scratch_addr +
+	     phyp_dump_info->reserved_scratch_size))
+		scratch_area_size += length;
+
+	if ((reserved_area_size == phyp_dump_info->init_reserve_size) &&
+	    (scratch_area_size == phyp_dump_info->reserved_scratch_size)) {
+
+		invalidate_last_dump(&phdr,
+				phyp_dump_info->reserved_scratch_addr);
+		register_dump_area (&phdr,
+				phyp_dump_info->reserved_scratch_addr);
+	}
+}
+
 /* ------------------------------------------------- */
 /**
  * sysfs_release_region -- sysfs interface to release memory range.
@@ -298,6 +331,8 @@ ssize_t store_release_region(struct kset
 	if (ret != 2)
 		return -EINVAL;
 
+	track_freed_range(start_addr, length);
+
 	/* Range-check - don't free any reserved memory that
 	 * wasn't reserved for phyp-dump */
 	if (start_addr < phyp_dump_info->init_reserve_start)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 8/8] pseries: phyp dump: config file
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (7 preceding siblings ...)
  2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
@ 2008-01-22 21:09 ` Manish Ahuja
  8 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:09 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake

To: linuxppc-dev@ozlabs.org


Add hypervisor-assisted dump to kernel config

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

-----
 arch/powerpc/Kconfig |   11 +++++++++++
 1 file changed, 11 insertions(+)

Index: linux-2.6.24-rc2-git4/arch/powerpc/Kconfig
===================================================================
--- linux-2.6.24-rc2-git4.orig/arch/powerpc/Kconfig	2007-11-14 16:39:20.000000000 -0600
+++ linux-2.6.24-rc2-git4/arch/powerpc/Kconfig	2007-11-15 14:27:33.000000000 -0600
@@ -261,6 +261,17 @@ config CRASH_DUMP
 
 	  Don't change this unless you know what you are doing.
 
+config PHYP_DUMP
+	bool "Hypervisor-assisted dump (EXPERIMENTAL)"
+	depends on PPC_PSERIES && EXPERIMENTAL
+	default y
+	help
+	  Hypervisor-assisted dump is meant to be a kdump replacement
+	  offering robustness and speed not possible without system
+	  hypervisor assistence.
+
+	  If unsure, say "Y"
+
 config PPCBUG_NVRAM
 	bool "Enable reading PPCBUG NVRAM during boot" if PPLUS || LOPEC
 	default y if PPC_PREP

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
  2008-01-22 21:00   ` Manish Ahuja
@ 2008-02-07  0:42   ` Paul Mackerras
  2008-02-11 18:29     ` Manish Ahuja
  1 sibling, 1 reply; 26+ messages in thread
From: Paul Mackerras @ 2008-02-07  0:42 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: ppc-dev, linasvepstas, Larry Kessler, Michael Strosaker

Manish Ahuja writes:

> Initial patch for reserving memory in early boot, and freeing it later.
> If the previous boot had ended with a crash, the reserved memory would contain
> a copy of the crashed kernel data.

[snip]

> +static void __init reserve_crashed_mem(void)
> +{
> +	unsigned long base, size;
> +
> +	if (phyp_dump_info->phyp_dump_is_active) {
> +		/* Reserve *everything* above RMR. We'll free this real soon.*/
> +		base = PHYP_DUMP_RMR_END;
> +		size = lmb_end_of_DRAM() - base;
> +
> +		/* XXX crashed_ram_end is wrong, since it may be beyond
> +	 	* the memory_limit, it will need to be adjusted. */
> +		lmb_reserve(base, size);
> +
> +		phyp_dump_info->init_reserve_start = base;
> +		phyp_dump_info->init_reserve_size = size;
> +	}
> +	else {
> +		size = phyp_dump_info->cpu_state_size +
> +			phyp_dump_info->hpte_region_size +
> +			PHYP_DUMP_RMR_END;
> +		base = lmb_end_of_DRAM() - size;
> +	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
> +		lmb_reserve(base, size);

This is still reserving memory even on systems that aren't running on
pHyp at all.  Please rework this so that no memory is reserved if the
system doesn't support phyp-assisted dump.

Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-02-07  0:42   ` Paul Mackerras
@ 2008-02-11 18:29     ` Manish Ahuja
  0 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-02-11 18:29 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Manish Ahuja, ppc-dev, linasvepstas, Larry Kessler,
	Michael Strosaker

Sorry,

I think i sent the wrong patch file, it shouldn't have my printk statement in there. Let me re-send 
the correct file and let me test it once more to make sure it does the right thing.

-Manish



Paul Mackerras wrote:
> Manish Ahuja writes:
> 
>> Initial patch for reserving memory in early boot, and freeing it later.
>> If the previous boot had ended with a crash, the reserved memory would contain
>> a copy of the crashed kernel data.
> 
> [snip]
> 
>> +static void __init reserve_crashed_mem(void)
>> +{
>> +	unsigned long base, size;
>> +
>> +	if (phyp_dump_info->phyp_dump_is_active) {
>> +		/* Reserve *everything* above RMR. We'll free this real soon.*/
>> +		base = PHYP_DUMP_RMR_END;
>> +		size = lmb_end_of_DRAM() - base;
>> +
>> +		/* XXX crashed_ram_end is wrong, since it may be beyond
>> +	 	* the memory_limit, it will need to be adjusted. */
>> +		lmb_reserve(base, size);
>> +
>> +		phyp_dump_info->init_reserve_start = base;
>> +		phyp_dump_info->init_reserve_size = size;
>> +	}
>> +	else {
>> +		size = phyp_dump_info->cpu_state_size +
>> +			phyp_dump_info->hpte_region_size +
>> +			PHYP_DUMP_RMR_END;
>> +		base = lmb_end_of_DRAM() - size;
>> +	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
>> +		lmb_reserve(base, size);
> 
> This is still reserving memory even on systems that aren't running on
> pHyp at all.  Please rework this so that no memory is reserved if the
> system doesn't support phyp-assisted dump.
> 
> Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-12  6:31 ` Manish Ahuja
@ 2008-02-12  7:11   ` Manish Ahuja
  2008-02-12 10:08     ` Stephen Rothwell
  2008-02-15  1:05     ` Tony Breeds
  0 siblings, 2 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-02-12  7:11 UTC (permalink / raw)
  To: linuxppc-dev, paulus; +Cc: mahuja, linasvepstas


Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-space tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |   88 +++++++++++++++++++++++++++--
 1 file changed, 82 insertions(+), 6 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-12 06:12:37.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-12 06:12:55.000000000 -0600
@@ -12,17 +12,24 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
+#include <asm/rtas.h>
 
 /* Global, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+static int ibm_configure_kernel_dump;
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -52,20 +59,89 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t
+store_release_region(struct kset *kset, const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
+
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range (start_pfn, nr_pages);
+
+	return count;
+}
+
+static ssize_t
+show_release_region(struct kset * kset, char *buf)
+{
+	return sprintf(buf, "ola\n");
+}
+
+static struct subsys_attribute rr = __ATTR(release_region, 0600,
+					 show_release_region,
+					 store_release_region);
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header;
+	int header_len = 0;
+	int rc;
 
 	/* If no memory was reserved in early boot, there is nothing to do */
 	if (phyp_dump_info->init_reserve_size == 0)
 		return 0;
 
-	/* Release memory that was reserved in early boot */
-	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
-	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
-	release_memory_range(start_pfn, nr_pages);
+	/* Return if phyp dump not supported */
+	if (!phyp_dump_info->phyp_dump_configured) {
+		return -ENOSYS;
+	}
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	if (dump_header == NULL) {
+		release_all();
+		return 0;
+	}
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = subsys_create_file(&kernel_subsys, &rr);
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
+		release_all();
+		return 0;
+	}
 
 	return 0;
 }
-
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-12  7:11   ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
@ 2008-02-12 10:08     ` Stephen Rothwell
  2008-02-12 16:40       ` Manish Ahuja
  2008-02-15  1:05     ` Tony Breeds
  1 sibling, 1 reply; 26+ messages in thread
From: Stephen Rothwell @ 2008-02-12 10:08 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

Hi Manish,

Just a small comment.

On Tue, 12 Feb 2008 01:11:58 -0600 Manish Ahuja <ahuja@austin.ibm.com> wrote:
>
> +	/* Is there dump data waiting for us? */
> +	rtas = of_find_node_by_path("/rtas");
> +	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);

You need an of_node_put(rtas) here.

> +	if (dump_header == NULL) {
> +		release_all();
> +		return 0;
> +	}

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-12 10:08     ` Stephen Rothwell
@ 2008-02-12 16:40       ` Manish Ahuja
  0 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-02-12 16:40 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

As noted, its fixed in patch 4. 

If its okay for this time, I will prefer to leave it there.

-Manish


Stephen Rothwell wrote:
> Hi Manish,
> 
> Just a small comment.
> 
> On Tue, 12 Feb 2008 01:11:58 -0600 Manish Ahuja <ahuja@austin.ibm.com> wrote:
>> +	/* Is there dump data waiting for us? */
>> +	rtas = of_find_node_by_path("/rtas");
>> +	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
> 
> You need an of_node_put(rtas) here.
> 
>> +	if (dump_header == NULL) {
>> +		release_all();
>> +		return 0;
>> +	}
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-12  7:11   ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
  2008-02-12 10:08     ` Stephen Rothwell
@ 2008-02-15  1:05     ` Tony Breeds
  2008-02-15  7:17       ` Manish Ahuja
  2008-02-15 17:30       ` Linas Vepstas
  1 sibling, 2 replies; 26+ messages in thread
From: Tony Breeds @ 2008-02-15  1:05 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

On Tue, Feb 12, 2008 at 01:11:58AM -0600, Manish Ahuja wrote:

<snip>

> +static ssize_t
> +show_release_region(struct kset * kset, char *buf)
> +{
> +	return sprintf(buf, "ola\n");
> +}
> +
> +static struct subsys_attribute rr = __ATTR(release_region, 0600,
> +					 show_release_region,
> +					 store_release_region);

Any reason this sysfs attribute can't be write only? The show method
doesn't seem needed.

> +static int __init phyp_dump_setup(void)
> +{

<snip>

> +	/* Is there dump data waiting for us? */
> +	rtas = of_find_node_by_path("/rtas");
> +	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);

Hmm this isn't good.  You need to check rtas != NULL.

> +	if (dump_header == NULL) {
> +		release_all();
> +		return 0;
> +	}
> +
> +	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
> +	rc = subsys_create_file(&kernel_subsys, &rr);
> +	if (rc) {
> +		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
> +		release_all();
> +		return 0;
> +	}
>  
>  	return 0;
>  }
> -
>  subsys_initcall(phyp_dump_setup);

Hmm I think this really should be a:
	machine_subsys_initcall(pseries, phyp_dump_setup)

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-15  1:05     ` Tony Breeds
@ 2008-02-15  7:17       ` Manish Ahuja
  2008-02-15 22:32         ` Tony Breeds
  2008-02-15 17:30       ` Linas Vepstas
  1 sibling, 1 reply; 26+ messages in thread
From: Manish Ahuja @ 2008-02-15  7:17 UTC (permalink / raw)
  To: Tony Breeds; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

Tony Breeds wrote:
> On Tue, Feb 12, 2008 at 01:11:58AM -0600, Manish Ahuja wrote:
> 
> <snip>
> 
>> +static ssize_t
>> +show_release_region(struct kset * kset, char *buf)
>> +{
>> +	return sprintf(buf, "ola\n");
>> +}
>> +
>> +static struct subsys_attribute rr = __ATTR(release_region, 0600,
>> +					 show_release_region,
>> +					 store_release_region);
> 
> Any reason this sysfs attribute can't be write only? The show method
> doesn't seem needed.

yes, its used later in the code.

> 
>> +static int __init phyp_dump_setup(void)
>> +{
> 
> <snip>
> 
>> +	/* Is there dump data waiting for us? */
>> +	rtas = of_find_node_by_path("/rtas");
>> +	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
> 
> Hmm this isn't good.  You need to check rtas != NULL.


yes, will fix this as well.



> 
>> +	if (dump_header == NULL) {
>> +		release_all();
>> +		return 0;
>> +	}
>> +
>> +	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
>> +	rc = subsys_create_file(&kernel_subsys, &rr);
>> +	if (rc) {
>> +		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
>> +		release_all();
>> +		return 0;
>> +	}
>>  
>>  	return 0;
>>  }
>> -
>>  subsys_initcall(phyp_dump_setup);
> 
> Hmm I think this really should be a:
> 	machine_subsys_initcall(pseries, phyp_dump_setup)
> 
> Yours Tony
> 
>   linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
>   Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-15  1:05     ` Tony Breeds
  2008-02-15  7:17       ` Manish Ahuja
@ 2008-02-15 17:30       ` Linas Vepstas
  1 sibling, 0 replies; 26+ messages in thread
From: Linas Vepstas @ 2008-02-15 17:30 UTC (permalink / raw)
  To: Tony Breeds; +Cc: mahuja, linuxppc-dev, paulus

On 14/02/2008, Tony Breeds <tony@bakeyournoodle.com> wrote:
> On Tue, Feb 12, 2008 at 01:11:58AM -0600, Manish Ahuja wrote:

>  > +static ssize_t
>  > +show_release_region(struct kset * kset, char *buf)
>  > +{
>  > +     return sprintf(buf, "ola\n");
>  > +}
>  > +
>  > +static struct subsys_attribute rr = __ATTR(release_region, 0600,
>  > +                                      show_release_region,
>  > +                                      store_release_region);
>
>
> Any reason this sysfs attribute can't be write only? The show method
>  doesn't seem needed.

This was supposed to be a place-holder; a later patch would add detailed
info.  The goal was to  have user-land tools that would operate these files
to progressively dump and release memory regions; however, until these
userland tools get written, the proper interface remains murky  (e.g.
real addresses? virtual addresses? just delta's or a whole memory map?
some sort of numa flags or whatever?)

--linas

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-15  7:17       ` Manish Ahuja
@ 2008-02-15 22:32         ` Tony Breeds
  0 siblings, 0 replies; 26+ messages in thread
From: Tony Breeds @ 2008-02-15 22:32 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

On Fri, Feb 15, 2008 at 01:17:16AM -0600, Manish Ahuja wrote:
> Tony Breeds wrote:
> > Any reason this sysfs attribute can't be write only? The show method
> > doesn't seem needed.
> 
> yes, its used later in the code.

I see that now, thanks.  From my point of view it would make reviewing 
these patches easier if each patch was a correct and simple as possible. 
In this case it would have made the review easier if the sysfs attribute 
was write only now and then modified to add the read side when it's 
actually implemented.  The same goes for fixing typosi, cosmetic changes
and reference counting. 

Looking forward to a respin of this patch series.

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-18  4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-02-18  5:38 ` Manish Ahuja
  2008-02-22  0:53 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Michael Ellerman
  1 sibling, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-02-18  5:38 UTC (permalink / raw)
  To: ppc-dev, paulus, Linas Vepstas; +Cc: mahuja



Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-sapce tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |   81 +++++++++++++++++++++++++++--
 1 file changed, 76 insertions(+), 5 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-18 03:23:47.000000000 -0600
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-18 04:32:13.000000000 -0600
@@ -12,18 +12,23 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
 #include <asm/machdep.h>
+#include <asm/rtas.h>
 
 /* Global, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -53,18 +58,84 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t store_release_region(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
+
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range(start_pfn, nr_pages);
+
+	return count;
+}
+
+static struct kobj_attribute rr = __ATTR(release_region, 0600,
+					 NULL, store_release_region);
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header = NULL;
+	int header_len = 0;
+	int rc;
 
 	/* If no memory was reserved in early boot, there is nothing to do */
 	if (phyp_dump_info->init_reserve_size == 0)
 		return 0;
 
-	/* Release memory that was reserved in early boot */
-	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
-	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
-	release_memory_range(start_pfn, nr_pages);
+	/* Return if phyp dump not supported */
+	if (!phyp_dump_info->phyp_dump_configured)
+		return -ENOSYS;
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	if (rtas) {
+		dump_header = of_get_property(rtas, "ibm,kernel-dump",
+								&header_len);
+		of_node_put(rtas);
+	}
+
+	if (dump_header == NULL)
+		return 0;
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = sysfs_create_file(kernel_kobj, &rr.attr);
+	if (rc) {
+		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
+									rc);
+		return 0;
+	}
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-28 23:57   ` Manish Ahuja
@ 2008-02-29  0:27     ` Manish Ahuja
  2008-03-11  6:16       ` Paul Mackerras
  0 siblings, 1 reply; 26+ messages in thread
From: Manish Ahuja @ 2008-02-29  0:27 UTC (permalink / raw)
  To: linuxppc-dev, paulus, michael; +Cc: mahuja, linasvepstas


Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-sapce tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |   82 +++++++++++++++++++++++++++--
 1 file changed, 77 insertions(+), 5 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-28 21:57:52.000000000 -0600
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-28 23:36:01.000000000 -0600
@@ -12,19 +12,25 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
 #include <asm/machdep.h>
 #include <asm/prom.h>
+#include <asm/rtas.h>
+
 
 /* Global, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -54,18 +60,84 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t store_release_region(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
+
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range(start_pfn, nr_pages);
+
+	return count;
+}
+
+static struct kobj_attribute rr = __ATTR(release_region, 0600,
+					 NULL, store_release_region);
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header = NULL;
+	int header_len = 0;
+	int rc;
 
 	/* If no memory was reserved in early boot, there is nothing to do */
 	if (phyp_dump_info->init_reserve_size == 0)
 		return 0;
 
-	/* Release memory that was reserved in early boot */
-	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
-	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
-	release_memory_range(start_pfn, nr_pages);
+	/* Return if phyp dump not supported */
+	if (!phyp_dump_info->phyp_dump_configured)
+		return -ENOSYS;
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	if (rtas) {
+		dump_header = of_get_property(rtas, "ibm,kernel-dump",
+								&header_len);
+		of_node_put(rtas);
+	}
+
+	if (dump_header == NULL)
+		return 0;
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = sysfs_create_file(kernel_kobj, &rr.attr);
+	if (rc) {
+		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
+									rc);
+		return 0;
+	}
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-02-29  0:27     ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
@ 2008-03-11  6:16       ` Paul Mackerras
  2008-03-11 16:44         ` Dale Farnsworth
  2008-03-12 17:38         ` Linas Vepstas
  0 siblings, 2 replies; 26+ messages in thread
From: Paul Mackerras @ 2008-03-11  6:16 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: linuxppc-dev, linasvepstas, mahuja

Manish Ahuja writes:

> Check to see if there actually is data from a previously
> crashed kernel waiting. If so, Allow user-sapce tools to
> grab the data (by reading /proc/kcore). When user-space 
> finishes dumping a section, it must release that memory
> by writing to sysfs. For example,
> 
>   echo "0x40000000 0x10000000" > /sys/kernel/release_region
> 
> will release 256MB starting at the 1GB.  The released memory
> becomes free for general use.
> 
> Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
> Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
> 
> ------

This line needs to be exactly 3 dashes, because otherwise the tools
include the diffstat into the commit message.  Putting 4 or more
dashes was an annoying habit Linas had, and it means I have to fix it
manually (usually after I have committed the patches, and then notice
that the commit message has the extra stuff in it, so I have to go
back and fix the separators, reset my tree and re-commit the patches.)

> +		dump_header = of_get_property(rtas, "ibm,kernel-dump",
> +								&header_len);

This is a somewhat weird-looking way of coping with too-long lines.
Please indent the second line either one more tab than the first line,
or else so that it starts just after the '(' in the first line (which
is what emacs will do by default).  The same comment applies in
several other places.

Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-03-11  6:16       ` Paul Mackerras
@ 2008-03-11 16:44         ` Dale Farnsworth
  2008-03-12 17:38         ` Linas Vepstas
  1 sibling, 0 replies; 26+ messages in thread
From: Dale Farnsworth @ 2008-03-11 16:44 UTC (permalink / raw)
  To: Linuxppc-dev

Paul wrote:
> Manish Ahuja writes:
> > +		dump_header = of_get_property(rtas, "ibm,kernel-dump",
> > +								&header_len);
> 
> This is a somewhat weird-looking way of coping with too-long lines.

Yes, but not too surprising, since it precisely follows the recommendation
(and the example) in Chapter 2 of Documentation/CodingStyle.  :)

-Dale

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-03-11  6:16       ` Paul Mackerras
  2008-03-11 16:44         ` Dale Farnsworth
@ 2008-03-12 17:38         ` Linas Vepstas
  1 sibling, 0 replies; 26+ messages in thread
From: Linas Vepstas @ 2008-03-12 17:38 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, mahuja

On 11/03/2008, Paul Mackerras <paulus@samba.org> wrote:
>
>  > ------
>
> This line needs to be exactly 3 dashes, because otherwise the tools
>  include the diffstat into the commit message.  Putting 4 or more
>  dashes was an annoying habit Linas had, and it means I have to fix it
>  manually (usually after I have committed the patches, and then notice
>  that the commit message has the extra stuff in it, so I have to go
>  back and fix the separators, reset my tree and re-commit the patches.)

Sorry, I had no idea!  If I didn't have enough dashes, then quilt would
sometimes wipe out the comment at the top, so paranoia made me
add lots of dashes.

--linas

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-03-21 22:42 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-03-21 23:39 ` Manish Ahuja
  0 siblings, 0 replies; 26+ messages in thread
From: Manish Ahuja @ 2008-03-21 23:39 UTC (permalink / raw)
  To: linuxppc-dev, paulus, michael; +Cc: mahuja, linasvepstas


Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-sapce tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>

---
 arch/powerpc/platforms/pseries/phyp_dump.c |   81 +++++++++++++++++++++++++++--
 1 file changed, 76 insertions(+), 5 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-03-21 00:10:15.000000000 -0500
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-03-21 22:39:21.000000000 -0500
@@ -12,19 +12,24 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
 #include <asm/machdep.h>
 #include <asm/prom.h>
+#include <asm/rtas.h>
 
 /* Variables, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_vars;
 struct phyp_dump *phyp_dump_info = &phyp_dump_vars;
 
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -54,18 +59,84 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t store_release_region(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
+
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range(start_pfn, nr_pages);
+
+	return count;
+}
+
+static struct kobj_attribute rr = __ATTR(release_region, 0600,
+					 NULL, store_release_region);
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header = NULL;
+	int header_len = 0;
+	int rc;
 
 	/* If no memory was reserved in early boot, there is nothing to do */
 	if (phyp_dump_info->init_reserve_size == 0)
 		return 0;
 
-	/* Release memory that was reserved in early boot */
-	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
-	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
-	release_memory_range(start_pfn, nr_pages);
+	/* Return if phyp dump not supported */
+	if (!phyp_dump_info->phyp_dump_configured)
+		return -ENOSYS;
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	if (rtas) {
+		dump_header = of_get_property(rtas, "ibm,kernel-dump",
+						&header_len);
+		of_node_put(rtas);
+	}
+
+	if (dump_header == NULL)
+		return 0;
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = sysfs_create_file(kernel_kobj, &rr.attr);
+	if (rc) {
+		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
+									rc);
+		return 0;
+	}
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-03-21 23:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-22 21:00   ` Manish Ahuja
2008-02-07  0:42   ` Paul Mackerras
2008-02-11 18:29     ` Manish Ahuja
2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  -- strict thread matches above, loose matches on Subject: below --
2008-03-21 22:42 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-03-21 23:39 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-02-18  4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-18  5:38 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-02-22  0:53 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Michael Ellerman
2008-02-28 23:57   ` Manish Ahuja
2008-02-29  0:27     ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-03-11  6:16       ` Paul Mackerras
2008-03-11 16:44         ` Dale Farnsworth
2008-03-12 17:38         ` Linas Vepstas
2008-01-07 23:45 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-12  6:31 ` Manish Ahuja
2008-02-12  7:11   ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-02-12 10:08     ` Stephen Rothwell
2008-02-12 16:40       ` Manish Ahuja
2008-02-15  1:05     ` Tony Breeds
2008-02-15  7:17       ` Manish Ahuja
2008-02-15 22:32         ` Tony Breeds
2008-02-15 17:30       ` Linas Vepstas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).