linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump
@ 2008-01-22 19:12 Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:12 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas, Michael Strosaker, Larry Kessler

The following series of patches implement a basic framework
for hypervisor-assisted dump. The very first patch provides 
documentation explaining what this is    :-)   . Yes, its supposed
to be an improvement over kdump.

A list of open issues / todo list is included in the documentation.
It also appears that the not-yet-released firmware versions this was tested 
on are still, ahem, incomplete; this work is also pending.

I have included most of the changes requested. Although, I did find
one or two, fixed in a later patch file rather than the first location
they appeared at.

-- Manish & Linas.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/8] pseries: phyp dump: Docmentation
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-01-22 19:26 ` Manish Ahuja
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:26 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: Larry Kessler, Michael Strosaker

Basic documentation for hypervisor-assisted dump.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>

----
 Documentation/powerpc/phyp-assisted-dump.txt |  129 +++++++++++++++++++++++++++
 1 file changed, 129 insertions(+)

Index: 2.6.24-rc5/Documentation/powerpc/phyp-assisted-dump.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/Documentation/powerpc/phyp-assisted-dump.txt	2008-01-07 18:05:46.000000000 -0600
@@ -0,0 +1,129 @@
+
+                   Hypervisor-Assisted Dump
+                   ------------------------
+                       November 2007
+
+The goal of hypervisor-assisted dump is to enable the dump of
+a crashed system, and to do so from a fully-reset system, and
+to minimize the total elapsed time until the system is back
+in production use.
+
+As compared to kdump or other strategies, hypervisor-assisted
+dump offers several strong, practical advantages:
+
+-- Unlike kdump, the system has been reset, and loaded
+   with a fresh copy of the kernel.  In particular,
+   PCI and I/O devices have been reinitialized and are
+   in a clean, consistent state.
+-- As the dump is performed, the dumped memory becomes
+   immediately available to the system for normal use.
+-- After the dump is completed, no further reboots are
+   required; the system will be fully usable, and running
+   in it's normal, production mode on it normal kernel.
+
+The above can only be accomplished by coordination with,
+and assistance from the hypervisor. The procedure is
+as follows:
+
+-- When a system crashes, the hypervisor will save
+   the low 256MB of RAM to a previously registered
+   save region. It will also save system state, system
+   registers, and hardware PTE's.
+
+-- After the low 256MB area has been saved, the
+   hypervisor will reset PCI and other hardware state.
+   It will *not* clear RAM. It will then launch the
+   bootloader, as normal.
+
+-- The freshly booted kernel will notice that there
+   is a new node (ibm,dump-kernel) in the device tree,
+   indicating that there is crash data available from
+   a previous boot. It will boot into only 256MB of RAM,
+   reserving the rest of system memory.
+
+-- Userspace tools will parse /sys/kernel/release_region
+   and read /proc/vmcore to obtain the contents of memory,
+   which holds the previous crashed kernel. The userspace
+   tools may copy this info to disk, or network, nas, san,
+   iscsi, etc. as desired.
+
+   For Example: the values in /sys/kernel/release-region
+   would look something like this (address-range pairs).
+   CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
+   DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
+
+-- As the userspace tools complete saving a portion of
+   dump, they echo an offset and size to
+   /sys/kernel/release_region to release the reserved
+   memory back to general use.
+
+   An example of this is:
+     "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+   which will release 256MB at the 1GB boundary.
+
+Please note that the hypervisor-assisted dump feature
+is only available on Power6-based systems with recent
+firmware versions.
+
+Implementation details:
+----------------------
+In order for this scheme to work, memory needs to be reserved
+quite early in the boot cycle. However, access to the device
+tree this early in the boot cycle is difficult, and device-tree
+access is needed to determine if there is a crash data waiting.
+To work around this problem, all but 256MB of RAM is reserved
+during early boot. A short while later in boot, a check is made
+to determine if there is dump data waiting. If there isn't,
+then the reserved memory is released to general kernel use.
+If there is dump data, then the /sys/kernel/release_region
+file is created, and the reserved memory is held.
+
+If there is no waiting dump data, then all but 256MB of the
+reserved ram will be released for general kernel use. The
+highest 256 MB of RAM will *not* be released: this region
+will be kept permanently reserved, so that it can act as
+a receptacle for a copy of the low 256MB in the case a crash
+does occur. See, however, "open issues" below, as to whether
+such a reserved region is really needed.
+
+Currently the dump will be copied from /proc/vmcore to a
+a new file upon user intervention. The starting address
+to be read and the range for each data point in provided
+in /sys/kernel/release_region.
+
+The tools to examine the dump will be same as the ones
+used for kdump.
+
+
+General notes:
+--------------
+Security: please note that there are potential security issues
+with any sort of dump mechanism. In particular, plaintext
+(unencrypted) data, and possibly passwords, may be present in
+the dump data. Userspace tools must take adequate precautions to
+preserve security.
+
+Open issues/ToDo:
+------------
+ o The various code paths that tell the hypervisor that a crash
+   occurred, vs. it simply being a normal reboot, should be
+   reviewed, and possibly clarified/fixed.
+
+ o Instead of using /sys/kernel, should there be a /sys/dump
+   instead? There is a dump_subsys being created by the s390 code,
+   perhaps the pseries code should use a similar layout as well.
+
+ o Is reserving a 256MB region really required? The goal of
+   reserving a 256MB scratch area is to make sure that no
+   important crash data is clobbered when the hypervisor
+   save low mem to the scratch area. But, if one could assure
+   that nothing important is located in some 256MB area, then
+   it would not need to be reserved. Something that can be
+   improved in subsequent versions.
+
+ o Still working the kdump team to integrate this with kdump,
+   some work remains but this would not affect the current
+   patches.
+
+ o Still need to write a shell script, to copy the dump away.
+   Currently I am parsing it manually.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
@ 2008-01-22 19:29 ` Manish Ahuja
  2008-01-22 21:00   ` Manish Ahuja
  2008-02-07  0:42   ` Paul Mackerras
  2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:29 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: Larry Kessler, Michael Strosaker


Initial patch for reserving memory in early boot, and freeing it later.
If the previous boot had ended with a crash, the reserved memory would contain
a copy of the crashed kernel data.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>

----
 arch/powerpc/kernel/prom.c                 |   46 ++++++++++++++++++
 arch/powerpc/kernel/rtas.c                 |   27 +++++++++++
 arch/powerpc/platforms/pseries/Makefile    |    1 
 arch/powerpc/platforms/pseries/phyp_dump.c |   71 +++++++++++++++++++++++++++++
 include/asm-powerpc/phyp_dump.h            |   37 +++++++++++++++
 include/asm/rtas.h                         |    3 +
 6 files changed, 185 insertions(+)

Index: 2.6.24-rc5/include/asm-powerpc/phyp_dump.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/include/asm-powerpc/phyp_dump.h	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,37 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyright (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PPC64_PHYP_DUMP_H
+#define _PPC64_PHYP_DUMP_H
+
+#ifdef CONFIG_PHYP_DUMP
+
+/* The RMR region will be saved for later dumping
+ * whenever the kernel crashes. Set this to 256MB. */
+#define PHYP_DUMP_RMR_START 0x0
+#define PHYP_DUMP_RMR_END   (1UL<<28)
+
+struct phyp_dump {
+	/* Memory that is reserved during very early boot. */
+	unsigned long init_reserve_start;
+	unsigned long init_reserve_size;
+	/* Check status during boot if dump active & present*/
+	unsigned long phyp_dump_is_active;
+	/* store cpu & hpte size */
+	unsigned long cpu_state_size;
+	unsigned long hpte_region_size;
+};
+
+extern struct phyp_dump *phyp_dump_info;
+
+#endif /* CONFIG_PHYP_DUMP */
+#endif /* _PPC64_PHYP_DUMP_H */
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,71 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyrhgit (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/pfn.h>
+#include <linux/swap.h>
+
+#include <asm/page.h>
+#include <asm/phyp_dump.h>
+
+/* Global, used to communicate data between early boot and late boot */
+static struct phyp_dump phyp_dump_global;
+struct phyp_dump *phyp_dump_info = &phyp_dump_global;
+
+/**
+ * release_memory_range -- release memory previously lmb_reserved
+ * @start_pfn: starting physical frame number
+ * @nr_pages: number of pages to free.
+ *
+ * This routine will release memory that had been previously
+ * lmb_reserved in early boot. The released memory becomes
+ * available for genreal use.
+ */
+static void
+release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct page *rpage;
+	unsigned long end_pfn;
+	long i;
+
+	end_pfn = start_pfn + nr_pages;
+
+	for (i=start_pfn; i <= end_pfn; i++) {
+		rpage = pfn_to_page(i);
+		if (PageReserved(rpage)) {
+			ClearPageReserved(rpage);
+			init_page_count(rpage);
+			__free_page(rpage);
+			totalram_pages++;
+		}
+	}
+}
+
+static int __init phyp_dump_setup(void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Release memory that was reserved in early boot */
+	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
+	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
+	release_memory_range(start_pfn, nr_pages);
+
+	return 0;
+}
+
+subsys_initcall(phyp_dump_setup);
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:33.000000000 -0600
@@ -18,3 +18,4 @@ obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu
 obj-$(CONFIG_HVC_CONSOLE)	+= hvconsole.o
 obj-$(CONFIG_HVCS)		+= hvcserver.o
 obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst.o
+obj-$(CONFIG_PHYP_DUMP)	+= phyp_dump.o
Index: 2.6.24-rc5/arch/powerpc/kernel/prom.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/prom.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/prom.c	2008-01-18 07:37:33.000000000 -0600
@@ -51,6 +51,7 @@
 #include <asm/machdep.h>
 #include <asm/pSeries_reconfig.h>
 #include <asm/pci-bridge.h>
+#include <asm/phyp_dump.h>
 #include <asm/kexec.h>
 
 #ifdef DEBUG
@@ -1011,6 +1012,48 @@ static void __init early_reserve_mem(voi
 #endif
 }
 
+#ifdef CONFIG_PHYP_DUMP
+
+/**
+ * reserve_crashed_mem() - reserve all not-yet-dumped mmemory
+ *
+ * This routine will reserve almost all of the memory in the
+ * system, except for a few hundred megabytes used to boot the
+ * new kernel. As the reserved memory is dumped to the dump
+ * device (by userland tools), it will be freed and made available.
+ */
+static void __init reserve_crashed_mem(void)
+{
+	unsigned long base, size;
+
+	if (phyp_dump_info->phyp_dump_is_active) {
+		/* Reserve *everything* above RMR. We'll free this real soon.*/
+		base = PHYP_DUMP_RMR_END;
+		size = lmb_end_of_DRAM() - base;
+
+		/* XXX crashed_ram_end is wrong, since it may be beyond
+	 	* the memory_limit, it will need to be adjusted. */
+		lmb_reserve(base, size);
+
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+	else {
+		size = phyp_dump_info->cpu_state_size +
+			phyp_dump_info->hpte_region_size +
+			PHYP_DUMP_RMR_END;
+		base = lmb_end_of_DRAM() - size;
+	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
+		lmb_reserve(base, size);
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+}
+#else
+static inline void __init reserve_crashed_mem(void) {}
+#endif /* CONFIG_PHYP_DUMP */
+
+
 void __init early_init_devtree(void *params)
 {
 	DBG(" -> early_init_devtree(%p)\n", params);
@@ -1022,6 +1065,8 @@ void __init early_init_devtree(void *par
 	/* Some machines might need RTAS info for debugging, grab it now. */
 	of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
 #endif
+	/* scan tree to see if dump occured during last boot */
+	of_scan_flat_dt(early_init_dt_scan_phyp_dump, NULL);
 
 	/* Retrieve various informations from the /chosen node of the
 	 * device-tree, including the platform type, initrd location and
@@ -1043,6 +1088,7 @@ void __init early_init_devtree(void *par
 	reserve_kdump_trampoline();
 	reserve_crashkernel();
 	early_reserve_mem();
+	reserve_crashed_mem();
 
 	lmb_enforce_memory_limit(memory_limit);
 	lmb_analyze();
Index: 2.6.24-rc5/arch/powerpc/kernel/rtas.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:33.000000000 -0600
@@ -39,6 +39,7 @@
 #include <asm/syscalls.h>
 #include <asm/smp.h>
 #include <asm/atomic.h>
+#include <asm/phyp_dump.h>
 
 struct rtas_t rtas = {
 	.lock = SPIN_LOCK_UNLOCKED
@@ -883,6 +884,32 @@ void __init rtas_initialize(void)
 #endif
 }
 
+int __init early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data)
+{
+#ifdef CONFIG_PHYP_DUMP
+	const unsigned int *sizes;
+
+	phyp_dump_info->phyp_dump_is_active = 0;
+	if (depth != 1 || strcmp(uname, "rtas") != 0)
+		return 0;
+
+	if (of_get_flat_dt_prop(node, "ibm,dump-kernel", NULL))
+		phyp_dump_info->phyp_dump_is_active++;
+
+	sizes = of_get_flat_dt_prop(node, "ibm,configure-kernel-dump-sizes", NULL);
+	if (!sizes)
+		return 0;
+
+	if (sizes[0] == 1)
+		phyp_dump_info->cpu_state_size = *((unsigned long *)&sizes[1]);
+
+	if (sizes[3] == 2)
+		phyp_dump_info->hpte_region_size = *((unsigned long *)&sizes[4]);
+#endif
+	return 1;
+}
+
 int __init early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data)
 {
Index: 2.6.24-rc5/include/asm/rtas.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/rtas.h	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/include/asm/rtas.h	2008-01-18 07:37:33.000000000 -0600
@@ -183,6 +183,9 @@ extern unsigned int rtas_busy_delay(int 
 
 extern int early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data);
+int early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data);
+
 
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
  2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 19:33 ` Manish Ahuja
  2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 19:33 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: lkessler, strosake


Check to see if there actually is data from a previously
crashed kernel waiting. If so, Allow user-sapce tools to
grab the data (by reading /proc/kcore). When user-space 
finishes dumping a section, it must release that memory
by writing to sysfs. For example,

  echo "0x40000000 0x10000000" > /sys/kernel/release_region

will release 256MB starting at the 1GB.  The released memory
becomes free for general use.

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  102 +++++++++++++++++++++++++++--
 1 file changed, 96 insertions(+), 6 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 22:43:00.000000000 -0600
@@ -12,17 +12,24 @@
  */
 
 #include <linux/init.h>
+#include <linux/kobject.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/pfn.h>
 #include <linux/swap.h>
+#include <linux/sysfs.h>
 
 #include <asm/page.h>
 #include <asm/phyp_dump.h>
+#include <asm/rtas.h>
 
 /* Global, used to communicate data between early boot and late boot */
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+static int ibm_configure_kernel_dump;
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -52,20 +59,103 @@ release_memory_range(unsigned long start
 	}
 }
 
-static int __init phyp_dump_setup(void)
+/* ------------------------------------------------- */
+/**
+ * sysfs_release_region -- sysfs interface to release memory range.
+ *
+ * Usage:
+ *   "echo <start addr> <length> > /sys/kernel/release_region"
+ *
+ * Example:
+ *   "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
+ *
+ * will release 256MB starting at 1GB.
+ */
+static ssize_t
+store_release_region(struct kset *kset, const char *buf, size_t count)
 {
+	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
+	ssize_t ret;
 
-	/* If no memory was reserved in early boot, there is nothing to do */
-	if (phyp_dump_info->init_reserve_size == 0)
-		return 0;
+	ret = sscanf(buf, "%lx %lx", &start_addr, &length);
+	if (ret != 2)
+		return -EINVAL;
+
+	/* Range-check - don't free any reserved memory that
+	 * wasn't reserved for phyp-dump */
+	if (start_addr < phyp_dump_info->init_reserve_start)
+		start_addr = phyp_dump_info->init_reserve_start;
+
+	end_addr = phyp_dump_info->init_reserve_start +
+			phyp_dump_info->init_reserve_size;
+	if (start_addr+length > end_addr)
+		length = end_addr - start_addr;
+
+	/* Release the region of memory assed in by user */
+	start_pfn = PFN_DOWN(start_addr);
+	nr_pages = PFN_DOWN(length);
+	release_memory_range (start_pfn, nr_pages);
 
-	/* Release memory that was reserved in early boot */
+	return count;
+}
+
+static ssize_t
+show_release_region(struct kset * kset, char *buf)
+{
+	return sprintf(buf, "ola\n");
+}
+
+static struct subsys_attribute rr = __ATTR(release_region, 0600,
+					 show_release_region,
+					 store_release_region);
+
+/* ------------------------------------------------- */
+
+static void release_all (void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* Release all memory that was reserved in early boot */
 	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
 	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
 	release_memory_range(start_pfn, nr_pages);
+}
+
+static int __init phyp_dump_setup(void)
+{
+	struct device_node *rtas;
+	const int *dump_header;
+	int header_len = 0;
+	int rc;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Return if phyp dump not supported */
+	ibm_configure_kernel_dump = rtas_token("ibm,configure-kernel-dump");
+	if (ibm_configure_kernel_dump == RTAS_UNKNOWN_SERVICE) {
+		release_all();
+		return -ENOSYS;
+	}
+
+	/* Is there dump data waiting for us? */
+	rtas = of_find_node_by_path("/rtas");
+	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	if (dump_header == NULL) {
+		release_all();
+		return 0;
+	}
+
+	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
+	rc = subsys_create_file(&kernel_subsys, &rr);
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
+		release_all();
+		return 0;
+	}
 
 	return 0;
 }
-
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (2 preceding siblings ...)
  2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
@ 2008-01-22 20:09 ` Manish Ahuja
  2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 20:09 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas

Initial patch for reserving memory in early boot, and freeing it later.
If the previous boot had ended with a crash, the reserved memory would contain
a copy of the crashed kernel data.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

----
 arch/powerpc/kernel/prom.c                 |   46 ++++++++++++++++++
 arch/powerpc/kernel/rtas.c                 |   27 +++++++++++
 arch/powerpc/platforms/pseries/Makefile    |    1 
 arch/powerpc/platforms/pseries/phyp_dump.c |   71 +++++++++++++++++++++++++++++
 include/asm-powerpc/phyp_dump.h            |   37 +++++++++++++++
 include/asm/rtas.h                         |    3 +
 6 files changed, 185 insertions(+)

Index: 2.6.24-rc5/include/asm-powerpc/phyp_dump.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/include/asm-powerpc/phyp_dump.h	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,37 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyright (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PPC64_PHYP_DUMP_H
+#define _PPC64_PHYP_DUMP_H
+
+#ifdef CONFIG_PHYP_DUMP
+
+/* The RMR region will be saved for later dumping
+ * whenever the kernel crashes. Set this to 256MB. */
+#define PHYP_DUMP_RMR_START 0x0
+#define PHYP_DUMP_RMR_END   (1UL<<28)
+
+struct phyp_dump {
+	/* Memory that is reserved during very early boot. */
+	unsigned long init_reserve_start;
+	unsigned long init_reserve_size;
+	/* Check status during boot if dump active & present*/
+	unsigned long phyp_dump_is_active;
+	/* store cpu & hpte size */
+	unsigned long cpu_state_size;
+	unsigned long hpte_region_size;
+};
+
+extern struct phyp_dump *phyp_dump_info;
+
+#endif /* CONFIG_PHYP_DUMP */
+#endif /* _PPC64_PHYP_DUMP_H */
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 07:37:33.000000000 -0600
@@ -0,0 +1,71 @@
+/*
+ * Hypervisor-assisted dump
+ *
+ * Linas Vepstas, Manish Ahuja 2007
+ * Copyrhgit (c) 2007 IBM Corp.
+ *
+ *      This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/pfn.h>
+#include <linux/swap.h>
+
+#include <asm/page.h>
+#include <asm/phyp_dump.h>
+
+/* Global, used to communicate data between early boot and late boot */
+static struct phyp_dump phyp_dump_global;
+struct phyp_dump *phyp_dump_info = &phyp_dump_global;
+
+/**
+ * release_memory_range -- release memory previously lmb_reserved
+ * @start_pfn: starting physical frame number
+ * @nr_pages: number of pages to free.
+ *
+ * This routine will release memory that had been previously
+ * lmb_reserved in early boot. The released memory becomes
+ * available for genreal use.
+ */
+static void
+release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct page *rpage;
+	unsigned long end_pfn;
+	long i;
+
+	end_pfn = start_pfn + nr_pages;
+
+	for (i=start_pfn; i <= end_pfn; i++) {
+		rpage = pfn_to_page(i);
+		if (PageReserved(rpage)) {
+			ClearPageReserved(rpage);
+			init_page_count(rpage);
+			__free_page(rpage);
+			totalram_pages++;
+		}
+	}
+}
+
+static int __init phyp_dump_setup(void)
+{
+	unsigned long start_pfn, nr_pages;
+
+	/* If no memory was reserved in early boot, there is nothing to do */
+	if (phyp_dump_info->init_reserve_size == 0)
+		return 0;
+
+	/* Release memory that was reserved in early boot */
+	start_pfn = PFN_DOWN(phyp_dump_info->init_reserve_start);
+	nr_pages = PFN_DOWN(phyp_dump_info->init_reserve_size);
+	release_memory_range(start_pfn, nr_pages);
+
+	return 0;
+}
+
+subsys_initcall(phyp_dump_setup);
Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/Makefile	2008-01-18 07:37:33.000000000 -0600
@@ -18,3 +18,4 @@ obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu
 obj-$(CONFIG_HVC_CONSOLE)	+= hvconsole.o
 obj-$(CONFIG_HVCS)		+= hvcserver.o
 obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst.o
+obj-$(CONFIG_PHYP_DUMP)	+= phyp_dump.o
Index: 2.6.24-rc5/arch/powerpc/kernel/prom.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/prom.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/prom.c	2008-01-18 07:37:33.000000000 -0600
@@ -51,6 +51,7 @@
 #include <asm/machdep.h>
 #include <asm/pSeries_reconfig.h>
 #include <asm/pci-bridge.h>
+#include <asm/phyp_dump.h>
 #include <asm/kexec.h>
 
 #ifdef DEBUG
@@ -1011,6 +1012,48 @@ static void __init early_reserve_mem(voi
 #endif
 }
 
+#ifdef CONFIG_PHYP_DUMP
+
+/**
+ * reserve_crashed_mem() - reserve all not-yet-dumped mmemory
+ *
+ * This routine will reserve almost all of the memory in the
+ * system, except for a few hundred megabytes used to boot the
+ * new kernel. As the reserved memory is dumped to the dump
+ * device (by userland tools), it will be freed and made available.
+ */
+static void __init reserve_crashed_mem(void)
+{
+	unsigned long base, size;
+
+	if (phyp_dump_info->phyp_dump_is_active) {
+		/* Reserve *everything* above RMR. We'll free this real soon.*/
+		base = PHYP_DUMP_RMR_END;
+		size = lmb_end_of_DRAM() - base;
+
+		/* XXX crashed_ram_end is wrong, since it may be beyond
+	 	* the memory_limit, it will need to be adjusted. */
+		lmb_reserve(base, size);
+
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+	else {
+		size = phyp_dump_info->cpu_state_size +
+			phyp_dump_info->hpte_region_size +
+			PHYP_DUMP_RMR_END;
+		base = lmb_end_of_DRAM() - size;
+	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
+		lmb_reserve(base, size);
+		phyp_dump_info->init_reserve_start = base;
+		phyp_dump_info->init_reserve_size = size;
+	}
+}
+#else
+static inline void __init reserve_crashed_mem(void) {}
+#endif /* CONFIG_PHYP_DUMP */
+
+
 void __init early_init_devtree(void *params)
 {
 	DBG(" -> early_init_devtree(%p)\n", params);
@@ -1022,6 +1065,8 @@ void __init early_init_devtree(void *par
 	/* Some machines might need RTAS info for debugging, grab it now. */
 	of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
 #endif
+	/* scan tree to see if dump occured during last boot */
+	of_scan_flat_dt(early_init_dt_scan_phyp_dump, NULL);
 
 	/* Retrieve various informations from the /chosen node of the
 	 * device-tree, including the platform type, initrd location and
@@ -1043,6 +1088,7 @@ void __init early_init_devtree(void *par
 	reserve_kdump_trampoline();
 	reserve_crashkernel();
 	early_reserve_mem();
+	reserve_crashed_mem();
 
 	lmb_enforce_memory_limit(memory_limit);
 	lmb_analyze();
Index: 2.6.24-rc5/arch/powerpc/kernel/rtas.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/kernel/rtas.c	2008-01-18 07:37:33.000000000 -0600
@@ -39,6 +39,7 @@
 #include <asm/syscalls.h>
 #include <asm/smp.h>
 #include <asm/atomic.h>
+#include <asm/phyp_dump.h>
 
 struct rtas_t rtas = {
 	.lock = SPIN_LOCK_UNLOCKED
@@ -883,6 +884,32 @@ void __init rtas_initialize(void)
 #endif
 }
 
+int __init early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data)
+{
+#ifdef CONFIG_PHYP_DUMP
+	const unsigned int *sizes;
+
+	phyp_dump_info->phyp_dump_is_active = 0;
+	if (depth != 1 || strcmp(uname, "rtas") != 0)
+		return 0;
+
+	if (of_get_flat_dt_prop(node, "ibm,dump-kernel", NULL))
+		phyp_dump_info->phyp_dump_is_active++;
+
+	sizes = of_get_flat_dt_prop(node, "ibm,configure-kernel-dump-sizes", NULL);
+	if (!sizes)
+		return 0;
+
+	if (sizes[0] == 1)
+		phyp_dump_info->cpu_state_size = *((unsigned long *)&sizes[1]);
+
+	if (sizes[3] == 2)
+		phyp_dump_info->hpte_region_size = *((unsigned long *)&sizes[4]);
+#endif
+	return 1;
+}
+
 int __init early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data)
 {
Index: 2.6.24-rc5/include/asm/rtas.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/rtas.h	2008-01-18 07:37:28.000000000 -0600
+++ 2.6.24-rc5/include/asm/rtas.h	2008-01-18 07:37:33.000000000 -0600
@@ -183,6 +183,9 @@ extern unsigned int rtas_busy_delay(int 
 
 extern int early_init_dt_scan_rtas(unsigned long node,
 		const char *uname, int depth, void *data);
+int early_init_dt_scan_phyp_dump(unsigned long node,
+		const char *uname, int depth, void *data);
+
 
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (3 preceding siblings ...)
  2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 20:15 ` Manish Ahuja
  2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 20:15 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas; +Cc: mahuja, linasvepstas, lkessler, strosake


Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  136 +++++++++++++++++++++++++++--
 1 file changed, 129 insertions(+), 7 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-18 22:43:00.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:49:23.000000000 -0600
@@ -30,6 +30,117 @@ struct phyp_dump *phyp_dump_info = &phyp
 static int ibm_configure_kernel_dump;
 
 /* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS 3
+#define DUMP_HEADER_VERSION 0x1
+#define DUMP_REQUEST_FLAG 0x1
+#define DUMP_SOURCE_CPU 0x0001
+#define DUMP_SOURCE_HPTE 0x0002
+#define DUMP_SOURCE_RMO  0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+	{
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+	}
+}
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -125,7 +236,9 @@ static void release_all (void)
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header;
+	const struct phyp_dump_header *dump_header;
+	unsigned long dump_area_start;
+	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -140,22 +253,31 @@ static int __init phyp_dump_setup(void)
 		return -ENOSYS;
 	}
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	of_node_put(rtas);
+
+	dump_area_length = init_dump_header(&phdr);
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK; /* align down */
+
 	if (dump_header == NULL) {
-		release_all();
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
 	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = subsys_create_file(&kernel_subsys, &rr);
-	if (rc) {
+	if (rc)
 		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
-		release_all();
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
@ 2008-01-22 21:00   ` Manish Ahuja
  2008-02-07  0:42   ` Paul Mackerras
  1 sibling, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:00 UTC (permalink / raw)
  To: ppc-dev, paulus, linasvepstas

Reposted this one. I got the email id wrong in this one.

Sorry about that. 

Manish

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/8] pseries: phyp dump: debugging print routines.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (4 preceding siblings ...)
  2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
@ 2008-01-22 21:02 ` Manish Ahuja
  2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:02 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


Provide some basic debugging support.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
-----

 arch/powerpc/platforms/pseries/phyp_dump.c |   64 +++++++++++++++++++++++++++--
 1 file changed, 60 insertions(+), 4 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 02:51:54.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 02:58:41.000000000 -0600
@@ -2,7 +2,7 @@
  * Hypervisor-assisted dump
  *
  * Linas Vepstas, Manish Ahuja 2007
- * Copyrhgit (c) 2007 IBM Corp.
+ * Copyright (c) 2007 IBM Corp.
  *
  *      This program is free software; you can redistribute it and/or
  *      modify it under the terms of the GNU General Public License
@@ -122,6 +122,61 @@ static unsigned long init_dump_header(st
 	return addr_offset;
 }
 
+static void print_dump_header(const struct phyp_dump_header *ph)
+{
+#ifdef DEBUG
+	printk(KERN_INFO "dump header:\n");
+	/* setup some ph->sections required */
+	printk(KERN_INFO "version = %d\n", ph->version);
+	printk(KERN_INFO "Sections = %d\n", ph->num_of_sections);
+	printk(KERN_INFO "Status = 0x%x\n", ph->status);
+
+	/* No ph->disk, so all should be set to 0 */
+	printk(KERN_INFO "Offset to first section 0x%x\n",
+						ph->first_offset_section);
+	printk(KERN_INFO "dump disk sections should be zero\n");
+	printk(KERN_INFO "dump disk section = %d\n", ph->dump_disk_section);
+	printk(KERN_INFO "block num = %ld\n", ph->block_num_dd);
+	printk(KERN_INFO "number of blocks = %ld\n", ph->num_of_blocks_dd);
+	printk(KERN_INFO "dump disk offset = %d\n", ph->offset_dd);
+	printk(KERN_INFO "Max auto time= %d\n", ph->maxtime_to_auto);
+
+	/*set cpu state and hpte states as well scratch pad area */
+	printk(KERN_INFO " CPU AREA \n");
+	printk(KERN_INFO "cpu dump_flags =%d\n", ph->cpu_data.dump_flags);
+	printk(KERN_INFO "cpu source_type =%d\n", ph->cpu_data.source_type);
+	printk(KERN_INFO "cpu error_flags =%d\n", ph->cpu_data.error_flags);
+	printk(KERN_INFO "cpu source_address =%lx\n",
+						ph->cpu_data.source_address);
+	printk(KERN_INFO "cpu source_length =%lx\n",
+						ph->cpu_data.source_length);
+	printk(KERN_INFO "cpu length_copied =%lx\n",
+						ph->cpu_data.length_copied);
+
+	printk(KERN_INFO " HPTE AREA \n");
+	printk(KERN_INFO "HPTE dump_flags =%d\n", ph->hpte_data.dump_flags);
+	printk(KERN_INFO "HPTE source_type =%d\n", ph->hpte_data.source_type);
+	printk(KERN_INFO "HPTE error_flags =%d\n", ph->hpte_data.error_flags);
+	printk(KERN_INFO "HPTE source_address =%lx\n",
+						ph->hpte_data.source_address);
+	printk(KERN_INFO "HPTE source_length =%lx\n",
+						ph->hpte_data.source_length);
+	printk(KERN_INFO "HPTE length_copied =%lx\n",
+						ph->hpte_data.length_copied);
+
+	printk(KERN_INFO " SRSD AREA \n");
+	printk(KERN_INFO "SRSD dump_flags =%d\n", ph->kernel_data.dump_flags);
+	printk(KERN_INFO "SRSD source_type =%d\n", ph->kernel_data.source_type);
+	printk(KERN_INFO "SRSD error_flags =%d\n", ph->kernel_data.error_flags);
+	printk(KERN_INFO "SRSD source_address =%lx\n",
+						ph->kernel_data.source_address);
+	printk(KERN_INFO "SRSD source_length =%lx\n",
+						ph->kernel_data.source_length);
+	printk(KERN_INFO "SRSD length_copied =%lx\n",
+						ph->kernel_data.length_copied);
+#endif
+}
+
 static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
 {
 	int rc;
@@ -134,9 +189,9 @@ static void register_dump_area(struct ph
 		               1, ph, sizeof(struct phyp_dump_header));
 	} while (rtas_busy_delay(rc));
 
-	if (rc)
-	{
-		printk (KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+	if (rc) {
+		printk(KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+		print_dump_header(ph);
 	}
 }
 
@@ -249,6 +304,7 @@ static int __init phyp_dump_setup(void)
 		release_all();
 		return -ENOSYS;
 	}
+	print_dump_header(dump_header);
 
 	/* Is there dump data waiting for us? If there isn't,
 	 * then register a new dump area, and release all of

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (5 preceding siblings ...)
  2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
@ 2008-01-22 21:05 ` Manish Ahuja
  2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
  2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:05 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


Routines to invalidate and unregister dump routines. 
Unregister has not been used yet, I will release another
patch for that at a later stage with the kdump integration patches.

There is also a routine which calculates the regions to be
freed and exports that through sysfs.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
-----

---
 arch/powerpc/platforms/pseries/phyp_dump.c |  101 +++++++++++++++++++++++++----
 include/asm/phyp_dump.h                    |    3 
 2 files changed, 93 insertions(+), 11 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:06:20.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:49:10.000000000 -0600
@@ -69,6 +69,10 @@ static struct phyp_dump_header phdr;
 #define DUMP_SOURCE_CPU 0x0001
 #define DUMP_SOURCE_HPTE 0x0002
 #define DUMP_SOURCE_RMO  0x0011
+#define DUMP_ERROR_FLAG 0x2000
+#define DUMP_TRIGGERED 0x4000
+#define DUMP_PERFORMED 0x8000
+
 
 /**
  * init_dump_header() - initialize the header declaring a dump
@@ -180,9 +184,15 @@ static void print_dump_header(const stru
 static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
 {
 	int rc;
-	ph->cpu_data.destination_address += addr;
-	ph->hpte_data.destination_address += addr;
-	ph->kernel_data.destination_address += addr;
+
+	/* Add addr value if not initialized before */
+	if (ph->cpu_data.destination_address == 0) {
+		ph->cpu_data.destination_address += addr;
+		ph->hpte_data.destination_address += addr;
+		ph->kernel_data.destination_address += addr;
+	}
+
+	/* ToDo Invalidate kdump and free memory range. */
 
 	do {
 		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
@@ -195,6 +205,46 @@ static void register_dump_area(struct ph
 	}
 }
 
+static
+void invalidate_last_dump(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+
+	/* Add addr value if not initialized before */
+	if (ph->cpu_data.destination_address == 0) {
+		ph->cpu_data.destination_address += addr;
+		ph->hpte_data.destination_address += addr;
+		ph->kernel_data.destination_address += addr;
+	}
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               2, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) "
+						"on invalidate\n", rc);
+		print_dump_header(ph);
+	}
+}
+
+static void unregister_dump_area(struct phyp_dump_header *ph)
+{
+	int rc;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               3, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc) {
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) "
+						"on unregister\n", rc);
+		print_dump_header(ph);
+	}
+}
+
 /* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
@@ -205,8 +255,8 @@ static void register_dump_area(struct ph
  * lmb_reserved in early boot. The released memory becomes
  * available for genreal use.
  */
-static void
-release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
+static
+void release_memory_range(unsigned long start_pfn, unsigned long nr_pages)
 {
 	struct page *rpage;
 	unsigned long end_pfn;
@@ -237,8 +287,8 @@ release_memory_range(unsigned long start
  *
  * will release 256MB starting at 1GB.
  */
-static ssize_t
-store_release_region(struct kset *kset, const char *buf, size_t count)
+static
+ssize_t store_release_region(struct kset *kset, const char *buf, size_t count)
 {
 	unsigned long start_addr, length, end_addr;
 	unsigned long start_pfn, nr_pages;
@@ -266,10 +316,23 @@ store_release_region(struct kset *kset, 
 	return count;
 }
 
-static ssize_t
-show_release_region(struct kset * kset, char *buf)
+static ssize_t show_release_region(struct kset * kset, char *buf)
 {
-	return sprintf(buf, "ola\n");
+	u64 second_addr_range;
+
+	/* total reserved size - start of scratch area */
+	second_addr_range = phyp_dump_info->init_reserve_size -
+				phyp_dump_info->reserved_scratch_size;
+	return sprintf(buf, "CPU:0x%lx-0x%lx: HPTE:0x%lx-0x%lx:"
+			    " DUMP:0x%lx-0x%lx, 0x%lx-0x%lx:\n",
+		phdr.cpu_data.destination_address,
+		phdr.cpu_data.length_copied,
+		phdr.hpte_data.destination_address,
+		phdr.hpte_data.length_copied,
+		phdr.kernel_data.destination_address,
+		phdr.kernel_data.length_copied,
+		phyp_dump_info->init_reserve_start,
+		second_addr_range);
 }
 
 static struct subsys_attribute rr = __ATTR(release_region, 0600,
@@ -307,7 +370,6 @@ static int __init phyp_dump_setup(void)
 		release_all();
 		return -ENOSYS;
 	}
-	print_dump_header(dump_header);
 
 	/* Is there dump data waiting for us? If there isn't,
 	 * then register a new dump area, and release all of
@@ -319,6 +381,7 @@ static int __init phyp_dump_setup(void)
 	rtas = of_find_node_by_path("/rtas");
 	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
 	of_node_put(rtas);
+	print_dump_header(dump_header);
 
 	dump_area_length = init_dump_header(&phdr);
 	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK; /* align down */
@@ -328,6 +391,22 @@ static int __init phyp_dump_setup(void)
 		return 0;
 	}
 
+	/* re-register the dump area, if old dump was invalid */
+	if ((dump_header) && (dump_header->status & DUMP_ERROR_FLAG)) {
+		invalidate_last_dump(&phdr, dump_area_start);
+		register_dump_area(&phdr, dump_area_start);
+		return 0;
+	}
+
+	if (dump_header) {
+		phyp_dump_info->reserved_scratch_addr =
+				dump_header->cpu_data.destination_address;
+		phyp_dump_info->reserved_scratch_size =
+				dump_header->cpu_data.source_length +
+				dump_header->hpte_data.source_length +
+				dump_header->kernel_data.source_length;
+	}
+
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = subsys_create_file(&kernel_subsys, &rr);
 	if (rc)
Index: 2.6.24-rc5/include/asm/phyp_dump.h
===================================================================
--- 2.6.24-rc5.orig/include/asm/phyp_dump.h	2008-01-21 22:21:01.000000000 -0600
+++ 2.6.24-rc5/include/asm/phyp_dump.h	2008-01-21 23:29:05.000000000 -0600
@@ -29,6 +29,9 @@ struct phyp_dump {
 	/* store cpu & hpte size */
 	unsigned long cpu_state_size;
 	unsigned long hpte_region_size;
+	/* previous scratch area values */
+	unsigned long reserved_scratch_addr;
+	unsigned long reserved_scratch_size;
 };
 
 extern struct phyp_dump *phyp_dump_info;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 7/8] pseries: phyp dump: Tracking memory range freed.
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (6 preceding siblings ...)
  2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
@ 2008-01-22 21:07 ` Manish Ahuja
  2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:07 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake


This patch tracks the size freed. For now it does a simple
rudimentary calculation of the ranges freed. The idea is
to keep it simple at the external shell script level and 
send in large chunks for now.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
-----

---
 arch/powerpc/platforms/pseries/phyp_dump.c |   35 +++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:30:18.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-01-21 23:42:04.000000000 -0600
@@ -275,6 +275,39 @@ void release_memory_range(unsigned long 
 	}
 }
 
+/**
+ * track_freed_range -- Counts the range being freed.
+ * Once the counter goes to zero, it re-registers dump for
+ * future use.
+ */
+static void
+track_freed_range(unsigned long addr, unsigned long length)
+{
+	static unsigned long scratch_area_size, reserved_area_size;
+
+	if (addr < phyp_dump_info->init_reserve_start)
+		return;
+
+	if ((addr >= phyp_dump_info->init_reserve_start) &&
+	    (addr <= phyp_dump_info->init_reserve_start +
+	     phyp_dump_info->init_reserve_size))
+		reserved_area_size += length;
+
+	if ((addr >= phyp_dump_info->reserved_scratch_addr) &&
+	    (addr <= phyp_dump_info->reserved_scratch_addr +
+	     phyp_dump_info->reserved_scratch_size))
+		scratch_area_size += length;
+
+	if ((reserved_area_size == phyp_dump_info->init_reserve_size) &&
+	    (scratch_area_size == phyp_dump_info->reserved_scratch_size)) {
+
+		invalidate_last_dump(&phdr,
+				phyp_dump_info->reserved_scratch_addr);
+		register_dump_area (&phdr,
+				phyp_dump_info->reserved_scratch_addr);
+	}
+}
+
 /* ------------------------------------------------- */
 /**
  * sysfs_release_region -- sysfs interface to release memory range.
@@ -298,6 +331,8 @@ ssize_t store_release_region(struct kset
 	if (ret != 2)
 		return -EINVAL;
 
+	track_freed_range(start_addr, length);
+
 	/* Range-check - don't free any reserved memory that
 	 * wasn't reserved for phyp-dump */
 	if (start_addr < phyp_dump_info->init_reserve_start)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 8/8] pseries: phyp dump: config file
  2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
                   ` (7 preceding siblings ...)
  2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
@ 2008-01-22 21:09 ` Manish Ahuja
  8 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-01-22 21:09 UTC (permalink / raw)
  To: ppc-dev, paulus; +Cc: mahuja, linasvepstas, lkessler, strosake

To: linuxppc-dev@ozlabs.org


Add hypervisor-assisted dump to kernel config

Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

-----
 arch/powerpc/Kconfig |   11 +++++++++++
 1 file changed, 11 insertions(+)

Index: linux-2.6.24-rc2-git4/arch/powerpc/Kconfig
===================================================================
--- linux-2.6.24-rc2-git4.orig/arch/powerpc/Kconfig	2007-11-14 16:39:20.000000000 -0600
+++ linux-2.6.24-rc2-git4/arch/powerpc/Kconfig	2007-11-15 14:27:33.000000000 -0600
@@ -261,6 +261,17 @@ config CRASH_DUMP
 
 	  Don't change this unless you know what you are doing.
 
+config PHYP_DUMP
+	bool "Hypervisor-assisted dump (EXPERIMENTAL)"
+	depends on PPC_PSERIES && EXPERIMENTAL
+	default y
+	help
+	  Hypervisor-assisted dump is meant to be a kdump replacement
+	  offering robustness and speed not possible without system
+	  hypervisor assistence.
+
+	  If unsure, say "Y"
+
 config PPCBUG_NVRAM
 	bool "Enable reading PPCBUG NVRAM during boot" if PPLUS || LOPEC
 	default y if PPC_PREP

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
  2008-01-22 21:00   ` Manish Ahuja
@ 2008-02-07  0:42   ` Paul Mackerras
  2008-02-11 18:29     ` Manish Ahuja
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Mackerras @ 2008-02-07  0:42 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: ppc-dev, linasvepstas, Larry Kessler, Michael Strosaker

Manish Ahuja writes:

> Initial patch for reserving memory in early boot, and freeing it later.
> If the previous boot had ended with a crash, the reserved memory would contain
> a copy of the crashed kernel data.

[snip]

> +static void __init reserve_crashed_mem(void)
> +{
> +	unsigned long base, size;
> +
> +	if (phyp_dump_info->phyp_dump_is_active) {
> +		/* Reserve *everything* above RMR. We'll free this real soon.*/
> +		base = PHYP_DUMP_RMR_END;
> +		size = lmb_end_of_DRAM() - base;
> +
> +		/* XXX crashed_ram_end is wrong, since it may be beyond
> +	 	* the memory_limit, it will need to be adjusted. */
> +		lmb_reserve(base, size);
> +
> +		phyp_dump_info->init_reserve_start = base;
> +		phyp_dump_info->init_reserve_size = size;
> +	}
> +	else {
> +		size = phyp_dump_info->cpu_state_size +
> +			phyp_dump_info->hpte_region_size +
> +			PHYP_DUMP_RMR_END;
> +		base = lmb_end_of_DRAM() - size;
> +	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
> +		lmb_reserve(base, size);

This is still reserving memory even on systems that aren't running on
pHyp at all.  Please rework this so that no memory is reserved if the
system doesn't support phyp-assisted dump.

Paul.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept
  2008-02-07  0:42   ` Paul Mackerras
@ 2008-02-11 18:29     ` Manish Ahuja
  0 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-02-11 18:29 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Manish Ahuja, ppc-dev, linasvepstas, Larry Kessler,
	Michael Strosaker

Sorry,

I think i sent the wrong patch file, it shouldn't have my printk statement in there. Let me re-send 
the correct file and let me test it once more to make sure it does the right thing.

-Manish



Paul Mackerras wrote:
> Manish Ahuja writes:
> 
>> Initial patch for reserving memory in early boot, and freeing it later.
>> If the previous boot had ended with a crash, the reserved memory would contain
>> a copy of the crashed kernel data.
> 
> [snip]
> 
>> +static void __init reserve_crashed_mem(void)
>> +{
>> +	unsigned long base, size;
>> +
>> +	if (phyp_dump_info->phyp_dump_is_active) {
>> +		/* Reserve *everything* above RMR. We'll free this real soon.*/
>> +		base = PHYP_DUMP_RMR_END;
>> +		size = lmb_end_of_DRAM() - base;
>> +
>> +		/* XXX crashed_ram_end is wrong, since it may be beyond
>> +	 	* the memory_limit, it will need to be adjusted. */
>> +		lmb_reserve(base, size);
>> +
>> +		phyp_dump_info->init_reserve_start = base;
>> +		phyp_dump_info->init_reserve_size = size;
>> +	}
>> +	else {
>> +		size = phyp_dump_info->cpu_state_size +
>> +			phyp_dump_info->hpte_region_size +
>> +			PHYP_DUMP_RMR_END;
>> +		base = lmb_end_of_DRAM() - size;
>> +	printk(KERN_ERR "Manish reserve regular kernel space is %ld %ld\n", base, size);
>> +		lmb_reserve(base, size);
> 
> This is still reserving memory even on systems that aren't running on
> pHyp at all.  Please rework this so that no memory is reserved if the
> system doesn't support phyp-assisted dump.
> 
> Paul.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-12  6:31 ` Manish Ahuja
@ 2008-02-12  7:14   ` Manish Ahuja
  2008-02-12 10:11     ` Stephen Rothwell
  0 siblings, 1 reply; 20+ messages in thread
From: Manish Ahuja @ 2008-02-12  7:14 UTC (permalink / raw)
  To: linuxppc-dev, paulus; +Cc: mahuja, linasvepstas


Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  136 +++++++++++++++++++++++++++--
 1 file changed, 129 insertions(+), 7 deletions(-)

Index: 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.24-rc5.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-12 06:12:55.000000000 -0600
+++ 2.6.24-rc5/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-12 06:13:01.000000000 -0600
@@ -30,6 +30,117 @@ struct phyp_dump *phyp_dump_info = &phyp
 static int ibm_configure_kernel_dump;
 
 /* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS 3
+#define DUMP_HEADER_VERSION 0x1
+#define DUMP_REQUEST_FLAG 0x1
+#define DUMP_SOURCE_CPU 0x0001
+#define DUMP_SOURCE_HPTE 0x0002
+#define DUMP_SOURCE_RMO  0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+		               1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+	{
+		printk (KERN_ERR "phyp-dump: unexpected error (%d) on register\n", rc);
+	}
+}
+
+/* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
  * @start_pfn: starting physical frame number
@@ -113,7 +224,9 @@ static struct subsys_attribute rr = __AT
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header;
+	const struct phyp_dump_header *dump_header;
+	unsigned long dump_area_start;
+	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -126,22 +239,31 @@ static int __init phyp_dump_setup(void)
 		return -ENOSYS;
 	}
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
+	of_node_put(rtas);
+
+	dump_area_length = init_dump_header(&phdr);
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK; /* align down */
+
 	if (dump_header == NULL) {
-		release_all();
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
 	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = subsys_create_file(&kernel_subsys, &rr);
-	if (rc) {
+	if (rc)
 		printk (KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n", rc);
-		release_all();
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 subsys_initcall(phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-12  7:14   ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
@ 2008-02-12 10:11     ` Stephen Rothwell
  2008-02-12 16:31       ` Manish Ahuja
  0 siblings, 1 reply; 20+ messages in thread
From: Stephen Rothwell @ 2008-02-12 10:11 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

Hi Manish,

> -	/* Is there dump data waiting for us? */
> +	/* Is there dump data waiting for us? If there isn't,
> +	 * then register a new dump area, and release all of
> +	 * the rest of the reserved ram.
> +	 *
> +	 * The /rtas/ibm,kernel-dump rtas node is present only
> +	 * if there is dump data waiting for us.
> +	 */
>  	rtas = of_find_node_by_path("/rtas");
>  	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
> +	of_node_put(rtas);

Oh, here is the of_node_put() - you should move that to patch 3.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-12 10:11     ` Stephen Rothwell
@ 2008-02-12 16:31       ` Manish Ahuja
  0 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-02-12 16:31 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: mahuja, linuxppc-dev, linasvepstas, paulus

For now, if we can leave this patch as is, that will be great. That move requires me 
to work all remaining patches as they apply uncleanly after that.

I will bunch those two together functionally next time onwards.

Thanks,
Manish


Stephen Rothwell wrote:
> Hi Manish,
> 
>> -	/* Is there dump data waiting for us? */
>> +	/* Is there dump data waiting for us? If there isn't,
>> +	 * then register a new dump area, and release all of
>> +	 * the rest of the reserved ram.
>> +	 *
>> +	 * The /rtas/ibm,kernel-dump rtas node is present only
>> +	 * if there is dump data waiting for us.
>> +	 */
>>  	rtas = of_find_node_by_path("/rtas");
>>  	dump_header = of_get_property(rtas, "ibm,kernel-dump", &header_len);
>> +	of_node_put(rtas);
> 
> Oh, here is the of_node_put() - you should move that to patch 3.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-18  4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-02-18  5:40 ` Manish Ahuja
  2008-02-22  0:53 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Michael Ellerman
  1 sibling, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-02-18  5:40 UTC (permalink / raw)
  To: ppc-dev, paulus, Linas Vepstas; +Cc: mahuja


Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  137 +++++++++++++++++++++++++++--
 1 file changed, 131 insertions(+), 6 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-18 03:26:56.000000000 -0600
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-18 04:30:28.000000000 -0600
@@ -28,6 +28,117 @@
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+static int ibm_configure_kernel_dump;
+/* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS 3
+#define DUMP_HEADER_VERSION 0x1
+#define DUMP_REQUEST_FLAG 0x1
+#define DUMP_SOURCE_CPU 0x0001
+#define DUMP_SOURCE_HPTE 0x0002
+#define DUMP_SOURCE_RMO  0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+				1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+		printk(KERN_ERR "phyp-dump: unexpected error (%d) on "
+						"register\n", rc);
+}
+
 /* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
@@ -106,7 +217,9 @@ static struct kobj_attribute rr = __ATTR
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header = NULL;
+ 	const struct phyp_dump_header *dump_header = NULL;
+ 	unsigned long dump_area_start;
+ 	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -118,7 +231,13 @@ static int __init phyp_dump_setup(void)
 	if (!phyp_dump_info->phyp_dump_configured)
 		return -ENOSYS;
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	if (rtas) {
 		dump_header = of_get_property(rtas, "ibm,kernel-dump",
@@ -126,17 +245,23 @@ static int __init phyp_dump_setup(void)
 		of_node_put(rtas);
 	}
 
-	if (dump_header == NULL)
+	dump_area_length = init_dump_header(&phdr);
+
+	/* align down */
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK;
+
+	if (dump_header == NULL) {
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
+	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = sysfs_create_file(kernel_kobj, &rr.attr);
-	if (rc) {
+	if (rc)
 		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
 									rc);
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 machine_subsys_initcall(pseries, phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-28 23:57   ` Manish Ahuja
@ 2008-02-29  0:29     ` Manish Ahuja
  2008-03-11  6:17       ` Paul Mackerras
  0 siblings, 1 reply; 20+ messages in thread
From: Manish Ahuja @ 2008-02-29  0:29 UTC (permalink / raw)
  To: linuxppc-dev, paulus, michael; +Cc: mahuja, linasvepstas


Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

------
 arch/powerpc/platforms/pseries/phyp_dump.c |  137 +++++++++++++++++++++++++++--
 1 file changed, 131 insertions(+), 6 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-28 23:36:01.000000000 -0600
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-02-28 23:36:42.000000000 -0600
@@ -30,6 +30,117 @@
 static struct phyp_dump phyp_dump_global;
 struct phyp_dump *phyp_dump_info = &phyp_dump_global;
 
+static int ibm_configure_kernel_dump;
+/* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS 3
+#define DUMP_HEADER_VERSION 0x1
+#define DUMP_REQUEST_FLAG 0x1
+#define DUMP_SOURCE_CPU 0x0001
+#define DUMP_SOURCE_HPTE 0x0002
+#define DUMP_SOURCE_RMO  0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+				1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+		printk(KERN_ERR "phyp-dump: unexpected error (%d) on "
+						"register\n", rc);
+}
+
 /* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
@@ -108,7 +219,9 @@ static struct kobj_attribute rr = __ATTR
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header = NULL;
+ 	const struct phyp_dump_header *dump_header = NULL;
+ 	unsigned long dump_area_start;
+ 	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -120,7 +233,13 @@ static int __init phyp_dump_setup(void)
 	if (!phyp_dump_info->phyp_dump_configured)
 		return -ENOSYS;
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	if (rtas) {
 		dump_header = of_get_property(rtas, "ibm,kernel-dump",
@@ -128,17 +247,23 @@ static int __init phyp_dump_setup(void)
 		of_node_put(rtas);
 	}
 
-	if (dump_header == NULL)
+	dump_area_length = init_dump_header(&phdr);
+
+	/* align down */
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK;
+
+	if (dump_header == NULL) {
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
+	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = sysfs_create_file(kernel_kobj, &rr.attr);
-	if (rc) {
+	if (rc)
 		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
 									rc);
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 machine_subsys_initcall(pseries, phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-02-29  0:29     ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
@ 2008-03-11  6:17       ` Paul Mackerras
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Mackerras @ 2008-03-11  6:17 UTC (permalink / raw)
  To: Manish Ahuja; +Cc: linuxppc-dev, linasvepstas, mahuja

Manish Ahuja writes:

> +#define NUM_DUMP_SECTIONS 3
> +#define DUMP_HEADER_VERSION 0x1
> +#define DUMP_REQUEST_FLAG 0x1
> +#define DUMP_SOURCE_CPU 0x0001
> +#define DUMP_SOURCE_HPTE 0x0002
> +#define DUMP_SOURCE_RMO  0x0011

I think it would be clearer if you use a tab to line up the values,
like this:

#define NUM_DUMP_SECTIONS	3
#define DUMP_HEADER_VERSION	0x1
#define DUMP_REQUEST_FLAG	0x1
#define DUMP_SOURCE_CPU		0x0001
#define DUMP_SOURCE_HPTE	0x0002
#define DUMP_SOURCE_RMO		0x0011

Paul.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/8] pseries: phyp dump: register dump area.
  2008-03-21 22:42 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
@ 2008-03-21 23:43 ` Manish Ahuja
  0 siblings, 0 replies; 20+ messages in thread
From: Manish Ahuja @ 2008-03-21 23:43 UTC (permalink / raw)
  To: linuxppc-dev, paulus, michael; +Cc: mahuja, linasvepstas



Set up the actual dump header, register it with the hypervisor.

Signed-off-by: Manish Ahuja <mahuja@us.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>

---
 arch/powerpc/platforms/pseries/phyp_dump.c |  137 +++++++++++++++++++++++++++--
 1 file changed, 131 insertions(+), 6 deletions(-)

Index: 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c
===================================================================
--- 2.6.25-rc1.orig/arch/powerpc/platforms/pseries/phyp_dump.c	2008-03-21 22:39:21.000000000 -0500
+++ 2.6.25-rc1/arch/powerpc/platforms/pseries/phyp_dump.c	2008-03-21 22:52:53.000000000 -0500
@@ -29,6 +29,117 @@
 static struct phyp_dump phyp_dump_vars;
 struct phyp_dump *phyp_dump_info = &phyp_dump_vars;
 
+static int ibm_configure_kernel_dump;
+/* ------------------------------------------------- */
+/* RTAS interfaces to declare the dump regions */
+
+struct dump_section {
+	u32 dump_flags;
+	u16 source_type;
+	u16 error_flags;
+	u64 source_address;
+	u64 source_length;
+	u64 length_copied;
+	u64 destination_address;
+};
+
+struct phyp_dump_header {
+	u32 version;
+	u16 num_of_sections;
+	u16 status;
+
+	u32 first_offset_section;
+	u32 dump_disk_section;
+	u64 block_num_dd;
+	u64 num_of_blocks_dd;
+	u32 offset_dd;
+	u32 maxtime_to_auto;
+	/* No dump disk path string used */
+
+	struct dump_section cpu_data;
+	struct dump_section hpte_data;
+	struct dump_section kernel_data;
+};
+
+/* The dump header *must be* in low memory, so .bss it */
+static struct phyp_dump_header phdr;
+
+#define NUM_DUMP_SECTIONS	3
+#define DUMP_HEADER_VERSION	0x1
+#define DUMP_REQUEST_FLAG	0x1
+#define DUMP_SOURCE_CPU		0x0001
+#define DUMP_SOURCE_HPTE	0x0002
+#define DUMP_SOURCE_RMO		0x0011
+
+/**
+ * init_dump_header() - initialize the header declaring a dump
+ * Returns: length of dump save area.
+ *
+ * When the hypervisor saves crashed state, it needs to put
+ * it somewhere. The dump header tells the hypervisor where
+ * the data can be saved.
+ */
+static unsigned long init_dump_header(struct phyp_dump_header *ph)
+{
+	unsigned long addr_offset = 0;
+
+	/* Set up the dump header */
+	ph->version = DUMP_HEADER_VERSION;
+	ph->num_of_sections = NUM_DUMP_SECTIONS;
+	ph->status = 0;
+
+	ph->first_offset_section =
+		(u32)offsetof(struct phyp_dump_header, cpu_data);
+	ph->dump_disk_section = 0;
+	ph->block_num_dd = 0;
+	ph->num_of_blocks_dd = 0;
+	ph->offset_dd = 0;
+
+	ph->maxtime_to_auto = 0; /* disabled */
+
+	/* The first two sections are mandatory */
+	ph->cpu_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->cpu_data.source_type = DUMP_SOURCE_CPU;
+	ph->cpu_data.source_address = 0;
+	ph->cpu_data.source_length = phyp_dump_info->cpu_state_size;
+	ph->cpu_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->cpu_state_size;
+
+	ph->hpte_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->hpte_data.source_type = DUMP_SOURCE_HPTE;
+	ph->hpte_data.source_address = 0;
+	ph->hpte_data.source_length = phyp_dump_info->hpte_region_size;
+	ph->hpte_data.destination_address = addr_offset;
+	addr_offset += phyp_dump_info->hpte_region_size;
+
+	/* This section describes the low kernel region */
+	ph->kernel_data.dump_flags = DUMP_REQUEST_FLAG;
+	ph->kernel_data.source_type = DUMP_SOURCE_RMO;
+	ph->kernel_data.source_address = PHYP_DUMP_RMR_START;
+	ph->kernel_data.source_length = PHYP_DUMP_RMR_END;
+	ph->kernel_data.destination_address = addr_offset;
+	addr_offset += ph->kernel_data.source_length;
+
+	return addr_offset;
+}
+
+static void register_dump_area(struct phyp_dump_header *ph, unsigned long addr)
+{
+	int rc;
+	ph->cpu_data.destination_address += addr;
+	ph->hpte_data.destination_address += addr;
+	ph->kernel_data.destination_address += addr;
+
+	do {
+		rc = rtas_call(ibm_configure_kernel_dump, 3, 1, NULL,
+				1, ph, sizeof(struct phyp_dump_header));
+	} while (rtas_busy_delay(rc));
+
+	if (rc)
+		printk(KERN_ERR "phyp-dump: unexpected error (%d) on "
+						"register\n", rc);
+}
+
 /* ------------------------------------------------- */
 /**
  * release_memory_range -- release memory previously lmb_reserved
@@ -107,7 +218,9 @@ static struct kobj_attribute rr = __ATTR
 static int __init phyp_dump_setup(void)
 {
 	struct device_node *rtas;
-	const int *dump_header = NULL;
+ 	const struct phyp_dump_header *dump_header = NULL;
+ 	unsigned long dump_area_start;
+ 	unsigned long dump_area_length;
 	int header_len = 0;
 	int rc;
 
@@ -119,7 +232,13 @@ static int __init phyp_dump_setup(void)
 	if (!phyp_dump_info->phyp_dump_configured)
 		return -ENOSYS;
 
-	/* Is there dump data waiting for us? */
+	/* Is there dump data waiting for us? If there isn't,
+	 * then register a new dump area, and release all of
+	 * the rest of the reserved ram.
+	 *
+	 * The /rtas/ibm,kernel-dump rtas node is present only
+	 * if there is dump data waiting for us.
+	 */
 	rtas = of_find_node_by_path("/rtas");
 	if (rtas) {
 		dump_header = of_get_property(rtas, "ibm,kernel-dump",
@@ -127,17 +246,23 @@ static int __init phyp_dump_setup(void)
 		of_node_put(rtas);
 	}
 
-	if (dump_header == NULL)
+	dump_area_length = init_dump_header(&phdr);
+
+	/* align down */
+	dump_area_start = phyp_dump_info->init_reserve_start & PAGE_MASK;
+
+	if (dump_header == NULL) {
+		register_dump_area(&phdr, dump_area_start);
 		return 0;
+	}
 
 	/* Should we create a dump_subsys, analogous to s390/ipl.c ? */
 	rc = sysfs_create_file(kernel_kobj, &rr.attr);
-	if (rc) {
+	if (rc)
 		printk(KERN_ERR "phyp-dump: unable to create sysfs file (%d)\n",
 									rc);
-		return 0;
-	}
 
+	/* ToDo: re-register the dump area, for next time. */
 	return 0;
 }
 machine_subsys_initcall(pseries, phyp_dump_setup);

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2008-03-21 23:43 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-01-22 19:29 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-22 21:00   ` Manish Ahuja
2008-02-07  0:42   ` Paul Mackerras
2008-02-11 18:29     ` Manish Ahuja
2008-01-22 19:33 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-01-22 20:09 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-22 20:15 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-01-22 21:02 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-01-22 21:05 ` [PATCH 6/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
2008-01-22 21:07 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-01-22 21:09 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  -- strict thread matches above, loose matches on Subject: below --
2008-03-21 22:42 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-03-21 23:43 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-02-18  4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-18  5:40 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-02-22  0:53 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Michael Ellerman
2008-02-28 23:57   ` Manish Ahuja
2008-02-29  0:29     ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-03-11  6:17       ` Paul Mackerras
2008-01-07 23:45 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-12  6:31 ` Manish Ahuja
2008-02-12  7:14   ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-02-12 10:11     ` Stephen Rothwell
2008-02-12 16:31       ` Manish Ahuja

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).