From: Michael Ellerman <michael@ellerman.id.au>
To: Nathan Lynch <ntl@pobox.com>
Cc: lkessler@us.ibm.com, linuxppc-dev@ozlabs.org, mahuja@us.ibm.com,
linasvepstas@gmail.com, strosake@us.ibm.com
Subject: Re: [PATCH 1/8] pseries: phyp dump: Docmentation
Date: Wed, 09 Jan 2008 15:58:35 +1100 [thread overview]
Message-ID: <1199854715.14578.1.camel@concordia> (raw)
In-Reply-To: <20080109042911.GT14201@localdomain>
[-- Attachment #1: Type: text/plain, Size: 4886 bytes --]
On Tue, 2008-01-08 at 22:29 -0600, Nathan Lynch wrote:
> Manish Ahuja wrote:
> > +
> > + Hypervisor-Assisted Dump
> > + ------------------------
> > + November 2007
>
> Date is unneeded (and, uhm, dated :)
>
>
> > +The goal of hypervisor-assisted dump is to enable the dump of
> > +a crashed system, and to do so from a fully-reset system, and
> > +to minimize the total elapsed time until the system is back
> > +in production use.
>
> Is it actually faster than kdump?
>
>
> > +As compared to kdump or other strategies, hypervisor-assisted
> > +dump offers several strong, practical advantages:
> > +
> > +-- Unlike kdump, the system has been reset, and loaded
> > + with a fresh copy of the kernel. In particular,
> > + PCI and I/O devices have been reinitialized and are
> > + in a clean, consistent state.
> > +-- As the dump is performed, the dumped memory becomes
> > + immediately available to the system for normal use.
> > +-- After the dump is completed, no further reboots are
> > + required; the system will be fully usable, and running
> > + in it's normal, production mode on it normal kernel.
> > +
> > +The above can only be accomplished by coordination with,
> > +and assistance from the hypervisor. The procedure is
> > +as follows:
> > +
> > +-- When a system crashes, the hypervisor will save
> > + the low 256MB of RAM to a previously registered
> > + save region. It will also save system state, system
> > + registers, and hardware PTE's.
> > +
> > +-- After the low 256MB area has been saved, the
> > + hypervisor will reset PCI and other hardware state.
> > + It will *not* clear RAM. It will then launch the
> > + bootloader, as normal.
> > +
> > +-- The freshly booted kernel will notice that there
> > + is a new node (ibm,dump-kernel) in the device tree,
> > + indicating that there is crash data available from
> > + a previous boot. It will boot into only 256MB of RAM,
> > + reserving the rest of system memory.
> > +
> > +-- Userspace tools will parse /sys/kernel/release_region
> > + and read /proc/vmcore to obtain the contents of memory,
> > + which holds the previous crashed kernel. The userspace
> > + tools may copy this info to disk, or network, nas, san,
> > + iscsi, etc. as desired.
> > +
> > + For Example: the values in /sys/kernel/release-region
> > + would look something like this (address-range pairs).
> > + CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
> > + DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
> > +
> > +-- As the userspace tools complete saving a portion of
> > + dump, they echo an offset and size to
> > + /sys/kernel/release_region to release the reserved
> > + memory back to general use.
> > +
> > + An example of this is:
> > + "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
> > + which will release 256MB at the 1GB boundary.
>
> This violates the "one file, one value" rule of sysfs, but nobody
> really takes that seriously, I guess. In any case, consider
> documenting this in Documentation/ABI.
>
>
> > +
> > +Please note that the hypervisor-assisted dump feature
> > +is only available on Power6-based systems with recent
> > +firmware versions.
>
> This statement will of course become dated/incorrect so I recommend
> removing it.
>
>
> > +
> > +Implementation details:
> > +----------------------
> > +In order for this scheme to work, memory needs to be reserved
> > +quite early in the boot cycle. However, access to the device
> > +tree this early in the boot cycle is difficult, and device-tree
> > +access is needed to determine if there is a crash data waiting.
>
> I don't think this bit about early device tree access is correct. By
> the time your code is reserving memory (from early_init_devtree(), I
> think), RTAS has been instantiated and you are able to test for the
> existence of /rtas/ibm,dump-kernel.
Yep it's early_init_devtree(), and yes it's fairly easy to access the
(flattened) device tree at that point.
> > +To work around this problem, all but 256MB of RAM is reserved
> > +during early boot. A short while later in boot, a check is made
> > +to determine if there is dump data waiting. If there isn't,
> > +then the reserved memory is released to general kernel use.
>
> So I think these gymnastics are unneeded -- unless I'm
> misunderstanding something, you should be able to determine very early
> whether to reserve that memory.
I agree.
cheers
--
Michael Ellerman
OzLabs, IBM Australia Development Lab
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2008-01-09 4:58 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-07 23:45 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-08 0:13 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-01-09 4:29 ` Nathan Lynch
2008-01-09 4:58 ` Michael Ellerman [this message]
2008-01-09 15:31 ` Linas Vepstas
2008-01-09 18:44 ` Nathan Lynch
2008-01-09 19:28 ` Manish Ahuja
2008-01-09 22:59 ` Michael Ellerman
2008-01-09 23:18 ` Manish Ahuja
2008-01-10 2:47 ` Linas Vepstas
2008-01-10 3:55 ` Michael Ellerman
2008-01-10 2:33 ` Linas Vepstas
2008-01-10 3:17 ` Olof Johansson
2008-01-10 4:12 ` Linas Vepstas
2008-01-10 4:52 ` Michael Ellerman
2008-01-10 16:21 ` Olof Johansson
2008-01-10 16:34 ` Linas Vepstas
2008-01-10 21:46 ` Mike Strosaker
2008-01-11 1:26 ` Nathan Lynch
2008-01-11 16:57 ` Linas Vepstas
2008-01-14 5:24 ` Olof Johansson
2008-01-14 15:21 ` Linas Vepstas
2008-01-08 0:16 ` [PATCH 2/8] pseries: phyp dump: config file Manish Ahuja
2008-01-08 3:18 ` Stephen Rothwell
2008-01-08 0:21 ` [PATCH 4/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-01-08 3:45 ` Stephen Rothwell
2008-01-08 18:34 ` Linas Vepstas
2008-01-08 0:25 ` [PATCH 3/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-08 3:16 ` Stephen Rothwell
2008-01-16 4:21 ` Paul Mackerras
2008-01-08 0:28 ` [PATCH 5/8] pseries: phyp dump: register dump area Manish Ahuja
2008-01-08 3:59 ` Stephen Rothwell
2008-01-08 0:35 ` [PATCH 6/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-01-08 0:49 ` Arnd Bergmann
2008-01-08 4:03 ` Stephen Rothwell
2008-01-08 0:37 ` [PATCH 7/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
2008-01-08 4:25 ` Stephen Rothwell
2008-01-08 22:56 ` Manish Ahuja
2008-01-08 0:39 ` [PATCH 8/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-02-12 6:31 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-12 6:53 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-02-12 7:08 ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-02-12 8:48 ` Michael Ellerman
2008-02-12 16:38 ` Manish Ahuja
2008-02-14 3:46 ` Tony Breeds
2008-02-14 23:12 ` Olof Johansson
2008-02-15 7:16 ` Manish Ahuja
2008-02-12 7:11 ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-02-12 10:08 ` Stephen Rothwell
2008-02-12 16:40 ` Manish Ahuja
2008-02-15 1:05 ` Tony Breeds
2008-02-15 7:17 ` Manish Ahuja
2008-02-15 22:32 ` Tony Breeds
2008-02-15 17:30 ` Linas Vepstas
2008-02-12 7:14 ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-02-12 10:11 ` Stephen Rothwell
2008-02-12 16:31 ` Manish Ahuja
2008-02-12 7:16 ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-02-12 7:18 ` [PATCH 6/8] pseries: phyp dump: Invalidate and print dump areas Manish Ahuja
2008-02-12 10:18 ` Stephen Rothwell
2008-02-12 16:32 ` Manish Ahuja
2008-02-13 21:43 ` Manish Ahuja
2008-02-12 7:20 ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-02-12 7:21 ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
-- strict thread matches above, loose matches on Subject: below --
2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-02-18 4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-22 0:53 ` Michael Ellerman
2008-02-28 23:57 ` Manish Ahuja
2008-02-29 0:22 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1199854715.14578.1.camel@concordia \
--to=michael@ellerman.id.au \
--cc=linasvepstas@gmail.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=lkessler@us.ibm.com \
--cc=mahuja@us.ibm.com \
--cc=ntl@pobox.com \
--cc=strosake@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.