From: Michael Ellerman <michael@ellerman.id.au>
To: Milton Miller <miltonm@bga.com>
Cc: linux-ppc <linuxppc-dev@ozlabs.org>, kexec@lists.infradead.org
Subject: Re: [PATCH] powerpc: check crash_base for relocatable kernel
Date: Thu, 08 Jan 2009 14:35:42 +1100 [thread overview]
Message-ID: <1231385742.8294.31.camel@localhost> (raw)
In-Reply-To: <f378ca9151a64ea3cead1b7354e8153c@bga.com>
[-- Attachment #1: Type: text/plain, Size: 4767 bytes --]
On Wed, 2009-01-07 at 08:57 -0600, Milton Miller wrote:
> [removed Paul from cc and fixed Mohan's email]
>
> On Jan 6, 2009, at 5:44 PM, Michael Ellerman wrote:
>
> > On Fri, 2009-01-02 at 14:46 -0600, Milton Miller wrote:
> >> @@ -94,10 +95,35 @@ void __init reserve_crashkernel(void)
> >> KDUMP_KERNELBASE);
> >>
> >> crashk_res.start = KDUMP_KERNELBASE;
> >> +#else
> >> + if (!crashk_res.start) {
> >> + /*
> >> + * unspecified address, choose a region of specified size
> >> + * can overlap with initrd (ignoring corruption when retained)
> >> + * ppc64 requires kernel and some stacks to be in first segemnt
> >> + */
> >> + crashk_res.start = KDUMP_KERNELBASE;
> >> + }
> >> +
> >> + crash_base = PAGE_ALIGN(crashk_res.start);
> >> + if (crash_base != crashk_res.start) {
> >> + printk("Crash kernel base must be aligned to 0x%lx\n",
> >> + PAGE_SIZE);
> >> + crashk_res.start = crash_base;
> >> + }
> >> +
> >> #endif
> >> crash_size = PAGE_ALIGN(crash_size);
> >> crashk_res.end = crashk_res.start + crash_size - 1;
> >>
> >> + /* The crash region must not overlap the current kernel */
> >> + if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
> >> + printk(KERN_WARNING
> >> + "Crash kernel can not overlap current kernel\n");
> >> + crashk_res.start = crashk_res.end = 0;
> >> + return;
> >> + }
> >
> > I think we can be smarter here. Why don't we adjust the crash kernel
> > region so that it doesn't overlap the first kernel? ie. move it up a
> > bit.
>
> How much? In addition to the size of the kernel, we have to allocate
> (1) the emergeency stacks as we use them to bring up secondary cpus (2)
> the irq stacks in the first segment. While the second could be met
> easier on systems with 1TB slbs we don't take advantage of that yet.
Hmm, we could try and work it out though. I guess we don't know how many
CPUs we have at that point, which makes it a little trickier.
So we have the emergency stack and the hard & soft irq stacks per cpu,
which is 48KB AFAICT. So for a 256-way system that would be 12MB.
I don't think I've seen an RMO smaller than 128MB, though I notice our
RPA note specifies 64M as the minimum we'll accept. That would probably
be a bit tight.
How about something like:
min_space = _end + 16MB (16 to be safe?)
if min_space < rmo_size / 2:
min_space = rmo_size / 2
if crash_base < min_space:
crash_base = min_space
> > There's also the issue of the RMO, I'm not sure what we should do
> > there,
> > but I think the kernel needs some smarts otherwise users are going to
> > shoot themselves in the foot.
>
> I was looking at the code in kexec-tools for the rmo, and it seems
> extremely broken (ie it sets rmo_top on every memory block instead of
> the lowest; the clamp to 768M is the savior for systems with multiple
> blocks).
Oh surprise.
> Do we care about loading a kernel below a relocated kernel (between the
> interrupt vectors and the new kernel)? I ignored that for now,
> arguing that we always run the first kernel at 0.
No I don't think so.
> > We could ignore the @x setting and split the RMO between both kernels
> > somewhat intelligently.
> >
> > What might work is multiple crash regions, that way we could have some
> > space in the RMO for the second kernel (say 32MB?), but the rest
> > outside
> > - leaving some RMO for the first kernel. But I think that would require
> > some serious surgery.
> >
>
> Other archs have this, i guess because they read the memory out of
> /proc/iomem. The trick is knowing what has to be put in real space
> and what can go abvoe the rmo. Also, we have those horrible hard-code
> rmo to 768M max because some platform (one of the cell ones?) didn't
> make the device tree to show it. Maybe we can track it down and add
> linux,usable-mem-ranges to fix it up?
Dunno about the cell, but some of the early blades did have crufty
firmware.
> Does the generic code support loading into the split regions, or is it
> just for giving the kernel room to run?
I don't think so. I don't see any logic that deals with gaps in the
crashk region.
> So while all of these are nice, what do you think about merging this as
> an interm measure, especially for backporting to 2.6.28 stable (and any
> distro that wants to pick up relocatable kdump)?
I guess. I'd rather do something smarter, like I suggested above.
cheers
--
Michael Ellerman
OzLabs, IBM Australia Development Lab
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-01-08 3:35 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-02 20:42 [PATCH 0/5 + 2] kexec updates Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:46 ` [PATCH] powerpc: make dummy section a valid note header Milton Miller
2009-01-02 20:46 ` [PATCH] powerpc: check crash_base for relocatable kernel Milton Miller
2009-01-06 23:44 ` Michael Ellerman
2009-01-07 14:57 ` Milton Miller
2009-01-08 3:35 ` Michael Ellerman [this message]
2009-01-02 21:04 ` [PATCH kexec-tools 1/5] ppc64: always check number of ranges when adding Milton Miller
2009-01-07 2:42 ` Michael Ellerman
2009-01-07 14:34 ` Milton Miller
2009-01-08 12:33 ` [PATCH kexec-tools v2] ppc64: always check number of ranges when adding them Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 2/5] ppc64: update kdump for 2.6.28 relocatable kernel Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 4/5] ppc64: cleanups Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 5/5] entry wants to be void * Milton Miller
2009-01-12 6:24 ` [PATCH 0/5 + 2] kexec updates Simon Horman
2009-01-13 4:15 ` M. Mohan Kumar
2009-01-13 15:59 ` Milton Miller
2009-01-15 22:43 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1231385742.8294.31.camel@localhost \
--to=michael@ellerman.id.au \
--cc=kexec@lists.infradead.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=miltonm@bga.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).