linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <michael@ellerman.id.au>
To: Milton Miller <miltonm@bga.com>
Cc: linux-ppc <linuxppc-dev@ozlabs.org>, kexec@lists.infradead.org
Subject: Re: [PATCH] powerpc: check crash_base for relocatable kernel
Date: Thu, 08 Jan 2009 14:35:42 +1100	[thread overview]
Message-ID: <1231385742.8294.31.camel@localhost> (raw)
In-Reply-To: <f378ca9151a64ea3cead1b7354e8153c@bga.com>

[-- Attachment #1: Type: text/plain, Size: 4767 bytes --]

On Wed, 2009-01-07 at 08:57 -0600, Milton Miller wrote:
> [removed Paul from cc and fixed Mohan's email]
> 
> On Jan 6, 2009, at 5:44 PM, Michael Ellerman wrote:
> 
> > On Fri, 2009-01-02 at 14:46 -0600, Milton Miller wrote:
> >> @@ -94,10 +95,35 @@ void __init reserve_crashkernel(void)
> >>  				KDUMP_KERNELBASE);
> >>
> >>  	crashk_res.start = KDUMP_KERNELBASE;
> >> +#else
> >> +	if (!crashk_res.start) {
> >> +		/*
> >> +		 * unspecified address, choose a region of specified size
> >> +		 * can overlap with initrd (ignoring corruption when retained)
> >> +		 * ppc64 requires kernel and some stacks to be in first segemnt
> >> +		 */
> >> +		crashk_res.start = KDUMP_KERNELBASE;
> >> +	}
> >> +
> >> +	crash_base = PAGE_ALIGN(crashk_res.start);
> >> +	if (crash_base != crashk_res.start) {
> >> +		printk("Crash kernel base must be aligned to 0x%lx\n",
> >> +				PAGE_SIZE);
> >> +		crashk_res.start = crash_base;
> >> +	}
> >> +
> >>  #endif
> >>  	crash_size = PAGE_ALIGN(crash_size);
> >>  	crashk_res.end = crashk_res.start + crash_size - 1;
> >>
> >> +	/* The crash region must not overlap the current kernel */
> >> +	if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
> >> +		printk(KERN_WARNING
> >> +			"Crash kernel can not overlap current kernel\n");
> >> +		crashk_res.start = crashk_res.end = 0;
> >> +		return;
> >> +	}
> >
> > I think we can be smarter here. Why don't we adjust the crash kernel
> > region so that it doesn't overlap the first kernel? ie. move it up a
> > bit.
> 
> How much?   In addition to the size of the kernel, we have to allocate 
> (1) the emergeency stacks as we use them to bring up secondary cpus (2) 
> the irq stacks in the first segment.   While the second could be met 
> easier on systems with 1TB slbs we don't take advantage of that yet.

Hmm, we could try and work it out though. I guess we don't know how many
CPUs we have at that point, which makes it a little trickier.

So we have the emergency stack and the hard & soft irq stacks per cpu,
which is 48KB AFAICT. So for a 256-way system that would be 12MB.

I don't think I've seen an RMO smaller than 128MB, though I notice our
RPA note specifies 64M as the minimum we'll accept. That would probably
be a bit tight.

How about something like:

min_space = _end + 16MB		(16 to be safe?)

if min_space < rmo_size / 2:
	min_space = rmo_size / 2

if crash_base < min_space:
	crash_base = min_space

> > There's also the issue of the RMO, I'm not sure what we should do 
> > there,
> > but I think the kernel needs some smarts otherwise users are going to
> > shoot themselves in the foot.
> 
> I was looking at the code in kexec-tools for the rmo, and it seems 
> extremely broken (ie it sets rmo_top on every memory block instead of 
> the lowest; the clamp to 768M is the savior for systems with multiple 
> blocks).

Oh surprise.

> Do we care about loading a kernel below a relocated kernel (between the 
> interrupt vectors and the new kernel)?   I ignored that for now, 
> arguing that we always run the first kernel at 0.

No I don't think so.

> > We could ignore the @x setting and split the RMO between both kernels
> > somewhat intelligently.
> >
> > What might work is multiple crash regions, that way we could have some
> > space in the RMO for the second kernel (say 32MB?), but the rest 
> > outside
> > - leaving some RMO for the first kernel. But I think that would require
> > some serious surgery.
> >
> 
> Other archs have this, i guess because they read the memory out of 
> /proc/iomem.   The trick is knowing what has to be put in real space 
> and what can go abvoe the rmo.   Also, we have those horrible hard-code 
> rmo to 768M max because some platform (one of the cell ones?) didn't 
> make the device tree to show it.  Maybe we can track it down and add 
> linux,usable-mem-ranges to fix it up?

Dunno about the cell, but some of the early blades did have crufty
firmware.

> Does the generic code support loading into the split regions, or is it 
> just for giving the kernel room to run?

I don't think so. I don't see any logic that deals with gaps in the
crashk region.

> So while all of these are nice, what do you think about merging this as 
> an interm measure, especially for backporting to 2.6.28 stable (and any 
> distro that wants to pick up relocatable kdump)?

I guess. I'd rather do something smarter, like I suggested above.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  reply	other threads:[~2009-01-08  3:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-02 20:42 [PATCH 0/5 + 2] kexec updates Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:44 ` Milton Miller
2009-01-02 20:46 ` [PATCH] powerpc: make dummy section a valid note header Milton Miller
2009-01-02 20:46 ` [PATCH] powerpc: check crash_base for relocatable kernel Milton Miller
2009-01-06 23:44   ` Michael Ellerman
2009-01-07 14:57     ` Milton Miller
2009-01-08  3:35       ` Michael Ellerman [this message]
2009-01-02 21:04 ` [PATCH kexec-tools 1/5] ppc64: always check number of ranges when adding Milton Miller
2009-01-07  2:42   ` Michael Ellerman
2009-01-07 14:34     ` Milton Miller
2009-01-08 12:33     ` [PATCH kexec-tools v2] ppc64: always check number of ranges when adding them Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 2/5] ppc64: update kdump for 2.6.28 relocatable kernel Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 4/5] ppc64: cleanups Milton Miller
2009-01-02 21:04 ` [PATCH kexec-tools 5/5] entry wants to be void * Milton Miller
2009-01-12  6:24 ` [PATCH 0/5 + 2] kexec updates Simon Horman
2009-01-13  4:15   ` M. Mohan Kumar
2009-01-13 15:59   ` Milton Miller
2009-01-15 22:43     ` Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1231385742.8294.31.camel@localhost \
    --to=michael@ellerman.id.au \
    --cc=kexec@lists.infradead.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=miltonm@bga.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).