public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <h.peter.anvin@intel.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>,
	linux-kernel@vger.kernel.org, vgoyal@redhat.com,
	hbabu@us.ibm.com, kexec@lists.infradead.org,
	ying.huang@intel.com, mingo@elte.hu, tglx@linutronix.de,
	sam@ravnborg.org
Subject: Re: [PATCH 00/14] RFC: x86: relocatable kernel changes
Date: Fri, 08 May 2009 11:04:30 -0700	[thread overview]
Message-ID: <4A04742E.8050506@intel.com> (raw)
In-Reply-To: <m1tz3wgk3n.fsf@fess.ebiederm.org>

Eric W. Biederman wrote:
>>
>>> The direction of this patch seems reasonable.  The details are broken.
>>> The common case for relocatable kernels today is kdump.  A situation
>>> with very minimal memory.  In that situation the kernel needs to run
>>> where we put it, modifying the kernel to not run where it gets put
>>> is a problem.
>> I thought in the kdump case you typically loaded it pretty high?  Either
>> which way, kdump is always loaded by kexec, so it should just be a
>> matter of updating kexec to zero the runtime_start field, no?
> 
> Yes.  In practice it doesn't matter. I just don't want to get into a
> contest with the kernel about who knows better how to put the kernel
> in memory the bootloader or the kernel decompressor.
> 
>> Basically
>> this is the bootloader saying "do what I say, dammit."  Since the
>> existing protocol doesn't have a way to unambiguously communicate one
>> direction versus another (see below), it seems like a relatively small
>> issue involving only one tool.  Suboptimal, yes.
> 
> The existing protocol doesn't have the option of anything else.
> 
> Physical start has always been <= the alignment for x86 and x86_64,
> in any real world configuration.

That assumption seems to be the fundamental flaw of the relocation
protocol as written, and rather quite what provoked this whole thing.
We really would want to run at above 16 MB for not just 15 MB hole but
also for ZONE_DMA reasons.

> Something goofy may have happened during unification, I thought I had
> removed physical start as totally unnecessary from x86_64.
> 
> In the non-kdump case this is interesting.  I know of instances where
> kexec is burned in firmware.  So I am strongly reluctant to make anything
> that feels like a true backwards incompatible change.
> 
> Those systems also don't have the stupid 15MB hole either.

OK, kexec in firmware is probably a showstopper... assuming *those*
kexec instances care about the exact final location of the code.
Otherwise, if all they are doing is loading the kernel and want it to
take over the machine, the proposed behavior (realign the kernel to a
more optimal point) is pretty much The Right Thing.  Could you expand on
this use case?  This seems like a key piece of the puzzle.

It's pretty well understood that we can't require changes for the tons
of deployed bootloaders, but at the same time we're stuck in a case with
overloading semantics that have to be disambiguated.

> On the 64bit kernel 2MB really is required.  We run at a fixed virtual
> address and use 2MB pages. So anything less that 2MB really won't work.
> 
> So I think it would be a bad idea if we had bootloaders ignoring the
> alignment.
> 
> With the suggested start address, it probably make sense to only
> export our true alignment requirement.

On 32 bits (which is the only case where one megabyte could possibly
matter) we *can* run at 1 MB, and that was the main case I was worrying
about there.  On the other hand, even very early Linux just barely ran
in 4 MB of RAM, and perhaps an alignment restriction of 4 MB (the
non-PAE case) handles even the smallest configurations?  If so we can
probably get away with just disallowing alignment < 2 MB and use your
solution.

>>> I expect we will still want to update kexec to be able to take
>>> advantage of loadtime_size (runtime_size seems like the wrong name).
>> Well, it is the amount of memory the kernel needs during runtime (as
>> opposed to during loading.)  I admit it's not an ideal name, though.  On
>> the other hand, simply calling it kernel_start and kernel_size seemed
>> ambiguous.
> 
> It is the amount of memory we need before a true memory allocator is
> initialized.  Essentially text+data+bss.  How about we call it init_size?
> 
> Perhaps we should have:
> init_size
> best start (As a 64bit field please)
> optimum align  (Or we flip it around)

I did think about that (64 bits), but I came to the conclusion that in
any case were we're supporting loading over 4 GB we need to be fully
relocatable anyway -- plus we need a whole bunch of other protocol
changes.  This is not in itself a reason not to do it, but the size of
the initialized header is limited to just over 127 bytes without a much
bigger change (since the size of the structure has to fit inside a
single signed byte at 0x201).

	-hpa

  reply	other threads:[~2009-05-08 18:09 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-07 22:26 [PATCH 00/14] RFC: x86: relocatable kernel changes H. Peter Anvin
2009-05-07 22:26 ` [PATCH 01/14] x86, boot: align the .bss section in the decompressor H. Peter Anvin
2009-05-08  7:17   ` Sam Ravnborg
2009-05-08  8:18     ` Eric Dumazet
2009-05-08 16:54     ` H. Peter Anvin
2009-05-08  7:53   ` Cyrill Gorcunov
2009-05-08 17:03     ` H. Peter Anvin
2009-05-08 17:15       ` Cyrill Gorcunov
2009-05-08 17:21         ` H. Peter Anvin
2009-05-07 22:26 ` [PATCH 02/14] x86, boot: honor CONFIG_PHYSICAL_START when relocatable H. Peter Anvin
2009-05-08  7:34   ` Sam Ravnborg
2009-05-08 16:58     ` H. Peter Anvin
2009-05-07 22:26 ` [PATCH 03/14] x86, config: change defaults PHYSICAL_START and PHYSICAL_ALIGN H. Peter Anvin
2009-05-08  7:36   ` Sam Ravnborg
2009-05-08  9:47     ` Ingo Molnar
2009-05-08 17:01     ` H. Peter Anvin
2009-05-07 22:26 ` [PATCH 04/14] x86, boot: unify use LOAD_PHYSICAL_ADDR and LOAD_PHYSICAL_ALIGN H. Peter Anvin
2009-05-07 22:26 ` [PATCH 05/14] kbuild: allow compressors (gzip, bzip2, lzma) to take multiple inputs H. Peter Anvin
2009-05-08  7:42   ` Sam Ravnborg
2009-05-08 20:18     ` H. Peter Anvin
2009-05-08 20:47       ` Sam Ravnborg
2009-05-08 20:49         ` H. Peter Anvin
2009-05-08 21:33           ` Sam Ravnborg
2009-05-07 22:26 ` [PATCH 06/14] x86: add a Kconfig symbol for when relocations are needed H. Peter Anvin
2009-05-07 22:26 ` [PATCH 07/14] x86, boot: simplify arch/x86/boot/compressed/Makefile H. Peter Anvin
2009-05-08  7:45   ` Sam Ravnborg
2009-05-07 22:26 ` [PATCH 08/14] x86, boot: use BP_scratch in arch/x86/boot/compressed/head_*.S H. Peter Anvin
2009-05-07 22:26 ` [PATCH 09/14] x86, boot: add new runtime_address and runtime_size bzImage fields H. Peter Anvin
2009-05-08  7:55   ` Sam Ravnborg
2009-05-08 21:09     ` H. Peter Anvin
2009-05-08 21:35       ` Sam Ravnborg
2009-05-07 22:26 ` [PATCH 10/14] x86, doc: document the runtime_start " H. Peter Anvin
2009-05-07 22:26 ` [PATCH 11/14] x86, boot: use rep movsq to move kernel on 64 bits H. Peter Anvin
2009-05-07 22:27 ` [PATCH 12/14] x86, boot: zero EFLAGS on 32 bits H. Peter Anvin
2009-05-07 22:27 ` [PATCH 13/14] x86: make CONFIG_RELOCATABLE the default H. Peter Anvin
2009-05-07 22:27 ` [PATCH 14/14] x86, defconfig: update defconfigs to relocatable H. Peter Anvin
2009-05-08  1:23 ` [PATCH 00/14] RFC: x86: relocatable kernel changes Eric W. Biederman
2009-05-08  5:31   ` H. Peter Anvin
2009-05-08  6:54     ` Eric W. Biederman
2009-05-08 18:04       ` H. Peter Anvin [this message]
2009-05-08 18:47       ` H. Peter Anvin
2009-05-11  5:18         ` RFC: x86: relocatable kernel changes (revised spec) H. Peter Anvin
2009-05-11 11:54           ` Eric W. Biederman
2009-05-11 16:03             ` H. Peter Anvin
2009-05-11 17:56             ` RFC: x86: relocatable kernel changes (revised spec v2) H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A04742E.8050506@intel.com \
    --to=h.peter.anvin@intel.com \
    --cc=ebiederm@xmission.com \
    --cc=hbabu@us.ibm.com \
    --cc=hpa@linux.intel.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=sam@ravnborg.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox