From: ebiederm@xmission.com (Eric W. Biederman)
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org, kexec@lists.infradead.org,
Simon Horman <horms@verge.net.au>,
yinghai@kernel.org, Thomas Renninger <trenn@suse.de>,
vgoyal@redhat.com
Subject: Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
Date: Wed, 06 Feb 2013 15:39:50 -0800 [thread overview]
Message-ID: <87obfx8115.fsf@xmission.com> (raw)
In-Reply-To: <5112E30C.50707@zytor.com> (H. Peter Anvin's message of "Wed, 06 Feb 2013 15:11:08 -0800")
"H. Peter Anvin" <hpa@zytor.com> writes:
> On 02/06/2013 03:04 PM, Eric W. Biederman wrote:
>>>
>>> There is another important point, why the command line approach
>>> should be preferred:
>>> Backward compatibility and the ability to backport the whole stuff to
>>> fix mmconf in kdump which would be nice for example for SLES11.
>>
>> Backward compatibility argues for editing the e820 map because we can do
>> that at any time, with no dependencies on any kernel changes. Only
>> the E820_RAM type will be treated as ram. Any unregcognized e820 type
>> will be treated as reserved. The code has always been like that.
>>
>> A new reserved value would be nice to communicate to the kernel areas
>> that are really ram but it isn't allowed to touch but is unnecessary at
>> this point. Even with just marking memory regions we don't use as
>> E820_RESERVED we match what is currently being done.
>>
>> Since a new reserved value has not been selected let me suggest.
>> 0x6b646d70 aka kdmp in asii.
>>
>
> I (somewhat) would like to keep the reserved numbers in a small(ish)
> range which argue against that specific constant. I kind of like
> 0x6bxxxxxx ("k") though, it has some flair to it.
Well if someone doesn't reserve such a constant in a well know place the
historical solution is to pick a random number and hope you don't
collide with someone else's random number. We are pretty close to that
right now with the e820 map.
And coming up sometime soonish is how do we do this for the efi memory
map.
We do need to regenerate the map in /sbin/kexec though to handle
the case of memory hotplug (which necessitates reloading our crash
kernel).
>> For backwards compatibility I prefer editing the e820 map in
>> /sbin/kexec.
>>
>>
>> My real preference would be to define a command line option that will
>> work on all architectures that implement kdump, as the craskernel option
>> does. Unfortunately it looks like that ship has sailed, and there isn't
>> enough desire to fix this to come up with a generic option that will
>> work on more than just x86. But if we could get past the kernel
>> versioning and figure out a arch-generic solution it might be worth it.
>>
>
> What would that option look like?
Probably something like "usemem=<size>@<addr>,..."
>>> kexec-tools can detect the kernel version of the kernel which is loaded
>>> as kdump/crash kernel. If its version is:
>>> "$MAINLINE_VERSION_THE_CHANGE_GETS_INTRODUCED"
>>> or newer, things are fine.
>>> But if the kernel version is older, there is no way for kexec-tools to
>>> find out whether the older kernel may have the feature included.
>>> That's bad!
>>
>> That is totally unnecessary for the e820 map because anything
>> unrecognized is treated as reserved, and for the sufficiently paranoid
>> we don't need to use a new memory type.
>
> The only issue is if kdump needs the memory it is going to dump to be
> mapped; we don't map reserved memory anymore unless explicitly requested
> via ioremap(). Does it?
I don't think that it makes sense for the memory to be permanently
mapped. Even at 4MB per terabyte with 2M pages for the bigger systems
that becomes a noticable amount of our memory to reserve for kdump.
In the general picture we do need to track the memory so that we
remember how the memory should be cached or we run into the possibility
of getting the caching bits set into an inconsistent state.
There is presently work to modify /dev/oldmem and /proc/vmcore so
that they are mmapable, so that userspace can control how much is
ioremapped at once. As currently on the larger systems there is
major performance problem with mapping a single page at a time and
copying that to userspace.
>> The existing e820 handling for unknown type is much much better. It
>> just treats them as reserved and goes about it's merry way.
>
> It sounds like this is the way to go.
It certainly looks good. We still need someone with the time to write
the patch and test it.
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2013-02-06 23:40 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
2013-01-22 15:02 ` [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string Thomas Renninger
2013-01-22 15:02 ` [PATCH 2/3] kexec x86: Extract kernel version and convert it to KERNEL_VERSION() style Thomas Renninger
2013-01-22 15:02 ` [PATCH 3/3] kexec x86: Make kexec aware of new memmap= kernel parameter possibilities Thomas Renninger
2013-01-30 4:31 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax Simon Horman
2013-01-30 5:40 ` H. Peter Anvin
2013-01-30 5:52 ` Simon Horman
2013-01-30 16:03 ` Thomas Renninger
2013-01-30 16:06 ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
2013-01-30 16:09 ` H. Peter Anvin
2013-01-30 16:08 ` [PATCH 2/3] x86: Introduce Linux kernel specific E820_RESERVED_KDUMP e820 memory range type Thomas Renninger
2013-01-30 16:10 ` [PATCH 3/3] x86 e820: Introduce memmap=kdump_reserve_usable for kdump usage Thomas Renninger
2013-01-30 16:10 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax H. Peter Anvin
2013-01-30 16:13 ` [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage Thomas Renninger
2013-01-30 16:16 ` H. Peter Anvin
2013-01-30 16:39 ` Thomas Renninger
2013-01-30 16:52 ` H. Peter Anvin
2013-01-30 17:41 ` Yinghai Lu
2013-01-30 18:52 ` Eric W. Biederman
2013-01-30 21:38 ` H. Peter Anvin
2013-01-30 21:57 ` Eric W. Biederman
2013-01-30 22:10 ` H. Peter Anvin
2013-01-30 22:29 ` Eric W. Biederman
2013-01-30 22:41 ` H. Peter Anvin
2013-01-30 22:49 ` Yinghai Lu
2013-01-31 0:15 ` Thomas Renninger
2013-01-31 0:18 ` H. Peter Anvin
2013-01-31 9:11 ` Thomas Renninger
2013-02-06 15:23 ` Thomas Renninger
2013-02-06 23:04 ` Eric W. Biederman
2013-02-06 23:11 ` H. Peter Anvin
2013-02-06 23:39 ` Eric W. Biederman [this message]
2013-02-08 20:08 ` Thomas Renninger
2013-02-08 20:25 ` Eric W. Biederman
2013-02-08 20:56 ` Thomas Renninger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87obfx8115.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=horms@verge.net.au \
--cc=hpa@zytor.com \
--cc=kexec@lists.infradead.org \
--cc=trenn@suse.de \
--cc=vgoyal@redhat.com \
--cc=x86@kernel.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox