From: Vivek Goyal <vgoyal@redhat.com>
To: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@elte.hu>,
kexec <kexec@lists.infradead.org>,
caiqian@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: kexec load failure introduced by "x86, memblock: Replace e820_/_early string with memblock_"
Date: Tue, 28 Sep 2010 10:01:12 -0400 [thread overview]
Message-ID: <20100928140112.GB8950@redhat.com> (raw)
In-Reply-To: <4CA195D7.4010706@kernel.org>
On Tue, Sep 28, 2010 at 12:14:31AM -0700, Yinghai Lu wrote:
> On 09/27/2010 08:46 PM, H. Peter Anvin wrote:
> > On 09/27/2010 05:53 PM, Vivek Goyal wrote:
> >>
> >> Actually, hardcoding the upper limit to 4G is probably not the best idea.
> >> Kexec loads the the relocatable binary (purgatory) and I remember that
> >> one of the generated relocation type was signed 32 bit and allowed max value
> >> to be 2G only. So IIRC, purgatory code always needed to be loaded below 2G.
> >>
> >> I liked HPA's other idea better of introducing memblock_find_in_range_lowest()
> >> so that we search bottom up and not rely on a specific upper limit.
> >>
> >
> > No, it's just another crappy hack which is broken in the same way. It's
> > better than open-coding, but it's still a hack.
> >
> > The Right Thing[TM] to do is for kexec to communicate the topmost
> > address it wants to this code, so it has both the upper and the lower
> > boundaries available to it instead of just one.
>
> hope you are happy with this one.
>
> [PATCH -v5] x86, memblock: Fix crashkernel allocation
>
> Cai Qian found crashkernel is broken with x86 memblock changes
> 1. crashkernel=128M@32M always reported that range is used, even first kernel is small
> no one use that range
> 2. always get following report when using "kexec -p"
> Could not find a free area of memory of a000 bytes...
> locate_hole failed
>
> The root cause is that generic memblock_find_in_range() will try to get range from top_down.
> But crashkernel do need from low and specified range.
>
> Let's limit the target range with rash_base + crash_size to make sure that
> We get range from bottom.
>
> -v5: use DEFAULT_BZIMAGE_ADDR_MAX to limit area that could be used by bzImge.
> also second try for vmlinux or new kexec tools will use bzImage 64bit entry
>
> Reported-and-Bisected-by: CAI Qian <caiqian@redhat.com>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
> ---
> arch/x86/kernel/setup.c | 24 ++++++++++++++++++------
> 1 file changed, 18 insertions(+), 6 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/setup.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/setup.c
> +++ linux-2.6/arch/x86/kernel/setup.c
> @@ -501,6 +501,7 @@ static inline unsigned long long get_tot
> return total << PAGE_SHIFT;
> }
>
> +#define DEFAULT_BZIMAGE_ADDR_MAX 0x37FFFFFF
> static void __init reserve_crashkernel(void)
> {
> unsigned long long total_mem;
> @@ -518,17 +519,28 @@ static void __init reserve_crashkernel(v
> if (crash_base <= 0) {
> const unsigned long long alignment = 16<<20; /* 16M */
>
> - crash_base = memblock_find_in_range(alignment, ULONG_MAX, crash_size,
> - alignment);
> + /*
> + * Assume half crash_size is for bzImage
> + * kexec want bzImage is below DEFAULT_BZIMAGE_ADDR_MAX
> + */
> + crash_base = memblock_find_in_range(alignment,
> + DEFAULT_BZIMAGE_ADDR_MAX + crash_size/2,
> + crash_size, alignment);
> +
IMHO, these kind of hardcodings are worse than finding the lowest possible
address. It is assuming that kexec is going to load a bzImage.
So we have following three options sorted from best to worst.
- Specify upper limit in "crashkernel=" command line syntax
- Find the lowest possible address for crashkernel reservations
- Hardcode upper limit based on certain factors.
Because upper limit depends on image being loaded and can also vary as
kexec-tools changes, knowing it for sure will require extra reboot. It
also make command line syntax more complicated as we need to introduce
another field to speciy upper limit. Especially for the following case.
crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
So personally I think we can stick to second best option and that is
finding the lowest possible memory area.
Thanks
Vivek
next prev parent reply other threads:[~2010-09-28 14:01 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1909915255.2046011285586388234.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-09-27 11:21 ` kexec load failure introduced by "x86, memblock: Replace e820_/_early string with memblock_" caiqian
2010-09-27 22:22 ` Yinghai Lu
2010-09-27 22:50 ` H. Peter Anvin
2010-09-27 23:20 ` Yinghai Lu
2010-09-27 23:26 ` H. Peter Anvin
2010-09-27 23:32 ` Yinghai Lu
2010-09-27 23:34 ` H. Peter Anvin
2010-09-27 23:41 ` Yinghai Lu
2010-09-28 0:53 ` Vivek Goyal
2010-09-28 2:41 ` Yinghai Lu
2010-09-28 3:46 ` H. Peter Anvin
2010-09-28 7:14 ` Yinghai Lu
2010-09-28 14:01 ` Vivek Goyal [this message]
2010-09-28 13:54 ` Vivek Goyal
[not found] <870873343.2003871285555329846.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-09-27 6:31 ` Yinghai Lu
2010-09-27 9:16 ` CAI Qian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100928140112.GB8950@redhat.com \
--to=vgoyal@redhat.com \
--cc=caiqian@redhat.com \
--cc=hpa@zytor.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox