* 896MB address limit
@ 2012-09-24 23:07 Cliff Wickman
2012-09-25 3:11 ` Eric W. Biederman
0 siblings, 1 reply; 8+ messages in thread
From: Cliff Wickman @ 2012-09-24 23:07 UTC (permalink / raw)
To: kexec
Gentlemen,
In dumping very large memories we are running up against the 896MB
limit in SLES11SP2 (3.0.38 kernel).
arch/x86/kernel/setup.c
/*
* Keep the crash kernel below this limit. On 32 bits earlier kernels
* would limit the kernel to the low 512 MiB due to mapping restrictions.
* On 64 bits, kexec-tools currently limits us to 896 MiB; increase this
* limit once kexec-tools are fixed.
*/
#ifdef CONFIG_X86_32
# define CRASH_KERNEL_ADDR_MAX (512 << 20)
#else
# define CRASH_KERNEL_ADDR_MAX (896 << 20)
#endif
/sbin/kexec we are using is from kexec-tools-2.0.0-53.43.10
Can you tell me if this is limit is removed in the current kexec-tools? And
if so, can I use a later release of kexec-tools with this kernel?
Or, perhaps you can point me to a patch that removes this limit.
Thanks.
-Cliff
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-24 23:07 896MB address limit Cliff Wickman
@ 2012-09-25 3:11 ` Eric W. Biederman
2012-09-25 14:18 ` Cliff Wickman
2012-09-25 17:38 ` Vivek Goyal
0 siblings, 2 replies; 8+ messages in thread
From: Eric W. Biederman @ 2012-09-25 3:11 UTC (permalink / raw)
To: Cliff Wickman; +Cc: kexec
Cliff Wickman <cpw@sgi.com> writes:
> Gentlemen,
>
> In dumping very large memories we are running up against the 896MB
> limit in SLES11SP2 (3.0.38 kernel).
Odd. That limit should be the maximum address in memory to load the
crash kernel. Tha limit should have nothing to do with the dump process
itself.
Are you saying you need more that 512MiB reserved for the crash kernel
to be able to dump all of the memory in your system?
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 3:11 ` Eric W. Biederman
@ 2012-09-25 14:18 ` Cliff Wickman
2012-09-25 15:10 ` Eric W. Biederman
2012-09-25 17:38 ` Vivek Goyal
1 sibling, 1 reply; 8+ messages in thread
From: Cliff Wickman @ 2012-09-25 14:18 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: kexec
Hi Eric, and all,
On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
> Cliff Wickman <cpw@sgi.com> writes:
>
> > Gentlemen,
> >
> > In dumping very large memories we are running up against the 896MB
> > limit in SLES11SP2 (3.0.38 kernel).
>
> Odd. That limit should be the maximum address in memory to load the
> crash kernel. Tha limit should have nothing to do with the dump process
> itself.
>
> Are you saying you need more that 512MiB reserved for the crash kernel
> to be able to dump all of the memory in your system?
>
> Eric
As I noted to Eric privately, yes we need to bump up to crashkernel=1G
or more for some very large memories.
As an experiment I bumped
+++ linux/arch/x86/kernel/setup.c
@@ -528,7 +528,7 @@ static inline unsigned long long get_tot
#ifdef CONFIG_X86_32
# define CRASH_KERNEL_ADDR_MAX (512 << 20)
#else
-# define CRASH_KERNEL_ADDR_MAX (896 << 20)
+# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
And that seems to work. i.e. I'm currently dumping a system where
crashkernel=1G and it seems to be working.
Am I just living dangerously?
-Cliff
--
Cliff Wickman
SGI
cpw@sgi.com
(651) 683-3824
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 14:18 ` Cliff Wickman
@ 2012-09-25 15:10 ` Eric W. Biederman
2012-09-25 15:57 ` Maxim Uvarov
2012-09-27 23:07 ` Cliff Wickman
0 siblings, 2 replies; 8+ messages in thread
From: Eric W. Biederman @ 2012-09-25 15:10 UTC (permalink / raw)
To: Cliff Wickman; +Cc: kexec
Cliff Wickman <cpw@sgi.com> writes:
> Hi Eric, and all,
>
> On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
>> Cliff Wickman <cpw@sgi.com> writes:
>>
>> > Gentlemen,
>> >
>> > In dumping very large memories we are running up against the 896MB
>> > limit in SLES11SP2 (3.0.38 kernel).
>>
>> Odd. That limit should be the maximum address in memory to load the
>> crash kernel. Tha limit should have nothing to do with the dump process
>> itself.
>>
>> Are you saying you need more that 512MiB reserved for the crash kernel
>> to be able to dump all of the memory in your system?
>>
>> Eric
>
> As I noted to Eric privately, yes we need to bump up to crashkernel=1G
> or more for some very large memories.
>
> As an experiment I bumped
> +++ linux/arch/x86/kernel/setup.c
> @@ -528,7 +528,7 @@ static inline unsigned long long get_tot
> #ifdef CONFIG_X86_32
> # define CRASH_KERNEL_ADDR_MAX (512 << 20)
> #else
> -# define CRASH_KERNEL_ADDR_MAX (896 << 20)
> +# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
>
> And that seems to work. i.e. I'm currently dumping a system where
> crashkernel=1G and it seems to be working.
>
> Am I just living dangerously?
So fundamentally this should work. However there have been a lot of
kinks and silly limitations in the x86 boot protocol.
So it used to be that the bootloader protocol variable ramdisk_max was
set to 896M for 32bit kernels. Because the ramdisk could not be located
in high memory.
Looking today it appears that ramdisk_max has been upped to 4G.
I will let you look through the /sbin/kexec source code.
As for testing I would up the limit to 4G on x86_64 and see how far
you get.
The practical question does the system still work with crashkernel=32M
when you have raised the limit much higher.
So I would test with crashkernel=1G@2G and see if that works. If that
works I figure that in practice all of the bugs are historical and we
can forget them. But a sweep through the /sbin/kexec code for the magic
number 896 might not be out of order.
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 15:10 ` Eric W. Biederman
@ 2012-09-25 15:57 ` Maxim Uvarov
2012-09-25 15:58 ` Maxim Uvarov
2012-09-27 23:07 ` Cliff Wickman
1 sibling, 1 reply; 8+ messages in thread
From: Maxim Uvarov @ 2012-09-25 15:57 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: kexec, Cliff Wickman
In the most cases you need to boot crash kernel with nr_cpus=1
parameter. In that case you will not allocate per cpus buffers for
other cpus. Saving dump on more the one cpu does not give any benefit.
So it's very rare case where you need such huge memory only for save
dump process. You might be looking to the problem from the work side.
Maxim.
2012/9/25 Eric W. Biederman <ebiederm@xmission.com>:
> Cliff Wickman <cpw@sgi.com> writes:
>
>> Hi Eric, and all,
>>
>> On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
>>> Cliff Wickman <cpw@sgi.com> writes:
>>>
>>> > Gentlemen,
>>> >
>>> > In dumping very large memories we are running up against the 896MB
>>> > limit in SLES11SP2 (3.0.38 kernel).
>>>
>>> Odd. That limit should be the maximum address in memory to load the
>>> crash kernel. Tha limit should have nothing to do with the dump process
>>> itself.
>>>
>>> Are you saying you need more that 512MiB reserved for the crash kernel
>>> to be able to dump all of the memory in your system?
>>>
>>> Eric
>>
>> As I noted to Eric privately, yes we need to bump up to crashkernel=1G
>> or more for some very large memories.
>>
>> As an experiment I bumped
>> +++ linux/arch/x86/kernel/setup.c
>> @@ -528,7 +528,7 @@ static inline unsigned long long get_tot
>> #ifdef CONFIG_X86_32
>> # define CRASH_KERNEL_ADDR_MAX (512 << 20)
>> #else
>> -# define CRASH_KERNEL_ADDR_MAX (896 << 20)
>> +# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
>>
>> And that seems to work. i.e. I'm currently dumping a system where
>> crashkernel=1G and it seems to be working.
>>
>> Am I just living dangerously?
>
> So fundamentally this should work. However there have been a lot of
> kinks and silly limitations in the x86 boot protocol.
>
> So it used to be that the bootloader protocol variable ramdisk_max was
> set to 896M for 32bit kernels. Because the ramdisk could not be located
> in high memory.
>
> Looking today it appears that ramdisk_max has been upped to 4G.
>
> I will let you look through the /sbin/kexec source code.
>
> As for testing I would up the limit to 4G on x86_64 and see how far
> you get.
>
> The practical question does the system still work with crashkernel=32M
> when you have raised the limit much higher.
>
> So I would test with crashkernel=1G@2G and see if that works. If that
> works I figure that in practice all of the bugs are historical and we
> can forget them. But a sweep through the /sbin/kexec code for the magic
> number 896 might not be out of order.
>
> Eric
>
>
>
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Best regards,
Maxim Uvarov
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 15:57 ` Maxim Uvarov
@ 2012-09-25 15:58 ` Maxim Uvarov
0 siblings, 0 replies; 8+ messages in thread
From: Maxim Uvarov @ 2012-09-25 15:58 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: kexec, Cliff Wickman
2012/9/25 Maxim Uvarov <muvarov@gmail.com>:
> In the most cases you need to boot crash kernel with nr_cpus=1
> parameter. In that case you will not allocate per cpus buffers for
> other cpus. Saving dump on more the one cpu does not give any benefit.
> So it's very rare case where you need such huge memory only for save
> dump process. You might be looking to the problem from the work side.
other.
>
> Maxim.
>
> 2012/9/25 Eric W. Biederman <ebiederm@xmission.com>:
>> Cliff Wickman <cpw@sgi.com> writes:
>>
>>> Hi Eric, and all,
>>>
>>> On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
>>>> Cliff Wickman <cpw@sgi.com> writes:
>>>>
>>>> > Gentlemen,
>>>> >
>>>> > In dumping very large memories we are running up against the 896MB
>>>> > limit in SLES11SP2 (3.0.38 kernel).
>>>>
>>>> Odd. That limit should be the maximum address in memory to load the
>>>> crash kernel. Tha limit should have nothing to do with the dump process
>>>> itself.
>>>>
>>>> Are you saying you need more that 512MiB reserved for the crash kernel
>>>> to be able to dump all of the memory in your system?
>>>>
>>>> Eric
>>>
>>> As I noted to Eric privately, yes we need to bump up to crashkernel=1G
>>> or more for some very large memories.
>>>
>>> As an experiment I bumped
>>> +++ linux/arch/x86/kernel/setup.c
>>> @@ -528,7 +528,7 @@ static inline unsigned long long get_tot
>>> #ifdef CONFIG_X86_32
>>> # define CRASH_KERNEL_ADDR_MAX (512 << 20)
>>> #else
>>> -# define CRASH_KERNEL_ADDR_MAX (896 << 20)
>>> +# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
>>>
>>> And that seems to work. i.e. I'm currently dumping a system where
>>> crashkernel=1G and it seems to be working.
>>>
>>> Am I just living dangerously?
>>
>> So fundamentally this should work. However there have been a lot of
>> kinks and silly limitations in the x86 boot protocol.
>>
>> So it used to be that the bootloader protocol variable ramdisk_max was
>> set to 896M for 32bit kernels. Because the ramdisk could not be located
>> in high memory.
>>
>> Looking today it appears that ramdisk_max has been upped to 4G.
>>
>> I will let you look through the /sbin/kexec source code.
>>
>> As for testing I would up the limit to 4G on x86_64 and see how far
>> you get.
>>
>> The practical question does the system still work with crashkernel=32M
>> when you have raised the limit much higher.
>>
>> So I would test with crashkernel=1G@2G and see if that works. If that
>> works I figure that in practice all of the bugs are historical and we
>> can forget them. But a sweep through the /sbin/kexec code for the magic
>> number 896 might not be out of order.
>>
>> Eric
>>
>>
>>
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>
>
>
> --
> Best regards,
> Maxim Uvarov
--
Best regards,
Maxim Uvarov
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 3:11 ` Eric W. Biederman
2012-09-25 14:18 ` Cliff Wickman
@ 2012-09-25 17:38 ` Vivek Goyal
1 sibling, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2012-09-25 17:38 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: kexec, Cliff Wickman
On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
> Cliff Wickman <cpw@sgi.com> writes:
>
> > Gentlemen,
> >
> > In dumping very large memories we are running up against the 896MB
> > limit in SLES11SP2 (3.0.38 kernel).
>
> Odd. That limit should be the maximum address in memory to load the
> crash kernel. Tha limit should have nothing to do with the dump process
> itself.
This limit came from kernel. IIRC, we had a discussion with hpa and others
and came up with max addresses we could load kernel at for 32bit and
64bit. I wanted it to be exported through bzImage header, so that
kexec-tools does not have to hard code it but i guess it never happened.
>
> Are you saying you need more that 512MiB reserved for the crash kernel
> to be able to dump all of the memory in your system?
Yes it can take more than 512MB (I think even case of 512MB is broken
with current upstream) for large memory system. Current dump filtering
utility takes 2bits of memory per 4K of page. So that is 64MB of memory
per terabyte of RAM. With current initramfs size that requires us (This
is distro specific), to reserve 192MB for 1TB of system. So after 6TB
of RAM, we will cross 512MB of memory.
Having said that, makedumpfile people are working on trying to make
it work with fixed size buffers. A basic implementation is available
in version 1.5.0 but this has performance issues. One more set of
patches needs to go in and after that performance might be acceptable
on large machines.
So hopefully newer version of makedumpfile will do away with the needs
of reserving memory more than 512MB. So memory is traded off for little
higher dumping time. (I prefer that then memory reservation failing).
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 896MB address limit
2012-09-25 15:10 ` Eric W. Biederman
2012-09-25 15:57 ` Maxim Uvarov
@ 2012-09-27 23:07 ` Cliff Wickman
1 sibling, 0 replies; 8+ messages in thread
From: Cliff Wickman @ 2012-09-27 23:07 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: kexec
On Tue, Sep 25, 2012 at 08:10:04AM -0700, Eric W. Biederman wrote:
> Cliff Wickman <cpw@sgi.com> writes:
>
> > Hi Eric, and all,
> >
> > On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
> >> Cliff Wickman <cpw@sgi.com> writes:
> >>
> >> > Gentlemen,
> >> >
> >> > In dumping very large memories we are running up against the 896MB
> >> > limit in SLES11SP2 (3.0.38 kernel).
> >>
> >> Odd. That limit should be the maximum address in memory to load the
> >> crash kernel. Tha limit should have nothing to do with the dump process
> >> itself.
> >>
> >> Are you saying you need more that 512MiB reserved for the crash kernel
> >> to be able to dump all of the memory in your system?
> >>
> >> Eric
> >
> > As I noted to Eric privately, yes we need to bump up to crashkernel=1G
> > or more for some very large memories.
> >
> > As an experiment I bumped
> > +++ linux/arch/x86/kernel/setup.c
> > @@ -528,7 +528,7 @@ static inline unsigned long long get_tot
> > #ifdef CONFIG_X86_32
> > # define CRASH_KERNEL_ADDR_MAX (512 << 20)
> > #else
> > -# define CRASH_KERNEL_ADDR_MAX (896 << 20)
> > +# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
> >
> > And that seems to work. i.e. I'm currently dumping a system where
> > crashkernel=1G and it seems to be working.
> >
> > Am I just living dangerously?
>
> So fundamentally this should work. However there have been a lot of
> kinks and silly limitations in the x86 boot protocol.
>
> So it used to be that the bootloader protocol variable ramdisk_max was
> set to 896M for 32bit kernels. Because the ramdisk could not be located
> in high memory.
>
> Looking today it appears that ramdisk_max has been upped to 4G.
>
> I will let you look through the /sbin/kexec source code.
>
> As for testing I would up the limit to 4G on x86_64 and see how far
> you get.
>
> The practical question does the system still work with crashkernel=32M
> when you have raised the limit much higher.
>
> So I would test with crashkernel=1G@2G and see if that works. If that
> works I figure that in practice all of the bugs are historical and we
> can forget them. But a sweep through the /sbin/kexec code for the magic
> number 896 might not be out of order.
>
> Eric
I did try setting the limit to 8G. The crashkernel did get loaded there
but it would not execute there.
It works fine on a UV to set the limit to 4G and use a
crashkernel=1280M. We have a hole of almost 2G there.
The memory at 2G is already in use so I can't explicitly place it there.
The kernel patch looks like this:
Index: linux/arch/x86/kernel/setup.c
===================================================================
--- linux.orig/arch/x86/kernel/setup.c
+++ linux/arch/x86/kernel/setup.c
@@ -522,13 +522,12 @@ static inline unsigned long long get_tot
/*
* Keep the crash kernel below this limit. On 32 bits earlier kernels
* would limit the kernel to the low 512 MiB due to mapping
* restrictions.
- * On 64 bits, kexec-tools currently limits us to 896 MiB; increase
this
- * limit once kexec-tools are fixed.
+ * On 64 bits, the boot protocol limits us to 4G.
*/
#ifdef CONFIG_X86_32
# define CRASH_KERNEL_ADDR_MAX (512 << 20)
#else
-# define CRASH_KERNEL_ADDR_MAX (896 << 20)
+# define CRASH_KERNEL_ADDR_MAX (1UL << 32)
#endif
static void __init reserve_crashkernel(void)
-Cliff
--
Cliff Wickman
SGI
cpw@sgi.com
(651) 683-3824
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-09-27 23:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-24 23:07 896MB address limit Cliff Wickman
2012-09-25 3:11 ` Eric W. Biederman
2012-09-25 14:18 ` Cliff Wickman
2012-09-25 15:10 ` Eric W. Biederman
2012-09-25 15:57 ` Maxim Uvarov
2012-09-25 15:58 ` Maxim Uvarov
2012-09-27 23:07 ` Cliff Wickman
2012-09-25 17:38 ` Vivek Goyal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.