All of lore.kernel.org
 help / color / mirror / Atom feed
* problems in kdump kernel if 'maxcpus=1' not specified?
@ 2008-07-16  1:07 Jay Lan
  2008-07-16 15:12 ` Vivek Goyal
  0 siblings, 1 reply; 15+ messages in thread
From: Jay Lan @ 2008-07-16  1:07 UTC (permalink / raw)
  To: kexec

Are there known problems if you boot up kdump kernel with
multipl cpus?

It takes unacceptably long time to run makedumpfile in
saving dump at a huge memory system. In my testing it
took 16hr25min to run create_dump_bitmap() on a 1TB system.
Pfn's are processed sequentially with single cpu. We
certainly can use multipl cpus here ;)

Regards,
 - jay

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16  1:07 problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
@ 2008-07-16 15:12 ` Vivek Goyal
  2008-07-16 15:25   ` Neil Horman
  2008-07-16 15:26   ` Bernhard Walle
  0 siblings, 2 replies; 15+ messages in thread
From: Vivek Goyal @ 2008-07-16 15:12 UTC (permalink / raw)
  To: Jay Lan; +Cc: Ken'ichi Ohmichi, kexec

On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
> Are there known problems if you boot up kdump kernel with
> multipl cpus?
> 

I had run into one issue and that was some system would get reset and 
jump to BIOS.

The reason was that kdump kernel can boot on a non-boot cpu. When it
tries to bring up other cpus it sends INIT and a non-boot cpu sending
INIT to "boot" cpu was not acceptable (as per intel documentation) and 
it re-initialized the system.

I am not sure how many systems are affected with this behavior. Hence
the reason for using maxcpus=1.

> It takes unacceptably long time to run makedumpfile in
> saving dump at a huge memory system. In my testing it
> took 16hr25min to run create_dump_bitmap() on a 1TB system.
> Pfn's are processed sequentially with single cpu. We
> certainly can use multipl cpus here ;)

This is certainly very long time. How much memory have you reserved for
kdump kernel?

I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
to filter and save the core (maximum filtering level of 31). I had
reserved 128MB of memory for kdump kernel.

I think something else is seriously wrong here. 1 TB is almost 10 times of
128GM and even if time scales linearly it should not take more than
40mins.

You need to dive deeper to find out what is taking so much of time.

CCing kenichi.

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 15:12 ` Vivek Goyal
@ 2008-07-16 15:25   ` Neil Horman
  2008-07-16 16:23     ` Vivek Goyal
  2008-07-16 15:26   ` Bernhard Walle
  1 sibling, 1 reply; 15+ messages in thread
From: Neil Horman @ 2008-07-16 15:25 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Jay Lan, kexec, Ken'ichi Ohmichi

On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
> On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
> > Are there known problems if you boot up kdump kernel with
> > multipl cpus?
> > 
> 
> I had run into one issue and that was some system would get reset and 
> jump to BIOS.
> 
> The reason was that kdump kernel can boot on a non-boot cpu. When it
> tries to bring up other cpus it sends INIT and a non-boot cpu sending
> INIT to "boot" cpu was not acceptable (as per intel documentation) and 
> it re-initialized the system.
> 
> I am not sure how many systems are affected with this behavior. Hence
> the reason for using maxcpus=1.
> 
+1, there are a number of multi-cpu issues with kdump.  I've seen some systems
where you simply can't re-inialize a halted cpu from software, which causes
problems/hangs

> > It takes unacceptably long time to run makedumpfile in
> > saving dump at a huge memory system. In my testing it
> > took 16hr25min to run create_dump_bitmap() on a 1TB system.
> > Pfn's are processed sequentially with single cpu. We
> > certainly can use multipl cpus here ;)
> 
> This is certainly very long time. How much memory have you reserved for
> kdump kernel?
> 
> I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
> to filter and save the core (maximum filtering level of 31). I had
> reserved 128MB of memory for kdump kernel.
> 
> I think something else is seriously wrong here. 1 TB is almost 10 times of
> 128GM and even if time scales linearly it should not take more than
> 40mins.
> 
> You need to dive deeper to find out what is taking so much of time.
> 
> CCing kenichi.
> 
You know, we might be able to get speedup's in makedumpfile without the use of
additional cpu's.  One of the things that concerned me when I read this was the
use of dump targets that need to be sequential.  i.e. multiple processes writing
to a local disk make good sense, but not so much if you're dumping over an scp
connection (don't want to re-order those writes).  The makedumpfile work cycle
goes something from 30000 feet like:

1) Inspect a page
2) Decide to filter the page
3) if (2) goto 1
4) else compress page
5) write page to target

I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
target?  That way we could at least free up some cpu cycles to work more quickly
on steps 2,3, and 4 

Thoughts?

Neil

-- 
/***************************************************
 *Neil Horman
 *Senior Software Engineer
 *Red Hat, Inc.
 *nhorman@redhat.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 15:12 ` Vivek Goyal
  2008-07-16 15:25   ` Neil Horman
@ 2008-07-16 15:26   ` Bernhard Walle
  2008-07-16 16:34     ` Vivek Goyal
  1 sibling, 1 reply; 15+ messages in thread
From: Bernhard Walle @ 2008-07-16 15:26 UTC (permalink / raw)
  To: kexec

* Vivek Goyal [2008-07-16 11:12]:
> 
> You need to dive deeper to find out what is taking so much of time.

I think the main difference are the huge memory gaps usual on IA64
systems. makedumpfile processes each PFN sequentially when creating the
1st bitmap.


Bernhard
-- 
Bernhard Walle, SUSE LINUX Products GmbH, Architecture Development

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 15:25   ` Neil Horman
@ 2008-07-16 16:23     ` Vivek Goyal
  2008-07-16 17:03       ` Neil Horman
  2008-07-16 18:00       ` problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
  0 siblings, 2 replies; 15+ messages in thread
From: Vivek Goyal @ 2008-07-16 16:23 UTC (permalink / raw)
  To: Neil Horman; +Cc: Jay Lan, kexec, Ken'ichi Ohmichi

On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
> On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
> > On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
> > > Are there known problems if you boot up kdump kernel with
> > > multipl cpus?
> > > 
> > 
> > I had run into one issue and that was some system would get reset and 
> > jump to BIOS.
> > 
> > The reason was that kdump kernel can boot on a non-boot cpu. When it
> > tries to bring up other cpus it sends INIT and a non-boot cpu sending
> > INIT to "boot" cpu was not acceptable (as per intel documentation) and 
> > it re-initialized the system.
> > 
> > I am not sure how many systems are affected with this behavior. Hence
> > the reason for using maxcpus=1.
> > 
> +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
> where you simply can't re-inialize a halted cpu from software, which causes
> problems/hangs
> 
> > > It takes unacceptably long time to run makedumpfile in
> > > saving dump at a huge memory system. In my testing it
> > > took 16hr25min to run create_dump_bitmap() on a 1TB system.
> > > Pfn's are processed sequentially with single cpu. We
> > > certainly can use multipl cpus here ;)
> > 
> > This is certainly very long time. How much memory have you reserved for
> > kdump kernel?
> > 
> > I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
> > to filter and save the core (maximum filtering level of 31). I had
> > reserved 128MB of memory for kdump kernel.
> > 
> > I think something else is seriously wrong here. 1 TB is almost 10 times of
> > 128GM and even if time scales linearly it should not take more than
> > 40mins.
> > 
> > You need to dive deeper to find out what is taking so much of time.
> > 
> > CCing kenichi.
> > 
> You know, we might be able to get speedup's in makedumpfile without the use of
> additional cpu's.  One of the things that concerned me when I read this was the
> use of dump targets that need to be sequential.  i.e. multiple processes writing
> to a local disk make good sense, but not so much if you're dumping over an scp
> connection (don't want to re-order those writes).  The makedumpfile work cycle
> goes something from 30000 feet like:
> 
> 1) Inspect a page
> 2) Decide to filter the page
> 3) if (2) goto 1
> 4) else compress page
> 5) write page to target

I thought that it first creates the bitmap. So in first pass it just
decides which are the pages to be dumped or filtered out and marks these
in bitmap.

Then in second pass it starts dumping all the pages sequentially along
with metadata, if any..

> 
> I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
> of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
> quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
> target?  That way we could at least free up some cpu cycles to work more quickly
> on steps 2,3, and 4 
> 

If above assumption if right, then probably AIO might not help as once we
marked the pages, we have no job but to wait for completion.

DIO might help a bit because we need not to fill page cache as we are 
not going to need vmcore pages again.

In case of jay, it looks creating bitmaps itself took a long time. 

Vivek

> Thoughts?
> 
> Neil
> 
> -- 
> /***************************************************
>  *Neil Horman
>  *Senior Software Engineer
>  *Red Hat, Inc.
>  *nhorman@redhat.com
>  *gpg keyid: 1024D / 0x92A74FA1
>  *http://pgp.mit.edu
>  ***************************************************/

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 15:26   ` Bernhard Walle
@ 2008-07-16 16:34     ` Vivek Goyal
  2008-07-16 16:45       ` Bernhard Walle
  0 siblings, 1 reply; 15+ messages in thread
From: Vivek Goyal @ 2008-07-16 16:34 UTC (permalink / raw)
  To: Bernhard Walle; +Cc: kexec

On Wed, Jul 16, 2008 at 05:26:01PM +0200, Bernhard Walle wrote:
> * Vivek Goyal [2008-07-16 11:12]:
> > 
> > You need to dive deeper to find out what is taking so much of time.
> 
> I think the main difference are the huge memory gaps usual on IA64
> systems. makedumpfile processes each PFN sequentially when creating the
> 1st bitmap.

By memory gaps you mean large amount of memory? If yes, on 128GB system
it took me 4 min and on 1TB system (10 times memory), it takes 16hrs.
Almost 240 times the duration of 128GB. That sounds horrible.

Either it is some kind of NUMA machine and memory access to remote nodes is
really slow.  I am not sure if that's the case.

Thanks
Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 16:34     ` Vivek Goyal
@ 2008-07-16 16:45       ` Bernhard Walle
  0 siblings, 0 replies; 15+ messages in thread
From: Bernhard Walle @ 2008-07-16 16:45 UTC (permalink / raw)
  To: kexec

* Vivek Goyal [2008-07-16 12:34]:
>
> On Wed, Jul 16, 2008 at 05:26:01PM +0200, Bernhard Walle wrote:
> > * Vivek Goyal [2008-07-16 11:12]:
> > > 
> > > You need to dive deeper to find out what is taking so much of time.
> > 
> > I think the main difference are the huge memory gaps usual on IA64
> > systems. makedumpfile processes each PFN sequentially when creating the
> > 1st bitmap.
> 
> By memory gaps you mean large amount of memory?

No, the whole system memory mappend (beside of boot memory) beyond
1^32. Like (from /proc/iomem):

6000100000-60003fffff : System RAM
6003000000-6033dfffff : System RAM
6033e00000-6043ffffff : System RAM
6044000000-6045b9efff : System RAM
6045b9f000-60f7ffdfff : System RAM
16000100000-160003fffff : System RAM
16003000000-16033dfffff : System RAM
...



Bernhard
-- 
Bernhard Walle, SUSE LINUX Products GmbH, Architecture Development

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 16:23     ` Vivek Goyal
@ 2008-07-16 17:03       ` Neil Horman
  2008-07-16 19:16         ` Jay Lan
  2008-07-16 18:00       ` problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
  1 sibling, 1 reply; 15+ messages in thread
From: Neil Horman @ 2008-07-16 17:03 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Jay Lan, Neil Horman, kexec, Ken'ichi Ohmichi

On Wed, Jul 16, 2008 at 12:23:43PM -0400, Vivek Goyal wrote:
> On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
> > On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
> > > On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
> > > > Are there known problems if you boot up kdump kernel with
> > > > multipl cpus?
> > > > 
> > > 
> > > I had run into one issue and that was some system would get reset and 
> > > jump to BIOS.
> > > 
> > > The reason was that kdump kernel can boot on a non-boot cpu. When it
> > > tries to bring up other cpus it sends INIT and a non-boot cpu sending
> > > INIT to "boot" cpu was not acceptable (as per intel documentation) and 
> > > it re-initialized the system.
> > > 
> > > I am not sure how many systems are affected with this behavior. Hence
> > > the reason for using maxcpus=1.
> > > 
> > +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
> > where you simply can't re-inialize a halted cpu from software, which causes
> > problems/hangs
> > 
> > > > It takes unacceptably long time to run makedumpfile in
> > > > saving dump at a huge memory system. In my testing it
> > > > took 16hr25min to run create_dump_bitmap() on a 1TB system.
> > > > Pfn's are processed sequentially with single cpu. We
> > > > certainly can use multipl cpus here ;)
> > > 
> > > This is certainly very long time. How much memory have you reserved for
> > > kdump kernel?
> > > 
> > > I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
> > > to filter and save the core (maximum filtering level of 31). I had
> > > reserved 128MB of memory for kdump kernel.
> > > 
> > > I think something else is seriously wrong here. 1 TB is almost 10 times of
> > > 128GM and even if time scales linearly it should not take more than
> > > 40mins.
> > > 
> > > You need to dive deeper to find out what is taking so much of time.
> > > 
> > > CCing kenichi.
> > > 
> > You know, we might be able to get speedup's in makedumpfile without the use of
> > additional cpu's.  One of the things that concerned me when I read this was the
> > use of dump targets that need to be sequential.  i.e. multiple processes writing
> > to a local disk make good sense, but not so much if you're dumping over an scp
> > connection (don't want to re-order those writes).  The makedumpfile work cycle
> > goes something from 30000 feet like:
> > 
> > 1) Inspect a page
> > 2) Decide to filter the page
> > 3) if (2) goto 1
> > 4) else compress page
> > 5) write page to target
> 
> I thought that it first creates the bitmap. So in first pass it just
> decides which are the pages to be dumped or filtered out and marks these
> in bitmap.
> 
> Then in second pass it starts dumping all the pages sequentially along
> with metadata, if any..
> 
It might, but I don't think thats overly relevant, as I expect the major cpu
usage point comes in during compression and the major wall clock time loss
occurs during I/O

> > 
> > I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
> > of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
> > quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
> > target?  That way we could at least free up some cpu cycles to work more quickly
> > on steps 2,3, and 4 
> > 
> 
> If above assumption if right, then probably AIO might not help as once we
> marked the pages, we have no job but to wait for completion.
> 
I assume that we interleave page compression with I/O (i.e. compress a page from
the bitmap, write the page to disk, repeat).  If thats the case, then AIO would
help because the kernel (or another thread) can wait on i/o completion while we
continue and compress another page

It will also help if a single context is unable to fill the I/O pipeline.  IIRC
multiple aio requests can be in flight at the same time, maximizing I/O
bandwidth.  And we can decide at the application level if our dump target will
allow parallel I/O

> DIO might help a bit because we need not to fill page cache as we are 
> not going to need vmcore pages again.
> 
We currently do something simmilar to this in RHEL.  The kdump initrd reduces
dirty_ratio to almost zero, effectively creating a DIO environment.  Numbers
from there would give us an idea of how that performs

> In case of jay, it looks creating bitmaps itself took a long time. 
> 
Do you have data for this?  I've not seen it.
Neil

> Vivek
> 
> > Thoughts?
> > 
> > Neil
> > 
> > -- 
> > /***************************************************
> >  *Neil Horman
> >  *Senior Software Engineer
> >  *Red Hat, Inc.
> >  *nhorman@redhat.com
> >  *gpg keyid: 1024D / 0x92A74FA1
> >  *http://pgp.mit.edu
> >  ***************************************************/

-- 
/***************************************************
 *Neil Horman
 *Senior Software Engineer
 *Red Hat, Inc.
 *nhorman@redhat.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 16:23     ` Vivek Goyal
  2008-07-16 17:03       ` Neil Horman
@ 2008-07-16 18:00       ` Jay Lan
  1 sibling, 0 replies; 15+ messages in thread
From: Jay Lan @ 2008-07-16 18:00 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Ken'ichi Ohmichi, Neil Horman, kexec

Vivek Goyal wrote:
> On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
>> On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
>>> On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
>>>> Are there known problems if you boot up kdump kernel with
>>>> multipl cpus?
>>>>
>>> I had run into one issue and that was some system would get reset and 
>>> jump to BIOS.
>>>
>>> The reason was that kdump kernel can boot on a non-boot cpu. When it
>>> tries to bring up other cpus it sends INIT and a non-boot cpu sending
>>> INIT to "boot" cpu was not acceptable (as per intel documentation) and 
>>> it re-initialized the system.
>>>
>>> I am not sure how many systems are affected with this behavior. Hence
>>> the reason for using maxcpus=1.
>>>
>> +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
>> where you simply can't re-inialize a halted cpu from software, which causes
>> problems/hangs
>>
>>>> It takes unacceptably long time to run makedumpfile in
>>>> saving dump at a huge memory system. In my testing it
>>>> took 16hr25min to run create_dump_bitmap() on a 1TB system.
>>>> Pfn's are processed sequentially with single cpu. We
>>>> certainly can use multipl cpus here ;)
>>> This is certainly very long time. How much memory have you reserved for
>>> kdump kernel?
>>>
>>> I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
>>> to filter and save the core (maximum filtering level of 31). I had
>>> reserved 128MB of memory for kdump kernel.
>>>
>>> I think something else is seriously wrong here. 1 TB is almost 10 times of
>>> 128GM and even if time scales linearly it should not take more than
>>> 40mins.
>>>
>>> You need to dive deeper to find out what is taking so much of time.
>>>
>>> CCing kenichi.
>>>
>> You know, we might be able to get speedup's in makedumpfile without the use of
>> additional cpu's.  One of the things that concerned me when I read this was the
>> use of dump targets that need to be sequential.  i.e. multiple processes writing
>> to a local disk make good sense, but not so much if you're dumping over an scp
>> connection (don't want to re-order those writes).  The makedumpfile work cycle
>> goes something from 30000 feet like:
>>
>> 1) Inspect a page
>> 2) Decide to filter the page
>> 3) if (2) goto 1
>> 4) else compress page
>> 5) write page to target
> 
> I thought that it first creates the bitmap. So in first pass it just
> decides which are the pages to be dumped or filtered out and marks these
> in bitmap.
> 
> Then in second pass it starts dumping all the pages sequentially along
> with metadata, if any..
> 
>> I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
>> of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
>> quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
>> target?  That way we could at least free up some cpu cycles to work more quickly
>> on steps 2,3, and 4 
>>
> 
> If above assumption if right, then probably AIO might not help as once we
> marked the pages, we have no job but to wait for completion.
> 
> DIO might help a bit because we need not to fill page cache as we are 
> not going to need vmcore pages again.
> 
> In case of jay, it looks creating bitmaps itself took a long time. 

Yep. Most time was spent on creating bitmaps itself.

I was running a version of 1.2.6 makedumpfile and time consumed was
broken down to:
   create_dump_bitmap            16hr 25min
   excluding unnecessary pages   28min
   write_kdump_pages              2min
   Copying data                  19min

I reserved 3960M memory for the kdump kernel.

Regards,
 - jay


> 
> Vivek
> 
>> Thoughts?
>>
>> Neil
>>
>> -- 
>> /***************************************************
>>  *Neil Horman
>>  *Senior Software Engineer
>>  *Red Hat, Inc.
>>  *nhorman@redhat.com
>>  *gpg keyid: 1024D / 0x92A74FA1
>>  *http://pgp.mit.edu
>>  ***************************************************/


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 17:03       ` Neil Horman
@ 2008-07-16 19:16         ` Jay Lan
  2008-07-17  4:59           ` Ken'ichi Ohmichi
  0 siblings, 1 reply; 15+ messages in thread
From: Jay Lan @ 2008-07-16 19:16 UTC (permalink / raw)
  To: Neil Horman; +Cc: Ken'ichi Ohmichi, kexec, Vivek Goyal

Neil Horman wrote:
> On Wed, Jul 16, 2008 at 12:23:43PM -0400, Vivek Goyal wrote:
>> On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
>>> On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
>>>> On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
>>>>> Are there known problems if you boot up kdump kernel with
>>>>> multipl cpus?
>>>>>
>>>> I had run into one issue and that was some system would get reset and 
>>>> jump to BIOS.
>>>>
>>>> The reason was that kdump kernel can boot on a non-boot cpu. When it
>>>> tries to bring up other cpus it sends INIT and a non-boot cpu sending
>>>> INIT to "boot" cpu was not acceptable (as per intel documentation) and 
>>>> it re-initialized the system.
>>>>
>>>> I am not sure how many systems are affected with this behavior. Hence
>>>> the reason for using maxcpus=1.
>>>>
>>> +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
>>> where you simply can't re-inialize a halted cpu from software, which causes
>>> problems/hangs
>>>
>>>>> It takes unacceptably long time to run makedumpfile in
>>>>> saving dump at a huge memory system. In my testing it
>>>>> took 16hr25min to run create_dump_bitmap() on a 1TB system.
>>>>> Pfn's are processed sequentially with single cpu. We
>>>>> certainly can use multipl cpus here ;)
>>>> This is certainly very long time. How much memory have you reserved for
>>>> kdump kernel?
>>>>
>>>> I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
>>>> to filter and save the core (maximum filtering level of 31). I had
>>>> reserved 128MB of memory for kdump kernel.
>>>>
>>>> I think something else is seriously wrong here. 1 TB is almost 10 times of
>>>> 128GM and even if time scales linearly it should not take more than
>>>> 40mins.
>>>>
>>>> You need to dive deeper to find out what is taking so much of time.
>>>>
>>>> CCing kenichi.
>>>>
>>> You know, we might be able to get speedup's in makedumpfile without the use of
>>> additional cpu's.  One of the things that concerned me when I read this was the
>>> use of dump targets that need to be sequential.  i.e. multiple processes writing
>>> to a local disk make good sense, but not so much if you're dumping over an scp
>>> connection (don't want to re-order those writes).  The makedumpfile work cycle
>>> goes something from 30000 feet like:
>>>
>>> 1) Inspect a page
>>> 2) Decide to filter the page
>>> 3) if (2) goto 1
>>> 4) else compress page
>>> 5) write page to target
>> I thought that it first creates the bitmap. So in first pass it just
>> decides which are the pages to be dumped or filtered out and marks these
>> in bitmap.
>>
>> Then in second pass it starts dumping all the pages sequentially along
>> with metadata, if any..
>>
> It might, but I don't think thats overly relevant, as I expect the major cpu
> usage point comes in during compression and the major wall clock time loss
> occurs during I/O
> 
>>> I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
>>> of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
>>> quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
>>> target?  That way we could at least free up some cpu cycles to work more quickly
>>> on steps 2,3, and 4 
>>>
>> If above assumption if right, then probably AIO might not help as once we
>> marked the pages, we have no job but to wait for completion.
>>
> I assume that we interleave page compression with I/O (i.e. compress a page from
> the bitmap, write the page to disk, repeat).  If thats the case, then AIO would
> help because the kernel (or another thread) can wait on i/o completion while we
> continue and compress another page
> 
> It will also help if a single context is unable to fill the I/O pipeline.  IIRC
> multiple aio requests can be in flight at the same time, maximizing I/O
> bandwidth.  And we can decide at the application level if our dump target will
> allow parallel I/O
> 
>> DIO might help a bit because we need not to fill page cache as we are 
>> not going to need vmcore pages again.
>>
> We currently do something simmilar to this in RHEL.  The kdump initrd reduces
> dirty_ratio to almost zero, effectively creating a DIO environment.  Numbers
> from there would give us an idea of how that performs

Upon completion of saving dump, about 2G of memory in cache in my
case.

> 
>> In case of jay, it looks creating bitmaps itself took a long time. 
>>
> Do you have data for this?  I've not seen it.

I just posted detailed data. My initial post gave the amount of time
spent in create_dump_bitmap().

The processing rate of pfn inside create_dump_bitmap() is about
   184500-pfn/sec  on memory map that does not contain data needs to
                   be saved.
   213700-pfn/sec  on memory map that contain data to be saved.

Here is some memory mappend from /proc/iomem:
16003000000-16033dfffff : System RAM
16033e00000-160f7ffffff : System RAM
16800000000-168f7ffffff : System RAM

We do not spent time in scanning pfn between 160f8000000 and
16800000000. Do we? I did not try to track it down.

 - jay



> Neil
> 
>> Vivek
>>
>>> Thoughts?
>>>
>>> Neil
>>>
>>> -- 
>>> /***************************************************
>>>  *Neil Horman
>>>  *Senior Software Engineer
>>>  *Red Hat, Inc.
>>>  *nhorman@redhat.com
>>>  *gpg keyid: 1024D / 0x92A74FA1
>>>  *http://pgp.mit.edu
>>>  ***************************************************/
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: problems in kdump kernel if 'maxcpus=1' not specified?
  2008-07-16 19:16         ` Jay Lan
@ 2008-07-17  4:59           ` Ken'ichi Ohmichi
  2008-07-17  6:35             ` [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?) Ken'ichi Ohmichi
  0 siblings, 1 reply; 15+ messages in thread
From: Ken'ichi Ohmichi @ 2008-07-17  4:59 UTC (permalink / raw)
  To: Jay Lan; +Cc: Neil Horman, kexec, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 2320 bytes --]


Hi Jay,

Jay Lan wrote:
>>> In case of jay, it looks creating bitmaps itself took a long time. 
>>>
>> Do you have data for this?  I've not seen it.
> 
> I just posted detailed data. My initial post gave the amount of time
> spent in create_dump_bitmap().
> 
> The processing rate of pfn inside create_dump_bitmap() is about
>    184500-pfn/sec  on memory map that does not contain data needs to
>                    be saved.
>    213700-pfn/sec  on memory map that contain data to be saved.
> 
> Here is some memory mappend from /proc/iomem:
> 16003000000-16033dfffff : System RAM
> 16033e00000-160f7ffffff : System RAM
> 16800000000-168f7ffffff : System RAM
> 
> We do not spent time in scanning pfn between 160f8000000 and
> 16800000000.

In Bernhard's mail:
> 6045b9f000-60f7ffdfff : System RAM
> 16000100000-160003fffff : System RAM
> 16003000000-16033dfffff : System RAM

I guess that makedumpfile spends time in scanning memory gap
between 0x60f7ffdfff and 0x16000100000 when creating 1st-bitmap.

I created the attached patch that makedumpfile does not scan
memory gap when creating 1st-bitmap. Could you please try it ?
This patch is for makedumpfile-1.2.6.


Thanks
Ken'ichi Ohmichi


diff -puN backup/v1.2.6/makedumpfile.c makedumpfile/makedumpfile.c
--- backup/v1.2.6/makedumpfile.c	2008-06-05 15:17:17.000000000 +0900
+++ makedumpfile/makedumpfile.c	2008-07-17 22:33:58.000000000 +0900
@@ -3987,8 +3987,10 @@ exclude_free_page()
 int
 create_1st_bitmap()
 {
+	int i;
 	char *buf = NULL;
-	unsigned long long pfn, paddr;
+	unsigned long long pfn, pfn_start, pfn_end, pfn_bitmap1;
+	struct pt_load_segment *pls;
 	off_t offset_page;
 	int ret = FALSE;
 
@@ -4021,13 +4023,17 @@ create_1st_bitmap()
 	/*
 	 * If page is on memory hole, set bit on the 1st-bitmap.
 	 */
-	for (pfn = 0, paddr = 0; pfn < info->max_mapnr;
-	    pfn++, paddr += info->page_size) {
-		if (is_in_segs(paddr))
+	for (i = pfn_bitmap1 = 0; i < info->num_load_memory; i++) {
+		pls = &info->pt_load_segments[i];
+		pfn_start = pls->phys_start / info->page_size;
+		pfn_end = pls->phys_end / info->page_size;
+		for (pfn = pfn_start; pfn < pfn_end; pfn++) {
 			set_bit_on_1st_bitmap(pfn);
-		else
-			pfn_memhole++;
+			pfn_bitmap1++;
+		}
 	}
+	pfn_memhole = info->max_mapnr - pfn_bitmap1;
+
 	if (!sync_1st_bitmap())
 		goto out;
 


[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?)
  2008-07-17  4:59           ` Ken'ichi Ohmichi
@ 2008-07-17  6:35             ` Ken'ichi Ohmichi
  2008-07-17 17:09               ` Jay Lan
  0 siblings, 1 reply; 15+ messages in thread
From: Ken'ichi Ohmichi @ 2008-07-17  6:35 UTC (permalink / raw)
  To: Jay Lan; +Cc: Neil Horman, kexec, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 1548 bytes --]


Hi Jay,

Ken'ichi Ohmichi wrote:
> I created the attached patch that makedumpfile does not scan
> memory gap when creating 1st-bitmap. Could you please try it ?
> This patch is for makedumpfile-1.2.6.

I found a bug in the patch I sent before, and I fixed it in the
attached patch. Could you please try this patch ?
Sorry for my mistake.


Thanks
Ken'ichi Ohmichi

diff -puN backup/v1.2.6/makedumpfile.c makedumpfile/makedumpfile.c
--- backup/v1.2.6/makedumpfile.c	2008-06-05 15:17:17.000000000 +0900
+++ makedumpfile/makedumpfile.c	2008-07-18 00:14:34.000000000 +0900
@@ -3987,8 +3987,10 @@ exclude_free_page()
 int
 create_1st_bitmap()
 {
+	int i;
 	char *buf = NULL;
-	unsigned long long pfn, paddr;
+	unsigned long long pfn, pfn_start, pfn_end, pfn_bitmap1;
+	struct pt_load_segment *pls;
 	off_t offset_page;
 	int ret = FALSE;
 
@@ -4021,13 +4023,19 @@ create_1st_bitmap()
 	/*
 	 * If page is on memory hole, set bit on the 1st-bitmap.
 	 */
-	for (pfn = 0, paddr = 0; pfn < info->max_mapnr;
-	    pfn++, paddr += info->page_size) {
-		if (is_in_segs(paddr))
+	for (i = pfn_bitmap1 = 0; i < info->num_load_memory; i++) {
+		pls = &info->pt_load_segments[i];
+		pfn_start = pls->phys_start >> PAGESHIFT();
+		pfn_end   = pls->phys_end >> PAGESHIFT();
+		if (!is_in_segs(pfn_start << PAGESHIFT()))
+			pfn_start++;
+		for (pfn = pfn_start; pfn < pfn_end; pfn++) {
 			set_bit_on_1st_bitmap(pfn);
-		else
-			pfn_memhole++;
+			pfn_bitmap1++;
+		}
 	}
+	pfn_memhole = info->max_mapnr - pfn_bitmap1;
+
 	if (!sync_1st_bitmap())
 		goto out;
 


[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?)
  2008-07-17  6:35             ` [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?) Ken'ichi Ohmichi
@ 2008-07-17 17:09               ` Jay Lan
  2008-07-18  4:00                 ` Ken'ichi Ohmichi
  0 siblings, 1 reply; 15+ messages in thread
From: Jay Lan @ 2008-07-17 17:09 UTC (permalink / raw)
  To: Ken'ichi Ohmichi; +Cc: Neil Horman, kexec, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 1815 bytes --]

Ken'ichi Ohmichi wrote:
> Hi Jay,
> 
> Ken'ichi Ohmichi wrote:
>> I created the attached patch that makedumpfile does not scan
>> memory gap when creating 1st-bitmap. Could you please try it ?
>> This patch is for makedumpfile-1.2.6.
> 
> I found a bug in the patch I sent before, and I fixed it in the
> attached patch. Could you please try this patch ?
> Sorry for my mistake.

Hi Ken'ichi San,

Thanks for your patch. I need to compete for the test machine. Will
post new data when i have one.

Regards,
 - jay

> 
> 
> Thanks
> Ken'ichi Ohmichi
> 
> diff -puN backup/v1.2.6/makedumpfile.c makedumpfile/makedumpfile.c
> --- backup/v1.2.6/makedumpfile.c	2008-06-05 15:17:17.000000000 +0900
> +++ makedumpfile/makedumpfile.c	2008-07-18 00:14:34.000000000 +0900
> @@ -3987,8 +3987,10 @@ exclude_free_page()
>  int
>  create_1st_bitmap()
>  {
> +	int i;
>  	char *buf = NULL;
> -	unsigned long long pfn, paddr;
> +	unsigned long long pfn, pfn_start, pfn_end, pfn_bitmap1;
> +	struct pt_load_segment *pls;
>  	off_t offset_page;
>  	int ret = FALSE;
>  
> @@ -4021,13 +4023,19 @@ create_1st_bitmap()
>  	/*
>  	 * If page is on memory hole, set bit on the 1st-bitmap.
>  	 */
> -	for (pfn = 0, paddr = 0; pfn < info->max_mapnr;
> -	    pfn++, paddr += info->page_size) {
> -		if (is_in_segs(paddr))
> +	for (i = pfn_bitmap1 = 0; i < info->num_load_memory; i++) {
> +		pls = &info->pt_load_segments[i];
> +		pfn_start = pls->phys_start >> PAGESHIFT();
> +		pfn_end   = pls->phys_end >> PAGESHIFT();
> +		if (!is_in_segs(pfn_start << PAGESHIFT()))
> +			pfn_start++;
> +		for (pfn = pfn_start; pfn < pfn_end; pfn++) {
>  			set_bit_on_1st_bitmap(pfn);
> -		else
> -			pfn_memhole++;
> +			pfn_bitmap1++;
> +		}
>  	}
> +	pfn_memhole = info->max_mapnr - pfn_bitmap1;
> +
>  	if (!sync_1st_bitmap())
>  		goto out;
>  



[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?)
  2008-07-17 17:09               ` Jay Lan
@ 2008-07-18  4:00                 ` Ken'ichi Ohmichi
  2008-07-18 15:57                   ` Jay Lan
  0 siblings, 1 reply; 15+ messages in thread
From: Ken'ichi Ohmichi @ 2008-07-18  4:00 UTC (permalink / raw)
  To: Jay Lan; +Cc: Neil Horman, kexec, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 2281 bytes --]


Hi,

Jay Lan wrote:
> Ken'ichi Ohmichi wrote:
>> Hi Jay,
>>
>> Ken'ichi Ohmichi wrote:
>>> I created the attached patch that makedumpfile does not scan
>>> memory gap when creating 1st-bitmap. Could you please try it ?
>>> This patch is for makedumpfile-1.2.6.
>> I found a bug in the patch I sent before, and I fixed it in the
>> attached patch. Could you please try this patch ?
>> Sorry for my mistake.
> 
> Hi Ken'ichi San,
> 
> Thanks for your patch. I need to compete for the test machine. Will
> post new data when i have one.

I think that new makedumpfile (version 1.2.7) is more useful for you,
because it includes not only this patch but also the other patch
shrinking the time for creating bitmap.


Thanks
Ken'ichi Ohmichi


>> diff -puN backup/v1.2.6/makedumpfile.c makedumpfile/makedumpfile.c
>> --- backup/v1.2.6/makedumpfile.c	2008-06-05 15:17:17.000000000 +0900
>> +++ makedumpfile/makedumpfile.c	2008-07-18 00:14:34.000000000 +0900
>> @@ -3987,8 +3987,10 @@ exclude_free_page()
>>  int
>>  create_1st_bitmap()
>>  {
>> +	int i;
>>  	char *buf = NULL;
>> -	unsigned long long pfn, paddr;
>> +	unsigned long long pfn, pfn_start, pfn_end, pfn_bitmap1;
>> +	struct pt_load_segment *pls;
>>  	off_t offset_page;
>>  	int ret = FALSE;
>>  
>> @@ -4021,13 +4023,19 @@ create_1st_bitmap()
>>  	/*
>>  	 * If page is on memory hole, set bit on the 1st-bitmap.
>>  	 */
>> -	for (pfn = 0, paddr = 0; pfn < info->max_mapnr;
>> -	    pfn++, paddr += info->page_size) {
>> -		if (is_in_segs(paddr))
>> +	for (i = pfn_bitmap1 = 0; i < info->num_load_memory; i++) {
>> +		pls = &info->pt_load_segments[i];
>> +		pfn_start = pls->phys_start >> PAGESHIFT();
>> +		pfn_end   = pls->phys_end >> PAGESHIFT();
>> +		if (!is_in_segs(pfn_start << PAGESHIFT()))
>> +			pfn_start++;
>> +		for (pfn = pfn_start; pfn < pfn_end; pfn++) {
>>  			set_bit_on_1st_bitmap(pfn);
>> -		else
>> -			pfn_memhole++;
>> +			pfn_bitmap1++;
>> +		}
>>  	}
>> +	pfn_memhole = info->max_mapnr - pfn_bitmap1;
>> +
>>  	if (!sync_1st_bitmap())
>>  		goto out;
>>  
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec




[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?)
  2008-07-18  4:00                 ` Ken'ichi Ohmichi
@ 2008-07-18 15:57                   ` Jay Lan
  0 siblings, 0 replies; 15+ messages in thread
From: Jay Lan @ 2008-07-18 15:57 UTC (permalink / raw)
  To: Ken'ichi Ohmichi; +Cc: Neil Horman, kexec, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 946 bytes --]

Ken'ichi Ohmichi wrote:
> Hi,
> 
> Jay Lan wrote:
>> Ken'ichi Ohmichi wrote:
>>> Hi Jay,
>>>
>>> Ken'ichi Ohmichi wrote:
>>>> I created the attached patch that makedumpfile does not scan
>>>> memory gap when creating 1st-bitmap. Could you please try it ?
>>>> This patch is for makedumpfile-1.2.6.
>>> I found a bug in the patch I sent before, and I fixed it in the
>>> attached patch. Could you please try this patch ?
>>> Sorry for my mistake.
>> Hi Ken'ichi San,
>>
>> Thanks for your patch. I need to compete for the test machine. Will
>> post new data when i have one.
> 
> I think that new makedumpfile (version 1.2.7) is more useful for you,
> because it includes not only this patch but also the other patch
> shrinking the time for creating bitmap.

I certainly will, Ken'ichi. Thanks!

I am eager to see the new data; unfortunately the test machine is
booked up until 7/29. :(

Regards,
 - jay

> 
> 
> Thanks
> Ken'ichi Ohmichi
> 
> 


[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-07-18 15:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-16  1:07 problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
2008-07-16 15:12 ` Vivek Goyal
2008-07-16 15:25   ` Neil Horman
2008-07-16 16:23     ` Vivek Goyal
2008-07-16 17:03       ` Neil Horman
2008-07-16 19:16         ` Jay Lan
2008-07-17  4:59           ` Ken'ichi Ohmichi
2008-07-17  6:35             ` [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?) Ken'ichi Ohmichi
2008-07-17 17:09               ` Jay Lan
2008-07-18  4:00                 ` Ken'ichi Ohmichi
2008-07-18 15:57                   ` Jay Lan
2008-07-16 18:00       ` problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
2008-07-16 15:26   ` Bernhard Walle
2008-07-16 16:34     ` Vivek Goyal
2008-07-16 16:45       ` Bernhard Walle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.