All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Lan <jlan@sgi.com>
To: Neil Horman <nhorman@redhat.com>
Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>,
	kexec@lists.infradead.org, Vivek Goyal <vgoyal@redhat.com>
Subject: Re: problems in kdump kernel if 'maxcpus=1' not specified?
Date: Wed, 16 Jul 2008 12:16:50 -0700	[thread overview]
Message-ID: <487E4922.9050209@sgi.com> (raw)
In-Reply-To: <20080716170338.GA11654@hmsendeavour.rdu.redhat.com>

Neil Horman wrote:
> On Wed, Jul 16, 2008 at 12:23:43PM -0400, Vivek Goyal wrote:
>> On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
>>> On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
>>>> On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
>>>>> Are there known problems if you boot up kdump kernel with
>>>>> multipl cpus?
>>>>>
>>>> I had run into one issue and that was some system would get reset and 
>>>> jump to BIOS.
>>>>
>>>> The reason was that kdump kernel can boot on a non-boot cpu. When it
>>>> tries to bring up other cpus it sends INIT and a non-boot cpu sending
>>>> INIT to "boot" cpu was not acceptable (as per intel documentation) and 
>>>> it re-initialized the system.
>>>>
>>>> I am not sure how many systems are affected with this behavior. Hence
>>>> the reason for using maxcpus=1.
>>>>
>>> +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
>>> where you simply can't re-inialize a halted cpu from software, which causes
>>> problems/hangs
>>>
>>>>> It takes unacceptably long time to run makedumpfile in
>>>>> saving dump at a huge memory system. In my testing it
>>>>> took 16hr25min to run create_dump_bitmap() on a 1TB system.
>>>>> Pfn's are processed sequentially with single cpu. We
>>>>> certainly can use multipl cpus here ;)
>>>> This is certainly very long time. How much memory have you reserved for
>>>> kdump kernel?
>>>>
>>>> I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
>>>> to filter and save the core (maximum filtering level of 31). I had
>>>> reserved 128MB of memory for kdump kernel.
>>>>
>>>> I think something else is seriously wrong here. 1 TB is almost 10 times of
>>>> 128GM and even if time scales linearly it should not take more than
>>>> 40mins.
>>>>
>>>> You need to dive deeper to find out what is taking so much of time.
>>>>
>>>> CCing kenichi.
>>>>
>>> You know, we might be able to get speedup's in makedumpfile without the use of
>>> additional cpu's.  One of the things that concerned me when I read this was the
>>> use of dump targets that need to be sequential.  i.e. multiple processes writing
>>> to a local disk make good sense, but not so much if you're dumping over an scp
>>> connection (don't want to re-order those writes).  The makedumpfile work cycle
>>> goes something from 30000 feet like:
>>>
>>> 1) Inspect a page
>>> 2) Decide to filter the page
>>> 3) if (2) goto 1
>>> 4) else compress page
>>> 5) write page to target
>> I thought that it first creates the bitmap. So in first pass it just
>> decides which are the pages to be dumped or filtered out and marks these
>> in bitmap.
>>
>> Then in second pass it starts dumping all the pages sequentially along
>> with metadata, if any..
>>
> It might, but I don't think thats overly relevant, as I expect the major cpu
> usage point comes in during compression and the major wall clock time loss
> occurs during I/O
> 
>>> I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
>>> of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
>>> quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
>>> target?  That way we could at least free up some cpu cycles to work more quickly
>>> on steps 2,3, and 4 
>>>
>> If above assumption if right, then probably AIO might not help as once we
>> marked the pages, we have no job but to wait for completion.
>>
> I assume that we interleave page compression with I/O (i.e. compress a page from
> the bitmap, write the page to disk, repeat).  If thats the case, then AIO would
> help because the kernel (or another thread) can wait on i/o completion while we
> continue and compress another page
> 
> It will also help if a single context is unable to fill the I/O pipeline.  IIRC
> multiple aio requests can be in flight at the same time, maximizing I/O
> bandwidth.  And we can decide at the application level if our dump target will
> allow parallel I/O
> 
>> DIO might help a bit because we need not to fill page cache as we are 
>> not going to need vmcore pages again.
>>
> We currently do something simmilar to this in RHEL.  The kdump initrd reduces
> dirty_ratio to almost zero, effectively creating a DIO environment.  Numbers
> from there would give us an idea of how that performs

Upon completion of saving dump, about 2G of memory in cache in my
case.

> 
>> In case of jay, it looks creating bitmaps itself took a long time. 
>>
> Do you have data for this?  I've not seen it.

I just posted detailed data. My initial post gave the amount of time
spent in create_dump_bitmap().

The processing rate of pfn inside create_dump_bitmap() is about
   184500-pfn/sec  on memory map that does not contain data needs to
                   be saved.
   213700-pfn/sec  on memory map that contain data to be saved.

Here is some memory mappend from /proc/iomem:
16003000000-16033dfffff : System RAM
16033e00000-160f7ffffff : System RAM
16800000000-168f7ffffff : System RAM

We do not spent time in scanning pfn between 160f8000000 and
16800000000. Do we? I did not try to track it down.

 - jay



> Neil
> 
>> Vivek
>>
>>> Thoughts?
>>>
>>> Neil
>>>
>>> -- 
>>> /***************************************************
>>>  *Neil Horman
>>>  *Senior Software Engineer
>>>  *Red Hat, Inc.
>>>  *nhorman@redhat.com
>>>  *gpg keyid: 1024D / 0x92A74FA1
>>>  *http://pgp.mit.edu
>>>  ***************************************************/
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2008-07-16 19:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-16  1:07 problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
2008-07-16 15:12 ` Vivek Goyal
2008-07-16 15:25   ` Neil Horman
2008-07-16 16:23     ` Vivek Goyal
2008-07-16 17:03       ` Neil Horman
2008-07-16 19:16         ` Jay Lan [this message]
2008-07-17  4:59           ` Ken'ichi Ohmichi
2008-07-17  6:35             ` [PATCH] makedumpfile: Shrink the time for creating 1st-bitmap (Re: problems in kdump kernel if 'maxcpus=1' not specified?) Ken'ichi Ohmichi
2008-07-17 17:09               ` Jay Lan
2008-07-18  4:00                 ` Ken'ichi Ohmichi
2008-07-18 15:57                   ` Jay Lan
2008-07-16 18:00       ` problems in kdump kernel if 'maxcpus=1' not specified? Jay Lan
2008-07-16 15:26   ` Bernhard Walle
2008-07-16 16:34     ` Vivek Goyal
2008-07-16 16:45       ` Bernhard Walle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=487E4922.9050209@sgi.com \
    --to=jlan@sgi.com \
    --cc=kexec@lists.infradead.org \
    --cc=nhorman@redhat.com \
    --cc=oomichi@mxs.nes.nec.co.jp \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.