From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: cpw <cpw@sgi.com>
Cc: kumagai-atsushi@mxc.nes.nec.co.jp, kexec@lists.infradead.org
Subject: Re: [PATCH 0/2] makedumpfile: for large memories
Date: Tue, 14 Jan 2014 20:33:40 +0900 [thread overview]
Message-ID: <52D52094.3050301@jp.fujitsu.com> (raw)
In-Reply-To: <E1Vy8l7-0004oL-7v@eag09.americas.sgi.com>
(2014/01/01 8:30), cpw wrote:
> From: Cliff Wickman <cpw@sgi.com>
>
> Gentlemen of kexec,
>
> I have been working on enabling kdump on some very large systems, and
> have found some solutions that I hope you will consider.
>
> The first issue is to work within the restricted size of crashkernel memory
> under 2.6.32-based kernels, such as sles11 and rhel6.
>
> The second issue is to reduce the very large size of a dump of a big memory
> system, even on an idle system.
>
> These are my propositions:
>
> Size of crashkernel memory
> 1) raw i/o for writing the dump
> 2) use root device for the bitmap file (not tmpfs)
> 3) raw i/o for reading/writing the bitmaps
>
> Size of dump (and hence the duration of dumping)
> 4) exclude page structures for unused pages
>
>
> 1) Is quite easy. The cache of pages needs to be aligned on a block
> boundary and written in block multiples, as required by O_DIRECT files.
>
> The use of raw i/o prevents the growing of the crash kernel's page
> cache.
>
> 2) Is also quite easy. My patch finds the path to the crash
> kernel's root device by examining the dump pathname. Storing the bitmaps
> to a file is otherwise not conserving memory, as they are being written
> to tmpfs.
>
> 3) Raw i/o for the bitmaps, is accomplished by caching the
> bitmap file in a similar way to that of the dump file.
>
> I find that the use of direct i/o is not significantly slower than
> writing through the kernel's page cache.
>
> 4) The excluding of unused kernel page structures is very
> important for a large memory system. The kernel otherwise includes
> 3.67 million pages of page structures per TB of memory. By contrast
> the rest of the kernel is only about 1 million pages.
>
> Test results are below, for systems of 1TB, 2TB, 8.8TB and 16TB.
> (There are no 'old' numbers for 16TB as time and space requirements
> made those effectively useless.)
>
> Run times were generally reduced 2-3x, and dump size reduced about 8x.
>
> All timings were done using 512M of crashkernel memory.
>
> System memory size
> 1TB unpatched patched
> OS: rhel6.4 (does a free pages pass)
> page scan time 1.6min 1.6min
> dump copy time 2.4min .4min
> total time 4.1min 2.0min
> dump size 3014M 364M
>
> OS: rhel6.5
> page scan time .6min .6min
> dump copy time 2.3min .5min
> total time 2.9min 1.1min
> dump size 3011M 423M
>
> OS: sles11sp3 (3.0.93)
> page scan time .5min .5min
> dump copy time 2.3min .5min
> total time 2.8min 1.0min
> dump size 2950M 350M
>
> 2TB
> OS: rhel6.5 (cyclicx3)
> page scan time 2.0min 1.8min
> dump copy time 8.0min 1.5min
> total time 10.0min 3.3min
> dump size 6141M 835M
>
> 8.8TB
> OS: rhel6.5 (cyclicx5)
> page scan time 6.6min 5.5min
> dump copy time 67.8min 6.2min
> total time 74.4min 11.7min
> dump size 15.8G 2.7G
>
> 16TB
> OS: rhel6.4
> page scan time 125.3min
> dump copy time 13.2min
> total time 138.5min
> dump size 4.0G
>
> OS: rhel6.5
> page scan time 27.8min
> dump copy time 13.3min
> total time 41.1min
> dump size 4.1G
>
Also, could you please show us results in more detail?
That is, this benchmark is relevant to 3 parameters below
- cyclic mode or non-cyclic mode
- cached I/O or direct I/O
- with or without page structure object array
Please describe results of each parameter separately, and we can easily
understand how each parameter affects without confusion.
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
prev parent reply other threads:[~2014-01-14 11:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-31 23:30 [PATCH 0/2] makedumpfile: for large memories cpw
2013-12-31 23:34 ` [PATCH 1/2] makedumpfile: raw i/o and use of root device Cliff Wickman
2013-12-31 23:36 ` [PATCH 2/2] makedumpfile: exclude unused vmemmap pages Cliff Wickman
2014-01-06 9:27 ` [PATCH 0/2] makedumpfile: for large memories Atsushi Kumagai
2014-01-09 0:25 ` Cliff Wickman
2014-01-10 7:48 ` Atsushi Kumagai
2014-01-10 18:23 ` Cliff Wickman
2014-01-14 12:59 ` HATAYAMA Daisuke
2014-01-07 10:14 ` HATAYAMA Daisuke
2014-01-10 17:58 ` [PATCH 1/2 V2] raw i/o and root device to use less memory Cliff Wickman
2014-01-13 9:58 ` Michael Holzheu
2014-01-13 13:30 ` Cliff Wickman
2014-01-13 15:02 ` Michael Holzheu
2014-01-10 18:00 ` [PATCH 2/2 V2] exclude unused vmemmap pages Cliff Wickman
2014-01-14 11:33 ` HATAYAMA Daisuke [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D52094.3050301@jp.fujitsu.com \
--to=d.hatayama@jp.fujitsu.com \
--cc=cpw@sgi.com \
--cc=kexec@lists.infradead.org \
--cc=kumagai-atsushi@mxc.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.