From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: cpw <cpw@sgi.com>
Cc: kumagai-atsushi@mxc.nes.nec.co.jp, kexec@lists.infradead.org
Subject: Re: [PATCH 0/2] makedumpfile: for large memories
Date: Tue, 14 Jan 2014 20:33:40 +0900 [thread overview]
Message-ID: <52D52094.3050301@jp.fujitsu.com> (raw)
In-Reply-To: <E1Vy8l7-0004oL-7v@eag09.americas.sgi.com>
(2014/01/01 8:30), cpw wrote:
> From: Cliff Wickman <cpw@sgi.com>
>
> Gentlemen of kexec,
>
> I have been working on enabling kdump on some very large systems, and
> have found some solutions that I hope you will consider.
>
> The first issue is to work within the restricted size of crashkernel memory
> under 2.6.32-based kernels, such as sles11 and rhel6.
>
> The second issue is to reduce the very large size of a dump of a big memory
> system, even on an idle system.
>
> These are my propositions:
>
> Size of crashkernel memory
> 1) raw i/o for writing the dump
> 2) use root device for the bitmap file (not tmpfs)
> 3) raw i/o for reading/writing the bitmaps
>
> Size of dump (and hence the duration of dumping)
> 4) exclude page structures for unused pages
>
>
> 1) Is quite easy. The cache of pages needs to be aligned on a block
> boundary and written in block multiples, as required by O_DIRECT files.
>
> The use of raw i/o prevents the growing of the crash kernel's page
> cache.
>
> 2) Is also quite easy. My patch finds the path to the crash
> kernel's root device by examining the dump pathname. Storing the bitmaps
> to a file is otherwise not conserving memory, as they are being written
> to tmpfs.
>
> 3) Raw i/o for the bitmaps, is accomplished by caching the
> bitmap file in a similar way to that of the dump file.
>
> I find that the use of direct i/o is not significantly slower than
> writing through the kernel's page cache.
>
> 4) The excluding of unused kernel page structures is very
> important for a large memory system. The kernel otherwise includes
> 3.67 million pages of page structures per TB of memory. By contrast
> the rest of the kernel is only about 1 million pages.
>
> Test results are below, for systems of 1TB, 2TB, 8.8TB and 16TB.
> (There are no 'old' numbers for 16TB as time and space requirements
> made those effectively useless.)
>
> Run times were generally reduced 2-3x, and dump size reduced about 8x.
>
> All timings were done using 512M of crashkernel memory.
>
> System memory size
> 1TB unpatched patched
> OS: rhel6.4 (does a free pages pass)
> page scan time 1.6min 1.6min
> dump copy time 2.4min .4min
> total time 4.1min 2.0min
> dump size 3014M 364M
>
> OS: rhel6.5
> page scan time .6min .6min
> dump copy time 2.3min .5min
> total time 2.9min 1.1min
> dump size 3011M 423M
>
> OS: sles11sp3 (3.0.93)
> page scan time .5min .5min
> dump copy time 2.3min .5min
> total time 2.8min 1.0min
> dump size 2950M 350M
>
> 2TB
> OS: rhel6.5 (cyclicx3)
> page scan time 2.0min 1.8min
> dump copy time 8.0min 1.5min
> total time 10.0min 3.3min
> dump size 6141M 835M
>
> 8.8TB
> OS: rhel6.5 (cyclicx5)
> page scan time 6.6min 5.5min
> dump copy time 67.8min 6.2min
> total time 74.4min 11.7min
> dump size 15.8G 2.7G
>
> 16TB
> OS: rhel6.4
> page scan time 125.3min
> dump copy time 13.2min
> total time 138.5min
> dump size 4.0G
>
> OS: rhel6.5
> page scan time 27.8min
> dump copy time 13.3min
> total time 41.1min
> dump size 4.1G
>
Also, could you please show us results in more detail?
That is, this benchmark is relevant to 3 parameters below
- cyclic mode or non-cyclic mode
- cached I/O or direct I/O
- with or without page structure object array
Please describe results of each parameter separately, and we can easily
understand how each parameter affects without confusion.
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
prev parent reply other threads:[~2014-01-14 11:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-31 23:30 [PATCH 0/2] makedumpfile: for large memories cpw
2013-12-31 23:34 ` [PATCH 1/2] makedumpfile: raw i/o and use of root device Cliff Wickman
2013-12-31 23:36 ` [PATCH 2/2] makedumpfile: exclude unused vmemmap pages Cliff Wickman
2014-01-06 9:27 ` [PATCH 0/2] makedumpfile: for large memories Atsushi Kumagai
2014-01-09 0:25 ` Cliff Wickman
2014-01-10 7:48 ` Atsushi Kumagai
2014-01-10 18:23 ` Cliff Wickman
2014-01-14 12:59 ` HATAYAMA Daisuke
2014-01-07 10:14 ` HATAYAMA Daisuke
2014-01-10 17:58 ` [PATCH 1/2 V2] raw i/o and root device to use less memory Cliff Wickman
2014-01-13 9:58 ` Michael Holzheu
2014-01-13 13:30 ` Cliff Wickman
2014-01-13 15:02 ` Michael Holzheu
2014-01-10 18:00 ` [PATCH 2/2 V2] exclude unused vmemmap pages Cliff Wickman
2014-01-14 11:33 ` HATAYAMA Daisuke [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D52094.3050301@jp.fujitsu.com \
--to=d.hatayama@jp.fujitsu.com \
--cc=cpw@sgi.com \
--cc=kexec@lists.infradead.org \
--cc=kumagai-atsushi@mxc.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox