From: Petr Tesarik <ptesarik@suse.cz>
To: kexec@lists.infradead.org
Subject: Re: [PATCH] makedumpfile: request the kernel do page scans
Date: Fri, 7 Dec 2012 17:50:55 +0100 [thread overview]
Message-ID: <20121207175055.4e588fd7@azariah.suse.cz> (raw)
In-Reply-To: <20121119180710.GA16448@sgi.com>
V Mon, 19 Nov 2012 12:07:10 -0600
Cliff Wickman <cpw@sgi.com> napsáno:
> On Fri, Nov 16, 2012 at 03:39:44PM -0500, Vivek Goyal wrote:
> > On Thu, Nov 15, 2012 at 04:52:40PM -0600, Cliff Wickman wrote:
> > >
> > > Gentlemen,
> > >
> > > I know this is rather late to the game, given all the recent work
> > > to speed up makedumpfile and reduce the memory that it consumes.
> > > But I've been experimenting with asking the kernel to scan the
> > > page tables instead of reading all those page structures
> > > through /proc/vmcore.
> > >
> > > The results are rather dramatic -- if they weren't I would not
> > > presume to suggest such a radical path.
> > > On a small, idle UV system: about 4 sec. versus about 40 sec.
> > > On a 8TB UV the unnecessary page scan alone takes 4 minutes, vs.
> > > about 200 min through /proc/vmcore.
> > >
> > > I have not compared it to your version 1.5.1, so I don't know if
> > > your recent work provides similar speedups.
> >
> > I guess try 1.5.1-rc. IIUC, we had the logic of going through page
> > tables but that required one single bitmap to be present and in
> > constrained memory environment we will not have that.
> >
> > That's when this idea came up that scan portion of struct page
> > range, filter it, dump it and then move on to next range.
> >
> > Even after 1.5.1-rc if difference is this dramatic, that means we
> > are not doing something right in makedumpfile and it needs to be
> > fixed/optimized.
> >
> > But moving the logic to kernel does not make much sense to me at
> > this point of time untile and unless there is a good explanation
> > that why user space can't do a good job of what kernel is doing.
>
> I tested a patch in which makedumpfile does nothing but scan all the
> page structures using /proc/vmcore. It is simply reading each
> consecutive range of page structures in readmem() chunks of 512
> structures. And doing nothing more than accumulating a hash total of
> the 'flags' field in each page (for a sanity check). On my test
> machine there are 6 blocks of page structures, totaling 12 million
> structures. This takes 31.1 'units of time' (I won't say seconds, as
> the speed of the clock seems to be way too fast in the crash kernel).
> If I increase the buffer size to 5120 structures: 31.0 units. At
> 51200 structures: 30.9. So buffer size has virtually no effect.
>
> I also request the kernel to do the same thing. Each of the 6
> requests asks the kernel to scan a range of page structures and
> accumulate a hash total of the 'flags' field. (And also copy a
> 10000-element pfn list back to user space, to test that such copies
> don't add significant overhead.) And the 12 million pages are scanned
> in 1.6 'units of time'.
>
> If I compare the time for actual page scanning (unnecessary pages and
> free pages) through /proc/vmcore vs. requesting the kernel to do the
> scanning: 40 units vs. 3.8 units.
>
> My conclusion is that makedumpfile's page scanning procedure is
> extremely dominated by the overhead of copying page structures
> through /proc/vmcore. And that is about 20x slower than using the
> kernel to access pages.
Understood.
I wonder if we can get the same speed if makedumpfile
mmaps /proc/vmcore instead of reading it. It's just a quick idea, so
maybe the first thing to find out is whether /proc/vmcore implements
mmap(), but if the bottleneck is indeed copy_to_user(), then this
should help.
Stay tuned,
Petr Tesarik
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2012-12-07 16:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-15 22:52 [PATCH] makedumpfile: request the kernel do page scans Cliff Wickman
2012-11-16 20:39 ` Vivek Goyal
2012-11-19 18:07 ` Cliff Wickman
2012-12-07 16:50 ` Petr Tesarik [this message]
2012-12-10 0:59 ` HATAYAMA Daisuke
2012-12-10 15:36 ` Cliff Wickman
2012-12-20 3:22 ` HATAYAMA Daisuke
2012-12-20 15:51 ` Cliff Wickman
2012-12-21 1:35 ` HATAYAMA Daisuke
2012-12-10 15:43 ` Cliff Wickman
2012-12-10 15:50 ` Cliff Wickman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121207175055.4e588fd7@azariah.suse.cz \
--to=ptesarik@suse.cz \
--cc=kexec@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox