Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Nathan Zimmer <nzimmer@sgi.com>
Cc: Daniel J Blueman <daniel@numascale-asia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Travis <travis@sgi.com>, "H. Peter Anvin" <hpa@zytor.com>,
	holt@sgi.com, rob@landley.net,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	yinghai@kernel.org, Greg KH <gregkh@linuxfoundation.org>,
	x86@kernel.org, linux-doc@vger.kernel.org,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Steffen Persvold <sp@numascale.com>
Subject: Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
Date: Sat, 29 Jun 2013 09:24:41 +0200	[thread overview]
Message-ID: <20130629072441.GA15394@gmail.com> (raw)
In-Reply-To: <51CDF417.3050406@sgi.com>


* Nathan Zimmer <nzimmer@sgi.com> wrote:

> On 06/26/2013 10:35 PM, Daniel J Blueman wrote:
> >On Wednesday, June 26, 2013 9:30:02 PM UTC+8, Andrew Morton wrote:
> >>
> >> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar
> ><mi...@kernel.org> wrote:
> >>
> >> > except that on 32 TB
> >> > systems we don't spend ~2 hours initializing 8,589,934,592
> >page heads.
> >>
> >> That's about a million a second which is crazy slow - even my
> >prehistoric desktop
> >> is 100x faster than that.
> >>
> >> Where's all this time actually being spent?
> >
> > The complexity of a directory-lookup architecture to make the 
> > (intrinsically unscalable) cache-coherency protocol scalable gives you 
> > a ~1us roundtrip to remote NUMA nodes.
> >
> > Probably a lot of time is spent in some memsets, and RMW cycles which 
> > are setting page bits, which are intrinsically synchronous, so the 
> > initialising core can't get to 12 or so outstanding memory 
> > transactions.
> >
> > Since EFI memory ranges have a flag to state if they are zerod (which 
> > may be a fair assumption for memory on non-bootstrap processor NUMA 
> > nodes), we can probably collapse the RMWs to just writes.
> >
> > A normal write will require a coherency cycle, then a fetch and a 
> > writeback when it's evicted from the cache. For this purpose, 
> > non-temporal writes would eliminate the cache line fetch and give a 
> > massive increase in bandwidth. We wouldn't even need a store-fence as 
> > the initialising core is the only one online.
> 
> Could you elaborate a bit more? or suggest a specific area to look at?
> 
> After some experiments with trying to just set some fields in the struct 
> page directly I haven't been able to produce any improvements.  Of 
> course there is lots about the area which I don't have much experience 
> with.

Any such improvement will at most be in the 10-20% range.

I'd suggest first concentrating on the 1000-fold boot time initialization 
speedup that the buddy allocator delayed initialization can offer, and 
speeding up whatever remains after that stage - in a much more 
development-friendly environment. (You'll be able to run 'perf record 
./calloc-1TB' after bootup and get meaningful results, etc.)

Thanks,

	Ingo

next prev parent reply	other threads:[~2013-06-29  7:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-27  3:35 [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Daniel J Blueman
2013-06-28 20:37 ` Nathan Zimmer
2013-06-29  7:24   ` Ingo Molnar [this message]
2013-06-29 18:03     ` Nathan Zimmer
  -- strict thread matches above, loose matches on Subject: below --
2013-06-21 16:25 [RFC 0/2] Delay initializing of large sections of memory Nathan Zimmer
2013-06-21 16:25 ` [RFC 2/2] x86_64, mm: Reinsert the absent memory Nathan Zimmer
2013-06-23  9:28   ` Ingo Molnar
2013-06-24 20:36     ` Nathan Zimmer
2013-06-25  7:38       ` Ingo Molnar
2013-06-25 17:22         ` Mike Travis
2013-06-25 18:43           ` H. Peter Anvin
2013-06-25 18:51             ` Mike Travis
2013-06-26  9:22               ` [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Ingo Molnar
2013-06-26 13:28                 ` Andrew Morton
2013-06-26 13:37                   ` Ingo Molnar
2013-06-26 15:02                     ` Nathan Zimmer
2013-06-26 16:15                     ` Mike Travis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130629072441.GA15394@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=daniel@numascale-asia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=holt@sgi.com \
    --cc=hpa@zytor.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nzimmer@sgi.com \
    --cc=rob@landley.net \
    --cc=sp@numascale.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=travis@sgi.com \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.