[RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-25 18:51             ` Mike Travis
@ 2013-06-26  9:22               ` Ingo Molnar
  2013-06-26 13:28                 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2013-06-26  9:22 UTC (permalink / raw)
  To: Mike Travis
  Cc: H. Peter Anvin, Nathan Zimmer, holt, rob, tglx, mingo, yinghai,
	akpm, gregkh, x86, linux-doc, linux-kernel, Linus Torvalds,
	Peter Zijlstra


(Changed the subject, to make it more apparent what we are talking about.)

* Mike Travis <travis@sgi.com> wrote:

> On 6/25/2013 11:43 AM, H. Peter Anvin wrote:
> > On 06/25/2013 10:22 AM, Mike Travis wrote:
> >>
> >> On 6/25/2013 12:38 AM, Ingo Molnar wrote:
> >>>
> >>> * Nathan Zimmer <nzimmer@sgi.com> wrote:
> >>>
> >>>> On Sun, Jun 23, 2013 at 11:28:40AM +0200, Ingo Molnar wrote:
> >>>>>
> >>>>> That's 4.5 GB/sec initialization speed - that feels a bit slow and the 
> >>>>> boot time effect should be felt on smaller 'a couple of gigabytes' 
> >>>>> desktop boxes as well. Do we know exactly where the 2 hours of boot 
> >>>>> time on a 32 TB system is spent?
> >>>>
> >>>> There are other several spots that could be improved on a large system 
> >>>> but memory initialization is by far the biggest.
> >>>
> >>> My feeling is that deferred/on-demand initialization triggered from the 
> >>> buddy allocator is the better long term solution.
> >>
> >> I haven't caught up with all of Nathan's changes yet (just
> >> got back from vacation), but there was an option to either
> >> start the memory insertion on boot, or trigger it later
> >> using the /sys/.../memory interface.  There is also a monitor
> >> program that calculates the memory insertion rate.  This was
> >> extremely useful to determine how changes in the kernel
> >> affected the rate.
> >>
> > 
> > Sorry, I *totally* did not follow that comment.  It seemed like a
> > complete non-sequitur?
> > 
> > 	-hpa
> 
> It was I who was not following the question.  I'm still reverting
> back to "work mode".
> 
> [There is more code in a separate patch that Nate has not sent
> yet that instructs the kernel to start adding memory as early
> as possible, or not.  That way you can start the insertion process
> later and monitor it's progress to determine how changes in the
> kernel affect that process.  It is controlled by a separate
> CONFIG option.]

So, just to repeat (and expand upon) the solution hpa and me suggests: 
it's not based on /sys, delayed initialization lists or any similar 
(essentially memory hot plug based) approach.

It's a transparent on-demand initialization scheme based on only 
initializing the very early memory setup in 1GB (2MB) steps (not in 4K 
steps like we do it today).

Any subsequent split-up initialization is done on-demand, in alloc_pages() 
et al, initilizing a batch of 512 (or 1024) struct page head's when an 
uninitialized portion is first encountered.

This leaves the principle logic of early init largely untouched, we still 
have the same amount of RAM during and after bootup, except that on 32 TB 
systems we don't spend ~2 hours initializing 8,589,934,592 page heads.

This scheme could be implemented by introducing a new PG_initialized flag, 
which is seen by an unlikely() branch in alloc_pages() and which triggers 
the on-demand initialization of pages.

[ It could probably be made zero-cost for the post-initialization state:
  we already check a bunch of rare PG_ flags, one more flag would not 
  introduce any new branch in the page allocation hot path. ]

It's a technically different solution from what was submitted in this 
thread.

Cons:

 - it works after bootup, via GFP. If done in a simple fashion it adds one 
   more branch to the GFP fastpath. [ If done a bit more cleverly it can 
   merge into an existing unlikely() branch and become essentially 
   zero-cost for the fastpath. ]

 - it adds an initialization non-determinism to GFP, to the tune of
   initializing ~512 page heads when RAM is utilized first.

 - initialization is done when memory is needed - not during or shortly 
   after bootup. This (slightly) increases first-use overhead. [I don't 
   think this factor is significant - and I think we'll quickly see 
   speedups to initialization, once the overhead becomes more easily 
   measurable.]

Pros:

 - it's transparent to the boot process. ('free' shows the same full
   amount of RAM all the time, there's no weird effects of RAM coming
   online asynchronously. You see all the RAM you have - etc.)

 - it helps the boot time of every single Linux system, not just large RAM
   ones. On a smallish, 4GB system memory init can take up precious
   hundreds of milliseconds, so this is a practical issue.

 - it spreads initialization overhead to later portions of the system's 
   life time: when there's typically more idle time and more paralellism
   available.

 - initialization overhead, because it's a natural part of first-time 
   memory allocation with this scheme, becomes more measurable (and thus 
   more prominently optimized) than any deferred lists processed in the 
   background.

 - as an added bonus it probably speeds up your usecase even more than the
   patches you are providing: on a 32 TB system the primary initialization
   would only have to enumerate memory, allocate page heads and buddy
   bitmaps, and initialize the 1GB granular page heads: there's only 32768
   of them.

So unless I overlooked some factor this scheme would be unconditional 
goodness for everyone.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-26  9:22               ` [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Ingo Molnar
@ 2013-06-26 13:28                 ` Andrew Morton
  2013-06-26 13:37                   ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2013-06-26 13:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mike Travis, H. Peter Anvin, Nathan Zimmer, holt, rob, tglx,
	mingo, yinghai, gregkh, x86, linux-doc, linux-kernel,
	Linus Torvalds, Peter Zijlstra

On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mingo@kernel.org> wrote:

> except that on 32 TB 
> systems we don't spend ~2 hours initializing 8,589,934,592 page heads.

That's about a million a second which is crazy slow - even my prehistoric desktop
is 100x faster than that.

Where's all this time actually being spent?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-26 13:28                 ` Andrew Morton
@ 2013-06-26 13:37                   ` Ingo Molnar
  2013-06-26 15:02                     ` Nathan Zimmer
  2013-06-26 16:15                     ` Mike Travis
  0 siblings, 2 replies; 9+ messages in thread
From: Ingo Molnar @ 2013-06-26 13:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mike Travis, H. Peter Anvin, Nathan Zimmer, holt, rob, tglx,
	mingo, yinghai, gregkh, x86, linux-doc, linux-kernel,
	Linus Torvalds, Peter Zijlstra

* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mingo@kernel.org> wrote:
> 
> > except that on 32 TB 
> > systems we don't spend ~2 hours initializing 8,589,934,592 page heads.
> 
> That's about a million a second which is crazy slow - even my 
> prehistoric desktop is 100x faster than that.
> 
> Where's all this time actually being spent?

See the earlier part of the thread - apparently it's spent initializing 
the page heads - remote NUMA node misses from a single boot CPU, going 
across a zillion cross-connects? I guess there's some other low hanging 
fruits as well - so making this easier to profile would be nice. The 
profile posted was not really usable.

Btw., NUMA locality would be another advantage of on-demand 
initialization: actual users of RAM tend to allocate node-local 
(especially on large clusters), so any overhead will be naturally lower.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-26 13:37                   ` Ingo Molnar
@ 2013-06-26 15:02                     ` Nathan Zimmer
  2013-06-26 16:15                     ` Mike Travis
  1 sibling, 0 replies; 9+ messages in thread
From: Nathan Zimmer @ 2013-06-26 15:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Mike Travis, H. Peter Anvin, Nathan Zimmer, holt,
	rob, tglx, mingo, yinghai, gregkh, x86, linux-doc, linux-kernel,
	Linus Torvalds, Peter Zijlstra

On Wed, Jun 26, 2013 at 03:37:15PM +0200, Ingo Molnar wrote:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mingo@kernel.org> wrote:
> > 
> > > except that on 32 TB 
> > > systems we don't spend ~2 hours initializing 8,589,934,592 page heads.
> > 
> > That's about a million a second which is crazy slow - even my 
> > prehistoric desktop is 100x faster than that.
> > 
> > Where's all this time actually being spent?
> 
> See the earlier part of the thread - apparently it's spent initializing 
> the page heads - remote NUMA node misses from a single boot CPU, going 
> across a zillion cross-connects? I guess there's some other low hanging 
> fruits as well - so making this easier to profile would be nice. The 
> profile posted was not really usable.
> 
That is correct, from what I am seeing, using crude cycle counters, there is
far more time spent on the later nodes, i.e. memory near the boot node is 
initialized a lot faster then remote memory.

I think the other low hanging fruits are currently being drowned out by the
lack of locality.

Nate

> Btw., NUMA locality would be another advantage of on-demand 
> initialization: actual users of RAM tend to allocate node-local 
> (especially on large clusters), so any overhead will be naturally lower.
> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-26 13:37                   ` Ingo Molnar
  2013-06-26 15:02                     ` Nathan Zimmer
@ 2013-06-26 16:15                     ` Mike Travis
  1 sibling, 0 replies; 9+ messages in thread
From: Mike Travis @ 2013-06-26 16:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, H. Peter Anvin, Nathan Zimmer, holt, rob, tglx,
	mingo, yinghai, gregkh, x86, linux-doc, linux-kernel,
	Linus Torvalds, Peter Zijlstra



On 6/26/2013 6:37 AM, Ingo Molnar wrote:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
>> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mingo@kernel.org> wrote:
>>
>>> except that on 32 TB 
>>> systems we don't spend ~2 hours initializing 8,589,934,592 page heads.
>>
>> That's about a million a second which is crazy slow - even my 
>> prehistoric desktop is 100x faster than that.
>>
>> Where's all this time actually being spent?
> 
> See the earlier part of the thread - apparently it's spent initializing 
> the page heads - remote NUMA node misses from a single boot CPU, going 
> across a zillion cross-connects? I guess there's some other low hanging 
> fruits as well - so making this easier to profile would be nice. The 
> profile posted was not really usable.

This is one advantage of delayed memory init.  I can do it under
the profiler.  I will put everything together to accomplish this
and then send a perf report.

> 
> Btw., NUMA locality would be another advantage of on-demand 
> initialization: actual users of RAM tend to allocate node-local 
> (especially on large clusters), so any overhead will be naturally lower.
> 
> Thanks,
> 
> 	Ingo
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
@ 2013-06-27  3:35 Daniel J Blueman
  2013-06-28 20:37 ` Nathan Zimmer
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel J Blueman @ 2013-06-27  3:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mike Travis, H. Peter Anvin, Nathan Zimmer, holt, rob,
	Thomas Gleixner, Ingo Molnar, yinghai, Greg KH, x86, linux-doc,
	Linux Kernel, Linus Torvalds, Peter Zijlstra, Steffen Persvold

On Wednesday, June 26, 2013 9:30:02 PM UTC+8, Andrew Morton wrote:
 >
 > On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mi...@kernel.org> wrote:
 >
 > > except that on 32 TB
 > > systems we don't spend ~2 hours initializing 8,589,934,592 page heads.
 >
 > That's about a million a second which is crazy slow - even my 
prehistoric desktop
 > is 100x faster than that.
 >
 > Where's all this time actually being spent?

The complexity of a directory-lookup architecture to make the 
(intrinsically unscalable) cache-coherency protocol scalable gives you a 
~1us roundtrip to remote NUMA nodes.

Probably a lot of time is spent in some memsets, and RMW cycles which 
are setting page bits, which are intrinsically synchronous, so the 
initialising core can't get to 12 or so outstanding memory transactions.

Since EFI memory ranges have a flag to state if they are zerod (which 
may be a fair assumption for memory on non-bootstrap processor NUMA 
nodes), we can probably collapse the RMWs to just writes.

A normal write will require a coherency cycle, then a fetch and a 
writeback when it's evicted from the cache. For this purpose, 
non-temporal writes would eliminate the cache line fetch and give a 
massive increase in bandwidth. We wouldn't even need a store-fence as 
the initialising core is the only one online.

Daniel
-- 
Daniel J Blueman
Principal Software Engineer, Numascale Asia

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-27  3:35 [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Daniel J Blueman
@ 2013-06-28 20:37 ` Nathan Zimmer
  2013-06-29  7:24   ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Zimmer @ 2013-06-28 20:37 UTC (permalink / raw)
  To: Daniel J Blueman
  Cc: Andrew Morton, Mike Travis, H. Peter Anvin, holt, rob,
	Thomas Gleixner, Ingo Molnar, yinghai, Greg KH, x86, linux-doc,
	Linux Kernel, Linus Torvalds, Peter Zijlstra, Steffen Persvold

On 06/26/2013 10:35 PM, Daniel J Blueman wrote:
> On Wednesday, June 26, 2013 9:30:02 PM UTC+8, Andrew Morton wrote:
> >
> > On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar <mi...@kernel.org> 
> wrote:
> >
> > > except that on 32 TB
> > > systems we don't spend ~2 hours initializing 8,589,934,592 page 
> heads.
> >
> > That's about a million a second which is crazy slow - even my 
> prehistoric desktop
> > is 100x faster than that.
> >
> > Where's all this time actually being spent?
>
> The complexity of a directory-lookup architecture to make the 
> (intrinsically unscalable) cache-coherency protocol scalable gives you 
> a ~1us roundtrip to remote NUMA nodes.
>
> Probably a lot of time is spent in some memsets, and RMW cycles which 
> are setting page bits, which are intrinsically synchronous, so the 
> initialising core can't get to 12 or so outstanding memory transactions.
>
> Since EFI memory ranges have a flag to state if they are zerod (which 
> may be a fair assumption for memory on non-bootstrap processor NUMA 
> nodes), we can probably collapse the RMWs to just writes.
>
> A normal write will require a coherency cycle, then a fetch and a 
> writeback when it's evicted from the cache. For this purpose, 
> non-temporal writes would eliminate the cache line fetch and give a 
> massive increase in bandwidth. We wouldn't even need a store-fence as 
> the initialising core is the only one online.
>
> Daniel

Could you elaborate a bit more? or suggest a specific area to look at?

After some experiments with trying to just set some fields in the struct 
page directly I haven't been able to produce any improvements.  Of 
course there is lots about the area which I don't have much experience with.

Nate


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-28 20:37 ` Nathan Zimmer
@ 2013-06-29  7:24   ` Ingo Molnar
  2013-06-29 18:03     ` Nathan Zimmer
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2013-06-29  7:24 UTC (permalink / raw)
  To: Nathan Zimmer
  Cc: Daniel J Blueman, Andrew Morton, Mike Travis, H. Peter Anvin,
	holt, rob, Thomas Gleixner, Ingo Molnar, yinghai, Greg KH, x86,
	linux-doc, Linux Kernel, Linus Torvalds, Peter Zijlstra,
	Steffen Persvold


* Nathan Zimmer <nzimmer@sgi.com> wrote:

> On 06/26/2013 10:35 PM, Daniel J Blueman wrote:
> >On Wednesday, June 26, 2013 9:30:02 PM UTC+8, Andrew Morton wrote:
> >>
> >> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar
> ><mi...@kernel.org> wrote:
> >>
> >> > except that on 32 TB
> >> > systems we don't spend ~2 hours initializing 8,589,934,592
> >page heads.
> >>
> >> That's about a million a second which is crazy slow - even my
> >prehistoric desktop
> >> is 100x faster than that.
> >>
> >> Where's all this time actually being spent?
> >
> > The complexity of a directory-lookup architecture to make the 
> > (intrinsically unscalable) cache-coherency protocol scalable gives you 
> > a ~1us roundtrip to remote NUMA nodes.
> >
> > Probably a lot of time is spent in some memsets, and RMW cycles which 
> > are setting page bits, which are intrinsically synchronous, so the 
> > initialising core can't get to 12 or so outstanding memory 
> > transactions.
> >
> > Since EFI memory ranges have a flag to state if they are zerod (which 
> > may be a fair assumption for memory on non-bootstrap processor NUMA 
> > nodes), we can probably collapse the RMWs to just writes.
> >
> > A normal write will require a coherency cycle, then a fetch and a 
> > writeback when it's evicted from the cache. For this purpose, 
> > non-temporal writes would eliminate the cache line fetch and give a 
> > massive increase in bandwidth. We wouldn't even need a store-fence as 
> > the initialising core is the only one online.
> 
> Could you elaborate a bit more? or suggest a specific area to look at?
> 
> After some experiments with trying to just set some fields in the struct 
> page directly I haven't been able to produce any improvements.  Of 
> course there is lots about the area which I don't have much experience 
> with.

Any such improvement will at most be in the 10-20% range.

I'd suggest first concentrating on the 1000-fold boot time initialization 
speedup that the buddy allocator delayed initialization can offer, and 
speeding up whatever remains after that stage - in a much more 
development-friendly environment. (You'll be able to run 'perf record 
./calloc-1TB' after bootup and get meaningful results, etc.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator
  2013-06-29  7:24   ` Ingo Molnar
@ 2013-06-29 18:03     ` Nathan Zimmer
  0 siblings, 0 replies; 9+ messages in thread
From: Nathan Zimmer @ 2013-06-29 18:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nathan Zimmer, Daniel J Blueman, Andrew Morton, Mike Travis,
	H. Peter Anvin, holt, rob, Thomas Gleixner, Ingo Molnar, yinghai,
	Greg KH, x86, linux-doc, Linux Kernel, Linus Torvalds,
	Peter Zijlstra, Steffen Persvold

On Sat, Jun 29, 2013 at 09:24:41AM +0200, Ingo Molnar wrote:
> 
> * Nathan Zimmer <nzimmer@sgi.com> wrote:
> 
> > On 06/26/2013 10:35 PM, Daniel J Blueman wrote:
> > >On Wednesday, June 26, 2013 9:30:02 PM UTC+8, Andrew Morton wrote:
> > >>
> > >> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar
> > ><mi...@kernel.org> wrote:
> > >>
> > >> > except that on 32 TB
> > >> > systems we don't spend ~2 hours initializing 8,589,934,592
> > >page heads.
> > >>
> > >> That's about a million a second which is crazy slow - even my
> > >prehistoric desktop
> > >> is 100x faster than that.
> > >>
> > >> Where's all this time actually being spent?
> > >
> > > The complexity of a directory-lookup architecture to make the 
> > > (intrinsically unscalable) cache-coherency protocol scalable gives you 
> > > a ~1us roundtrip to remote NUMA nodes.
> > >
> > > Probably a lot of time is spent in some memsets, and RMW cycles which 
> > > are setting page bits, which are intrinsically synchronous, so the 
> > > initialising core can't get to 12 or so outstanding memory 
> > > transactions.
> > >
> > > Since EFI memory ranges have a flag to state if they are zerod (which 
> > > may be a fair assumption for memory on non-bootstrap processor NUMA 
> > > nodes), we can probably collapse the RMWs to just writes.
> > >
> > > A normal write will require a coherency cycle, then a fetch and a 
> > > writeback when it's evicted from the cache. For this purpose, 
> > > non-temporal writes would eliminate the cache line fetch and give a 
> > > massive increase in bandwidth. We wouldn't even need a store-fence as 
> > > the initialising core is the only one online.
> > 
> > Could you elaborate a bit more? or suggest a specific area to look at?
> > 
> > After some experiments with trying to just set some fields in the struct 
> > page directly I haven't been able to produce any improvements.  Of 
> > course there is lots about the area which I don't have much experience 
> > with.
> 
> Any such improvement will at most be in the 10-20% range.
> 
> I'd suggest first concentrating on the 1000-fold boot time initialization 
> speedup that the buddy allocator delayed initialization can offer, and 
> speeding up whatever remains after that stage - in a much more 
> development-friendly environment. (You'll be able to run 'perf record 
> ./calloc-1TB' after bootup and get meaningful results, etc.)
> 
> Thanks,
> 
> 	Ingo

I had been focusing on the bigger gains but my attention had been diverted by
hope of an easy, alibiet smaller, win.


I have been experimenting with the patch proper, I am just doing 2MB pages for
the moment.  The improvement is vast,  I'll worry about proper numbers once I
think I have a fully working patch.

Some progress is being made on the real patch.  I think the memory is
being set up correctly, On aligned pages setting the up the page as normal
plus setting new PG_ flag. 

Right now I am trying to sort out free_pages_prepare and free_pages_check.

Thanks,
Nate




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-06-29 18:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-27  3:35 [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Daniel J Blueman
2013-06-28 20:37 ` Nathan Zimmer
2013-06-29  7:24   ` Ingo Molnar
2013-06-29 18:03     ` Nathan Zimmer
  -- strict thread matches above, loose matches on Subject: below --
2013-06-21 16:25 [RFC 0/2] Delay initializing of large sections of memory Nathan Zimmer
2013-06-21 16:25 ` [RFC 2/2] x86_64, mm: Reinsert the absent memory Nathan Zimmer
2013-06-23  9:28   ` Ingo Molnar
2013-06-24 20:36     ` Nathan Zimmer
2013-06-25  7:38       ` Ingo Molnar
2013-06-25 17:22         ` Mike Travis
2013-06-25 18:43           ` H. Peter Anvin
2013-06-25 18:51             ` Mike Travis
2013-06-26  9:22               ` [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator Ingo Molnar
2013-06-26 13:28                 ` Andrew Morton
2013-06-26 13:37                   ` Ingo Molnar
2013-06-26 15:02                     ` Nathan Zimmer
2013-06-26 16:15                     ` Mike Travis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).