All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel J Blueman <daniel@numascale.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Steffen Persvold <sp@numascale.com>,
	Linux-MM <linux-mm@kvack.org>, Robin Holt <holt@sgi.com>,
	Nathan Zimmer <nzimmer@sgi.com>, Daniel Rahn <drahn@suse.com>,
	Davidlohr Bueso <dbueso@suse.com>,
	Dave Hansen <dave.hansen@intel.com>, Tom Vaden <tom.vaden@hp.com>,
	Scott Norton <scott.norton@hp.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 0/14] Parallel memory initialisation
Date: Mon, 20 Apr 2015 11:15:02 +0800	[thread overview]
Message-ID: <1429499702.19274.3@cpanel21.proisp.no> (raw)
In-Reply-To: <1429170665.19274.0@cpanel21.proisp.no>

[-- Attachment #1: Type: text/plain, Size: 3260 bytes --]

On Thu, Apr 16, 2015 at 3:51 PM, Daniel J Blueman 
<daniel@numascale.com> wrote:
> On Monday, April 13, 2015 at 6:20:05 PM UTC+8, Mel Gorman wrote:
> > Memory initialisation had been identified as one of the reasons why 
> large
> > machines take a long time to boot. Patches were posted a long time 
> ago
> > that attempted to move deferred initialisation into the page 
> allocator
> > paths. This was rejected on the grounds it should not be necessary 
> to hurt
> > the fast paths to parallelise initialisation. This series reuses 
> much of
> > the work from that time but defers the initialisation of memory to 
> kswapd
> > so that one thread per node initialises memory local to that node. 
> The
> > issue is that on the machines I tested with, memory initialisation 
> was not
> > a major contributor to boot times. I'm posting the RFC to both 
> review the
> > series and see if it actually helps users of very large machines.
> >
> > After applying the series and setting the appropriate Kconfig 
> variable I
> > see this in the boot log on a 64G machine
> >
> > [    7.383764] kswapd 0 initialised deferred memory in 188ms
> > [    7.404253] kswapd 1 initialised deferred memory in 208ms
> > [    7.411044] kswapd 3 initialised deferred memory in 216ms
> > [    7.411551] kswapd 2 initialised deferred memory in 216ms
> >
> > On a 1TB machine, I see
> >
> > [   11.913324] kswapd 0 initialised deferred memory in 1168ms
> > [   12.220011] kswapd 2 initialised deferred memory in 1476ms
> > [   12.245369] kswapd 3 initialised deferred memory in 1500ms
> > [   12.271680] kswapd 1 initialised deferred memory in 1528ms
> >
> > Once booted the machine appears to work as normal. Boot times were 
> measured
> > from the time shutdown was called until ssh was available again.  
> In the
> > 64G case, the boot time savings are negligible. On the 1TB machine, 
> the
> > savings were 10 seconds (about 8% improvement on kernel times but 
> 1-2%
> > overall as POST takes so long).
> >
> > It would be nice if the people that have access to really large 
> machines
> > would test this series and report back if the complexity is 
> justified.
> 
> Nice work!
> 
> On an older Numascale system with 1TB memory and 256 cores/32 NUMA 
> nodes, platform init takes 52s (cold boot), firmware takes 84s 
> (includes one warm reboot), stock linux 4.0 then takes 732s to boot 
> [1] (due to the 700ns roundtrip, RMW cache-coherent cycles due to the 
> temporal writes for pagetable init and per-core store queue limits), 
> so there is huge potential.

Same 1TB setup (256 cores, 32 NUMA nodes):
unpatched 4.0: 789s [1]
2GB per node up-front: 426s [2]
4GB node 0 up-front, 0GB later nodes: 461s [3]
4GB node 0 up-front, 0.5GB later nodes: 404s [4]

Compelling results at only 1TB! In the last case, we see PMD setup take 
42% (168s) of the time, along with topology_init taking 39% (157s). I 
should be able to get data on a 7TB system this week.

[1] 
https://resources.numascale.com/telemetry/defermem/h8qgl-defer-stock.txt
[2] 
https://resources.numascale.com/telemetry/defermem/h8qgl-defer-2g.txt
[3] 
https://resources.numascale.com/telemetry/defermem/h8qgl-defer-4+0.txt
[4] 
https://resources.numascale.com/telemetry/defermem/h8qgl-defer-4+half.txt

[-- Attachment #2: Type: text/html, Size: 4356 bytes --]

  reply	other threads:[~2015-04-20  3:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-16  7:51 [RFC PATCH 0/14] Parallel memory initialisation Daniel J Blueman
2015-04-20  3:15 ` Daniel J Blueman [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-04-13 10:16 Mel Gorman
2015-04-13 10:16 ` Mel Gorman
2015-04-13 10:29 ` Mel Gorman
2015-04-13 10:29   ` Mel Gorman
2015-04-15 13:15 ` Waiman Long
2015-04-15 13:38   ` Mel Gorman
2015-04-15 13:38     ` Mel Gorman
2015-04-15 14:50     ` Waiman Long
2015-04-15 14:50       ` Waiman Long
2015-04-15 15:44       ` Mel Gorman
2015-04-15 15:44         ` Mel Gorman
2015-04-15 21:37         ` nzimmer
2015-04-15 21:37           ` nzimmer
2015-04-16 18:20     ` Waiman Long
2015-04-15 14:27   ` Peter Zijlstra
2015-04-15 14:27     ` Peter Zijlstra
2015-04-15 14:34     ` Mel Gorman
2015-04-15 14:34       ` Mel Gorman
2015-04-15 14:48       ` Peter Zijlstra
2015-04-15 14:48         ` Peter Zijlstra
2015-04-15 16:18         ` Waiman Long
2015-04-15 16:18           ` Waiman Long
2015-04-15 16:42           ` Norton, Scott J
2015-04-15 16:42             ` Norton, Scott J
2015-04-16  7:25 ` Andrew Morton
2015-04-16  7:25   ` Andrew Morton
2015-04-16  8:46   ` Mel Gorman
2015-04-16  8:46     ` Mel Gorman
2015-04-16 17:26     ` Andrew Morton
2015-04-16 17:26       ` Andrew Morton
2015-04-16 17:37       ` Mel Gorman
2015-04-16 17:37         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1429499702.19274.3@cpanel21.proisp.no \
    --to=daniel@numascale.com \
    --cc=dave.hansen@intel.com \
    --cc=dbueso@suse.com \
    --cc=drahn@suse.com \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=nzimmer@sgi.com \
    --cc=scott.norton@hp.com \
    --cc=sp@numascale.com \
    --cc=tom.vaden@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.