From: Robert Richter <rric@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Robin Holt <holt@sgi.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Nate Zimmer <nzimmer@sgi.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Rob Landley <rob@landley.net>,
Mike Travis <travis@sgi.com>,
Daniel J Blueman <daniel@numascale-asia.com>,
Andrew Morton <akpm@linux-foundation.org>,
Greg KH <gregkh@linuxfoundation.org>,
Yinghai Lu <yinghai@kernel.org>, Mel Gorman <mgorman@suse.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator
Date: Fri, 12 Jul 2013 10:19:09 +0100 [thread overview]
Message-ID: <20130712091909.GC8731@rric.localhost> (raw)
In-Reply-To: <20130712082756.GA4328@gmail.com>
On 12.07.13 10:27:56, Ingo Molnar wrote:
>
> * Robin Holt <holt@sgi.com> wrote:
>
> > [...]
> >
> > With this patch, we did boot a 16TiB machine. Without the patches, the
> > v3.10 kernel with the same configuration took 407 seconds for
> > free_all_bootmem. With the patches and operating on 2MiB pages instead
> > of 1GiB, it took 26 seconds so performance was improved. I have no feel
> > for how the 1GiB chunk size will perform.
>
> That's pretty impressive.
>
> It's still a 15x speedup instead of a 512x speedup, so I'd say there's
> something else being the current bottleneck, besides page init
> granularity.
>
> Can you boot with just a few gigs of RAM and stuff the rest into hotplug
> memory, and then hot-add that memory? That would allow easy profiling of
> remaining overhead.
>
> Side note:
>
> Robert Richter and Boris Petkov are working on 'persistent events' support
> for perf, which will eventually allow boot time profiling - I'm not sure
> if the patches and the tooling support is ready enough yet for your
> purposes.
The latest patch set is still this:
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent-v2
It requires the perf subsystem to be initialized first which might be
too late, see perf_event_init() in start_kernel(). The patch set is
currently also limited to tracepoints only.
If this is sufficient for you, you might register persistent events
with the function perf_add_persistent_event_by_id(), see
mcheck_init_tp() how to do this. Later you can fetch all samples with:
# perf record -e persistent/<tracepoint>/ sleep 1
> Robert, Boris, the following workflow would be pretty intuitive:
>
> - kernel developer sets boot flag: perf=boot,freq=1khz,size=16MB
>
> - we'd get a single (cycles?) event running once the perf subsystem is up
> and running, with a sampling frequency of 1 KHz, sending profiling
> trace events to a sufficiently sized profiling buffer of 16 MB per
> CPU.
I am not sure about the event you want to setup here, if it is a
tracepoint the sample_period should be always 1. The buffer size
parameter looks interesting, for now it is 512kB per cpu per default
(as perf tools setup the buffer).
>
> - once the system reaches SYSTEM_RUNNING, profiling is stopped either
> automatically - or the user stops it via a new tooling command.
>
> - the profiling buffer is extracted into a regular perf.data via a
> special 'perf record' call or some other, new perf tooling
> solution/variant.
See the perf-record command above...
>
> [ Alternatively the kernel could attempt to construct a 'virtual'
> perf.data from the persistent buffer, available via /sys/debug or
> elsewhere in /sys - just like the kernel constructs a 'virtual'
> /proc/kcore, etc. That file could be copied or used directly. ]
>
> - from that point on this workflow joins the regular profiling workflow:
> perf report, perf script et al can be used to analyze the resulting
> boot profile.
Ingo, thanks for outlining this workflow. We will look how this could
fit into the new version of persistent events we currently working on.
Thanks,
-Robert
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Robert Richter <rric@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Robin Holt <holt@sgi.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Nate Zimmer <nzimmer@sgi.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Rob Landley <rob@landley.net>,
Mike Travis <travis@sgi.com>,
Daniel J Blueman <daniel@numascale-asia.com>,
Andrew Morton <akpm@linux-foundation.org>,
Greg KH <gregkh@linuxfoundation.org>,
Yinghai Lu <yinghai@kernel.org>, Mel Gorman <mgorman@suse.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator
Date: Fri, 12 Jul 2013 10:19:09 +0100 [thread overview]
Message-ID: <20130712091909.GC8731@rric.localhost> (raw)
In-Reply-To: <20130712082756.GA4328@gmail.com>
On 12.07.13 10:27:56, Ingo Molnar wrote:
>
> * Robin Holt <holt@sgi.com> wrote:
>
> > [...]
> >
> > With this patch, we did boot a 16TiB machine. Without the patches, the
> > v3.10 kernel with the same configuration took 407 seconds for
> > free_all_bootmem. With the patches and operating on 2MiB pages instead
> > of 1GiB, it took 26 seconds so performance was improved. I have no feel
> > for how the 1GiB chunk size will perform.
>
> That's pretty impressive.
>
> It's still a 15x speedup instead of a 512x speedup, so I'd say there's
> something else being the current bottleneck, besides page init
> granularity.
>
> Can you boot with just a few gigs of RAM and stuff the rest into hotplug
> memory, and then hot-add that memory? That would allow easy profiling of
> remaining overhead.
>
> Side note:
>
> Robert Richter and Boris Petkov are working on 'persistent events' support
> for perf, which will eventually allow boot time profiling - I'm not sure
> if the patches and the tooling support is ready enough yet for your
> purposes.
The latest patch set is still this:
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent-v2
It requires the perf subsystem to be initialized first which might be
too late, see perf_event_init() in start_kernel(). The patch set is
currently also limited to tracepoints only.
If this is sufficient for you, you might register persistent events
with the function perf_add_persistent_event_by_id(), see
mcheck_init_tp() how to do this. Later you can fetch all samples with:
# perf record -e persistent/<tracepoint>/ sleep 1
> Robert, Boris, the following workflow would be pretty intuitive:
>
> - kernel developer sets boot flag: perf=boot,freq=1khz,size=16MB
>
> - we'd get a single (cycles?) event running once the perf subsystem is up
> and running, with a sampling frequency of 1 KHz, sending profiling
> trace events to a sufficiently sized profiling buffer of 16 MB per
> CPU.
I am not sure about the event you want to setup here, if it is a
tracepoint the sample_period should be always 1. The buffer size
parameter looks interesting, for now it is 512kB per cpu per default
(as perf tools setup the buffer).
>
> - once the system reaches SYSTEM_RUNNING, profiling is stopped either
> automatically - or the user stops it via a new tooling command.
>
> - the profiling buffer is extracted into a regular perf.data via a
> special 'perf record' call or some other, new perf tooling
> solution/variant.
See the perf-record command above...
>
> [ Alternatively the kernel could attempt to construct a 'virtual'
> perf.data from the persistent buffer, available via /sys/debug or
> elsewhere in /sys - just like the kernel constructs a 'virtual'
> /proc/kcore, etc. That file could be copied or used directly. ]
>
> - from that point on this workflow joins the regular profiling workflow:
> perf report, perf script et al can be used to analyze the resulting
> boot profile.
Ingo, thanks for outlining this workflow. We will look how this could
fit into the new version of persistent events we currently working on.
Thanks,
-Robert
next prev parent reply other threads:[~2013-07-12 9:19 UTC|newest]
Thread overview: 153+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-12 2:03 [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robin Holt
2013-07-12 2:03 ` Robin Holt
2013-07-12 2:03 ` [RFC 1/4] memblock: Introduce a for_each_reserved_mem_region iterator Robin Holt
2013-07-12 2:03 ` Robin Holt
2013-07-12 2:03 ` [RFC 2/4] Have __free_pages_memory() free in larger chunks Robin Holt
2013-07-12 2:03 ` Robin Holt
2013-07-12 7:45 ` Robin Holt
2013-07-12 7:45 ` Robin Holt
2013-07-13 3:08 ` Yinghai Lu
2013-07-13 3:08 ` Yinghai Lu
2013-07-16 13:02 ` Sam Ben
2013-07-16 13:02 ` Sam Ben
2013-07-23 15:32 ` Johannes Weiner
2013-07-23 15:32 ` Johannes Weiner
2013-07-12 2:03 ` [RFC 3/4] Seperate page initialization into a separate function Robin Holt
2013-07-12 2:03 ` Robin Holt
2013-07-13 3:06 ` Yinghai Lu
2013-07-13 3:06 ` Yinghai Lu
2013-07-15 3:19 ` Robin Holt
2013-07-15 3:19 ` Robin Holt
2013-07-12 2:03 ` [RFC 4/4] Sparse initialization of struct page array Robin Holt
2013-07-12 2:03 ` Robin Holt
2013-07-13 4:19 ` Yinghai Lu
2013-07-13 4:19 ` Yinghai Lu
2013-07-13 4:39 ` H. Peter Anvin
2013-07-13 4:39 ` H. Peter Anvin
2013-07-13 5:31 ` Yinghai Lu
2013-07-13 5:31 ` Yinghai Lu
2013-07-13 5:38 ` H. Peter Anvin
2013-07-13 5:38 ` H. Peter Anvin
2013-07-15 14:08 ` Nathan Zimmer
2013-07-15 14:08 ` Nathan Zimmer
2013-07-15 17:45 ` Nathan Zimmer
2013-07-15 17:45 ` Nathan Zimmer
2013-07-15 17:54 ` H. Peter Anvin
2013-07-15 17:54 ` H. Peter Anvin
2013-07-15 18:26 ` Robin Holt
2013-07-15 18:26 ` Robin Holt
2013-07-15 18:29 ` H. Peter Anvin
2013-07-15 18:29 ` H. Peter Anvin
2013-07-23 8:32 ` Ingo Molnar
2013-07-23 8:32 ` Ingo Molnar
2013-07-23 11:09 ` Robin Holt
2013-07-23 11:09 ` Robin Holt
2013-07-23 11:15 ` Robin Holt
2013-07-23 11:15 ` Robin Holt
2013-07-23 11:41 ` Robin Holt
2013-07-23 11:41 ` Robin Holt
2013-07-23 11:50 ` Robin Holt
2013-07-23 11:50 ` Robin Holt
2013-07-16 10:26 ` Robin Holt
2013-07-16 10:26 ` Robin Holt
2013-07-25 2:25 ` Robin Holt
2013-07-25 2:25 ` Robin Holt
2013-07-25 12:50 ` Yinghai Lu
2013-07-25 12:50 ` Yinghai Lu
2013-07-25 13:42 ` Robin Holt
2013-07-25 13:42 ` Robin Holt
2013-07-25 13:52 ` Yinghai Lu
2013-07-25 13:52 ` Yinghai Lu
2013-07-15 21:30 ` Andrew Morton
2013-07-15 21:30 ` Andrew Morton
2013-07-16 10:38 ` Robin Holt
2013-07-16 10:38 ` Robin Holt
2013-07-12 8:27 ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-07-12 8:27 ` Ingo Molnar
2013-07-12 8:47 ` boot tracing Borislav Petkov
2013-07-12 8:47 ` Borislav Petkov
2013-07-12 8:53 ` Ingo Molnar
2013-07-12 8:53 ` Ingo Molnar
2013-07-15 1:38 ` Sam Ben
2013-07-15 1:38 ` Sam Ben
2013-07-23 8:18 ` Ingo Molnar
2013-07-23 8:18 ` Ingo Molnar
2013-07-12 9:19 ` Robert Richter [this message]
2013-07-12 9:19 ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robert Richter
2013-07-15 15:16 ` Robin Holt
2013-07-15 15:16 ` Robin Holt
2013-07-16 8:55 ` Joonsoo Kim
2013-07-16 8:55 ` Joonsoo Kim
2013-07-16 9:08 ` Borislav Petkov
2013-07-16 9:08 ` Borislav Petkov
2013-07-23 8:20 ` Ingo Molnar
2013-07-23 8:20 ` Ingo Molnar
2013-07-15 15:00 ` Robin Holt
2013-07-15 15:00 ` Robin Holt
2013-07-17 5:17 ` Sam Ben
2013-07-17 5:17 ` Sam Ben
2013-07-17 9:30 ` Robin Holt
2013-07-17 9:30 ` Robin Holt
2013-07-19 23:51 ` Yinghai Lu
2013-07-22 6:13 ` Robin Holt
2013-07-22 6:13 ` Robin Holt
2013-08-02 17:44 ` [RFC v2 0/5] " Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-02 17:44 ` [RFC v2 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-02 17:44 ` [RFC v2 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-02 17:44 ` [RFC v2 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-02 17:44 ` [RFC v2 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-03 20:04 ` Nathan Zimmer
2013-08-03 20:04 ` Nathan Zimmer
2013-08-02 17:44 ` [RFC v2 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-02 17:44 ` Nathan Zimmer
2013-08-05 9:58 ` [RFC v2 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-05 9:58 ` Ingo Molnar
2013-08-12 21:54 ` [RFC v3 " Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-12 21:54 ` [RFC v3 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-12 21:54 ` [RFC v3 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-12 21:54 ` [RFC v3 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-12 21:54 ` [RFC v3 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-12 21:54 ` [RFC v3 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-12 21:54 ` Nathan Zimmer
2013-08-13 10:58 ` [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-13 10:58 ` Ingo Molnar
2013-08-13 17:09 ` Linus Torvalds
2013-08-13 17:09 ` Linus Torvalds
2013-08-13 17:23 ` H. Peter Anvin
2013-08-13 17:23 ` H. Peter Anvin
2013-08-13 17:33 ` Mike Travis
2013-08-13 17:33 ` Mike Travis
2013-08-13 17:51 ` Linus Torvalds
2013-08-13 17:51 ` Linus Torvalds
2013-08-13 18:04 ` Mike Travis
2013-08-13 18:04 ` Mike Travis
2013-08-13 19:06 ` Mike Travis
2013-08-13 19:06 ` Mike Travis
2013-08-13 20:24 ` Yinghai Lu
2013-08-13 20:24 ` Yinghai Lu
2013-08-13 20:37 ` Mike Travis
2013-08-13 20:37 ` Mike Travis
2013-08-13 21:35 ` Nathan Zimmer
2013-08-13 21:35 ` Nathan Zimmer
2013-08-13 23:10 ` Nathan Zimmer
2013-08-13 23:10 ` Nathan Zimmer
2013-08-13 23:55 ` Linus Torvalds
2013-08-13 23:55 ` Linus Torvalds
2013-08-14 11:27 ` Ingo Molnar
2013-08-14 11:27 ` Ingo Molnar
2013-08-14 11:05 ` Ingo Molnar
2013-08-14 11:05 ` Ingo Molnar
2013-08-14 22:15 ` Nathan Zimmer
2013-08-14 22:15 ` Nathan Zimmer
2013-08-16 16:36 ` Dave Hansen
2013-08-16 16:36 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130712091909.GC8731@rric.localhost \
--to=rric@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=daniel@numascale-asia.com \
--cc=gregkh@linuxfoundation.org \
--cc=holt@sgi.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=nzimmer@sgi.com \
--cc=rob@landley.net \
--cc=travis@sgi.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.