From: William Cohen <wcohen@redhat.com>
To: "linux-perf-users@vger.kernel.org" <linux-perf-users@vger.kernel.org>
Subject: Perf supporting function reordering?
Date: Mon, 17 Mar 2014 12:22:52 -0400 [thread overview]
Message-ID: <5327215C.7010005@redhat.com> (raw)
Has there been any thought about perf supporting function reordering? The
kernel had a function reorder option that was available in Linux
2.6.17 to 2.6.21? The kernel code has poor performance when compared
to the user-space code. For a simple experiment compiling the kernel
code the kernel was getting a L1 icache miss every 60 instructions
versus the userspace getting an L1 icache miss every 260 instructions.
The difference in IPC was also significant. The kernel code had 0.55
IPC while userspace had 1.59 IPC.
The arguments for removal of the function reorder code for Linux 2.6.22 were:
-linker was slowed too much by many sections
-manual generation of the ordering list got out of date
-too diverse worksloads
With perf it is easy to collect information about the operation
runtime characteristics. Over the weekend I was able to collect call
graph information of kernel build on the system with perf, then
render the data with gprof2dot and dot:
export training=training/make_a_g_branch_k
sudo perf record -a -g -e branches:k -o $training.data su wcohen -c "make -j4"
sudo chown wcohen $training.data; sudo chgrp wcohen $training.data
perf script -i $training.data| gprof2dot --format=perf > $training.gv
dot -Tsvg < $training.gv > $(training).svg
By default gprof2dot prunes nodes and edges. The following provides a
more complete graph:
perf script -i $training.data| gprof2dot --format=perf -n 0.05 -e 0.01 > training/make_a_g_branch_k_2.gv
The results graphs are at:
http://people.redhat.com/wcohen/sediment/make_a_g_branch_k.svg
http://people.redhat.com/wcohen/sediment/make_a_g_branch_k_2.svg
The graphs give some indication of the flow through the kernel
code. Search for "system_call" will show the kernel entry and where
thing branch out from there. Other place of interest "page_fault" and
"apic_timer_interrupt".
Any thoughts on making it easier for perf make this statistical
callgraph information available and using it to do code reordering? I
have experimented with code reorder with user space postgres package
and it did help performance about 5% improvement in IPC
(http://people.redhat.com/wcohen/sediment/html/pop.html)
-Will
next reply other threads:[~2014-03-17 16:22 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-17 16:22 William Cohen [this message]
2014-03-17 16:35 ` Perf supporting function reordering? Andi Kleen
2014-03-17 18:36 ` William Cohen
2014-03-17 23:39 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5327215C.7010005@redhat.com \
--to=wcohen@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).