From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Subject: Re: [PATCH 1/4] tracing: move __DO_TRACE out of line
Date: Sat, 18 Apr 2009 02:53:31 -0400 [thread overview]
Message-ID: <20090418065331.GA1942@Krystal> (raw)
In-Reply-To: <49E8D91F.1060005@goop.org>
* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> Ingo Molnar wrote:
>> I meant to suggest to Jeremy to measure the effect of this
>> out-of-lining, in terms of instruction count in the hotpath.
>>
>
> OK, here's a comparison for trace_sched_switch, comparing inline and out
> of line tracing functions, with CONFIG_PREEMPT enabled:
>
> The inline __DO_TRACE version of trace_sched_switch inserts 20
> instructions, assembling to 114 bytes of code in the hot path:
>
[...]
>
> __do_trace_sched_switch is a fair bit larger, mostly due to function
> preamble frame and reg save/restore, and some unfortunate and
> unnecessary register thrashing (why not keep rdi,rsi,rdx where they
> are?). But it isn't that much larger than the inline version: 34
> instructions, 118 bytes. This code will also be shared among all
> instances of the tracepoint (not in this case, because sched_switch is
> unique, but other tracepoints have multiple users).
>
[...]
> So, conclusion: putting the tracepoint code out of line significantly
> reduces the hot-path code size at each tracepoint (114 bytes down to 31
> in this case, 27% the size). This should reduce the overhead of having
> tracing configured but not enabled. The saving won't be as large for
> tracepoints with fewer arguments or without CONFIG_PREEMPT, but I chose
> this example because it is realistic and undeniably a hot path. And
> when doing pvops tracing, 80 new events with hundreds of callsites
> around the kernel, this is really going to add up.
>
> The tradeoff is that the actual tracing function is a little larger, but
> not dramatically so. I would expect some performance hit when the
> tracepoint is actually enabled. This may be mitigated increased icache
> hits when a tracepoint has multiple sites.
>
> (BTW, I realized that we don't need to pass &__tracepoint_FOO to
> __do_trace_FOO(), since its always going to be the same; this simplifies
> the calling convention at the callsite, and it also makes void
> tracepoints work again.)
>
> J
Yep, keeping "void" working is a niceness I would like to keep. So about
this supposed "near-zero function call impact", I decided to take LTTng
for a little test. I compare tracing the "core set" of Google
tracepoints with the tracepoints inline and out-of line. Here is the
result :
tbench test
kernel : 2.6.30-rc1
running on a 8-cores x86_64, localhost server
tracepoints inactive :
2051.20 MB/sec
"google" tracepoints activated, flight recorder mode (overwrite) tracing
inline tracepoints
1704.70 MB/sec (16.9 % slower than baseline)
out-of-line tracepoints
1635.14 MB/sec (20.3 % slower than baseline)
So the overall tracer impact is 20 % bigger just by making the
tracepoints out-of-line. This is going to add up quickly if we add as
much function calls as we currently find in the event tracer fast path,
but LTTng, OTOH, has been designed to minimize the number of such
function calls, and you see a good example of why it's been such an
important design goal above.
About cache-line usage, I agree that in some cases gcc does not seem
intelligent enough to move those code paths away from the fast path.
What we would really whant there is -freorder-blocks-and-partition, but
I doubt we want this for the whole kernel, as it makes some jumps
slightly larger. One thing we should maybe look into is to add some kind
of "very unlikely" builtin expect to gcc that would teach it to really
put the branch in a cache-cold location, no matter what.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2009-04-18 6:54 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-17 6:35 [PATCH] tracing WIP patches Jeremy Fitzhardinge
2009-04-17 6:35 ` [PATCH 1/4] tracing: move __DO_TRACE out of line Jeremy Fitzhardinge
2009-04-17 15:46 ` Ingo Molnar
2009-04-17 16:10 ` Mathieu Desnoyers
2009-04-17 16:23 ` Ingo Molnar
2009-04-17 16:47 ` Jeremy Fitzhardinge
2009-04-17 19:31 ` Jeremy Fitzhardinge
2009-04-17 19:46 ` Ingo Molnar
2009-04-17 19:57 ` Steven Rostedt
2009-04-17 19:58 ` Jeremy Fitzhardinge
2009-04-17 20:06 ` Steven Rostedt
2009-04-18 6:53 ` Mathieu Desnoyers [this message]
2009-04-18 14:16 ` Steven Rostedt
2009-04-19 3:59 ` Mathieu Desnoyers
2009-04-19 23:38 ` Jeremy Fitzhardinge
2009-04-20 21:39 ` Mathieu Desnoyers
2009-04-19 23:40 ` Jeremy Fitzhardinge
2009-04-20 21:47 ` Mathieu Desnoyers
2009-04-17 6:35 ` [PATCH 2/4] x86/pvops: target CREATE_TRACE_POINTS to particular subsystems Jeremy Fitzhardinge
2009-04-17 15:55 ` Steven Rostedt
2009-04-17 16:14 ` Jeremy Fitzhardinge
2009-04-17 16:32 ` Steven Rostedt
2009-04-17 16:48 ` Jeremy Fitzhardinge
2009-04-17 16:57 ` Steven Rostedt
2009-04-17 17:14 ` Jeremy Fitzhardinge
2009-04-17 17:33 ` Steven Rostedt
2009-04-17 18:11 ` Jeremy Fitzhardinge
2009-04-17 6:35 ` [PATCH 3/4] tracing: pass proto and args to DEFINE_TRACE Jeremy Fitzhardinge
2009-04-17 6:48 ` Christoph Hellwig
2009-04-17 6:58 ` Jeremy Fitzhardinge
2009-04-17 7:05 ` Christoph Hellwig
2009-04-17 12:53 ` Ingo Molnar
2009-04-17 15:21 ` Mathieu Desnoyers
2009-04-17 6:35 ` [PATCH 4/4] tracing: avoid warnings from zero-arg tracepoints Jeremy Fitzhardinge
2009-04-17 15:53 ` Steven Rostedt
2009-04-17 15:53 ` Ingo Molnar
2009-04-17 16:10 ` [tip:tracing/core] " tip-bot for Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090418065331.GA1942@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=jeremy.fitzhardinge@citrix.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox