From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Ingo Molnar <mingo@elte.hu>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Arjan van de Ven <arjan@infradead.org>
Subject: Re: [patch-early-RFC 00/10] LTTng architecture dependent instrumentation
Date: Sat, 8 Dec 2007 14:05:04 -0500 [thread overview]
Message-ID: <20071208190504.GA30538@Krystal> (raw)
In-Reply-To: <20071206101112.GB17299@elte.hu>
* Ingo Molnar (mingo@elte.hu) wrote:
>
> hi Mathieu,
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Hi,
> >
> > Here is the architecture dependent instrumentation for LTTng. [...]
>
> A fundamental observation about markers, and i raised this point many
> many months ago already, so it might sound repetitive, but i'm unsure
> wether it's addressed. Documentation/markers.txt still says:
>
> | * Purpose of markers
> |
> | A marker placed in code provides a hook to call a function (probe)
> | that you can provide at runtime. A marker can be "on" (a probe is
> | connected to it) or "off" (no probe is attached). When a marker is
> | "off" it has no effect, except for adding a tiny time penalty
> | (checking a condition for a branch) and space penalty (adding a few
> | bytes for the function call at the end of the instrumented function
> | and adds a data structure in a separate section).
>
> could you please eliminate the checking of the flag, and insert a pure
> NOP sequence by default (no extra branches), which is then patched in
> with a function call instruction sequence, when the trace point is
> turned on? (on architectures that have code patching infrastructure -
> such as x86)
>
Hi Ingo,
Here are the results of a test I made, hacking a binary to put nops
instead of a function call.
The test is 20000 loops calling a function that contains a marker with
interrupts disabled. It is performed on a x86 32, Pentium 4 3GHz.
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current->pid,
arg, arg2);
The number here include the function call (present in both cases) the
counter increment/tests and the marker.
* No marker at all
240300 cycles total
12.02 cycles per loop
void test(unsigned long arg, unsigned long arg2)
{
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
asm volatile ("");
}
3: 5d pop %ebp
4: c3 ret
* With my marker implementation (load immediate 0, branch predicted) :
between 200355 and 200580 cycles total (avg 200400 cycles)
10.02 cycles per loop (yes, adding the marker increases performance)
void test(unsigned long arg, unsigned long arg2)
{
4d: 55 push %ebp
4e: 89 e5 mov %esp,%ebp
50: 83 ec 1c sub $0x1c,%esp
53: 89 c1 mov %eax,%ecx
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current-
>pid, arg, arg2);
55: b0 00 mov $0x0,%al
57: 84 c0 test %al,%al
59: 75 02 jne 5d <test+0x10>
}
5b: c9 leave
5c: c3 ret
* With NOPs :
avg around 410000 cycles total
20.5 cycles/loop (slowdown of 2)
void test(unsigned long arg, unsigned long arg2)
{
4d: 55 push %ebp
4e: 89 e5 mov %esp,%ebp
50: 83 ec 1c sub $0x1c,%esp
struct task_struct;
DECLARE_PER_CPU(struct task_struct *, current_task);
static __always_inline struct task_struct *get_current(void)
{
return x86_read_percpu(current_task);
53: 64 8b 0d 00 00 00 00 mov %fs:0x0,%ecx
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current-
>pid, arg, arg2);
5a: 89 54 24 18 mov %edx,0x18(%esp)
5e: 89 44 24 14 mov %eax,0x14(%esp)
62: 8b 81 c4 00 00 00 mov 0xc4(%ecx),%eax
68: 89 44 24 10 mov %eax,0x10(%esp)
6c: c7 44 24 0c 02 00 00 movl $0x2,0xc(%esp)
73: 00
74: c7 44 24 08 0e 00 00 movl $0xe,0x8(%esp)
7b: 00
7c: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
83: 00
84: c7 04 24 00 00 00 00 movl $0x0,(%esp)
8b: 90 nop
8c: 90 nop
8d: 90 nop
8e: 90 nop
8f: 90 nop
}
90: c9 leave
91: c3 ret
Therefore, because of the cost of stack setup, the load immediate and
conditionnal branch seems to be _much_ faster than the NOP alternative.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2007-12-08 19:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-06 2:56 [patch-early-RFC 00/10] LTTng architecture dependent instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 01/10] LTTng - ARM instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 02/10] LTTng - x86_32 instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 03/10] LTTng - MIPS instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 04/10] LTTng instrumentation Powerpc Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 05/10] LTTng instrumentation PPC Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 06/10] LTTng - instrumentation SH Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 07/10] LTTng instrumentation SH64 Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 08/10] LTTng Sparc instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 09/10] LTTng - x86_64 instrumentation Mathieu Desnoyers
2007-12-06 2:57 ` [patch-early-RFC 10/10] LTTng - s390 instrumentation Mathieu Desnoyers
2007-12-06 10:11 ` [patch-early-RFC 00/10] LTTng architecture dependent instrumentation Ingo Molnar
2007-12-06 14:19 ` Mathieu Desnoyers
2007-12-08 19:05 ` Mathieu Desnoyers [this message]
2007-12-10 0:28 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071208190504.GA30538@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox