From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Ingo Molnar <mingo@elte.hu>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Arjan van de Ven <arjan@infradead.org>
Subject: Re: [patch-early-RFC 00/10] LTTng architecture dependent instrumentation
Date: Sat, 8 Dec 2007 14:05:04 -0500 [thread overview]
Message-ID: <20071208190504.GA30538@Krystal> (raw)
In-Reply-To: <20071206101112.GB17299@elte.hu>
* Ingo Molnar (mingo@elte.hu) wrote:
>
> hi Mathieu,
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Hi,
> >
> > Here is the architecture dependent instrumentation for LTTng. [...]
>
> A fundamental observation about markers, and i raised this point many
> many months ago already, so it might sound repetitive, but i'm unsure
> wether it's addressed. Documentation/markers.txt still says:
>
> | * Purpose of markers
> |
> | A marker placed in code provides a hook to call a function (probe)
> | that you can provide at runtime. A marker can be "on" (a probe is
> | connected to it) or "off" (no probe is attached). When a marker is
> | "off" it has no effect, except for adding a tiny time penalty
> | (checking a condition for a branch) and space penalty (adding a few
> | bytes for the function call at the end of the instrumented function
> | and adds a data structure in a separate section).
>
> could you please eliminate the checking of the flag, and insert a pure
> NOP sequence by default (no extra branches), which is then patched in
> with a function call instruction sequence, when the trace point is
> turned on? (on architectures that have code patching infrastructure -
> such as x86)
>
Hi Ingo,
Here are the results of a test I made, hacking a binary to put nops
instead of a function call.
The test is 20000 loops calling a function that contains a marker with
interrupts disabled. It is performed on a x86 32, Pentium 4 3GHz.
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current->pid,
arg, arg2);
The number here include the function call (present in both cases) the
counter increment/tests and the marker.
* No marker at all
240300 cycles total
12.02 cycles per loop
void test(unsigned long arg, unsigned long arg2)
{
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
asm volatile ("");
}
3: 5d pop %ebp
4: c3 ret
* With my marker implementation (load immediate 0, branch predicted) :
between 200355 and 200580 cycles total (avg 200400 cycles)
10.02 cycles per loop (yes, adding the marker increases performance)
void test(unsigned long arg, unsigned long arg2)
{
4d: 55 push %ebp
4e: 89 e5 mov %esp,%ebp
50: 83 ec 1c sub $0x1c,%esp
53: 89 c1 mov %eax,%ecx
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current-
>pid, arg, arg2);
55: b0 00 mov $0x0,%al
57: 84 c0 test %al,%al
59: 75 02 jne 5d <test+0x10>
}
5b: c9 leave
5c: c3 ret
* With NOPs :
avg around 410000 cycles total
20.5 cycles/loop (slowdown of 2)
void test(unsigned long arg, unsigned long arg2)
{
4d: 55 push %ebp
4e: 89 e5 mov %esp,%ebp
50: 83 ec 1c sub $0x1c,%esp
struct task_struct;
DECLARE_PER_CPU(struct task_struct *, current_task);
static __always_inline struct task_struct *get_current(void)
{
return x86_read_percpu(current_task);
53: 64 8b 0d 00 00 00 00 mov %fs:0x0,%ecx
__my_trace_mark(0, kernel_debug_test, NULL, "%d %d %ld %ld", 2, current-
>pid, arg, arg2);
5a: 89 54 24 18 mov %edx,0x18(%esp)
5e: 89 44 24 14 mov %eax,0x14(%esp)
62: 8b 81 c4 00 00 00 mov 0xc4(%ecx),%eax
68: 89 44 24 10 mov %eax,0x10(%esp)
6c: c7 44 24 0c 02 00 00 movl $0x2,0xc(%esp)
73: 00
74: c7 44 24 08 0e 00 00 movl $0xe,0x8(%esp)
7b: 00
7c: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
83: 00
84: c7 04 24 00 00 00 00 movl $0x0,(%esp)
8b: 90 nop
8c: 90 nop
8d: 90 nop
8e: 90 nop
8f: 90 nop
}
90: c9 leave
91: c3 ret
Therefore, because of the cost of stack setup, the load immediate and
conditionnal branch seems to be _much_ faster than the NOP alternative.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2007-12-08 19:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-06 2:56 [patch-early-RFC 00/10] LTTng architecture dependent instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 01/10] LTTng - ARM instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 02/10] LTTng - x86_32 instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 03/10] LTTng - MIPS instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 04/10] LTTng instrumentation Powerpc Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 05/10] LTTng instrumentation PPC Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 06/10] LTTng - instrumentation SH Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 07/10] LTTng instrumentation SH64 Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 08/10] LTTng Sparc instrumentation Mathieu Desnoyers
2007-12-06 2:56 ` [patch-early-RFC 09/10] LTTng - x86_64 instrumentation Mathieu Desnoyers
2007-12-06 2:57 ` [patch-early-RFC 10/10] LTTng - s390 instrumentation Mathieu Desnoyers
2007-12-06 10:11 ` [patch-early-RFC 00/10] LTTng architecture dependent instrumentation Ingo Molnar
2007-12-06 14:19 ` Mathieu Desnoyers
2007-12-08 19:05 ` Mathieu Desnoyers [this message]
2007-12-10 0:28 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071208190504.GA30538@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.