Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-15 17:14 Chuck Ebbert
  2006-09-15 18:32 ` Alan Cox
  2006-09-16 10:46 ` Jes Sorensen
  0 siblings, 2 replies; 271+ messages in thread
From: Chuck Ebbert @ 2006-09-15 17:14 UTC (permalink / raw)
  To: Alan Cox
  Cc: Greg Kroah-Hartman, linux-kernel, Roman Zippel, Jes Sorensen,
	Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers,
	Christoph Hellwig, Andrew Morton, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain>

On Fri, 15 Sep 2006 15:37:51 +0100, Alan Cox wrote:

> > $ grep KPROBES arch/*/Kconf*
> > arch/i386/Kconfig:config KPROBES
> > arch/ia64/Kconfig:config KPROBES
> > arch/powerpc/Kconfig:config KPROBES
> > arch/sparc64/Kconfig:config KPROBES
> > arch/x86_64/Kconfig:config KPROBES
>
> Send patches. The fact nobody has them implemented on your platform
> isn't a reason to implement something else, quite the reverse in fact.

Yes, but the point is: until that's done you can't claim kprobes is a
valid tracing tool for everyone.

And things like net/ipv4/tcp_probe.c shouldn't be generally implemented
until every arch is supported.

-- 
Chuck

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert
@ 2006-09-15 18:32 ` Alan Cox
  2006-09-16 10:46 ` Jes Sorensen
  1 sibling, 0 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 18:32 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel

Ar Gwe, 2006-09-15 am 13:14 -0400, ysgrifennodd Chuck Ebbert:
> In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain>
> > > $ grep KPROBES arch/*/Kconf*
> > > arch/i386/Kconfig:config KPROBES
> > > arch/ia64/Kconfig:config KPROBES
> > > arch/powerpc/Kconfig:config KPROBES
> > > arch/sparc64/Kconfig:config KPROBES
> > > arch/x86_64/Kconfig:config KPROBES
> >
> > Send patches. The fact nobody has them implemented on your platform
> > isn't a reason to implement something else, quite the reverse in fact.
> 
> Yes, but the point is: until that's done you can't claim kprobes is a
> valid tracing tool for everyone.

I can however claim that kprobes is what they should be implementing not
adding new large patches for another infrastructure whose author has
already said for dynamic stuff it is based on the same things.



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure)  0.5.108
  2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert
  2006-09-15 18:32 ` Alan Cox
@ 2006-09-16 10:46 ` Jes Sorensen
  1 sibling, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-16 10:46 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Alan Cox, Greg Kroah-Hartman, linux-kernel, Roman Zippel,
	Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers,
	Christoph Hellwig, Andrew Morton, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Chuck Ebbert wrote:
> In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain>
> 
> On Fri, 15 Sep 2006 15:37:51 +0100, Alan Cox wrote:
> 
>>> $ grep KPROBES arch/*/Kconf*
>>> arch/i386/Kconfig:config KPROBES
>>> arch/ia64/Kconfig:config KPROBES
>>> arch/powerpc/Kconfig:config KPROBES
>>> arch/sparc64/Kconfig:config KPROBES
>>> arch/x86_64/Kconfig:config KPROBES
>> Send patches. The fact nobody has them implemented on your platform
>> isn't a reason to implement something else, quite the reverse in fact.
> 
> Yes, but the point is: until that's done you can't claim kprobes is a
> valid tracing tool for everyone.

The fact that the remaining architectures haven't bothered implementing
kprobe supposed should not be used as an argument for pushing something
inferior out of laziness.

It's the same with syscalls, the kernel infrastructure is there, but if
you don't bother updating the syscall tables and wrap it in with glibc,
then the call isn't available on your architecture.

The core kprobe infrastructure is available to all architectures, it's
up to the developers of the remaining architectures to implement the
remaining bits.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-25 15:20 Chuck Ebbert
  2006-09-25 15:39 ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Chuck Ebbert @ 2006-09-25 15:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

In-Reply-To: <20060918151713.GA11495@elte.hu>

On Mon, 18 Sep 2006 17:17:13 +0200, Ingo Molnar wrote:

> yeah - and i dont think the kprobes overhead is a fundamental thing - i 
> posted a few kprobes-speedup patches as a reply to your measurements.

Where is the source code for the kprobes benchmarks you used?

-- 
Chuck


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-25 15:20 Chuck Ebbert
@ 2006-09-25 15:39 ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-25 15:39 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 716 bytes --]


Chuck,

i cannot email you because the mail always bounces ...

the kprobes benchmark is a simple "NOP" function:

 static int counter = 0;

 static int probe_pre_handler (struct kprobe * kp,
                               struct pt_regs * regs)
 {
         counter++;
         return 0;
 }

i've attached it.

	Ingo

* Chuck Ebbert <76306.1226@compuserve.com> wrote:

> In-Reply-To: <20060918151713.GA11495@elte.hu>
> 
> On Mon, 18 Sep 2006 17:17:13 +0200, Ingo Molnar wrote:
> 
> > yeah - and i dont think the kprobes overhead is a fundamental thing - i 
> > posted a few kprobes-speedup patches as a reply to your measurements.
> 
> Where is the source code for the kprobes benchmarks you used?
> 
> -- 
> Chuck

[-- Attachment #2: noop_kprobe.c --]
[-- Type: text/plain, Size: 1014 bytes --]

/*
 * no-op kprobe handler
 * Copyright (c) 2005 Hitachi,Ltd.,
 * Created by Masami Hiramatsu<hiramatu@sdl.hitachi.co.jp>
 */
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/kprobes.h>

MODULE_AUTHOR("M.Hiramatsu");
MODULE_LICENSE("GPL");

static unsigned long addr = 0;
module_param(addr, ulong, 0444);

static struct kprobe kp;
static int counter=0;

static int probe_pre_handler (struct kprobe * kp,
			      struct pt_regs * regs)
{
	counter++;
	return 0;
}

static int install_probe(void) 
{
	int ret = -10000;
	if (addr) {
		kp.pre_handler = probe_pre_handler;
		kp.addr = (void *)addr;
		printk("probe install to %p\n", (void*)addr);
		ret = register_kprobe(&kp);
	}
	if (ret) {
		printk("probe install error: %d\n",ret);
	}
	return ret;
}

static void uninstall_probe(void)
{
	if (kp.addr) {
		printk("uninstall from %p\n", (void*)kp.addr);
		unregister_kprobe(&kp);
	}
	printk("count:%d\n",counter);
}

module_init(install_probe);
module_exit(uninstall_probe);

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-15  9:17 Richard J Moore
  0 siblings, 0 replies; 271+ messages in thread
From: Richard J Moore @ 2006-09-15  9:17 UTC (permalink / raw)
  To: linux-kernel


Ingo Molnar wrote:

> > I don't think anyone is saying that static tracepoints do not have
> > their limitations, or that dynamic tracepointing is useless. But
> > that's not the point ... why can't we have one infrastructure that
> > supports both? Preferably in a fairly simple, consistent way.
>
> primarily because i fail to see any property of static tracers that are
> not met by dynamic tracers. So to me dynamic tracers like SystemTap are
> a superset of static tracers.

There is one example whethere dynamic tracing is difficult or very messy to
implement and that's for tracepoints needed during system and device
initialization. In this sense dynamic is not a practical superset of
static. However I believe the tooling, for dynamic trace should work for
static as well.

- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-15  3:10 James Dickens
  0 siblings, 0 replies; 271+ messages in thread
From: James Dickens @ 2006-09-15  3:10 UTC (permalink / raw)
  To: lkml

Static probe points in the mainline kernel should not be there for
kernel programmers. Any kernel programmer that is interested in an
event that a static probe would trace, could with a little work use
kprobes, Systemtap, printk,  statements or numerous other methods and
accomplish the same thing most likely with less impact on the kernel.

If you allow static probe points, do them for the people that use your
code,  If static probing is to work in the mainline kernel, its
necessary for everyone to see the value of them.

I came up with some simple rules that may help the adoption of static
probe points in the kernel. They answer a lot of issues I read in
other reads.

Some simple rules for Static Probing:

- If the probe is not enabled, it turns into a NOP. No probes are
enabled by default
- Each programmer should provide this as a service to the user.
- There should be at most a 1000 static probe points in the entire
kernel including modules, drivers, etc.
- Probes should not pass out any more information than what a user
would need. If the user needs more he needs to find another way to get
it, perhaps dynamic probing.
- If any part of the kernel has more than a dozen probe points there
are too many.
- If a probe would be of little use to a user/sysadmin it should be
removed from the mainline kernel.
- Yes, if a probe point is in the code you are working on, the role of
maintaining it falls on you.
- If you notice your code is doing something that matches a statically
probed event (.i.e. your network driver dropped a packet), it's your
responsibility to add the necessary probe in your code.
- If "you" need a probe that would not be needed except for debugging
your code, use one of the other methods mentioned above, or remove it
before your code is submitted to the mainline kernel.

Some example static probe points

Task going is being moved on to a cpu.
Task moving off a cpu

Start of an IO
End of an IO

Network packet received
Packet dropped.

Various lock activities
Lock taken
Spin lock taken

James Dickens
uadmin.blogspot.com

^ permalink raw reply	[flat|nested] 271+ messages in thread

* [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-14  3:38 Mathieu Desnoyers
  2006-09-14 11:27 ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-14  3:38 UTC (permalink / raw)
  To: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi
  Cc: ltt-dev, Michel Dagenais

Hi,

Following an advice Christoph gave me this summer, submitting a smaller,
easier to review patch should make everybody happier. Here is a stripped
down version of LTTng : I removed everything that would make the code
review reluctant (especially kernel instrumentation and kernel state dump
module). I plan to release this "core" version every few LTTng releases
and post it to LKML.

Comments and reviews are very welcome.

See http://ltt.polymtl.ca > QUICKSTART for information about creating your own
instrumentation set.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14  3:38 Mathieu Desnoyers
@ 2006-09-14 11:27 ` Ingo Molnar
  2006-09-14 13:40   ` Roman Zippel
                     ` (3 more replies)
  0 siblings, 4 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 11:27 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> Following an advice Christoph gave me this summer, submitting a 
> smaller, easier to review patch should make everybody happier. Here is 
> a stripped down version of LTTng : I removed everything that would 
> make the code review reluctant (especially kernel instrumentation and 
> kernel state dump module). I plan to release this "core" version every 
> few LTTng releases and post it to LKML.
> 
> Comments and reviews are very welcome.

i have one very fundamental question: why should we do this 
source-intrusive method of adding tracepoints instead of the dynamic, 
unintrusive (and thus zero-overhead) KProbes+SystemTap method?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 11:27 ` Ingo Molnar
@ 2006-09-14 13:40   ` Roman Zippel
  2006-09-14 13:55     ` Ingo Molnar
  2006-09-14 15:02   ` Mathieu Desnoyers
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 13:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> i have one very fundamental question: why should we do this 
> source-intrusive method of adding tracepoints instead of the dynamic, 
> unintrusive (and thus zero-overhead) KProbes+SystemTap method?

Could you define "zero-overhead"?
Actual implementation aside having a core number of tracepoints is far 
more portable than KProbes.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 13:40   ` Roman Zippel
@ 2006-09-14 13:55     ` Ingo Molnar
  2006-09-14 14:33       ` Roman Zippel
  2006-09-14 15:19       ` Mathieu Desnoyers
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 13:55 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> On Thu, 14 Sep 2006, Ingo Molnar wrote:
> 
> > i have one very fundamental question: why should we do this 
> > source-intrusive method of adding tracepoints instead of the dynamic, 
> > unintrusive (and thus zero-overhead) KProbes+SystemTap method?
> 
> Could you define "zero-overhead"?

zero overhead when not used: not a single instruction added to the 
kernel codepath that is to be traced, anywhere. (which will be the case 
on 99% of the systems)

> Actual implementation aside having a core number of tracepoints is far 
> more portable than KProbes.

the key point is that we want _zero_ "static tracepoints". Firstly, 
static tracepoints are fundamentally limited:

 - they can only be added at the source code level

 - modifying them requires a reboot which is not practical in a 
   production environment

 - there can only be a limited set of them, while many problems need 
   finegrained tracepoints tailored to the problem at hand

 - conditional tracepoints are typically either nonexistent or very 
   limited.

But besides the usability problems, the most important problem is that 
static tracepoints add a _constant maintainance overhead_ to the kernel. 
I'm talking from first hand experience: i wrote 'iotrace' (a static 
tracer) in 1996 and have maintained it for many years, and even today 
i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ 
want static tracepoints in the mainline kernel.

enter KProbes+SystemTap. It needs no changes at the source code level at 
all, so no maintainance overhead to generic kernel code. Tracepoints can 
be added and removed while the system is running. Trace actions and 
filters can be added based on a scripting language, so tracing is as 
dynamic as it gets.

(check out http://lwn.net/Articles/198557/ if you have an lwn 
subscription - it's subscriber-only for a few weeks)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 13:55     ` Ingo Molnar
@ 2006-09-14 14:33       ` Roman Zippel
  2006-09-14 15:26         ` Michel Dagenais
                           ` (2 more replies)
  2006-09-14 15:19       ` Mathieu Desnoyers
  1 sibling, 3 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 14:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> > On Thu, 14 Sep 2006, Ingo Molnar wrote:
> > 
> > > i have one very fundamental question: why should we do this 
> > > source-intrusive method of adding tracepoints instead of the dynamic, 
> > > unintrusive (and thus zero-overhead) KProbes+SystemTap method?
> > 
> > Could you define "zero-overhead"?
> 
> zero overhead when not used: not a single instruction added to the 
> kernel codepath that is to be traced, anywhere. (which will be the case 
> on 99% of the systems)

Using alternatives this could be near zero as well and it will likely 
have less overhead when it's actually used.

> > Actual implementation aside having a core number of tracepoints is far 
> > more portable than KProbes.
> 
> the key point is that we want _zero_ "static tracepoints". Firstly, 
> static tracepoints are fundamentally limited:

BTW I don't mind KProbes as an option, but I have huge problem with making 
it the only option.

> But besides the usability problems, the most important problem is that 
> static tracepoints add a _constant maintainance overhead_ to the kernel. 
> I'm talking from first hand experience: i wrote 'iotrace' (a static 
> tracer) in 1996 and have maintained it for many years, and even today 
> i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ 
> want static tracepoints in the mainline kernel.

Even dynamic tracepoints have a maintainance overhead and I doubt there is 
much difference. The big problem is having to maintain them outside the 
mainline kernel, that's why it's so important to get them into the 
mainline kernel.
You didn't address my main issue at all - kprobes is only available for a 
few archs...

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 14:33       ` Roman Zippel
@ 2006-09-14 15:26         ` Michel Dagenais
  2006-09-14 17:48           ` Ingo Molnar
  2006-09-14 18:08           ` Nick Piggin
  2006-09-14 17:13         ` Ingo Molnar
  2006-09-14 17:51         ` Karim Yaghmour
  2 siblings, 2 replies; 271+ messages in thread
From: Michel Dagenais @ 2006-09-14 15:26 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev

On Thu, 2006-14-09 at 16:33 +0200, Roman Zippel wrote:
> On Thu, 14 Sep 2006, Ingo Molnar wrote:
> > > On Thu, 14 Sep 2006, Ingo Molnar wrote:

> > > > i have one very fundamental question: why should we do this 
> > > > source-intrusive method of adding tracepoints instead of the dynamic, 
> > > > unintrusive (and thus zero-overhead) KProbes+SystemTap method?

> Using alternatives this could be near zero as well and it will likely 
> have less overhead when it's actually used.

This is the crucial point. Using an INT3 at each dynamic tracepoint is
both costly and is a larger perturbation on the system under study.
Static tracepoints can be achieved by various means, including a few
NOPs to reserve space which get patched dynamically for activation. They
may also be compiled out completely. By the way, there are quite a few
tracers already in device drivers in the kernel.

> BTW I don't mind KProbes as an option, but I have huge problem with making 
> it the only option.

Indeed, KProbes SystemTAP and LTTng are complementary and people
involved in the three projects are cooperating.

> > But besides the usability problems, the most important problem is that 
> > static tracepoints add a _constant maintainance overhead_ to the kernel. 
> > I'm talking from first hand experience: i wrote 'iotrace' (a static 
> > tracer) in 1996 and have maintained it for many years, and even today 
> > i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ 
> > want static tracepoints in the mainline kernel.
> 
> Even dynamic tracepoints have a maintainance overhead and I doubt there is 
> much difference. The big problem is having to maintain them outside the 
> mainline kernel, that's why it's so important to get them into the 
> mainline kernel.

Indeed, dynamic tracepoints are like code patches, when the kernel
source changes they may or not apply to newer versions. Mainline kernel
"static" tracepoints are more like the existing 70000+ printk
statements!


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:26         ` Michel Dagenais
@ 2006-09-14 17:48           ` Ingo Molnar
  2006-09-15 15:04             ` Mathieu Desnoyers
  2006-09-14 18:08           ` Nick Piggin
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 17:48 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev


* Michel Dagenais <michel.dagenais@polymtl.ca> wrote:

> This is the crucial point. Using an INT3 at each dynamic tracepoint is 
> both costly and is a larger perturbation on the system under study. 
> [...]

have you measured this?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:48           ` Ingo Molnar
@ 2006-09-15 15:04             ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 15:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michel Dagenais, Roman Zippel, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Michel Dagenais <michel.dagenais@polymtl.ca> wrote:
> 
> > This is the crucial point. Using an INT3 at each dynamic tracepoint is 
> > both costly and is a larger perturbation on the system under study. 
> > [...]
> 
> have you measured this?
> 

Hi Ingo,

A very quick test (yes, done in user space, but should be accurate enough for
our needs) on a pentium 4 3 GHz shows that generating a int3 breakpoint in a
loop (connected to an empty handler) takes an average of 2.01µs per breakpoint.

LTT has an impact of about 0.220µs per probe (10 times smaller).

Please refer to this kind of high event rate workload :
http://www.listserv.shafik.org/pipermail/ltt-dev/2005-December/001139.html

On the same pentium 4, 3 GHz (in the following results, I do not consider the
fact that the CPU had hyperthreading enabled) :

Probe execution time at probe site : 220ns/event

220ns * 9588836 events = 2.11s

Event rate : 749994 events per second

LTT :
749994 events/s * 0.220µs/event = 16.5 % of cpu time

With a breakpoint :
749994 events/s * 2.01µs/event = 150 % of cpu time

Considering the limitations of these tests :
- int3 timings taken from user space, which implies calling an empty handler in
  user space.
- The machine had hyperthreading enabled, but considered UP here.

It shows that tracing the same workload with breakpoints would make the machine
more than twice slower when a direct memory write has a relatively small impact
(16.5% of cpu time spent in probes).

In high event rate/low perturbation scenarios where instrumentation is put at
arbitrary locations in the code, it shows necessary to use the static
instrumentation alternative because the breakpoint approach is just too slow.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:26         ` Michel Dagenais
  2006-09-14 17:48           ` Ingo Molnar
@ 2006-09-14 18:08           ` Nick Piggin
  2006-09-14 18:38             ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Nick Piggin @ 2006-09-14 18:08 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev

Michel Dagenais wrote:
> On Thu, 2006-14-09 at 16:33 +0200, Roman Zippel wrote:

>>BTW I don't mind KProbes as an option, but I have huge problem with making 
>>it the only option.
> 
> 
> Indeed, KProbes SystemTAP and LTTng are complementary and people
> involved in the three projects are cooperating.

That doesn't mean we want them all in the kernel.

The best aim would of course be to come up with a solution that has
the advantages of all and disadvantages of none. That may be
impossible, but if we can find one way to do things that is acceptable
to all...

What's the huge problem with making kprobes the only option (that can't
be fixed by doing a bit of coding)?

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:08           ` Nick Piggin
@ 2006-09-14 18:38             ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 18:38 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Michel Dagenais, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev

Nick Piggin wrote:
> What's the huge problem with making kprobes the only option (that can't
> be fixed by doing a bit of coding)?

No offense, having been on the receiving end of this for a number
of years, one feels like he's watching a never-ending repeat of a
30second commercial where the woman is holding up a magic scrub
and says something like "Just use Mr. Scrub" and the product then
twinkles with some light music and then cut, next commercial;
except in this case, it's "Just use Kprobes" and all your
problems will go away, wink-wink!

Sorry, it's just not that straight-forward. There's a reason
why the systemtap folks got interested in the markers proposal,
they actually have to maintain a dynamic instrumentation set.
Mr. Scrub just doesn't scrub as clean as advertised, you
actually have to scrub to make the scum go away. Which goes
back to what I said elsewhere: no matter where you draw the
line someone is doing the heavy lifting. Doing it outside the
kernel only means that there's yet another piece of software
that needs to be updated before you can actually start
profiting from your new and improved kernel ...

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 14:33       ` Roman Zippel
  2006-09-14 15:26         ` Michel Dagenais
@ 2006-09-14 17:13         ` Ingo Molnar
  2006-09-14 17:55           ` Roman Zippel
                             ` (2 more replies)
  2006-09-14 17:51         ` Karim Yaghmour
  2 siblings, 3 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 17:13 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Thu, 14 Sep 2006, Ingo Molnar wrote:
> 
> > > On Thu, 14 Sep 2006, Ingo Molnar wrote:
> > > 
> > > > i have one very fundamental question: why should we do this 
> > > > source-intrusive method of adding tracepoints instead of the dynamic, 
> > > > unintrusive (and thus zero-overhead) KProbes+SystemTap method?
> > > 
> > > Could you define "zero-overhead"?
> > 
> > zero overhead when not used: not a single instruction added to the 
> > kernel codepath that is to be traced, anywhere. (which will be the case 
> > on 99% of the systems)
> 
> Using alternatives this could be near zero as well and it will likely 
> have less overhead when it's actually used.

if there are lots of tracepoints (and the union of _all_ useful 
tracepoints that i ever encountered in my life goes into the thousands) 
then the overhead is not zero at all.

also, the other disadvantages i listed very much count too. Static 
tracepoints are fundamentally limited because:

  - they can only be added at the source code level

  - modifying them requires a reboot which is not practical in a
    production environment

  - there can only be a limited set of them, while many problems need
    finegrained tracepoints tailored to the problem at hand

  - conditional tracepoints are typically either nonexistent or very
    limited.

for me these are all _independent_ grounds for rejection, as a generic 
kernel infrastructure.

> > the key point is that we want _zero_ "static tracepoints". Firstly, 
> > static tracepoints are fundamentally limited:
> 
> BTW I don't mind KProbes as an option, but I have huge problem with 
> making it the only option.

i'm not arguing for SystemTap to be the only option (KProbes is just the 
infrastructure SystemTap is using - there are other uses for KProbes 
too), but i'm arguing against the inclusion of static tracepoints as an 
infrastructure, precisely because a much better option (SystemTap) is 
already available and is usable on the stock kernel. You are of course 
free to invent other, equally advantageous (or better) options.

> > But besides the usability problems, the most important problem is 
> > that static tracepoints add a _constant maintainance overhead_ to 
> > the kernel. I'm talking from first hand experience: i wrote 
> > 'iotrace' (a static tracer) in 1996 and have maintained it for many 
> > years, and even today i'm maintaining a handful of tracepoints in 
> > the -rt kernel. I _dont_ want static tracepoints in the mainline 
> > kernel.
> 
> Even dynamic tracepoints have a maintainance overhead and I doubt 
> there is much difference. The big problem is having to maintain them 
> outside the mainline kernel, that's why it's so important to get them 
> into the mainline kernel.

i dispute that: for example kernel/sched.c has zero maintainance 
overhead under SystemTap, while it's nonzero with static tracepoints. Of 
course SystemTap _itself_ has maintainance overhead, but it does not 
slow down any other subsystem's speed of progress.

> You didn't address my main issue at all - kprobes is only available 
> for a few archs...

the kprobes infrastructure, despite being fairly young, is widely 
available: powerpc, i386, x86_64, ia64 and sparc64. The other 
architectures are free to implement them too, there's nothing 
hardware-specific about kprobes and the "porting overhead" is in essence 
a one-time cost - while for static tracepoints the maintainance overhead 
goes on forever and scales linearly with the number of tracepoints 
added.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:13         ` Ingo Molnar
@ 2006-09-14 17:55           ` Roman Zippel
  2006-09-14 18:15             ` Ingo Molnar
  2006-09-14 18:12           ` Karim Yaghmour
  2006-09-14 20:25           ` Martin Bligh
  2 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 17:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> also, the other disadvantages i listed very much count too. Static 
> tracepoints are fundamentally limited because:
> 
>   - they can only be added at the source code level
> 
>   - modifying them requires a reboot which is not practical in a
>     production environment
> 
>   - there can only be a limited set of them, while many problems need
>     finegrained tracepoints tailored to the problem at hand
> 
>   - conditional tracepoints are typically either nonexistent or very
>     limited.
> 
> for me these are all _independent_ grounds for rejection, as a generic 
> kernel infrastructure.

Tracepoints of course need to be managed, but that's true for both dynamic 
and static tracepoints. Both have their advantages and disadvantages and 
just hammering on the possible problems of static ones (which are not much 
of a problem for other people) is highly unfair and not a reason for 
rejection. If you don't like them, don't use them, nobody forces you, it's 
that simple...

> > You didn't address my main issue at all - kprobes is only available 
> > for a few archs...
> 
> the kprobes infrastructure, despite being fairly young, is widely 
> available: powerpc, i386, x86_64, ia64 and sparc64. The other 
> architectures are free to implement them too, there's nothing 
> hardware-specific about kprobes and the "porting overhead" is in essence 
> a one-time cost - while for static tracepoints the maintainance overhead 
> goes on forever and scales linearly with the number of tracepoints 
> added.

kprobes are not trivial to implement (especially to reach the level of 
perfomance and flexibility of static tracepoints) and until then you deny 
their users/developers a useful tool? 
I also think you highly exaggerate the maintaince overhead of static 
tracepoints, once added they hardly need any maintainance, most of the 
time you can just ignore them. Only if the code drastically changes they 
need to be adjusted, but at that point this should be the smallest 
problem. The kernel is full debug prints, do you seriously suggest to 
throw them out because of their "high maintainance"?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:55           ` Roman Zippel
@ 2006-09-14 18:15             ` Ingo Molnar
  2006-09-14 18:35               ` Mathieu Desnoyers
                                 ` (3 more replies)
  0 siblings, 4 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 18:15 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > for me these are all _independent_ grounds for rejection, as a generic 
> > kernel infrastructure.
> 
> Tracepoints of course need to be managed, but that's true for both 
> dynamic and static tracepoints. [...]

that's not true, and this is the important thing that i believe you are 
missing. A dynamic tracepoint is _detached_ from the normal source code 
and thus is zero maintainance overhead. You dont have to maintain it 
during normal development - only if you need it. You dont see the 
dynamic tracepoints in the source code.

a static tracepoint, once it's in the mainline kernel, is a nonzero 
maintainance overhead _until eternity_. It is a constant visual 
hindrance and a constant build-correctness and boot-correctness problem 
if you happen to change the code that is being traced by a static 
tracepoint. Again, I am talking out of actual experience with static 
tracepoints: i frequently break my kernel via static tracepoints and i 
have constant maintainance cost from them. So what i do is that i try to 
minimize the number of static tracepoints to _zero_. I.e. i only add 
them when i need them for a given bug.

static tracepoints are inferior to dynamic tracepoints in almost every 
way.

> [...]  Both have their advantages and disadvantages and just hammering 
> on the possible problems of static ones [...]

how about giving a line by line rebuttal to the very real problems of 
static tracepoints i listed (twice already), instead of calling them 
"possible problems"?

i am giving a line by line rebuttal of all arguments that come up. 
Please be fair and do the same. Here are the arguments again, for a 
third time. Thanks!

> > also, the other disadvantages i listed very much count too. Static 
> > tracepoints are fundamentally limited because:
> > 
> >   - they can only be added at the source code level
> > 
> >   - modifying them requires a reboot which is not practical in a
> >     production environment
> > 
> >   - there can only be a limited set of them, while many problems need
> >     finegrained tracepoints tailored to the problem at hand
> > 
> >   - conditional tracepoints are typically either nonexistent or very
> >     limited.

> > the kprobes infrastructure, despite being fairly young, is widely 
> > available: powerpc, i386, x86_64, ia64 and sparc64. The other 
> > architectures are free to implement them too, there's nothing 
> > hardware-specific about kprobes and the "porting overhead" is in 
> > essence a one-time cost - while for static tracepoints the 
> > maintainance overhead goes on forever and scales linearly with the 
> > number of tracepoints added.
> 
> kprobes are not trivial to implement [...]

nor are smp-alternatives, which was suggested as a solution to reduce 
the overhead of static tracepoints. So what's the point? It's a one-off 
development overhead that has already been done for all the major 
arches. If another arch needs it they can certainly implement it.

it's like arguing against ptrace on the grounds of: "application 
developers can add printf if they want to debug their apps, or they can 
add static tracepoints too, and besides, ptrace is hard to implement".

> I also think you highly exaggerate the maintaince overhead of static 
> tracepoints, once added they hardly need any maintainance, most of the 
> time you can just ignore them. [...]

hundreds (or possibly thousands) of tracepoints? Have you ever tried to 
maintain that? I have and it's a nightmare.

Even assuming a rich set of hundreds of static tracepoints, it doesnt 
even solve the problems at hand: people want to do much more when they 
probe the kernel - and today, with DTrace under Solaris people _know_ 
that much better tracing _can be done_, and they _demand_ that Linux 
adopts an intelligent solution. The clock is ticking for dinosaurs like 
static printks and static tracepoints to debug the kernel...

> [...] The kernel is full debug prints, do you seriously suggest to 
> throw them out because of their "high maintainance"?

oh yes, these days i frequently throw them out when i find them in code 
i modify. (my most recent such zap was rwsemtrace()). Also, obviously 
when most of them were added we didnt have good kernel debugging 
infrastructure (in fact we didnt have any kernel debugging 
infrastructure besides printk), so _something_ had to be used back then. 
But today there's little reason to keep them. Welcome to 2006 :-)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:15             ` Ingo Molnar
@ 2006-09-14 18:35               ` Mathieu Desnoyers
  2006-09-14 18:54               ` Karim Yaghmour
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-14 18:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> that's not true, and this is the important thing that i believe you are 
> missing. A dynamic tracepoint is _detached_ from the normal source code 
> and thus is zero maintainance overhead. You dont have to maintain it 
> during normal development - only if you need it. You dont see the 
> dynamic tracepoints in the source code.
> 

What happen if someone need trace points in "normal kernel development" (which
appears to be the case, see blktrace and latency tracer) ?

> a static tracepoint, once it's in the mainline kernel, is a nonzero 
> maintainance overhead _until eternity_. It is a constant visual 
> hindrance and a constant build-correctness and boot-correctness problem 
> if you happen to change the code that is being traced by a static 
> tracepoint. Again, I am talking out of actual experience with static 
> tracepoints: i frequently break my kernel via static tracepoints and i 
> have constant maintainance cost from them. So what i do is that i try to 
> minimize the number of static tracepoints to _zero_. I.e. i only add 
> them when i need them for a given bug.
> 

What kind of code are you calling from your instrumentation sites to break your
kernel so easily ? Or perhaps are you instrumenting the page fault handler
which, yes, can have side effects? My goal is exctly to provide the kind of
code that can be called from any kernel site without breaking it!


Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:15             ` Ingo Molnar
  2006-09-14 18:35               ` Mathieu Desnoyers
@ 2006-09-14 18:54               ` Karim Yaghmour
  2006-09-15  9:20                 ` Jes Sorensen
  2006-09-14 19:40               ` Tim Bird
  2006-09-14 19:47               ` Roman Zippel
  3 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 18:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> that's not true, and this is the important thing that i believe you are 
> missing. A dynamic tracepoint is _detached_ from the normal source code 
> and thus is zero maintainance overhead. You dont have to maintain it 
> during normal development - only if you need it. You dont see the 
> dynamic tracepoints in the source code.

And that's actually a problem for those who maintain such dynamic
trace points.

> a static tracepoint, once it's in the mainline kernel, is a nonzero 
> maintainance overhead _until eternity_. It is a constant visual 
> hindrance and a constant build-correctness and boot-correctness problem 
> if you happen to change the code that is being traced by a static 
> tracepoint. Again, I am talking out of actual experience with static 
> tracepoints: i frequently break my kernel via static tracepoints and i 
> have constant maintainance cost from them. So what i do is that i try to 
> minimize the number of static tracepoints to _zero_. I.e. i only add 
> them when i need them for a given bug.

Bzzt, wrong. This is your own personal experience with tracing. Marked
up code does not need to be active under all build conditions. In
fact trace points can be inactive by default at all times, except
when you choose to build them in.

And as I said elsewhere, the fact that your use of instrumentation is
solely for debugging ("i only add them when i need them for a given bug"),
I repeat that there are mortals out there that need this for their
applications.

> static tracepoints are inferior to dynamic tracepoints in almost every 
> way.

Sorry, orthogonal is the word.

> hundreds (or possibly thousands) of tracepoints? Have you ever tried to 
> maintain that? I have and it's a nightmare.

I have, and I've showed you that you're wrong. The only reason you can
make this argument is that you view these things from the point of view
of what use they are for you as a kernel developer and I will repeat
what I've said for years now: static instrumentation of the kernel
isn't meant to be useful for kernel developers. While it may indeed
be in some cases, in most cases it's likely useless, as you've been
very successfully arguing in this thread. Nevertheless there are very
legitimate uses for standardized instrumentation points.

> Even assuming a rich set of hundreds of static tracepoints, it doesnt 
> even solve the problems at hand: people want to do much more when they 
> probe the kernel - and today, with DTrace under Solaris people _know_ 
> that much better tracing _can be done_, and they _demand_ that Linux 
> adopts an intelligent solution. The clock is ticking for dinosaurs like 
> static printks and static tracepoints to debug the kernel...

Thank you, I couldn't have put it better. This paragraph, more than
any other snippet I've seen to date, clearly demonstrates why
tracing is such a contentious issue. Kernel developers use tracing
during their normal development process, and of course their gut
reaction is: why the hell would anybody need this for mainline? But
of course this misses the entire point. Kernel tracing for developers
is but a corner case of kernel tracing in general. There are very valid
and legitimate reasons for userspace to be able to obtain important
events. And of course any infrastructure developed with that in
mind should also be usable by kernel developers.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:54               ` Karim Yaghmour
@ 2006-09-15  9:20                 ` Jes Sorensen
  2006-09-15 12:38                   ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15  9:20 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

>>>>> "Karim" == Karim Yaghmour <karim@opersys.com> writes:

Karim> Ingo Molnar wrote:
>> that's not true, and this is the important thing that i believe you
>> are missing. A dynamic tracepoint is _detached_ from the normal
>> source code and thus is zero maintainance overhead. You dont have
>> to maintain it during normal development - only if you need it. You
>> dont see the dynamic tracepoints in the source code.

Karim> And that's actually a problem for those who maintain such
Karim> dynamic trace points.

And who should pay here? The people who want the tracepoints or the
people who are not interested in them?

>> a static tracepoint, once it's in the mainline kernel, is a nonzero
>> maintainance overhead _until eternity_. It is a constant visual
>> hindrance and a constant build-correctness and boot-correctness
>> problem if you happen to change the code that is being traced by a
>> static tracepoint. Again, I am talking out of actual experience
>> with static tracepoints: i frequently break my kernel via static
>> tracepoints and i have constant maintainance cost from them. So
>> what i do is that i try to minimize the number of static
>> tracepoints to _zero_. I.e. i only add them when i need them for a
>> given bug.

Karim> Bzzt, wrong. This is your own personal experience with
Karim> tracing. Marked up code does not need to be active under all
Karim> build conditions. In fact trace points can be inactive by
Karim> default at all times, except when you choose to build them in.

You have obviously never tried to maintain a codebase for a long
time. Even if the code is not activated, you make a change and
something breaks and people come running and screaming, or the thing
is in the way for the structural code change you want to make.

Not to mention that some of the classical places people wish to add
those static tracepoints are in performance sensitive codepaths,
syscalls for example.

>> static tracepoints are inferior to dynamic tracepoints in almost
>> every way.

Karim> Sorry, orthogonal is the word.

You can do pretty much everything you want to do with dynamic
tracepoints, it's just a matter of whether you want to dump the burden
of maintenance on someone else. Been there done that, had to show
people in the past how to do with dynamic points what they insisted
had to be done with static points.

>> hundreds (or possibly thousands) of tracepoints? Have you ever
>> tried to maintain that? I have and it's a nightmare.

Karim> I have, and I've showed you that you're wrong. The only reason
Karim> you can make this argument is that you view these things from
Karim> the point of view of what use they are for you as a kernel
Karim> developer and I will repeat what I've said for years now:
Karim> static instrumentation of the kernel isn't meant to be useful
Karim> for kernel developers.

So you maintain the tracepoints in the kernel and you are offering to
take over maintenance of all code that now contain these tracepoints?
You add your static tracepoints, next week someone else wants some
very similar but slightly different points, the following week it's
someone else. Thanks, but no thanks.

Karim> Nevertheless there are
Karim> very legitimate uses for standardized instrumentation points.

Some evidence would be useful here, so far you haven't provided any.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15  9:20                 ` Jes Sorensen
@ 2006-09-15 12:38                   ` Karim Yaghmour
  2006-09-15 12:32                     ` Jes Sorensen
  2006-09-15 13:20                     ` Paul Mundt
  0 siblings, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 12:38 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Jes Sorensen wrote:
> Karim> And that's actually a problem for those who maintain such
> Karim> dynamic trace points.
> 
> And who should pay here? The people who want the tracepoints or the
> people who are not interested in them?

If you'd care to read through the thread you'd notice I've demonstrated
time and again that those static trace points we're mostly interested
in a never-changing. Lest something fundamentally changes with the
kernel, there will always be a scheduling change; etc. This
"instrumentation is evil" mantra is only substantiated if you view
it from the point of view of someone who's only used it to debug code.
Yet, and I repeat this again, instrumentation for in-source debugging
is but a corner case of instrumentation in general.

> You have obviously never tried to maintain a codebase for a long
> time.

Please, this is not constructive. I've never really grasped the need
for posturing on LKML. Jes, I'm not going to fight a war of resumes
with you. If you think I'm incompetent then there's very little I can
do to change your mind.

> Not to mention that some of the classical places people wish to add
> those static tracepoints are in performance sensitive codepaths,
> syscalls for example.

And this argument ignores everything I said on how there does not need
be the limitation currently known to previous static tracing mechanisms.

> You can do pretty much everything you want to do with dynamic
> tracepoints, it's just a matter of whether you want to dump the burden
> of maintenance on someone else. Been there done that, had to show
> people in the past how to do with dynamic points what they insisted
> had to be done with static points.

Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can
get away with this argument is if you view it exclusively from the
point of view of kernel development. And that's why you're wrong.

> So you maintain the tracepoints in the kernel and you are offering to
> take over maintenance of all code that now contain these tracepoints?

Please explain, honestly, why the following instrumentation point is
going to be a maintenance drag on the person modifying the scheduler:
@@ -1709,6 +1712,7 @@ switch_tasks:
   		++*switch_count;

   		prepare_arch_switch(rq, next);
+		TRACE_SCHEDCHANGE(prev, next);
   		prev = context_switch(rq, prev, next);
   		barrier();

And please, don't bother complaining about the semantics, they can
be changed. I'm just arguing about location/meaning/content.

> You add your static tracepoints, next week someone else wants some
> very similar but slightly different points, the following week it's
> someone else. Thanks, but no thanks.

Obviously there's no point in me spelling any code of conduct to
anyone, Martin has already pointed out that it's up to the subsystem
maintainers to decide what's appropriate and what's not, as is
customary anyway. But the issue I'm putting forth here is that there
is value for allowing outsiders to understand the dynamic behavior of
your code and the only person who can do that best is the person
writing the code. It is then that person's responsibility to
distinguish between instrumentation they may find important to debug
their code and instrumentation that would be relevant to those using
their code. And if you've maintained code long enough, and I trust
you do, you would see that there is a clear difference between both.

Thanks,

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:38                   ` Karim Yaghmour
@ 2006-09-15 12:32                     ` Jes Sorensen
  2006-09-15 14:09                       ` Karim Yaghmour
  2006-09-15 13:20                     ` Paul Mundt
  1 sibling, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15 12:32 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Jes Sorensen wrote:
>>> And who should pay here? The people who want the tracepoints or the
>> people who are not interested in them?
> 
> If you'd care to read through the thread you'd notice I've demonstrated
> time and again that those static trace points we're mostly interested
> in a never-changing. Lest something fundamentally changes with the
> kernel, there will always be a scheduling change; etc.

Except as I pointed out, that everyone wants their info slightly
differently so even trace points in the scheduler will be contentious
and we will end up with a stack of them if we are to satisfy everyone.
So now, you didn't demonstrate anything.

> This
> "instrumentation is evil" mantra is only substantiated if you view
> it from the point of view of someone who's only used it to debug code.
> Yet, and I repeat this again, instrumentation for in-source debugging
> is but a corner case of instrumentation in general.

Given that I have used this stuff to more than just debug code, then
this obviously doesn't apply.

>> You have obviously never tried to maintain a codebase for a long
>> time.
> 
> Please, this is not constructive. I've never really grasped the need
> for posturing on LKML. Jes, I'm not going to fight a war of resumes
> with you. If you think I'm incompetent then there's very little I can
> do to change your mind.

You refuse to take the big picture into account and then claim that
there is no cost of doing things your way. Point being that once you
start maintaining a large project such as the kernel, or just parts of
it, you realize how much those 'zero cost' additions really cost.

>> Not to mention that some of the classical places people wish to add
>> those static tracepoints are in performance sensitive codepaths,
>> syscalls for example.
> 
> And this argument ignores everything I said on how there does not need
> be the limitation currently known to previous static tracing mechanisms.

And how does there not? If you want to add tracepoints to the syscall
path, then you will make an impact. It's non trivial to validate, yes
I have seen some scary attempts of adding LTT tracecalls to the ia64
syscall path, and just because it might not be compiled in in most cases
that doesn't mean it doesn't raise the complexity.

>> You can do pretty much everything you want to do with dynamic
>> tracepoints, it's just a matter of whether you want to dump the burden
>> of maintenance on someone else. Been there done that, had to show
>> people in the past how to do with dynamic points what they insisted
>> had to be done with static points.
> 
> Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can
> get away with this argument is if you view it exclusively from the
> point of view of kernel development. And that's why you're wrong.

As I said, kprobes are much more than kernel development! But you
obviously haven't bothered looking at those properly! Been there done
that!

>> So you maintain the tracepoints in the kernel and you are offering to
>> take over maintenance of all code that now contain these tracepoints?
> 
> Please explain, honestly, why the following instrumentation point is
> going to be a maintenance drag on the person modifying the scheduler:
> @@ -1709,6 +1712,7 @@ switch_tasks:
>    		++*switch_count;
> 
>    		prepare_arch_switch(rq, next);
> +		TRACE_SCHEDCHANGE(prev, next);
>    		prev = context_switch(rq, prev, next);
>    		barrier();
> 
> And please, don't bother complaining about the semantics, they can
> be changed. I'm just arguing about location/meaning/content.

It will be a drag because next week someone else wants a tracepoint
5 lines further down the code! Again, I have seen people try and do
that on top of the old LTT patchsets, so maybe *you* didn't want the
tracepoint somewhere else, but some people did! Next?

>> You add your static tracepoints, next week someone else wants some
>> very similar but slightly different points, the following week it's
>> someone else. Thanks, but no thanks.
> 
> Obviously there's no point in me spelling any code of conduct to
> anyone, Martin has already pointed out that it's up to the subsystem
> maintainers to decide what's appropriate and what's not, as is
> customary anyway. But the issue I'm putting forth here is that there
> is value for allowing outsiders to understand the dynamic behavior of
> your code and the only person who can do that best is the person
> writing the code. It is then that person's responsibility to
> distinguish between instrumentation they may find important to debug
> their code and instrumentation that would be relevant to those using
> their code. And if you've maintained code long enough, and I trust
> you do, you would see that there is a clear difference between both.

You are once again ignoring the point that not everyone needs the exact
same view of things that you are looking for. Dynamic probes allows for
that, doing that with static probes is going to turn into maintenance
hell. Guess what, some of us still try to look after code 8-10 years
after we wrote it initially.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:32                     ` Jes Sorensen
@ 2006-09-15 14:09                       ` Karim Yaghmour
  2006-09-15 14:30                         ` Jes Sorensen
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:09 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Jes Sorensen wrote:
> Except as I pointed out, that everyone wants their info slightly
> differently so even trace points in the scheduler will be contentious
> and we will end up with a stack of them if we are to satisfy everyone.
> So now, you didn't demonstrate anything.

There is in my view, and this is what this whole debate is really
about, a clear difference in between the type of instrumentation
being added. Clearly in the view of others there just isn't. But
bare with me. I submit to you that there are 3 classes of trace
points:

- OS-class: These are trace points which will be found in a given
  kernel regardless of how it is implemented if it belongs to a
  certain family of OSes. Linux being made to mimic Unix, it will
  always have key events. And if you look closely at the initial
  set of points added by ltt, these would be found in any Unix.
  It's not for nothing that my paper on ltt was accepted at Usenix
  2000 - and in fact during the question period somebody asked how
  easy it would be to port it to BSD, and the answer: trivial.

- Subsystem-class: These are trace points which are specific to
  a given implementation. Say block tracing, scsi tracing, etc. as
  they are implemented in Linux. The purpose of these is to allow
  a user of these given subsystems to get more in-depth understanding
  of what's happening inside the box.

- Debug-class: These are trace points required to find difficult
  problems such as race-conditions/etc. which are needed to debug
  the OS.

I'm not arguing for the inclusion of debug tracepoints. I can see
that within a given subsystem there can be disagreement over the
placement of specific tracepoints, and this is where I think your
argument lies and it is not without merit - IOW such tracepoints
should be more carefully scrutinized. However, there are OS-class
tracepoints for which I hardly see any possible debate either in
terms of usefulness or in terms of maintainability.

> Given that I have used this stuff to more than just debug code, then
> this obviously doesn't apply.
...
> You refuse to take the big picture into account and then claim that
> there is no cost of doing things your way. Point being that once you
> start maintaining a large project such as the kernel, or just parts of
> it, you realize how much those 'zero cost' additions really cost.

Someone else alluded to the parallel between in-code comments and
documentation maintained separately. There is a cost to in-code
instrumentation in the same way that there is to in-code documentation.
And they, in fact, are very much alike.

> And how does there not? If you want to add tracepoints to the syscall
> path, then you will make an impact. It's non trivial to validate, yes
> I have seen some scary attempts of adding LTT tracecalls to the ia64
> syscall path, and just because it might not be compiled in in most cases
> that doesn't mean it doesn't raise the complexity.

Again, this is an implementation issue. If we have a way to mark-up
code, then we can at least "hide" much of the scary stuff.

> As I said, kprobes are much more than kernel development! But you
> obviously haven't bothered looking at those properly! Been there done
> that!

I have, and taking an int3 on every tracepoint wasn't my liking, nor
was having to chase kernel versions for binary editing. If I was going
do maintenance I was much happier to work with source than binary.

> It will be a drag because next week someone else wants a tracepoint
> 5 lines further down the code! Again, I have seen people try and do
> that on top of the old LTT patchsets, so maybe *you* didn't want the
> tracepoint somewhere else, but some people did! Next?

Not if you understand the distinction I am making above.

Now, I can understand that you may think: Karim, nobody is going to
fsck'ing care about the distinction you're making once this is in
the kernel. But for me this is a separate, but yet entirely relevant,
part of the debate. The argument here has already been pointed out
elsewhere: There are already subsystem maintainers and they are more
than capable of taking the appropriate decisions. The distinction I
make above is not esoteric.

> You are once again ignoring the point that not everyone needs the exact
> same view of things that you are looking for. Dynamic probes allows for
> that, doing that with static probes is going to turn into maintenance
> hell. Guess what, some of us still try to look after code 8-10 years
> after we wrote it initially.

I'm not ignoring that people have different needs. I'm being depicted
as endorsing static traces all over the place, and I'm not advocating
such a course of action. The only reason any argument against static
instrumentation can be made is if you consider it from the debug
point of view and what drag such instrumentation would have. There is
a big difference of purpose and of persistent-relevance in between
debug instrumentation of os-class instrumentation. It's entirely
disingenuous to suggest otherwise.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:09                       ` Karim Yaghmour
@ 2006-09-15 14:30                         ` Jes Sorensen
  2006-09-15 15:12                           ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15 14:30 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Jes Sorensen wrote:
> There is in my view, and this is what this whole debate is really
> about, a clear difference in between the type of instrumentation
> being added. Clearly in the view of others there just isn't. But
> bare with me. I submit to you that there are 3 classes of trace
> points:
> 
> - OS-class: These are trace points which will be found in a given
>   kernel regardless of how it is implemented if it belongs to a
>   certain family of OSes. Linux being made to mimic Unix, it will
>   always have key events. And if you look closely at the initial
>   set of points added by ltt, these would be found in any Unix.
>   It's not for nothing that my paper on ltt was accepted at Usenix
>   2000 - and in fact during the question period somebody asked how
>   easy it would be to port it to BSD, and the answer: trivial.

There very few tracepoints in this category, the only things you can
claim are more or less generic are syscalls, and tracing syscall
handling is tricky.

> - Subsystem-class: These are trace points which are specific to
>   a given implementation. Say block tracing, scsi tracing, etc. as
>   they are implemented in Linux. The purpose of these is to allow
>   a user of these given subsystems to get more in-depth understanding
>   of what's happening inside the box.

This is grossly over simplifying things and why the whole things doesn't
hold water. There is no such thing as 'the place' to put a specific
tracepoint.

Especially when we start talking about things like tracepoints in the
scheduler.

Note that I haven't been referring to debug tracepoints at any point in
this debate.

>> It will be a drag because next week someone else wants a tracepoint
>> 5 lines further down the code! Again, I have seen people try and do
>> that on top of the old LTT patchsets, so maybe *you* didn't want the
>> tracepoint somewhere else, but some people did! Next?
> 
> Not if you understand the distinction I am making above.

Your distinction above doesn't hold water, but I did understand it
very well ....

You seem to think that it's fine to add instrumentation in the syscall
path as an example as long as it's compiled out. Well on some
architectures, the syscall path is very sensitive to alignment and there
may be restrictions on how large the stub of code is allowed to be, like
a few hundred bytes. Just because things work one way on x86, doesn't
mean they work like that everywhere.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:30                         ` Jes Sorensen
@ 2006-09-15 15:12                           ` Karim Yaghmour
  2006-09-16 10:41                             ` Jes Sorensen
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 15:12 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Jes Sorensen wrote:
> There very few tracepoints in this category,

Wow, that's progress.

> the only things you can
> claim are more or less generic are syscalls, and tracing syscall
> handling is tricky.

If there are implementation issue, I trust an adequate solution can be
found by using the tested-and-proven method of posting stuff on the
lkml for review.

> This is grossly over simplifying things and why the whole things doesn't
> hold water. There is no such thing as 'the place' to put a specific
> tracepoint.
> 
> Especially when we start talking about things like tracepoints in the
> scheduler.

I do not underestimate the difficulty of selecting such tracepoints.
This is why I chose not to maintain other people's specific tracepoints.
I realize this is a tough problem, but I also trust subsystem maintainers
are smart enough to make the appropriate decision. Obviously for such
things like the scheduler, any fine-grained instrumentation will draw
a barrage of criticism from anyone since a lot of stuff depends on it.
Either the lkml process works or it doesn't, but it isn't for me to
decide.

> Note that I haven't been referring to debug tracepoints at any point in
> this debate.

You're right, but others have happily intermingled the whole lot, and
I just wanted to document my personal categorization on lkml for all
to see.

> You seem to think that it's fine to add instrumentation in the syscall
> path as an example as long as it's compiled out. Well on some
> architectures, the syscall path is very sensitive to alignment and there
> may be restrictions on how large the stub of code is allowed to be, like
> a few hundred bytes. Just because things work one way on x86, doesn't
> mean they work like that everywhere.

If ltt failed to implement such things appropriately, then we apologize.
That fact doesn't preclude proper implementation in the future, however.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 15:12                           ` Karim Yaghmour
@ 2006-09-16 10:41                             ` Jes Sorensen
  2006-09-16 15:28                               ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-16 10:41 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Jes Sorensen wrote:
>> There very few tracepoints in this category,
> 
> Wow, that's progress.

Karim,

A personal question?, do you feel that being patronising and insulting
is in any way going to put your LTT project in a better light? It
certainly makes it a  lot harder for many of us to take your arguments
serious.

>> the only things you can claim are more or less generic are syscalls,
 >> and tracing syscall handling is tricky.
> 
> If there are implementation issue, I trust an adequate solution can be
> found by using the tested-and-proven method of posting stuff on the
> lkml for review.

And how is this going to solve the case where trace code in the syscall
path has a negative impact on cacheline utilization and alignment, even
when the trace data is not being used?

>> This is grossly over simplifying things and why the whole things doesn't
>> hold water. There is no such thing as 'the place' to put a specific
>> tracepoint.
[snip]
> I do not underestimate the difficulty of selecting such tracepoints.
> This is why I chose not to maintain other people's specific tracepoints.
> I realize this is a tough problem, but I also trust subsystem maintainers
> are smart enough to make the appropriate decision.

So you are back to saying that trace data other people wish to collect
are uninteresting and therefore should just be ignored? If not, what you
are saying there otherwise just backs up the argument that if LTT or
something similar goes into mainline, we will see the amount of
tracepoints grow significantly.

>> You seem to think that it's fine to add instrumentation in the syscall
>> path as an example as long as it's compiled out. Well on some
>> architectures, the syscall path is very sensitive to alignment and there
>> may be restrictions on how large the stub of code is allowed to be, like
>> a few hundred bytes. Just because things work one way on x86, doesn't
>> mean they work like that everywhere.
> 
> If ltt failed to implement such things appropriately, then we apologize.
> That fact doesn't preclude proper implementation in the future, however.

Please read what I wrote above! Touching the syscall path with static
tracepoints is costly and has side effects! The argument that things can
be compiled out is just pointless, end users do not recompile kernels at
random and many of the 'end user' cases where people wish to vizualize
trace data, are running on precompiled vendor kernels. Recompiling the
kernel and rebooting is not an option here!

In fact, the users who wish to trace data in self-compiled kernels are a
tiny subset of the potential userbase for this stuff which is primarily
useful to developers .... which in terms makes your argument about debug
tracepoints irrelevant since you are turning all the tracepoints into
debug tracepoints :)

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 10:41                             ` Jes Sorensen
@ 2006-09-16 15:28                               ` Karim Yaghmour
  2006-09-18  8:57                                 ` Jes Sorensen
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 15:28 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Jes Sorensen wrote:
> A personal question?, do you feel that being patronising and insulting
> is in any way going to put your LTT project in a better light? It
> certainly makes it a  lot harder for many of us to take your arguments
> serious.

ltt isn't *mine* anymore, somebody else is maintaining it at this point,
and it remains to be seen whether any of my input in this thread is:
a) appreciated by them, b) agreed by them.

With regards to the tone of the thread, then please at least read other
people's approach to me, including yourself. I think the casual observer
will see that there was a great deal of animosity aimed at me personally.
I'll admit to being sarcastic and biting back. But that's hardly alien
to lkml.

> And how is this going to solve the case where trace code in the syscall
> path has a negative impact on cacheline utilization and alignment, even
> when the trace data is not being used?

Hmm... and then compare that to the negative impact of kprobes at runtime.
Of course if we could override the syscall table your point disappears.
That's not how ltt does it now, but it could easily be done otherwise.
All implementations I've looked at so far of syscall in Linux involve
a table. If the base of this table was a dynamically modifiable entry,
then the problem is solved. Wouldn't it?

> So you are back to saying that trace data other people wish to collect
> are uninteresting and therefore should just be ignored? If not, what you
> are saying there otherwise just backs up the argument that if LTT or
> something similar goes into mainline, we will see the amount of
> tracepoints grow significantly.

I've explained earlier the difference in between these things.

> Please read what I wrote above! Touching the syscall path with static
> tracepoints is costly and has side effects! The argument that things can
> be compiled out is just pointless, end users do not recompile kernels at
> random and many of the 'end user' cases where people wish to vizualize
> trace data, are running on precompiled vendor kernels. Recompiling the
> kernel and rebooting is not an option here!

It is for some. And please stop repeating the syscall path stuff. It can
be solved elegantly. The fact that it hasn't up to this point is only an
excuse to keep working harder on it. There is, in fact, no reason that
the solution may not just be a combination of static markup and dynamic
modification.

> In fact, the users who wish to trace data in self-compiled kernels are a
> tiny subset of the potential userbase for this stuff which is primarily
> useful to developers .... which in terms makes your argument about debug
> tracepoints irrelevant since you are turning all the tracepoints into
> debug tracepoints :)

How many embedded Linux projects did you personally work on?

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 15:28                               ` Karim Yaghmour
@ 2006-09-18  8:57                                 ` Jes Sorensen
  2006-09-18 14:48                                   ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:57 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> It is for some. And please stop repeating the syscall path stuff. It can
> be solved elegantly. The fact that it hasn't up to this point is only an
> excuse to keep working harder on it. There is, in fact, no reason that
> the solution may not just be a combination of static markup and dynamic
> modification.

You just don't want to listen, this is *not* a question of a modifiable
table or not. It's a question of *how* code needs to be added to the
syscall path, we both know why a modifiable table is not going to
happen. How do you plan to handle vdso based syscalls with LTT?

>> In fact, the users who wish to trace data in self-compiled kernels are a
>> tiny subset of the potential userbase for this stuff which is primarily
>> useful to developers .... which in terms makes your argument about debug
>> tracepoints irrelevant since you are turning all the tracepoints into
>> debug tracepoints :)
> 
> How many embedded Linux projects did you personally work on?

You know what, I give up. Your primary interest seems to be in attacking
people personally because they didn't start out jumping up and down
clapping their hands in support of your pet project. Even if I wanted to
I couldn't tell you about the number of different projects I have
worked, partly because I can't remember half of them, partly because of
contract limitation, and most importantly because I do not need to
justify my experience to you.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18  8:57                                 ` Jes Sorensen
@ 2006-09-18 14:48                                   ` Ingo Molnar
  2006-09-18 15:37                                     ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-18 14:48 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: karim, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Jes Sorensen <jes@sgi.com> wrote:

> >> tiny subset of the potential userbase for this stuff which is primarily
> >> useful to developers .... which in terms makes your argument about debug
> >> tracepoints irrelevant since you are turning all the tracepoints into
> >> debug tracepoints :)
> > 
> > How many embedded Linux projects did you personally work on?
> 
> You know what, I give up. Your primary interest seems to be in 
> attacking people personally because they didn't start out jumping up 
> and down clapping their hands in support of your pet project. [...]

i'm giving up on Karim too. I did apologize to Karim for the mistake i 
did in this thread-of-200-mails, but it's revolting to see that Karim 
still goes on and attacks top Linux contributors like you, without 
looking back, without apologizing for anything and without feeling any 
remorse. Karim patronized, attacked and insulted various people dozens 
of times in this thread alone. I just dont see any value in trying to 
"work with" Karim anymore, because it's apparently not something he is 
interested in doing. I feel a bit sorry for him too, because at heart he 
must be a deeply lonely person.

( I do see value in working with Mathieu, who has shown lot of insight, 
  patience, ability in cleaning up the LTT codebase and producing LTTng. 
  I dont envy him for having to work with Karim though. LTTng still 
  needs alot of work to be upstream-acceptable but my current impression 
  is that Mathieu's fundamentally professional approach will be 
  successful. )

> > How many embedded Linux projects did you personally work on?
> >
> [...] Even if I wanted to I couldn't tell you about the number of 
> different projects I have worked, partly because I can't remember half 
> of them, partly because of contract limitation, and most importantly 
> because I do not need to justify my experience to you.

you dont need to justify your experience to Karim. Your countless 
contributions to the Linux kernel speak for themselves. Most tellingly, 
his boasting aside, the only embedded-related Linux kernel contribution 
i have ever seen from Karim was the 1000-lines relayfs code - and even 
that code took years for Tom Zanussi to clean up and to get upstream. 
Besides that i have not seen a single line of code from Karim - not a 
single patch, not a oneliner fix, nothing. So if someone needs to prove 
his experience in embedded Linux matters on this forum then it's Karim.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18 14:48                                   ` Ingo Molnar
@ 2006-09-18 15:37                                     ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-18 15:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais


Trust me, I don't intend to drag this any longer. I just want to make
sure this issue of "respect" is cleared up.

Ingo Molnar wrote:
> i'm giving up on Karim too. I did apologize to Karim for the mistake i 
> did in this thread-of-200-mails, but it's revolting to see that Karim 
> still goes on and attacks top Linux contributors like you, without 
> looking back, without apologizing for anything and without feeling any 
> remorse.

If there exists a cult where top contributors are to be venerated, then
I'm not part of it. If my calling individuals to account on their supposed
expertise on tracing, which they use as justification for continued
marginalization of such related projects, has generated so much backlash,
then it is for me but a sign of how entrenched arrogance can be in some
quarters.

Don't get wrong, I have immense respect for the collective talent of
kernel developers. But no matter how broad collective talent can be, it
cannot be omniscient.

> Karim patronized, attacked and insulted various people dozens 
> of times in this thread alone. I just dont see any value in trying to 
> "work with" Karim anymore, because it's apparently not something he is 
> interested in doing. I feel a bit sorry for him too, because at heart he 
> must be a deeply lonely person.

Ditto.

> single patch, not a oneliner fix, nothing. So if someone needs to prove 
> his experience in embedded Linux matters on this forum then it's Karim.

http://www.oreilly.com/catalog/belinuxsys/

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:38                   ` Karim Yaghmour
  2006-09-15 12:32                     ` Jes Sorensen
@ 2006-09-15 13:20                     ` Paul Mundt
  2006-09-15 13:41                       ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Paul Mundt @ 2006-09-15 13:20 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Jes Sorensen, Ingo Molnar, Roman Zippel, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote:
> If you'd care to read through the thread you'd notice I've demonstrated
> time and again that those static trace points we're mostly interested
> in a never-changing. Lest something fundamentally changes with the
> kernel, there will always be a scheduling change; etc. This
> "instrumentation is evil" mantra is only substantiated if you view
> it from the point of view of someone who's only used it to debug code.
> Yet, and I repeat this again, instrumentation for in-source debugging
> is but a corner case of instrumentation in general.
> 
I didn't get the "instrumentation is evil" mantra from this thread,
rather "static tracepoints are good, so long as someone else is
maintaining them". The issue comes down to who ends up maintaining the
trace points, and given with how intrusive LTT was in the past, I can't
see anyone wanting to suddenly start littering them around the kernel
now (at least in the areas that they're responsible for, particularly if
it's not something that's going to be useful to most people). Admittedly
LTTng is not as bad at this as LTT was in this regard, though.

If static tracepoints are something that's useful for you, then you
can continue maintaining them out of tree.

> Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can
> get away with this argument is if you view it exclusively from the
> point of view of kernel development. And that's why you're wrong.
> 
kprobes may not be the answer to all lifes problems, but it is
non-intrusive once the initial implementation pains are out of the way..

> Please explain, honestly, why the following instrumentation point is
> going to be a maintenance drag on the person modifying the scheduler:
> @@ -1709,6 +1712,7 @@ switch_tasks:
>    		++*switch_count;
> 
>    		prepare_arch_switch(rq, next);
> +		TRACE_SCHEDCHANGE(prev, next);
>    		prev = context_switch(rq, prev, next);
>    		barrier();
> 
> And please, don't bother complaining about the semantics, they can
> be changed. I'm just arguing about location/meaning/content.
> 
For someone complaining about meaningless posturing on the list, posting
this as a representation for the isolated changes involved is rather
interesting. If it were down to a small handful of critical static
tracepoints in-tree and the rest left up to the people that really want
them in out-of-tree patches, I doubt LTT would have ever had half of the
resistance towards it.

It's the intrusiveness that becomes the maintenance burden, and if you
whittle it down to a point where the intrusiveness is not that big of a
deal, then I'm not sure I see what static points would buy you over
dynamic instrumentation.

It's easy to write off the maintenance overhead when you aren't the one
maintaining the code..

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:20                     ` Paul Mundt
@ 2006-09-15 13:41                       ` Roman Zippel
  2006-09-15 13:44                         ` Jes Sorensen
  2006-09-15 13:57                         ` Paul Mundt
  0 siblings, 2 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 13:41 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Karim Yaghmour, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Paul Mundt wrote:

> On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote:
> > If you'd care to read through the thread you'd notice I've demonstrated
> > time and again that those static trace points we're mostly interested
> > in a never-changing. Lest something fundamentally changes with the
> > kernel, there will always be a scheduling change; etc. This
> > "instrumentation is evil" mantra is only substantiated if you view
> > it from the point of view of someone who's only used it to debug code.
> > Yet, and I repeat this again, instrumentation for in-source debugging
> > is but a corner case of instrumentation in general.
> > 
> I didn't get the "instrumentation is evil" mantra from this thread,
> rather "static tracepoints are good, so long as someone else is
> maintaining them". The issue comes down to who ends up maintaining the
> trace points,

The claim that these tracepoints would be maintainance burden is pretty 
much unproven so far. The static tracepoint haters just assume the kernel 
will be littered with thousands of unrelated tracepoints, where a good 
tracepoint would only document what already happens in that function, so 
that the tracepoint would be far from something obscure, which only few 
people could understand and maintain.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:41                       ` Roman Zippel
@ 2006-09-15 13:44                         ` Jes Sorensen
  2006-09-15 14:03                           ` Roman Zippel
  2006-09-15 13:57                         ` Paul Mundt
  1 sibling, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15 13:44 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Roman Zippel wrote:
> The claim that these tracepoints would be maintainance burden is pretty 
> much unproven so far. The static tracepoint haters just assume the kernel 
> will be littered with thousands of unrelated tracepoints, where a good 
> tracepoint would only document what already happens in that function, so 
> that the tracepoint would be far from something obscure, which only few 
> people could understand and maintain.

How do you propose to handle the case where two tracepoint clients wants
slightly different data from the same function? I saw this with LTT
users where someone wanted things in different places in schedule().

It *is* a nightmare to maintain.

You still haven't explained your argument about kprobes not being
generally available - where?

Cheers,
Jes




^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:44                         ` Jes Sorensen
@ 2006-09-15 14:03                           ` Roman Zippel
  2006-09-15 14:37                             ` Alan Cox
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 14:03 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Jes Sorensen wrote:

> Roman Zippel wrote:
> > The claim that these tracepoints would be maintainance burden is pretty 
> > much unproven so far. The static tracepoint haters just assume the kernel 
> > will be littered with thousands of unrelated tracepoints, where a good 
> > tracepoint would only document what already happens in that function, so 
> > that the tracepoint would be far from something obscure, which only few 
> > people could understand and maintain.
> 
> How do you propose to handle the case where two tracepoint clients wants
> slightly different data from the same function? I saw this with LTT
> users where someone wanted things in different places in schedule().
> 
> It *is* a nightmare to maintain.

That nightmare would not be with tracepoints itself, but with the users of 
it, so you're missing the point.
Tracepoints can be abused of course, but it's quite a leap to conclude 
from this that they are bad in general.

> You still haven't explained your argument about kprobes not being
> generally available - where?

Huh? What kind of explanation do you want?

$ grep KPROBES arch/*/Kconf*
arch/i386/Kconfig:config KPROBES
arch/ia64/Kconfig:config KPROBES
arch/powerpc/Kconfig:config KPROBES
arch/sparc64/Kconfig:config KPROBES
arch/x86_64/Kconfig:config KPROBES

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:03                           ` Roman Zippel
@ 2006-09-15 14:37                             ` Alan Cox
  2006-09-15 14:34                               ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Alan Cox @ 2006-09-15 14:37 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Jes Sorensen, Paul Mundt, Karim Yaghmour, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 16:03 +0200, ysgrifennodd Roman Zippel:
> Huh? What kind of explanation do you want?
> 
> $ grep KPROBES arch/*/Kconf*
> arch/i386/Kconfig:config KPROBES
> arch/ia64/Kconfig:config KPROBES
> arch/powerpc/Kconfig:config KPROBES
> arch/sparc64/Kconfig:config KPROBES
> arch/x86_64/Kconfig:config KPROBES

Send patches. The fact nobody has them implemented on your platform
isn't a reason to implement something else, quite the reverse in fact.

Alan


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:37                             ` Alan Cox
@ 2006-09-15 14:34                               ` Roman Zippel
  0 siblings, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 14:34 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jes Sorensen, Paul Mundt, Karim Yaghmour, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Alan Cox wrote:

> Ar Gwe, 2006-09-15 am 16:03 +0200, ysgrifennodd Roman Zippel:
> > Huh? What kind of explanation do you want?
> > 
> > $ grep KPROBES arch/*/Kconf*
> > arch/i386/Kconfig:config KPROBES
> > arch/ia64/Kconfig:config KPROBES
> > arch/powerpc/Kconfig:config KPROBES
> > arch/sparc64/Kconfig:config KPROBES
> > arch/x86_64/Kconfig:config KPROBES
> 
> Send patches. The fact nobody has them implemented on your platform
> isn't a reason to implement something else, quite the reverse in fact.

Alan, you offer no fact at all and all I can think about this is rather 
emotional and potentially offensive, so I'll refrain from further 
comments. The anti-tracepoint league has made up its mind anyway, so 
what's the point... :-(

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:41                       ` Roman Zippel
  2006-09-15 13:44                         ` Jes Sorensen
@ 2006-09-15 13:57                         ` Paul Mundt
  2006-09-15 14:17                           ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Paul Mundt @ 2006-09-15 13:57 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Karim Yaghmour, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, Sep 15, 2006 at 03:41:03PM +0200, Roman Zippel wrote:
> > On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote:
> > I didn't get the "instrumentation is evil" mantra from this thread,
> > rather "static tracepoints are good, so long as someone else is
> > maintaining them". The issue comes down to who ends up maintaining the
> > trace points,
> 
> The claim that these tracepoints would be maintainance burden is pretty 
> much unproven so far. The static tracepoint haters just assume the kernel 
> will be littered with thousands of unrelated tracepoints, where a good 
> tracepoint would only document what already happens in that function, so 
> that the tracepoint would be far from something obscure, which only few 
> people could understand and maintain.
> 
Again, this works fine so long as the number of static tracepoints is
small and manageable, but it seems like there's a division between what
the subsystem developer deems as meaningful and what someone doing the
tracing might want to look at. Static tracepoints are completely
subjective, LTT proved that this was a problem regarding general
code-level intrusiveness when the number of tracepoints in relatively
close locality started piling up based on what people considered
arbitrarily useful, and LTTng doesn't appear to do anything to address
this.

This doesn't really match my definition of a neglible maintenance
burden..

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:57                         ` Paul Mundt
@ 2006-09-15 14:17                           ` Karim Yaghmour
  2006-09-15 14:13                             ` Jes Sorensen
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:17 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Roman Zippel, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Paul Mundt wrote:
> subjective, LTT proved that this was a problem regarding general
> code-level intrusiveness when the number of tracepoints in relatively
> close locality started piling up based on what people considered
> arbitrarily useful, and LTTng doesn't appear to do anything to address
> this.

"LTT proved that ..." what are you talking about? Have you noticed
the posting earlier regarding the fact that the ltt tracepoints did
not change over a 5 year span? **five** years ... Where do you get
this claim that ltt trace points "started piling up"? Have a look
at figure 2 of this article and let me know exactly which of those
tracepoints are actually a problem to you:
http://www.usenix.org/events/usenix2000/general/full_papers/yaghmour/yaghmour_html/index.html

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:17                           ` Karim Yaghmour
@ 2006-09-15 14:13                             ` Jes Sorensen
  2006-09-15 14:31                               ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15 14:13 UTC (permalink / raw)
  To: karim
  Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Karim Yaghmour wrote:
> Paul Mundt wrote:
>> subjective, LTT proved that this was a problem regarding general
>> code-level intrusiveness when the number of tracepoints in relatively
>> close locality started piling up based on what people considered
>> arbitrarily useful, and LTTng doesn't appear to do anything to address
>> this.
> 
> "LTT proved that ..." what are you talking about? Have you noticed
> the posting earlier regarding the fact that the ltt tracepoints did
> not change over a 5 year span? **five** years ... Where do you get
> this claim that ltt trace points "started piling up"? Have a look
> at figure 2 of this article and let me know exactly which of those
> tracepoints are actually a problem to you:

Because other people have tried to use LTT for additional projects,
but said projects haven't been integrated into LTT. In other words,
just because *you* haven't added those, doesn't mean someone else
won't try and do it later, if LTT was integrated.

Nice try!

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:13                             ` Jes Sorensen
@ 2006-09-15 14:31                               ` Karim Yaghmour
  2006-09-15 14:28                                 ` Paul Mundt
  2006-09-15 14:39                                 ` Jes Sorensen
  0 siblings, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:31 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Jes Sorensen wrote:
> Because other people have tried to use LTT for additional projects,
> but said projects haven't been integrated into LTT. In other words,
> just because *you* haven't added those, doesn't mean someone else
> won't try and do it later, if LTT was integrated.

Thank you. I will take it as a complement and likely laminate this
email for your suggestion that I've acted responsibly in my
maintenance of ltt. Boy, can you imagine what this debate would
have looked like if I had included precisely those additional
projects ...

C'mon Jes, if I was able to responsibly maintain ltt over 5
years *out* of the tree and I'm being labeled as incompetent all
over this thread, then imagine what the very competent people
maintaining the kernel could actually do.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:31                               ` Karim Yaghmour
@ 2006-09-15 14:28                                 ` Paul Mundt
  2006-09-15 14:46                                   ` Martin J. Bligh
  2006-09-15 14:51                                   ` Karim Yaghmour
  2006-09-15 14:39                                 ` Jes Sorensen
  1 sibling, 2 replies; 271+ messages in thread
From: Paul Mundt @ 2006-09-15 14:28 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, Sep 15, 2006 at 10:31:51AM -0400, Karim Yaghmour wrote:
> Jes Sorensen wrote:
> > Because other people have tried to use LTT for additional projects,
> > but said projects haven't been integrated into LTT. In other words,
> > just because *you* haven't added those, doesn't mean someone else
> > won't try and do it later, if LTT was integrated.
> 
> Thank you. I will take it as a complement and likely laminate this
> email for your suggestion that I've acted responsibly in my
> maintenance of ltt. Boy, can you imagine what this debate would
> have looked like if I had included precisely those additional
> projects ...
> 
Which brings back the point of static tracepoints being entirely
subjective. By this line of reasoning, you define for other people what
the useful tracepoints are, and couldn't care less which points they're
actually interested in. How exactly is this serving the need of people
looking for instrumentation, rather than a pre-canned view of what they
can trace? If they already have to go with their own tracepoints for the
things they're interested in, then having a few static points
pre-existing doesn't really buy anyone much else either, especially if
by your own admission you're not integrating the points that people
_are_ interested in.

I'm not indicating that you didn't do exactly what you should have in
this situation, only that static tracepoints in general are only going
to be a small part of the picture, and not a complete solution to most
people on their own. Dynamic instrumentation fills the same sort of gap
without worrying about arbitrary maintenance, so what exactly does
shoving static instrumentation in to the kernel buy us?

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:28                                 ` Paul Mundt
@ 2006-09-15 14:46                                   ` Martin J. Bligh
  2006-09-15 15:22                                     ` Alan Cox
  2006-09-15 14:51                                   ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Martin J. Bligh @ 2006-09-15 14:46 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Karim Yaghmour, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

> Which brings back the point of static tracepoints being entirely
> subjective. By this line of reasoning, you define for other people what
> the useful tracepoints are, and couldn't care less which points they're
> actually interested in. How exactly is this serving the need of people
> looking for instrumentation, rather than a pre-canned view of what they
> can trace? If they already have to go with their own tracepoints for the
> things they're interested in, then having a few static points
> pre-existing doesn't really buy anyone much else either, especially if
> by your own admission you're not integrating the points that people
> _are_ interested in.

They're not *entirely* subjective, though I agree some are. I find the
fact that Andrew Morton, myself, and apparently several other people 
have all instrumented the memory reclaim code to tell you *why* it's
failing to reclaim pages at various points in time slightly amusing,
but also rather depressing. It's all rather a waste of effort.

Moreover, subsystem experts know what needs to be traced in order to
give useful information, and the users may not. It's a damned sight
easier for them to say "oh, please turn on tracing for VM events
and send me the output" than custom-construct a set of probes for
that user, and send them off. There's a barrier to entry that just
won't happen there.

Hell, look at all the debug printks in the kernel for example, and
the various small add-hoc tracing facilities. If all we do is unite
those, it'll still be a step forwards.

> I'm not indicating that you didn't do exactly what you should have in
> this situation, only that static tracepoints in general are only going
> to be a small part of the picture, and not a complete solution to most
> people on their own. Dynamic instrumentation fills the same sort of gap
> without worrying about arbitrary maintenance, so what exactly does
> shoving static instrumentation in to the kernel buy us?

Dynamic probes do NOT reduce maintenance, they increase it. They just
push it into somebody else's lap, where it's done more inefficiently.
That's not a solution. The question is what's add-hoc debug for a
particular problem vs. what's generically useful. I refuse to believe
that the subsystem maintainers are too stupid to be able to make that
judgement call.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:46                                   ` Martin J. Bligh
@ 2006-09-15 15:22                                     ` Alan Cox
  2006-09-15 15:47                                       ` Martin J. Bligh
  0 siblings, 1 reply; 271+ messages in thread
From: Alan Cox @ 2006-09-15 15:22 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Paul Mundt, Karim Yaghmour, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 07:46 -0700, ysgrifennodd Martin J. Bligh:
> Moreover, subsystem experts know what needs to be traced in order to
> give useful information, and the users may not. It's a damned sight
> easier for them to say "oh, please turn on tracing for VM events
> and send me the output" than custom-construct a set of probes for
> that user, and send them off. There's a barrier to entry that just
> won't happen there.

That has nothing to do with the static or dynamic probe question.
Scriptable dynamic probes do everything your static probes do and more.

> Hell, look at all the debug printks in the kernel for example, and
> the various small add-hoc tracing facilities. If all we do is unite
> those, it'll still be a step forwards.

Look how many there are, look how they spread, tracepoints will do the
same.

> Dynamic probes do NOT reduce maintenance, they increase it.

Thats a logical fallacy to begin with. A dynamic probe can probe
anything a static probe can. So a static probe can be implemented with a
dynamic probe.

In other words if you like static probe lists and your subsystem happens
to be one where it is useful then you can script it with the same effect
and send people the script.

With kprobes you've got a passably good chance (ie if Distros can be
persuaded to package the debug data) that you can say "run this
systemtap script". With static tracepoints its "recompile your vendor
kernel in your vendor manner with your vendor initrd and add it to the
boot loader"

Alan

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 15:22                                     ` Alan Cox
@ 2006-09-15 15:47                                       ` Martin J. Bligh
  0 siblings, 0 replies; 271+ messages in thread
From: Martin J. Bligh @ 2006-09-15 15:47 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paul Mundt, Karim Yaghmour, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Alan Cox wrote:
> Ar Gwe, 2006-09-15 am 07:46 -0700, ysgrifennodd Martin J. Bligh:
>> Moreover, subsystem experts know what needs to be traced in order to
>> give useful information, and the users may not. It's a damned sight
>> easier for them to say "oh, please turn on tracing for VM events
>> and send me the output" than custom-construct a set of probes for
>> that user, and send them off. There's a barrier to entry that just
>> won't happen there.
> 
> That has nothing to do with the static or dynamic probe question.
> Scriptable dynamic probes do everything your static probes do and more.

No. The point is that they're not *there* and have to be modified
for every kernel version. And do you mean with or without the markers
in the code to tell the dynamic probes where to hook in, and what data
to fetch? that makes a huge difference.

Suppose, as a very real example, I want to instrument shrink_list.
There are 20 or so places where it can switch what we're doing with
a page for different reasons. Potentially we're scanning through many
thousands of pages. If I can keep counters as I go through the function,
and then do one trace entry at the end, that's fairly efficient. If I
have to create 20 separate hooks that all jump out of line, it's going
to be a lot slower. If I log a tracepoint at every damned page every
time it switches, it's going to be a nightmare.

Most things can be done with dynamic probes. Some things will require
markers in the code to tell us sustainably over time where to attatch
them. A few things (like the above) probably require some explicit
code.

>> Hell, look at all the debug printks in the kernel for example, and
>> the various small add-hoc tracing facilities. If all we do is unite
>> those, it'll still be a step forwards.
> 
> Look how many there are, look how they spread, tracepoints will do the
> same.

As long as they all use the same infrastructure, that's an improvement.

>> Dynamic probes do NOT reduce maintenance, they increase it.
> 
> Thats a logical fallacy to begin with. A dynamic probe can probe
> anything a static probe can. So a static probe can be implemented with a
> dynamic probe.

In the absence of the markers, I don't think that's true - there's the
maintenance of exactly where they go, plus access to local data. If you
mean with markers, then yes, that's fine. The markers + dynamic probes
seems to be a reasonable compromise between the two. Exactly what we
call that combo, static or dynamic, I don't really care ;-)

> In other words if you like static probe lists and your subsystem happens
> to be one where it is useful then you can script it with the same effect
> and send people the script.
> 
> With kprobes you've got a passably good chance (ie if Distros can be
> persuaded to package the debug data) that you can say "run this
> systemtap script". With static tracepoints its "recompile your vendor
> kernel in your vendor manner with your vendor initrd and add it to the
> boot loader"

You're thinking of one situation where you can't recompile. I'm thinking
of a situation where it's trivial to recompile. Both exist, neither is
invalid. Of course, where possible, we'd like to be able to add stuff
on the fly, but it's not a panacea.

Without the markers, maintaining a usable set of dynamic probe points
that's always available for every kernel version seems infeasible.
With them, I think it'll cover 99% of the cases, and would be pretty
useful. If people agree on putting tags in there, perhaps we can
discuss things like the logging mechanism, format, and readout.
If not, I suppose we have to drag this debate out even longer.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:28                                 ` Paul Mundt
  2006-09-15 14:46                                   ` Martin J. Bligh
@ 2006-09-15 14:51                                   ` Karim Yaghmour
  2006-09-15 15:00                                     ` Thomas Gleixner
  2006-09-15 15:24                                     ` Alan Cox
  1 sibling, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:51 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Paul Mundt wrote:
> Which brings back the point of static tracepoints being entirely
> subjective. By this line of reasoning, you define for other people what
> the useful tracepoints are, and couldn't care less which points they're
> actually interested in. How exactly is this serving the need of people
> looking for instrumentation, rather than a pre-canned view of what they
> can trace? If they already have to go with their own tracepoints for the
> things they're interested in, then having a few static points
> pre-existing doesn't really buy anyone much else either, especially if
> by your own admission you're not integrating the points that people
> _are_ interested in.
> 
> I'm not indicating that you didn't do exactly what you should have in
> this situation, only that static tracepoints in general are only going
> to be a small part of the picture, and not a complete solution to most
> people on their own. Dynamic instrumentation fills the same sort of gap
> without worrying about arbitrary maintenance, so what exactly does
> shoving static instrumentation in to the kernel buy us?

And this flies in the face of all of those who, for years, have been
satisfied customers for ltt and who were more than looking forwad
for not having to depend on me to get a working traceable kernel.

The static tracepoints we maintained were *the* solution for a great
deal many people. As a maintainer I had two choices with those who
were not content:
a- Maintain their tracepoints for them -- not happening.
b- Suggest they contribute to helping getting a generic tracing
  infrastructure into the kernel and then make their case on the
  lkml as to the pertinence of their instrumentation.

And what I did is "b". I wasn't going to defend anybody else's
choice of tracepoints. Those who were using ltt for its designated
purpose -- allowing normal users and developers to get an accurate
view of the behavior of their system -- were very happy with it.

You want to know who was unhappy with using it: kernel developers.
It just wasn't geared for them. Which goes back to my earlier
arguments ...

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:51                                   ` Karim Yaghmour
@ 2006-09-15 15:00                                     ` Thomas Gleixner
  2006-09-15 15:28                                       ` Karim Yaghmour
  2006-09-15 18:16                                       ` Andrew Morton
  2006-09-15 15:24                                     ` Alan Cox
  1 sibling, 2 replies; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 15:00 UTC (permalink / raw)
  To: karim
  Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, 2006-09-15 at 10:51 -0400, Karim Yaghmour wrote:
> And what I did is "b". I wasn't going to defend anybody else's
> choice of tracepoints. Those who were using ltt for its designated
> purpose -- allowing normal users and developers to get an accurate
> view of the behavior of their system -- were very happy with it.
> 
> You want to know who was unhappy with using it: kernel developers.
> It just wasn't geared for them. Which goes back to my earlier
> arguments ...

What do you want to prove with this rant ? Simply the fact that your
view of tracing is not matching the view of others. Nothing else.

You just made it clear, that your solution was and still is targeted on
one single user group.

Nobody is opposing instrumentation per se, we just need to figure out a
good solution suitable for endusers, kernel developers, debug
fetishists ... without splattering ten different tracers all across the
kernel source.

The way to a solid kernel instrumentation is definitely not by pushing a
single purpose solution in, which we have to _maintain_ for a long time
without being convinced that it is the _best_ technical solution we can
have right now.

	tglx

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 15:00                                     ` Thomas Gleixner
@ 2006-09-15 15:28                                       ` Karim Yaghmour
  2006-09-15 18:16                                       ` Andrew Morton
  1 sibling, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 15:28 UTC (permalink / raw)
  To: tglx
  Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


Thomas Gleixner wrote:
> You just made it clear, that your solution was and still is targeted on
> one single user group.

And that was part of my point. Every time I got in a debate on lkml
regarding ltt, there were crowds screaming in horror at the possibility
of trace points everywhere.

> Nobody is opposing instrumentation per se, we just need to figure out a
> good solution suitable for endusers, kernel developers, debug
> fetishists ... without splattering ten different tracers all across the
> kernel source.

I agree entirely.

> The way to a solid kernel instrumentation is definitely not by pushing a
> single purpose solution in, which we have to _maintain_ for a long time
> without being convinced that it is the _best_ technical solution we can
> have right now.

I think we're in full agreement. A solid kernel instrumentation mechanism
is exactly what is needed. The whole point of posting the ltt stuff on
the lkml is exactly to get the best technical solution. The ltt developers
are more than happy to take suggestions as to how to achieve this.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 15:00                                     ` Thomas Gleixner
  2006-09-15 15:28                                       ` Karim Yaghmour
@ 2006-09-15 18:16                                       ` Andrew Morton
  2006-09-15 18:19                                         ` Ingo Molnar
                                                           ` (3 more replies)
  1 sibling, 4 replies; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 18:16 UTC (permalink / raw)
  To: tglx
  Cc: karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 15 Sep 2006 17:00:47 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Fri, 2006-09-15 at 10:51 -0400, Karim Yaghmour wrote:
> > And what I did is "b". I wasn't going to defend anybody else's
> > choice of tracepoints. Those who were using ltt for its designated
> > purpose -- allowing normal users and developers to get an accurate
> > view of the behavior of their system -- were very happy with it.
> > 
> > You want to know who was unhappy with using it: kernel developers.
> > It just wasn't geared for them. Which goes back to my earlier
> > arguments ...
> 
> What do you want to prove with this rant ? Simply the fact that your
> view of tracing is not matching the view of others. Nothing else.

What Karim is sharing with us here (yet again) is the real in-field
experience of real users (ie: not kernel developers).

I mean, on one hand we have people explaining what they think a tracing
facility should and shouldn't do, and on the other hand we have a guy who
has been maintaining and shipping exactly that thing to (paying!) customers
for many years.

Me thinks our time would be best spent trying to benefit from his
experience..

Me, I'm not particularly averse to some 50-100 static tracepoints if
experience tells us that we need such things.  And both Karim's and Frank's
experience does indicate that such things are needed, which carries weight.

What I _am_ concerned about with this patchset is all the infrastructural
goop which backs up those tracepoints.  I'd have thought that a better
approach would be to make those explicit tracepoints be "helpers" for the
existing kprobe code.

Of course, it they are properly designed, the one set of tracepoints could
be used by different tracing backends - that allows us to separate the
concepts of "tracepoints" and "tracing backends".

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                                       ` Andrew Morton
@ 2006-09-15 18:19                                         ` Ingo Molnar
  2006-09-15 19:26                                           ` Karim Yaghmour
                                                             ` (2 more replies)
  2006-09-15 19:35                                         ` Thomas Gleixner
                                                           ` (2 subsequent siblings)
  3 siblings, 3 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 18:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Andrew Morton <akpm@osdl.org> wrote:

> What Karim is sharing with us here (yet again) is the real in-field 
> experience of real users (ie: not kernel developers).

well, Jes has that experience and Thomas too.

> I mean, on one hand we have people explaining what they think a 
> tracing facility should and shouldn't do, and on the other hand we 
> have a guy who has been maintaining and shipping exactly that thing to 
> (paying!) customers for many years.

so does Thomas and Jes. So what's the point?

i judge LTT by its current code quality, not by its proponents shouting 
volume - and that quality is still quite poor at the moment. (and then 
there are the conceptual problems too, outlined numerous times) I have 
quoted specific example(s) for that in this thread. Furthermore, LTT 
does this:

 246 files changed, 26207 insertions(+), 71 deletions(-)

and this gives me the shivers, for all the reasons i outlined.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:19                                         ` Ingo Molnar
@ 2006-09-15 19:26                                           ` Karim Yaghmour
  2006-09-15 19:43                                           ` Roman Zippel
  2006-09-15 20:13                                           ` Andrew Morton
  2 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 19:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, tglx, Paul Mundt, Jes Sorensen, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> well, Jes has that experience and Thomas too.
...
> so does Thomas and Jes. So what's the point?

Either I'm too stupid for you to bother replying to any of my emails
(which is very possible) or, shall we say politely, you're not
exactly humble. I've responded to half a dozen of your emails, yet
you have not deemed it worthwhile to talk to me directly.

First you came out screaming that static tracepoints are heresy, and
then when there was non-ltt-specific interest being voiced for code
markup, you viciously set out to fud ltt as best you can using your
experience at implementing kernel tracers as ammunition. So answer
this simple question, how many tracers did you actually write which
were geared for non-kernel-developer users? Based on your own
account from yesterday, the answer I conclude is: NONE. I'd say
you've got pretty strong opinions about something you've never
attempted to do. Of course you claim that all tracers are the same,
how could they be different? But that's where experience talks and
hubris walks.

> i judge LTT by its current code quality, not by its proponents shouting 
> volume - and that quality is still quite poor at the moment.

You're either skillfully trying to steer arguments in your direction
or you're simply unaware of the basic rules of debating. You started
by saying that static instrumentation of any kind is evil, yet this
is demonstrably false, if nothing else by the outpour of experience
from those who have had to maintain non-inlined instrumentation. Then
you proceed to try to amalgamate this attack with a vicious attack on
ltt. I'll say it one more time: the ltt code gets posted to lkml
*for review*. If you're that concerned about the code, then go ahead
look at it and tell the maintainers what you'd like to see fixed.

Instead, you run out and come back and conclude "The best that Frank
and me came up ..." and then you present your own nomenclature for
static instrumentation. I mean, if nothing, else, have a little
decency for those who have put effort in trying to make this stuff work.

I mean, at least explain to me why you insist on using such a tone
against a project that is now within its 7th year of existence (a
pretty long lifetime if you ask me for something that has been
labeled useless all over this thread.) Do you actually realize the
lkml's past reluctance to admitting a standard tracing mechanism
into the kernel has actually contributed in doing great harm to those
who had put substantial personal and financial investment in getting
something to work. I'll spare you the political debates, but look at
past involvement of major corporate users in ltt and ask yourself why
they've decided to put their efforts elsewhere. We were basically told:
we cannot justify investing any further funds in a project which does
not seem to gain any sort of acceptance by the kernel developers.
I've never complained about this before because I don't like whining.
Do, however, realize that the fact that there are 4 separate teams
working on this in parallel (ltt, lkst, systemtap, lket, off the top
of my head) is directly due to the lack of success ltt has had in
being admitted into the kernel. Do, at least, realize that this is
huge miscarriage of the lkml process.

And finally, do realize that in 2000 I personally contacted the head
of the DProbes project IBM in order to foster common development,
following which ltt was effectively modified in order to allow
dynamic instrumentation of the kernel ...

cheesh ...

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:19                                         ` Ingo Molnar
  2006-09-15 19:26                                           ` Karim Yaghmour
@ 2006-09-15 19:43                                           ` Roman Zippel
  2006-09-15 20:05                                             ` Ingo Molnar
  2006-09-15 20:13                                           ` Andrew Morton
  2 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 19:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> > What Karim is sharing with us here (yet again) is the real in-field 
> > experience of real users (ie: not kernel developers).
> 
> well, Jes has that experience and Thomas too.
> 
> > I mean, on one hand we have people explaining what they think a 
> > tracing facility should and shouldn't do, and on the other hand we 
> > have a guy who has been maintaining and shipping exactly that thing to 
> > (paying!) customers for many years.
> 
> so does Thomas and Jes. So what's the point?

That only Karim's experience is being in question here?

> i judge LTT by its current code quality, not by its proponents shouting 
> volume - and that quality is still quite poor at the moment. (and then 
> there are the conceptual problems too, outlined numerous times) I have 
> quoted specific example(s) for that in this thread. Furthermore, LTT 
> does this:
> 
>  246 files changed, 26207 insertions(+), 71 deletions(-)
> 
> and this gives me the shivers, for all the reasons i outlined.

Well, I'm first to admit that LTT needs improvement, but that has never 
been the point.

We need to get to some kind of agreement what level of tracing Linux 
should support in general, preferably something that is easy to 
integrate and usable by everyone. Especially the latter means that there 
is not one true solution, so we need to figure out what kind of common
infrastructure can be implemented, from which all of them can benefit.

At this point you've been rather uncompromising contrary to every single 
argument from either side.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:43                                           ` Roman Zippel
@ 2006-09-15 20:05                                             ` Ingo Molnar
  2006-09-15 20:22                                               ` Mathieu Desnoyers
  2006-09-15 21:12                                               ` Roman Zippel
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 20:05 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Fri, 15 Sep 2006, Ingo Molnar wrote:
> 
> > > What Karim is sharing with us here (yet again) is the real in-field 
> > > experience of real users (ie: not kernel developers).
> > 
> > well, Jes has that experience and Thomas too.
> > 
> > > I mean, on one hand we have people explaining what they think a 
> > > tracing facility should and shouldn't do, and on the other hand we 
> > > have a guy who has been maintaining and shipping exactly that thing to 
> > > (paying!) customers for many years.
> > 
> > so does Thomas and Jes. So what's the point?
> 
> That only Karim's experience is being in question here?

i think you misunderstood, please read the paragraphs above. They 
suggest that there's "real in-field experience of real users" against 
"people explaining what they think a tracing facility should and 
shouldn't do". I only pointed out that those people (Thomas, Jes) dont 
just randomly express their opinion but have actual in-field experience 
too (of paying customers), about the very topic at hand.

> > i judge LTT by its current code quality, not by its proponents shouting 
> > volume - and that quality is still quite poor at the moment. (and then 
> > there are the conceptual problems too, outlined numerous times) I have 
> > quoted specific example(s) for that in this thread. Furthermore, LTT 
> > does this:
> > 
> >  246 files changed, 26207 insertions(+), 71 deletions(-)
> > 
> > and this gives me the shivers, for all the reasons i outlined.
> 
> Well, I'm first to admit that LTT needs improvement, but that has 
> never been the point.

that might not be your point, but that very much is my point. I do claim 
that LTT's problems arise out of its fundamental mistake on the kernel 
side: that it is a static tracer that tries to be too many things to too 
many people. SystemTap is available here and today on an unmodified 
upstream kernel. LTT has been in this shape for the past ~8 years. But 
if you wish you can certainly prove me wrong via for example cleaning up 
and shrinking LTT down to a size and impact that is not scary anymore, 
with the same functionality, and the clear future path for the removal 
of its dependencies. I tried to argue that in the abstract, but please 
by all means feel free to prove me wrong. (or argue against my specific 
points)

> We need to get to some kind of agreement what level of tracing Linux 
> should support in general, preferably something that is easy to 
> integrate and usable by everyone. Especially the latter means that 
> there is not one true solution, [...]

sorry, but i disagree. There _is_ a solution that is superior in every 
aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)

> At this point you've been rather uncompromising [...]

yes, i'm rather uncompromising when i sense attempts to push inferior 
concepts into the core kernel _when_ a better concept exists here and 
today. Especially if the concept being pushed adds more than 350 
tracepoints that expose something to user-space that amounts to a 
complex external API, which tracepoints we have little chance of ever 
getting rid of under a static tracing concept.

i'm also looking at it this way too: you already seem to be quite 
reluctant to add kprobes to your architecture today. How reluctant would 
you be tomorrow if you had static tracepoints, which would remove a fair 
chunk of incentive to implement kprobes?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:05                                             ` Ingo Molnar
@ 2006-09-15 20:22                                               ` Mathieu Desnoyers
  2006-09-15 21:08                                                 ` Jose R. Santos
                                                                   ` (2 more replies)
  2006-09-15 21:12                                               ` Roman Zippel
  1 sibling, 3 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 20:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Please Ingo, stop repeating false argument without taking in account people's
corrections :

* Ingo Molnar (mingo@elte.hu) wrote:
> sorry, but i disagree. There _is_ a solution that is superior in every 
> aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
> 

I am sorry to have to repeat myself, but this is not true for heavy loads.

> > At this point you've been rather uncompromising [...]
> 
> yes, i'm rather uncompromising when i sense attempts to push inferior 
> concepts into the core kernel _when_ a better concept exists here and 
> today. Especially if the concept being pushed adds more than 350 
> tracepoints that expose something to user-space that amounts to a 
> complex external API, which tracepoints we have little chance of ever 
> getting rid of under a static tracing concept.
> 
>From an earlier email from Tim bird :

"I still think that this is off-topic for the patch posted.  I think we
should debate the implementation of tracepoints/markers when someone posts a
patch for some.  I think it's rather scurrilous to complain about
code NOT submitted.  Ingo has even mis-characterized the not-submitted
instrumentation patch, by saying it has 350 tracepoints when it has no
such thing.  I counted 58 for one architecture (with only 8 being
arch-specific)."

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:22                                               ` Mathieu Desnoyers
@ 2006-09-15 21:08                                                 ` Jose R. Santos
  2006-09-15 21:25                                                   ` Mathieu Desnoyers
  2006-09-15 22:03                                                   ` Ingo Molnar
  2006-09-15 21:32                                                 ` Ingo Molnar
  2006-09-16  9:59                                                 ` Jes Sorensen
  2 siblings, 2 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 21:08 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> Please Ingo, stop repeating false argument without taking in account people's
> corrections :
>
> * Ingo Molnar (mingo@elte.hu) wrote:
> > sorry, but i disagree. There _is_ a solution that is superior in every 
> > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
> > 
>
> I am sorry to have to repeat myself, but this is not true for heavy loads.
>   

This thread has already discuss the merits of static instrumentation 
when it comes to the performance impacts.  The key is now to find a 
balance between static vs dynamic probes.  While it is true that static 
probes will provide less overhead compared to dynamic probes, some probe 
point will see less of an impact in measurable performance impact of 
dynamic probes due to the nature of the probe.  We need to find what 
that balance is.

To some people performance is the #1 priority and to other it is 
flexibility.  I would like to come up with a list of those probe point 
that absolutely need to be inserted into the code statically.  Those 
that are not absolutely critical to have statically should be 
implemented dynamically.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:08                                                 ` Jose R. Santos
@ 2006-09-15 21:25                                                   ` Mathieu Desnoyers
  2006-09-15 22:02                                                     ` Jose R. Santos
  2006-09-15 22:03                                                   ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 21:25 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Jose R. Santos (jrs@us.ibm.com) wrote:
> To some people performance is the #1 priority and to other it is 
> flexibility.  I would like to come up with a list of those probe point 
> that absolutely need to be inserted into the code statically.  Those 
> that are not absolutely critical to have statically should be 
> implemented dynamically.
> 

I agree with you that only very specific parts of the kernel have this kind of
high throughput. Using kprobes for lower thoughput tracepoints if perfectly
acceptable from my point of view, as it does not perturb the system too much.

I would suggest (as a beginning) those "standard" hi event rate tracepoints :

(taken from the highest rates in
http://sourceware.org/ml/systemtap/2005-q4/msg00451.html)

- syscall entry/exit
- irq entry/exit
- softirq entry/exit
- tasklet entry/exit
- trap entry/exit
- scheduler change
- wakeup
- network traffic (packet in/out)
- "select" and "poll" system calls
- page_alloc/page_free

(be warned : this list is probably incomplete, too exhaustive or can cause
dizziness under stress condition) :)

However, a tracing infrastructure should still provide the ability for
developers to instrument their own high traffic interrupt handler with a very
low overhead.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:25                                                   ` Mathieu Desnoyers
@ 2006-09-15 22:02                                                     ` Jose R. Santos
  0 siblings, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 22:02 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> * Jose R. Santos (jrs@us.ibm.com) wrote:
> > To some people performance is the #1 priority and to other it is 
> > flexibility.  I would like to come up with a list of those probe point 
> > that absolutely need to be inserted into the code statically.  Those 
> > that are not absolutely critical to have statically should be 
> > implemented dynamically.
> > 
>
> I agree with you that only very specific parts of the kernel have this kind of
> high throughput. Using kprobes for lower thoughput tracepoints if perfectly
> acceptable from my point of view, as it does not perturb the system too much.
>
> I would suggest (as a beginning) those "standard" hi event rate tracepoints :
>
> (taken from the highest rates in
> http://sourceware.org/ml/systemtap/2005-q4/msg00451.html)
>
> - syscall entry/exit
> - irq entry/exit
> - softirq entry/exit
> - tasklet entry/exit
> - trap entry/exit
> - scheduler change
> - wakeup
> - network traffic (packet in/out)
> - "select" and "poll" system calls
> - page_alloc/page_free
>
> (be warned : this list is probably incomplete, too exhaustive or can cause
> dizziness under stress condition) :)
>
> However, a tracing infrastructure should still provide the ability for
> developers to instrument their own high traffic interrupt handler with a very
> low overhead.
>   
This is base on a single scenario, which is wrong.  A criteria needs to 
be establish that describes the justification for a static trace hook.  
Base on the previous comments on the thread, this list is already seems 
to big.

If a user of the trace tool absolutely need to have the best 
performance, then the propose tool should be smart enough to use static 
hooks if available but revert back to dynamic probes if there is no 
available static counter part.  This performance static tracepoint patch 
can be maintained outside of the kernel tree without bloating the 
kernel.  This way he can have mostly dynamic trace point but at least 
provide some sort of mechanism for those that absolutely must have 
static hooks in order to get useful data out of the trace tool.

-JRS



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:08                                                 ` Jose R. Santos
  2006-09-15 21:25                                                   ` Mathieu Desnoyers
@ 2006-09-15 22:03                                                   ` Ingo Molnar
  2006-09-15 22:32                                                     ` Karim Yaghmour
                                                                       ` (2 more replies)
  1 sibling, 3 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 22:03 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Jose R. Santos <jrs@us.ibm.com> wrote:

> [...]  While it is true that static probes will provide less overhead 
> compared to dynamic probes, [...]

that is not true at all. Yes, an INT3 based kprobe might be expensive if 
+0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that 
is "only" an implementation detail, not a conceptual property. 
Especially considering that help (djprobes) is on the way. And in the 
future, as more and more code gets generated (and regenerated) on the 
fly, dynamic probes will be _faster_ than static probes - plainly 
because they adapt better to the environment they plug into.

so there's basically nothing to balance. My point is that dynamic probes 
have won or will win on every front, and we shouldnt tie us down with 
static tracers. 5 years ago with no kprobes, had someone submitted a 
clean static tracer patchset, we could probably not have resisted it (i 
though probably would have resisted it on the grounds of maintainance 
overhead) and would have added it because tracing makes sense in 
general. But today there's just no reason to add static tracers anymore.

NOTE: i still accept the temporary (or non-temporary) introduction of 
static markers, to help dynamic tracing. But my expectation is that 
these markers will be less intrusive than static tracepoints, and a lot 
more flexible.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:03                                                   ` Ingo Molnar
@ 2006-09-15 22:32                                                     ` Karim Yaghmour
  2006-09-15 22:43                                                       ` Ingo Molnar
  2006-09-15 22:59                                                     ` Frank Ch. Eigler
  2006-09-15 23:17                                                     ` Jose R. Santos
  2 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 22:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


Ingo Molnar wrote:
> that is not true at all. Yes, an INT3 based kprobe might be expensive if 
> +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that 
> is "only" an implementation detail, not a conceptual property. 
> Especially considering that help (djprobes) is on the way. And in the 

djprobes has been "on the way" for some time now. Why don't you at
least have the intellectual honesty to use the same rules you've
repeatedly used against ltt elsewhere in this thread -- i.e. what
it does today is what it is, and what it does today isn't worth
bragging about. But that would be too much to ask of you Ingo,
wouldn't it?

But, sarcasm aside, even if this mechanism existed it still wouldn't
resolve the need for static markup. It would just make djprobe a
likelier candidate for tools that cannot currently rely on kprobes.

> NOTE: i still accept the temporary (or non-temporary) introduction of 
> static markers, to help dynamic tracing. But my expectation is that 
> these markers will be less intrusive than static tracepoints, and a lot 
> more flexible.

Chalk one up for nice endorsement and another for arbitrary distinction.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:32                                                     ` Karim Yaghmour
@ 2006-09-15 22:43                                                       ` Ingo Molnar
  2006-09-15 23:33                                                         ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 22:43 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Karim Yaghmour <karim@opersys.com> wrote:

> Ingo Molnar wrote:
> > that is not true at all. Yes, an INT3 based kprobe might be expensive if 
> > +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that 
> > is "only" an implementation detail, not a conceptual property. 
> > Especially considering that help (djprobes) is on the way. And in the 
> 
> djprobes has been "on the way" for some time now. Why don't you at 
> least have the intellectual honesty to use the same rules you've 
> repeatedly used against ltt elsewhere in this thread -- i.e. what it 
> does today is what it is, and what it does today isn't worth bragging 
> about. [...]

i actually think djprobes are pretty darn inventive. I also think that 
the tracebuffer management portion of LTT is better than the hacks in 
SystemTap, and that LTT's visualization tools are better (for example 
they do exist :-) - so clearly there's synergy possible. But i have no 
faith at all, for the many reasons outlined before, in the concept of 
static tracing, because i see no possible future path out of its many 
limitations and because i see no possible future way to get rid of their 
dependencies. So i'd rather wait some time for dynamic tracers to 
outgrow static tracers in even the last final area, than let static 
tracing into the kernel - which would add dependencies that we'd have to 
live with almost until eternity.

> But, sarcasm aside, even if this mechanism existed it still wouldn't 
> resolve the need for static markup. It would just make djprobe a 
> likelier candidate for tools that cannot currently rely on kprobes.

it would clearly reduce the number of places where static markup would 
still be necessary. With static tracers i see no such mechanism that 
gradually moves the markups out of the kernel.

> > NOTE: i still accept the temporary (or non-temporary) introduction 
> > of static markers, to help dynamic tracing. But my expectation is 
> > that these markers will be less intrusive than static tracepoints, 
> > and a lot more flexible.
> 
> Chalk one up for nice endorsement and another for arbitrary 
> distinction.

So you dispute that markups for dynamic tracing will be more flexible 
and you dispute that they will be less intrusive than markups for static 
tracing?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:43                                                       ` Ingo Molnar
@ 2006-09-15 23:33                                                         ` Karim Yaghmour
  2006-09-15 23:52                                                           ` Ingo Molnar
  2006-09-15 23:53                                                           ` Ingo Molnar
  0 siblings, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 23:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar wrote:
> i actually think djprobes are pretty darn inventive.

So do I. While there is a language barrier, the Hitachi folks,
especially Hiramatsu-san, are very talented.

> I also think that 
> the tracebuffer management portion of LTT is better than the hacks in 
> SystemTap, and that LTT's visualization tools are better (for example 
> they do exist :-) - so clearly there's synergy possible.

Great, because I believe all those involved would like to see this
happen. I personally am convinced that none of those involved want
to continue wasting their time in parallel.

> But i have no 
> faith at all, for the many reasons outlined before, in the concept of 
> static tracing, because i see no possible future path out of its many 
> limitations and because i see no possible future way to get rid of their 
> dependencies.

Yes, I do so believe that this is what you most sincerely think. And
I'm ok with that. We don't have to approach the problem from the
same direction. In my view we should at least settle for working on
the most basic thing we *do* agree on: having a markup mechanism for
necessary instrumentation.

> So i'd rather wait some time for dynamic tracers to 
> outgrow static tracers in even the last final area, than let static 
> tracing into the kernel - which would add dependencies that we'd have to 
> live with almost until eternity.

I genuinely understand your concern. And I repeat that ltt's initial
design cared little of the provenance of the events. It just needed
key events to present an intelligent picture to the user. The
patches have since grown to include stuff which was essential as
development went ahead. But there's no reason things cannot be
refactored into an acceptable format to all by review on the lkml.

> it would clearly reduce the number of places where static markup would 
> still be necessary. With static tracers i see no such mechanism that 
> gradually moves the markups out of the kernel.

Again, I strongly believe that this issue isn't about static vs.
dynamic. The goal, and that's what's important, is to allow users
to have access to a set of tools they can use on *any* kernel
they get their hands on, without having to edit anything anywhere
or fix any script. For having spent considerable effort into this,
I don't see any other way that using static markup. Here's a
simple case: you ask someone who's got a bug report on a kernel
crashing because of his user-space realtime task, and you ask him
to dump you a trace, and that trace actually ends up misleading
because his out-of-tree instrumentation was inserted in the wrong
location.

Again, the goal is to obtain tools that users can use on *any*
kernel they get their hands on.

> So you dispute that markups for dynamic tracing will be more flexible 
> and you dispute that they will be less intrusive than markups for static 
> tracing?

No, I'm saying that the flexibility of the markup is not tied to the
instrumentation "grab" mechanism (direct call or binary editing.)
That's the "arbitrary" I'm talking about.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:33                                                         ` Karim Yaghmour
@ 2006-09-15 23:52                                                           ` Ingo Molnar
  2006-09-16  2:24                                                             ` Karim Yaghmour
  2006-09-15 23:53                                                           ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 23:52 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Karim Yaghmour <karim@opersys.com> wrote:

> > So you dispute that markups for dynamic tracing will be more 
> > flexible and you dispute that they will be less intrusive than 
> > markups for static tracing?
> 
> No, I'm saying that the flexibility of the markup is not tied to the 
> instrumentation "grab" mechanism (direct call or binary editing.) 
> That's the "arbitrary" I'm talking about.

ok, then i'd like to dispute your point. Contrary to your statement 
there is a very fundamental difference between "static tracing" (static 
call, which relies on compile-time insertion of trace points) and 
"dynamic tracing" (which can insert trace points almost anywhere) - 
_even if both use in-source markers_.

The fundamental difference is this: dynamic tracing has full access to 
the full environment of the code that it taps into _at the time of 
tracepoint activation_, while static tracing has to get all its context 
during compilation.

To make my point easier to understand, consider the following example: 
we want to tap into the middle of a global_function():

 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2();

         ... [lots of code] ...
 }

We want to trace the function right after 'x' has been assigned, and we 
want to trace an event_A, with parameters: arg1, arg2, arg3 and x. This 
is a pretty common scenario. Ok so far?

here is how the markup looks like under static tracing:

 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2();
         D(event_A, arg1, arg2, arg3, x);

         ... [lots of code] ...
 }

that's what you'd expect, right? This is pretty common too, up to this 
point.

now how could the markup look like for a dynamic tracepoint:

 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2();
         D(event_A, x);

         ... [lots of code] ...
 }

Note: there's no (arg1, arg2, arg3) passed to the markup! Why? Because 
SystemTap has full access to the function's arguments and in this 
particular case it's simply not necessary to reference them explicitly.
So the markup has less of an overhead because it does not 'touch' arg1,
arg2, arg3 if the tracepoint is not active [which is the common case we
optimize for].

Furthermore, the markup is also visually less intrusive.

But better than that, the markup could look like this as well:

 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2();

         ... [lots of code] ...
 }

right, no markup at all, but in a script somewhere we'd have:

  insert.trace(global_function: "x = func2();", after);

or maybe even in a script, annotated in patch format, so that the 
context of the tapped code is captured too.

so, as a result: the dynamic markup() does the same, but has less impact 
on the compiled code (less parameters touched), and is more flexible in 
terms of attachment to the source code.

Can we do any of this with the static tracepoint? We cannot, 
fundamentally! So if we allowed static tracers to access that tracepoint 
anytime, we could never make things more intelligent there in the 
future!

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:52                                                           ` Ingo Molnar
@ 2006-09-16  2:24                                                             ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16  2:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar wrote:
> ok, then i'd like to dispute your point. Contrary to your statement 
> there is a very fundamental difference between "static tracing" (static 
> call, which relies on compile-time insertion of trace points) and 
> "dynamic tracing" (which can insert trace points almost anywhere) - 
> _even if both use in-source markers_.

Good, a nice little down-to-earth debate for a change ;)

> The fundamental difference is this: dynamic tracing has full access to 
> the full environment of the code that it taps into _at the time of 
> tracepoint activation_, while static tracing has to get all its context 
> during compilation.

I disagree.

> To make my point easier to understand, consider the following example: 
> we want to tap into the middle of a global_function():
> 
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
> 
>          x = func2();
> 
>          ... [lots of code] ...
>  }
> 
> We want to trace the function right after 'x' has been assigned, and we 
> want to trace an event_A, with parameters: arg1, arg2, arg3 and x. This 
> is a pretty common scenario. Ok so far?

Ok so far.

> here is how the markup looks like under static tracing:
> 
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
> 
>          x = func2();
>          D(event_A, arg1, arg2, arg3, x);
> 
>          ... [lots of code] ...
>  }
> 
> that's what you'd expect, right? This is pretty common too, up to this 
> point.

No, that's not what I'd necessarily expect, though it could be and
definitely does match current standard practice. There's no reason,
though, D(foo) isn't calling a statically-linked function which has
a pluggable interface (a module-overloadable symbol if you'd like)
which can then do much more than initially fetch arg1-2-3 using,
as you alluded to earlier, built-in disassemblers and the likes.

One nice thing about the above, though, is that you can easily have
type information at build time and can actually create customized
logging info right there. But this is just brain farting, more
substance below.

> now how could the markup look like for a dynamic tracepoint:
> 
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
> 
>          x = func2();
>          D(event_A, x);
> 
>          ... [lots of code] ...
>  }
> 
> Note: there's no (arg1, arg2, arg3) passed to the markup! Why? Because 
> SystemTap has full access to the function's arguments and in this 
> particular case it's simply not necessary to reference them explicitly.
> So the markup has less of an overhead because it does not 'touch' arg1,
> arg2, arg3 if the tracepoint is not active [which is the common case we
> optimize for].

Again, this does not have to be the case. D(arg1, ..., N) could actually
be defined to nothing in *ALL* cases in a header. Nothing precludes
having a special parser that only runs if tracing is enabled and then
generates a special header and corresponding C file which then have
what it takes to make these D() markups meaningful. So in this case,
the compiler never gives a damn about arg1-Z (i.e. no touch or
dependency or anything of the likes), yet a compile-time option allows
you to suddenly make D(foo) turn into a system-tap usable probe point
or a direct call to a statically-linked function (which is what I refer
to as "static tracing".)

> Furthermore, the markup is also visually less intrusive.

That's debatable. If you're going to mark something up, you might as
well state right away what's typically interesting about the event.
Sure, you could make a point that arg32 is something you may be
interesting in some cases, but if arg1-3 are the ones most relevant
99% of the time for this function, then you might as well say that
in your trace marker.

> But better than that, the markup could look like this as well:
> 
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
> 
>          x = func2();
> 
>          ... [lots of code] ...
>  }
> 
> right, no markup at all, but in a script somewhere we'd have:
> 
>   insert.trace(global_function: "x = func2();", after);

That's two files. If we're talking funky, and the following is
by no means and endorsement I'm making -- just showing you what
could be possible, then here's a better one:

Look ma, no hands:

 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2(); /*T* @here:arg1,arg2,arg3 */

         ... [lots of code] ...
 }

Now you can't say that's visually wrong: we've already got tons
of outdated comments in the code. And you can't say there's
entirely no precedent: kerneldoc. Yet, this can be used by a
build-time tool which automagically generates either information
for later use by probe inserters or, alternatively, substitutes
the default built file (say foo.c) with an equivalent (foo-trace.c)
which has inlined static tracing.

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:33                                                         ` Karim Yaghmour
  2006-09-15 23:52                                                           ` Ingo Molnar
@ 2006-09-15 23:53                                                           ` Ingo Molnar
  2006-09-16  2:51                                                             ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 23:53 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Karim Yaghmour <karim@opersys.com> wrote:

> > the tracebuffer management portion of LTT is better than the hacks 
> > in SystemTap, and that LTT's visualization tools are better (for 
> > example they do exist :-) - so clearly there's synergy possible.
> 
> Great, because I believe all those involved would like to see this 
> happen. I personally am convinced that none of those involved want to 
> continue wasting their time in parallel.

a reasonable compromise for me would be what i suggested a few mails 
ago:

 nor do i reject all of LTT: as i said before i like the tools, and i
 think its collection of trace events should be turned into systemtap
 markups and scripts. Furthermore, it's ringbuffer implementation looks
 better. So as far as the user is concerned, LTT could (and should) live
 on with full capabilities, but with this crutial difference in how it
 interfaces to the kernel source code.

i.e. could you try to just give SystemTap a chance and attempt to 
integrate a portion of LTT with it ... that shares more of the 
infrastructure and we'd obviously only need "one" markup variant, and 
would have full markup (removal-) flexibility. I'll try to help djprobes 
as much as possible. Hm?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:53                                                           ` Ingo Molnar
@ 2006-09-16  2:51                                                             ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16  2:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


Ingo Molnar wrote:
>  nor do i reject all of LTT: as i said before i like the tools, and i
>  think its collection of trace events should be turned into systemtap
>  markups and scripts. Furthermore, it's ringbuffer implementation looks
>  better. So as far as the user is concerned, LTT could (and should) live
>  on with full capabilities, but with this crutial difference in how it
>  interfaces to the kernel source code.

The interface to the kernel source code can be worked on. I hope my
other email has demonstrated that.

> i.e. could you try to just give SystemTap a chance and attempt to 
> integrate a portion of LTT with it ... that shares more of the 
> infrastructure and we'd obviously only need "one" markup variant, and 
> would have full markup (removal-) flexibility. I'll try to help djprobes 
> as much as possible. Hm?

Preface: I have absolutely nothing against SystemTap. I did have a
bone with the way it was developed (behind closed-doors practically),
but I told the SystemTap people about this and end of story, we
moved on and I've had many enjoyable discussions with the SystemTap
team since. I just have a feeling that part of the team is proceeding
as if ltt was dead and buried. They'd like to interface with us --
at least I think -- but nobody dares to touch ltt with a 10foot
poll because it's a political hot-potato i.e. for all they care, ltt
could be a liability for SystemTap because of all the fuss about it
amongst kernel developers. But that's my take, I could be entirely
wrong.

Now, on a technical level, SystemTap cannot currently be a substitute
for what the ltt patch provides, especially in terms of performance.
Maybe one day it will be a substitute, with djprobe and other stuff,
but it isn't *now*. Nevertheless, I'm all for encouraging a movement
in a common direction. And in that regard I think that there is
consensus both amongst the SystemTap team and within the ltt team
-- at least I think, for having a common markers interface. This is
something we can definitely build on. Hopefully dispelling some of
the ltt fud and gathering some positive mantra for the ltt effort
on lkml can help ease people's fears about the possibility of
rubbing the kernel developers the wrong way.

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:03                                                   ` Ingo Molnar
  2006-09-15 22:32                                                     ` Karim Yaghmour
@ 2006-09-15 22:59                                                     ` Frank Ch. Eigler
  2006-09-15 23:40                                                       ` Karim Yaghmour
  2006-09-15 23:17                                                     ` Jose R. Santos
  2 siblings, 1 reply; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-15 22:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton,
	tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel,
	Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi,
	ltt-dev, Michel Dagenais

Ingo Molnar <mingo@elte.hu> writes:

> [...]  NOTE: i still accept the temporary (or non-temporary)
> introduction of static markers, to help dynamic tracing. But my
> expectation is that these markers will be less intrusive than static
> tracepoints, and a lot more flexible.

It seems like an agreement on this is coming together.  You and Karim
may be in violent agreement, even if others haven't quite come around:

Let us design a static marker mechanism that can be coupled at run
time either to a dynamic system such as systemtap, or by a specialized
tracing system such as lttnng (!).  Then "markers" === "static
instrumentation", for purposes of the kernel developer.  If the
markers are lightweight enough, then a distribution kernel can afford
keeping them compiled in.

- FChE

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:59                                                     ` Frank Ch. Eigler
@ 2006-09-15 23:40                                                       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 23:40 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Ingo Molnar, Jose R. Santos, Mathieu Desnoyers, Roman Zippel,
	Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel,
	Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi,
	ltt-dev, Michel Dagenais


Frank Ch. Eigler wrote:
> Let us design a static marker mechanism that can be coupled at run
> time either to a dynamic system such as systemtap, or by a specialized
> tracing system such as lttnng (!).  Then "markers" === "static
> instrumentation", for purposes of the kernel developer.  If the
> markers are lightweight enough, then a distribution kernel can afford
> keeping them compiled in.

I'm all for it.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:03                                                   ` Ingo Molnar
  2006-09-15 22:32                                                     ` Karim Yaghmour
  2006-09-15 22:59                                                     ` Frank Ch. Eigler
@ 2006-09-15 23:17                                                     ` Jose R. Santos
  2 siblings, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 23:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar wrote:
> * Jose R. Santos <jrs@us.ibm.com> wrote:
>
> > [...]  While it is true that static probes will provide less overhead 
> > compared to dynamic probes, [...]
>
> that is not true at all. Yes, an INT3 based kprobe might be expensive if 
> +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that 
> is "only" an implementation detail, not a conceptual property. 
> Especially considering that help (djprobes) is on the way. And in the 
> future, as more and more code gets generated (and regenerated) on the 
> fly, dynamic probes will be _faster_ than static probes - plainly 
> because they adapt better to the environment they plug into.
>   
Agree.  And they are details that can be fixed.

One such detail we still see issue with is kretprobes though (which we 
use on LKET for systemcall exit).  These have problem scaling due to 
spinlock issues even on small smp systems.  Its an implementation issue 
that can be fixed but I've been told that the fix is not trivial and 
should not expect it anytime soon.
> so there's basically nothing to balance. My point is that dynamic probes 
> have won or will win on every front, and we shouldnt tie us down with 
> static tracers. 5 years ago with no kprobes, had someone submitted a 
> clean static tracer patchset, we could probably not have resisted it (i 
> though probably would have resisted it on the grounds of maintainance 
> overhead) and would have added it because tracing makes sense in 
> general. But today there's just no reason to add static tracers anymore.
>
> NOTE: i still accept the temporary (or non-temporary) introduction of 
> static markers, to help dynamic tracing. But my expectation is that 
> these markers will be less intrusive than static tracepoints, and a lot 
> more flexible.
>   
Agree here as well.  Sorry, I was also counting static markers as  
static  tracepoint as well.  Even with static markers, there need to be 
balance of what thing need to be implemented with markers vs those that 
can just be done dynamically.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:22                                               ` Mathieu Desnoyers
  2006-09-15 21:08                                                 ` Jose R. Santos
@ 2006-09-15 21:32                                                 ` Ingo Molnar
  2006-09-15 21:58                                                   ` Mathieu Desnoyers
  2006-09-16  9:59                                                 ` Jes Sorensen
  2 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 21:32 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> * Ingo Molnar (mingo@elte.hu) wrote:
> > sorry, but i disagree. There _is_ a solution that is superior in every 
> > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
> > 
> 
> I am sorry to have to repeat myself, but this is not true for heavy 
> loads.

djprobes?

> > > At this point you've been rather uncompromising [...]
> > 
> > yes, i'm rather uncompromising when i sense attempts to push inferior 
> > concepts into the core kernel _when_ a better concept exists here and 
> > today. Especially if the concept being pushed adds more than 350 
> > tracepoints that expose something to user-space that amounts to a 
> > complex external API, which tracepoints we have little chance of ever 
> > getting rid of under a static tracing concept.
> > 
> From an earlier email from Tim bird :
> 
> "I still think that this is off-topic for the patch posted.  I think 
> we should debate the implementation of tracepoints/markers when 
> someone posts a patch for some.  I think it's rather scurrilous to 
> complain about code NOT submitted.  Ingo has even mis-characterized 
> the not-submitted instrumentation patch, by saying it has 350 
> tracepoints when it has no such thing.  I counted 58 for one 
> architecture (with only 8 being arch-specific)."

i missed that (way too many mails in this thread).

Here is how i counted them:

 $ grep "\<trace_.*(" * | wc -l
 359

some of those are not true tracepoints, but there's at least this many 
of them:

 $ grep "\<trace_.*(" *instrumentation* | wc -l
 235

so the real number is somewhere between.

 patch-2.6.17-lttng-0.5.108-instrumentation-arm.diff
 patch-2.6.17-lttng-0.5.108-instrumentation.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-i386.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-mips.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-powerpc.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-ppc.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-s390.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-sh.diff
 patch-2.6.17-lttng-0.5.108-instrumentation-x86_64.diff

when judging kernel maintainance overhead, the sum of all patches 
matters. And i considered all the other patches too (the ones that add 
actual tracepoints) that will come after the currently offered ones, not 
just the ones you submitted to lkml.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:32                                                 ` Ingo Molnar
@ 2006-09-15 21:58                                                   ` Mathieu Desnoyers
  2006-09-15 22:19                                                     ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 21:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > * Ingo Molnar (mingo@elte.hu) wrote:
> > > sorry, but i disagree. There _is_ a solution that is superior in every 
> > > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
> > > 
> > 
> > I am sorry to have to repeat myself, but this is not true for heavy 
> > loads.
> 
> djprobes?
> 

I am fully aware of djprobes limitations towards fully preemptible kernel (and
around branches instructions ? I don't remember if they solved this one). Oh,
yes, and if a trap happen to come at the wrong spot, then the thread gets
scheduled out... well, it cannot be applied everywhere, eh ?

> > > > At this point you've been rather uncompromising [...]
> > > 
> > > yes, i'm rather uncompromising when i sense attempts to push inferior 
> > > concepts into the core kernel _when_ a better concept exists here and 
> > > today. Especially if the concept being pushed adds more than 350 
> > > tracepoints that expose something to user-space that amounts to a 
> > > complex external API, which tracepoints we have little chance of ever 
> > > getting rid of under a static tracing concept.
> > > 
> > From an earlier email from Tim bird :
> > 
> > "I still think that this is off-topic for the patch posted.  I think 
> > we should debate the implementation of tracepoints/markers when 
> > someone posts a patch for some.  I think it's rather scurrilous to 
> > complain about code NOT submitted.  Ingo has even mis-characterized 
> > the not-submitted instrumentation patch, by saying it has 350 
> > tracepoints when it has no such thing.  I counted 58 for one 
> > architecture (with only 8 being arch-specific)."
> 
> i missed that (way too many mails in this thread).
> 
> Here is how i counted them:
> 
>  $ grep "\<trace_.*(" * | wc -l
>  359
> 

This count includes the inline trace functions definitions.

> some of those are not true tracepoints, but there's at least this many 
> of them:
> 
>  $ grep "\<trace_.*(" *instrumentation* | wc -l
>  235
> 

1 - This counts per architecture trace points. It quickly adds up considering
that we support ARM, MIPS, i386, powerpc, ppc and x86_64.
2 - It also counts some experimental trace points that I do not want to submit.
3 - Most of these are instrumentation of the traps handlers, which is
conceptually only one event.

> when judging kernel maintainance overhead, the sum of all patches 
> matters. And i considered all the other patches too (the ones that add 
> actual tracepoints) that will come after the currently offered ones, not 
> just the ones you submitted to lkml.
> 

I plan to rework the instrumentation patches before submitting them to LKML,
don't worry. I just hasn't been my focus until now. Too bad that you take those
as arguments.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:58                                                   ` Mathieu Desnoyers
@ 2006-09-15 22:19                                                     ` Ingo Molnar
  2006-09-15 22:45                                                       ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 22:19 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> > > > sorry, but i disagree. There _is_ a solution that is superior in 
> > > > every aspect: kprobes + SystemTap. (or any other equivalent 
> > > > dynamic tracer)
> > > > 
> > > 
> > > I am sorry to have to repeat myself, but this is not true for 
> > > heavy loads.
> > 
> > djprobes?
> > 
> 
> I am fully aware of djprobes limitations towards fully preemptible 
> kernel [...]

i dont see any fundamental limitation with a preemptible kernel. 
(preemptability was never a showstopper for any kernel feature in the 
past, and i dont expect it to be a showstopper for anything in the 
future either.)

> [...] (and around branches instructions ? I don't remember if they 
> solved this one). Oh, yes, and if a trap happen to come at the wrong 
> spot, then the thread gets scheduled out... well, it cannot be applied 
> everywhere, eh ?

i expect the number of places where dynamic tracers have problems to 
gradually shrink. It has shrunk significantly already. Hence i'm 
supportive of static markers (as i stated it numerous times), as long as 
it's there to ease dynamic probing - _and as long as these static 
markers shrink in number as the capabilities of dynamic tracers 
improve_. With static tracers i just dont see that possibility: a static 
tracer needs all its static tracepoints forever or otherwise it just 
wont work.

> >  $ grep "\<trace_.*(" * | wc -l
> >  359
> > 
> 
> This count includes the inline trace functions definitions.

yes, as i stated:

> > some of those are not true tracepoints, but there's at least this many 
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > of them:
> > 
> >  $ grep "\<trace_.*(" *instrumentation* | wc -l
> >  235
> > 
> 
> 1 - This counts per architecture trace points. It quickly adds up 
> considering that we support ARM, MIPS, i386, powerpc, ppc and x86_64.

yes. That's my point: overhead of static tracepoints "quickly adds up". 
The cost goes up linearly, as you grow into more subsystems and into 
more architectures.

btw., an observation: that's 6 LTT architectures in 7 years, while 
kprobes are now on 5 architectures in 2 years.

> 2 - It also counts some experimental trace points that I do not want 
> to submit.
> 3 - Most of these are instrumentation of the traps handlers, which is 
> conceptually only one event.

i counted the number of tracepoints, not the number of unique types of 
events, because:

> > when judging kernel maintainance overhead, the sum of all patches 
> > matters. And i considered all the other patches too (the ones that 
> > add actual tracepoints) that will come after the currently offered 
> > ones, not just the ones you submitted to lkml.
> 
> I plan to rework the instrumentation patches before submitting them to 
> LKML, don't worry. I just hasn't been my focus until now. Too bad that 
> you take those as arguments.

the static tracer patches make little sense without instrumentation, so 
sure i considered them. I also clearly declared that you didnt submit 
them yet:

>>> Let me quote from the latest LTT patch (patch-2.6.17-lttng-0.5.108, 
>>> which is the same version submitted to lkml - although no specific 
                                                  ^^^^^^^^^^^^^^^^^^^^
>>> tracepoints were submitted):
    ^^^^^^^^^^^^^^^^^^^^^^^^^^

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:19                                                     ` Ingo Molnar
@ 2006-09-15 22:45                                                       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt,
	Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


Ingo Molnar wrote:
> btw., an observation: that's 6 LTT architectures in 7 years, while 
> kprobes are now on 5 architectures in 2 years.

Actually much of ltt underwent a complete rewrite since Mathieu took
over maintainership. Let's, according to this email, Mathieu became
the maintainer in November 2005:
http://www.listserv.shafik.org/pipermail/ltt-dev/2005-November/001092.html

[ Karim takes out calculator and punches: 10/12 = 0.83 ]

So that's 7 architectures in 0.83 years, compared to 5 in 2 years.

Joke's on you pall.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:22                                               ` Mathieu Desnoyers
  2006-09-15 21:08                                                 ` Jose R. Santos
  2006-09-15 21:32                                                 ` Ingo Molnar
@ 2006-09-16  9:59                                                 ` Jes Sorensen
  2006-09-16 17:24                                                   ` Mathieu Desnoyers
  2006-09-16 17:30                                                   ` Mathieu Desnoyers
  2 siblings, 2 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-16  9:59 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> Please Ingo, stop repeating false argument without taking in account people's
> corrections :
> 
> * Ingo Molnar (mingo@elte.hu) wrote:
>> sorry, but i disagree. There _is_ a solution that is superior in every 
>> aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
>>
> I am sorry to have to repeat myself, but this is not true for heavy loads.

Alan pointed out earlier in the thread that the actual kprobe is noise
in this context, and I have seen similar issues on real workloads. Yes
kprobes are probably a little higher overhead in real life, but you have
to way that up against the rest of the system load.

If you want to prove people wrong, I suggest you do some real life
implementation and measure some real workloads with a predefined set of
tracepoints implemented using kprobes and LTT and show us that the
benchmark of the user application suffers in a way that can actually be
measured. Argueing that a syscall takes an extra 50 instructions
because it's traced using kprobes rather than LTT doesn't mean it
actually has any real impact.

"The 'kprobes' are too high overhead that makes them unusable" is one of
these classic myths that the static tracepoint advocates so far have
only been backing up with rhetoric. Give us some hard evidence or stop
repeating this argument please. Just because something is repeated
constantly doesn't transform it into truth.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  9:59                                                 ` Jes Sorensen
@ 2006-09-16 17:24                                                   ` Mathieu Desnoyers
  2006-09-16 17:35                                                     ` Ingo Molnar
                                                                       ` (2 more replies)
  2006-09-16 17:30                                                   ` Mathieu Desnoyers
  1 sibling, 3 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-16 17:24 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Jes Sorensen (jes@sgi.com) wrote:
> Mathieu Desnoyers wrote:
> >Please Ingo, stop repeating false argument without taking in account 
> >people's
> >corrections :
> >
> >* Ingo Molnar (mingo@elte.hu) wrote:
> >>sorry, but i disagree. There _is_ a solution that is superior in every 
> >>aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer)
> >>
> >I am sorry to have to repeat myself, but this is not true for heavy loads.
> 
> Alan pointed out earlier in the thread that the actual kprobe is noise
> in this context, and I have seen similar issues on real workloads. Yes
> kprobes are probably a little higher overhead in real life, but you have
> to way that up against the rest of the system load.
> 
> If you want to prove people wrong, I suggest you do some real life
> implementation and measure some real workloads with a predefined set of
> tracepoints implemented using kprobes and LTT and show us that the
> benchmark of the user application suffers in a way that can actually be
> measured. Argueing that a syscall takes an extra 50 instructions
> because it's traced using kprobes rather than LTT doesn't mean it
> actually has any real impact.
> 
> "The 'kprobes' are too high overhead that makes them unusable" is one of
> these classic myths that the static tracepoint advocates so far have
> only been backing up with rhetoric. Give us some hard evidence or stop
> repeating this argument please. Just because something is repeated
> constantly doesn't transform it into truth.
> 

Hi,

Here we go. I made a test that we can consider a lower bound for kprobes impact.
Two tests per run.

Simulation of high speed network traffic :

time ping -f localhost

First run : without any tracing activated, LTTng probes compiled in :

39457 packets received in 2.021 seconds : 19523.50 packets/s
142672 packets received in 7.237 seconds : 19714.24 packets/s

Second run : LTTng tracing activated (traces system calls, interrupts and
packet in/out...) :

93051 packets received in 7.395 seconds : 12582.96 packets/s
121585 packets received in 9.703 seconds : 12530.66 packets/s


Third run : same LTTng instrumentation, with a kprobe handler triggered by each
event traced.

56643 packets received in 11.152 seconds : 5079.17 packets/s
50150 packets received in 9.593 seconds : 5227.77 packets/s


The bottom line is :

LTTng impact on the studied phenomenon : 35% slower

LTTng+kprobes impact on the studied phenomenon : 73% slower

Therefore, I conclude that on this type of high event rate workload, kprobes
doubles the tracer impact on the system.

Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:24                                                   ` Mathieu Desnoyers
@ 2006-09-16 17:35                                                     ` Ingo Molnar
  2006-09-16 17:56                                                       ` Mathieu Desnoyers
  2006-09-16 18:11                                                       ` Karim Yaghmour
  2006-09-16 17:55                                                     ` Karim Yaghmour
  2006-09-18  8:33                                                     ` Jes Sorensen
  2 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 17:35 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> Third run : same LTTng instrumentation, with a kprobe handler 
> triggered by each event traced.

where exactly did you put the kprobe handler?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:35                                                     ` Ingo Molnar
@ 2006-09-16 17:56                                                       ` Mathieu Desnoyers
  2006-09-16 19:10                                                         ` Ingo Molnar
  2006-09-16 23:40                                                         ` Ingo Molnar
  2006-09-16 18:11                                                       ` Karim Yaghmour
  1 sibling, 2 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-16 17:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > Third run : same LTTng instrumentation, with a kprobe handler 
> > triggered by each event traced.
> 
> where exactly did you put the kprobe handler?

ltt_relay_reserve_slot.

See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert the kprobe.
Tests done on LTTng 0.5.111, on a x86 3GHz with hyperthreading.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:56                                                       ` Mathieu Desnoyers
@ 2006-09-16 19:10                                                         ` Ingo Molnar
  2006-09-16 19:37                                                           ` Ingo Molnar
  2006-09-16 19:51                                                           ` Karim Yaghmour
  2006-09-16 23:40                                                         ` Ingo Molnar
  1 sibling, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 19:10 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert 
> the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with 
> hyperthreading.

i have done a bit of kprobes and djprobes testing on a 2160 MHz Athlon64 
CPU, UP. I have tested 2 types of almost-NOP tracepoints (on 2.6.17), 
where the probe function only increases a counter:

 static int counter;

 static void probe_func(struct djprobe *djp, struct pt_regs *regs)
 {
         counter++;
 }

and have measured the overhead of an unmodified, kprobes-probed and 
djprobes-probed sys_getpid() system-call:

 sys_getpid() unmodified latency:    317 cycles   [ 0.146 usecs ]
 sys_getpid() kprobes latency:       815 cycles   [ 0.377 usecs ]
 sys_getpid() djprobes latency:      380 cycles   [ 0.176 usecs ]

i.e. the kprobes overhead is +498 cycles (+0.231 usecs), the djprobes 
overhead is +63 cycles (+0.029 usecs).

what do these numbers tell us? Firstly, on this CPU the kprobes overhead 
is not 1000-2000 cycles but 500 cycles. Secondly, if that's not fast 
enough, the "next-gen kprobes" code, djprobes have a really small 
overhead of 63 cycles.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:10                                                         ` Ingo Molnar
@ 2006-09-16 19:37                                                           ` Ingo Molnar
  2006-09-17 10:13                                                             ` Frederik Deweerdt
  2006-09-16 19:51                                                           ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 19:37 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Ingo Molnar <mingo@elte.hu> wrote:

> and have measured the overhead of an unmodified, kprobes-probed and 
> djprobes-probed sys_getpid() system-call:
> 
>  sys_getpid() unmodified latency:    317 cycles   [ 0.146 usecs ]
>  sys_getpid() kprobes latency:       815 cycles   [ 0.377 usecs ]
>  sys_getpid() djprobes latency:      380 cycles   [ 0.176 usecs ]

i have taken a look at the kprobes fastpath, and there are a few things 
we can do to speed it up. The patch below shaves off 75 cycles from the 
kprobes overhead:

   sys_getpid() kprobes-speedup:       740 cycles   [ 0.342 usecs ]

that reduces the kprobes overhead to 423 cycles.

	Ingo

--------------->
Subject: [patch] kprobes: speed INT3 trap handling up on i386
From: Ingo Molnar <mingo@elte.hu>

speed up kprobes trap handling by special-casing kernel-space
INT3 traps (which do not occur otherwise) and doing a kprobes
handler check - instead of redirecting over the i386-die-notifier
chain.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/i386/kernel/kprobes.c |    2 +-
 arch/i386/kernel/traps.c   |   19 ++++++++++++-------
 include/asm-i386/kprobes.h |    2 ++
 3 files changed, 15 insertions(+), 8 deletions(-)

Index: linux/arch/i386/kernel/kprobes.c
===================================================================
--- linux.orig/arch/i386/kernel/kprobes.c
+++ linux/arch/i386/kernel/kprobes.c
@@ -200,7 +200,7 @@ void __kprobes arch_prepare_kretprobe(st
  * Interrupts are disabled on entry as trap3 is an interrupt gate and they
  * remain disabled thorough out this function.
  */
-static int __kprobes kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_handler(struct pt_regs *regs)
 {
 	struct kprobe *p;
 	int ret = 0;
Index: linux/arch/i386/kernel/traps.c
===================================================================
--- linux.orig/arch/i386/kernel/traps.c
+++ linux/arch/i386/kernel/traps.c
@@ -802,13 +802,18 @@ EXPORT_SYMBOL_GPL(unset_nmi_callback);
 #ifdef CONFIG_KPROBES
 fastcall void __kprobes do_int3(struct pt_regs *regs, long error_code)
 {
-	if (notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP)
-			== NOTIFY_STOP)
-		return;
-	/* This is an interrupt gate, because kprobes wants interrupts
-	disabled.  Normal trap handlers don't. */
-	restore_interrupts(regs);
-	do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL);
+	/*
+	 * kernel-mode INT3s are likely kprobes:
+	 */
+        if (!user_mode(regs)) {
+                if (kprobe_handler(regs))
+			return;
+		/* This is an interrupt gate, because kprobes wants interrupts
+		disabled.  Normal trap handlers don't. */
+		restore_interrupts(regs);
+		do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL);
+	}
+	notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP);
 }
 #endif
 
Index: linux/include/asm-i386/kprobes.h
===================================================================
--- linux.orig/include/asm-i386/kprobes.h
+++ linux/include/asm-i386/kprobes.h
@@ -88,4 +88,6 @@ static inline void restore_interrupts(st
 
 extern int kprobe_exceptions_notify(struct notifier_block *self,
 				    unsigned long val, void *data);
+extern int kprobe_handler(struct pt_regs *regs);
+
 #endif				/* _ASM_KPROBES_H */

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:37                                                           ` Ingo Molnar
@ 2006-09-17 10:13                                                             ` Frederik Deweerdt
  2006-09-17 14:00                                                               ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Frederik Deweerdt @ 2006-09-17 10:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Sat, Sep 16, 2006 at 09:37:45PM +0200, Ingo Molnar wrote:
> --------------->
> Subject: [patch] kprobes: speed INT3 trap handling up on i386
> From: Ingo Molnar <mingo@elte.hu>
> 
> speed up kprobes trap handling by special-casing kernel-space
> INT3 traps (which do not occur otherwise) and doing a kprobes
> handler check - instead of redirecting over the i386-die-notifier
> chain.
> 
Hi Ingo,

Not that it would make any difference to the actual kprobe performance,
but I think that not using the die-notifier chain makes the DIE_INT3
handling in kprobe_exceptions_notify() useless.

Regards,
Frederik


Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>

diff --git a/arch/i386/kernel/kprobes.c b/arch/i386/kernel/kprobes.c
index afe6505..90787ff 100644
--- a/arch/i386/kernel/kprobes.c
+++ b/arch/i386/kernel/kprobes.c
@@ -652,10 +652,6 @@ int __kprobes kprobe_exceptions_notify(s
 		return ret;
 
 	switch (val) {
-	case DIE_INT3:
-		if (kprobe_handler(args->regs))
-			ret = NOTIFY_STOP;
-		break;
 	case DIE_DEBUG:
 		if (post_kprobe_handler(args->regs))
 			ret = NOTIFY_STOP;

^ permalink raw reply related	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 10:13                                                             ` Frederik Deweerdt
@ 2006-09-17 14:00                                                               ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 14:00 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Frederik Deweerdt <deweerdt@free.fr> wrote:

> On Sat, Sep 16, 2006 at 09:37:45PM +0200, Ingo Molnar wrote:
> > --------------->
> > Subject: [patch] kprobes: speed INT3 trap handling up on i386
> > From: Ingo Molnar <mingo@elte.hu>
> > 
> > speed up kprobes trap handling by special-casing kernel-space
> > INT3 traps (which do not occur otherwise) and doing a kprobes
> > handler check - instead of redirecting over the i386-die-notifier
> > chain.
> > 
> Hi Ingo,
> 
> Not that it would make any difference to the actual kprobe 
> performance, but I think that not using the die-notifier chain makes 
> the DIE_INT3 handling in kprobe_exceptions_notify() useless.

yeah, indeed - i'll add your patch to the kprobes patchset.

	Ingo


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:10                                                         ` Ingo Molnar
  2006-09-16 19:37                                                           ` Ingo Molnar
@ 2006-09-16 19:51                                                           ` Karim Yaghmour
  1 sibling, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 19:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> i have done a bit of kprobes and djprobes testing on a 2160 MHz Athlon64 
> CPU, UP. I have tested 2 types of almost-NOP tracepoints (on 2.6.17), 
> where the probe function only increases a counter:
> 
>  static int counter;
> 
>  static void probe_func(struct djprobe *djp, struct pt_regs *regs)
>  {
>          counter++;
>  }
> 
> and have measured the overhead of an unmodified, kprobes-probed and 
> djprobes-probed sys_getpid() system-call:
> 
>  sys_getpid() unmodified latency:    317 cycles   [ 0.146 usecs ]
>  sys_getpid() kprobes latency:       815 cycles   [ 0.377 usecs ]
>  sys_getpid() djprobes latency:      380 cycles   [ 0.176 usecs ]
> 
> i.e. the kprobes overhead is +498 cycles (+0.231 usecs), the djprobes 
> overhead is +63 cycles (+0.029 usecs).

But that's an entirely hypothetical benchmark. Mathieu was asked for
real-workload benchmarks and he gave you those. In turn, you set up
a simplistic test and then go on to conclude that the measurements
are far less than advertised. You ask that ltt replace its static
instrumentation by what kprobes provides and Mathieu demonstrated
that that's not realistic. If you want to change his mind, at least
reproduce the exact information ltt can provide and then we'll
talk.

> what do these numbers tell us? Firstly, on this CPU the kprobes overhead 
> is not 1000-2000 cycles but 500 cycles. Secondly, if that's not fast 
> enough, the "next-gen kprobes" code, djprobes have a really small 
> overhead of 63 cycles.

But djprobe isn't even here yet. If you insist on keeping ltt's
_current_ limitations as your single most powerful justification to
reject it, how you hold kprobes to a different standard with a
straight face? You're only perpetuating the fallacy found
throughout this thread that somehow the shortcomings of dynamic
editing are "easy" to fix while those of static instrumentation are
inherently unrecoverable. That's just plain not true, as I've
demonstrated now countless times in this thread.

And please Ingo, I'm still waiting for your feedback on the static
markup mechanism I proposed earlier. I believe it avoids every
single problem you alluded to with regards to the problems generated
by inline markup.

Thanks,

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:56                                                       ` Mathieu Desnoyers
  2006-09-16 19:10                                                         ` Ingo Molnar
@ 2006-09-16 23:40                                                         ` Ingo Molnar
  2006-09-17  5:33                                                           ` Mathieu Desnoyers
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 23:40 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> > > Third run : same LTTng instrumentation, with a kprobe handler 
> > > triggered by each event traced.
> > 
> > where exactly did you put the kprobe handler?
> 
> ltt_relay_reserve_slot.
> 
> See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert 
> the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with 
> hyperthreading.

ok. In what way did you enable LTTng instrumentation? I have 0.5.108 
installed, and i'd like to make sure i do everything as you did, to make 
the tests comparable. Which kernel config options (default ones?), and 
what precise lttcl commands did you use, were they the usual:

  lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace

? What filesystem does /tmp/trace reside on?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 23:40                                                         ` Ingo Molnar
@ 2006-09-17  5:33                                                           ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-17  5:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > > > Third run : same LTTng instrumentation, with a kprobe handler 
> > > > triggered by each event traced.
> > > 
> > > where exactly did you put the kprobe handler?
> > 
> > ltt_relay_reserve_slot.
> > 
> > See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert 
> > the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with 
> > hyperthreading.
> 
> ok. In what way did you enable LTTng instrumentation? I have 0.5.108 
> installed, and i'd like to make sure i do everything as you did, to make 
> the tests comparable. Which kernel config options (default ones?), and 
> what precise lttcl commands did you use, were they the usual:
> 
>   lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace
> 
> ? What filesystem does /tmp/trace reside on?
> 

I used LTTng 0.5.111 (yes, now with debugfs!) ;).

I ran the tests on a Pentium 4 3 GHz, with hyperthreading enabled. The system
has 1GB of ram. Hard disk : WDC WD1600JD-00H. File system : ext3.

The kernel (2.6.17) is configured with SMP enabled.

Relevant kernel config :

CONFIG_LTT=y
CONFIG_LTT_TRACER=m
CONFIG_LTT_RELAY=m
CONFIG_LTT_ALIGNMENT=y
CONFIG_LTT_HEARTBEAT=y
CONFIG_LTT_HEARTBEAT_EVENT=y
# CONFIG_LTT_SYNTHETIC_TSC is not set
CONFIG_LTT_USERSPACE_GENERIC=y
CONFIG_LTT_NETLINK_CONTROL=m
CONFIG_LTT_STATEDUMP=m
CONFIG_LTT_FACILITY_CORE=y
CONFIG_LTT_FACILITY_FS=y
CONFIG_LTT_FACILITY_FS_DATA=y
CONFIG_LTT_FACILITY_IPC=y
CONFIG_LTT_FACILITY_KERNEL=y
CONFIG_LTT_FACILITY_KERNEL_ARCH=y
# CONFIG_LTT_FACILITY_LOCKING is not set
CONFIG_LTT_FACILITY_MEMORY=y
CONFIG_LTT_FACILITY_NETWORK=y
CONFIG_LTT_FACILITY_NETWORK_IP_INTERFACE=y
CONFIG_LTT_FACILITY_PROCESS=y
CONFIG_LTT_FACILITY_SOCKET=y
CONFIG_LTT_FACILITY_STATEDUMP=y
CONFIG_LTT_FACILITY_TIMER=y
CONFIG_LTT_FACILITY_STACK=y
CONFIG_LTT_PROCESS_STACK=y
CONFIG_LTT_PROCESS_MAX_FUNCTION_STACK=100
CONFIG_LTT_PROCESS_MAX_STACK_LEN=250
CONFIG_LTT_KERNEL_STACK=y
CONFIG_LTT_STACK_SYSCALL=y
CONFIG_LTT_STACK_INTERRUPT=y
CONFIG_LTT_STACK_NMI=y

Huge note : I left CONFIG_LTT_FACILITY_STACK enabled, but THIS IS EXPERIMENTAL.

lttctl commands :

Start tracing :
lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace1
(note : 0.5.111 uses debugfs, 0.5.108 uses relayfs)

Stop tracing :
lttctl -n trace -R

See http://ltt.polymtl.ca > QUICKSTART for other details (modules to load...)


Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:35                                                     ` Ingo Molnar
  2006-09-16 17:56                                                       ` Mathieu Desnoyers
@ 2006-09-16 18:11                                                       ` Karim Yaghmour
  2006-09-16 17:44                                                         ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 18:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


Ingo Molnar wrote:
> where exactly did you put the kprobe handler?

So location matters, huh? If you're keen to ask this question,
then it might be worth asking why should non-experts be
trusted with keeping instrumentation pertinent out of tree.

[ I know you've said that you acknowledge the need for static
markup. I'm just highlighting a fact substantiating the
position I stated to you in my response late last evening. ]

Thanks,

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 18:11                                                       ` Karim Yaghmour
@ 2006-09-16 17:44                                                         ` Ingo Molnar
  2006-09-16 18:15                                                           ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 17:44 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Karim Yaghmour <karim@opersys.com> wrote:

> Ingo Molnar wrote:
> > where exactly did you put the kprobe handler?
> 
> So location matters, huh? [...]

yes, location very much matters if someone wants to reproduce the 
numbers.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:44                                                         ` Ingo Molnar
@ 2006-09-16 18:15                                                           ` Karim Yaghmour
  2006-09-18  8:18                                                             ` Jes Sorensen
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 18:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton,
	tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


Ingo Molnar wrote:
> yes, location very much matters if someone wants to reproduce the 
> numbers.

Was that really the angle? I'll give you the benefit of the doubt.
But I'm sure you understand the importance of probe placement
with regards to impact of performance ...

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 18:15                                                           ` Karim Yaghmour
@ 2006-09-18  8:18                                                             ` Jes Sorensen
  0 siblings, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:18 UTC (permalink / raw)
  To: karim
  Cc: Ingo Molnar, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Ingo Molnar wrote:
>> yes, location very much matters if someone wants to reproduce the 
>> numbers.
> 
> Was that really the angle? I'll give you the benefit of the doubt.
> But I'm sure you understand the importance of probe placement
> with regards to impact of performance ...

So now you produce a benchmark, then won't allow someone to reproduce
it ..... do we see a pattern here?

Jes


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:24                                                   ` Mathieu Desnoyers
  2006-09-16 17:35                                                     ` Ingo Molnar
@ 2006-09-16 17:55                                                     ` Karim Yaghmour
  2006-09-18  8:21                                                       ` Jes Sorensen
  2006-09-18  8:33                                                     ` Jes Sorensen
  2 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 17:55 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Ingo Molnar, Roman Zippel, Andrew Morton, tglx,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


Mathieu Desnoyers wrote:
> The bottom line is :
> 
> LTTng impact on the studied phenomenon : 35% slower
> 
> LTTng+kprobes impact on the studied phenomenon : 73% slower
> 
> Therefore, I conclude that on this type of high event rate workload, kprobes
> doubles the tracer impact on the system.

Amen to that. Hopefully this puts to rest the myth of Mr. Scrub.

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:55                                                     ` Karim Yaghmour
@ 2006-09-18  8:21                                                       ` Jes Sorensen
  0 siblings, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:21 UTC (permalink / raw)
  To: karim
  Cc: Mathieu Desnoyers, Ingo Molnar, Roman Zippel, Andrew Morton, tglx,
	Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Mathieu Desnoyers wrote:
>> The bottom line is :
>>
>> LTTng impact on the studied phenomenon : 35% slower
>>
>> LTTng+kprobes impact on the studied phenomenon : 73% slower
>>
>> Therefore, I conclude that on this type of high event rate workload, kprobes
>> doubles the tracer impact on the system.
> 
> Amen to that. Hopefully this puts to rest the myth of Mr. Scrub.

If it wasn't because it's so sad, this would be hysterically funny.

Jes



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:24                                                   ` Mathieu Desnoyers
  2006-09-16 17:35                                                     ` Ingo Molnar
  2006-09-16 17:55                                                     ` Karim Yaghmour
@ 2006-09-18  8:33                                                     ` Jes Sorensen
  2006-09-18 15:01                                                       ` Mathieu Desnoyers
  2 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> The bottom line is :
> 
> LTTng impact on the studied phenomenon : 35% slower
> 
> LTTng+kprobes impact on the studied phenomenon : 73% slower
> 
> Therefore, I conclude that on this type of high event rate workload, kprobes
> doubles the tracer impact on the system.

For this specific benchmark, for which we have not seen the code, nor
do we know what system configuration it was run on. Sorry, but even M$'s
sham benchmarks generally tell you which system they used for their
tests.

In addition, some profiling would be interesting so we can see exactly
where things go wrong and fix it. Ingo seems to be doing a good job at
that even without you providing this basic info....

Anyway, despite what Karim likes to claim, this *is* the Linux way!
Things don't get fixed if they are not reported broken and when they
are, whoever is interested in the item will try and fix it. We are not
going to cease Linux kernel development just to please Karim.

The point of this discussion is that the concept of dynamic tracing is
the way to go. If the code isn't 100% there today, then it should be
fixed, thats *not* an excuse to add a lot of cruft based on the wrong
design when we know which path to take. I know it's hard for someone
to accept when he's thrown so much personal time into a project, but as
Ingo keeps saying, there is a lot of value in LTT, the actual markup
isn't the big issue.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18  8:33                                                     ` Jes Sorensen
@ 2006-09-18 15:01                                                       ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-18 15:01 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Jes Sorensen (jes@sgi.com) wrote:
> Mathieu Desnoyers wrote:
> > The bottom line is :
> > 
> > LTTng impact on the studied phenomenon : 35% slower
> > 
> > LTTng+kprobes impact on the studied phenomenon : 73% slower
> > 
> > Therefore, I conclude that on this type of high event rate workload, kprobes
> > doubles the tracer impact on the system.
> 
> For this specific benchmark, for which we have not seen the code, nor
> do we know what system configuration it was run on. Sorry, but even M$'s
> sham benchmarks generally tell you which system they used for their
> tests.
> 
> In addition, some profiling would be interesting so we can see exactly
> where things go wrong and fix it. Ingo seems to be doing a good job at
> that even without you providing this basic info....
> 

Hi Jes,

I did not repeat my system configuration from the previous email in the thread
as it seemed redundant. Ingo asked me politely to tell more about my config
and tests, which I have done. Please read on further down this thread to get
that information.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  9:59                                                 ` Jes Sorensen
  2006-09-16 17:24                                                   ` Mathieu Desnoyers
@ 2006-09-16 17:30                                                   ` Mathieu Desnoyers
  2006-09-18  8:15                                                     ` Jes Sorensen
  1 sibling, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-16 17:30 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Jes Sorensen (jes@sgi.com) wrote:

> If you want to prove people wrong, I suggest you do some real life
> implementation and measure some real workloads with a predefined set of
> tracepoints implemented using kprobes and LTT and show us that the
> benchmark of the user application suffers in a way that can actually be
> measured. Argueing that a syscall takes an extra 50 instructions
> because it's traced using kprobes rather than LTT doesn't mean it
> actually has any real impact.
>

And about those extra cycles.. according to :
Documentation/kprobes.txt
"6. Probe Overhead

On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
microseconds to process.  Specifically, a benchmark that hits the same
probepoint repeatedly, firing a simple handler each time, reports 1-2
million hits per second, depending on the architecture.  A jprobe or
return-probe hit typically takes 50-75% longer than a kprobe hit.
When you have a return probe set on a function, adding a kprobe at
the entry to that function adds essentially no overhead.

i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips
k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40

x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips
k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07

ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU)
k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99


So, 1 microsecond seems more like 1500-2000 cycles to me, not 50.

Mathieu




OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 17:30                                                   ` Mathieu Desnoyers
@ 2006-09-18  8:15                                                     ` Jes Sorensen
  2006-09-18 14:53                                                       ` Mathieu Desnoyers
  0 siblings, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:15 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> And about those extra cycles.. according to :
> Documentation/kprobes.txt
> "6. Probe Overhead
> 
> On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
> microseconds to process.  Specifically, a benchmark that hits the same
> probepoint repeatedly, firing a simple handler each time, reports 1-2
> million hits per second, depending on the architecture.  A jprobe or
> return-probe hit typically takes 50-75% longer than a kprobe hit.
> When you have a return probe set on a function, adding a kprobe at
> the entry to that function adds essentially no overhead.
[snip]
> So, 1 microsecond seems more like 1500-2000 cycles to me, not 50.

So call it 2000 cycles, now go measure it in *real* life benchmarks
and not some artificial I call this one syscall that hits the probe
every time in a tight loop, kinda thing.

Show us some *real* numbers please.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18  8:15                                                     ` Jes Sorensen
@ 2006-09-18 14:53                                                       ` Mathieu Desnoyers
  2006-09-18 15:17                                                         ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-18 14:53 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Jes Sorensen (jes@sgi.com) wrote:
> Mathieu Desnoyers wrote:
> > And about those extra cycles.. according to :
> > Documentation/kprobes.txt
> > "6. Probe Overhead
> > 
> > On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
> > microseconds to process.  Specifically, a benchmark that hits the same
> > probepoint repeatedly, firing a simple handler each time, reports 1-2
> > million hits per second, depending on the architecture.  A jprobe or
> > return-probe hit typically takes 50-75% longer than a kprobe hit.
> > When you have a return probe set on a function, adding a kprobe at
> > the entry to that function adds essentially no overhead.
> [snip]
> > So, 1 microsecond seems more like 1500-2000 cycles to me, not 50.
> 
> So call it 2000 cycles, now go measure it in *real* life benchmarks
> and not some artificial I call this one syscall that hits the probe
> every time in a tight loop, kinda thing.
> 
> Show us some *real* numbers please.
> 

You are late (I don't blame you about it, considering the size of this thread).
It has been posted in the following email :

http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html

Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18 14:53                                                       ` Mathieu Desnoyers
@ 2006-09-18 15:17                                                         ` Ingo Molnar
  2006-09-18 16:54                                                           ` Mathieu Desnoyers
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-18 15:17 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jes Sorensen, Andrew Morton, tglx, Paul Mundt, linux-kernel,
	Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi,
	ltt-dev, Michel Dagenais


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> You are late (I don't blame you about it, considering the size of this 
> thread). It has been posted in the following email :
> 
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html

yeah - and i dont think the kprobes overhead is a fundamental thing - i 
posted a few kprobes-speedup patches as a reply to your measurements.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18 15:17                                                         ` Ingo Molnar
@ 2006-09-18 16:54                                                           ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-18 16:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jes Sorensen, Andrew Morton, tglx, Paul Mundt, linux-kernel,
	Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > You are late (I don't blame you about it, considering the size of this 
> > thread). It has been posted in the following email :
> > 
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html
> 
> yeah - and i dont think the kprobes overhead is a fundamental thing - i 
> posted a few kprobes-speedup patches as a reply to your measurements.
> 

Hi Ingo,

Yes, and I replied that I really don't think that a few cycles saved here and
there by a predicted branch will change anything significant compared to the
int3 cost. As my test bench is really not that hard to deploy (I have given the
precise instructions to do so), I assume that the burden of the proof is on your
side there.

Anyhow, I prefer to move to a more constructive matter than testing kprobes
branch optimisations.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:05                                             ` Ingo Molnar
  2006-09-15 20:22                                               ` Mathieu Desnoyers
@ 2006-09-15 21:12                                               ` Roman Zippel
  2006-09-15 21:08                                                 ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 21:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> i'm also looking at it this way too: you already seem to be quite 
> reluctant to add kprobes to your architecture today. How reluctant would 
> you be tomorrow if you had static tracepoints, which would remove a fair 
> chunk of incentive to implement kprobes?

If I see that whole teams spend years to implement efficient dynamic 
tracing, do you really think that your "incentive" makes any difference?

byem Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:12                                               ` Roman Zippel
@ 2006-09-15 21:08                                                 ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 21:08 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Fri, 15 Sep 2006, Ingo Molnar wrote:
> 
> > i'm also looking at it this way too: you already seem to be quite 
> > reluctant to add kprobes to your architecture today. How reluctant 
> > would you be tomorrow if you had static tracepoints, which would 
> > remove a fair chunk of incentive to implement kprobes?
> 
> If I see that whole teams spend years to implement efficient dynamic 
> tracing, do you really think that your "incentive" makes any 
> difference?

oh, being the first mover is the hardest part. Finding the right 
solution is a hard, it is blind Brownian motion in untested waters. Once 
good solutions have been found and once they have been integrated 
upstream, an architecture 'only' has to follow straight through the 
example. (which is _still_ far from trivial, but it certainly doesnt 
take years.)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:19                                         ` Ingo Molnar
  2006-09-15 19:26                                           ` Karim Yaghmour
  2006-09-15 19:43                                           ` Roman Zippel
@ 2006-09-15 20:13                                           ` Andrew Morton
  2006-09-15 21:49                                             ` Jose R. Santos
  2006-09-16 10:19                                             ` Jes Sorensen
  2 siblings, 2 replies; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 20:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 15 Sep 2006 20:19:07 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Andrew Morton <akpm@osdl.org> wrote:
> 
> > What Karim is sharing with us here (yet again) is the real in-field 
> > experience of real users (ie: not kernel developers).
> 
> well, Jes has that experience and Thomas too.

systemtap and ltt are the only full-scale tracing tools which target
sysadmins and applciation developers of which I am aware..

> > I mean, on one hand we have people explaining what they think a 
> > tracing facility should and shouldn't do, and on the other hand we 
> > have a guy who has been maintaining and shipping exactly that thing to 
> > (paying!) customers for many years.
> 
> so does Thomas and Jes. So what's the point?

My point is that I respect Karim and Frank's experience.  I in fact
disagree with them (or at least, I want to).  But they've been there, and I
haven't.  So I listen.

> i judge LTT by its current code quality, not by its proponents shouting 
> volume - and that quality is still quite poor at the moment. (and then 
> there are the conceptual problems too, outlined numerous times) I have 
> quoted specific example(s) for that in this thread. Furthermore, LTT 
> does this:
> 
>  246 files changed, 26207 insertions(+), 71 deletions(-)
> 
> and this gives me the shivers, for all the reasons i outlined.
> 

In the bit of text which you snipped I was agreeing with this...

Look, if Karim and Frank (who I assume is a systemtap developer) think that
we need static tracepoints then I have no reason to disagree with them. 
What I would propose is that:

a) Those tracepoints be integrated one at a time on well-understood
   grounds of necessity.  Tracepoints _should_ be added dynamically.  But
   if there are instances where that's not working and cannot be made to
   work then OK, in we go.

b) Saying "we need the static tracepoints because the line numbers keep
   on changing" is not, repeat not a justification for static tracepoints. 
   It's a SMOP to develop tracepoint-adding code which can handle line
   numbers changing.  lwall did it.

c) Any static tracepoints should be seen as corner-case augmentation of
   existing dynamic tracing framework(s).  IOW: I see no justification at
   this time for adding complete new second set of backend
   accumulation/reporting/management infrastructure (ie: LTT core).

Shorter version: I agree with Frank.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:13                                           ` Andrew Morton
@ 2006-09-15 21:49                                             ` Jose R. Santos
  2006-09-16 10:19                                             ` Jes Sorensen
  1 sibling, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 21:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Andrew Morton wrote:
> On Fri, 15 Sep 2006 20:19:07 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
>
> > 
> > * Andrew Morton <akpm@osdl.org> wrote:
> > 
> > > What Karim is sharing with us here (yet again) is the real in-field 
> > > experience of real users (ie: not kernel developers).
> > 
> > well, Jes has that experience and Thomas too.
>
> systemtap and ltt are the only full-scale tracing tools which target
> sysadmins and applciation developers of which I am aware..
>   

IMO, I think SystemTap is to generic of a tool to be considered a 
tracing tool.  LKET and LKST are more comparable with the functionality 
that LTT provides.  LKET is implemented using SystemTap while LKST has 
both a SystemTap and static kernel patch implementation.


> In the bit of text which you snipped I was agreeing with this...
>
> Look, if Karim and Frank (who I assume is a systemtap developer) think that
> we need static tracepoints then I have no reason to disagree with them. 
> What I would propose is that:
>
> a) Those tracepoints be integrated one at a time on well-understood
>    grounds of necessity.  Tracepoints _should_ be added dynamically.  But
>    if there are instances where that's not working and cannot be made to
>    work then OK, in we go.
>   
Agree.  What would be the criteria that justifies having static probe vs 
a dynamic one?

-JRS


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:13                                           ` Andrew Morton
  2006-09-15 21:49                                             ` Jose R. Santos
@ 2006-09-16 10:19                                             ` Jes Sorensen
  2006-09-16 16:05                                               ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-16 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, tglx, karim, Paul Mundt, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Andrew Morton wrote:
> On Fri, 15 Sep 2006 20:19:07 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
>> * Andrew Morton <akpm@osdl.org> wrote:
>>
>>> What Karim is sharing with us here (yet again) is the real in-field 
>>> experience of real users (ie: not kernel developers).
>> well, Jes has that experience and Thomas too.
> 
> systemtap and ltt are the only full-scale tracing tools which target
> sysadmins and applciation developers of which I am aware..

Just to clarify, the stuff I have looked at in the field was based on
LTT, but not part of the official LTT. It simply goes to show that end
users cannot agree on a small set of fixed tracepoints because someone
always wants a slightly different view of things, like in the cases I
looked at. Not to mention that the changes LTT users make, at times, to
shoehorn their stuff in, especially in sensitive codepaths such as the
syscall path, have side effects which clearly weren't considered.

In one case I ended up doing an alternative implementation using kprobes
to prove that similar results could be achieved in that manner.
Strangely enough I was right :)

I don't have any objections to markers as Ingo suggested. I just don't
buy the repeated argument that LTT has been around for years and barely
changed. It's simply a case of the LTT team not being aware (or deciding
to ignore, I cannot say which) what users have actually done with the
LTT codebase, but it seems obvious they are not aware what everyone is
doing with it. But we have seen before how if an infrastructure like LTT
goes into the kernel, many more users will pop up and want to have their
stuff added.

The other part is the constantly repeated performance claim, which to
this point hasn't been backed up by any hard evidence. If we are to take
that argument serious, then I strongly encourage the LTT community to
present some real numbers, but until then it can be classified as
nothing but FUD.

I shall be the first to point out that kprobes are less than ideal,
especially the current ia64 implementation suffers from some tricky
limitations, but thats an implementation issue.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 10:19                                             ` Jes Sorensen
@ 2006-09-16 16:05                                               ` Karim Yaghmour
  2006-09-17  4:54                                                 ` Ganesan Rajagopal
  2006-09-18  8:13                                                 ` Jes Sorensen
  0 siblings, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-16 16:05 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Jes Sorensen wrote:
> Just to clarify, the stuff I have looked at in the field was based on
> LTT, but not part of the official LTT. It simply goes to show that end
> users cannot agree on a small set of fixed tracepoints because someone
> always wants a slightly different view of things, like in the cases I
> looked at. Not to mention that the changes LTT users make, at times, to
> shoehorn their stuff in, especially in sensitive codepaths such as the
> syscall path, have side effects which clearly weren't considered.

Good. So give me concrete examples of those cases that you saw and tell
me exactly what those people you were working with were attempting to
achieve.

> I don't have any objections to markers as Ingo suggested. I just don't
> buy the repeated argument that LTT has been around for years and barely
> changed. It's simply a case of the LTT team not being aware (or deciding
> to ignore, I cannot say which) what users have actually done with the
> LTT codebase, but it seems obvious they are not aware what everyone is
> doing with it. But we have seen before how if an infrastructure like LTT
> goes into the kernel, many more users will pop up and want to have their
> stuff added.

Either ltt had a userbase or it didn't. To say that all its users went
out and added their own tracepoints is to not know enough about the project
and so too is it to say that none of its users could actually just use
it out of the box without modifying it. Now, as an outsider, trying to
measure how many users were using it without modifying it is like
trying to figure out how many Linux users there are out there. There's
a silent majority and there's those that need customization. Guess
who you've been talking to?

Strange, come to think of it I don't remember *ever* getting an
email from you while being the maintainer or seing *any* emails by you
on the ltt lists -- that's indicative of mindset, namely that you
personally assumed you knew all about tracing and didn't need us to make
suggestions to help you AND that you personally never found it relevant
to contribute back. That's like me going off forking the kernel, adding
features to it and then calling the kernel developers incompetent when
they come around saying that what I'm doing is wrong. Who's patronizing
who here?

And I submit to you an idea which I submitted to Ingo yesterday and have
not yet received feedback on. Here's static markup as it could be
implemented:

The plain function:
 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2();

         ... [lots of code] ...
 }

The function with static markup:
 int global_function(int arg1, int arg2, int arg3)
 {
         ... [lots of code] ...

         x = func2(); /*T* @here:arg1,arg2,arg3 */

         ... [lots of code] ...
 }

The semantics are primitive at this stage, and they could definitely
benefit from lkml input, but essentially we have a build-time parser
that goes around the code and automagically does one of two things:
a) create information for binary editors to use
b) generate an alternative C file (foo-trace.c) with inlined static
   function calls.

And there might be other possibilities I haven't thought of.

This beats every argument I've seen to date on static instrumentation.
Namely:
- It isn't visually offensive: it's a comment.
- It's not a maintenance drag: outdated comments are not alien.
- It doesn't use weird function names or caps: it's a comment.
- There is precedent: kerneldoc.
And it does preserve most of the key things those who've asked for
static markup are looking for. Namely:
- Static instrumentation
- Mainline maintainability
- Contextualized variables

When I was still part of the ltt development process we had accumulated
a huge amount of ideas of how we could optimize and fix stuff here and
there. We were never actually ever able to reduce these to practice
because folks like you never bothered interfacing with us and the
attitude on the lkml was exactly as I described. We spent our time
chasing kernels.

> The other part is the constantly repeated performance claim, which to
> this point hasn't been backed up by any hard evidence. If we are to take
> that argument serious, then I strongly encourage the LTT community to
> present some real numbers, but until then it can be classified as
> nothing but FUD.

Hmm... beats me why even the systemtap folks would themselves admit
to performance limitations.

> I shall be the first to point out that kprobes are less than ideal,
> especially the current ia64 implementation suffers from some tricky
> limitations, but thats an implementation issue.

Ah, so it's ok for kprobes to have implementation issues, but not ltt.
Somehow there's this magic thought recurring throughout this thread
that the limitations of dynamic instrumentation are trivial to fix,
but those of static instrumentation are unrecoverable. *That* is a
fallacy if I ever saw one. I'm willing to admit that a combination of
dynamic editing and static instrumentation is a good balance, but Jes
please drop this discourse, it's not constructive.

Karim
-- 
President  / Opersys Inc.
Embedded Linux Training and Expertise
www.opersys.com  /  1.866.677.4546

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 16:05                                               ` Karim Yaghmour
@ 2006-09-17  4:54                                                 ` Ganesan Rajagopal
  2006-09-18  8:13                                                 ` Jes Sorensen
  1 sibling, 0 replies; 271+ messages in thread
From: Ganesan Rajagopal @ 2006-09-17  4:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: ltt-dev

>>>>> Karim Yaghmour <karim@opersys.com> writes:

> And I submit to you an idea which I submitted to Ingo yesterday and have
> not yet received feedback on. Here's static markup as it could be
> implemented:
>
> The plain function:
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
>
>          x = func2();
>
>          ... [lots of code] ...
>  }
>
> The function with static markup:
>  int global_function(int arg1, int arg2, int arg3)
>  {
>          ... [lots of code] ...
>
>          x = func2(); /*T* @here:arg1,arg2,arg3 */
>
>          ... [lots of code] ...
>  }
>
> The semantics are primitive at this stage, and they could definitely
> benefit from lkml input, but essentially we have a build-time parser
> that goes around the code and automagically does one of two things:
> a) create information for binary editors to use
> b) generate an alternative C file (foo-trace.c) with inlined static
>    function calls.

This makes sense to me, when combined with kprobes. I refer to the dtrace
Usenix http://www.sun.com/bigadmin/content/dtrace/dtrace_usenix.pdf. They
argue (Section 4.2 Statically-defined Tracing):

"While FBT (Function Boundary Tracing) allows for comprehensive probe
coverage, one must be familar with the kernel implementation to use it
effectively. To have probes with semantic meaning, one must allow probes to
be statically declared in the implementation. The mechanism for implemting
this is typically a macro that expands to a conditional call into a tracing
framework if tracing is enabled. While the probe effect of this mechanism is
small, it is observable: even when disabled, the expanded macro introduces a
load, a compare and a taken branch.

In keeping with our philosophy of zero probe effect when disabled, we have
implemnted a statically defined tracing (SDT) provider by defining a C macro
that expands to a call to a non-existent function with a well-defined prefix
("__dtrace_probe_"). When the kernel linker sees a relocation against a
function with this prefix, it replaces the call instruction with a
no-operation and records the full name of the bogus function along with the
location of the call site. Wehn the SDT provider loads, it queries the
auxiliary structure and creates a probe with a name specified by the
function name. When a SDT probe is enabled, teh no-operation at the call
site is patched to be a call into an SDT-controlled trampoline that
transfers control into DTrace."

-- 
Ganesan Rajagopal

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 16:05                                               ` Karim Yaghmour
  2006-09-17  4:54                                                 ` Ganesan Rajagopal
@ 2006-09-18  8:13                                                 ` Jes Sorensen
  2006-09-18 14:46                                                   ` Mathieu Desnoyers
  2006-09-18 17:06                                                   ` Martin Bligh
  1 sibling, 2 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:13 UTC (permalink / raw)
  To: karim
  Cc: Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Karim Yaghmour wrote:
> Jes Sorensen wrote:
> Good. So give me concrete examples of those cases that you saw and tell
> me exactly what those people you were working with were attempting to
> achieve.

I don't have all the details at hand, but it included syscalls and
scheduler points amongst others.

> Either ltt had a userbase or it didn't. To say that all its users went
> out and added their own tracepoints is to not know enough about the project
> and so too is it to say that none of its users could actually just use
> it out of the box without modifying it. Now, as an outsider, trying to
> measure how many users were using it without modifying it is like
> trying to figure out how many Linux users there are out there. There's
> a silent majority and there's those that need customization. Guess
> who you've been talking to?

Or maybe people start looking at it not knowing whether the want to
pursue it to the end for their product.

> Strange, come to think of it I don't remember *ever* getting an
> email from you while being the maintainer or seing *any* emails by you
> on the ltt lists -- that's indicative of mindset, namely that you
> personally assumed you knew all about tracing and didn't need us to make
> suggestions to help you AND that you personally never found it relevant
> to contribute back.

There's a word for that: *plonk*

Maybe the code was used to evaluate it as an option, maybe they realized
it wasn't worth using in the end, maybe they decided they could make it
work. Maybe the LTT mailing list had been *dead* for 18 months by the
time? You know, reading C code isn't that hard, and it didn't state
anywhere in the LTT license that one is required to take out a paying
contract with a certain Mr. Yaghmour just to be allowed to compile the
code.

> The semantics are primitive at this stage, and they could definitely
> benefit from lkml input, but essentially we have a build-time parser
> that goes around the code and automagically does one of two things:
> a) create information for binary editors to use
> b) generate an alternative C file (foo-trace.c) with inlined static
>    function calls.

You intend to handle inline assembly how? You plan to handle the issue
of debugging the code when the markup is present how?

> And there might be other possibilities I haven't thought of.
> 
> This beats every argument I've seen to date on static instrumentation.
> Namely:
> - It isn't visually offensive: it's a comment.
> - It's not a maintenance drag: outdated comments are not alien.
> - It doesn't use weird function names or caps: it's a comment.
> - There is precedent: kerneldoc.
> And it does preserve most of the key things those who've asked for
> static markup are looking for. Namely:
> - Static instrumentation
> - Mainline maintainability
> - Contextualized variables

And it doesn't address the following issues:

a) The static community providing actual evidence that dynamic tracing
   is noticably slower.
b) It will not be enabled per default in vendor kernels so in practice
   the information will not be available anywhere, only in debug
   kernels.
c) The point that we will end up with markups all over the place to
   satisfy everybody's needs.

>> The other part is the constantly repeated performance claim, which to
>> this point hasn't been backed up by any hard evidence. If we are to take
>> that argument serious, then I strongly encourage the LTT community to
>> present some real numbers, but until then it can be classified as
>> nothing but FUD.
> 
> Hmm... beats me why even the systemtap folks would themselves admit
> to performance limitations.

Everything has performance limitations, you keep running around touting
that static is the only thing thats not a problem. Now show us the
numbers!

>> I shall be the first to point out that kprobes are less than ideal,
>> especially the current ia64 implementation suffers from some tricky
>> limitations, but thats an implementation issue.
> 
> Ah, so it's ok for kprobes to have implementation issues, but not ltt.
> Somehow there's this magic thought recurring throughout this thread
> that the limitations of dynamic instrumentation are trivial to fix,
> but those of static instrumentation are unrecoverable. *That* is a
> fallacy if I ever saw one. I'm willing to admit that a combination of
> dynamic editing and static instrumentation is a good balance, but Jes
> please drop this discourse, it's not constructive.

Oh so bringing fact into a discussion is not allowed. Karim, maybe you
should try using some real arguments. What I am saying about the ia64
implementation is that there are limitations but I am also saying they
can be fixed, it's an implementation issue, not a problem with the
concept.

The problems pointed out with LTT are *conceptual*, but of course you
keep ignoring the facts and refusing to provide real numbers.

Says it all really ....

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18  8:13                                                 ` Jes Sorensen
@ 2006-09-18 14:46                                                   ` Mathieu Desnoyers
  2006-09-18 17:06                                                   ` Martin Bligh
  1 sibling, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-18 14:46 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

Hi Jes,

* Jes Sorensen (jes@sgi.com) wrote:
> Everything has performance limitations, you keep running around touting
> that static is the only thing thats not a problem. Now show us the
> numbers!
> 

If I may : I showed in a precedent thread that kprobes impact doubled LTTng's
impact on the system. If you are interested in numbers about LTTng, here they
are :

"The LTTng tracer : A Low Impact Performance and Behavior Monitor for GNU/Linux"
(OLS2006)
http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf

(and for Ingo : I haven't rerun the tests on your modified kprobes, it will
come in time. But I do not really expect that 30-50 cycles compared to 1500
will make a very big difference.)

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18  8:13                                                 ` Jes Sorensen
  2006-09-18 14:46                                                   ` Mathieu Desnoyers
@ 2006-09-18 17:06                                                   ` Martin Bligh
  2006-09-20 14:17                                                     ` Jes Sorensen
  1 sibling, 1 reply; 271+ messages in thread
From: Martin Bligh @ 2006-09-18 17:06 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

> And it doesn't address the following issues:
> 
> a) The static community providing actual evidence that dynamic tracing
>    is noticably slower.

...

> Everything has performance limitations, you keep running around touting
> that static is the only thing thats not a problem. Now show us the
> numbers!

When comparing two different approaches to a problem, it is unreasonable
and disingenuous to try to force the onus on the proponents of one
particular approach to do all the benchmarking for both sides. Everybody
has to help try to find the correct solution.

Furthermore, Mathieu already did provide numbers, if you go back and
look.

> The problems pointed out with LTT are *conceptual*, but of course you
> keep ignoring the facts and refusing to provide real numbers.

This is getting very silly, and unnecessarily abusive. Real problems
exist on both sides of the fence, which have been discussed ad nauseam.
If you don't recall them, then go back and read the thread again. The
question is how to strike a comprimise between two different set of
problems, which Ingo and Karim actually seemed to be making progress
on towards the end of the thread.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-18 17:06                                                   ` Martin Bligh
@ 2006-09-20 14:17                                                     ` Jes Sorensen
  0 siblings, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-20 14:17 UTC (permalink / raw)
  To: Martin Bligh
  Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Martin Bligh wrote:
>> Everything has performance limitations, you keep running around touting
>> that static is the only thing thats not a problem. Now show us the
>> numbers!
> 
> When comparing two different approaches to a problem, it is unreasonable
> and disingenuous to try to force the onus on the proponents of one
> particular approach to do all the benchmarking for both sides. Everybody
> has to help try to find the correct solution.

Martin,

If you have one side of a discussion stating that the other side's
suggestion is useless for performance reasons, then it is IMHO totally
fair for the second side to ask the first side to back up their
statement with facts. If one wants to get a patch into the kernel,
you also get asked for justication, and if you want to get it into
a vendor kernel, a benchmark proving your patch is not causing any
damage is pretty much standard. Fortunately Mathieu also showed that he
was willing to try and do that.

> This is getting very silly, and unnecessarily abusive. Real problems
> exist on both sides of the fence, which have been discussed ad nauseam.
> If you don't recall them, then go back and read the thread again. The
> question is how to strike a comprimise between two different set of
> problems, which Ingo and Karim actually seemed to be making progress
> on towards the end of the thread.

This got very silly and abuse pretty much from the beginning, at the
very point anyone tried to challenge the justification that was
initially presented with the LTT patches. This isn't how Linux works,
if you want to post a patch, you should be ready to accept public
scrutiny of your design and your actual code. Just because something is
your personal pet project doesn't mean it nobody has the right to
challenge it.

Even after Christoph tried to be the neutral middle-man, we had to see
another three follow-ups of 'I must have the last word' postings :(

As I said in my last posting related to this thread, I had had enough,
I haven't even read all the responses to my posting and I doubt I will.
Instead I went back and starting writing code (unrelated and really
evil code, but in a very different way, and trust me it's making me
very grumpy :)

Fortunately, we at least now have a situation where Mathieu has shown he
is interested in being constructive on the issue and is able to work
with Ingo on the static markers, which I'd like to applaud.

I am optimistic a useful solution will come out of it finally, but I
will rather stay out of it at this point.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                                       ` Andrew Morton
  2006-09-15 18:19                                         ` Ingo Molnar
@ 2006-09-15 19:35                                         ` Thomas Gleixner
  2006-09-15 19:40                                           ` Ingo Molnar
  2006-09-15 19:56                                           ` Karim Yaghmour
  2006-09-15 20:00                                         ` Mathieu Desnoyers
  2006-09-15 20:37                                         ` Alan Cox
  3 siblings, 2 replies; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 19:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 2006-09-15 at 11:16 -0700, Andrew Morton wrote:
> Me thinks our time would be best spent trying to benefit from his
> experience..

I was involved in tracer development for quite a while and I have used
them in $paying customer projects too.

> Me, I'm not particularly averse to some 50-100 static tracepoints if
> experience tells us that we need such things.  And both Karim's and Frank's
> experience does indicate that such things are needed, which carries weight.

>From my experience the tracepoints usually are not at the place where
you need them to track down a particular problem or analyse a particular
usage scenario in detail. This has been true from a kernel and from an
application programmer POV. Also many of the LTT customer I'm aware of
used their own homebrewed set of trace points.

What I always hated on static tracers is the requirement to recompile /
reboot the kernel in order to gather information. Kprobes / systemtap is
really a conveniant way to avoid this.

I completely agree that the maintenance of the "out of code" trace
scripts is a task which needs a lot of effort, but it does not offload
the maintenance effort to those modifying the code and we have not yet
another pseudo instruction/function set which is interfering with the
goal to have clear and understandable code. Hell, the code in those code
paths which are of common interest for instrumentation is already
complex enough. We really can do without adding some more obfuscated
macro constructs.

When we can maintain a basic set of tracescripts in the kernel tree and
once the necessary infrastructure is in place, I'm quite sure that quite
a lot of kernel developers would keep those fundamental trace scripts in
shape out of their own interest. It might take a while to get this going
but once it is established, distros will ship the scripts along with
dynamic tracing enabled in the kernels.

I see a major advantage over static tracing in that:

Static tracing is usually not enabled in production kernels, but the
dynamic tracing infrastructure can be enabled without costs. So you
can actually request traces (at least for the standard set of
tracepoints) from Joe User to track down complex problems.

One thing which is much more important IMHO is the availablity of
_USEFUL_ postprocessing tools to give users a real value of
instrumentation. This is a much more complex task than this whole kernel
instrumentation business. This also includes the ability to coordinate
user space _and_ kernel space instrumentation, which is necessary to
analyse complex kernel / application code interactions. 

	tglx

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:35                                         ` Thomas Gleixner
@ 2006-09-15 19:40                                           ` Ingo Molnar
  2006-09-15 19:56                                           ` Karim Yaghmour
  1 sibling, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 19:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andrew Morton, karim, Paul Mundt, Jes Sorensen, Roman Zippel,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Thomas Gleixner <tglx@linutronix.de> wrote:

> I see a major advantage over static tracing in that:
> 
> Static tracing is usually not enabled in production kernels, but the 
> dynamic tracing infrastructure can be enabled without costs. So you 
> can actually request traces (at least for the standard set of 
> tracepoints) from Joe User to track down complex problems.

FYI, kprobes/SystemTap is already enabled in RHEL4.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:35                                         ` Thomas Gleixner
  2006-09-15 19:40                                           ` Ingo Molnar
@ 2006-09-15 19:56                                           ` Karim Yaghmour
  2006-09-15 20:23                                             ` Thomas Gleixner
  1 sibling, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 19:56 UTC (permalink / raw)
  To: tglx
  Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Thomas Gleixner wrote:
> One thing which is much more important IMHO is the availablity of
> _USEFUL_ postprocessing tools to give users a real value of
> instrumentation. This is a much more complex task than this whole kernel
> instrumentation business. This also includes the ability to coordinate
> user space _and_ kernel space instrumentation, which is necessary to
> analyse complex kernel / application code interactions. 

And of course the usefulness of such postprocessing tools is gated
by the ability of users to use them on _any_ kernel they get their
hands on. Up to this point, this has not been for *any* of the
existing toolsets, simply because they require the user to either
recompile his kernel or modify his probe points to match his kernel.
Until users can actually do without either of these steps (which is
only possible with static markup) then the development teams of
the various projects will continue having to invest resources
chasing the kernel.

We don't need separate popstprocessing tool teams. The only reasons
there are separate project teams is because managers in key
positions made the decision that they'd rather break from existing
projects which had had little success mainlining and instead use
their corporate bodyweight to pressure/seduce kernel developers
working for them into pushing their new great which-aboslutely-
has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree
with you kernel developers that this is crap, this is why we're
developing this new amazing thing). That's the truth plain and
simple.

When I started involving myself in Linux development a decade ago,
I honestly did not think I'd ever see this kind of stuff happen,
but, hey, that's life.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:56                                           ` Karim Yaghmour
@ 2006-09-15 20:23                                             ` Thomas Gleixner
  2006-09-15 20:40                                               ` Roman Zippel
  2006-09-15 21:05                                               ` Karim Yaghmour
  0 siblings, 2 replies; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 20:23 UTC (permalink / raw)
  To: karim
  Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, 2006-09-15 at 15:56 -0400, Karim Yaghmour wrote:
> Thomas Gleixner wrote:
> > One thing which is much more important IMHO is the availablity of
> > _USEFUL_ postprocessing tools to give users a real value of
> > instrumentation. This is a much more complex task than this whole kernel
> > instrumentation business. This also includes the ability to coordinate
> > user space _and_ kernel space instrumentation, which is necessary to
> > analyse complex kernel / application code interactions. 
> 
> And of course the usefulness of such postprocessing tools is gated
> by the ability of users to use them on _any_ kernel they get their
> hands on. Up to this point, this has not been for *any* of the
> existing toolsets, simply because they require the user to either
> recompile his kernel or modify his probe points to match his kernel.

So this has to be changed. And requiring to recompile the kernel is the
wrong answer. Having some nifty tool, which allows you to define the set
of dynamic trace points or use a predefined one is the way to go.

> Until users can actually do without either of these steps (which is
> only possible with static markup) 

Generalization like that are simply wrong. Static markup is not a
panacea. It might help for some things in the first place, but it is not
flexible enough in the long run. It is an engineering challenge to make
the "static" trace rules autogenerated by some means as Andrew pointed
out several times in this thread (see patch(1)), so we can provide a
useful ad hoc set for the users.

> We don't need separate popstprocessing tool teams. The only reasons
> there are separate project teams is because managers in key
> positions made the decision that they'd rather break from existing
> projects which had had little success mainlining and instead use
> their corporate bodyweight to pressure/seduce kernel developers
> working for them into pushing their new great which-aboslutely-
> has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree
> with you kernel developers that this is crap, this is why we're
> developing this new amazing thing). That's the truth plain and
> simple.

Stop whining! LTT did not manage to solve the problem in a generic,
mainline acceptable way. If you really believe that Kprobes / Systemtap
is just a $corporate maliciousness to kick you out of business, then I
really start to doubt your sanity.

This has nothing to do with postprocessing and tracepoint creation
tools. The postprocessing stuff is not in the scope of mainlining. Once
a halfways future proof interface is available, tools will come up
within no time. There are a lot of companies out there who have the
interest and the capabilites to do an intergration into Eclipse to name
one example. They will not start to spend a second of work time until
there is a consolidated instrumentation core in the kernel.

> When I started involving myself in Linux development a decade ago,
> I honestly did not think I'd ever see this kind of stuff happen,
> but, hey, that's life.

- ENOPARSE

	tglx

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:23                                             ` Thomas Gleixner
@ 2006-09-15 20:40                                               ` Roman Zippel
  2006-09-15 20:48                                                 ` Ingo Molnar
  2006-09-15 21:05                                               ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 20:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: karim, Andrew Morton, Paul Mundt, Jes Sorensen, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Thomas Gleixner wrote:

> So this has to be changed. And requiring to recompile the kernel is the
> wrong answer. Having some nifty tool, which allows you to define the set
> of dynamic trace points or use a predefined one is the way to go.

Nobody is taking dynamic tracing away!
You make it sound that tracing is only possible via dynamic traces.
If I want to use static tracepoints, why shouldn't I?

> Stop whining!

So we're back to personal attacks now. :-(

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:40                                               ` Roman Zippel
@ 2006-09-15 20:48                                                 ` Ingo Molnar
  2006-09-15 21:17                                                   ` Karim Yaghmour
  2006-09-15 21:27                                                   ` Roman Zippel
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 20:48 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> On Fri, 15 Sep 2006, Thomas Gleixner wrote:
> 
> > So this has to be changed. And requiring to recompile the kernel is the
> > wrong answer. Having some nifty tool, which allows you to define the set
> > of dynamic trace points or use a predefined one is the way to go.
> 
> Nobody is taking dynamic tracing away!
> You make it sound that tracing is only possible via dynamic traces.
> If I want to use static tracepoints, why shouldn't I?

because:

 - static tracepoints, once added, are very hard to remove - up until
   eternity. (On the other hand, markers for dynamic tracers are easily 
   removed, either via making the dynamic tracer smarter, or by 
   detaching the marker via the patch(1) method. In any case, if a 
   marker goes away then hell does not break loose in dynamic tracing 
   land - but it does in static tracing land.

 - the markers needed for dynamic tracing are different from the LTT
   static tracepoints.

 - a marker for dynamic tracing has lower performance impact than a 
   static tracepoint, on systems that are not being traced. (but which 
   have the tracing infrastructure enabled otherwise)

 - having static tracepoints dillutes the incentive for architectures to
   implement proper kprobes support.

> > > there are separate project teams is because managers in key 
> > > positions made the decision that they'd rather break from existing 
> > > projects which had had little success mainlining and instead use 
> > > their corporate bodyweight to pressure/seduce kernel developers 
> > > working for them into pushing their new great which-aboslutely- 
> > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree 
> > > with you kernel developers that this is crap, this is why we're 
> > > developing this new amazing thing). That's the truth plain and 
> > > simple.
> >
> > Stop whining!
> 
> So we're back to personal attacks now. :-(

hm, so you dont consider the above paragraph a whine. How would you 
characterize it then? A measured, balanced, on-topic technical comment? 
I'm truly curious.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:48                                                 ` Ingo Molnar
@ 2006-09-15 21:17                                                   ` Karim Yaghmour
  2006-09-15 21:15                                                     ` Ingo Molnar
  2006-09-15 21:27                                                   ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 21:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar wrote:
> hm, so you dont consider the above paragraph a whine. How would you 
> characterize it then? A measured, balanced, on-topic technical comment? 
> I'm truly curious.

Take it for what you want. It's yours to disparage. Consider, though,
that I'm factually explaining the real-life result of resistance to
static instrumentation. It's not entirely detached, I'll admit, but
consider that it remained on-topic and entirely respectful of all parties
involved. I've enjoyed very positive relationships with all those
individuals and continue to hold them with high regard. They took the
decisions they thought were best at the time, and I can only respect
them for having acted as responsibly as they found relevant for their
respective organizations. I don't agree with it, but that's life. It
was just important to me to point out to the casual reader the source
of a lot of the fud than can be found on ltt -- i.e. lots of it is
marketing. For sure ltt initially got a lot of things wrong, but the
progress of kernel tracing overall would have been much better had
the naysayers actually chose to understand the problem instead of
stonewalling the efforts being invested.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:17                                                   ` Karim Yaghmour
@ 2006-09-15 21:15                                                     ` Ingo Molnar
  2006-09-15 21:56                                                       ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 21:15 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Karim Yaghmour <karim@opersys.com> wrote:

> [...] Consider, though, that I'm factually explaining the real-life 
> result of resistance to static instrumentation. [...]

with all due respect, do you realize the possibility that this 
resistance might be a genuine technical opinion on my part that is 
driven by the quality of the code being offered and by the conceptual 
problems static tracing introduces in the future, as i see them? And 
thus, maybe, what you wrote:

" and instead use their corporate bodyweight to pressure/seduce kernel
  developers working for them into pushing their new great [...] "

could possibly be total, utter nonsense?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:15                                                     ` Ingo Molnar
@ 2006-09-15 21:56                                                       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 21:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar wrote:
> with all due respect, do you realize the possibility that this 
> resistance might be a genuine technical opinion on my part that is 
> driven by the quality of the code being offered and by the conceptual 
> problems static tracing introduces in the future, as i see them?

Wait. What I said could not possibly apply to comments you, or anybody
else for that matter, made within this thread. What I said refers to
events and threads which have long since passed. The "resistance" I
allude to is that faced by ltt early on and for as long as several
parties were actively involved in trying to standardize on it. I'm
merely trying to explain the current status of this: several teams
in "apparent" competition one another.

> " and instead use their corporate bodyweight to pressure/seduce kernel
>   developers working for them into pushing their new great [...] "
> 
> could possibly be total, utter nonsense?

Please read this in the above context -- passed events. In as far as
my understanding of events as I was part of them, this was the
best I made of the decision-making thought process at a managerial
level. And I do not wish to substantiate that nor was this meant as
a personal attack against any person or organization. Everyone acted
to the best of their knowledge of the facts at the time and I cannot
fault them for that. I disagreed and was disappointed, obviously,
but that's mine to bear.

Put simply: all parties involved would actually wish things were
different.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:48                                                 ` Ingo Molnar
  2006-09-15 21:17                                                   ` Karim Yaghmour
@ 2006-09-15 21:27                                                   ` Roman Zippel
  2006-09-15 21:51                                                     ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 21:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> > Nobody is taking dynamic tracing away!
> > You make it sound that tracing is only possible via dynamic traces.
> > If I want to use static tracepoints, why shouldn't I?
> 
> because:
> 
>  - static tracepoints, once added, are very hard to remove - up until
>    eternity. (On the other hand, markers for dynamic tracers are easily 
>    removed, either via making the dynamic tracer smarter, or by 
>    detaching the marker via the patch(1) method. In any case, if a 
>    marker goes away then hell does not break loose in dynamic tracing 
>    land - but it does in static tracing land.

This is simply not true, at the source level you can remove a static 
tracepoint as easily as a dynamic tracepoint, the effect of the missing 
trace information is the same either way.

>  - the markers needed for dynamic tracing are different from the LTT
>    static tracepoints.

What makes the requirements so different? I would actually think it 
depends on the user independent of the tracing is done.

>  - a marker for dynamic tracing has lower performance impact than a 
>    static tracepoint, on systems that are not being traced. (but which 
>    have the tracing infrastructure enabled otherwise)

Anyone using static tracing intents to use, which makes this point moot.

>  - having static tracepoints dillutes the incentive for architectures to
>    implement proper kprobes support.

Considering the level of work needed to support efficient dynamic tracing 
it only withholds archs from tracing support for no good reason.

> > > > there are separate project teams is because managers in key 
> > > > positions made the decision that they'd rather break from existing 
> > > > projects which had had little success mainlining and instead use 
> > > > their corporate bodyweight to pressure/seduce kernel developers 
> > > > working for them into pushing their new great which-aboslutely- 
> > > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree 
> > > > with you kernel developers that this is crap, this is why we're 
> > > > developing this new amazing thing). That's the truth plain and 
> > > > simple.
> > >
> > > Stop whining!
> > 
> > So we're back to personal attacks now. :-(
> 
> hm, so you dont consider the above paragraph a whine. How would you 
> characterize it then? A measured, balanced, on-topic technical comment? 
> I'm truly curious.

It's sarcastic, but considering the disrespect towards Karim, I don't 
blame him. At some point the "whining" argument was funny, but lately it's 
only used to descredit people.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:27                                                   ` Roman Zippel
@ 2006-09-15 21:51                                                     ` Ingo Molnar
  2006-09-15 22:15                                                       ` Karim Yaghmour
  2006-09-15 22:53                                                       ` Roman Zippel
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 21:51 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > because:
> > 
> >  - static tracepoints, once added, are very hard to remove - up until
> >    eternity. (On the other hand, markers for dynamic tracers are easily 
> >    removed, either via making the dynamic tracer smarter, or by 
> >    detaching the marker via the patch(1) method. In any case, if a 
> >    marker goes away then hell does not break loose in dynamic tracing 
> >    land - but it does in static tracing land.
> 
> This is simply not true, at the source level you can remove a static 
> tracepoint as easily as a dynamic tracepoint, the effect of the 
> missing trace information is the same either way.

this is not true. I gave you one example already a few mails ago (which 
you did not reply to, neither did you reply the previous time when i 
first mentioned this - perhaps you missed it in the high volume of 
emails):

" i outlined one such specific "removal of static tracepoint" example 
  already: static trace points at the head/prologue of functions (half 
  of the existing tracepoints are such). The sock_sendmsg() example i 
  quoted before is such a case. Those trace points can be replaced with 
  a simple GCC function attribute, which would cause a 5-byte (or 
  whatever necessary) NOP to be inserted at the function prologue. The 
  attribute would be alot less invasive than an explicit tracepoint (and 
  thus easier to maintain) "

> >  - the markers needed for dynamic tracing are different from the LTT
> >    static tracepoints.
> 
> What makes the requirements so different? I would actually think it 
> depends on the user independent of the tracing is done.

yes, and i mentioned before that they can be merged (i even outlined a 
few APIs for it), but still that is not being offered by LTT today.

> >  - a marker for dynamic tracing has lower performance impact than a 
> >    static tracepoint, on systems that are not being traced. (but which 
> >    have the tracing infrastructure enabled otherwise)
> 
> Anyone using static tracing intents to use, which makes this point 
> moot.

that's not at all true, on multiple grounds:

Firstly, many people use distro kernels. A Linux distribution typically 
wants to offer as few kernel rpms as possible (one per arch to be 
precise), but it also wants to offer as many features as possible. So if 
there was a static tracer in there, a distro would enable it - but 99.9% 
of the users would never use it - still they would see the overhead. 
Hence the user would have it enabled, but does not intend to use it - 
which contradicts your statement.

Secondly, even people who intend to _eventually_ make use of tracing, 
dont use it most of the time. So why should they have more overhead when 
they are not tracing? Again: the point is not moot because even though 
the user intends to use tracing, but does not always want to trace.

> >  - having static tracepoints dillutes the incentive for architectures to
> >    implement proper kprobes support.
> 
> Considering the level of work needed to support efficient dynamic 
> tracing it only withholds archs from tracing support for no good 
> reason.

5 major architectures (both RISC and CISC) already support kprobes, so 
fortunately this point is largely moot - but you are right to a certain 
degree, it's not totally solved. But the examples are there. It's still 
not trivial to implement a feature like this, but kernel programming 
never is. I far more prefer the harder but more intelligent solution 
than the easier but less intelligent solution - even if that means a 
temporary unavailability of a feature for some rarer arch.

> > > > > there are separate project teams is because managers in key 
> > > > > positions made the decision that they'd rather break from existing 
> > > > > projects which had had little success mainlining and instead use 
> > > > > their corporate bodyweight to pressure/seduce kernel developers 
> > > > > working for them into pushing their new great which-aboslutely- 
> > > > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree 
> > > > > with you kernel developers that this is crap, this is why we're 
> > > > > developing this new amazing thing). That's the truth plain and 
> > > > > simple.
> > > >
> > > > Stop whining!
> > > 
> > > So we're back to personal attacks now. :-(
> > 
> > hm, so you dont consider the above paragraph a whine. How would you 
> > characterize it then? A measured, balanced, on-topic technical 
> > comment? I'm truly curious.
> 
> It's sarcastic, [...]

oh, really? Karim's characterization was:

 " I'm factually explaining the real-life result of resistance to static
   instrumentation. "

so whose interpretation of Karim's comments should i accept, yours or 
Karim's? I'm really torn on that issue. (_that_ was sarcastic)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:51                                                     ` Ingo Molnar
@ 2006-09-15 22:15                                                       ` Karim Yaghmour
  2006-09-15 22:53                                                       ` Roman Zippel
  1 sibling, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 22:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


Ingo Molnar wrote:
> oh, really? Karim's characterization was:
> 
>  " I'm factually explaining the real-life result of resistance to static
>    instrumentation. "
> 
> so whose interpretation of Karim's comments should i accept, yours or 
> Karim's? I'm really torn on that issue. (_that_ was sarcastic)

Hmm ... this might explain why we're having a hard time here ... me
thinks: Ingo don't see that dynamic tracing is orthogonal to static
markup and Ingo don't see that my explanation is orthogonal to
Roman's (i.e. I did factually explain stuff and did resort to
sarcasm as part of said explanation) ... maybe Ingo does not like
orthogonal stuff ...

That _too_ was sarcastic.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:51                                                     ` Ingo Molnar
  2006-09-15 22:15                                                       ` Karim Yaghmour
@ 2006-09-15 22:53                                                       ` Roman Zippel
  2006-09-15 23:14                                                         ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 22:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> > This is simply not true, at the source level you can remove a static 
> > tracepoint as easily as a dynamic tracepoint, the effect of the 
> > missing trace information is the same either way.
> 
> this is not true. I gave you one example already a few mails ago (which 
> you did not reply to, neither did you reply the previous time when i 
> first mentioned this - perhaps you missed it in the high volume of 
> emails):
> 
> " i outlined one such specific "removal of static tracepoint" example 
>   already: static trace points at the head/prologue of functions (half 
>   of the existing tracepoints are such). The sock_sendmsg() example i 
>   quoted before is such a case. Those trace points can be replaced with 
>   a simple GCC function attribute, which would cause a 5-byte (or 
>   whatever necessary) NOP to be inserted at the function prologue. The 
>   attribute would be alot less invasive than an explicit tracepoint (and 
>   thus easier to maintain) "

As I said before you're mixing up function tracing with event tracing, not 
all events are tied to functions, functions can be moved and renamed, the 
actual event more often stays the same.
Function attributes also doesn't provide information local to the 
function.

> > >  - the markers needed for dynamic tracing are different from the LTT
> > >    static tracepoints.
> > 
> > What makes the requirements so different? I would actually think it 
> > depends on the user independent of the tracing is done.
> 
> yes, and i mentioned before that they can be merged (i even outlined a 
> few APIs for it), but still that is not being offered by LTT today.

It's possible I missed something, but pretty much anything you outlined 
wouldn't make the live of static tracepoints any easier.

> > >  - a marker for dynamic tracing has lower performance impact than a 
> > >    static tracepoint, on systems that are not being traced. (but which 
> > >    have the tracing infrastructure enabled otherwise)
> > 
> > Anyone using static tracing intents to use, which makes this point 
> > moot.
> 
> that's not at all true, on multiple grounds:
> 
> Firstly, many people use distro kernels. A Linux distribution typically 
> wants to offer as few kernel rpms as possible (one per arch to be 
> precise), but it also wants to offer as many features as possible. So if 
> there was a static tracer in there, a distro would enable it - but 99.9% 
> of the users would never use it - still they would see the overhead. 
> Hence the user would have it enabled, but does not intend to use it - 
> which contradicts your statement.

So if dynamic tracing is available use it, as distributions already do.
OTOH the barrier to use static tracing is drastically different whether 
the user has to deal with external patches or whether it's a simple kernel 
option.
Again, static tracing doesn't exclude the possibility of dynamic tracing, 
that's something you constantly omit and thus make it sound like both 
options were mutually exlusive.

> Secondly, even people who intend to _eventually_ make use of tracing, 
> dont use it most of the time. So why should they have more overhead when 
> they are not tracing? Again: the point is not moot because even though 
> the user intends to use tracing, but does not always want to trace.

I've used kernels which included static tracing and the perfomance 
overhead is negligible for occasional use.

> > >  - having static tracepoints dillutes the incentive for architectures to
> > >    implement proper kprobes support.
> > 
> > Considering the level of work needed to support efficient dynamic 
> > tracing it only withholds archs from tracing support for no good 
> > reason.
> 
> 5 major architectures (both RISC and CISC) already support kprobes, so 
> fortunately this point is largely moot - but you are right to a certain 
> degree, it's not totally solved. But the examples are there. It's still 
> not trivial to implement a feature like this, but kernel programming 
> never is. I far more prefer the harder but more intelligent solution 
> than the easier but less intelligent solution - even if that means a 
> temporary unavailability of a feature for some rarer arch.

Why don't you leave the choice to the users? Why do you constantly make it 
an exclusive choice? There is a lot of common ground, but you seem to be 
hellbent to make the life of static tracers and thus their users as hard 
possible. Only for pursuit of some perfect solution while the more 
practical solution is easily available without any ill effects?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 22:53                                                       ` Roman Zippel
@ 2006-09-15 23:14                                                         ` Ingo Molnar
  2006-09-15 23:49                                                           ` Nicholas Miell
  2006-09-16  0:31                                                           ` Roman Zippel
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 23:14 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > This is simply not true, at the source level you can remove a static 
> > > tracepoint as easily as a dynamic tracepoint, the effect of the 
> > > missing trace information is the same either way.
> > 
> > this is not true. I gave you one example already a few mails ago (which 
> > you did not reply to, neither did you reply the previous time when i 
> > first mentioned this - perhaps you missed it in the high volume of 
> > emails):
> > 
> > " i outlined one such specific "removal of static tracepoint" example 
> >   already: static trace points at the head/prologue of functions (half 
> >   of the existing tracepoints are such). The sock_sendmsg() example i 
> >   quoted before is such a case. Those trace points can be replaced with 
> >   a simple GCC function attribute, which would cause a 5-byte (or 
> >   whatever necessary) NOP to be inserted at the function prologue. The 
> >   attribute would be alot less invasive than an explicit tracepoint (and 
> >   thus easier to maintain) "
> 
> As I said before you're mixing up function tracing with event tracing, 
> not all events are tied to functions, functions can be moved and 
> renamed, the actual event more often stays the same.

you are showing a clear misunderstanding of how tracing is typically 
done. Both for LTT and for blktrace (and for the tracers i've done 
myself), roughly half (50%) of the tracepoints are right at the top of 
the function and trace the function arguments. Let me quote an example 
straight from LTT:

 int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 {
         struct kiocb iocb;
         struct sock_iocb siocb;
         int ret;

         trace_socket_sendmsg(sock, sock->sk->sk_family,
                 sock->sk->sk_type,
                 sock->sk->sk_protocol,
                 size);

this tracepoint, under a dynamic tracing concept, can be replaced with:

 int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 {
         struct kiocb iocb;
         struct sock_iocb siocb;
         int ret;

note the "__trace" attribute to the function. (see my previous mails 
where i talked about __trace for more details) SystemTap can hook to 
that point and can access the very same parameters that the markup does, 
in a lot less invasive way.

So a 5-line markup can be replaced with a single function attribute.

roughly half of the existing tracepoints in blktrace/LTT can be replaced 
that way. A 50% reduction in the number of markups is significant - but 
such a reduction in markups not possible under the static tracing 
concept. And that method was just off the top of my head - Andrew 
provided other ideas to reduce the number of markups.

> Function attributes also doesn't provide information local to the 
> function.

of course, but where does the above tracepoint i quoted use information 
local to the function? A fair number of markups use global functions 
because, surprise, alot of interesting activity happens along global 
functions. So a healthy reduction in markups can be achieved.

> > > >  - the markers needed for dynamic tracing are different from the 
> > > >    LTT static tracepoints.
> > >
> > > What makes the requirements so different? I would actually think 
> > > it depends on the user independent of the tracing is done.
> > 
> > yes, and i mentioned before that they can be merged (i even outlined 
> > a few APIs for it), but still that is not being offered by LTT 
> > today.
> 
> It's possible I missed something, but pretty much anything you 
> outlined wouldn't make the live of static tracepoints any easier.

sorry, but if you re-read the above line of argument, your sentence 
appears non-sequitor. I said "the markers needed for dynamic tracing are 
different from the LTT static tracepoints". You asked why they are so 
different, and i replied that i already outlined what the right API 
would be in my opinion to do markups, but that API is different from 
what LTT is offering now. To which you are now replying: "pretty much 
anything you outlined wouldn't make the life of static tracepoints any 
easier." Huh?

> > > >  - a marker for dynamic tracing has lower performance impact than a 
> > > >    static tracepoint, on systems that are not being traced. (but which 
> > > >    have the tracing infrastructure enabled otherwise)
> > > 
> > > Anyone using static tracing intents to use, which makes this point 
> > > moot.
> > 
> > that's not at all true, on multiple grounds:
> > 
> > Firstly, many people use distro kernels. A Linux distribution typically 
> > wants to offer as few kernel rpms as possible (one per arch to be 
> > precise), but it also wants to offer as many features as possible. So if 
> > there was a static tracer in there, a distro would enable it - but 99.9% 
> > of the users would never use it - still they would see the overhead. 
> > Hence the user would have it enabled, but does not intend to use it - 
> > which contradicts your statement.
> 
> So if dynamic tracing is available use it, as distributions already 
> do. OTOH the barrier to use static tracing is drastically different 
> whether the user has to deal with external patches or whether it's a 
> simple kernel option. Again, static tracing doesn't exclude the 
> possibility of dynamic tracing, that's something you constantly omit 
> and thus make it sound like both options were mutually exlusive.

how does this reply to my point that: "a marker for dynamic tracing has 
lower performance impact than a static tracepoint, on systems that are 
not being traced", which point you claimed moot?

> > Secondly, even people who intend to _eventually_ make use of 
> > tracing, dont use it most of the time. So why should they have more 
> > overhead when they are not tracing? Again: the point is not moot 
> > because even though the user intends to use tracing, but does not 
> > always want to trace.
> 
> I've used kernels which included static tracing and the perfomance 
> overhead is negligible for occasional use.

how does this suddenly make my point, that "a marker for dynamic tracing 
has lower performance impact than a static tracepoint, on systems that 
are not being traced", "moot"?

> > > >  - having static tracepoints dillutes the incentive for 
> > > >  architectures to
> > > >    implement proper kprobes support.
> > > 
> > > Considering the level of work needed to support efficient dynamic 
> > > tracing it only withholds archs from tracing support for no good 
> > > reason.
> > 
> > 5 major architectures (both RISC and CISC) already support kprobes, 
> > so fortunately this point is largely moot - but you are right to a 
> > certain degree, it's not totally solved. But the examples are there. 
> > It's still not trivial to implement a feature like this, but kernel 
> > programming never is. I far more prefer the harder but more 
> > intelligent solution than the easier but less intelligent solution - 
> > even if that means a temporary unavailability of a feature for some 
> > rarer arch.
> 
> Why don't you leave the choice to the users? Why do you constantly 
> make it an exclusive choice? [...]

as i outlined it tons of times before: once we add markups for static 
tracers, we cannot remove them. That is a constant kernel maintainance 
drag that i feel uncomfortable about. While with dynamic tracers i see a 
clear path out of any such drag. We can, in a very finegrained way, tune 
the overhead of markups vs. out-of-source scripts. Static tracers dont 
give us this flexibility - and hence limit our future choices.

the user of course does not care about kernel internal design and 
maintainance issues. Think about the many reasons why STREAMS was 
rejected - users wanted that too. And note that users dont want "static 
tracers" or any design detail of LTT in particular: what they want is 
the _functionality_ of LTT.

nor do i reject all of LTT: as i said before i like the tools, and i 
think its collection of trace events should be turned into systemtap 
markups and scripts. Furthermore, it's ringbuffer implementation looks 
better. So as far as the user is concerned, LTT could (and should) live 
on with full capabilities, but with this crutial difference in how it 
interfaces to the kernel source code.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:14                                                         ` Ingo Molnar
@ 2006-09-15 23:49                                                           ` Nicholas Miell
  2006-09-15 23:57                                                             ` Ingo Molnar
  2006-09-16  0:31                                                           ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Nicholas Miell @ 2006-09-15 23:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Sat, 2006-09-16 at 01:14 +0200, Ingo Molnar wrote:
> * Roman Zippel <zippel@linux-m68k.org> wrote:
> 
> > > > This is simply not true, at the source level you can remove a static 
> > > > tracepoint as easily as a dynamic tracepoint, the effect of the 
> > > > missing trace information is the same either way.
> > > 
> > > this is not true. I gave you one example already a few mails ago (which 
> > > you did not reply to, neither did you reply the previous time when i 
> > > first mentioned this - perhaps you missed it in the high volume of 
> > > emails):
> > > 
> > > " i outlined one such specific "removal of static tracepoint" example 
> > >   already: static trace points at the head/prologue of functions (half 
> > >   of the existing tracepoints are such). The sock_sendmsg() example i 
> > >   quoted before is such a case. Those trace points can be replaced with 
> > >   a simple GCC function attribute, which would cause a 5-byte (or 
> > >   whatever necessary) NOP to be inserted at the function prologue. The 
> > >   attribute would be alot less invasive than an explicit tracepoint (and 
> > >   thus easier to maintain) "
> > 
> > As I said before you're mixing up function tracing with event tracing, 
> > not all events are tied to functions, functions can be moved and 
> > renamed, the actual event more often stays the same.
> 
> you are showing a clear misunderstanding of how tracing is typically 
> done. Both for LTT and for blktrace (and for the tracers i've done 
> myself), roughly half (50%) of the tracepoints are right at the top of 
> the function and trace the function arguments. Let me quote an example 
> straight from LTT:
> 
>  int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>  {
>          struct kiocb iocb;
>          struct sock_iocb siocb;
>          int ret;
> 
>          trace_socket_sendmsg(sock, sock->sk->sk_family,
>                  sock->sk->sk_type,
>                  sock->sk->sk_protocol,
>                  size);
> 
> this tracepoint, under a dynamic tracing concept, can be replaced with:
> 
>  int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>  {
>          struct kiocb iocb;
>          struct sock_iocb siocb;
>          int ret;
> 
> note the "__trace" attribute to the function. (see my previous mails 
> where i talked about __trace for more details) SystemTap can hook to 
> that point and can access the very same parameters that the markup does, 
> in a lot less invasive way.
> 
> So a 5-line markup can be replaced with a single function attribute.
> 
> roughly half of the existing tracepoints in blktrace/LTT can be replaced 
> that way. A 50% reduction in the number of markups is significant - but 
> such a reduction in markups not possible under the static tracing 
> concept. And that method was just off the top of my head - Andrew 
> provided other ideas to reduce the number of markups.
> 

You're going to want to be able to trace every function in the kernel,
which means they'd all need a __trace -- and in that case, a
-fpad-functions-for-tracing gcc option would make more sense then
per-function attributes.

The option could also insert NOPs before RETs, not just before the
prologue so that function returns are equally easy to trace. (It might
also inhibit tail calls, assuming being able to trace all function
returns is more important than that optimization.)


And SystemTap can already hook into sock_sendmsg() (or any other
function) and examine it's arguments -- all of this GCC extension talk
is just performance enhancement.

-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:49                                                           ` Nicholas Miell
@ 2006-09-15 23:57                                                             ` Ingo Molnar
  2006-09-16  0:41                                                               ` Nicholas Miell
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 23:57 UTC (permalink / raw)
  To: Nicholas Miell
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Nicholas Miell <nmiell@comcast.net> wrote:

> You're going to want to be able to trace every function in the kernel, 
> which means they'd all need a __trace -- and in that case, a 
> -fpad-functions-for-tracing gcc option would make more sense then 
> per-function attributes.

the __trace attribute would be a _specific_ replacement for a _specific_ 
static markup at the entry of a function. So no, we would not want to 
add __trace to _every_ function in the kernel: only those which get 
commonly traced. And note that SystemTap can trace the rest too, just 
with slighly higher overhead.

In that sense __trace is not an enabling infrastructure, it's a 
performance tuning infrastructure.

> The option could also insert NOPs before RETs, not just before the 
> prologue so that function returns are equally easy to trace. (It might 
> also inhibit tail calls, assuming being able to trace all function 
> returns is more important than that optimization.)

yeah. __trace_entry and __trace_exit [or both] attributes. Makes sense.

> And SystemTap can already hook into sock_sendmsg() (or any other 
> function) and examine it's arguments -- all of this GCC extension talk 
> is just performance enhancement.

yes, yes, yes, exactly!!! Finally someone reads my mails and understands 
my points. There's hope! ;)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:57                                                             ` Ingo Molnar
@ 2006-09-16  0:41                                                               ` Nicholas Miell
  0 siblings, 0 replies; 271+ messages in thread
From: Nicholas Miell @ 2006-09-16  0:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Sat, 2006-09-16 at 01:57 +0200, Ingo Molnar wrote:
> * Nicholas Miell <nmiell@comcast.net> wrote:
> 
> > You're going to want to be able to trace every function in the kernel, 
> > which means they'd all need a __trace -- and in that case, a 
> > -fpad-functions-for-tracing gcc option would make more sense then 
> > per-function attributes.
> 
> the __trace attribute would be a _specific_ replacement for a _specific_ 
> static markup at the entry of a function. So no, we would not want to 
> add __trace to _every_ function in the kernel: only those which get 
> commonly traced. And note that SystemTap can trace the rest too, just 
> with slighly higher overhead.
> 
> In that sense __trace is not an enabling infrastructure, it's a 
> performance tuning infrastructure.
> 
> > The option could also insert NOPs before RETs, not just before the 
> > prologue so that function returns are equally easy to trace. (It might 
> > also inhibit tail calls, assuming being able to trace all function 
> > returns is more important than that optimization.)
> 
> yeah. __trace_entry and __trace_exit [or both] attributes. Makes sense.
> 
> > And SystemTap can already hook into sock_sendmsg() (or any other 
> > function) and examine it's arguments -- all of this GCC extension talk 
> > is just performance enhancement.
> 
> yes, yes, yes, exactly!!! Finally someone reads my mails and understands 
> my points. There's hope! ;)

I'm not sure that I do, actually.

You seem to be opposed to all static probe markers in general, but I
think that they'd be useful for big abstract things like "new thread
created" (which would encompass fork/vfork/clone and probably consist of
a single marker in do_fork) or for similar things that happen all over
the kernel (for example, I imagine that all filesystems would want to
use the same set of probe names just to make I/O tracing easier for
userspace).




-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 23:14                                                         ` Ingo Molnar
  2006-09-15 23:49                                                           ` Nicholas Miell
@ 2006-09-16  0:31                                                           ` Roman Zippel
  2006-09-16  8:20                                                             ` Ingo Molnar
                                                                               ` (6 more replies)
  1 sibling, 7 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-16  0:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Sat, 16 Sep 2006, Ingo Molnar wrote:

> > As I said before you're mixing up function tracing with event tracing, 
> > not all events are tied to functions, functions can be moved and 
> > renamed, the actual event more often stays the same.
> 
> you are showing a clear misunderstanding of how tracing is typically 
> done.

Not really, you're missing the point I'm trying to make, we want to trace 
_events_ not functions. Function specific tracing would still require 
kernel specific mapping to map function names to events.

> Both for LTT and for blktrace (and for the tracers i've done 
> myself), roughly half (50%) of the tracepoints are right at the top of 
> the function and trace the function arguments. Let me quote an example 
> straight from LTT:
> 
>  int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>  {
>          struct kiocb iocb;
>          struct sock_iocb siocb;
>          int ret;
> 
>          trace_socket_sendmsg(sock, sock->sk->sk_family,
>                  sock->sk->sk_type,
>                  sock->sk->sk_protocol,
>                  size);
> 
> this tracepoint, under a dynamic tracing concept, can be replaced with:
> 
>  int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>  {
>          struct kiocb iocb;
>          struct sock_iocb siocb;
>          int ret;
> 
> note the "__trace" attribute to the function. (see my previous mails 
> where i talked about __trace for more details) SystemTap can hook to 
> that point and can access the very same parameters that the markup does, 
> in a lot less invasive way.
> 
> So a 5-line markup can be replaced with a single function attribute.

A nice example where you make life more difficult for static tracers for 
no reason, whereas a "trace_socket_sendmsg(sock, size);" is just as 
usable. It would also add virtually no maintainance overhead as you like 
to claim - how often does this function change?

> > Function attributes also doesn't provide information local to the 
> > function.
> 
> of course, but where does the above tracepoint i quoted use information 
> local to the function? A fair number of markups use global functions 
> because, surprise, alot of interesting activity happens along global 
> functions. So a healthy reduction in markups can be achieved.

But not completely, which is the whole point.

> > It's possible I missed something, but pretty much anything you 
> > outlined wouldn't make the live of static tracepoints any easier.
> 
> sorry, but if you re-read the above line of argument, your sentence 
> appears non-sequitor. I said "the markers needed for dynamic tracing are 
> different from the LTT static tracepoints". You asked why they are so 
> different, and i replied that i already outlined what the right API 
> would be in my opinion to do markups, but that API is different from 
> what LTT is offering now. To which you are now replying: "pretty much 
> anything you outlined wouldn't make the life of static tracepoints any 
> easier." Huh?

Yeah, huh?
I have no idea, what you're trying to tell me. As you demonstrated above 
your "right API" is barely usable for static tracers.

> > So if dynamic tracing is available use it, as distributions already 
> > do. OTOH the barrier to use static tracing is drastically different 
> > whether the user has to deal with external patches or whether it's a 
> > simple kernel option. Again, static tracing doesn't exclude the 
> > possibility of dynamic tracing, that's something you constantly omit 
> > and thus make it sound like both options were mutually exlusive.
> 
> how does this reply to my point that: "a marker for dynamic tracing has 
> lower performance impact than a static tracepoint, on systems that are 
> not being traced", which point you claimed moot?

Because it's pretty much an implementation issue. The point is about 
adding markers at all, it's about the choice being able to use static 
tracers in the first place. Both have undeniable their advantages/ 
disadvantages, where you prefer to emphasize only the strong points of 
dynamic tracing and constantly declare its problems as nonissues.

> > > Secondly, even people who intend to _eventually_ make use of 
> > > tracing, dont use it most of the time. So why should they have more 
> > > overhead when they are not tracing? Again: the point is not moot 
> > > because even though the user intends to use tracing, but does not 
> > > always want to trace.
> > 
> > I've used kernels which included static tracing and the perfomance 
> > overhead is negligible for occasional use.
> 
> how does this suddenly make my point, that "a marker for dynamic tracing 
> has lower performance impact than a static tracepoint, on systems that 
> are not being traced", "moot"?

Why exactly is the point relevant in first place? How exactly is the added 
(minor!) overhead such a fundamental problem?

> > Why don't you leave the choice to the users? Why do you constantly 
> > make it an exclusive choice? [...]
> 
> as i outlined it tons of times before: once we add markups for static 
> tracers, we cannot remove them. That is a constant kernel maintainance 
> drag that i feel uncomfortable about.

As many, many people have already said, any tracepoints have an 
maintainance overhead, which is barely different between dynamic and 
static tracing and only increases the further away the tracepoints are 
from the source.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
@ 2006-09-16  8:20                                                             ` Ingo Molnar
  2006-09-16  8:21                                                             ` Ingo Molnar
                                                                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:20 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > this tracepoint, under a dynamic tracing concept, can be replaced with:
> > 
> >  int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>
> A nice example where you make life more difficult for static tracers 
> for no reason, [...]

No, it's simply a clever feature: "halve the impact of static markups".

What you say will be _precisely_ the kind of situations that make me 
very wary of static tracers. Someone does something smart that enables 
us to remove half of the tracepoints from the kernel source code, while 
you will go on and complain: "why do you make the life harder for static 
tracers". You, perhaps inwillingly, are giving the perfect demonstration 
of why static tracepoints are a maintainance problem: once added _they 
can not be removed without breaking static tracers_.

And i see you didnt reply to (and you didnt even quote) the paragraph 
that i believe answers your point:

> > the user of course does not care about kernel internal design and 
> > maintainance issues. Think about the many reasons why STREAMS was 
> > rejected - users wanted that too. And note that users dont want 
> > "static tracers" or any design detail of LTT in particular: what 
> > they want is the _functionality_ of LTT.

The kernel tree is not there to make it easier for inferior approaches. 
How hard is it for the static tracer folks to take a look at dynamic 
tracers and realize that it's the fundamentally better approach, for the 
reasons above and for other reasons, and pick the concept up and 
integrate it with their code? Just like the STREAMS folks had a chance 
to look at the existing TCP/IP implementation in the Linux kernel and 
had the chance to realize that it's the better approach. Yet they 
insisted on just adding a few hooks here and there, to "make the life 
easier for STREAMS".

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
  2006-09-16  8:20                                                             ` Ingo Molnar
@ 2006-09-16  8:21                                                             ` Ingo Molnar
  2006-09-16  8:21                                                             ` Ingo Molnar
                                                                               ` (4 subsequent siblings)
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:21 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> [...] It would also add virtually no maintainance overhead as you like 
> to claim - how often does this function change?

as i said, roughly half of the tracepoints are like this - and some of 
them in functions in frequented places. That's far from "virtually no 
maintainance overhead". In the -rt tree i have never more than a dozen 
static tracepoints, yet even this small amount caused at least 5 extra 
-rt tree iterations due to various breakages (build problems or even 
crashes). Cruft comes in small steps, and my worry is that such 
_unremovable_ markups will be cruft that never shrinks. With dynamic 
tracers i see the _chance_ for cruft to shift to places where it does 
not hurt, if that cruft turns out to become a hindrance.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
  2006-09-16  8:20                                                             ` Ingo Molnar
  2006-09-16  8:21                                                             ` Ingo Molnar
@ 2006-09-16  8:21                                                             ` Ingo Molnar
  2006-09-16  8:22                                                             ` Ingo Molnar
                                                                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:21 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > > > This is simply not true, at the source level you can remove a 
> > > > > static tracepoint as easily as a dynamic tracepoint, the 
> > > > > effect of the missing trace information is the same either way.
> > > >
> > > > this is not true. I gave you one example already a few mails ago
> > > > [...]
> > >
> > > Function attributes also doesn't provide information local to the 
> > > function.
> > 
> > of course, but where does the above tracepoint i quoted use 
> > information local to the function? A fair number of markups use 
> > global functions because, surprise, alot of interesting activity 
> > happens along global functions. So a healthy reduction in markups 
> > can be achieved.
> 
> But not completely, which is the whole point.

the point was what you said above, which i claimed and still claim to be 
false: "at the source level you can remove a static tracepoint as easily 
as a dynamic tracepoint, the effect of the missing trace information is 
the same either way."

Your point is still incorrect. I gave you an example of how half of the 
tracepoints could be removed under a dynamic scheme - while they couldnt 
be removed under a static scheme. Hence that directly contradicts your 
contention that "you can remove a static tracepoint as easily as a 
dynamic tracepoint". Nothing more, nothing less. I just pointed out the 
point in your thinking that i believe to be incorrect.

Reality is that you can remove a dynamic tracepoint much easier, due to 
the fundamental flexibility of dynamic tracers. While with static 
tracers, every tracepoint has to be _somewhere_ in the source code, 
otherwise people like you will complain just like you did in this mail: 
"you make life more difficult for static tracers for no reason".

You can concede my point or you can dispute that argument - but what you 
did above was neither: you snipped all the quotations and you claimed a 
totally new point. (which new point i never argued with: _of course_ i 
never claimed that __trace function attributes can remove _all_ markups. 
They can "only" remove half of them.)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
                                                                               ` (2 preceding siblings ...)
  2006-09-16  8:21                                                             ` Ingo Molnar
@ 2006-09-16  8:22                                                             ` Ingo Molnar
  2006-09-16 19:58                                                               ` Roman Zippel
  2006-09-16  8:23                                                             ` Ingo Molnar
                                                                               ` (2 subsequent siblings)
  6 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:22 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > It's possible I missed something, but pretty much anything you 
> > > outlined wouldn't make the live of static tracepoints any easier.
> > 
> > sorry, but if you re-read the above line of argument, your sentence 
> > appears non-sequitor. I said "the markers needed for dynamic tracing are 
> > different from the LTT static tracepoints". You asked why they are so 
> > different, and i replied that i already outlined what the right API 
> > would be in my opinion to do markups, but that API is different from 
> > what LTT is offering now. To which you are now replying: "pretty much 
> > anything you outlined wouldn't make the life of static tracepoints any 
> > easier." Huh?
> 
> Yeah, huh?
>
> I have no idea, what you're trying to tell me. As you demonstrated 
> above your "right API" is barely usable for static tracers.

you raise a new point again (without conceding or disputing the point we 
were discussing, which point you snipped from your reply) but i'm happy 
to reply to this new point too: my suggested API is not "barely usable" 
for static tracers but "totally unusable". Did i tell you yet that i 
disagree with the addition of markups for static tracers?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  8:22                                                             ` Ingo Molnar
@ 2006-09-16 19:58                                                               ` Roman Zippel
  2006-09-16 22:50                                                                 ` Ingo Molnar
                                                                                   ` (2 more replies)
  0 siblings, 3 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-16 19:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

I don't know why you split this into multiple subthreads and instead of 
delving further into secondary issues, please let me get back to the 
primary issues to put everything a little into perspective.

The foremost issue is still that there is only limited kprobes support. 
The way you ignore this and try to make this a non-issue makes it appear 
to me rather arrogant, I appreciate it that you want to push technology 
forward, but it's rather ignorant how you leave people behind in the dust 
who can't keep up, by making it very hard for them to easily get access to 
tracing in the kernel.
Since I have a quite good idea of the amount of work needed to implement 
second rate kprobes hack, first rate kprobes support and first rate 
ltt(ng) support, it's a quite simple decision what I'm going to do. Since 
your "incentive" to add kprobes support is not very high, it's more likely 
to backfire in making you the jerk denying me easy access to tracing 
technologies.

Since my options are right now limited to a static tracer in first place, 
most of the issues you mentioned over the various mails become really 
moot, e.g. why should I care about the overhead of inactive traces? We can 
happily discuss the merits of dynamic tracers forever, but it does _not_ 
change my current situation, that I have no access to one on some machines 
I care about.

The main issue in supporting static tracers are the tracepoints and so far 
I haven't seen any convincing proof that the maintainance overhead of 
dynamic and static tracepoints has to be significantly different. What you 
did is constructing a worst case scenario, which only proves that it's 
possible, what it doesn't prove is that there are no measures to prevent 
this from happining. This means nobody proved so far that it's not 
possible to create and enforce a set of rules to keep the amount and 
effect of tracepoints under control.
Let's take your example of a tracepoint in an area of high development 
activity, since such development should happen in -mm, it would be no 
problem to drop the trace and add it back once development calmed down, 
exactly like you would do for a dynamic trace. OTOH it's very well 
possible some people might find the trace useful during development.
So the problem here is now that you simply work from the unproven premiss, 
that static tracepoints automatically lead to uncontrolled chaos. This 
makes a reasonable discussion about managing tracepoints impossible, since 
you don't want to support static tracepoints at all.

Ingo, as long as you don't give up this zero tolerance strategy, it 
doesn't make much sense to discuss details and I can only hope there are 
other people who are more reasonable...

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:58                                                               ` Roman Zippel
@ 2006-09-16 22:50                                                                 ` Ingo Molnar
  2006-09-16 23:00                                                                 ` Ingo Molnar
  2006-09-16 23:14                                                                 ` Ingo Molnar
  2 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 22:50 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> I don't know why you split this into multiple subthreads [...]

huh? Maybe because the mail got ... too big?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:58                                                               ` Roman Zippel
  2006-09-16 22:50                                                                 ` Ingo Molnar
@ 2006-09-16 23:00                                                                 ` Ingo Molnar
  2006-09-17  1:15                                                                   ` Roman Zippel
  2006-09-16 23:14                                                                 ` Ingo Molnar
  2 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 23:00 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> Since my options are right now limited to a static tracer in first 
> place, [...]

Lets see the equation of the current situation. On one side you want 
static tracing but you dont want to implement kprobes on m68k - although 
you probably could. On the other side there is the main kernel, which, 
if it ever accepted static tracepoints, could probably never get rid of 
them.

so, you request the main kernel to accept hundreds of static tracepoints 
that would probably never go away, just because you are reluctant at the 
moment to implement kprobes? And that only to bridge a temporary period 
of time when m68k has no kprobes support yet? Combined with the fact 
that m68k was just fine without tracing for 13 years? Did i get that 
right?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 23:00                                                                 ` Ingo Molnar
@ 2006-09-17  1:15                                                                   ` Roman Zippel
  2006-09-17  8:42                                                                     ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-17  1:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> Lets see the equation of the current situation. On one side you want 
> static tracing but you dont want to implement kprobes on m68k - although 
> you probably could.

You would have a point if would it be just about m68k.

> On the other side there is the main kernel, which, 
> if it ever accepted static tracepoints, could probably never get rid of 
> them.

If they are useful and not hurting anyone, why should we?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17  1:15                                                                   ` Roman Zippel
@ 2006-09-17  8:42                                                                     ` Ingo Molnar
  2006-09-17 15:16                                                                       ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17  8:42 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > On the other side there is the main kernel, which, if it ever 
> > accepted static tracepoints, could probably never get rid of them.
> 
> If they are useful and not hurting anyone, why should we?

FYI, whether it is true that "they not hurting anyone" is one of those 
"secondary issues" that I analyzed in great detail in the emails 
yesterday, and which you opted not to "further dvelve into":

   Message-ID: <20060916082347.GG6317@elte.hu>:

    ' That is a constant kernel maintainance drag that i feel 
      uncomfortable about. '

   Message-ID: <20060916082107.GB6317@elte.hu>:

    'That's far from "virtually no maintainance overhead".'

   Message-ID: <20060916082054.GA6317@elte.hu>:

    'static tracepoints are a maintainance problem: once added _they can 
     not be removed without breaking static tracers_.'

I still very much opine that your claim that static tracepoints are not 
hurting anyone is false: they can cause significant maintainance 
overhead in the long run that we cannot remove, and these costs 
integrate over a long period of time.

We have statements from two people who have /used and hacked/ LTT in 
products and have seen LTT's use, indicating that the maintainance 
overhead is nonzero and that the combined number of tracepoints in use 
by actual customers is much larger than posited in this thread. We also 
have LTT proponents disputing that and suggesting that the long-term 
maintainance overhead is very low. So even taking my opinion out of the 
picture, the picture is far from clear. If we put my opinion back into 
the picture: i base it on my first-hand experience with tracers. [**]

so at least to me the rule in such a situation is clear: if we have the 
choice between two approaches that are useful in similar ways [*] but 
one has a larger flexibility to decrease the total maintainance cost, 
then we _must_ pick that one.

This really isnt rocket science, we do such decisions every day. We did 
that decision for STREAMS too. (which STREAMS argument you ignored for a 
number of times.) STREAMS was a similar situation: people wanted "just a 
few unintrusive hooks which you could compile out" for external STREAMS 
functionality to hook into.

and unlike STREAMS, in the LTT case it's not the totality of the project 
that is being disputed: i only dispute the static tracing aspect of it, 
which is a comparatively small (but intrusive) portion of a project that 
consists of a 26,000 lines kernel patchset and a large body of userspace 
tools.

	Ingo

[*] furthermore, dynamic tracing is not only "similarly useful", it is
    _more useful_ because it allows alot more than static tracing does. 
    That's why i analyzed the "secondary issue" of the usefulness of 
    dynamic tracers: the decision gets easier if one of the technologies 
    is fundamentally more capable.

[**] Also, just yesterday i tried to merge the 2.6.17 version of the LTT 
     patchset to 2.6.18, and it created non-trivial rejects left and 
     right. That is a further objective indicator to me - if something 
     has low maintainance cost, how come its patchset is so intrusive 
     that it cannot survive 3 months of kernel development flux? If it's 
     intrusive, shouldnt we have the fundamental option to shift that 
     maintainance overhead out of the core kernel, back to the people 
     that want those features?

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17  8:42                                                                     ` Ingo Molnar
@ 2006-09-17 15:16                                                                       ` Roman Zippel
  2006-09-17 15:25                                                                         ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 15:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> > If they are useful and not hurting anyone, why should we?
> 
> FYI, whether it is true that "they not hurting anyone" is one of those 
> "secondary issues" that I analyzed in great detail in the emails 
> yesterday, and which you opted not to "further dvelve into":

Ingo, you happily still ignore my primary issues, how serious do you 
expect me to take this?

> so at least to me the rule in such a situation is clear: if we have the
> choice between two approaches that are useful in similar ways [*] but
> one has a larger flexibility to decrease the total maintainance cost,
> then we _must_ pick that one.

That would assume the choices are mutually exclusive, which you haven't 
proven at all.

To put everything in yet another perspective: We have the kernel full of 
security hooks, which are likely more invasive than any trace marker ever 
will be. These security hooks are well hated by a few developers, but we 
merged them anyway, because they are useful.
So the big question is now, why should it be impossible to create and 
merge a well defined set of markers, which can be used by any tracer?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 15:16                                                                       ` Roman Zippel
@ 2006-09-17 15:25                                                                         ` Ingo Molnar
  2006-09-17 16:02                                                                           ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 15:25 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> Ingo, you happily still ignore my primary issues, how serious do you 
> expect me to take this?

I did not ignore your new "primary issues", to the contrary. Please read 
my replies. To recap, your "primary issues" are:

> The foremost issue is still that there is only limited kprobes 
> support.

> The main issue in supporting static tracers are the tracepoints and so 
> far I haven't seen any convincing proof that the maintainance overhead 
> of dynamic and static tracepoints has to be significantly different.

to both points i (and others) already replied in great detail - please 
follow up on them. (I can quote message-IDs if you cannot find them.)

[ Or if it's not these two then let me know if i missed some important 
  point - it's easy to miss a valid point in a sea of of replies. 
  For example yesterday i have replied to 7 different issues _you_ 
  raised, partly issues where you have questioned my credibility and 
  competence, so i felt compelled to reply - but still you replied to 
  none of those mails, only declaring them "secondary" in a passing 
  reference. If they were secondary then why did you raise them in the 
  first place? Or do you summarily concede all those points by not 
  replying to them? And is there any guarantee that you will reply to
  any mails i write to you now? Will you declare them "secondary" too 
  once the argument appears to turn unfavorable to your position? ]

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 15:25                                                                         ` Ingo Molnar
@ 2006-09-17 16:02                                                                           ` Roman Zippel
  2006-09-17 16:45                                                                             ` Ingo Molnar
  2006-09-17 16:59                                                                             ` Nick Piggin
  0 siblings, 2 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 16:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> > The foremost issue is still that there is only limited kprobes 
> > support.
> 
> > The main issue in supporting static tracers are the tracepoints and so 
> > far I haven't seen any convincing proof that the maintainance overhead 
> > of dynamic and static tracepoints has to be significantly different.
> 
> to both points i (and others) already replied in great detail - please 
> follow up on them. (I can quote message-IDs if you cannot find them.)

What you basically tell me is (rephrased to make it more clear): Implement 
kprobes support or fuck off! You make it very clear, that you're unwilling 
to support static tracers even to point to make _any_ static trace support 
impossible. It's impossible to discuss this with you, because you're 
absolutely unwilling to make any concessions. What am I supposed to do 
than it's very clear to me, that you don't want to make any compromise 
anyway? You leave me _nothing_ to work with, that's the main reason I 
leave such things unanswered. AFAICT there is nothing I can do about that 
than just repeating what I told you already anyway and you'll continue to 
ignore it and I'm sick and tired of it.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 16:02                                                                           ` Roman Zippel
@ 2006-09-17 16:45                                                                             ` Ingo Molnar
  2006-09-17 16:59                                                                             ` Nick Piggin
  1 sibling, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 16:45 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > to both points i (and others) already replied in great detail - 
> > please follow up on them. (I can quote message-IDs if you cannot 
> > find them.)
> 
> What you basically tell me is (rephrased to make it more clear): 
> Implement kprobes support or fuck off! [...]

What i am saying (again and again) is: "the other option you suggest is 
not acceptable to me because a better solution exists" [for the many 
reasons outlined before]. Think about the STREAMS example: there too 
_that_ particular approach was rejected, because a better solution 
existed. (although it was a _much_ larger body of code that was 
rejected)

I'm not "forcing" kprobes on you: you can invent whatever other approach 
that solves the problems i and others raised, or you can have your own 
separate patchset - this is standard kernel acceptance procedure. 
Granted, kprobes is an existing solution with extensive existing 
infrastructure, so it's IMO the easiest solution technically, but you 
are certainly not 'forced' to do it. You want the feature on your 
architecture _without_ kprobes, solve the problems.

> [...] You make it very clear, that you're unwilling to support static 
> tracers even to point to make _any_ static trace support impossible. 
> It's impossible to discuss this with you, because you're absolutely 
> unwilling to make any concessions. [...]

Because we either accept the concept of static tracing or not - 
unfortunately there's no meaningful middle ground. I'd love it if there 
was some meaningful middle-ground, because then we'd not have this 
lengthy discussion at all. But sometimes such situations do happen. Same 
was true for STREAMS: the only choice was to either it was accepted or 
it was rejected. One cannot get a "little bit pregnant".

The "add some static markups" suggestion is IMO just tactical pretense: 
static tracing will only be fully functional once it grows a 
comprehensive set of static tracepoints, so once we accept a "little 
bit" of static tracing where all the tools are built around a full set 
of tracepoints, we've created an expectance to have all of it.

Hence my suggestion: forget static tracing for the LTT engine and 
concentrate on dynamic tracepoints with _static markups_. Do you realize 
that dynamic tracers can insert _function calls_ into static markups, 
today? [and i'm not talking about djprobes here but current existing 
SystemTap behavior.]

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 16:02                                                                           ` Roman Zippel
  2006-09-17 16:45                                                                             ` Ingo Molnar
@ 2006-09-17 16:59                                                                             ` Nick Piggin
  2006-09-17 17:26                                                                               ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Nick Piggin @ 2006-09-17 16:59 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

Roman Zippel wrote:
> Hi,
> 
> On Sun, 17 Sep 2006, Ingo Molnar wrote:
> 
> 
>>>The foremost issue is still that there is only limited kprobes 
>>>support.
>>
>>>The main issue in supporting static tracers are the tracepoints and so 
>>>far I haven't seen any convincing proof that the maintainance overhead 
>>>of dynamic and static tracepoints has to be significantly different.

Above, weren't you asking about static vs dynamic trace-*points*, rather
than the implementation of the tracer itself. I think Ingo said that
some "static tracepoints" (eg. annotation) could be acceptable.

>>to both points i (and others) already replied in great detail - please 
>>follow up on them. (I can quote message-IDs if you cannot find them.)
> 
> 
> What you basically tell me is (rephrased to make it more clear): Implement 
> kprobes support or fuck off! You make it very clear, that you're unwilling 
> to support static tracers even to point to make _any_ static trace support 

Now it seems you are talking about compiled vs runtime inserted traces,
which is different. And so far I have to agree with Ingo: dynamic seems
to be better in almost every way. Implementation may be more complex,
but that's never stood in the way of a better solution before, and I
don't think anybody has shown it to be prohibitive ("I won't implement
it" notwithstanding)

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 16:59                                                                             ` Nick Piggin
@ 2006-09-17 17:26                                                                               ` Roman Zippel
  2006-09-17 17:56                                                                                 ` Nick Piggin
  2006-09-17 19:23                                                                                 ` Ingo Molnar
  0 siblings, 2 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 17:26 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Mon, 18 Sep 2006, Nick Piggin wrote:

> > > > The foremost issue is still that there is only limited kprobes support.
> > > 
> > > > The main issue in supporting static tracers are the tracepoints and so
> > > > far I haven't seen any convincing proof that the maintainance overhead
> > > > of dynamic and static tracepoints has to be significantly different.
> 
> Above, weren't you asking about static vs dynamic trace-*points*, rather
> than the implementation of the tracer itself. I think Ingo said that
> some "static tracepoints" (eg. annotation) could be acceptable.

No, he made it rather clear, that as far as possible he only wants dynamic 
annotations (e.g. via function attributes).

> > What you basically tell me is (rephrased to make it more clear): Implement
> > kprobes support or fuck off! You make it very clear, that you're unwilling
> > to support static tracers even to point to make _any_ static trace support 
> 
> Now it seems you are talking about compiled vs runtime inserted traces,
> which is different. And so far I have to agree with Ingo: dynamic seems
> to be better in almost every way. Implementation may be more complex,
> but that's never stood in the way of a better solution before, and I
> don't think anybody has shown it to be prohibitive ("I won't implement
> it" notwithstanding)

I don't deny that dynamic tracer are more flexible, but I simply don't 
have the resources to implement one. If those who demand I use a dynamic 
tracer, would also provide the appropriate funding, it would change the 
situation completely, but without that I have to live with the tools 
available to me.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 17:26                                                                               ` Roman Zippel
@ 2006-09-17 17:56                                                                                 ` Nick Piggin
  2006-09-17 18:59                                                                                   ` Roman Zippel
  2006-09-17 21:32                                                                                   ` Ingo Molnar
  2006-09-17 19:23                                                                                 ` Ingo Molnar
  1 sibling, 2 replies; 271+ messages in thread
From: Nick Piggin @ 2006-09-17 17:56 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Roman Zippel wrote:
> Hi,
> 
> On Mon, 18 Sep 2006, Nick Piggin wrote:
> 
> 
>>Above, weren't you asking about static vs dynamic trace-*points*, rather
>>than the implementation of the tracer itself. I think Ingo said that
>>some "static tracepoints" (eg. annotation) could be acceptable.
> 
> 
> No, he made it rather clear, that as far as possible he only wants dynamic 
> annotations (e.g. via function attributes).

OK we must have him interpreted differently. I won't speak for Ingo,
but he can respond if he likes.

>>Now it seems you are talking about compiled vs runtime inserted traces,
>>which is different. And so far I have to agree with Ingo: dynamic seems
>>to be better in almost every way. Implementation may be more complex,
>>but that's never stood in the way of a better solution before, and I
>>don't think anybody has shown it to be prohibitive ("I won't implement
>>it" notwithstanding)
> 
> 
> I don't deny that dynamic tracer are more flexible, but I simply don't 
> have the resources to implement one. If those who demand I use a dynamic 
> tracer, would also provide the appropriate funding, it would change the 
> situation completely, but without that I have to live with the tools 
> available to me.

You definitely don't have to use a dynamic tracer, nor even implement
one on m68k (that will presumably happen if/when somebody does want a
dynamic tracer enough).

But equally nobody can demand that a feature go into the upstream
kernel. Especially not if there is a more flexible alternative
already available that just requires implementing for their arch.

This shouldn't be surprising, the kernel doesn't have a doctrine of
unlimited choice or merge features because they exist. For example
people wanted pluggable (runtime and/or compile time CPU scheduler
in the kernel. This was rejected (IIRC by Linus, Andrew, Ingo, and
myself). No doubt it would have been useful for a small number of
people but it was decided that it would split testing and development
resources. The STREAMS example is another one.

As an aside, there are quite a number of different types of tracing
things (mostly static, compile out) in the kernel. Everything from
blktrace to various userspace notifiers to lots of /proc/stuff could
be considered a type of static event tracing. I don't know what my
point is other than all these big, disjoint frameworks trying to be
pushed into the kernel. Are there any plans for working some things
together, or is that somebody else's problem?

Nick

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 17:56                                                                                 ` Nick Piggin
@ 2006-09-17 18:59                                                                                   ` Roman Zippel
  2006-09-17 21:23                                                                                     ` Ingo Molnar
                                                                                                       ` (2 more replies)
  2006-09-17 21:32                                                                                   ` Ingo Molnar
  1 sibling, 3 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 18:59 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Mon, 18 Sep 2006, Nick Piggin wrote:

> But equally nobody can demand that a feature go into the upstream
> kernel. Especially not if there is a more flexible alternative
> already available that just requires implementing for their arch.

I completely agree with you under the condition that these alternatives 
were mutually exclusive or conflicting with each other.

> This shouldn't be surprising, the kernel doesn't have a doctrine of
> unlimited choice or merge features because they exist.

Do we have a doctrine which forces us to design a feature in such way 
that has to be as difficult as possible to make it available to our users?
In this case it would be very easy to provide some basic functionality via 
static tracing and the full functionality via dynamic tracing. Where is 
the law that forbids this?

> For example
> people wanted pluggable (runtime and/or compile time CPU scheduler
> in the kernel. This was rejected (IIRC by Linus, Andrew, Ingo, and
> myself). No doubt it would have been useful for a small number of
> people but it was decided that it would split testing and development
> resources. The STREAMS example is another one.

Comparing it to STREAMS is an insult and Ingo should be aware of this. :-(

> As an aside, there are quite a number of different types of tracing
> things (mostly static, compile out) in the kernel. Everything from
> blktrace to various userspace notifiers to lots of /proc/stuff could
> be considered a type of static event tracing. I don't know what my
> point is other than all these big, disjoint frameworks trying to be
> pushed into the kernel. Are there any plans for working some things
> together, or is that somebody else's problem?

All the controversy around static tracing in general and LTT in specific 
has prevented this so far...

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 18:59                                                                                   ` Roman Zippel
@ 2006-09-17 21:23                                                                                     ` Ingo Molnar
  2006-09-17 21:52                                                                                       ` Roman Zippel
  2006-09-17 21:40                                                                                     ` Ingo Molnar
  2006-09-18  8:43                                                                                     ` Jes Sorensen
  2 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 21:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > For example people wanted pluggable (runtime and/or compile time CPU 
> > scheduler in the kernel. This was rejected (IIRC by Linus, Andrew, 
> > Ingo, and myself). No doubt it would have been useful for a small 
> > number of people but it was decided that it would split testing and 
> > development resources. The STREAMS example is another one.
> 
> Comparing it to STREAMS is an insult and Ingo should be aware of this. 
> :-(

so in your opinion Nick's mentioning of STREAMS is an insult too? I 
certainly do not understand Nick's example as an insult. Is STREAMS now 
a dirty word to you that no-one is allowed to use as an example in 
kernel maintanance discussions?

Let me recap how I mentioned STREAMS for the first time: it was simply 
the best example i could think of when you asked the following question:

> > Why don't you leave the choice to the users? Why do you constantly 
> > make it an exclusive choice? [...]
>
> [...]
>
> the user of course does not care about kernel internal design and 
> maintainance issues. Think about the many reasons why STREAMS was 
> rejected - users wanted that too. And note that users dont want 
> "static tracers" or any design detail of LTT in particular: what they 
> want is the _functionality_ of LTT.

(see <20060915231419.GA24731@elte.hu> for the full context. Tellingly, 
that point of mine you have left unreplied too.)

btw., you still have not retracted or corrected your false suggestion 
that "concessions" or a "compromise" were possible and you did not 
retract or correct your false accusation that i "dont want to make 
them":

> It's impossible to discuss this with you, because you're absolutely 
> unwilling to make any concessions. What am I supposed to do than it's 
> very clear to me, that you don't want to make any compromise anyway?

while, as i explained it before, such a concession simply does not exist 
- so i am not in the position to "make such a concession". There are 
only two choices in essence: either we accept a generic static tracer, 
or we reject it.

(see <Pine.LNX.4.64.0609171744570.6761@scrub.home>)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 21:23                                                                                     ` Ingo Molnar
@ 2006-09-17 21:52                                                                                       ` Roman Zippel
  2006-09-17 22:27                                                                                         ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 21:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> btw., you still have not retracted or corrected your false suggestion 
> that "concessions" or a "compromise" were possible and you did not 
> retract or correct your false accusation that i "dont want to make 
> them":

Sorry, I have nothing to retract and I'm not interesting in playing your 
word games. :-(

> > It's impossible to discuss this with you, because you're absolutely 
> > unwilling to make any concessions. What am I supposed to do than it's 
> > very clear to me, that you don't want to make any compromise anyway?
> 
> while, as i explained it before, such a concession simply does not exist 
> - so i am not in the position to "make such a concession". There are 
> only two choices in essence: either we accept a generic static tracer, 
> or we reject it.

Wrong, this is about the minimum support, which can be used by both static 
and dynamic tracers.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 21:52                                                                                       ` Roman Zippel
@ 2006-09-17 22:27                                                                                         ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 22:27 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> On Sun, 17 Sep 2006, Ingo Molnar wrote:
> 
> > btw., you still have not retracted or corrected your false suggestion 
> > that "concessions" or a "compromise" were possible and you did not 
> > retract or correct your false accusation that i "dont want to make 
> > them":
> 
> Sorry, I have nothing to retract and I'm not interesting in playing 
> your word games. :-(

you are wrong if you call my asking you to retract your false suggestion 
and false accusation a "word game". It is my basic right to point out 
misrepresentations, false statements, false accusations and 
misinterpretations when i see them. The sentences i pointed out were not 
just opinions, they were materially false statements of yours. But you 
are of course free to not retract or correct them (or to not dispute my 
characterization of them as such).

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 18:59                                                                                   ` Roman Zippel
  2006-09-17 21:23                                                                                     ` Ingo Molnar
@ 2006-09-17 21:40                                                                                     ` Ingo Molnar
  2006-09-18  8:43                                                                                     ` Jes Sorensen
  2 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 21:40 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > As an aside, there are quite a number of different types of tracing 
> > things (mostly static, compile out) in the kernel. Everything from 
> > blktrace to various userspace notifiers to lots of /proc/stuff could 
> > be considered a type of static event tracing. I don't know what my 
> > point is other than all these big, disjoint frameworks trying to be 
> > pushed into the kernel. Are there any plans for working some things 
> > together, or is that somebody else's problem?
> 
> All the controversy around static tracing in general and LTT in 
> specific has prevented this so far...

BLKTRACE is a special-purpose tracing facility limited to one subsystem 
and written and maintained by the /same/ person (Jens) who maintains 
that subsystem. He maintains the subsystem, the tracer and the userspace 
tool that extracts the tracer data.

LTT on the other hand is a static tracer that affects _all_ subsystems. 
That is a very different situation from a maintainance overhead POV, and 
i believe you must know that.

your suggestion that this controversy has prevented consolidation in 
this area is baseless and misleading, please correct or retract it.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 18:59                                                                                   ` Roman Zippel
  2006-09-17 21:23                                                                                     ` Ingo Molnar
  2006-09-17 21:40                                                                                     ` Ingo Molnar
@ 2006-09-18  8:43                                                                                     ` Jes Sorensen
  2 siblings, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-18  8:43 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Ingo Molnar, Thomas Gleixner, karim, Andrew Morton,
	Paul Mundt, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Roman Zippel wrote:
> Hi,
> 
> On Mon, 18 Sep 2006, Nick Piggin wrote:
> 
>> But equally nobody can demand that a feature go into the upstream
>> kernel. Especially not if there is a more flexible alternative
>> already available that just requires implementing for their arch.
> 
> I completely agree with you under the condition that these alternatives 
> were mutually exclusive or conflicting with each other.

Roman,

I don't get this, you are arguing that we should put it in because it
doesn't do any damage. First of all it does, by adding a lot of clutter
all over the place. Second, if we take that argument, then we should
allow anybody to put in anything they want, are you also suggesting we
put devfs back in?

Point is that the Linux kernel gets so many proposals, some are good
some are bad and some while maybe looking like a good idea at the
beginning, show out later to be a bad idea - LTT falls into this
category. *However*, it doesn't mean the knowledge and tools that were
developed with LTT are bad or useless.

To take another related project, look at relayfs. There was so much
noise about it when it was initially pushed, yuck I even remember how it
was suggested that printk should be implemented via relayfs. But look at
it now, there is no fs/relayfs/* these days. The kernel moved on, used
the knowledge optained and provided the feature in a better way -
exactly like it is being proposed to do for trace points, by using
dynamic probes.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 17:56                                                                                 ` Nick Piggin
  2006-09-17 18:59                                                                                   ` Roman Zippel
@ 2006-09-17 21:32                                                                                   ` Ingo Molnar
  1 sibling, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 21:32 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> As an aside, there are quite a number of different types of tracing 
> things (mostly static, compile out) in the kernel. Everything from 
> blktrace to various userspace notifiers to lots of /proc/stuff could 
> be considered a type of static event tracing. I don't know what my 
> point is other than all these big, disjoint frameworks trying to be 
> pushed into the kernel. Are there any plans for working some things 
> together, or is that somebody else's problem?

AFAIK Jens has indicated interest in seeing experiments that would try 
to replace BKLTRACE with dynamic tracepoints, so it's being worked on.

but yes, that would be the general idea: to turn all existing ad-hoc 
tracing/debugging points in the kernel into static SystemTap markers or 
SystemTap scripts.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 17:26                                                                               ` Roman Zippel
  2006-09-17 17:56                                                                                 ` Nick Piggin
@ 2006-09-17 19:23                                                                                 ` Ingo Molnar
  2006-09-17 19:45                                                                                   ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 19:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > [...] I think Ingo said that some "static tracepoints" (eg. 
> > annotation) could be acceptable.
> 
> No, he made it rather clear, that as far as possible he only wants 
> dynamic annotations (e.g. via function attributes).

what you say is totally and utterly nonsensical misrepresentation of 
what i have said. I always said: i support in-source annotations too (I 
even suggested APIs how to do them), as long as they are not a total 
_guaranteed_ set destined for static tracers, i.e. as long as they are 
there for the purpose of dynamic tracers. I dont _care_ about static 
annotations as long as they are there for dynamic tracers, because they 
can be moved into scripts if they cause problems. But static annotations 
for static tracers are much, much harder to remove. Please go on and 
read my "tracepoint maintainance models" email:

 Message-ID: <20060917143623.GB15534@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 19:23                                                                                 ` Ingo Molnar
@ 2006-09-17 19:45                                                                                   ` Roman Zippel
  2006-09-17 20:56                                                                                     ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 19:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> > > [...] I think Ingo said that some "static tracepoints" (eg. 
> > > annotation) could be acceptable.
> > 
> > No, he made it rather clear, that as far as possible he only wants 
> > dynamic annotations (e.g. via function attributes).
> 
> what you say is totally and utterly nonsensical misrepresentation of 
> what i have said. I always said: i support in-source annotations too (I 
> even suggested APIs how to do them),

Some consistency would certainly help:
'my suggested API is not "barely usable" for static tracers but "totally 
unusable".'

<20060916082214.GD6317@elte.hu>

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 19:45                                                                                   ` Roman Zippel
@ 2006-09-17 20:56                                                                                     ` Ingo Molnar
  2006-09-17 21:36                                                                                       ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 20:56 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Sun, 17 Sep 2006, Ingo Molnar wrote:
> 
> > > > [...] I think Ingo said that some "static tracepoints" (eg. 
> > > > annotation) could be acceptable.
> > > 
> > > No, he made it rather clear, that as far as possible he only wants 
> > > dynamic annotations (e.g. via function attributes).
> > 
> > what you say is totally and utterly nonsensical misrepresentation of 
> > what i have said. I always said: i support in-source annotations too (I 
> > even suggested APIs how to do them),
> 
> Some consistency would certainly help: 'my suggested API is not 
> "barely usable" for static tracers but "totally unusable".'

I am really sorry that you were able to misunderstand and misrepresent 
such a simple sentence. Let me quote the full paragraph of what i said:

> you raise a new point again (without conceding or disputing the point 
> we were discussing, which point you snipped from your reply) but i'm 
> happy to reply to this new point too: my suggested API is not "barely 
> usable" for static tracers but "totally unusable". Did i tell you yet 
> that i disagree with the addition of markups for static tracers?

this makes it clear that i disagree with adding static markups for 
static tracers - but i of course still agree with static markups for 
_dynamic tracers_. The markups would be totally unusable for static 
tracers because there is no guarantee for the existence of static 
markups _everywhere_: the static markups would come and go, as per the 
"tracepoint maintainance model". Do you understand that or should i 
explain it in more detail?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 20:56                                                                                     ` Ingo Molnar
@ 2006-09-17 21:36                                                                                       ` Roman Zippel
  2006-09-17 22:13                                                                                         ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-17 21:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Sun, 17 Sep 2006, Ingo Molnar wrote:

> > Some consistency would certainly help: 'my suggested API is not 
> > "barely usable" for static tracers but "totally unusable".'
> 
> I am really sorry that you were able to misunderstand and misrepresent 
> such a simple sentence.

Considering the context, which is not exactly full of support for static 
tracer, I think my understanding was and still is quite correct.
Let's take <20060915231419.GA24731@elte.hu>, where you suggest converting 
as much possible tracepoints to this API, thus excluding a lot of
information from static tracers.

> this makes it clear that i disagree with adding static markups for 
> static tracers - but i of course still agree with static markups for 
> _dynamic tracers_. The markups would be totally unusable for static 
> tracers because there is no guarantee for the existence of static 
> markups _everywhere_: the static markups would come and go, as per the 
> "tracepoint maintainance model". Do you understand that or should i 
> explain it in more detail?

Well, I rather just wait for the real patch, where you can show your 
support for all possible users.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 21:36                                                                                       ` Roman Zippel
@ 2006-09-17 22:13                                                                                         ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 22:13 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > I am really sorry that you were able to misunderstand and 
> > misrepresent such a simple sentence.
> 
> Considering the context, which is not exactly full of support for 
> static tracer, I think my understanding was and still is quite 
> correct.

this thought of you is still false. Nick said:

 ' I think Ingo said that some "static tracepoints" (eg. annotation) 
   could be acceptable. '

to which you replied:

  ' No, he made it rather clear, that as far as possible he only wants 
    dynamic annotations (e.g. via function attributes). '

That "No" word at the beginning of your sentence, by its plain meaning, 
falsely questions Nick's correct interpretation of what I said. I ask 
you to retract or correct this false statement.

Nick is of course correct: i said before that some static markups could 
be acceptable. In fact, i even outlined a possible API for such static 
markups in 20060914231956.GB29229@elte.hu. Would I want to reduce the 
number of such static markups: of course, not wanting to reduce the 
number of subsystem-functionality unrelated source code lines would be 
foolish.

> > this makes it clear that i disagree with adding static markups for 
> > static tracers - but i of course still agree with static markups for 
> > _dynamic tracers_. The markups would be totally unusable for static 
> > tracers because there is no guarantee for the existence of static 
> > markups _everywhere_: the static markups would come and go, as per 
> > the "tracepoint maintainance model". Do you understand that or 
> > should i explain it in more detail?
> 
> Well, I rather just wait for the real patch, where you can show your 
> support for all possible users.

this answer of yours does not rectify the false statement you did.

Your sentence also introduces a new misrepresentation of my intentions: 
my intention with partial static markups (which intention i've written 
to you about before, so it was known to you when you wrote this 
stentence) is not to support "all possible users", but to support 
dynamic tracers. Static tracers cannot use static markups that go away 
into dynamic tracing scripts.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 19:58                                                               ` Roman Zippel
  2006-09-16 22:50                                                                 ` Ingo Molnar
  2006-09-16 23:00                                                                 ` Ingo Molnar
@ 2006-09-16 23:14                                                                 ` Ingo Molnar
  2006-09-17 14:19                                                                   ` Frank Ch. Eigler
       [not found]                                                                   ` <y0mu036eglz.fsf@ton.toronto.redhat.com>
  2 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16 23:14 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> [...] instead of delving further into secondary issues, please let me 
> get back to the primary issues [...]

here's a list of some of those "secondary issues" that we were 
discussing, and which you opted not to "further dvelve into":

firstly, a factually wrong statement of yours:

> [...] any tracepoints have an maintainance overhead, which is barely 
> different between dynamic and static tracing [...]

secondly, a factually wrong statement of yours:

> [...] at the source level you can remove a static tracepoint as easily 
> as a dynamic tracepoint, [...]

thirdly, a factually wrong statement of yours:

> [...] It would also add virtually no maintainance overhead [...]

[see the previous mails for the full context on these items.]

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16 23:14                                                                 ` Ingo Molnar
@ 2006-09-17 14:19                                                                   ` Frank Ch. Eigler
  2006-09-17 15:31                                                                     ` Ingo Molnar
       [not found]                                                                   ` <y0mu036eglz.fsf@ton.toronto.redhat.com>
  1 sibling, 1 reply; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-17 14:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ingo Molnar <mingo@elte.hu> writes:

> [...]
> firstly, a factually wrong statement of yours:
> 
> > [...] any tracepoints have an maintainance overhead, which is barely 
> > different between dynamic and static tracing [...]

If one totals the fixup effort required across the programmers who
need to do the work, I would concur with the OP; or if there is a
difference, it is in favour of the static markers.  It is unfortunate
that all the talk about maintenance has been almost entirely aloof and
disconnected from empirical examples.  It would be much better if we
were able to sketch out plausible designs for static instrumentation
and similar dynamic probes, and carry out gedanken experiments aobut
how they would need to adopt to realistic examples of code drift.  It
is not the case that all "maintenance" is alike.

> secondly, a factually wrong statement of yours:
> 
> > [...] at the source level you can remove a static tracepoint as easily 
> > as a dynamic tracepoint, [...]

It is not hard to imagine commenting out a single line; nor inserting
the equivalent of "#define NDEBUG" at the head of the .c file to
disable them all for the whole compilation unit.  The retort that
"this would break the entire tracing system" does not hold water
without far more argument.  Missing events do not necessarily a
totally broken system make.  (Renamed or changed events may even be
mapped back via a translation layer.)  Tracing events need not become
as firmly fixed (unremovable or unchangeable) a user interface as the
syscalls.

> thirdly, a factually wrong statement of yours:
>
> > [...] It would also add virtually no maintainance overhead [...]

Yes, the knife cuts both ways: both cost ongoing effort.  The question
is how much; who would do the work; who is better able to do the work;
who (users/developers) receives value from the work.  The overall
cost/benefit calculation is far more complicated than pithy lines
about "no maintenance" or its opposite.

As for the possibilities of kprobes performance improvements: bring
them on, they're great.  It is however almost certain that, because
reasons like debugging-information imperfection or absence, compiler
optimizations, different deployment scenarios, some un-probable blind
spots would remain kprobes-only probing system.

As for Karim's proposed comment-based markers, I don't have a strong
opinion, not being one whose kernel-side code would be marked up one
way or the other.  My intuition suggests that, if the runtime costs of
a dormant static marker are low enough, they should be just compiled
in by default.  And if they are compiled in, then by golly, compile
them in honestly and don't hide them.  Something like build-time
multilibbing seems like too much effort to trade one eyesore for a
different eyesore.  But that's just my opinion, I could be wrong.

- FChE

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 14:19                                                                   ` Frank Ch. Eigler
@ 2006-09-17 15:31                                                                     ` Ingo Molnar
  2006-09-17 17:15                                                                       ` Mathieu Desnoyers
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 15:31 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Frank Ch. Eigler <fche@redhat.com> wrote:

> As for Karim's proposed comment-based markers, I don't have a strong 
> opinion, not being one whose kernel-side code would be marked up one 
> way or the other. [...]

What makes the difference isnt just the format of markup (although i 
fully agree that the least visually intrusive markup format should be 
used for static markers, and the range of possibilities includes 
comment-based markers too), but what makes the differen is:

 the /guarantee/ of a full (comprehensive) set to /static tracers/

The moment we allow a static tracer into the upstream kernel, we make 
that guarantee, implicitly and explicitly. (I've expanded on this line 
of argument in the previous few mails, extensively.)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-17 15:31                                                                     ` Ingo Molnar
@ 2006-09-17 17:15                                                                       ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-17 17:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frank Ch. Eigler, Roman Zippel, Thomas Gleixner, karim,
	Andrew Morton, Paul Mundt, Jes Sorensen, linux-kernel,
	Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Frank Ch. Eigler <fche@redhat.com> wrote:
> 
> > As for Karim's proposed comment-based markers, I don't have a strong 
> > opinion, not being one whose kernel-side code would be marked up one 
> > way or the other. [...]
> 
> What makes the difference isnt just the format of markup (although i 
> fully agree that the least visually intrusive markup format should be 
> used for static markers, and the range of possibilities includes 
> comment-based markers too), but what makes the differen is:
> 
>  the /guarantee/ of a full (comprehensive) set to /static tracers/
> 
> The moment we allow a static tracer into the upstream kernel, we make 
> that guarantee, implicitly and explicitly. (I've expanded on this line 
> of argument in the previous few mails, extensively.)
> 

Ingo, your definition of a static tracer seems to be slightly off from LTTng's
reality in two ways :

First, the kernel tracer supports dynamically loadable "event types", which
makes it quite more flexible than a static tracer that would have to guarantee
a full set of trace points. There is a clear difference between statically
adding instrumentation and statically adding new event types in that forcing a
static set of events would indeed break the user space tools when an event is
added or removed.

Second, the user space analysis tools are built so that they can handle missing
information. So, if they lack things like scheduler change or irq entry/exit
events, they will still show the available information. No "breakage" would
result from a missing probe. Moreover, the LTTV trace analysis tool being
modular and plugin-based, developers can choose to load or not analysis on the
data based on the instrumentation present in the traced kernel.

So there is no guarantee of any full instrumentation set : both instrumentation
and analysis tools are extensible by the users when needed.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

[parent not found: <y0mu036eglz.fsf@ton.toronto.redhat.com>]

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
       [not found]                                                                   ` <y0mu036eglz.fsf@ton.toronto.redhat.com>
@ 2006-09-17 15:00                                                                     ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-17 15:00 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt,
	Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Frank Ch. Eigler <fche@redhat.com> wrote:

> [...]  It would be much better if we were able to sketch out plausible 
> designs for static instrumentation and similar dynamic probes, and 
> carry out gedanken experiments aobut how they would need to adopt to 
> realistic examples of code drift.  It is not the case that all 
> "maintenance" is alike.

see my previous mail - hopefully that explains my position even clearer.

A number of people have expressed doubts about the all-static model (i'm 
amongst them) - and that's all based on actual experience. So there's no 
need for Gedanken-experiments, because we've got real-life experiments
:-) A number of people also have expressed that they think an all-static
markup model is the right one - and that's based on experience as well.

Just looking at the opinions objectively and excluding my opinion i'd 
say that the most likely model will thus be a _hybrid_ one: some 
markups will be static, some will be dynamic.

Whether a tracepoint will be static or dynamic will depend on the 'flux 
of changes' in the tracing code and of the code they trace. If tracing 
code has a high flux, or the traced code has a high flux, then the 
lowest maintainance overhead is to have a dynamic tracepoint. If _both_ 
the tracing code and the traced code has low flux of changes, then the 
lowest maintainance overhead will be a static markup.

Put differently: dynamic markups will turn into static markups if the 
code that they handle "cools down". Static markups will turn into 
dynamic markups if the code where they reside in gets "too hot" or if 
the markups themselves are "too hot".

But one thing is sure: with a static tracer model accepted into the 
kernel we are forced to have a comprehensive, always-maintained, full 
set of static markups in the tree, for a long time. Dynamic tracers will 
still be around, but we wont be able to fully benefit from the more 
flexible tracepoint maintainance models they allow, because we'll always 
have to carry around the static markups, for the sake of static tracers. 
There will likely be periodic friction about how many static markups 
there should be in the source: subsystem maintainers will want them out, 
static-trace-users will want them in. If a crutial static markup is 
removed or damaged then the kernel will regress materially.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
                                                                               ` (3 preceding siblings ...)
  2006-09-16  8:22                                                             ` Ingo Molnar
@ 2006-09-16  8:23                                                             ` Ingo Molnar
  2006-09-16  8:23                                                             ` Ingo Molnar
  2006-09-16  8:23                                                             ` Ingo Molnar
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > > > >  - a marker for dynamic tracing has lower performance impact
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > >    than a static tracepoint, on systems that are not being
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > >    traced. (but which have the tracing infrastructure enabled
               ^^^^^^
> > > > > >    otherwise)
> > > > >
> > > > > Anyone using static tracing intents to use, which makes this point
> > > > > moot.
> > > >
> > > > that's not at all true, on multiple grounds:
> > > >
> > > > Firstly, many people use distro kernels. A Linux distribution
> > > > typically wants to offer as few kernel rpms as possible (one per
> > > > arch to be precise), but it also wants to offer as many features
> > > > as possible. So if there was a static tracer in there, a distro
> > > > would enable it - but 99.9% of the users would never use it - still
> > > > they would see the overhead. Hence the user would have it enabled,
> > > > but does not intend to use it - which contradicts your statement.
> > >
> > > So if dynamic tracing is available use it, as distributions 
> > > already do. OTOH the barrier to use static tracing is drastically 
> > > different whether the user has to deal with external patches or 
> > > whether it's a simple kernel option. Again, static tracing doesn't 
> > > exclude the possibility of dynamic tracing, that's something you 
> > > constantly omit and thus make it sound like both options were 
> > > mutually exlusive.
> > 
> > how does this reply to my point that: "a marker for dynamic tracing has 
> > lower performance impact than a static tracepoint, on systems that are 
> > not being traced", which point you claimed moot?
> 
> Because it's pretty much an implementation issue. [...]

No, that's my point, it's not an "implementational issue" of static 
tracers, the overhead of markups for static tracers is:

   _inherent to their concept of being compile-time and static_

ok?

> [...] The point is about adding markers at all, it's about the choice 
> being able to use static tracers in the first place. [...]

your characterization of "the point" is at odds with the specific point 
that we are discussing - see the underlined sentence above, right at the 
top of the quotes:

> > > > > >  - a marker for dynamic tracing has lower performance impact
> > > > > >    than a static tracepoint, on systems that are not being
> > > > > >    traced. (but which have the tracing infrastructure enabled

Please either concede the point or dispute it, before shifting to new 
grounds. Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
                                                                               ` (4 preceding siblings ...)
  2006-09-16  8:23                                                             ` Ingo Molnar
@ 2006-09-16  8:23                                                             ` Ingo Molnar
  2006-09-16  8:23                                                             ` Ingo Molnar
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > > Secondly, even people who intend to _eventually_ make use of 
> > > > tracing, dont use it most of the time. So why should they have 
> > > > more overhead when they are not tracing? Again: the point is not 
> > > > moot because even though the user intends to use tracing, but 
> > > > does not always want to trace.
> > > 
> > > I've used kernels which included static tracing and the perfomance 
> > > overhead is negligible for occasional use.
> > 
> > how does this suddenly make my point, that "a marker for dynamic 
> > tracing has lower performance impact than a static tracepoint, on 
> > systems that are not being traced", "moot"?
> 
> Why exactly is the point relevant in first place? How exactly is the 
> added (minor!) overhead such a fundamental problem?

how could a fundamental performance difference between two markup 
schemes be not relevant to kernel design decisions? Which performance 
difference i claim derives straight from the conceptual difference 
between the two approaches and is thus "unfixable" (and not an 
"implementational issue").

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-16  0:31                                                           ` Roman Zippel
                                                                               ` (5 preceding siblings ...)
  2006-09-16  8:23                                                             ` Ingo Molnar
@ 2006-09-16  8:23                                                             ` Ingo Molnar
  6 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-16  8:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > Why don't you leave the choice to the users? Why do you constantly 
> > > make it an exclusive choice? [...]
> > 
> > as i outlined it tons of times before: once we add markups for static 
> > tracers, we cannot remove them. That is a constant kernel maintainance 
> > drag that i feel uncomfortable about.
> 
> As many, many people have already said, any tracepoints have an 
> maintainance overhead, which is barely different between dynamic and 
> static tracing and only increases the further away the tracepoints are 
> from the source.

i have demonstrated that with dynamic tracers it's possible to have: 
"half the number of tracepoints" or "no tracepoints at all", right in 
the traced kernel source. That way we are able to shift away the 
maintainance overhead from the subsystem which is being traced to the 
person who _wants_ to do the tracing (instead of on the person who 
maintains the code that is being traced), in a finegrained way.

But even the secondary metric, the "sum of all maintainance, including 
the maintanance of tracepoints" can become lower with dynamic tracers: 
if a subsystem changes with a much higher frequency than the tracing 
scripts follow.

Let me try to explain it to you with other words: if all tracing is done 
via scripts and no in-source tracepoints at all, then we could for 
example update the tracing scripts only once per release. A subsystem 
might undergo a heavy cycle of updates, changing functions that are 
traced many times: i call this a "high frequency update to the source 
code".

If tracing is done via tracepoints for static tracers, then such "high 
frequency updates to the source code" have to "carry with them" all the 
markups. It might be zero overhead if a subsystem has no tracepoints, 
but it might be alot more complex too.

For example, I can tell you that the -rt tree has a number of very 
useful scheduling tracepoints but which are also a constant maintainance 
hindrance. For example i even have a separate _function_ that is a 
helper to one of the tracepoints. And this was the _bare minimum_ of 
static tracepoints i needed for the purposes of visualizing and 
analyzing scheduling patterns in the -rt tree (either on my boxes or on 
users' boxes). Occasionally users needed alot more tracepoints. So i am 
talking from first-hand experience. This maintainance overhead occured 
(and still occurs) to /me/, so please dont try to tell me that the 
maintainance overhead is minimal. Even "half the tracepoints" would be 
great. And i only have a dozen tracepoints, not hundreds!

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:23                                             ` Thomas Gleixner
  2006-09-15 20:40                                               ` Roman Zippel
@ 2006-09-15 21:05                                               ` Karim Yaghmour
  2006-09-15 21:17                                                 ` Thomas Gleixner
  1 sibling, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 21:05 UTC (permalink / raw)
  To: tglx
  Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Thomas Gleixner wrote:
> Stop whining!

I resent that. If your efforts in working on popular kernel topics
met rapid reward then I'm happy for you. The fact that others tackle
unpopular topics and persist despite constant personal attacks should
nevertheless be recognized for what it is.

> LTT did not manage to solve the problem in a generic,

You're entirely correct. I never claimed it to be perfect, that's why I
had approached others early on to try to bridge things together and
that's why I used to post ltt patches to the lkml.

> mainline acceptable way. If you really believe that Kprobes / Systemtap
> is just a $corporate maliciousness to kick you out of business, then I
> really start to doubt your sanity.

If that's how it was read, then it wasn't written right. ltt was never
really a profit center for me, embedded Linux training was -- you
wouldn't believe how much more profitable training is than pure
consulting. But my own business is just beside the point. My point
was that the high barrier to entry for tracing fragmented efforts
around it. As for corporate decisions which culminated from such
resistance, they probably were the sanest decision to take at the
time. Heck if I was a manager at any of those companies I would have
likely taken the same decision. It was, and still is, though,
counterproductive. Fully justifiable, but counterproductive.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:05                                               ` Karim Yaghmour
@ 2006-09-15 21:17                                                 ` Thomas Gleixner
  2006-09-15 21:31                                                   ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 21:17 UTC (permalink / raw)
  To: karim
  Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, 2006-09-15 at 17:05 -0400, Karim Yaghmour wrote:
> Thomas Gleixner wrote:
> > Stop whining!
>
> I resent that. 

See last sentence of this mail.

> If your efforts in working on popular kernel topics
> met rapid reward then I'm happy for you. The fact that others tackle
> unpopular topics and persist despite constant personal attacks should
> nevertheless be recognized for what it is.

Oh well. I'm working on unpopular and intrusive stuff as long as you do.
Just our ways to work and communicate differ slightly.

> > mainline acceptable way. If you really believe that Kprobes / Systemtap
> > is just a $corporate maliciousness to kick you out of business, then I
> > really start to doubt your sanity.
> 
> If that's how it was read, then it wasn't written right

Ouch. Can you please tell me what's the technical merit of this
paragraph:

"                               ... The only reasons
there are separate project teams is because managers in key
positions made the decision that they'd rather break from existing
projects which had had little success mainlining and instead use
their corporate bodyweight to pressure/seduce kernel developers
working for them into pushing their new great which-aboslutely-
has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree
with you kernel developers that this is crap, this is why we're
developing this new amazing thing). That's the truth plain and
simple."

Sorry, I have not found a way to interpret it usefully.

	tglx

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:17                                                 ` Thomas Gleixner
@ 2006-09-15 21:31                                                   ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 21:31 UTC (permalink / raw)
  To: tglx
  Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais


Thomas Gleixner wrote:
> Oh well. I'm working on unpopular and intrusive stuff as long as you do.

Well, I won't debate that shall I :)

> Just our ways to work and communicate differ slightly.

Maybe so. Any wisdom would be greatly appreciated.

> Sorry, I have not found a way to interpret it usefully.

See my response to Ingo on this topic.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                                       ` Andrew Morton
  2006-09-15 18:19                                         ` Ingo Molnar
  2006-09-15 19:35                                         ` Thomas Gleixner
@ 2006-09-15 20:00                                         ` Mathieu Desnoyers
  2006-09-15 20:27                                           ` Jose R. Santos
  2006-09-15 20:37                                         ` Alan Cox
  3 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 20:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Andrew Morton (akpm@osdl.org) wrote:
> Of course, it they are properly designed, the one set of tracepoints could
> be used by different tracing backends - that allows us to separate the
> concepts of "tracepoints" and "tracing backends".

If I try to develop your idea a little further, we could this of dividing the
tracing problem into four layers :

- tracepoints (where the code is instrumented)
  - identifying code
  - accessing data surrounding the code
- tracing backend (how to add the tracepoints)
- tracing infrastructure (what code will serialize the information)
- data extraction (getting the data out to disk, network, ...)

I think that, if we agree on this segmentation of the problem, this thread is
generally debating on the tracing backends and their respective limitations.
I just want to point out that the patch I have submitted adresses mainly the
"tracing infrastructure" and "data extraction" topics.

Regards,

Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:00                                         ` Mathieu Desnoyers
@ 2006-09-15 20:27                                           ` Jose R. Santos
  0 siblings, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 20:27 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Mathieu Desnoyers wrote:
> * Andrew Morton (akpm@osdl.org) wrote:
> > Of course, it they are properly designed, the one set of tracepoints could
> > be used by different tracing backends - that allows us to separate the
> > concepts of "tracepoints" and "tracing backends".
>
> If I try to develop your idea a little further, we could this of dividing the
> tracing problem into four layers :
>
> - tracepoints (where the code is instrumented)
>   - identifying code
>   - accessing data surrounding the code
> - tracing backend (how to add the tracepoints)
> - tracing infrastructure (what code will serialize the information)
> - data extraction (getting the data out to disk, network, ...)
>   

I think you missing user-space post processing which should be also 
considered part of the problem since the capabilities of post-processing 
will be limited by the "tracepoints" available.  Tracepoints and 
post-processing are also the problems which need to be address first 
between the other established tracing projects before going forward with 
in-kernel solutions.

> I think that, if we agree on this segmentation of the problem, this thread is
> generally debating on the tracing backends and their respective limitations.
> I just want to point out that the patch I have submitted adresses mainly the
> "tracing infrastructure" and "data extraction" topics.
>   

This seem like a good idea to dissect the problem since it seem like 
other important issues relevant to general tracing are being ignore 
simply because of a dislike of the way LTTng has chosen to implement trace.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                                       ` Andrew Morton
                                                           ` (2 preceding siblings ...)
  2006-09-15 20:00                                         ` Mathieu Desnoyers
@ 2006-09-15 20:37                                         ` Alan Cox
  2006-09-15 20:26                                           ` Mathieu Desnoyers
                                                             ` (2 more replies)
  3 siblings, 3 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 20:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 11:16 -0700, ysgrifennodd Andrew Morton:
> What Karim is sharing with us here (yet again) is the real in-field
> experience of real users (ie: not kernel developers).

A lot of us have plenty of experience helping customers and end users
trace bugs. Thats a good part of why we get paid in the first place.

> What I _am_ concerned about with this patchset is all the infrastructural
> goop which backs up those tracepoints.  I'd have thought that a better
> approach would be to make those explicit tracepoints be "helpers" for the
> existing kprobe code.

If you put explicit tracepoints in they will be compiled out for end
users. If you have a script which hits the standard tracepoint set it'll
be usable by end users.

> Of course, it they are properly designed, the one set of tracepoints could
> be used by different tracing backends - that allows us to separate the
> concepts of "tracepoints" and "tracing backends".

There are more than two layers. The first question is "how do I trace
event XYZ" which seems to be the big debate. The second is "how do I
find XYZ" which seems to have some commonality. The third is "what do I
do when the event is hit", which kprobes provides to all the existing
consumers such as systemtap and can field into arrays for graph plotting
and the like.

Ignoring the question of static compiled in trace points kprobes appears
to have solved the problem space. Everyone else can use the kprobes
interfaces to do pretty much anything computationally viable.

I am sceptical about static tracepoints in critical spots because if
they make the variable easy to access they will reduce optimisations and
that will cost a lot more than 5 or 6 clocks.

In addition ideally we want a mechanism that is also sufficient that
printk can be mangled into so that you can pull all the printk text
strings _out_ of the kernel and into the debug traces for embedded work.

[ie you want printk("Oh dear %s exploded.\n", foo->bar); to end up with
"Oh dear %s exploded.\n" out of kernel and in kernel

		tracepoint_printk(foo->bar);

maybe with minimal type info (although that can be pulled at debug time
from the string spat into the debug data).]

Alan

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:37                                         ` Alan Cox
@ 2006-09-15 20:26                                           ` Mathieu Desnoyers
  2006-09-15 20:51                                           ` Karim Yaghmour
  2006-09-17 17:53                                           ` Mathieu Desnoyers
  2 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 20:26 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

* Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> In addition ideally we want a mechanism that is also sufficient that
> printk can be mangled into so that you can pull all the printk text
> strings _out_ of the kernel and into the debug traces for embedded work.
> 
> [ie you want printk("Oh dear %s exploded.\n", foo->bar); to end up with
> "Oh dear %s exploded.\n" out of kernel and in kernel
> 
> 		tracepoint_printk(foo->bar);
> 

Good idea, trivial to implement on top of LTTng. When seeing printk's reentrancy
limitations, I have though about doing it a couple of times.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:37                                         ` Alan Cox
  2006-09-15 20:26                                           ` Mathieu Desnoyers
@ 2006-09-15 20:51                                           ` Karim Yaghmour
  2006-09-17 17:53                                           ` Mathieu Desnoyers
  2 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 20:51 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, tglx, Paul Mundt, Jes Sorensen, Roman Zippel,
	Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Alan Cox wrote:
> A lot of us have plenty of experience helping customers and end users
> trace bugs. Thats a good part of why we get paid in the first place.

But of course, and I wouldn't dare compare my experience with yours.

FWIW, though, I submit to you that there is a difference in between
helping a customer trace something and actually attempting to create
a tool which standard users can use to trace their own stuff.

Then, again, my experience may just be lacking.

Here's an example just for the fun of it: I was giving a class at
a customer's site. It so happened they scheduled this class right
after product delivery (advice: this is a mistake.) And, predictably,
in came the technician asking for Joe, out went Joe, in came Joe,
repeat. They spent quite some time after hours trying to figure
this one out. Midweek, they asked if I could help, they were
having some odd behavior in user-space on a custom-developed board.
Try as I may, none of the standard user-space stuff was effective.
Ok, time to try ltt. Now this was a "vendor" kernel, with
preemption (ok, I'm not telling who, but this was definitely
before Ingo's work) -- the sort of which I hadn't dabbled in
before. I spent the evening trying to figure out how the heck the
thing worked to no avail -- the locking mechanisms were just
wrong for what ltt needed at the time. Last day I asked him if
they could get a *normal* kernel on there and someone somewhere
found an odd-port stable enough to run. So got an ltt patch,
customized it for said kernel (would have had to do something
similar if it were probe points instead of static traces), got a
trace, and within 5 minutes we had found a bug in their custom
hardware (and no, their drivers were just fine). This customer
would not have even needed me or needed to waste their time if he
had been able to get a trace for his bastardized kernel. But
the way the anti-static-instrumentation creed goes this
customer would still have needed me ... or someone else ...
<conspiracy> wait a minute, maybe that's not a coincidence ...
</conspiracy> ;)

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:37                                         ` Alan Cox
  2006-09-15 20:26                                           ` Mathieu Desnoyers
  2006-09-15 20:51                                           ` Karim Yaghmour
@ 2006-09-17 17:53                                           ` Mathieu Desnoyers
  2 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-17 17:53 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen,
	Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

* Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> In addition ideally we want a mechanism that is also sufficient that
> printk can be mangled into so that you can pull all the printk text
> strings _out_ of the kernel and into the debug traces for embedded work.

Hi,

I just implemented a printk instrumentation that logs the printks into LTTng
traces ASAP in order to keep the time causality correct. It can be found in
LTTng 0.5.112.

Regards,

Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:51                                   ` Karim Yaghmour
  2006-09-15 15:00                                     ` Thomas Gleixner
@ 2006-09-15 15:24                                     ` Alan Cox
  2006-09-15 15:23                                       ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Alan Cox @ 2006-09-15 15:24 UTC (permalink / raw)
  To: karim
  Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 10:51 -0400, ysgrifennodd Karim Yaghmour:
> The static tracepoints we maintained were *the* solution for a great

I think you mean "a" solution. You've not proved there are no others.

> deal many people. As a maintainer I had two choices with those who
> were not content:
> a- Maintain their tracepoints for them -- not happening.
> b- Suggest they contribute to helping getting a generic tracing
>   infrastructure into the kernel and then make their case on the
>   lkml as to the pertinence of their instrumentation.

b has been done, its called kprobes. We just need better tools for the
dynamic probes.

> choice of tracepoints. Those who were using ltt for its designated
> purpose -- allowing normal users and developers to get an accurate
> view of the behavior of their system -- were very happy with it.

and you can maintain "Karim's probe list" which is the dynamic probe set
which matches your old static probes, only of course its now much more
flexible.


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 15:24                                     ` Alan Cox
@ 2006-09-15 15:23                                       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 15:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Alan Cox wrote:
> b has been done, its called kprobes. We just need better tools for the
> dynamic probes.

As long as there needs to be the updating of an outside piece of something
then "b" hasn't been done. Especially with regards to what this means
to figuring out which of kernel or instrumentation-script is broken when
you get bug reports on lkml.

> and you can maintain "Karim's probe list" which is the dynamic probe set
> which matches your old static probes, only of course its now much more
> flexible.

Sorry, the issue isn't about my probe list. The issue is that there
needs to be a way of pointing important events without having to
modify things at 3 or 4 different places. The only way this can be
done is if it's in the tree -- regardless of the mechanism. This
isn't about static tracers vs. dynamic tracers, it's about statically
marking code. What goes underneath is secondary. And if the static
markup -- with even the SystemTap people are interested in -- is
but a hook for further selecting the appropriate instrumentation
mechanism, then that's fine too.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:31                               ` Karim Yaghmour
  2006-09-15 14:28                                 ` Paul Mundt
@ 2006-09-15 14:39                                 ` Jes Sorensen
  2006-09-15 15:04                                   ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15 14:39 UTC (permalink / raw)
  To: karim
  Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Karim Yaghmour wrote:
> Jes Sorensen wrote:
>> Because other people have tried to use LTT for additional projects,
>> but said projects haven't been integrated into LTT. In other words,
>> just because *you* haven't added those, doesn't mean someone else
>> won't try and do it later, if LTT was integrated.
> 
> Thank you. I will take it as a complement and likely laminate this
> email for your suggestion that I've acted responsibly in my
> maintenance of ltt. Boy, can you imagine what this debate would
> have looked like if I had included precisely those additional
> projects ...

Karim,

Thank you for this, it just proves that taking this discussion any
further is a waste of everybody's time.

> C'mon Jes, if I was able to responsibly maintain ltt over 5
> years *out* of the tree and I'm being labeled as incompetent all
> over this thread, then imagine what the very competent people
> maintaining the kernel could actually do.

Nobody ever said you were irresponsible, but you are claiming that you
are able to define a finite set of static tracepoints that are relevant
to everybody. Or in other words, they are defined as being the ones
relevant to you.

Please read Paul Mundt's response to your email - it's bang on, couldn't
put it any better myself.

Jes


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:39                                 ` Jes Sorensen
@ 2006-09-15 15:04                                   ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 15:04 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Jes Sorensen wrote:
> Thank you for this, it just proves that taking this discussion any
> further is a waste of everybody's time.

Sorry you feel this way.

> Nobody ever said you were irresponsible, but you are claiming that you
> are able to define a finite set of static tracepoints that are relevant
> to everybody. Or in other words, they are defined as being the ones
> relevant to you.

No, I'm precisely not claiming that the tracepoints I was looking for
were "relevant to everybody". They are, however, very relevant to any
standard sysadmin or developer who wants to get a better picture of
what his kernel is doing. Again, please refer to figure 2 of this
article and explain to me why it's not relevant for standard users
and developers to understand when these events happen inside the
kernel:
http://www.usenix.org/events/usenix2000/general/full_papers/yaghmour/yaghmour_html/index.html

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:15             ` Ingo Molnar
  2006-09-14 18:35               ` Mathieu Desnoyers
  2006-09-14 18:54               ` Karim Yaghmour
@ 2006-09-14 19:40               ` Tim Bird
  2006-09-14 20:00                 ` Ingo Molnar
  2006-09-15 11:40                 ` Alan Cox
  2006-09-14 19:47               ` Roman Zippel
  3 siblings, 2 replies; 271+ messages in thread
From: Tim Bird @ 2006-09-14 19:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> * Roman Zippel <zippel@linux-m68k.org> wrote:
> 
>>> for me these are all _independent_ grounds for rejection, as a generic 
>>> kernel infrastructure.
>> Tracepoints of course need to be managed, but that's true for both 
>> dynamic and static tracepoints. [...]
> 
> that's not true, and this is the important thing that i believe you are 
> missing. A dynamic tracepoint is _detached_ from the normal source code 
> and thus is zero maintainance overhead. You dont have to maintain it
> during normal development - only if you need it. You dont see the
> dynamic tracepoints in the source code.

It's only zero maintenance overhead for you.  Someone has to
maintain it. The party line for years has been that in-tree
maintenance is easier than out-of-tree maintenance.

> 
> a static tracepoint, once it's in the mainline kernel, is a nonzero 
> maintainance overhead _until eternity_. It is a constant visual 
> hindrance and a constant build-correctness and boot-correctness problem 
> if you happen to change the code that is being traced by a static 
> tracepoint. Again, I am talking out of actual experience with static 
> tracepoints: i frequently break my kernel via static tracepoints and i 
> have constant maintainance cost from them. So what i do is that i try to 
> minimize the number of static tracepoints to _zero_. I.e. i only add 
> them when i need them for a given bug.

Ingo - I'm sure you are doing things at a level where static tracepoints
impose a significant perturbation to the code.  However, if you look
historically at the set of static tracepoints that people have used
with Linux (with LTT or LKST), they are really not too bad to maintain.  I'm
repeating what others have said, but I've been working with LTT and
LTTng for several years, and the tracepoints haven't changed very much
in that time.   Heck, I've even brought LTTng up on new kernel versions
and new architectures.  How hard could it be if I can do it? ;-)
(Of course, who knows if I did it right? - since it's out-of-tree it
doesn't get as much testing.)

The set of static tracepoints (or markers) that is envisioned is in the
range of about 30 to 40 key kernel events.  Dynamic tracepoints would
be used for other stuff.

I don't want to offend you, but I suspect your usage model for tracepoints
is different from what the expected (and historical) usage model
would be for LTTng-style static tracepoints.

> 
> static tracepoints are inferior to dynamic tracepoints in almost every 
> way.
>
>> [...]  Both have their advantages and disadvantages and just hammering 
>> on the possible problems of static ones [...]
> 
> how about giving a line by line rebuttal to the very real problems of 
> static tracepoints i listed (twice already), instead of calling them 
> "possible problems"?

I respect your experience, but I think it would be more productive
to have this debate when a patch is submitted with a static tracepoint (or marker)
implementation.  The patch in question, if I understand correctly, provides
infrastructure for tracing activities and should hopefully be useful for
either static or dynamic tracepoints.  I'm hoping someone from the SystemTAP
camp can speak up and give their opinion on whether this is useful.  If it is,
then the whole debate about static vs. dynamic tracepoints is less important.
If not, then that's a different debate.

I maintain Kernel Function Trace (KFT) out-of-tree.  This is a system which
uses compiler flags to instrument every kernel function entry and exit.  For obvious
reasons this type of instrumentation is used only during development, but it has
proven quite handy for certain development tasks (finding long-duration routines and
finding bloated call sequences).   I can imagine KFT using the infrastructure
that is provided by the LTTng-core patch (and relinquishing my own infrastructure
for activation, trace control, event handling etc.)

Regards,
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Electronics
=============================

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 19:40               ` Tim Bird
@ 2006-09-14 20:00                 ` Ingo Molnar
  2006-09-14 20:46                   ` Karim Yaghmour
  2006-09-14 21:02                   ` Roman Zippel
  2006-09-15 11:40                 ` Alan Cox
  1 sibling, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 20:00 UTC (permalink / raw)
  To: Tim Bird
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Tim Bird <tim.bird@am.sony.com> wrote:

> > that's not true, and this is the important thing that i believe you 
> > are missing. A dynamic tracepoint is _detached_ from the normal 
> > source code and thus is zero maintainance overhead. You dont have to 
> > maintain it during normal development - only if you need it. You 
> > dont see the dynamic tracepoints in the source code.
> 
> It's only zero maintenance overhead for you.  Someone has to maintain 
> it. The party line for years has been that in-tree maintenance is 
> easier than out-of-tree maintenance.

There's a third option, and that's the one i'm advocating: adding the 
tracepoint rules to the kernel, but in a _detached_ form from the actual 
source code.

yes, someone has to maintain it, but that will be a detached effort, on 
a low-frequency as-needed basis. It doesnt slow down or hinder 
high-frequency fast prototyping work, it does not impact the source code 
visually, and it does not make reading the code harder. Furthermore, 
while a single broken LTT tracepoint prevents the kernel from building 
at all, a single broken dynamic rule just wont be inserted into the 
kernel. All the other rules are still very much intact.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:00                 ` Ingo Molnar
@ 2006-09-14 20:46                   ` Karim Yaghmour
  2006-09-19 12:05                     ` Christoph Hellwig
  2006-09-14 21:02                   ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 20:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tim Bird, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> There's a third option, and that's the one i'm advocating: adding the 
> tracepoint rules to the kernel, but in a _detached_ form from the actual 
> source code.
> 
> yes, someone has to maintain it, but that will be a detached effort, on 
> a low-frequency as-needed basis. It doesnt slow down or hinder 
> high-frequency fast prototyping work, it does not impact the source code 
> visually, and it does not make reading the code harder. Furthermore, 
> while a single broken LTT tracepoint prevents the kernel from building 
> at all, a single broken dynamic rule just wont be inserted into the 
> kernel. All the other rules are still very much intact.

Actually the way ltt used to add its trace-statements is again an
implementation issue. Broken tracepoints need not lead to kernel
build failure.

That's where the markers idea can be useful. What a marker should
do is but provide location. It doesn't need to specify the variables
being observed or anything local, though it doesn't mean the
infrastructure shouldn't allow for this if the maintainer of the
code wanted to.

Ideally, though, markers should be self-contained. IOW, the person
implementing such a marker should not need to edit any other file
that the one being worked on to add an instrumentation point --
at least that's the way I think is easiest. What this means is that
you would be able to add an instrumentation point in the kernel,
build it, run the tracing and view the trace with your new event
without any further intervention on any tool, header, or anything
else.

The only way that I believe this can be done is with a flexible
marker infrastructure that a has a few basic properties:
- Markers should be inlined (clearly this is the bone of contention
  at this point of the thread.)
- By default, all markers should generate not a single instruction
  or modify any instruction path that would be generated should the
  the instrumentation not be there.
- Allow the person instrumenting to specify which variables they
  are interested in without any possibility of build failure should
  the code change making the variable obsolete.
- Build options should be added allowing users to:
  - Keep instrumentation disabled.
  - Create inlined trace points.
  - Create dynamic instrumentation markers.
  - Automatically generate appropriate information required for
    tools to be able to deal with the new instrumentation and/or
    display new information properly -- possibly in a new section
    of the binary.
  - etc.

Again, the goal is to have the loop from instrumentation to
visualization as simple as possible. Any instrumentation required
more that single-file modification is bound to fall in bitrot,
and fast.

Hope this helps.

Thanks,

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:46                   ` Karim Yaghmour
@ 2006-09-19 12:05                     ` Christoph Hellwig
  0 siblings, 0 replies; 271+ messages in thread
From: Christoph Hellwig @ 2006-09-19 12:05 UTC (permalink / raw)
  To: Karim Yaghmour
  Cc: Ingo Molnar, Tim Bird, Roman Zippel, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Thu, Sep 14, 2006 at 04:46:21PM -0400, Karim Yaghmour wrote:
> Ideally, though, markers should be self-contained. IOW, the person
> implementing such a marker should not need to edit any other file
> that the one being worked on to add an instrumentation point --
> at least that's the way I think is easiest. What this means is that
> you would be able to add an instrumentation point in the kernel,
> build it, run the tracing and view the trace with your new event
> without any further intervention on any tool, header, or anything
> else.

Just in case my first mail on this subject wasn't clear enough I
completely agree with that statement.  complex traces detaches from
the actual sourcecode are an uteer maintaince nightmare and should
be avoided for anything but spontanous debugging.  For that case they
are of course imensely useful.   Thus we need two forms to specify
probes, and to not make the tracing an utter mess they need to share
as much infrastructure as possible.


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:00                 ` Ingo Molnar
  2006-09-14 20:46                   ` Karim Yaghmour
@ 2006-09-14 21:02                   ` Roman Zippel
  1 sibling, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 21:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> > It's only zero maintenance overhead for you.  Someone has to maintain 
> > it. The party line for years has been that in-tree maintenance is 
> > easier than out-of-tree maintenance.
> 
> There's a third option, and that's the one i'm advocating: adding the 
> tracepoint rules to the kernel, but in a _detached_ form from the actual 
> source code.
> 
> yes, someone has to maintain it, but that will be a detached effort, on 
> a low-frequency as-needed basis. It doesnt slow down or hinder 
> high-frequency fast prototyping work, it does not impact the source code 
> visually, and it does not make reading the code harder. Furthermore, 
> while a single broken LTT tracepoint prevents the kernel from building 
> at all, a single broken dynamic rule just wont be inserted into the 
> kernel. All the other rules are still very much intact.

This pretty much contradicts existing experience, most core events are 
rather static - a schedule event is a schedule event no matter how the 
actual scheduler is implemented.
Separate tracepoints are like separate documentation, there are forgotten 
by the developers who could easily keep them uptodate if they were close 
to the source.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 19:40               ` Tim Bird
  2006-09-14 20:00                 ` Ingo Molnar
@ 2006-09-15 11:40                 ` Alan Cox
  2006-09-15 11:46                   ` Roman Zippel
  1 sibling, 1 reply; 271+ messages in thread
From: Alan Cox @ 2006-09-15 11:40 UTC (permalink / raw)
  To: Tim Bird
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Ar Iau, 2006-09-14 am 12:40 -0700, ysgrifennodd Tim Bird:
> It's only zero maintenance overhead for you.  Someone has to
> maintain it. The party line for years has been that in-tree
> maintenance is easier than out-of-tree maintenance.

That misses the entire point. If you have dynamic tracepoints you don't
have any static tracepoints to maintain because you don't need them.
They may be a clock or three slower but you are then going to branch
into the trace tool code paths, take tlb misses, take cache misses, and
eventually get back, so the cost of it being dynamic is so close to zero
in the biger picture it doesn't matter.

Alan

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 11:40                 ` Alan Cox
@ 2006-09-15 11:46                   ` Roman Zippel
  2006-09-15 12:38                     ` Alan Cox
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 11:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Alan Cox wrote:

> Ar Iau, 2006-09-14 am 12:40 -0700, ysgrifennodd Tim Bird:
> > It's only zero maintenance overhead for you.  Someone has to
> > maintain it. The party line for years has been that in-tree
> > maintenance is easier than out-of-tree maintenance.
> 
> That misses the entire point. If you have dynamic tracepoints you don't
> have any static tracepoints to maintain because you don't need them.

This assumes dynamic tracepoints are generally available, which is wrong.
This assumes that dynamic tracepoints can't benefit from static source 
annotations, which is also wrong.
He doesn't miss the point at all, dynamic tracepoints don't imply zero 
maintenance overhead.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 11:46                   ` Roman Zippel
@ 2006-09-15 12:38                     ` Alan Cox
  2006-09-15 12:39                       ` Roman Zippel
  2006-09-15 17:45                       ` Andrew Morton
  0 siblings, 2 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 12:38 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 13:46 +0200, ysgrifennodd Roman Zippel:
> > That misses the entire point. If you have dynamic tracepoints you don't
> > have any static tracepoints to maintain because you don't need them.
> 
> This assumes dynamic tracepoints are generally available, which is wrong.

Wrong in what sense, you don't have them implemented or your
architecture is mindbogglingly braindead you can't implement them ?

> This assumes that dynamic tracepoints can't benefit from static source 
> annotations, which is also wrong.

gcc -g produces extensive annotations which are then usably by many
tools other than gdb.

Alan


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:38                     ` Alan Cox
@ 2006-09-15 12:39                       ` Roman Zippel
  2006-09-15 13:41                         ` Alan Cox
  2006-09-15 17:45                       ` Andrew Morton
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 12:39 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Alan Cox wrote:

> Ar Gwe, 2006-09-15 am 13:46 +0200, ysgrifennodd Roman Zippel:
> > > That misses the entire point. If you have dynamic tracepoints you don't
> > > have any static tracepoints to maintain because you don't need them.
> > 
> > This assumes dynamic tracepoints are generally available, which is wrong.
> 
> Wrong in what sense, you don't have them implemented or your
> architecture is mindbogglingly braindead you can't implement them ?
> 
> > This assumes that dynamic tracepoints can't benefit from static source 
> > annotations, which is also wrong.
> 
> gcc -g produces extensive annotations which are then usably by many
> tools other than gdb.

Both points have very strong consequences regarding complexity. Why do you 
want to deny me the choice to use something simple, especially since both 
solutions are not mutually exclusive and can even complement each other? 
What's the point in forcing everyone to use a single solution?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:39                       ` Roman Zippel
@ 2006-09-15 13:41                         ` Alan Cox
  2006-09-15 13:34                           ` Roman Zippel
  2006-09-15 18:10                           ` Jose R. Santos
  0 siblings, 2 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 13:41 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 14:39 +0200, ysgrifennodd Roman Zippel:
> Both points have very strong consequences regarding complexity. Why do you 
> want to deny me the choice to use something simple, especially since both 
> solutions are not mutually exclusive and can even complement each other? 

I don't want to deny you the choice, I just don't want to see
unneccessary garbage in the base kernel. What you put in your own toilet
is a private matter. What you leave out in a public place is different.

> What's the point in forcing everyone to use a single solution?

Maintainability ? common good over individual weirdnesses ? Ability for
people to concentrate on getting one good set of interfaces not twelve
bad ones ? Consistency for user space ?

Alan


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:41                         ` Alan Cox
@ 2006-09-15 13:34                           ` Roman Zippel
  2006-09-15 14:41                             ` Alan Cox
  2006-09-15 18:10                           ` Jose R. Santos
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 13:34 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Alan Cox wrote:

> Ar Gwe, 2006-09-15 am 14:39 +0200, ysgrifennodd Roman Zippel:
> > Both points have very strong consequences regarding complexity. Why do you 
> > want to deny me the choice to use something simple, especially since both 
> > solutions are not mutually exclusive and can even complement each other? 
> 
> I don't want to deny you the choice, I just don't want to see
> unneccessary garbage in the base kernel. What you put in your own toilet
> is a private matter. What you leave out in a public place is different.

Now we've already sunken to the toilet level... :-(

> > What's the point in forcing everyone to use a single solution?
> 
> Maintainability ? common good over individual weirdnesses ? Ability for
> people to concentrate on getting one good set of interfaces not twelve
> bad ones ? Consistency for user space ?

Alan, you're making things up without any proof.

Listening to this diatribe against static tracepoints, one could get idea 
they would be something alien, which would polute the source. Well, 
everything can be abused, but good tracepoints are like good 
documentation, nobody wants to write and maintain it, but in the end 
others benefit from it if it exists.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:34                           ` Roman Zippel
@ 2006-09-15 14:41                             ` Alan Cox
  2006-09-15 14:35                               ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Alan Cox @ 2006-09-15 14:41 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Ar Gwe, 2006-09-15 am 15:34 +0200, ysgrifennodd Roman Zippel:
> > Maintainability ? common good over individual weirdnesses ? Ability for
> > people to concentrate on getting one good set of interfaces not twelve
> > bad ones ? Consistency for user space ?
> 
> Alan, you're making things up without any proof.

Welcome to my killfile. There isn't much point having a discussion with
anyone who considers any view or fact not in agreement as "no proof" and
any view or fact that favours them as "proven".

In the meantime perhaps the saner members of the static trace brigade
can explain why gcc debug data isn't good enough for them when its good
enough for kgdb to do single stepping at source level and variable
printing ?

Alan

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:41                             ` Alan Cox
@ 2006-09-15 14:35                               ` Karim Yaghmour
  2006-09-15 14:58                                 ` Alan Cox
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais


Alan Cox wrote:
> In the meantime perhaps the saner members of the static trace brigade
> can explain why gcc debug data isn't good enough for them when its good
> enough for kgdb to do single stepping at source level and variable
> printing ?

Care to explain how I can use to implement the equivalent of this:

@@ -1709,6 +1712,7 @@ switch_tasks:
   		++*switch_count;

   		prepare_arch_switch(rq, next);
+		TRACE_SCHEDCHANGE(prev, next);
   		prev = context_switch(rq, prev, next);
   		barrier();

Also, care to explain how kprobes can be used to access same data
without having to actually customize a probe point for every binary?

Thanks,

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:35                               ` Karim Yaghmour
@ 2006-09-15 14:58                                 ` Alan Cox
  2006-09-15 14:57                                   ` Karim Yaghmour
                                                     ` (3 more replies)
  0 siblings, 4 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 14:58 UTC (permalink / raw)
  To: karim
  Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ar Gwe, 2006-09-15 am 10:35 -0400, ysgrifennodd Karim Yaghmour:
> Care to explain how I can use to implement the equivalent of this:
> 
> @@ -1709,6 +1712,7 @@ switch_tasks:
>    		++*switch_count;
> 
>    		prepare_arch_switch(rq, next);
> +		TRACE_SCHEDCHANGE(prev, next);
>    		prev = context_switch(rq, prev, next);
>    		barrier();

The gdb debug data lets you find each line and also the variable
assignments (except when highly optimised in some cases). Try
breakpointing there with kgdb and using "where"... A kgdb script is the
wrong way to do instrumentation but it does demonstrate the information
is already out there, automatically generated and self maintaining.

You do need the gdb -g debug data, but equally if it was static you'd
need to recompile with the tracepoint because it would be off by
default, and there is a very small risk in both cases you'll disturb or
change the code behaviour/flow.

> Also, care to explain how kprobes can be used to access same data
> without having to actually customize a probe point for every binary?

Thats why we have things like systemtap.

All we appear to lack is systemtap ability to parse debug data so it can
be told "trace on line 9 of sched.c and record rq and next"

Alan

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:58                                 ` Alan Cox
@ 2006-09-15 14:57                                   ` Karim Yaghmour
  2006-09-15 17:49                                     ` Andrew Morton
  2006-09-15 17:01                                   ` Tim Bird
                                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 14:57 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais


Alan Cox wrote:
> The gdb debug data lets you find each line and also the variable
> assignments (except when highly optimised in some cases). Try
> breakpointing there with kgdb and using "where"... A kgdb script is the
> wrong way to do instrumentation but it does demonstrate the information
> is already out there, automatically generated and self maintaining.
> 
> You do need the gdb -g debug data, but equally if it was static you'd
> need to recompile with the tracepoint because it would be off by
> default, and there is a very small risk in both cases you'll disturb or
> change the code behaviour/flow.
...
> Thats why we have things like systemtap.
> 
> All we appear to lack is systemtap ability to parse debug data so it can
> be told "trace on line 9 of sched.c and record rq and next"

Thanks for the explanation. But I submit to you that both explanations
actually highlight the argument I was making earlier with regards to
dynamic tracing (and gdb info in this case) actually require a non-
expert to chase kernel versions and create appropriate appropriate
scripts/config-info for the post-insertion of instrumentation, with
the risks to kernel developers this may have (ex.: bug report to
lkml from user claiming to have discovered problem in subsystem when,
in fact, trace point by external maintainer was ill-chosen.)

Cheers,

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:57                                   ` Karim Yaghmour
@ 2006-09-15 17:49                                     ` Andrew Morton
  2006-09-15 18:20                                       ` Karim Yaghmour
  0 siblings, 1 reply; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 17:49 UTC (permalink / raw)
  To: karim
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 15 Sep 2006 10:57:29 -0400
Karim Yaghmour <karim@opersys.com> wrote:

> But I submit to you that both explanations
> actually highlight the argument I was making earlier with regards to
> dynamic tracing (and gdb info in this case) actually require a non-
> expert to chase kernel versions and create appropriate appropriate
> scripts/config-info for the post-insertion of instrumentation
> ...

Again, I don't see this as a huge problem.  patch(1) is able to keep track
of specific places within source code even in the presence of quite violent
changes to that source code.  There's no reason why systemtap support code
cannot do the same.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 17:49                                     ` Andrew Morton
@ 2006-09-15 18:20                                       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 18:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais



Andrew Morton wrote:
> Again, I don't see this as a huge problem.  patch(1) is able to keep track
> of specific places within source code even in the presence of quite violent
> changes to that source code.  There's no reason why systemtap support code
> cannot do the same.

If you don't want to listen to my part of the argument then consider
the point of view of those who have maintained systems entirely based
on binary editing, namely systemtap and LKET. It's indicative that
all those who have been involved in tracing, be it by static
instrumentation of code or the use of binary editing, all favor some
form of static markup mechanism of the code.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:58                                 ` Alan Cox
  2006-09-15 14:57                                   ` Karim Yaghmour
@ 2006-09-15 17:01                                   ` Tim Bird
  2006-09-15 17:08                                   ` Frank Ch. Eigler
  2006-09-15 18:18                                   ` Martin Bligh
  3 siblings, 0 replies; 271+ messages in thread
From: Tim Bird @ 2006-09-15 17:01 UTC (permalink / raw)
  To: Alan Cox
  Cc: karim, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Alan Cox wrote:
> Ar Gwe, 2006-09-15 am 10:35 -0400, ysgrifennodd Karim Yaghmour:
>> @@ -1709,6 +1712,7 @@ switch_tasks:
>>    		++*switch_count;
>>
>>    		prepare_arch_switch(rq, next);
>> +		TRACE_SCHEDCHANGE(prev, next);
>>    		prev = context_switch(rq, prev, next);
>>    		barrier();
>
> All we appear to lack is systemtap ability to parse debug data so it can
> be told "trace on line 9 of sched.c and record rq and next"

If the latter is a suggestion for how an out-of-tree rule for a
tracepoint definition should look, it's a terrible one.
Alan's example is much more fragile, from a maintenance perspective,
than Karim's.  Plus, it's much more difficult to implement, whether
you plan to inject no-ops at compile time, just record locations and
stack offsets, or actually place some tracing code (heaven forbid)
that the compiler could optimize for that context.

I still think that this is off-topic for the patch posted.  I think we
should debate the implementation of tracepoints/markers when someone posts a
patch for some.  I think it's rather scurrilous to complain about
code NOT submitted.  Ingo has even mis-characterized the not-submitted
instrumentation patch, by saying it has 350 tracepoints when it has no
such thing.  I counted 58 for one architecture (with only 8 being
arch-specific).
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Electronics
=============================


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:58                                 ` Alan Cox
  2006-09-15 14:57                                   ` Karim Yaghmour
  2006-09-15 17:01                                   ` Tim Bird
@ 2006-09-15 17:08                                   ` Frank Ch. Eigler
  2006-09-15 17:57                                     ` Andrew Morton
  2006-09-15 18:31                                     ` Alan Cox
  2006-09-15 18:18                                   ` Martin Bligh
  3 siblings, 2 replies; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-15 17:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> [...]
> > 
> >    		prepare_arch_switch(rq, next);
> > +		TRACE_SCHEDCHANGE(prev, next);
> >    		prev = context_switch(rq, prev, next);
> >    		barrier();
> 
> The gdb debug data lets you find each line and also the variable
> assignments (except when highly optimised in some cases). [...]

Unfortunately, variables and even control flow are quite regularly
made non-probe-capable by modern gcc.  Statement boundaries and
variables are not preserved.  There is an arms race within gcc to both
improve code optimization and its own "reverse-engineering" debugging
data generation, and the former is always ahead.

The end result is that there are many spots that we'd like to probe in
systemtap, but can't place exactly or extract all the data we'd like.
Really.

There are also spots that for other reasons cannot tolerate a fully
dynamic kprobes-style probe:

- where 1000-cycle int3-dispatching overheads too high
- in low-level code such as fault handling or locking, that, if probed
  dynamically, could entail infinite regress
- debugging information may not be available

This is the reason why I'm in favour of some lightweight event-marking
facility: a way of catching those points where dynamic probing is not
sufficiently fast or dependable.

> [...]
> All we appear to lack is systemtap ability to parse debug data so it can
> be told "trace on line 9 of sched.c and record rq and next"

Actually:

#! stap
probe kernel.function("*@kernel/sched.c:9") { printf("%p %p", $rq, $next) }

- FChE

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 17:08                                   ` Frank Ch. Eigler
@ 2006-09-15 17:57                                     ` Andrew Morton
  2006-09-15 18:31                                     ` Alan Cox
  1 sibling, 0 replies; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 17:57 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Alan Cox, karim, Roman Zippel, Tim Bird, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

On 15 Sep 2006 13:08:29 -0400
fche@redhat.com (Frank Ch. Eigler) wrote:

> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> > [...]
> > > 
> > >    		prepare_arch_switch(rq, next);
> > > +		TRACE_SCHEDCHANGE(prev, next);
> > >    		prev = context_switch(rq, prev, next);
> > >    		barrier();
> > 
> > The gdb debug data lets you find each line and also the variable
> > assignments (except when highly optimised in some cases). [...]
> 
> Unfortunately, variables and even control flow are quite regularly
> made non-probe-capable by modern gcc.  Statement boundaries and
> variables are not preserved.  There is an arms race within gcc to both
> improve code optimization and its own "reverse-engineering" debugging
> data generation, and the former is always ahead.
> 
> The end result is that there are many spots that we'd like to probe in
> systemtap, but can't place exactly or extract all the data we'd like.
> Really.

Useful info, thanks.

> There are also spots that for other reasons cannot tolerate a fully
> dynamic kprobes-style probe:
> 
> - where 1000-cycle int3-dispatching overheads too high

Is that still true of the recent kprobes "boosting" changes?

> - in low-level code such as fault handling or locking, that, if probed
>   dynamically, could entail infinite regress
> - debugging information may not be available
> 
> This is the reason why I'm in favour of some lightweight event-marking
> facility: a way of catching those points where dynamic probing is not
> sufficiently fast or dependable.

OK.

> > [...]
> > All we appear to lack is systemtap ability to parse debug data so it can
> > be told "trace on line 9 of sched.c and record rq and next"
> 
> Actually:
> 
> #! stap
> probe kernel.function("*@kernel/sched.c:9") { printf("%p %p", $rq, $next) }
> 

Really.  That's impressive progress.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 17:08                                   ` Frank Ch. Eigler
  2006-09-15 17:57                                     ` Andrew Morton
@ 2006-09-15 18:31                                     ` Alan Cox
  2006-09-15 18:12                                       ` Ingo Molnar
  2006-09-15 18:24                                       ` Frank Ch. Eigler
  1 sibling, 2 replies; 271+ messages in thread
From: Alan Cox @ 2006-09-15 18:31 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler:
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> - where 1000-cycle int3-dispatching overheads too high

Why are your despatching overheads 1000 cycles ? (and if its due to int3
why are you using int 3 8))


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:31                                     ` Alan Cox
@ 2006-09-15 18:12                                       ` Ingo Molnar
  2006-09-15 19:10                                         ` Roman Zippel
  2006-09-15 18:24                                       ` Frank Ch. Eigler
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 18:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: Frank Ch. Eigler, karim, Roman Zippel, Tim Bird,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler:
> > Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> > - where 1000-cycle int3-dispatching overheads too high
> 
> Why are your despatching overheads 1000 cycles ? (and if its due to 
> int3 why are you using int 3 8))

this is being worked on actively: there's the "djprobes" patchset, which 
includes a simplified disassembler to analyze common target code and can 
thus insert much faster, call-a-trampoline-function based tracepoints 
that are just as fast as (or faster than) compile-time, static 
tracepoints.

there's no fundamental reason why INT3 should be the primary model of 
inserting kprobes. Sometimes we are unlucky and the code which we target 
is too complex - then we take a few hundred cycles of a penalty. If that 
piece of code is a really common destination then we can add a static 
marker in the source which both prepares parameters and inserts a 
sufficiently sized NOP (or a function call) to prepare things for fast 
dynamic tracing - but it should only be an optional performance helper 
that we have the freedom to zap.

(kprobes can be thought of as a special "JIT", and there's no 
fundamental reason why it couldnt do almost arbitrary transformations on 
kernel code.)

and there's alot more that kprobes/systemtap can do: it can be a method 
of extending the kernel along a 'plugin' model - without having to 
impact the kernel source! That way people can experiment with kernel 
extensions on live kernels, without the barrier of recompile/reboot.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:12                                       ` Ingo Molnar
@ 2006-09-15 19:10                                         ` Roman Zippel
  2006-09-15 19:10                                           ` Ingo Molnar
                                                             ` (2 more replies)
  0 siblings, 3 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 19:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler:
> > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> > > - where 1000-cycle int3-dispatching overheads too high
> > 
> > Why are your despatching overheads 1000 cycles ? (and if its due to 
> > int3 why are you using int 3 8))
> 
> this is being worked on actively: there's the "djprobes" patchset, which 
> includes a simplified disassembler to analyze common target code and can 
> thus insert much faster, call-a-trampoline-function based tracepoints 
> that are just as fast as (or faster than) compile-time, static 
> tracepoints.

Who is going to implement this for every arch?
Is this now the official party line that only archs, which implement all 
of this, can make use of efficient tracing?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:10                                         ` Roman Zippel
@ 2006-09-15 19:10                                           ` Ingo Molnar
  2006-09-15 20:05                                           ` Thomas Gleixner
  2006-09-19 12:29                                           ` Christoph Hellwig
  2 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 19:10 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Roman Zippel <zippel@linux-m68k.org> wrote:

> On Fri, 15 Sep 2006, Ingo Molnar wrote:
> 
> > this is being worked on actively: there's the "djprobes" patchset, 
> > which includes a simplified disassembler to analyze common target 
> > code and can thus insert much faster, call-a-trampoline-function 
> > based tracepoints that are just as fast as (or faster than) 
> > compile-time, static tracepoints.
> 
> Who is going to implement this for every arch?

someone who is interested enough in that arch growing that capability?

> Is this now the official party line that only archs, which implement 
> all of this, can make use of efficient tracing?

that's certainly my preference - kprobes have lots of other advantages 
besides tracing. Whether that becomes the "official party line" depends 
on the technological analysis of the situation which will ultimately 
shape the outcome of this discussion.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:10                                         ` Roman Zippel
  2006-09-15 19:10                                           ` Ingo Molnar
@ 2006-09-15 20:05                                           ` Thomas Gleixner
  2006-09-15 20:35                                             ` Roman Zippel
  2006-09-15 21:44                                             ` Tim Bird
  2006-09-19 12:29                                           ` Christoph Hellwig
  2 siblings, 2 replies; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 20:05 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

On Fri, 2006-09-15 at 21:10 +0200, Roman Zippel wrote:
> > 
> > this is being worked on actively: there's the "djprobes" patchset, which 
> > includes a simplified disassembler to analyze common target code and can 
> > thus insert much faster, call-a-trampoline-function based tracepoints 
> > that are just as fast as (or faster than) compile-time, static 
> > tracepoints.
> 
> Who is going to implement this for every arch?
> Is this now the official party line that only archs, which implement all 
> of this, can make use of efficient tracing?

In the reverse you are enforcing an ugly - but available for all archs -
solution due to the fact that there is nobody interested enough to
implement it ?

If there is no interest to do that, then this arch can probably live w/o
instrumentation for the next decade too.

	tglx



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:05                                           ` Thomas Gleixner
@ 2006-09-15 20:35                                             ` Roman Zippel
  2006-09-15 21:44                                             ` Tim Bird
  1 sibling, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-15 20:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Thomas Gleixner wrote:

> > Who is going to implement this for every arch?
> > Is this now the official party line that only archs, which implement all 
> > of this, can make use of efficient tracing?
> 
> In the reverse you are enforcing an ugly - but available for all archs -
> solution due to the fact that there is nobody interested enough to
> implement it ?

Where is the proof that such solution is inherently ugly? (Note that 
just picking some example from LTT doesn't make a general proof.)
I am also not the one who wants to enforce a single solution onto 
everyone.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:05                                           ` Thomas Gleixner
  2006-09-15 20:35                                             ` Roman Zippel
@ 2006-09-15 21:44                                             ` Tim Bird
  1 sibling, 0 replies; 271+ messages in thread
From: Tim Bird @ 2006-09-15 21:44 UTC (permalink / raw)
  To: tglx
  Cc: Roman Zippel, Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev,
	Michel Dagenais

Thomas Gleixner wrote:
> On Fri, 2006-09-15 at 21:10 +0200, Roman Zippel wrote:
> 
>>>this is being worked on actively: there's the "djprobes" patchset, which 
>>>includes a simplified disassembler to analyze common target code and can 
>>>thus insert much faster, call-a-trampoline-function based tracepoints 
>>>that are just as fast as (or faster than) compile-time, static 
>>>tracepoints.
>>
>>Who is going to implement this for every arch?
>>Is this now the official party line that only archs, which implement all 
>>of this, can make use of efficient tracing?
>  
> In the reverse you are enforcing an ugly - but available for all archs -
> solution due to the fact that there is nobody interested enough to
> implement it ?

????

If there's a solution people are willing to implement, and one
they aren't - doesn't that say something?  Static tracepoint
patches for numerous architectures have existed and been maintained
out-of-tree for years.

> If there is no interest to do that, then this arch can probably live w/o
> instrumentation for the next decade too.

The arches already have instrumentation - just not dynamic 
instrumentation.  The reason static tracepoints have been
implemented and kprobes haven't is that static tracepoints
are sufficient for what those people are doing, and dynamic
tracepoints are a pain to implement.

Let me repeat that, just in case people missed it:
"Static tracepoints work for what I need."  If other people
want to implement something fancier that works for them,
then feel free.

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Electronics
=============================


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:10                                         ` Roman Zippel
  2006-09-15 19:10                                           ` Ingo Molnar
  2006-09-15 20:05                                           ` Thomas Gleixner
@ 2006-09-19 12:29                                           ` Christoph Hellwig
  2006-09-19 13:17                                             ` Roman Zippel
  2 siblings, 1 reply; 271+ messages in thread
From: Christoph Hellwig @ 2006-09-19 12:29 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

On Fri, Sep 15, 2006 at 09:10:44PM +0200, Roman Zippel wrote:
> Hi,
> 
> On Fri, 15 Sep 2006, Ingo Molnar wrote:
> 
> > > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler:
> > > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> > > > - where 1000-cycle int3-dispatching overheads too high
> > > 
> > > Why are your despatching overheads 1000 cycles ? (and if its due to 
> > > int3 why are you using int 3 8))
> > 
> > this is being worked on actively: there's the "djprobes" patchset, which 
> > includes a simplified disassembler to analyze common target code and can 
> > thus insert much faster, call-a-trampoline-function based tracepoints 
> > that are just as fast as (or faster than) compile-time, static 
> > tracepoints.
> 
> Who is going to implement this for every arch?
> Is this now the official party line that only archs, which implement all 
> of this, can make use of efficient tracing?

Come on, stop trying to be an asshole.  It's always been the case that to
use new functionality you have to add arch code where nessecary.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-19 12:29                                           ` Christoph Hellwig
@ 2006-09-19 13:17                                             ` Roman Zippel
  0 siblings, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-19 13:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird,
	Mathieu Desnoyers, linux-kernel, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Hi,

On Tue, 19 Sep 2006, Christoph Hellwig wrote:

> > Who is going to implement this for every arch?
> > Is this now the official party line that only archs, which implement all 
> > of this, can make use of efficient tracing?
> 
> Come on, stop trying to be an asshole.  It's always been the case that to
> use new functionality you have to add arch code where nessecary.

On the contrary I'm really trying my best to be reasonable.
If there were no way around implementing kprobes, I would completely agree 
with you.

Let's take an item from todo list: TLS support for m68k. This a language 
feature becoming more and more important and increasingly difficult to 
work around it. Considering the complexities of this feature it will take 
quite a bit of the time available to me and somehow I doubt someone will 
beat me to it. I'm not complaining about it, I even enjoy hacking on it, 
but I also have to take no shit on how I have to spend my time.

Considering this I hope you understand how important kprobes are to me, I 
admit it's a nice a feature, but it's far from being essential.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:31                                     ` Alan Cox
  2006-09-15 18:12                                       ` Ingo Molnar
@ 2006-09-15 18:24                                       ` Frank Ch. Eigler
  2006-09-15 18:23                                         ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-15 18:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

[-- Attachment #1: Type: text/plain, Size: 734 bytes --]

Hi -

On Fri, Sep 15, 2006 at 07:31:48PM +0100, Alan Cox wrote:

> Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler:
Yeah, or something. :-)

> > Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> > - where 1000-cycle int3-dispatching overheads too high
> 
> Why are your despatching overheads 1000 cycles ? (and if its due to int3
> why are you using int 3 8))

Smart teams from IBM and Hitachi have been hammering away at this code
for a year or two now, and yet (roughly) here we are.  There have been
experiments involving plopping branches instead of int3's at probe
locations, but this is self-modifying code involving multiple
instructions, and appears to be tricky on SMP/preempt boxes.

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:24                                       ` Frank Ch. Eigler
@ 2006-09-15 18:23                                         ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 18:23 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Alan Cox, karim, Roman Zippel, Tim Bird, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais


* Frank Ch. Eigler <fche@redhat.com> wrote:

> > Why are your despatching overheads 1000 cycles ? (and if its due to 
> > int3 why are you using int 3 8))
> 
> Smart teams from IBM and Hitachi have been hammering away at this code 
> for a year or two now, and yet (roughly) here we are.  There have been 
> experiments involving plopping branches instead of int3's at probe 
> locations, but this is self-modifying code involving multiple 
> instructions, and appears to be tricky on SMP/preempt boxes.

i am talking to them about that, and i'm 100% sure the solution is much 
easier than the many (much harder) problems that SystemTap has already 
solved. I think you are way too modest to realize how powerful (and 
important) SystemTap is :-)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 14:58                                 ` Alan Cox
                                                     ` (2 preceding siblings ...)
  2006-09-15 17:08                                   ` Frank Ch. Eigler
@ 2006-09-15 18:18                                   ` Martin Bligh
  3 siblings, 0 replies; 271+ messages in thread
From: Martin Bligh @ 2006-09-15 18:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

>>Also, care to explain how kprobes can be used to access same data
>>without having to actually customize a probe point for every binary?
> 
> 
> Thats why we have things like systemtap.
> 
> All we appear to lack is systemtap ability to parse debug data so it can
> be told "trace on line 9 of sched.c and record rq and next"

But that's the whole point - if it's not integrated into a marker as
source code, it requires manual intervention for every bloody release
to do. "line 9 of sched.c" is a farcically stupid way of doing tags
on a dynamically moving project like the linux kernel.

Yes, that may work OK for something that is very static, like a distro
snapshot, but as a general mechanism, it's unsustainable and broken.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 13:41                         ` Alan Cox
  2006-09-15 13:34                           ` Roman Zippel
@ 2006-09-15 18:10                           ` Jose R. Santos
  2006-09-15 19:49                             ` Mathieu Desnoyers
  1 sibling, 1 reply; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 18:10 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Alan Cox wrote:
> Consistency for user space ?
>   

With several other trace tools being implemented for the kernel, there 
is a great problem with consistencies among these tool.  It is my 
opinion that trace are of very little use to _most_ people with out the 
availability of post-processing tools to analyses these trace.  While I 
wont say that we need one all powerful solution, it would be good if all 
solutions would at least be able to talk to the same post-processing 
facilities in user-space.  Before LTTng is even considered into the 
kernel, there need to be discussion to determine if the trace mechanism 
being propose is suitable for all people interested in doing trace 
analysis.  The fact the there also exist tool like LKET and LKST seem to 
suggest that there other things to be considered when it comes to 
implementing a trace mechanism that everyone would be happy with.

It would also be useful for all the trace tool to implement the same 
probe points so that post-processing tools can be interchanged between 
the various trace implementations.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:10                           ` Jose R. Santos
@ 2006-09-15 19:49                             ` Mathieu Desnoyers
  2006-09-15 20:54                               ` Jose R. Santos
  0 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 19:49 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

* Jose R. Santos (jrs@us.ibm.com) wrote:
> Alan Cox wrote:
> 
> With several other trace tools being implemented for the kernel, there 
> is a great problem with consistencies among these tool.  It is my 
> opinion that trace are of very little use to _most_ people with out the 
> availability of post-processing tools to analyses these trace.  While I 
> wont say that we need one all powerful solution, it would be good if all 
> solutions would at least be able to talk to the same post-processing 
> facilities in user-space.  Before LTTng is even considered into the 
> kernel, there need to be discussion to determine if the trace mechanism 
> being propose is suitable for all people interested in doing trace 
> analysis.  The fact the there also exist tool like LKET and LKST seem to 
> suggest that there other things to be considered when it comes to 
> implementing a trace mechanism that everyone would be happy with.
> 
> It would also be useful for all the trace tool to implement the same 
> probe points so that post-processing tools can be interchanged between 
> the various trace implementations.
> 
> 

Hi Jose,

I completely agree that there is a crying need for standardisation there. The
reason why I propose the LTTng infrastructure as a tracing core in the Linux
kernel is this : the fundamental problem I have found with kernel tracers so
far is that they perturb the system too much or do not offer enough fine
grained protection against reentrancy. Ingo's post about tracing statement
breaking the kernel all the time seems to me like a sufficient proof that this
is a real problem.

My goal with LTTng is to provide a reentrant data serialisation mechanism that
can be called from anywhere in the kernel (ok, the vmalloc path of the page
fault handler is _the_ exception) that does not use any lock and can therefore
trace code paths like NMI handlers.

I also implemented code that would serialize any type of data structure I could
think of. If it is too much, well, we can use part of it.

LTTng trace format is explained there. Your comments on it are very welcome.

http://ltt.polymtl.ca/ > LTTV and LTTng developer documentation > format.html
(http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/format.html)

Regards,

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:49                             ` Mathieu Desnoyers
@ 2006-09-15 20:54                               ` Jose R. Santos
  2006-09-15 21:42                                 ` Karim Yaghmour
  2006-09-15 21:46                                 ` Mathieu Desnoyers
  0 siblings, 2 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 20:54 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> * Jose R. Santos (jrs@us.ibm.com) wrote:
> > Alan Cox wrote:
> > 
> > With several other trace tools being implemented for the kernel, there 
> > is a great problem with consistencies among these tool.  It is my 
> > opinion that trace are of very little use to _most_ people with out the 
> > availability of post-processing tools to analyses these trace.  While I 
> > wont say that we need one all powerful solution, it would be good if all 
> > solutions would at least be able to talk to the same post-processing 
> > facilities in user-space.  Before LTTng is even considered into the 
> > kernel, there need to be discussion to determine if the trace mechanism 
> > being propose is suitable for all people interested in doing trace 
> > analysis.  The fact the there also exist tool like LKET and LKST seem to 
> > suggest that there other things to be considered when it comes to 
> > implementing a trace mechanism that everyone would be happy with.
> > 
> > It would also be useful for all the trace tool to implement the same 
> > probe points so that post-processing tools can be interchanged between 
> > the various trace implementations.
> > 
> > 
>
> Hi Jose,
>
> I completely agree that there is a crying need for standardisation there. The
> reason why I propose the LTTng infrastructure as a tracing core in the Linux
> kernel is this : the fundamental problem I have found with kernel tracers so
> far is that they perturb the system too much or do not offer enough fine
> grained protection against reentrancy. Ingo's post about tracing statement
> breaking the kernel all the time seems to me like a sufficient proof that this
> is a real problem.
>
>   
I agree with your goal for ltt.

> My goal with LTTng is to provide a reentrant data serialisation mechanism that
> can be called from anywhere in the kernel (ok, the vmalloc path of the page
> fault handler is _the_ exception) that does not use any lock and can therefore
> trace code paths like NMI handlers.
>   

One of the things that I've notice from this thread that neither you or 
Karim sees to have answer is why is LTTng needed if a suitable 
replacement can be developed using SystemTap with static markers.  I am 
personally interested in this answer as well.  If all the things that 
LTT is proposing can be implemented in SystemTap, what then is the 
advantage of accenting such an interface into the kernel.

I don't really care which method is used as long as its the right tool 
for the job.  I see several idea from LTT that could be integrated into 
SystemTap in order to make it a one stop solution for both dynamic and 
static tracing.  Would you care to elaborate why you think having 
separate projects is a better solution?
> I also implemented code that would serialize any type of data structure I could
> think of. If it is too much, well, we can use part of it.
>
> LTTng trace format is explained there. Your comments on it are very welcome.
>
> http://ltt.polymtl.ca/ > LTTV and LTTng developer documentation > format.html
> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/format.html)
>   

Trace event headers are very similar between both LTT and LKET which is 
good in other to get some synergy between our projects.  One thing that 
LKET has on each trace event that LTT doesn't is the tid and CPU id of 
each event.  We find this extremely useful for post-processing.  Also, 
why have the event_size on every event taken?  Why not describe the 
event during the trace header and remove this redundant information from 
the event header and save some trace file space.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:54                               ` Jose R. Santos
@ 2006-09-15 21:42                                 ` Karim Yaghmour
  2006-09-15 21:46                                 ` Mathieu Desnoyers
  1 sibling, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 21:42 UTC (permalink / raw)
  To: jrs
  Cc: Mathieu Desnoyers, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar,
	linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Jose R. Santos wrote:
> I don't really care which method is used as long as its the right tool 
> for the job.  I see several idea from LTT that could be integrated into 
> SystemTap in order to make it a one stop solution for both dynamic and 
> static tracing.  Would you care to elaborate why you think having 
> separate projects is a better solution?

We don't -- at least *I* wouldn't care, but I'm not the current
maintainer. ltt's usefulness has always been in the digested information
it can present to the user. The kernel patching part was a necessary
evil. What I object to is the depiction of dynamic tracing as solving
the need for static markup. I doesn't, and, therefore, does not
currently constitute an adequate substitute for ltt's patches. If
someone else can actually provide ltt with the events and surround
detail (timestamping and all) it needs while still providing the same
performance we currently get out of the current ltt patches, then I'd
say more power to them -- the current developers may how more relevant
things to say.

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:54                               ` Jose R. Santos
  2006-09-15 21:42                                 ` Karim Yaghmour
@ 2006-09-15 21:46                                 ` Mathieu Desnoyers
  2006-09-19 15:05                                   ` Jose R. Santos
  1 sibling, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15 21:46 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Hi Jose,

* Jose R. Santos (jrs@us.ibm.com) wrote:
> >My goal with LTTng is to provide a reentrant data serialisation mechanism 
> >that
> >can be called from anywhere in the kernel (ok, the vmalloc path of the page
> >fault handler is _the_ exception) that does not use any lock and can 
> >therefore
> >trace code paths like NMI handlers.
> >  
> 
> One of the things that I've notice from this thread that neither you or 
> Karim sees to have answer is why is LTTng needed if a suitable 
> replacement can be developed using SystemTap with static markers.  I am 
> personally interested in this answer as well.  If all the things that 
> LTT is proposing can be implemented in SystemTap, what then is the 
> advantage of accenting such an interface into the kernel.
> 

Well, last time I have checked, SystemTAP did not have a reentrant serialisation
mechanism to write the information to the buffers. Also, the goals of the
projects differ : SystemTAP finds acceptable to suffer from the kprobe
performance hit while it is unacceptable for LTTng.

> I don't really care which method is used as long as its the right tool 
> for the job.  I see several idea from LTT that could be integrated into 
> SystemTap in order to make it a one stop solution for both dynamic and 
> static tracing.  Would you care to elaborate why you think having 
> separate projects is a better solution?

I think that each projet focus on their own different goals but that there is
much to gain in reusing the strenghts of each.

SystemTAP is good at dynamic instrumentation.
LTTng is good at data serialisation under a fully reentrant kernel.
LTTng provides logging primitives for any data type, including SystemTAP text
output.

Is someone willing to try to create a small facility that will dump SystemTAP's
output in LTTng ? It is nearly trivial : if I wasn't completing my debugfs port,
I would probably be doing it right now.

> Trace event headers are very similar between both LTT and LKET which is 
> good in other to get some synergy between our projects.  One thing that 
> LKET has on each trace event that LTT doesn't is the tid and CPU id of 
> each event.  We find this extremely useful for post-processing.  Also, 
> why have the event_size on every event taken?  Why not describe the 
> event during the trace header and remove this redundant information from 
> the event header and save some trace file space.
> 

A standard event header has to have only crucial information, nothing more, or
it becomes bloated and quickly grow trace size. We decided not to put tid and
CPU id in the event header because tid is already available with the schedchange
events at post-processing time and CPU id is already available too, as we have
per CPU buffers.

The event size is completely unnecessary, but in reality very, very useful to
authenticate the correspondance between the size of the data recorded by the
kernel and the size of data the viewer thinks it is reading. Think of it as a
consistency check between kernel and viewer algorithms.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 21:46                                 ` Mathieu Desnoyers
@ 2006-09-19 15:05                                   ` Jose R. Santos
  2006-09-19 15:30                                     ` Mathieu Desnoyers
  0 siblings, 1 reply; 271+ messages in thread
From: Jose R. Santos @ 2006-09-19 15:05 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> > Trace event headers are very similar between both LTT and LKET which is 
> > good in other to get some synergy between our projects.  One thing that 
> > LKET has on each trace event that LTT doesn't is the tid and CPU id of 
> > each event.  We find this extremely useful for post-processing.  Also, 
> > why have the event_size on every event taken?  Why not describe the 
> > event during the trace header and remove this redundant information from 
> > the event header and save some trace file space.
> > 
>
> A standard event header has to have only crucial information, nothing more, or
> it becomes bloated and quickly grow trace size. We decided not to put tid and
> CPU id in the event header because tid is already available with the schedchange
> events at post-processing time and CPU id is already available too, as we have
> per CPU buffers.
>   

We still keep the CPU id because LKET still support ASCII tracing which 
mixes the output of all the CPUs together.  It is still debatable 
whether this is a useful feature or not though.  If we remove ASCII 
event tracing from LKET, we could remove CPU id from the event header as 
well.

The tid we still include because LKET supports turning on individual 
tracepoints unlike LTT, which if I remember correctly turns on all the 
tracepoint that are compiled into the running kernel.  Since the user is 
free to chose which tracepoints he wants to use for his workload, we can 
not guarantee that scheduler tracepoints are going to be available.  We 
consider taking the tid as one of those absolute minimum pieces of data 
required to do meaningful analysis.

We chose to control performance and trace output size by letting users 
have control of number of tracepoint he can activate at any given time.  
This is important to us since we plan to add many dynamic tracepoints to 
different sub-systems (filesystem, device drivers, core kernel 
facilities, etc...).  Turning on all of these tracepoint at the same 
time would slow down the system to much and change the performance 
characteristics of the environment being studied.
> The event size is completely unnecessary, but in reality very, very useful to
> authenticate the correspondance between the size of the data recorded by the
> kernel and the size of data the viewer thinks it is reading. Think of it as a
> consistency check between kernel and viewer algorithms.
>   

I understand.  But if the size of each event is fixed, why would you 
expect the data sizes that the tool reports in the trace header for each 
event to change over the course of a trace.  If the data on the per-CPU 
buffers is serialized, a similar authentication could be done using the 
timestamp by checking the timestamps of the events before and after the 
current event, thus validating the current timestamp as well as the size 
offset of the previous event.  Just a thought.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-19 15:05                                   ` Jose R. Santos
@ 2006-09-19 15:30                                     ` Mathieu Desnoyers
  2006-09-19 16:39                                       ` Jose R. Santos
  0 siblings, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-19 15:30 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

* Jose R. Santos (jrs@us.ibm.com) wrote:
> Mathieu Desnoyers wrote:
> >A standard event header has to have only crucial information, nothing 
> >more, or
> >it becomes bloated and quickly grow trace size. We decided not to put tid 
> >and
> >CPU id in the event header because tid is already available with the 
> >schedchange
> >events at post-processing time and CPU id is already available too, as we 
> >have
> >per CPU buffers.
> >  
> 
> We still keep the CPU id because LKET still support ASCII tracing which 
> mixes the output of all the CPUs together.  It is still debatable 
> whether this is a useful feature or not though.  If we remove ASCII 
> event tracing from LKET, we could remove CPU id from the event header as 
> well.
> 

How hard would it be to make LKET send its ASCII output to multiple "channels"
(buffers) and then fetch and combine them in user space ? Have a look at lttd
and lttv in the ltt-control package from the LTTng project : it would be
trivial to adapt. In fact, there is already a text dump module available.

> The tid we still include because LKET supports turning on individual 
> tracepoints unlike LTT, which if I remember correctly turns on all the 
> tracepoint that are compiled into the running kernel.  Since the user is 
> free to chose which tracepoints he wants to use for his workload, we can 
> not guarantee that scheduler tracepoints are going to be available.  We 
> consider taking the tid as one of those absolute minimum pieces of data 
> required to do meaningful analysis.
> 

I understand, but it does not have to be included in the bare-boned event
header. We could think of an optional "event context" header that would have its
individual parts enabled or not depending on the events recorded in the trace.
For instance :

With scheduler instrumentation activated :

Event Header  |  Variable data

Without scheduler instrumentation activated :

Event Header  |  PID  |  Variable data

The information about whether or not the optional event context is present in
the trace or not could be saved in the trace header.

This way, we could not add unnecessary data when it is not needed. And
furthermore, this is extensible for other event context information.

> We chose to control performance and trace output size by letting users 
> have control of number of tracepoint he can activate at any given time.  
> This is important to us since we plan to add many dynamic tracepoints to 
> different sub-systems (filesystem, device drivers, core kernel 
> facilities, etc...).  Turning on all of these tracepoint at the same 
> time would slow down the system to much and change the performance 
> characteristics of the environment being studied.

Yes, I know that overhead is a big problem with dynamic instrumentation ;) I
think we can find a way to both have an optimal trace format while giving
a dynamic probe based tracer enough context when needed.

> >The event size is completely unnecessary, but in reality very, very useful 
> >to
> >authenticate the correspondance between the size of the data recorded by 
> >the
> >kernel and the size of data the viewer thinks it is reading. Think of it 
> >as a
> >consistency check between kernel and viewer algorithms.
> >  
> 
> I understand.  But if the size of each event is fixed, why would you 
> expect the data sizes that the tool reports in the trace header for each 
> event to change over the course of a trace.  If the data on the per-CPU 
> buffers is serialized, a similar authentication could be done using the 
> timestamp by checking the timestamps of the events before and after the 
> current event, thus validating the current timestamp as well as the size 
> offset of the previous event.  Just a thought.
> 

Yes, but if there is a bug with the timestamp (time going backward because of
problematic event record serialization), it becomes harder to pinpoint the
source of the problem (if it is due to a bug in the variable data serialization
mechanism, a bug in the user space "unserialization" mechanism or a bug in event
serialization within the kernel). LTTng hasn't suffered of this kind of issue
for quite some time, but when under heavy development, those indicators of data
consistency have all proven their usefulness.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-19 15:30                                     ` Mathieu Desnoyers
@ 2006-09-19 16:39                                       ` Jose R. Santos
  2006-09-19 18:03                                         ` Mathieu Desnoyers
  0 siblings, 1 reply; 271+ messages in thread
From: Jose R. Santos @ 2006-09-19 16:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Mathieu Desnoyers wrote:
> > We still keep the CPU id because LKET still support ASCII tracing which 
> > mixes the output of all the CPUs together.  It is still debatable 
> > whether this is a useful feature or not though.  If we remove ASCII 
> > event tracing from LKET, we could remove CPU id from the event header as 
> > well.
> > 
>
> How hard would it be to make LKET send its ASCII output to multiple "channels"
> (buffers) and then fetch and combine them in user space ? Have a look at lttd
> and lttv in the ltt-control package from the LTTng project : it would be
> trivial to adapt. In fact, there is already a text dump module available.
>   

Actually, ASCII trace should output to multiple channels if we use bulk 
mode.  The original idea for keeping ASCII trace (this was the original 
output mechanism) was that a user may have wanted to look at trace 
output information in real-time as it was being printed onto the screen 
(which requires merging all the output channels).  Again, I question the 
usability of this feature and if a user really wanted to look at ASCII 
trace data in real time, a better solution would be for the lket-b2a 
conversion tool to have a mode were it could print the output of 
constantly changing trace buffers to the screen.  The ASCII output mode 
in LKET is cryptic and having lket-b2a do this would perform better and 
produce prettier output while also reducing the trace file output size a 
bit.
> > The tid we still include because LKET supports turning on individual 
> > tracepoints unlike LTT, which if I remember correctly turns on all the 
> > tracepoint that are compiled into the running kernel.  Since the user is 
> > free to chose which tracepoints he wants to use for his workload, we can 
> > not guarantee that scheduler tracepoints are going to be available.  We 
> > consider taking the tid as one of those absolute minimum pieces of data 
> > required to do meaningful analysis.
> > 
>
> I understand, but it does not have to be included in the bare-boned event
> header. We could think of an optional "event context" header that would have its
> individual parts enabled or not depending on the events recorded in the trace.
> For instance :
>
> With scheduler instrumentation activated :
>
> Event Header  |  Variable data
>
> Without scheduler instrumentation activated :
>
> Event Header  |  PID  |  Variable data
>
> The information about whether or not the optional event context is present in
> the trace or not could be saved in the trace header.
>
> This way, we could not add unnecessary data when it is not needed. And
> furthermore, this is extensible for other event context information.
>   
Thats also a possible and it should not be difficult to implement. 
> > We chose to control performance and trace output size by letting users 
> > have control of number of tracepoint he can activate at any given time.  
> > This is important to us since we plan to add many dynamic tracepoints to 
> > different sub-systems (filesystem, device drivers, core kernel 
> > facilities, etc...).  Turning on all of these tracepoint at the same 
> > time would slow down the system to much and change the performance 
> > characteristics of the environment being studied.
>
> Yes, I know that overhead is a big problem with dynamic instrumentation ;) I
> think we can find a way to both have an optimal trace format while giving
> a dynamic probe based tracer enough context when needed.
>   

Actually, we started doing this six years ago on our internal *static* 
trace tool before we started implementing event tracing using 
SystemTap.  Regardless of whether the tool uses static or dynamic 
probes, if the problem only requires 3 tracepoints to figure out, why 
would you want to activate 50+ hooks.
>
> > I understand.  But if the size of each event is fixed, why would you 
> > expect the data sizes that the tool reports in the trace header for each 
> > event to change over the course of a trace.  If the data on the per-CPU 
> > buffers is serialized, a similar authentication could be done using the 
> > timestamp by checking the timestamps of the events before and after the 
> > current event, thus validating the current timestamp as well as the size 
> > offset of the previous event.  Just a thought.
> > 
>
> Yes, but if there is a bug with the timestamp (time going backward because of
> problematic event record serialization), it becomes harder to pinpoint the
> source of the problem (if it is due to a bug in the variable data serialization
> mechanism, a bug in the user space "unserialization" mechanism or a bug in event
> serialization within the kernel). LTTng hasn't suffered of this kind of issue
> for quite some time, but when under heavy development, those indicators of data
> consistency have all proven their usefulness.
>
>   
Look like the example you propose above could also apply to this as 
well.  You could implement some sort of debug mode to the trace data 
that provides extra information useful for debugging the tool.  If the 
information is really only useful when debugging the trace tool during 
development,  wouldn't it make sense to have a way to disable debugging 
junk as needed?

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-19 16:39                                       ` Jose R. Santos
@ 2006-09-19 18:03                                         ` Mathieu Desnoyers
  0 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-19 18:03 UTC (permalink / raw)
  To: Jose R. Santos
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

* Jose R. Santos (jrs@us.ibm.com) wrote:
> Look like the example you propose above could also apply to this as 
> well.  You could implement some sort of debug mode to the trace data 
> that provides extra information useful for debugging the tool.  If the 
> information is really only useful when debugging the trace tool during 
> development,  wouldn't it make sense to have a way to disable debugging 
> junk as needed?
> 

You are absolutely right.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 12:38                     ` Alan Cox
  2006-09-15 12:39                       ` Roman Zippel
@ 2006-09-15 17:45                       ` Andrew Morton
  2006-09-15 18:16                         ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 17:45 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 15 Sep 2006 13:38:58 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> gcc -g produces extensive annotations which are then usably by many
> tools other than gdb.

This is something I'm curious about.  AFAICT there are two(*) reasons for
wanting static tracepoints:

a) to be able to get at local variables and

b) as a "marker" somewhere within the body of a function - the
   expectation here is that identifiying that particular spot in the
   function would be hard without some marker which moves around as the
   functions itself is modified over time.

If a) is true, then isn't this simply a feature request against the
systemtap infrastructure?  There's no reason per-se why a kprobe point
cannot access locals, using the dwarf debug info.  It'll be somewhat
unreliable, because stack slots and registers go out of scope and get
reused for other things.  But as any gdb user will know, it's still
useful.

As for b), if it was _really_ an advantage to be able to identify
particular places within the body of a function then one could concoct a
macro which inserts some info into a separate elf section and which adds no
code at all to actual .text.

Although IMO this is a bit lame - it is quite possible to go into
SexySystemTapGUI, click on a particular kernel file-n-line and have
systemtap userspace keep track of that place in the kernel source across
many kernel versions: all it needs to do is to remember the file+line and a
snippet of the surrounding text, for readjustment purposes.

(*) I don't buy the performance arguments: kprobes are quick, and I'd
expect that the CPU consumption of the destination of the probe is
comparable to or higher than the cost of taking the initial trap.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 17:45                       ` Andrew Morton
@ 2006-09-15 18:16                         ` Karim Yaghmour
  2006-09-15 19:20                           ` Jose R. Santos
  2006-09-15 19:59                           ` Andrew Morton
  0 siblings, 2 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 18:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais


Andrew Morton wrote:
> This is something I'm curious about.  AFAICT there are two(*) reasons for
> wanting static tracepoints:
> 
> a) to be able to get at local variables and
> 
> b) as a "marker" somewhere within the body of a function - the
>    expectation here is that identifiying that particular spot in the
>    function would be hard without some marker which moves around as the
>    functions itself is modified over time.
> 
> 
> If a) is true, then isn't this simply a feature request against the
> systemtap infrastructure?  There's no reason per-se why a kprobe point
> cannot access locals, using the dwarf debug info.  It'll be somewhat
> unreliable, because stack slots and registers go out of scope and get
> reused for other things.  But as any gdb user will know, it's still
> useful.

I believe this has been addressed by Frank in his other email, so I'll
skip.

> As for b), if it was _really_ an advantage to be able to identify
> particular places within the body of a function then one could concoct a
> macro which inserts some info into a separate elf section and which adds no
> code at all to actual .text.

Yes, and this specific suggestion has been made a number of times.
Though, then, this is an implementation debate and there are number
of things which could be made available as build-time options. The
emerging consensus in this thread, however, that there is a clear
need for a way for statically marking up important events, and this
point has been emphasized both by those who have maintained
infrastructure based on "static" tracepoints and those maintaining
such infrastructure based on "dynamic" tracepoints.

> Although IMO this is a bit lame - it is quite possible to go into
> SexySystemTapGUI, click on a particular kernel file-n-line and have
> systemtap userspace keep track of that place in the kernel source across
> many kernel versions: all it needs to do is to remember the file+line and a
> snippet of the surrounding text, for readjustment purposes.

Sure, if you're a kernel developer, but as I've explained numberous
times in this thread, there are far more many users of tracing than
kernel developers.

> (*) I don't buy the performance arguments: kprobes are quick, and I'd
> expect that the CPU consumption of the destination of the probe is
> comparable to or higher than the cost of taking the initial trap.

Please see Mathieu's earlier posting of numbers comparing kprobes to
static points. Nevertheless, I do not believe that the use of kprobes
should be pitted against static instrumentation, the two are
orthogonal.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                         ` Karim Yaghmour
@ 2006-09-15 19:20                           ` Jose R. Santos
  2006-09-15 19:59                           ` Andrew Morton
  1 sibling, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 19:20 UTC (permalink / raw)
  To: karim
  Cc: Andrew Morton, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais

Karim Yaghmour wrote:
> > Although IMO this is a bit lame - it is quite possible to go into
> > SexySystemTapGUI, click on a particular kernel file-n-line and have
> > systemtap userspace keep track of that place in the kernel source across
> > many kernel versions: all it needs to do is to remember the file+line and a
> > snippet of the surrounding text, for readjustment purposes.
>
> Sure, if you're a kernel developer, but as I've explained numberous
> times in this thread, there are far more many users of tracing than
> kernel developers.
>   

This is so true (and the main reason we implemented a trace utility in 
SystemTap).

Several of the people that work with in my team are _not_ kernel 
developers.  They do not necessarily know the Linux kernel code enough 
to insert their own instrumentation.  On the other had, they do posses 
other very good knowledges about things specific to a particular 
software stack or a HW subsystem.  Structured predefined probe points 
(dynamic or static) allow people with limited  kernel hacking skills to 
feedback useful information back to developers of the kernel.

I agree with Karim that a trace tool (while useful to developers) is 
mostly targeted at a non kernel developer audience.  They are mostly 
meant to enhance the communication between developers and regular 
users.  Any solution that is intended to be dynamic replacement for 
LTTng needs to take these kinds of users into account.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 18:16                         ` Karim Yaghmour
  2006-09-15 19:20                           ` Jose R. Santos
@ 2006-09-15 19:59                           ` Andrew Morton
  2006-09-15 20:24                             ` Karim Yaghmour
  1 sibling, 1 reply; 271+ messages in thread
From: Andrew Morton @ 2006-09-15 19:59 UTC (permalink / raw)
  To: karim
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 15 Sep 2006 14:16:18 -0400
Karim Yaghmour <karim@opersys.com> wrote:

> > Although IMO this is a bit lame - it is quite possible to go into
> > SexySystemTapGUI, click on a particular kernel file-n-line and have
> > systemtap userspace keep track of that place in the kernel source across
> > many kernel versions: all it needs to do is to remember the file+line and a
> > snippet of the surrounding text, for readjustment purposes.
> 
> Sure, if you're a kernel developer, but as I've explained numberous
> times in this thread, there are far more many users of tracing than
> kernel developers.

Disagree.  I was describing a means by which a set of systemtap trace
points could be described.  A means which would allow those tracepoints to
be maintained without human intervention as the kernel source changes. 
(ie: use a similar algorithm and representation as patch(1)).

Presumably those tracepoints would have been provided by a kernel developer
and delivered to non-developers, just like static tracepoints.

> > (*) I don't buy the performance arguments: kprobes are quick, and I'd
> > expect that the CPU consumption of the destination of the probe is
> > comparable to or higher than the cost of taking the initial trap.
> 
> Please see Mathieu's earlier posting of numbers comparing kprobes to
> static points. Nevertheless, I do not believe that the use of kprobes
> should be pitted against static instrumentation, the two are
> orthogonal.

People have been speeding up kprobes in recent kernels, to avoid the int3
overhead.  I don't recall seeing how effective that has been.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 19:59                           ` Andrew Morton
@ 2006-09-15 20:24                             ` Karim Yaghmour
  2006-09-15 20:25                               ` Thomas Gleixner
  0 siblings, 1 reply; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-15 20:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers,
	linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais


Andrew Morton wrote:
> People have been speeding up kprobes in recent kernels, to avoid the int3
> overhead.  I don't recall seeing how effective that has been.

I don't want to microdebate this one, but here's the quote from Frank
on the topic of djprobe:
> Smart teams from IBM and Hitachi have been hammering away at this code
> for a year or two now, and yet (roughly) here we are.  There have been
> experiments involving plopping branches instead of int3's at probe
> locations, but this is self-modifying code involving multiple
> instructions, and appears to be tricky on SMP/preempt boxes.

The idea behind this mechanism is neat. But every step along the way
there seem to be ever more complex corner cases where it can't be
used.

Should this mechanism ever be made to work, the need for static
markup would still be felt however.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15 20:24                             ` Karim Yaghmour
@ 2006-09-15 20:25                               ` Thomas Gleixner
  0 siblings, 0 replies; 271+ messages in thread
From: Thomas Gleixner @ 2006-09-15 20:25 UTC (permalink / raw)
  To: karim
  Cc: Andrew Morton, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar,
	Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar,
	Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais

On Fri, 2006-09-15 at 16:24 -0400, Karim Yaghmour wrote:
> Should this mechanism ever be made to work, the need for static
> markup would still be felt however.

This might apply to some exotic points, but for 98% of the
instrumentation scenarios static markup is not necessary.

	tglx



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 18:15             ` Ingo Molnar
                                 ` (2 preceding siblings ...)
  2006-09-14 19:40               ` Tim Bird
@ 2006-09-14 19:47               ` Roman Zippel
  2006-09-14 20:24                 ` Ingo Molnar
  3 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 19:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> > > for me these are all _independent_ grounds for rejection, as a generic 
> > > kernel infrastructure.
> > 
> > Tracepoints of course need to be managed, but that's true for both 
> > dynamic and static tracepoints. [...]
> 
> that's not true, and this is the important thing that i believe you are 
> missing. A dynamic tracepoint is _detached_ from the normal source code 
> and thus is zero maintainance overhead. You dont have to maintain it 
> during normal development - only if you need it. You dont see the 
> dynamic tracepoints in the source code.
> 
> a static tracepoint, once it's in the mainline kernel, is a nonzero 
> maintainance overhead _until eternity_.

I hope you do realize that this a rather selfish point of view. The zero 
maintainance overhead is a myth, only because _you_ don't have to do it.
OTOH maintaining the trace points along with the corresponding source is 
a barely noticable noise and is certainly less work than having them to 
maintain separately.

> It is a constant visual 
> hindrance and a constant build-correctness and boot-correctness problem 
> if you happen to change the code that is being traced by a static 
> tracepoint. Again, I am talking out of actual experience with static 
> tracepoints: i frequently break my kernel via static tracepoints and i 
> have constant maintainance cost from them.

Sorry, but you're not the only one with actual experience and in my 
experience the value far outweighs the occasional need for adjustments. If 
you don't use them, they are of course a nuisance, but is your personal 
dislike really reason enough to deny others a useful tool?

> i am giving a line by line rebuttal of all arguments that come up. 
> Please be fair and do the same. Here are the arguments again, for a 
> third time. Thanks!

Ingo, maybe you should try to understand the point I'm trying to make?
You mostly emphasize your personal dislike of static tracepoints.

> > > also, the other disadvantages i listed very much count too. Static 
> > > tracepoints are fundamentally limited because:
> > > 
> > >   - they can only be added at the source code level
> > > 
> > >   - modifying them requires a reboot which is not practical in a
> > >     production environment
> > > 
> > >   - there can only be a limited set of them, while many problems need
> > >     finegrained tracepoints tailored to the problem at hand
> > > 
> > >   - conditional tracepoints are typically either nonexistent or very
> > >     limited.

Sorry, but I fail to see the point you're trying to make (beside your 
personal preferences), none of this is a unsolvable problem, which would 
prevent making good use of static tracepoints.

> > > the kprobes infrastructure, despite being fairly young, is widely 
> > > available: powerpc, i386, x86_64, ia64 and sparc64. The other 
> > > architectures are free to implement them too, there's nothing 
> > > hardware-specific about kprobes and the "porting overhead" is in 
> > > essence a one-time cost - while for static tracepoints the 
> > > maintainance overhead goes on forever and scales linearly with the 
> > > number of tracepoints added.
> > 
> > kprobes are not trivial to implement [...]
> 
> nor are smp-alternatives, which was suggested as a solution to reduce 
> the overhead of static tracepoints. So what's the point? It's a one-off 
> development overhead that has already been done for all the major 
> arches. If another arch needs it they can certainly implement it.

Static tracepoints don't have to be implemented via alternatives and
you continue to ignore that kprobes are nontrivial, you continue to ignore 
that both can coexist just fine. You just want to force your personal 
preferences onto others. :-(

> it's like arguing against ptrace on the grounds of: "application 
> developers can add printf if they want to debug their apps, or they can 
> add static tracepoints too, and besides, ptrace is hard to implement".

Sorry, I don't understand this point. Ptrace support would match kernel 
gdb support, which would be a complete different discussion...

> > I also think you highly exaggerate the maintaince overhead of static 
> > tracepoints, once added they hardly need any maintainance, most of the 
> > time you can just ignore them. [...]
> 
> hundreds (or possibly thousands) of tracepoints? Have you ever tried to 
> maintain that? I have and it's a nightmare.

_This_ discussion is about a core set of trace points! Yes, you can have 
thousands of trace points in drivers, but they don't have to be enabled by 
default and are no reason at all against a few core trace point, which 
can be used by _all_ archs to trace core events as _cheaply_ as possible.

> Even assuming a rich set of hundreds of static tracepoints, it doesnt 
> even solve the problems at hand: people want to do much more when they 
> probe the kernel - and today, with DTrace under Solaris people _know_ 
> that much better tracing _can be done_, and they _demand_ that Linux 
> adopts an intelligent solution. The clock is ticking for dinosaurs like 
> static printks and static tracepoints to debug the kernel...

Huh? How exactly do static tracepoints prevent you from doing this?
Different problems require different solutions, nobody is taking Kprobes 
away, but why should Kprobes be the only solution?

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 19:47               ` Roman Zippel
@ 2006-09-14 20:24                 ` Ingo Molnar
  2006-09-14 20:54                   ` Roman Zippel
  2006-09-15  1:47                   ` Mathieu Desnoyers
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 20:24 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > > > also, the other disadvantages i listed very much count too. Static 
> > > > tracepoints are fundamentally limited because:
> > > > 
> > > >   - they can only be added at the source code level
> > > > 
> > > >   - modifying them requires a reboot which is not practical in a
> > > >     production environment
> > > > 
> > > >   - there can only be a limited set of them, while many problems need
> > > >     finegrained tracepoints tailored to the problem at hand
> > > > 
> > > >   - conditional tracepoints are typically either nonexistent or very
> > > >     limited.
> 
> Sorry, but I fail to see the point you're trying to make (beside your 
> personal preferences), none of this is a unsolvable problem, which 
> would prevent making good use of static tracepoints.

those are technical arguments - i'm not sure how you can understand them 
to be "personal preferences". The only personal preference i have is 
that in the end a technically most superior solution should be merged. 
(be that one project or the other, or a hybrid of the two) The analysis 
of which one is a better solution depends on pros and cons - exactly 
like the ones listed above. If they are solvable problems then please 
let me know how you would solve them and when you (or others) would 
solve them, preferably before merging the code. Right now they are 
pretty heavy cons as far as LTT goes, so obviously they have a primary 
impact on the topic at hand (whic is whether to merge LTT or not).

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:24                 ` Ingo Molnar
@ 2006-09-14 20:54                   ` Roman Zippel
  2006-09-14 21:08                     ` Daniel Walker
  2006-09-15  1:47                   ` Mathieu Desnoyers
  1 sibling, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 20:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> those are technical arguments - i'm not sure how you can understand them 
> to be "personal preferences". The only personal preference i have is 
> that in the end a technically most superior solution should be merged. 

Ingo, so far you have made not a single argument why they can't coexist 
except for your personal dislike.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:54                   ` Roman Zippel
@ 2006-09-14 21:08                     ` Daniel Walker
  2006-09-14 21:30                       ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Daniel Walker @ 2006-09-14 21:08 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

On Thu, 2006-09-14 at 22:54 +0200, Roman Zippel wrote:
> Hi,
> 
> On Thu, 14 Sep 2006, Ingo Molnar wrote:
> 
> > those are technical arguments - i'm not sure how you can understand them 
> > to be "personal preferences". The only personal preference i have is 
> > that in the end a technically most superior solution should be merged. 
> 
> Ingo, so far you have made not a single argument why they can't coexist 
> except for your personal dislike.

Not to put to fine a point on it, but I think there's not a small number
of us that "prefer" the best solution.

Daniel


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 21:08                     ` Daniel Walker
@ 2006-09-14 21:30                       ` Roman Zippel
  2006-09-14 22:15                         ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 21:30 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Thu, 14 Sep 2006, Daniel Walker wrote:

> > Ingo, so far you have made not a single argument why they can't coexist 
> > except for your personal dislike.
> 
> Not to put to fine a point on it, but I think there's not a small number
> of us that "prefer" the best solution.

You can have it.
OTOH I would also like to know what's going in my m68k kernel without 
having to implement some rather complex infrastructure, which I don't need 
otherwise. There hasn't been a single argument so far, why we can't have 
both.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 21:30                       ` Roman Zippel
@ 2006-09-14 22:15                         ` Ingo Molnar
  2006-09-14 23:39                           ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 22:15 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Thu, 14 Sep 2006, Daniel Walker wrote:
> 
> > > Ingo, so far you have made not a single argument why they can't coexist 
> > > except for your personal dislike.
> > 
> > Not to put to fine a point on it, but I think there's not a small number
> > of us that "prefer" the best solution.
> 
> You can have it.
> OTOH I would also like to know what's going in my m68k kernel without 
> having to implement some rather complex infrastructure, which I don't 
> need otherwise. There hasn't been a single argument so far, why we 
> can't have both.

the argument is very simple: LTT creates strong coupling, it is almost a 
set of 350+ system-calls, moved into the heart of the kernel. Once moved 
in, it's very hard to remove it. "Why did you remove that trace 
information, you broke my LTT script!"

While with SystemTap the coupling is alot smaller. With dynamic tracing 
there's no _fundamental requirement_ for _any_ tracepoint to be in the 
source code, hence we have the present and future flexibility to 
eliminate most of them. So my point is: shape all the static tracepoints 
in a "provide data to dynamic tracers" way. If they are removed (which 
we should have the freedom to do), the removal is not a showstopper.

Flexibility of future choices, especially for user/developer-visible 
features, is one of the most important factors of kernel maintainance.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:15                         ` Ingo Molnar
@ 2006-09-14 23:39                           ` Roman Zippel
  2006-09-14 23:43                             ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 23:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

> > OTOH I would also like to know what's going in my m68k kernel without 
> > having to implement some rather complex infrastructure, which I don't 
> > need otherwise. There hasn't been a single argument so far, why we 
> > can't have both.
> 
> the argument is very simple: LTT creates strong coupling, it is almost a 
> set of 350+ system-calls, moved into the heart of the kernel. Once moved 
> in, it's very hard to remove it. "Why did you remove that trace 
> information, you broke my LTT script!"

You are changing the topic. Nobody said the current LTT tracepoints have 
to be merged as is. You generalize from a work in progress to static trace 
points in general.

> While with SystemTap the coupling is alot smaller.

What guarantees we don't have similiar problems with dynamic tracepoints?
As soon as any tracing is merged, users will have some kind of expectation 
and thus you can expect "Why did you change this source? It broke my 
SystemTap script!" here as well.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 23:39                           ` Roman Zippel
@ 2006-09-14 23:43                             ` Ingo Molnar
  2006-09-15  0:27                               ` Roman Zippel
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 23:43 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

* Roman Zippel <zippel@linux-m68k.org> wrote:

> > While with SystemTap the coupling is alot smaller.
> 
> What guarantees we don't have similiar problems with dynamic 
> tracepoints? As soon as any tracing is merged, users will have some 
> kind of expectation [...]

because users rely on the functionality, not on the implementation 
details. As i outlined it before: with dynamic tracers, static 
tracepoints _are not a necessity_. With static tracers, _static 
tracepoints are the only game in town_.

i outlined one such specific "removal of static tracepoint" example 
already: static trace points at the head/prologue of functions (half of 
the existing tracepoints are such). The sock_sendmsg() example i quoted 
before is such a case. Those trace points can be replaced with a simple 
GCC function attribute, which would cause a 5-byte (or whatever 
necessary) NOP to be inserted at the function prologue. The attribute 
would be alot less invasive than an explicit tracepoint (and thus easier 
to maintain):

 int __trace function(char arg1, char arg2)
 {
 }

where kprobes can be used to attach a lightweight tracepoint that does a 
call, not a break (INT3) instruction. With static tracers we couldnt do 
this so we'd have to stick with the static tracepoints forever! It's 
always hard to remove features, so we have to make sure we add the 
feature that we know is the best long-term solution.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 23:43                             ` Ingo Molnar
@ 2006-09-15  0:27                               ` Roman Zippel
  0 siblings, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-15  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Hi,

On Fri, 15 Sep 2006, Ingo Molnar wrote:

>  int __trace function(char arg1, char arg2)
>  {
>  }
> 
> where kprobes can be used to attach a lightweight tracepoint that does a 
> call, not a break (INT3) instruction. With static tracers we couldnt do 
> this so we'd have to stick with the static tracepoints forever! It's 
> always hard to remove features, so we have to make sure we add the 
> feature that we know is the best long-term solution.

Where is the prove for that? Why can't the same rules apply to dynamic and 
static trace points?
You're also mixing up function tracing with event tracing. Most of the LTT 
trace points log rather high level events, which are rather unlikely to  
disappear. It's more likely that the place where they are generated is 
moved and then it's only advantageous if the marker is moved as well at 
the same time. OTOH if the actual event really is not generated anymore, 
there is also no need for the marker anymore.

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:24                 ` Ingo Molnar
  2006-09-14 20:54                   ` Roman Zippel
@ 2006-09-15  1:47                   ` Mathieu Desnoyers
  2006-09-15  5:47                     ` Vara Prasad
  1 sibling, 1 reply; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-15  1:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Roman Zippel <zippel@linux-m68k.org> wrote:
> 
> > > > > also, the other disadvantages i listed very much count too. Static 
> > > > > tracepoints are fundamentally limited because:
> > > > > 
[...]
> Right now they are 
> pretty heavy cons as far as LTT goes, so obviously they have a primary 
> impact on the topic at hand (whic is whether to merge LTT or not).
> 

Ingo, why are you arguing about static instrumentation when I don't submit any
static instrumentation in my patch ? You can argue about static VS dynamic
instrumentation all you want, but please don't apply this debate to a dicision
about including or not a core tracing infrastructure that has nothing to do
with the way instrumentation or probes are inserted.

Mathieu


OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15  1:47                   ` Mathieu Desnoyers
@ 2006-09-15  5:47                     ` Vara Prasad
  0 siblings, 0 replies; 271+ messages in thread
From: Vara Prasad @ 2006-09-15  5:47 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Roman Zippel, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, systemtap

Mathieu Desnoyers wrote:

>* Ingo Molnar (mingo@elte.hu) wrote:
>  
>
>>* Roman Zippel <zippel@linux-m68k.org> wrote:
>>
>>    
>>
>>>>>>also, the other disadvantages i listed very much count too. Static 
>>>>>>tracepoints are fundamentally limited because:
>>>>>>
>>>>>>            
>>>>>>
>[...]
>  
>
>>Right now they are 
>>pretty heavy cons as far as LTT goes, so obviously they have a primary 
>>impact on the topic at hand (whic is whether to merge LTT or not).
>>
>>    
>>
>
>Ingo, why are you arguing about static instrumentation when I don't submit any
>static instrumentation in my patch ? You can argue about static VS dynamic
>instrumentation all you want, but please don't apply this debate to a dicision
>about including or not a core tracing infrastructure that has nothing to do
>with the way instrumentation or probes are inserted.
>
>Mathieu
>
>
>  
>
I think Ingo is right in saying what we really need first is a generic 
mechanism in how to specify static markers in the kernel which can be 
used to put dynamic probes on demand or use as a real static function 
calls if one chooses. Once we agree on the marker mechanism dynamic 
tracing and static tracing can both co-exist happily.

Coming to your rest of the patches i really don't think we need whole 
lot more than the facilities we already got in the kernel. Frank has 
successfully demonstrated in OLS how one can use static markers by using 
only existing facilities in the kernel.

>OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
>Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:13         ` Ingo Molnar
  2006-09-14 17:55           ` Roman Zippel
@ 2006-09-14 18:12           ` Karim Yaghmour
  2006-09-14 20:25           ` Martin Bligh
  2 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 18:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> also, the other disadvantages i listed very much count too. Static 
> tracepoints are fundamentally limited because:
> 
>   - they can only be added at the source code level

Non-issue. See below. This is actually a feature, as can be seen
by browsing the source code of various subsystems/filesystems/etc.
who's authors saw fit to include their own static tracepoints.
Darn, they must've been all misguided, so too were those who
reviewed the code and let it in.

>   - modifying them requires a reboot which is not practical in a
>     production environment

Non-issue. See below.

>   - there can only be a limited set of them, while many problems need
>     finegrained tracepoints tailored to the problem at hand

Non-issue. See below.

>   - conditional tracepoints are typically either nonexistent or very
>     limited.

I don't get this one. What's a "conditional tracepoint" for you?

> for me these are all _independent_ grounds for rejection, as a generic 
> kernel infrastructure.

I've addressed other issues in another posting, but I want to
reiterate something here that Roman said that keeps getting
forgotten:

There is no competition between static and dynamic trace points.
They are both useful and complementary. If some set of existing
static trace points are insufficient at runtime for you to
resolve an issue, nothing precludes you from using the dynamic
mechanisms for adding more localized instrumentation.

Side point: you may be a kernel god, but there are mere mortals
out there who use Linux. The point I've been making for years
now is that there are legitimate reasons why normal non-kernel-
developer users who would benefit greatly from being able to
have access to tools that generate digested information
regarding key kernel events. You can argue all you want about
maintainability, and I continue to think you're wrong, but
you should know that the development and usefulness of any such
tools is gated by the continued inability to have a standard
set of known-to-be-good source of key kernel events. And I
repeat, the use of dynamic tracing does *not* solve this
issue.

At OLS2005 I had suggested a development of a markers infrastructure
who's users could use just to mark-up their code, the decision
for tying such markers to a given type of instrumentation not
actually being tied to the markers themselves. At OLS this
year a very good talk was given on this topic by Frank from the
systemtap team and it was very well received by the jam-packed
audience. IOW, while there used to be a time when people pitted
static instrumentation against dynamic instrumentation, there's
been an ever growing consensus that no such choice need be made.

Thanks,

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:13         ` Ingo Molnar
  2006-09-14 17:55           ` Roman Zippel
  2006-09-14 18:12           ` Karim Yaghmour
@ 2006-09-14 20:25           ` Martin Bligh
  2006-09-14 20:34             ` Ingo Molnar
  2 siblings, 1 reply; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

> if there are lots of tracepoints (and the union of _all_ useful 
> tracepoints that i ever encountered in my life goes into the thousands) 
> then the overhead is not zero at all.
> 
> also, the other disadvantages i listed very much count too. Static 
> tracepoints are fundamentally limited because:
> 
>   - they can only be added at the source code level
> 
>   - modifying them requires a reboot which is not practical in a
>     production environment
> 
>   - there can only be a limited set of them, while many problems need
>     finegrained tracepoints tailored to the problem at hand
> 
>   - conditional tracepoints are typically either nonexistent or very
>     limited.
> 
> for me these are all _independent_ grounds for rejection, as a generic 
> kernel infrastructure.

I don't think anyone is saying that static tracepoints do not have their
limitations, or that dynamic tracepointing is useless. But that's not
the point ... why can't we have one infrastructure that supports both?
Preferably in a fairly simple, consistent way.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:25           ` Martin Bligh
@ 2006-09-14 20:34             ` Ingo Molnar
  2006-09-14 20:55               ` Martin Bligh
                                 ` (2 more replies)
  0 siblings, 3 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 20:34 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche


* Martin Bligh <mbligh@mbligh.org> wrote:

> >if there are lots of tracepoints (and the union of _all_ useful 
> >tracepoints that i ever encountered in my life goes into the thousands) 
> >then the overhead is not zero at all.
> >
> >also, the other disadvantages i listed very much count too. Static 
> >tracepoints are fundamentally limited because:
> >
> >  - they can only be added at the source code level
> >
> >  - modifying them requires a reboot which is not practical in a
> >    production environment
> >
> >  - there can only be a limited set of them, while many problems need
> >    finegrained tracepoints tailored to the problem at hand
> >
> >  - conditional tracepoints are typically either nonexistent or very
> >    limited.
> >
> >for me these are all _independent_ grounds for rejection, as a generic 
> >kernel infrastructure.
> 
> I don't think anyone is saying that static tracepoints do not have 
> their limitations, or that dynamic tracepointing is useless. But 
> that's not the point ... why can't we have one infrastructure that 
> supports both? Preferably in a fairly simple, consistent way.

primarily because i fail to see any property of static tracers that are 
not met by dynamic tracers. So to me dynamic tracers like SystemTap are 
a superset of static tracers.

So my position is that what we should concentrate on is to make the life 
of dynamic tracers easier (be that a handful of generic, parametric 
hooks that gather debuginfo information and add NOPs for easy patching), 
while realizing that static tracers have no advantage over dynamic 
tracers.

i.e. why add infrastructure for the sake of something that is clearly 
inferior? I have no problem with adding infrastructure for SystemTap, 
but i am asking the question: is it worth adding a static tracer?

I would of course accept static tracers too if someone proved it that 
they offer something that dynamic tracers cannot do.

(Just like i would accept the reintroduction of the Big Kernel Lock too, 
if someone proved it that it's the right thing to do.)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:34             ` Ingo Molnar
@ 2006-09-14 20:55               ` Martin Bligh
  2006-09-14 21:31                 ` Ingo Molnar
  2006-09-19 12:08                 ` Christoph Hellwig
  2006-09-14 21:07               ` Roman Zippel
  2006-09-15  9:29               ` Jes Sorensen
  2 siblings, 2 replies; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 20:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

Ingo Molnar wrote:
> * Martin Bligh <mbligh@mbligh.org> wrote:
> 
> 
>>>if there are lots of tracepoints (and the union of _all_ useful 
>>>tracepoints that i ever encountered in my life goes into the thousands) 
>>>then the overhead is not zero at all.
>>>
>>>also, the other disadvantages i listed very much count too. Static 
>>>tracepoints are fundamentally limited because:
>>>
>>> - they can only be added at the source code level
>>>
>>> - modifying them requires a reboot which is not practical in a
>>>   production environment
>>>
>>> - there can only be a limited set of them, while many problems need
>>>   finegrained tracepoints tailored to the problem at hand
>>>
>>> - conditional tracepoints are typically either nonexistent or very
>>>   limited.
>>>
>>>for me these are all _independent_ grounds for rejection, as a generic 
>>>kernel infrastructure.
>>
>>I don't think anyone is saying that static tracepoints do not have 
>>their limitations, or that dynamic tracepointing is useless. But 
>>that's not the point ... why can't we have one infrastructure that 
>>supports both? Preferably in a fairly simple, consistent way.
> 
> 
> primarily because i fail to see any property of static tracers that are 
> not met by dynamic tracers. So to me dynamic tracers like SystemTap are 
> a superset of static tracers.

1. They're harder to maintain out of tree.
2. they're written in some jibberish awk crap
3. They're slower. If you're doing thousands of tracepoints a second,
	into a circular 8GB log buffer, that *does* matter. You want
	to peturb what you're measuring as little as possible.

If you're running across thousands of systems, in live production, in
order to catch a rare race condition, the performance does matter.

> So my position is that what we should concentrate on is to make the life 
> of dynamic tracers easier (be that a handful of generic, parametric 
> hooks that gather debuginfo information and add NOPs for easy patching), 
> while realizing that static tracers have no advantage over dynamic 
> tracers.

I'm confused. You're saying that the dynamic tracers need help by
adding some static data to the kernel, and yet at the same time
rejecting static additions to the kernel on the grounds they have
no value???

Perhaps we're just meaning different things by static tracing. To me,
what is important is that there is a well-defined place in the source
code where the data needed to be logged, and the exact place to log
it at, is defined. If all that macro does to the compilation is add
a couple of nops, and make an entry in a symbol data, or other debug
data, for something to hook into later that's *fine*. The point is
to maintain the location and intelligence about *what* to trace.

Perhaps I'm calling that static, and you're calling it dynamic? Would
explain why we're disagreeing ;-) Seems to be exactly what you're
suggesting above?

If we want it to be superfast, we could compile with a different config 
option to insert some tracing statically in there or something, but I
agree it should not be the default.

> i.e. why add infrastructure for the sake of something that is clearly 
> inferior? I have no problem with adding infrastructure for SystemTap, 
> but i am asking the question: is it worth adding a static tracer?

Yes ;-) Realise that your usage model is not exactly the same as
everyone else's, and I don't give a damn if I have to recompile. I
realise other people do, but ....

> I would of course accept static tracers too if someone proved it that 
> they offer something that dynamic tracers cannot do.

Can you *really* trace *any* variable (stack variables, etc) at *any*
point within *any* function with kprobes? It didn't do that before,
and I find it hard to see how it could, given compiler optimizations,
etc.

> (Just like i would accept the reintroduction of the Big Kernel Lock too, 
> if someone proved it that it's the right thing to do.)

Surely it's still there at the moment? ;-)

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:55               ` Martin Bligh
@ 2006-09-14 21:31                 ` Ingo Molnar
  2006-09-14 22:25                   ` Martin Bligh
  2006-09-19 12:08                 ` Christoph Hellwig
  1 sibling, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 21:31 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

* Martin Bligh <mbligh@mbligh.org> wrote:

> > primarily because i fail to see any property of static tracers that 
> > are not met by dynamic tracers. So to me dynamic tracers like 
> > SystemTap are a superset of static tracers.
> 
> 1. They're harder to maintain out of tree.

as i mentioned before, SystemTap should be in tree. Relayfs was added 
for the sake of SystemTap for example, i have no problem with moving 
SystemTap into the tree either.

> 2. they're written in some jibberish awk crap

You can write embedded-C SystemTap scripts too. There's an "EMBEDDED C" 
section in "man stap".

> 3. They're slower. If you're doing thousands of tracepoints a second,
> 	into a circular 8GB log buffer, that *does* matter. You want
> 	to peturb what you're measuring as little as possible.

i very much agree that they should become as fast as possible. So to 
rephrase the question: can we make dynamic tracepoints as fast (or 
nearly as fast) as static tracepoints? If yes, should we care about 
static tracers at all?

> >So my position is that what we should concentrate on is to make the life 
> >of dynamic tracers easier (be that a handful of generic, parametric 
> >hooks that gather debuginfo information and add NOPs for easy patching), 
> >while realizing that static tracers have no advantage over dynamic 
> >tracers.
> 
> I'm confused. You're saying that the dynamic tracers need help by 
> adding some static data to the kernel, and yet at the same time 
> rejecting static additions to the kernel on the grounds they have no 
> value???

no. I'm saying that dynamic tracers are fundamentally more advanced, and 
that _iff_ we are to add static info to the kernel we should add it _for 
the sole sake of speeding up dynamic tracers_. If static tracers can 
live off the same hooks then fine, but we should architect primarily for 
the needs of the dynamic tracers.

> Perhaps we're just meaning different things by static tracing. To me, 
> what is important is that there is a well-defined place in the source 
> code where the data needed to be logged, and the exact place to log it 
> at, is defined. If all that macro does to the compilation is add a 
> couple of nops, and make an entry in a symbol data, or other debug 
> data, for something to hook into later that's *fine*. The point is to 
> maintain the location and intelligence about *what* to trace.

ok. For me 'static tracepoints' are like the sort of stuff that LTT 
adds: funky function names littering the tree.

i see the point behind 'data extraction point' hooks mentioned by you as 
a compromise, which incidentally will also speed up dynamic tracepoints 
to the level of static tracepoints. But they should be very much 
constructed as data extraction points for the purposes of dynamic 
tracers. (which the LTT hooks currently are not)

> If we want it to be superfast, we could compile with a different 
> config option to insert some tracing statically in there or something, 
> but I agree it should not be the default.

for a dynamic tracer all that is needed is a 5-byte NOP (even on 
64-bit), and the availability of all the data. Maybe even a function 
call that can be patched out after bootup, with NOPs. But the current 
LTT stuff has lots of inlined crap that just bloats the kernel.

> >i.e. why add infrastructure for the sake of something that is clearly 
> >inferior? I have no problem with adding infrastructure for SystemTap, 
> >but i am asking the question: is it worth adding a static tracer?
> 
> Yes ;-) Realise that your usage model is not exactly the same as 
> everyone else's, and I don't give a damn if I have to recompile. I 
> realise other people do, but ....

So you dont care about recompiling: that's fine - but others care, so as 
long as all your needs are met (which we are working on meeting :-) then 
we'll go for the solution that is better - instead of having some dual 
debugging infrastructure.

> > (Just like i would accept the reintroduction of the Big Kernel Lock
> >  too, if someone proved it that it's the right thing to do.)
> 
> Surely it's still there at the moment? ;-)

no - at least for me it's the Big Kernel Semaphore ;-)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 21:31                 ` Ingo Molnar
@ 2006-09-14 22:25                   ` Martin Bligh
  2006-09-14 22:36                     ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 22:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

Ingo Molnar wrote:
> * Martin Bligh <mbligh@mbligh.org> wrote:
> 
> 
>>>primarily because i fail to see any property of static tracers that 
>>>are not met by dynamic tracers. So to me dynamic tracers like 
>>>SystemTap are a superset of static tracers.
>>
>>1. They're harder to maintain out of tree.
> 
> as i mentioned before, SystemTap should be in tree. Relayfs was added 
> for the sake of SystemTap for example, i have no problem with moving 
> SystemTap into the tree either.

Right, but I'm not talking about the infrastructure, I'm talking about
the placement of the trace points, and the local variables they need
to access in order to get useful data.

>>2. they're written in some jibberish awk crap
> 
> You can write embedded-C SystemTap scripts too. There's an "EMBEDDED C" 
> section in "man stap".

OK, that helps - thanks. Will try to find some time to go back and look
again.

> 
>>3. They're slower. If you're doing thousands of tracepoints a second,
>>	into a circular 8GB log buffer, that *does* matter. You want
>>	to peturb what you're measuring as little as possible.
> 
> i very much agree that they should become as fast as possible. So to 
> rephrase the question: can we make dynamic tracepoints as fast (or 
> nearly as fast) as static tracepoints? If yes, should we care about 
> static tracers at all?

Depends how many nops you're willing to add, I guess. Anything, even
the static tracepoints really needs at least a branch to be useful,
IMHO. At least for what I've been doing with it, you need to stop
the data flow after a while (when the event you're interested in
happens, I'm using it like a flight data recorder, so we can go back
and do postmortem on what went wrong). I should imagine branch
prediction makes it very cheap on most modern CPUs, but don't have
hard data to hand.

OTOH, if you don't know in advance how big the tracing point is
(ie what it's having to do within there to log), you have a problem.
I believe the usual way kprobes/systemtap does this is to do a jump
out of line, which is significantly slower. If we could get a good
estimate on how large the trace point was *likely* to be, maybe we
could leave enough space in nop's inline? OTOH, if we do that a lot,
we end up increasing code size ....

So I suspect the correct compromise is to have macros that normally
are extremely non-invasive, either just entries in a data table (no
code impact) or that plus enough nops to do a jump (as I understand
it, you sometimes need the nops because it's not always possible to
relocate certain bits of code ... perhaps we can detect when?). But
it *will* be slower at trace time, because we're still jumping.
OTOH, if you want it to be fast, you recompile with the "I actually
need tracing to be superfast" option, and it leaves more space.
Seems to give the best of both worlds, as needed.

>>>So my position is that what we should concentrate on is to make the life 
>>>of dynamic tracers easier (be that a handful of generic, parametric 
>>>hooks that gather debuginfo information and add NOPs for easy patching), 
>>>while realizing that static tracers have no advantage over dynamic 
>>>tracers.
>>
>>I'm confused. You're saying that the dynamic tracers need help by 
>>adding some static data to the kernel, and yet at the same time 
>>rejecting static additions to the kernel on the grounds they have no 
>>value???
> 
> no. I'm saying that dynamic tracers are fundamentally more advanced, and 
> that _iff_ we are to add static info to the kernel we should add it _for 
> the sole sake of speeding up dynamic tracers_. If static tracers can 
> live off the same hooks then fine, but we should architect primarily for 
> the needs of the dynamic tracers.

OK. Not too fusssed about the exact details ... would it be fair to say
that you agree that we may need to add *some* instrumentation / hooks
into the codebase in order to locate where and what to trace? Beyond
that, it seems like little bits of implementation detail to me. What
we ended up with was basically:

	ktrace(major_type, minor_type, data, ...)

The minor and major types were enums, but given descriptive names, they
actually seem to help, rather than hinder, code readability. I'd send
out the code, but it needs a major cleanup first ;-)

> ok. For me 'static tracepoints' are like the sort of stuff that LTT 
> adds: funky function names littering the tree.

I think it can be done in different ways, some cleaner than others.
What's important, to me at least, is that the tags are in tree to make
them maintained along with the code, and we can get at all local
variable data, etc, easily. Obviously, beyond that, it should be
as clean and uninvasive as possible. Maybe others have different views,
not sure.

> i see the point behind 'data extraction point' hooks mentioned by you as 
> a compromise, which incidentally will also speed up dynamic tracepoints 
> to the level of static tracepoints. But they should be very much 
> constructed as data extraction points for the purposes of dynamic 
> tracers. (which the LTT hooks currently are not)

OK. Not sure I care too much what the purpose is, as long as they tag
where and what needs extracting, people can use them for whatever ...
as handbags to dance round, as far as I care ;-)

>>If we want it to be superfast, we could compile with a different 
>>config option to insert some tracing statically in there or something, 
>>but I agree it should not be the default.
> 
> for a dynamic tracer all that is needed is a 5-byte NOP (even on 
> 64-bit), and the availability of all the data. Maybe even a function 
> call that can be patched out after bootup, with NOPs. But the current 
> LTT stuff has lots of inlined crap that just bloats the kernel.

OK. But I don't think that's inherent to tracing hooks ... sounds like
more of an implementation detail? Worst case, it's a config option as
to whether to put a nop or inlined stuff in there, if we decide that
the extra speed of not doing a jump may be important?

> So you dont care about recompiling: that's fine - but others care, so as 
> long as all your needs are met (which we are working on meeting :-) then 
> we'll go for the solution that is better - instead of having some dual 
> debugging infrastructure.

Sounds absolutely correct to me. Even if we had some static points, I
think we'd still want the ability to mix both in *one* infrastructure.

>>>(Just like i would accept the reintroduction of the Big Kernel Lock
>>> too, if someone proved it that it's the right thing to do.)
>>
>>Surely it's still there at the moment? ;-)
> 
> no - at least for me it's the Big Kernel Semaphore ;-)

Ah, semantics ;-) Fair enough. It still needs to die though ...

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:25                   ` Martin Bligh
@ 2006-09-14 22:36                     ` Ingo Molnar
  2006-09-14 22:59                       ` Martin Bligh
  2006-09-15 15:37                       ` Michel Dagenais
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 22:36 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche


* Martin Bligh <mbligh@mbligh.org> wrote:

> > i very much agree that they should become as fast as possible. So to 
> > rephrase the question: can we make dynamic tracepoints as fast (or 
> > nearly as fast) as static tracepoints? If yes, should we care about 
> > static tracers at all?
> 
> Depends how many nops you're willing to add, I guess. Anything, even 
> the static tracepoints really needs at least a branch to be useful, 
> IMHO. At least for what I've been doing with it, you need to stop the 
> data flow after a while (when the event you're interested in happens, 
> I'm using it like a flight data recorder, so we can go back and do 
> postmortem on what went wrong). I should imagine branch prediction 
> makes it very cheap on most modern CPUs, but don't have hard data to 
> hand.

only 5 bytes of NOP are needed by default, so that a kprobe can insert a 
call/callq instruction. The easiest way in practice is to insert a 
_single_, unconditional function call that is patched out to NOPs upon 
its first occurance (doing this is not a performance issue at all). That 
way the only cost is the NOP and the function parameter preparation 
side-effects. (which might or might not be significant - with register 
calling conventions and most parameters being readily available it 
should be small.)

note that such a limited, minimally invasive 'data extraction point' 
infrastructure is not actually what the LTT patches are doing. It's not 
even close, and i think you'll be surprised. Let me quote from the 
latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version 
submitted to lkml - although no specific tracepoints were submitted):

+/* Event wakeup logging function */
+static inline void trace_process_wakeup(
+		unsigned int lttng_param_pid,
+		int lttng_param_state)
+#if (!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS))
+{
+}
+#else
+{
+	unsigned int index;
+	struct ltt_channel_struct *channel;
+	struct ltt_trace_struct *trace;
+	void *transport_data;
+	char *buffer = NULL;
+	size_t real_to_base = 0; /* The buffer is allocated on arch_size alignment */
+	size_t *to_base = &real_to_base;
+	size_t real_to = 0;
+	size_t *to = &real_to;
+	size_t real_len = 0;
+	size_t *len = &real_len;
+	size_t reserve_size;
+	size_t slot_size;
+	size_t align;
+	const char *real_from;
+	const char **from = &real_from;
+	u64 tsc;
+	size_t before_hdr_pad, after_hdr_pad, header_size;
+
+	if(ltt_traces.num_active_traces == 0) return;
+
+	/* For each field, calculate the field size. */
+	/* size = *to_base + *to + *len */
+	/* Assume that the padding for alignment starts at a
+	 * sizeof(void *) address. */
+
+	*from = (const char*)&lttng_param_pid;
+	align = sizeof(unsigned int);
+
+	if(*len == 0) {
+		*to += ltt_align(*to, align); /* align output */
+	} else {
+		*len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+	}
+
+	*len += sizeof(unsigned int);
+
+	*from = (const char*)&lttng_param_state;
+	align = sizeof(int);
+
+	if(*len == 0) {
+		*to += ltt_align(*to, align); /* align output */
+	} else {
+		*len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+	}
+
+	*len += sizeof(int);
+
+	reserve_size = *to_base + *to + *len;
+	preempt_disable();
+	ltt_nesting[smp_processor_id()]++;
+	index = ltt_get_index_from_facility(ltt_facility_process_2905B6EB,
+						event_process_wakeup);
+
+	list_for_each_entry_rcu(trace, &ltt_traces.head, list) {
+		if(!trace->active) continue;
+
+		channel = ltt_get_channel_from_index(trace, index);
+
+		slot_size = 0;
+		buffer = ltt_reserve_slot(trace, channel, &transport_data,
+			reserve_size, &slot_size, &tsc,
+			&before_hdr_pad, &after_hdr_pad, &header_size);
+		if(!buffer) continue; /* buffer full */
+
+		*to_base = *to = *len = 0;
+
+		ltt_write_event_header(trace, channel, buffer,
+			ltt_facility_process_2905B6EB, event_process_wakeup,
+			reserve_size, before_hdr_pad, tsc);
+		*to_base += before_hdr_pad + after_hdr_pad + header_size;
+
+		*from = (const char*)&lttng_param_pid;
+		align = sizeof(unsigned int);
+
+		if(*len == 0) {
+			*to += ltt_align(*to, align); /* align output */
+		} else {
+			*len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+		}
+
+		*len += sizeof(unsigned int);
+
+		/* Flush pending memcpy */
+		if(*len != 0) {
+			memcpy(buffer+*to_base+*to, *from, *len);
+			*to += *len;
+			*len = 0;
+		}
+
+		*from = (const char*)&lttng_param_state;
+		align = sizeof(int);
+
+		if(*len == 0) {
+			*to += ltt_align(*to, align); /* align output */
+		} else {
+			*len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+		}
+
+		*len += sizeof(int);
+
+		/* Flush pending memcpy */
+		if(*len != 0) {
+			memcpy(buffer+*to_base+*to, *from, *len);
+			*to += *len;
+			*len = 0;
+		}
+
+		ltt_commit_slot(channel, &transport_data, buffer, slot_size);
+
+	}
+
+	ltt_nesting[smp_processor_id()]--;
+	preempt_enable_no_resched();
+}
+#endif //(!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS))
+

believe it or not, this is inlined into: kernel/sched.c ...

'enuff said. LTT is so far from being even considerable that it's not 
even funny.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:36                     ` Ingo Molnar
@ 2006-09-14 22:59                       ` Martin Bligh
  2006-09-14 23:19                         ` Ingo Molnar
  2006-09-15  7:00                         ` Vara Prasad
  2006-09-15 15:37                       ` Michel Dagenais
  1 sibling, 2 replies; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 22:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

Ingo Molnar wrote:
> * Martin Bligh <mbligh@mbligh.org> wrote:
> 
>>>i very much agree that they should become as fast as possible. So to 
>>>rephrase the question: can we make dynamic tracepoints as fast (or 
>>>nearly as fast) as static tracepoints? If yes, should we care about 
>>>static tracers at all?
>>
>>Depends how many nops you're willing to add, I guess. Anything, even 
>>the static tracepoints really needs at least a branch to be useful, 
>>IMHO. At least for what I've been doing with it, you need to stop the 
>>data flow after a while (when the event you're interested in happens, 
>>I'm using it like a flight data recorder, so we can go back and do 
>>postmortem on what went wrong). I should imagine branch prediction 
>>makes it very cheap on most modern CPUs, but don't have hard data to 
>>hand.
> 
> only 5 bytes of NOP are needed by default, so that a kprobe can insert a 
> call/callq instruction. The easiest way in practice is to insert a 
> _single_, unconditional function call that is patched out to NOPs upon 
> its first occurance (doing this is not a performance issue at all). That 
> way the only cost is the NOP and the function parameter preparation 
> side-effects. (which might or might not be significant - with register 
> calling conventions and most parameters being readily available it 
> should be small.)
> 
> note that such a limited, minimally invasive 'data extraction point' 
> infrastructure is not actually what the LTT patches are doing. It's not 
> even close, and i think you'll be surprised. Let me quote from the 
> latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version 
> submitted to lkml - although no specific tracepoints were submitted):

OK, I grant you that's pretty scary ;-) However, it's not the only way
to do it. Most things we're using write a statically sized 64-bit event
into a relayfs buffer, with a timestamp, a minor and major event type,
and a byte of data payload.

> believe it or not, this is inlined into: kernel/sched.c ...
> 
> 'enuff said. LTT is so far from being even considerable that it's not 
> even funny.

Particularly if we're doing more complex things like that, I'd agree
that the overhead of doing the out of line jump is non-existant by
comparison. Even with the relayfs logging alone, perhaps the jump is
not that heavy ... hmmm.

If we put the NOPs in (at least as an option on some architectures)
from a macro, you don't really need the full kprobes implemented to
to tracing, even ... just overwrite the nops with a jump, so presumably
would be easier to port. However, not sure how local variable data
is specified in that case ... perhaps the kprobes guys know better.
Most of the complexity seemed to be with relocating existing code
because you didn't have nops.

To me, the main thing is to have hooks for the at least some of the
basic needs maintained in-kernel - from the dtrace paper Val pointed
me to, that seems to be exactly what they do too, and it integrates
with the newly added dynamic ones where necessary. Plus I hate the
whole awk thing, and general complexity of systemtap, but we can
probably avoid that easily enough - either the embedded C option
you mentioned, or just a different definiton for the same hook macros
under a config option.

So perhaps it'll all work. Still need a little bit of data maintained
in tree though.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:59                       ` Martin Bligh
@ 2006-09-14 23:19                         ` Ingo Molnar
  2006-09-15  0:19                           ` Nicholas Miell
  2006-09-15  1:04                           ` Martin J. Bligh
  2006-09-15  7:00                         ` Vara Prasad
  1 sibling, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 23:19 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

* Martin Bligh <mbligh@mbligh.org> wrote:

> > note that such a limited, minimally invasive 'data extraction point' 
> > infrastructure is not actually what the LTT patches are doing. It's 
> > not even close, and i think you'll be surprised. Let me quote from 
> > the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same 
> > version submitted to lkml - although no specific tracepoints were 
> > submitted):
> 
> OK, I grant you that's pretty scary ;-) However, it's not the only way 
> to do it. Most things we're using write a statically sized 64-bit 
> event into a relayfs buffer, with a timestamp, a minor and major event 
> type, and a byte of data payload.

oh, no need to tell me. I wrote ktrace 10 years ago, iotrace 8 years ago 
and latency-trace 2 years ago. (The latter even does extensive mcount 
based tracing, which is as demanding on the ringbuffer as it gets - on 
my testbox i routinely get 10-20 million trace events per second, where 
each trace entry includes: type, cpu, flags, preempt_count, pid, 
timestamp and 4 words of arbitrary payload, all fit into 32 bytes. It 
has static tracepoints too, in addition to the 20,000-40,000 mcount 
tracepoints a typical kernel has.)

So i think i know the advantages and disadvantages of static tracers, 
their maintainance and performance impact.

but i think (and i think now you'll be surprised) the way to go is to do 
all this in SystemTap ;-) If we add any static points to the kernel then 
it should have a pure 'local data preparation for extraction' purpose - 
nothing more. Static tracing can be built around that too, but at that 
point it will be unnecessary because SystemTap will be able to do that 
too, with the same (or better, considering the LTT mess) performance.

i.e. we should have macros to prepare local information, with macro 
arities of 2, 3, 4 and 5:

    _(name, data1);
   __(name, data1, data2);
  ___(name, data1, data2, data3);
 ____(name, data1, data2, data3, data4);

that and nothing more. But no guarantees that these trace points will 
always be there and usable for static tracers: for example about 50% of 
all tracepoints can be eliminated via a function attribute. (which 
function attribute tells GCC to generate a 5-byte NOP as the first 
instruction of the function prologue.) That will be invariant to things 
like function renames, etc.

> So perhaps it'll all work. Still need a little bit of data maintained 
> in tree though.

ok. And i think SystemTap itself should be in tree too, with a couple of 
examples and helper scripts all around tracing and probing - and of 
course an LTT-compatible trace output so that all the nice LTT userspace 
code and visualization can live on.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 23:19                         ` Ingo Molnar
@ 2006-09-15  0:19                           ` Nicholas Miell
  2006-09-15  1:04                           ` Martin J. Bligh
  1 sibling, 0 replies; 271+ messages in thread
From: Nicholas Miell @ 2006-09-15  0:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

On Fri, 2006-09-15 at 01:19 +0200, Ingo Molnar wrote: 
> but i think (and i think now you'll be surprised) the way to go is to do 
> all this in SystemTap ;-) If we add any static points to the kernel then 
> it should have a pure 'local data preparation for extraction' purpose - 
> nothing more. Static tracing can be built around that too, but at that 
> point it will be unnecessary because SystemTap will be able to do that 
> too, with the same (or better, considering the LTT mess) performance.
> 
> i.e. we should have macros to prepare local information, with macro 
> arities of 2, 3, 4 and 5:
> 
>     _(name, data1);
>    __(name, data1, data2);
>   ___(name, data1, data2, data3);
>  ____(name, data1, data2, data3, data4);
> 
> that and nothing more. But no guarantees that these trace points will 
> always be there and usable for static tracers: for example about 50% of 
> all tracepoints can be eliminated via a function attribute. (which 
> function attribute tells GCC to generate a 5-byte NOP as the first 
> instruction of the function prologue.) That will be invariant to things 
> like function renames, etc.

Another interesting idea would be the addition to gcc of a:

__builtin_trace_point(char *name, ...)

It would output a function call sized NOP at it's call site, and store
in another section the trace point name, location, and (this is the
important part) a series of DWARF expressions to reconstruct the trace
point's argument list from the stack frame and saved registers.

This would completely eliminate the argument passing overhead of a
patched-out function call in the cases where the trace point takes
arguments.

It'd also make your __trace function attribute unnecessary, because gcc
could presumably figure out that the trace point is at the beginning of
the function.

It "only" requires compiler support on every architecture that the
kernel cares about and compiler upgrades for everyone who wants to use
static trace points, which is no mean feat.

(Roman Zippel was trimmed from the CC list because his server is
rejecting mail from me and/or Comcast. If the first attempts actually
make it through and this is yet another duplicate, sorry.)

-- 
Nicholas Miell <nmiell@comcast.net>

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 23:19                         ` Ingo Molnar
  2006-09-15  0:19                           ` Nicholas Miell
@ 2006-09-15  1:04                           ` Martin J. Bligh
  2006-09-15 12:38                             ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Martin J. Bligh @ 2006-09-15  1:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

> i.e. we should have macros to prepare local information, with macro 
> arities of 2, 3, 4 and 5:
> 
>     _(name, data1);
>    __(name, data1, data2);
>   ___(name, data1, data2, data3);
>  ____(name, data1, data2, data3, data4);

Personally I think that's way more visually offensive that something
that looks like a function call, but still ;-) We do it as a caps macro

KTRACE(foo, bar)

internally, which I suppose makes it not look like a function call.
But at the end of the day, it's all just a matter of visual taste,
what's actually in there is way more important.

> that and nothing more. But no guarantees that these trace points will 
> always be there and usable for static tracers: for example about 50% of 
> all tracepoints can be eliminated via a function attribute. (which 
> function attribute tells GCC to generate a 5-byte NOP as the first 
> instruction of the function prologue.) That will be invariant to things 
> like function renames, etc.

Yup, sometimes you just want to know when a function is called, and
there's no real need to add that. The hook for system calls should be
pretty generic too. But things like instrumenting the reclaim code need
more work - I ended up incrementing some counters for each type of page
recovery failure in shrink_list() and then just logging one compound
event on the stats structure at the end. That's pretty specific, but
does give you a lot of useful data when the box is dying from mem
pressure.

>> So perhaps it'll all work. Still need a little bit of data maintained 
>> in tree though.
> 
> ok. And i think SystemTap itself should be in tree too, with a couple of 
> examples and helper scripts all around tracing and probing - and of 
> course an LTT-compatible trace output so that all the nice LTT userspace 
> code and visualization can live on.

I have to figure out how to graft the internal Google stuff onto the
same mechanism ... I definitely want to be able to combine the static
points with dynamic ones. And then add schedstats and blktrace into
the same thing so it interleaves properly ... seeing the blktrace type
data interact with memory reclaim debugging was very useful to me, for
instance. All these little fragmented tools are way more difficult to
deal with.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-15  1:04                           ` Martin J. Bligh
@ 2006-09-15 12:38                             ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-15 12:38 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais, fche


* Martin J. Bligh <mbligh@mbligh.org> wrote:

> >i.e. we should have macros to prepare local information, with macro 
> >arities of 2, 3, 4 and 5:
> >
> >    _(name, data1);
> >   __(name, data1, data2);
> >  ___(name, data1, data2, data3);
> > ____(name, data1, data2, data3, data4);
> 
> Personally I think that's way more visually offensive that something 
> that looks like a function call, but still ;-) We do it as a caps 
> macro
> 
> KTRACE(foo, bar)
> 
> internally, which I suppose makes it not look like a function call. 
> But at the end of the day, it's all just a matter of visual taste, 
> what's actually in there is way more important.

i disagree with the naming, for the reasons stated before: if we add any 
static info to the kernel, it's a "easier data extraction" thing (for 
the purposes of speeding up dynamic tracing), not a tracepoint. That way 
there's no dispute whether what i remove is a tracepoint (on which 
static tracers might rely in a hard way), or just a speedup for 
SystemTap. So a better name would be what SystemTap has implemented 
today:

  STAP_MARK_NN(kernel_context_switch, prev, next);

or what makes this even more explicit:

  DEBUG_DATA(kernel_context_switch, prev, next);

(but i'm flexible about the naming - as long as it doesnt say 'trace' 
and as long as there are no guarantees at all that those points remain, 
when a better method of accessing the same data for dynamic tracers is 
implemented.)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:59                       ` Martin Bligh
  2006-09-14 23:19                         ` Ingo Molnar
@ 2006-09-15  7:00                         ` Vara Prasad
  1 sibling, 0 replies; 271+ messages in thread
From: Vara Prasad @ 2006-09-15  7:00 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche,
	systemtap

Martin Bligh wrote:

> Ingo Molnar wrote:
>
>> * Martin Bligh <mbligh@mbligh.org> wrote:
>>
>>>> i very much agree that they should become as fast as possible. So 
>>>> to rephrase the question: can we make dynamic tracepoints as fast 
>>>> (or nearly as fast) as static tracepoints? If yes, should we care 
>>>> about static tracers at all?
>>>
>>>
>>> Depends how many nops you're willing to add, I guess. Anything, even 
>>> the static tracepoints really needs at least a branch to be useful, 
>>> IMHO. At least for what I've been doing with it, you need to stop 
>>> the data flow after a while (when the event you're interested in 
>>> happens, I'm using it like a flight data recorder, so we can go back 
>>> and do postmortem on what went wrong). I should imagine branch 
>>> prediction makes it very cheap on most modern CPUs, but don't have 
>>> hard data to hand.
>>
>>
>> only 5 bytes of NOP are needed by default, so that a kprobe can 
>> insert a call/callq instruction. The easiest way in practice is to 
>> insert a _single_, unconditional function call that is patched out to 
>> NOPs upon its first occurance (doing this is not a performance issue 
>> at all). That way the only cost is the NOP and the function parameter 
>> preparation side-effects. (which might or might not be significant - 
>> with register calling conventions and most parameters being readily 
>> available it should be small.)
>>
>> note that such a limited, minimally invasive 'data extraction point' 
>> infrastructure is not actually what the LTT patches are doing. It's 
>> not even close, and i think you'll be surprised. Let me quote from 
>> the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same 
>> version submitted to lkml - although no specific tracepoints were 
>> submitted):
>
>
> OK, I grant you that's pretty scary ;-) However, it's not the only way
> to do it. Most things we're using write a statically sized 64-bit event
> into a relayfs buffer, with a timestamp, a minor and major event type,
> and a byte of data payload.
>
>> believe it or not, this is inlined into: kernel/sched.c ...
>>
>> 'enuff said. LTT is so far from being even considerable that it's not 
>> even funny.
>
>
> Particularly if we're doing more complex things like that, I'd agree
> that the overhead of doing the out of line jump is non-existant by
> comparison. Even with the relayfs logging alone, perhaps the jump is
> not that heavy ... hmmm.
>
> If we put the NOPs in (at least as an option on some architectures)
> from a macro, you don't really need the full kprobes implemented to
> to tracing, even ... just overwrite the nops with a jump, so presumably
> would be easier to port. However, not sure how local variable data
> is specified in that case ... perhaps the kprobes guys know better.
> Most of the complexity seemed to be with relocating existing code
> because you didn't have nops.


With kprobes one can place probes anywhere you want but the ones placed 
in the middle of the function are not maintainable because they are tied 
to a location in the code.  Having a NOP leaves a maintainable address 
that we can hook into when needed. 

AFAIK writing a portable code for using local variables is not easy 
without using DWARF information, hence we don't handle that in kprobes. 
Jprobes is a special case where you can have access to function 
arguments at the function entry point. SystemTap can be used to specify 
probes anywhere in the function and local variables can also be used in 
the probe handlers. The problem still is maintainability as probes are 
specified using line numbers.

>
> To me, the main thing is to have hooks for the at least some of the
> basic needs maintained in-kernel - from the dtrace paper Val pointed
> me to, that seems to be exactly what they do too, and it integrates
> with the newly added dynamic ones where necessary. 


Once we have these static markers one can use both dynamic probes and 
static probes intermixed getting best of both worlds as Frank 
demonstrated in OLS.

Here are couple of proposals that were discussed in the systemtap 
mailing list in how to specify static markers, we could use these ideas 
with the rest in deciding on a maker proposal.
http://sources.redhat.com/ml/systemtap/2006-q3/msg00273.html
http://sourceware.org/ml/systemtap/2005-q4/msg00415.html

> Plus I hate the
> whole awk thing, and general complexity of systemtap, but we can
> probably avoid that easily enough - either the embedded C option
> you mentioned, or just a different definiton for the same hook macros
> under a config option.
>
> So perhaps it'll all work. Still need a little bit of data maintained
> in tree though.


For placing probes at the begin and end of function we don't really need 
markers as function boundary works as a marker.
I think we only need markers in few places where an important decision 
is made in the middle of a function.

>
> M.
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:36                     ` Ingo Molnar
  2006-09-14 22:59                       ` Martin Bligh
@ 2006-09-15 15:37                       ` Michel Dagenais
  1 sibling, 0 replies; 271+ messages in thread
From: Michel Dagenais @ 2006-09-15 15:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Bligh, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, fche


> only 5 bytes of NOP are needed by default, so that a kprobe can insert a 
> call/callq instruction. The easiest way in practice is to insert a 
> _single_, unconditional function call that is patched out to NOPs upon 
> its first occurance (doing this is not a performance issue at all). That 
> way the only cost is the NOP and the function parameter preparation 
> side-effects. (which might or might not be significant - with register 
> calling conventions and most parameters being readily available it 
> should be small.)

Interestingly, while this whole thread is full of diverging views, there
is nevertheless considerable common ground.

- Getting a trace output is very useful, whether it is generated from
dynamic or static tracepoints. You need some infrastructure (e.g.
relayfs + a few things) to get the data out efficiently.

- Some sort of static markers make sense in key locations. Whether they
are there "primarily" for dynamic or static tracepoints is mostly
irrelevant. Interesting suggestions were made for a syntax clearly
identifying their "probe point" status.

>From there we can get onto a constructive debate about the technical
details of each of these components.

> note that such a limited, minimally invasive 'data extraction point' 
> infrastructure is not actually what the LTT patches are doing. It's not 
> even close, and i think you'll be surprised. Let me quote from the 
> latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version 
> submitted to lkml - although no specific tracepoints were submitted):

This is a case where it started with inline code but as you take into
account SMP and eventuelly multiple traces (e.g. the sysadmin is tracing
the system and a user is generating a trace for his processes) it
becomes larger and inlining may not be such a good idea any more, to say
the least. However, this is relatively easy to change.

It is also worth mentioning that code patching NOPs to minimize the cost
of inactive tracepoints was envisioned quite some time ago. Again you
might call these "static low overhead placeholders for optimized dynamic
tracepoints" or "optimized low overhead static tracepoints"... You need
however to be careful when code patching instructions on SMP as it may
not be trivial to atomically replace 5 NOPs by a call.


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:55               ` Martin Bligh
  2006-09-14 21:31                 ` Ingo Molnar
@ 2006-09-19 12:08                 ` Christoph Hellwig
  1 sibling, 0 replies; 271+ messages in thread
From: Christoph Hellwig @ 2006-09-19 12:08 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche

On Thu, Sep 14, 2006 at 01:55:44PM -0700, Martin Bligh wrote:
> 1. They're harder to maintain out of tree.
> 2. they're written in some jibberish awk crap
> 3. They're slower. If you're doing thousands of tracepoints a second,
> 	into a circular 8GB log buffer, that *does* matter. You want
> 	to peturb what you're measuring as little as possible.

agreed to all these and I'd like to add:

 4.  If you merge proper dynamic tracing infrastructure you get static
     traces for free.  It's just a bunch of macros directly calling
     the trace function also used by the dynamic tracing code, maybe
     keyed of an enable variable.


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:34             ` Ingo Molnar
  2006-09-14 20:55               ` Martin Bligh
@ 2006-09-14 21:07               ` Roman Zippel
  2006-09-15  9:29               ` Jes Sorensen
  2 siblings, 0 replies; 271+ messages in thread
From: Roman Zippel @ 2006-09-14 21:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais, fche

Hi,

On Thu, 14 Sep 2006, Ingo Molnar wrote:

> primarily because i fail to see any property of static tracers that are 
> not met by dynamic tracers. So to me dynamic tracers like SystemTap are 
> a superset of static tracers.

You keep ignoring that a dynamic tracer is nontrivial... :-(
A static tracer is easy to implement and sufficient for many uses and 
most important it doesn't prevent anyone from using a dynamic tracer. 
Having a choice is good!

bye, Roman

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:34             ` Ingo Molnar
  2006-09-14 20:55               ` Martin Bligh
  2006-09-14 21:07               ` Roman Zippel
@ 2006-09-15  9:29               ` Jes Sorensen
  2 siblings, 0 replies; 271+ messages in thread
From: Jes Sorensen @ 2006-09-15  9:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Bligh, Roman Zippel, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche

>>>>> "Ingo" == Ingo Molnar <mingo@elte.hu> writes:

Ingo> * Martin Bligh <mbligh@mbligh.org> wrote:

>> I don't think anyone is saying that static tracepoints do not have
>> their limitations, or that dynamic tracepointing is useless. But
>> that's not the point ... why can't we have one infrastructure that
>> supports both? Preferably in a fairly simple, consistent way.

Ingo> primarily because i fail to see any property of static tracers
Ingo> that are not met by dynamic tracers. So to me dynamic tracers
Ingo> like SystemTap are a superset of static tracers.

Ingo> So my position is that what we should concentrate on is to make
Ingo> the life of dynamic tracers easier (be that a handful of
Ingo> generic, parametric hooks that gather debuginfo information and
Ingo> add NOPs for easy patching), while realizing that static tracers
Ingo> have no advantage over dynamic tracers.

The parallel that springs to mind here is C++ kernel components 'I
promise to only use the good parts', then next week someone else adds
another pile in a worse place. Once the points are in we will never
get rid of them, look at how long it took to get rid of devfs :( In
addition it is guaranteed that people will not be able to agree on
which points to put where, despite the claim that there will be only
30 points - sorry, I am not buying that, we have plenty of evidence to
show the opposite.

I looked at the old LTT code a while ago and it was pretty appalling,
maybe LTTng is better, but I can't say the old code gave me a warm
fuzzy feeling.

Jes

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 14:33       ` Roman Zippel
  2006-09-14 15:26         ` Michel Dagenais
  2006-09-14 17:13         ` Ingo Molnar
@ 2006-09-14 17:51         ` Karim Yaghmour
  2 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 17:51 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

Roman Zippel wrote:
> Even dynamic tracepoints have a maintainance overhead and I doubt there is 
> much difference. The big problem is having to maintain them outside the 
> mainline kernel, that's why it's so important to get them into the 
> mainline kernel.

Thanks for pointing this out. This is indeed the nugget. We can try
slicing the pie in any direction we think is best, but the bottom
line is that there's somebody somewhere that is matching source code
to important events (regardless of whether the instrumentation is
static or dynamic.) For a very long time the mantra on LKML was
"instrumentation is evil: it's a maintenance nightmare." Try as I
may, every argument I put forth was countered by this mantra.

Unfortunately for me, but fortunately for the current ltt maintainers,
time is a powerful argument. So, with that in mind, here are some
excerpts of a discussion I had with Andrew back in the summer of
2004:

Here's Andrew pulling the "instrumentation is evil" mantra:
http://marc.theaimsgroup.com/?l=linux-kernel&m=108873232414895&w=2

Here's me demonstrating that the mantra is wrong by comparing a
patch against 2.2.13 dated 1999/11/18 and a patch against 2.6.3
dated 2004/03/15:
http://marc.theaimsgroup.com/?l=linux-kernel&m=108874078111041&w=2

And here's Andrew, to his credit, saying "Fair enough."
http://marc.theaimsgroup.com/?l=linux-kernel&m=108874940728542&w=2

Now, this is 2 years ago and I haven't done the analysis recently,
but I'd bet the comparison would probably yield very similar
results. The 1st ltt patch was made in July 1999, that's more
than **7** years ago. How much longer can anybody continue saying
with a straight face that static instrumentation is a maintenance
problem? In my opinion the real problem is what impact the fact
that this issue has lingered on for so long has in encouraging people
and/or companies in investing any sort of effort in the kernel
development process. There's just no excuse for Linux not to have
something that is clearly as essential as this.

I think now is a good time to put this issue to rest and drop the
misleading mantra.

Cheers,

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 13:55     ` Ingo Molnar
  2006-09-14 14:33       ` Roman Zippel
@ 2006-09-14 15:19       ` Mathieu Desnoyers
  2006-09-14 19:39         ` Frank Ch. Eigler
  2006-09-15 17:13         ` Jose R. Santos
  1 sibling, 2 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-14 15:19 UTC (permalink / raw)
  To: Ingo Molnar, Karim Yaghmour
  Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais, Douglas Niehaus

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Roman Zippel <zippel@linux-m68k.org> wrote:
> 
> the key point is that we want _zero_ "static tracepoints". Firstly, 
> static tracepoints are fundamentally limited:
> 
>  - they can only be added at the source code level
> 
>  - modifying them requires a reboot which is not practical in a 
>    production environment

Not for kernel modules : unload/load is enough.

>  - there can only be a limited set of them, while many problems need 
>    finegrained tracepoints tailored to the problem at hand

Not true with the dynamic facility loading. LTTng can register new events upon
module load/unload.

> 
>  - conditional tracepoints are typically either nonexistent or very 
>    limited.
> 
Maybe, but it can be useful to have static instrumentation available for those
limited conditional tracepoints.

> But besides the usability problems, the most important problem is that 
> static tracepoints add a _constant maintainance overhead_ to the kernel. 
> I'm talking from first hand experience: i wrote 'iotrace' (a static 
> tracer) in 1996 and have maintained it for many years, and even today 
> i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ 
> want static tracepoints in the mainline kernel.
> 

If the trace points are modified with the code by the ones who make the
original code changes, it lessens the maintainance overhead. Furthermore, if
there is a major change in a code path that requires rethinking the trace
points, the person introducing the change has the best knowledge of what to do
with the trace point. I think that trace point maintainance should be left to
subsystem maintainers, not a centralised task done by distributions once in a
while.

Talking about experience, Karim has maintained the original LTT trace points,
which targeted key kernel event, for years without major trace points changes
between kernel versions. I think he already proved that maintainance of static
trace points in not an issue.

However, I restate that my position is that both static and dynamic
instrumentation of the kernel are a necessity and that a tracer core should be
usable by both.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:19       ` Mathieu Desnoyers
@ 2006-09-14 19:39         ` Frank Ch. Eigler
  2006-09-15 17:13         ` Jose R. Santos
  1 sibling, 0 replies; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-14 19:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Karim Yaghmour, Roman Zippel, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais,
	Douglas Niehaus

Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> writes:

> [...]  However, I restate that my position is that both static and
> dynamic instrumentation of the kernel are a necessity and that a
> tracer core should be usable by both.

On a complementary note, it would be nice if whatever static
instrumetation hooks are deemed worthwhile were themselves generic so
they could be coupled to either a fixed or dynamic "core" or back-end.

- FChE

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:19       ` Mathieu Desnoyers
  2006-09-14 19:39         ` Frank Ch. Eigler
@ 2006-09-15 17:13         ` Jose R. Santos
  1 sibling, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 17:13 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Karim Yaghmour, Roman Zippel, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais,
	Douglas Niehaus

Mathieu Desnoyers wrote:
> * Ingo Molnar (mingo@elte.hu) wrote:
> > 
> > * Roman Zippel <zippel@linux-m68k.org> wrote:
> > 
> > the key point is that we want _zero_ "static tracepoints". Firstly, 
> > static tracepoints are fundamentally limited:
> > 
> >  - they can only be added at the source code level
> > 
> >  - modifying them requires a reboot which is not practical in a 
> >    production environment
>
> Not for kernel modules : unload/load is enough.
>   

This assumes that the module can be unloaded in the first place.  
Inserting a new probe on the disk controler for your boot drive or in 
the filesystem module would still require a reboot.

> If the trace points are modified with the code by the ones who make the
> original code changes, it lessens the maintainance overhead. Furthermore, if
> there is a major change in a code path that requires rethinking the trace
> points, the person introducing the change has the best knowledge of what to do
> with the trace point. I think that trace point maintainance should be left to
> subsystem maintainers, not a centralised task done by distributions once in a
> while.
>   

I agree with you here, I think is silly to claim dynamic instrumentation 
as a fix for the "constant maintainace overhead" of static trace point.  
Working on LKET, one of the biggest burdens that we've had is mantainig 
the probe points when something in the kernel changes enough to cause a 
breakage of the dynamic instrumentation.  The solution to this is having 
the SystemTap tapsets maintained by the subsystems maintainers so that 
changes in the code can be applied to the dynamic instrumentation as 
well.  This of course means that the subsystem maintainer would need to 
maintain two pieces of code instead of one.  There are a lot of 
advantages to dynamic vs static instrumentation, but I don't think 
maintainace overhead is one of them.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 11:27 ` Ingo Molnar
  2006-09-14 13:40   ` Roman Zippel
@ 2006-09-14 15:02   ` Mathieu Desnoyers
  2006-09-14 15:14   ` Martin J. Bligh
  2006-09-19 11:59   ` Christoph Hellwig
  3 siblings, 0 replies; 271+ messages in thread
From: Mathieu Desnoyers @ 2006-09-14 15:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar,
	Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev,
	Michel Dagenais, Douglas Niehaus

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > Following an advice Christoph gave me this summer, submitting a 
> > smaller, easier to review patch should make everybody happier. Here is 
> > a stripped down version of LTTng : I removed everything that would 
> > make the code review reluctant (especially kernel instrumentation and 
> > kernel state dump module). I plan to release this "core" version every 
> > few LTTng releases and post it to LKML.
> > 
> > Comments and reviews are very welcome.
> 
> i have one very fundamental question: why should we do this 
> source-intrusive method of adding tracepoints instead of the dynamic, 
> unintrusive (and thus zero-overhead) KProbes+SystemTap method?
> 

Hi Ingo,

First, I never said that this tracing infrastructure was tied to static trace
points in any way. My goal is to provide a robust data serialisation mechanism
that could be used both from static and dynamic trace points.

Zero-overhead for static tracepoints can be achieved by compiling them out.

One problem with the KProbes approach is that is limits what can be instrumented
because of its performance impact when active : traps are very costly and can
limit instrumentation of often triggered code paths : scheduler change, traps,
interrupts...

Also, a major issue with dynamic instrumentation is that it will never be useful
to kernel developers who keep current with the git HEAD. Dynamic instrumentation
has to be defined outside of the kernel tree and cannot follow the code changes
quickly enough to be useful for a developer without himself maintaining his own
dynamic instrumentation.

I do not advocate for a particular approach : I think that dynamic
instrumentation is very well suited for distributions which stick to a
particular kernel version for a long time. However, static probes can be very
useful for kernel developers as they can follow the kernel HEAD because they
are part of the code.

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 11:27 ` Ingo Molnar
  2006-09-14 13:40   ` Roman Zippel
  2006-09-14 15:02   ` Mathieu Desnoyers
@ 2006-09-14 15:14   ` Martin J. Bligh
  2006-09-14 17:43     ` Ingo Molnar
                       ` (2 more replies)
  2006-09-19 11:59   ` Christoph Hellwig
  3 siblings, 3 replies; 271+ messages in thread
From: Martin J. Bligh @ 2006-09-14 15:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
>> Following an advice Christoph gave me this summer, submitting a 
>> smaller, easier to review patch should make everybody happier. Here is 
>> a stripped down version of LTTng : I removed everything that would 
>> make the code review reluctant (especially kernel instrumentation and 
>> kernel state dump module). I plan to release this "core" version every 
>> few LTTng releases and post it to LKML.
>>
>> Comments and reviews are very welcome.
> 
> i have one very fundamental question: why should we do this 
> source-intrusive method of adding tracepoints instead of the dynamic, 
> unintrusive (and thus zero-overhead) KProbes+SystemTap method?

Because:

1. Kprobes are more overhead when they *are* being used.
2. You can get zero overhead by CONFIG'ing things out.
3. (most importantly) it's a bitch to maintain tracepoints out
    of-tree on a rapidly moving kernel
4. I believe kprobes still doesn't have full access to local variables.


Now (3) is possibly solvable by putting the points in as no-ops (either
insert a few nops or just a marker entry in the symbol table?), but full
dynamic just isn't sustainable. What would be really nice is one trace
infrastructure, that allowed both static and dynamic tracepoints without
all the awk-style language crap that seems to come with systemtap.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:14   ` Martin J. Bligh
@ 2006-09-14 17:43     ` Ingo Molnar
  2006-09-14 18:25       ` Karim Yaghmour
  2006-09-14 20:03       ` Martin Bligh
  2006-09-14 19:03     ` grundig
  2006-09-14 19:48     ` Frank Ch. Eigler
  2 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 17:43 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

* Martin J. Bligh <mbligh@mbligh.org> wrote:

> >>Comments and reviews are very welcome.
> >
> > i have one very fundamental question: why should we do this 
> > source-intrusive method of adding tracepoints instead of the 
> > dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap 
> > method?
> 
> Because:
> 
> 1. Kprobes are more overhead when they *are* being used.

minimally so - at least on i386 and x86_64. In that sense tracing is a 
_slowpath_, and it _will_ slow things down if done excessively. I dont 
care about the tracepoint being slower by a few instructions as long as 
it has _zero effect_ on normal code, be that source code or binary code.

> 2. You can get zero overhead by CONFIG'ing things out.

but that's not how a fair chunk of people want to use tracing. People 
(enterprise customers trying to figure out performance problems, 
engineers trying to debug things on a live, production system) want to 
be able to insert a tracepoint anywhere and anytime - and also they want 
to have zero overhead from tracing if no tracepoints are used on a 
system.

> 3. (most importantly) it's a bitch to maintain tracepoints out
>    of-tree on a rapidly moving kernel

wrong: the original demo tracepoints that came with SystemTap still work 
on the current kernel, because the 'coupling' is loose: based on 
function names.

Static tracepoints on the other hand, if added via an external patch, do 
depend on the target function not moving around and the context of the 
tracepoint not being changed. (and static tracepoints if in the source 
all the time are a constant hindrance to development and code 
readability.)

and of course the big advantage of dynamic probing is its flexibility: 
you can add add-hoc tracepoints to thousands of functions, instead of 
having to maintain hundreds (or thousands) of static tracepoints all the 
time. (and if we wont end up with hundreds/thousands of static 
tracepoints then it wont be usable enough as a generic solution.)

> 4. I believe kprobes still doesn't have full access to local 
> variables.

wrong: with SystemTap you can probe local variables too (via 
jprobes/kretprobes, all in the upstream kernel already).

> Now (3) is possibly solvable by putting the points in as no-ops 
> (either insert a few nops or just a marker entry in the symbol 
> table?), but full dynamic just isn't sustainable. [...]

i'm not sure i follow. Could you explain where SystemTap has this 
difficulty?

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:43     ` Ingo Molnar
@ 2006-09-14 18:25       ` Karim Yaghmour
  2006-09-14 20:03       ` Martin Bligh
  1 sibling, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 18:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin J. Bligh, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais


Ingo Molnar wrote:
> but that's not how a fair chunk of people want to use tracing. People 
> (enterprise customers trying to figure out performance problems, 
> engineers trying to debug things on a live, production system) want to 
> be able to insert a tracepoint anywhere and anytime - and also they want 
> to have zero overhead from tracing if no tracepoints are used on a 
> system.

This is an implementation issue. You can easily have it so that at
the site of a marker you generate some code in a special "trace"
section of the binary which does the actual tracing and insert
noops at the marker site. Therefore the only penalty until the
tracing is enabled is the execution of additional noops.

[ note: this comes from a suggestion made by Hiramatsu-san at
this year's OLS. ]

> wrong: the original demo tracepoints that came with SystemTap still work 
> on the current kernel, because the 'coupling' is loose: based on 
> function names.
> 
> Static tracepoints on the other hand, if added via an external patch, do 
> depend on the target function not moving around and the context of the 
> tracepoint not being changed. (and static tracepoints if in the source 
> all the time are a constant hindrance to development and code 
> readability.)

Instrumentation of function boundaries is usually not much of an issue.
Instrumentation of key events, though, is different. Here's the classic:
@@ -1709,6 +1712,7 @@ switch_tasks:
   		++*switch_count;

   		prepare_arch_switch(rq, next);
+		TRACE_SCHEDCHANGE(prev, next);
   		prev = context_switch(rq, prev, next);
   		barrier();

This is the kind of thing for which the instrumentation, be it static
or dynamic, requires some kind of intelligent analysis of where to
get the info. Now, answer honestly, wouldn't it be simpler to have
such an event marker instead of having to figure out for every kernel
binary you get where the darned probe needs to be inserted?

Karim

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 17:43     ` Ingo Molnar
  2006-09-14 18:25       ` Karim Yaghmour
@ 2006-09-14 20:03       ` Martin Bligh
  2006-09-14 20:14         ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 20:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

Ingo Molnar wrote:
> * Martin J. Bligh <mbligh@mbligh.org> wrote:
> 
> 
>>>>Comments and reviews are very welcome.
>>>
>>>i have one very fundamental question: why should we do this 
>>>source-intrusive method of adding tracepoints instead of the 
>>>dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap 
>>>method?
>>
>>Because:
>>
>>1. Kprobes are more overhead when they *are* being used.
> 
> 
> minimally so - at least on i386 and x86_64. In that sense tracing is a 
> _slowpath_, and it _will_ slow things down if done excessively. I dont 
> care about the tracepoint being slower by a few instructions as long as 
> it has _zero effect_ on normal code, be that source code or binary code.

Would be interesting to see some measurements. But jumping is slower
than a simple branch (or noops to skip over that can be overwritten).

>>2. You can get zero overhead by CONFIG'ing things out.
> 
> but that's not how a fair chunk of people want to use tracing. People 
> (enterprise customers trying to figure out performance problems, 
> engineers trying to debug things on a live, production system) want to 
> be able to insert a tracepoint anywhere and anytime - and also they want 
> to have zero overhead from tracing if no tracepoints are used on a 
> system.

I'm fine with that ... "a fair chunk of people" - but it's not everyone,
by any means. We need both static and dynamic tracepoints, in one
infrastructure.

>>3. (most importantly) it's a bitch to maintain tracepoints out
>>   of-tree on a rapidly moving kernel
> 
> wrong: the original demo tracepoints that came with SystemTap still work 
> on the current kernel, because the 'coupling' is loose: based on 
> function names.

And what do those trace? I bet not half the stuff we want to do.
I've been migrating Google's tracepoints around between different
kernel versions, and it's not a mechanical port. Just stupid things
like renaming of functions inside memory reclaim creates pain, for
starters. (shrink_cache/shrink_list, refill_inactive_zone, etc).

> Static tracepoints on the other hand, if added via an external patch, do 
> depend on the target function not moving around and the context of the 
> tracepoint not being changed. (and static tracepoints if in the source 
> all the time are a constant hindrance to development and code 
> readability.)

an external patch is, indeed, pretty useless. Merging a few simple
tracepoints should not be a problem - see blktrace and schedstats,
for instance.

> and of course the big advantage of dynamic probing is its flexibility: 
> you can add add-hoc tracepoints to thousands of functions, instead of 
> having to maintain hundreds (or thousands) of static tracepoints all the 
> time. (and if we wont end up with hundreds/thousands of static 
> tracepoints then it wont be usable enough as a generic solution.)

I wasn't saying that dynamic tracepoints are useless - I agree it's
valuable to add stuff on the fly. But some things are better done
statically.

>>4. I believe kprobes still doesn't have full access to local 
>>variables. 
> 
> wrong: with SystemTap you can probe local variables too (via 
> jprobes/kretprobes, all in the upstream kernel already).

I'll look again, but last time I looked it didn't do this, and
when I spoke to the kprobes/systemtap people at OLS, IIRC they
said it still couldn't.

>>Now (3) is possibly solvable by putting the points in as no-ops 
>>(either insert a few nops or just a marker entry in the symbol 
>>table?), but full dynamic just isn't sustainable. [...]
> 
> i'm not sure i follow. Could you explain where SystemTap has this 
> difficulty?

If you have an extremely limited set of probes, on a static area
of the kernel, then yes, they may work for a long time. But try
tracing something like the scheduler, which people seem to delight
in rewriting every month or two ...

It amuses me that we're so opposed to external patches to the tree
(for perfectly understandable reasons), but we somehow think tracepoints
are magically different and should be maintained out of tree somehow.
You yourself made the argument that it's a maintainance burden to
keep the trace points *in* the tree ... if that's true, how is it
any easier to keep them outside of the tree?

If we really want to, we can still keep the hooks inside the code,
and have them do absolutely nothing at all - putting markers into
the symbol table is pretty much free. It also reuses the well
structured code-sharing mechanism we already have in place - the
linux kernel tree.

I really don't want to deal with all the systemtap crap - I just
want something that works, and I don't particularly care if I have
to recompile the kernel to get it. I know that doesn't suit everyone,
but there are requirements on both sides, and we should not dismiss
each other's requirements out of hand.

Having one consistent consistent collection mechanism for all these
different types of tracing data seems both logical and very important
to me ...

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:03       ` Martin Bligh
@ 2006-09-14 20:14         ` Ingo Molnar
  2006-09-14 20:40           ` Martin Bligh
  2006-09-14 21:05           ` Michel Dagenais
  0 siblings, 2 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 20:14 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais, fche

* Martin Bligh <mbligh@mbligh.org> wrote:

> an external patch is, indeed, pretty useless. Merging a few simple 
> tracepoints should not be a problem [...]

the problem is, LTT is not about a 'few' tracepoints: it adds a whopping 
350 tracepoints, a fair portion of it is multi-line with tons of 
arguments.

 $ diffstat patch-2.6.17-lttng-0.5.108-instrumentation*
 98 files changed, 1450 insertions(+), 64 deletions(-)

saying "it's just a few lightweight tracepoints" misses two points: it's 
not just a few, and it's not lightweight.

and the set of tracepoints never gets smaller. People who start to rely 
on a tracepoint will scream bloody murder if it goes away or breaks. 
Static tracepoints are a maintainance PITA that will rarely get smaller, 
and will easily grow ...

> [...] - see blktrace and schedstats, for instance.

yes, i do want to remove the 34 schedstats tracepoints too, once a 
feasible alternative is present. I already have to do two compilations 
when changing something substantial in the scheduler - once with and 
once without schedstats.

same for blktrace: once SystemTap can provide a compatible replacement, 
it should.

> It amuses me that we're so opposed to external patches to the tree 
> (for perfectly understandable reasons), but we somehow think 
> tracepoints are magically different and should be maintained out of 
> tree somehow.

i think you misunderstood what i meant. SystemTap should very much be 
integrated into the kernel proper, but i dont think the _rules_ (and 
scripts) should become part of the _source code files themselves_. So 
yes, there's advantage to kernel integration, but there's disadvantage 
to littering the kernel source with countless static tracepoints, if 
dynamic tracepoints can offer the same benefits (or more).

the question is: what is more maintainance, hundreds of static 
tracepoints (with long parameter lists) all around the (core) kernel, or 
hundreds of detached dynamic rules that need an update every now and 
then? [but of which most would still be usable even if some of them 
"broke"] To me the answer is clear: having hundreds of tracepoints 
_within_ the source code is higher cost. But please prove me wrong :-)

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:14         ` Ingo Molnar
@ 2006-09-14 20:40           ` Martin Bligh
  2006-09-14 21:05           ` Michel Dagenais
  1 sibling, 0 replies; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 20:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais, fche

Ingo Molnar wrote:
> * Martin Bligh <mbligh@mbligh.org> wrote:
> 
> 
>>an external patch is, indeed, pretty useless. Merging a few simple 
>>tracepoints should not be a problem [...]
> 
> 
> the problem is, LTT is not about a 'few' tracepoints: it adds a whopping 
> 350 tracepoints, a fair portion of it is multi-line with tons of 
> arguments.

"static tracepoints" does not equate directly to "all of LTT". I'm not
saying we should accept LTT as-is. I'm saying we should not reject the
concept of static tracepoints.

>  $ diffstat patch-2.6.17-lttng-0.5.108-instrumentation*
>  98 files changed, 1450 insertions(+), 64 deletions(-)
> 
> saying "it's just a few lightweight tracepoints" misses two points: it's 
> not just a few, and it's not lightweight.
> 
> and the set of tracepoints never gets smaller. People who start to rely 
> on a tracepoint will scream bloody murder if it goes away or breaks. 
> Static tracepoints are a maintainance PITA that will rarely get smaller, 
> and will easily grow ...

If people are *using* them, it's no easier to maintain them outside of
tree, than in-tree. it's significantly harder.

>>[...] - see blktrace and schedstats, for instance.
> 
> yes, i do want to remove the 34 schedstats tracepoints too, once a 
> feasible alternative is present. I already have to do two compilations 
> when changing something substantial in the scheduler - once with and 
> once without schedstats.
> 
> same for blktrace: once SystemTap can provide a compatible replacement, 
> it should.

Your argument about schedstats only seems to illustrate the flaws in the
arguments for dynamic tracepointing - you've put your finger on exactly
what the problem is, when the code changes, the tracing HAS to change
too. The best time to do this is when the code itself changes.

It's the same arguement for putting documentation in the C file against
the source itself.

>>It amuses me that we're so opposed to external patches to the tree 
>>(for perfectly understandable reasons), but we somehow think 
>>tracepoints are magically different and should be maintained out of 
>>tree somehow.
> 
> i think you misunderstood what i meant. SystemTap should very much be 
> integrated into the kernel proper, but i dont think the _rules_ (and 
> scripts) should become part of the _source code files themselves_. So 
> yes, there's advantage to kernel integration, but there's disadvantage 
> to littering the kernel source with countless static tracepoints, if 
> dynamic tracepoints can offer the same benefits (or more).

If you're talking about the scriptable awk-like "stuff" that comes with
Systemtap, yes I agree it should not be in the C code, it's foul.
However, I don't think a simple macro hooks are a burden.

> the question is: what is more maintainance, hundreds of static 
> tracepoints (with long parameter lists) all around the (core) kernel, or 
> hundreds of detached dynamic rules that need an update every now and 
> then? [but of which most would still be usable even if some of them 
> "broke"] To me the answer is clear: having hundreds of tracepoints 
> _within_ the source code is higher cost. But please prove me wrong :-)

How can you possibly say that maintaining the same set of data in two
dis-coupled trees is easier than doing it in the same place? You don't
require any *less* information to do it with systemtap than you do with
some form of static tracing.

If you're talking about the effort of maintaining just what's in the
kernel tree, then of course it's a little easier, but that's only half
the equation. And I don't think it's much of a burden, frankly. Yes,
if we have 2 billion tracepoints, it'll be a pain in the arse, but the
taste of the subsystem maintainers is what would regulate this, along
with everything else that we do. They'll accept a few important ones,
and reject the rest. If it's not valuable in general, they won't take
it. I don't see what the big problem is.

What *is* a problem is having a two separate mechanisms for doing
dynamic and static tracing. They should share the same logging 
facilities and readback mechanisms so we can read both types
consistently from userspace, and the data is correctly interspersed.

M.

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 20:14         ` Ingo Molnar
  2006-09-14 20:40           ` Martin Bligh
@ 2006-09-14 21:05           ` Michel Dagenais
  2006-09-14 22:23             ` Ingo Molnar
  1 sibling, 1 reply; 271+ messages in thread
From: Michel Dagenais @ 2006-09-14 21:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, fche


> the question is: what is more maintainance, hundreds of static 
> tracepoints (with long parameter lists) all around the (core) kernel, or 
> hundreds of detached dynamic rules that need an update every now and 
> then? [but of which most would still be usable even if some of them 
> "broke"] To me the answer is clear: having hundreds of tracepoints 
> _within_ the source code is higher cost. But please prove me wrong :-)

Actually I rarely find that any of the 70 000 printk is such a huge
nuisance to code readability. They may even help understand what is
going on in a code area you are less familiar with.


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 21:05           ` Michel Dagenais
@ 2006-09-14 22:23             ` Ingo Molnar
  2006-09-14 22:46               ` Martin Bligh
  0 siblings, 1 reply; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 22:23 UTC (permalink / raw)
  To: Michel Dagenais
  Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, fche


* Michel Dagenais <michel.dagenais@polymtl.ca> wrote:

> > the question is: what is more maintainance, hundreds of static 
> > tracepoints (with long parameter lists) all around the (core) kernel, or 
> > hundreds of detached dynamic rules that need an update every now and 
> > then? [but of which most would still be usable even if some of them 
> > "broke"] To me the answer is clear: having hundreds of tracepoints 
> > _within_ the source code is higher cost. But please prove me wrong :-)
> 
> Actually I rarely find that any of the 70 000 printk is such a huge 
> nuisance to code readability. They may even help understand what is 
> going on in a code area you are less familiar with.

i disagree. Consider the following example from LTT:

 int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 {
         struct kiocb iocb;
         struct sock_iocb siocb;
         int ret;

         trace_socket_sendmsg(sock, sock->sk->sk_family,
                 sock->sk->sk_type,
                 sock->sk->sk_protocol,
                 size);

         init_sync_kiocb(&iocb, NULL);
         iocb.private = &siocb;
         ret = __sock_sendmsg(&iocb, sock, msg, size);
         if (-EIOCBQUEUED == ret)
                 ret = wait_on_sync_kiocb(&iocb);
         return ret;
 }

what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? 
Nothing. They mostly just duplicate the information i already have from 
the function declaration. They obscure the clear view of the function:

 int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 {
         struct kiocb iocb;
         struct sock_iocb siocb;
         int ret;

         init_sync_kiocb(&iocb, NULL);
         iocb.private = &siocb;
         ret = __sock_sendmsg(&iocb, sock, msg, size);
         if (-EIOCBQUEUED == ret)
                 ret = wait_on_sync_kiocb(&iocb);
         return ret;
 }

the resulting visual and structural redundancy hurts.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:23             ` Ingo Molnar
@ 2006-09-14 22:46               ` Martin Bligh
  2006-09-14 22:56                 ` Ingo Molnar
  0 siblings, 1 reply; 271+ messages in thread
From: Martin Bligh @ 2006-09-14 22:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michel Dagenais, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, fche

> i disagree. Consider the following example from LTT:
...
>          trace_socket_sendmsg(sock, sock->sk->sk_family,
>                  sock->sk->sk_type,
>                  sock->sk->sk_protocol,
>                  size);
...

> what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? 
> Nothing. They mostly just duplicate the information i already have from 
> the function declaration. They obscure the clear view of the function:
  ...
> the resulting visual and structural redundancy hurts.

Couldn't that be easily fixed by just doing

	trace_socket_sendmsg(sock, size);

and have it work out which esoteric parts of the sock we want to trace,
and which we don't? Is much less visually invasive, and gives the same
effect.

M.



^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 22:46               ` Martin Bligh
@ 2006-09-14 22:56                 ` Ingo Molnar
  0 siblings, 0 replies; 271+ messages in thread
From: Ingo Molnar @ 2006-09-14 22:56 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Michel Dagenais, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, fche


* Martin Bligh <mbligh@mbligh.org> wrote:

> >i disagree. Consider the following example from LTT:
> ...
> >         trace_socket_sendmsg(sock, sock->sk->sk_family,
> >                 sock->sk->sk_type,
> >                 sock->sk->sk_protocol,
> >                 size);
> ...
> 
> >what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? 
> >Nothing. They mostly just duplicate the information i already have from 
> >the function declaration. They obscure the clear view of the function:
>  ...
> >the resulting visual and structural redundancy hurts.
> 
> Couldn't that be easily fixed by just doing
> 
> 	trace_socket_sendmsg(sock, size);
> 
> and have it work out which esoteric parts of the sock we want to 
> trace, and which we don't? Is much less visually invasive, and gives 
> the same effect.

yeah, visual impact is everything. The best that Frank and me came up 
with is:

	_(socket_sendmsg, sock, size);

we could quickly learn to visually skip over lines like that, they have 
a pretty unique geometric form . While if it's called:

	trace_socket_sendmsg(sock, size);

it always looks like a function call in the corner of the eye and 
attracts attention.

the '_()' macro is defined as:

	#define _(x,y,z) STAP_MARK(x,y,z)

(STAP_MARK is an existing SystemTap helper to insert static tracepoints 
into the kernel.)

but the other property of dynamic tracing is still very important too: 
we have the technological freedom to remove static tracepoints, if we 
decide so. With static tracers, once they are in the tree, we are stuck 
with these APIs.

	Ingo

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:14   ` Martin J. Bligh
  2006-09-14 17:43     ` Ingo Molnar
@ 2006-09-14 19:03     ` grundig
  2006-09-14 19:21       ` Karim Yaghmour
  2006-09-14 19:48     ` Frank Ch. Eigler
  2 siblings, 1 reply; 271+ messages in thread
From: grundig @ 2006-09-14 19:03 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: mingo, mathieu.desnoyers, linux-kernel, hch, akpm, mingo, gregkh,
	tglx, zanussi, ltt-dev, michel.dagenais

El Thu, 14 Sep 2006 08:14:19 -0700,
"Martin J. Bligh" <mbligh@mbligh.org> escribió:

> 2. You can get zero overhead by CONFIG'ing things out.

IOW, no distro will enable it by default to avoid the overhead,
making it useless for lots of real-world working systems where
you need to guess what's hapenning to software running real
workloads that can't just be stopped.

I guess there's no problem in having both LTT and Kprobes merged in 
the main tree at the same time. But Kprobes + systemtap will get
enabled and used by distros massively just because users can start
using it inmediately, without recompiling or installing extra
kernels and rebooting. There're cases where distros may want to
enable automatic tracing in every boot and only on boot but that
don't like to suffer from an extra performance hit after booting...

I'm not meaning that LTT sucks and doesn't have advantages and that 
doesn't deserve being merged/used, it just looks like kprobes+systemtap
will get way more real-world users no matter how much you discuss here

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 19:03     ` grundig
@ 2006-09-14 19:21       ` Karim Yaghmour
  0 siblings, 0 replies; 271+ messages in thread
From: Karim Yaghmour @ 2006-09-14 19:21 UTC (permalink / raw)
  To: grundig
  Cc: Martin J. Bligh, mingo, mathieu.desnoyers, linux-kernel, hch,
	akpm, mingo, gregkh, tglx, zanussi, ltt-dev, michel.dagenais


grundig wrote:
> IOW, no distro will enable it by default to avoid the overhead,

Please bear in mind that this is an implementation issue. As I've
explained elsewhere, there are ways to implement this where even
compiled-in static tracepoints have practically no cost at all
-- being noops until enabling. Thereby being no justification for
not actually shipping with such built kernels and, therefore,
no reason why tools such as ltt can't real-world usage.

Karim


^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 15:14   ` Martin J. Bligh
  2006-09-14 17:43     ` Ingo Molnar
  2006-09-14 19:03     ` grundig
@ 2006-09-14 19:48     ` Frank Ch. Eigler
  2006-09-15 16:32       ` Jose R. Santos
  2 siblings, 1 reply; 271+ messages in thread
From: Frank Ch. Eigler @ 2006-09-14 19:48 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig,
	Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner,
	Tom Zanussi, ltt-dev, Michel Dagenais

"Martin J. Bligh" <mbligh@mbligh.org> writes:

> [...] What would be really nice is one trace infrastructure, that
> allowed both static and dynamic tracepoints

We in systemtap land hope to encounter *some* static tracepoint
structure, perhaps like the one I presented at OLS, via which
systemtap could become your unified static+dynamic "infrastructure".
Even in that universe, using LTT-derived code for high-performance
tracing is within the realm of reason.

> without all the awk-style language crap that seems to come with
> systemtap.

I'm sorry to hear you dislike the scripting language.  But that's
okay, you Real Men can embed literal C code inside systemtap scripts
to do the Real Work, and leave to systemtap only sundry duties such as
probe placement and removal.

- FChE

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 19:48     ` Frank Ch. Eigler
@ 2006-09-15 16:32       ` Jose R. Santos
  0 siblings, 0 replies; 271+ messages in thread
From: Jose R. Santos @ 2006-09-15 16:32 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Frank Ch. Eigler, Ingo Molnar, Mathieu Desnoyers, linux-kernel,
	Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman,
	Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais

Frank Ch. Eigler wrote:
> "Martin J. Bligh" <mbligh@mbligh.org> writes:
>
> > without all the awk-style language crap that seems to come with
> > systemtap.
>
> I'm sorry to hear you dislike the scripting language.  But that's
> okay, you Real Men can embed literal C code inside systemtap scripts
> to do the Real Work, and leave to systemtap only sundry duties such as
> probe placement and removal.
>   

There are also a couple of projects within SystemTap that provide trace 
like functionality without the need to use the SystemTap language.  In 
the case of LKET, we've tried to make this as simple as possible by 
predefining probe points using the SystemTap language and embedded C 
code, but from a users perspective all he really need to do is just 
invoke a simple script like:

#! stap
process_snapshot() {}
addevent.tskdispatch.cpuidle {}
addevent.process {}
addevent.syscall.entry { printf ("%4b", $flags) }
addevent.syscall.exit {}
addevent.tskdispatch.cpuidle {}

The data can later be analyses in user-space with what ever method you like.  The developer instrumenting the probe point needs to know the Systemtap language, but the user of the trace just need to know which events are available to him.

We also plan to do static tracing once SystemTap supports static markers.  This may not be the perfect solution, but I'm interested in knowing how we can get there.

-JRS

^ permalink raw reply	[flat|nested] 271+ messages in thread

* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
  2006-09-14 11:27 ` Ingo Molnar
                     ` (2 preceding siblings ...)
  2006-09-14 15:14   ` Martin J. Bligh
@ 2006-09-19 11:59   ` Christoph Hellwig
  3 siblings, 0 replies; 271+ messages in thread
From: Christoph Hellwig @ 2006-09-19 11:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton,
	Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi,
	ltt-dev, Michel Dagenais

On Thu, Sep 14, 2006 at 01:27:18PM +0200, Ingo Molnar wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > Following an advice Christoph gave me this summer, submitting a 
> > smaller, easier to review patch should make everybody happier. Here is 
> > a stripped down version of LTTng : I removed everything that would 
> > make the code review reluctant (especially kernel instrumentation and 
> > kernel state dump module). I plan to release this "core" version every 
> > few LTTng releases and post it to LKML.
> > 
> > Comments and reviews are very welcome.
> 
> i have one very fundamental question: why should we do this 
> source-intrusive method of adding tracepoints instead of the dynamic, 
> unintrusive (and thus zero-overhead) KProbes+SystemTap method?

Coming a little late to this thread because I've been travelling the last
three weeks I'll answer here before wading through hundreds of mails.

I'll categorize tracing methods into a few categories:

  a) static and in-inline

     These are tracepoints directly in the kernel source, always compiled
     in (or under a CONFIG option).  We have various ad-hoc tracers of
     this type already in the kernel, e.g. blktrace or xfs's ktrace

  b) dynamic and in-line (markers)

     These are in-line but normally don't do anything in the code except
     of maybe adding a nop.  We currently don't support this at all.

  c) dynamic and out-of-line

     These are mainained as external modules or things that need to be
     translated to modules.  We have various low-level mechanisms to
     implement the hooking up of those currently (*probes) but no other
     infratsurcture in the kernel to help with those.  There's an external
     project, systemtap which supports probes like those but has a bunch
     of problems:

       - it doesn't allow writing scripts in C but only in some odd scripting
	 language
       - it doesn't actually put support code into the kernel tree but keeps
	 it separate, not allowing to keep probes with the kernel either.
	 In addition it also needs quite frequent updates because it has to
	 poke deep into kernel internals by it's nature.

So what's the right way of tracing for us?   I'd say a pretty clear all three,
and most importantly we need to have a common infrastrucuture for all of those.

The most important bit we need right now is a reliable framework to transfer
trace data to userspace - one we have that we support a) and a subset of
b) above.  LTT might be that missing bit, but I'd need to look at the actual
patches to see if it's suitable.  b) is something people have talked about
a lot and we've seen lots of prototypes, in my eyes it's the second priority.

But even after that the way we support c) is very rudimentary - we need
helpers to look at data, put probes at points outside of function entry/
return we needs things like a dwarf parser, an so on.

I think the systemtap approach of the external package is the very last
thing we need.  Unlike you said elsewhere having the tracepoints externally
does not eliminitate maintaince overhead - it shifts it to someone else.
Shifting maintaince overhead to someone else is a valid concept in the
linux kernel development, we do this all the time for things we don't care
about.  I think it's fundamentally wrong for traces, though.  Traces are
very important for debugging complex problems, and I've grown very tired
of maintaining all my ad-hoc scripts.  Having them in the kernel tree
or traces static in it's nature inline would allow and force kernel developers
to always keept it uptodate with it's changes. 

^ permalink raw reply	[flat|nested] 271+ messages in thread

end of thread, other threads:[~2006-09-25 15:47 UTC | newest]

Thread overview: 271+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert
2006-09-15 18:32 ` Alan Cox
2006-09-16 10:46 ` Jes Sorensen
  -- strict thread matches above, loose matches on Subject: below --
2006-09-25 15:20 Chuck Ebbert
2006-09-25 15:39 ` Ingo Molnar
2006-09-15  9:17 Richard J Moore
2006-09-15  3:10 James Dickens
2006-09-14  3:38 Mathieu Desnoyers
2006-09-14 11:27 ` Ingo Molnar
2006-09-14 13:40   ` Roman Zippel
2006-09-14 13:55     ` Ingo Molnar
2006-09-14 14:33       ` Roman Zippel
2006-09-14 15:26         ` Michel Dagenais
2006-09-14 17:48           ` Ingo Molnar
2006-09-15 15:04             ` Mathieu Desnoyers
2006-09-14 18:08           ` Nick Piggin
2006-09-14 18:38             ` Karim Yaghmour
2006-09-14 17:13         ` Ingo Molnar
2006-09-14 17:55           ` Roman Zippel
2006-09-14 18:15             ` Ingo Molnar
2006-09-14 18:35               ` Mathieu Desnoyers
2006-09-14 18:54               ` Karim Yaghmour
2006-09-15  9:20                 ` Jes Sorensen
2006-09-15 12:38                   ` Karim Yaghmour
2006-09-15 12:32                     ` Jes Sorensen
2006-09-15 14:09                       ` Karim Yaghmour
2006-09-15 14:30                         ` Jes Sorensen
2006-09-15 15:12                           ` Karim Yaghmour
2006-09-16 10:41                             ` Jes Sorensen
2006-09-16 15:28                               ` Karim Yaghmour
2006-09-18  8:57                                 ` Jes Sorensen
2006-09-18 14:48                                   ` Ingo Molnar
2006-09-18 15:37                                     ` Karim Yaghmour
2006-09-15 13:20                     ` Paul Mundt
2006-09-15 13:41                       ` Roman Zippel
2006-09-15 13:44                         ` Jes Sorensen
2006-09-15 14:03                           ` Roman Zippel
2006-09-15 14:37                             ` Alan Cox
2006-09-15 14:34                               ` Roman Zippel
2006-09-15 13:57                         ` Paul Mundt
2006-09-15 14:17                           ` Karim Yaghmour
2006-09-15 14:13                             ` Jes Sorensen
2006-09-15 14:31                               ` Karim Yaghmour
2006-09-15 14:28                                 ` Paul Mundt
2006-09-15 14:46                                   ` Martin J. Bligh
2006-09-15 15:22                                     ` Alan Cox
2006-09-15 15:47                                       ` Martin J. Bligh
2006-09-15 14:51                                   ` Karim Yaghmour
2006-09-15 15:00                                     ` Thomas Gleixner
2006-09-15 15:28                                       ` Karim Yaghmour
2006-09-15 18:16                                       ` Andrew Morton
2006-09-15 18:19                                         ` Ingo Molnar
2006-09-15 19:26                                           ` Karim Yaghmour
2006-09-15 19:43                                           ` Roman Zippel
2006-09-15 20:05                                             ` Ingo Molnar
2006-09-15 20:22                                               ` Mathieu Desnoyers
2006-09-15 21:08                                                 ` Jose R. Santos
2006-09-15 21:25                                                   ` Mathieu Desnoyers
2006-09-15 22:02                                                     ` Jose R. Santos
2006-09-15 22:03                                                   ` Ingo Molnar
2006-09-15 22:32                                                     ` Karim Yaghmour
2006-09-15 22:43                                                       ` Ingo Molnar
2006-09-15 23:33                                                         ` Karim Yaghmour
2006-09-15 23:52                                                           ` Ingo Molnar
2006-09-16  2:24                                                             ` Karim Yaghmour
2006-09-15 23:53                                                           ` Ingo Molnar
2006-09-16  2:51                                                             ` Karim Yaghmour
2006-09-15 22:59                                                     ` Frank Ch. Eigler
2006-09-15 23:40                                                       ` Karim Yaghmour
2006-09-15 23:17                                                     ` Jose R. Santos
2006-09-15 21:32                                                 ` Ingo Molnar
2006-09-15 21:58                                                   ` Mathieu Desnoyers
2006-09-15 22:19                                                     ` Ingo Molnar
2006-09-15 22:45                                                       ` Karim Yaghmour
2006-09-16  9:59                                                 ` Jes Sorensen
2006-09-16 17:24                                                   ` Mathieu Desnoyers
2006-09-16 17:35                                                     ` Ingo Molnar
2006-09-16 17:56                                                       ` Mathieu Desnoyers
2006-09-16 19:10                                                         ` Ingo Molnar
2006-09-16 19:37                                                           ` Ingo Molnar
2006-09-17 10:13                                                             ` Frederik Deweerdt
2006-09-17 14:00                                                               ` Ingo Molnar
2006-09-16 19:51                                                           ` Karim Yaghmour
2006-09-16 23:40                                                         ` Ingo Molnar
2006-09-17  5:33                                                           ` Mathieu Desnoyers
2006-09-16 18:11                                                       ` Karim Yaghmour
2006-09-16 17:44                                                         ` Ingo Molnar
2006-09-16 18:15                                                           ` Karim Yaghmour
2006-09-18  8:18                                                             ` Jes Sorensen
2006-09-16 17:55                                                     ` Karim Yaghmour
2006-09-18  8:21                                                       ` Jes Sorensen
2006-09-18  8:33                                                     ` Jes Sorensen
2006-09-18 15:01                                                       ` Mathieu Desnoyers
2006-09-16 17:30                                                   ` Mathieu Desnoyers
2006-09-18  8:15                                                     ` Jes Sorensen
2006-09-18 14:53                                                       ` Mathieu Desnoyers
2006-09-18 15:17                                                         ` Ingo Molnar
2006-09-18 16:54                                                           ` Mathieu Desnoyers
2006-09-15 21:12                                               ` Roman Zippel
2006-09-15 21:08                                                 ` Ingo Molnar
2006-09-15 20:13                                           ` Andrew Morton
2006-09-15 21:49                                             ` Jose R. Santos
2006-09-16 10:19                                             ` Jes Sorensen
2006-09-16 16:05                                               ` Karim Yaghmour
2006-09-17  4:54                                                 ` Ganesan Rajagopal
2006-09-18  8:13                                                 ` Jes Sorensen
2006-09-18 14:46                                                   ` Mathieu Desnoyers
2006-09-18 17:06                                                   ` Martin Bligh
2006-09-20 14:17                                                     ` Jes Sorensen
2006-09-15 19:35                                         ` Thomas Gleixner
2006-09-15 19:40                                           ` Ingo Molnar
2006-09-15 19:56                                           ` Karim Yaghmour
2006-09-15 20:23                                             ` Thomas Gleixner
2006-09-15 20:40                                               ` Roman Zippel
2006-09-15 20:48                                                 ` Ingo Molnar
2006-09-15 21:17                                                   ` Karim Yaghmour
2006-09-15 21:15                                                     ` Ingo Molnar
2006-09-15 21:56                                                       ` Karim Yaghmour
2006-09-15 21:27                                                   ` Roman Zippel
2006-09-15 21:51                                                     ` Ingo Molnar
2006-09-15 22:15                                                       ` Karim Yaghmour
2006-09-15 22:53                                                       ` Roman Zippel
2006-09-15 23:14                                                         ` Ingo Molnar
2006-09-15 23:49                                                           ` Nicholas Miell
2006-09-15 23:57                                                             ` Ingo Molnar
2006-09-16  0:41                                                               ` Nicholas Miell
2006-09-16  0:31                                                           ` Roman Zippel
2006-09-16  8:20                                                             ` Ingo Molnar
2006-09-16  8:21                                                             ` Ingo Molnar
2006-09-16  8:21                                                             ` Ingo Molnar
2006-09-16  8:22                                                             ` Ingo Molnar
2006-09-16 19:58                                                               ` Roman Zippel
2006-09-16 22:50                                                                 ` Ingo Molnar
2006-09-16 23:00                                                                 ` Ingo Molnar
2006-09-17  1:15                                                                   ` Roman Zippel
2006-09-17  8:42                                                                     ` Ingo Molnar
2006-09-17 15:16                                                                       ` Roman Zippel
2006-09-17 15:25                                                                         ` Ingo Molnar
2006-09-17 16:02                                                                           ` Roman Zippel
2006-09-17 16:45                                                                             ` Ingo Molnar
2006-09-17 16:59                                                                             ` Nick Piggin
2006-09-17 17:26                                                                               ` Roman Zippel
2006-09-17 17:56                                                                                 ` Nick Piggin
2006-09-17 18:59                                                                                   ` Roman Zippel
2006-09-17 21:23                                                                                     ` Ingo Molnar
2006-09-17 21:52                                                                                       ` Roman Zippel
2006-09-17 22:27                                                                                         ` Ingo Molnar
2006-09-17 21:40                                                                                     ` Ingo Molnar
2006-09-18  8:43                                                                                     ` Jes Sorensen
2006-09-17 21:32                                                                                   ` Ingo Molnar
2006-09-17 19:23                                                                                 ` Ingo Molnar
2006-09-17 19:45                                                                                   ` Roman Zippel
2006-09-17 20:56                                                                                     ` Ingo Molnar
2006-09-17 21:36                                                                                       ` Roman Zippel
2006-09-17 22:13                                                                                         ` Ingo Molnar
2006-09-16 23:14                                                                 ` Ingo Molnar
2006-09-17 14:19                                                                   ` Frank Ch. Eigler
2006-09-17 15:31                                                                     ` Ingo Molnar
2006-09-17 17:15                                                                       ` Mathieu Desnoyers
     [not found]                                                                   ` <y0mu036eglz.fsf@ton.toronto.redhat.com>
2006-09-17 15:00                                                                     ` Ingo Molnar
2006-09-16  8:23                                                             ` Ingo Molnar
2006-09-16  8:23                                                             ` Ingo Molnar
2006-09-16  8:23                                                             ` Ingo Molnar
2006-09-15 21:05                                               ` Karim Yaghmour
2006-09-15 21:17                                                 ` Thomas Gleixner
2006-09-15 21:31                                                   ` Karim Yaghmour
2006-09-15 20:00                                         ` Mathieu Desnoyers
2006-09-15 20:27                                           ` Jose R. Santos
2006-09-15 20:37                                         ` Alan Cox
2006-09-15 20:26                                           ` Mathieu Desnoyers
2006-09-15 20:51                                           ` Karim Yaghmour
2006-09-17 17:53                                           ` Mathieu Desnoyers
2006-09-15 15:24                                     ` Alan Cox
2006-09-15 15:23                                       ` Karim Yaghmour
2006-09-15 14:39                                 ` Jes Sorensen
2006-09-15 15:04                                   ` Karim Yaghmour
2006-09-14 19:40               ` Tim Bird
2006-09-14 20:00                 ` Ingo Molnar
2006-09-14 20:46                   ` Karim Yaghmour
2006-09-19 12:05                     ` Christoph Hellwig
2006-09-14 21:02                   ` Roman Zippel
2006-09-15 11:40                 ` Alan Cox
2006-09-15 11:46                   ` Roman Zippel
2006-09-15 12:38                     ` Alan Cox
2006-09-15 12:39                       ` Roman Zippel
2006-09-15 13:41                         ` Alan Cox
2006-09-15 13:34                           ` Roman Zippel
2006-09-15 14:41                             ` Alan Cox
2006-09-15 14:35                               ` Karim Yaghmour
2006-09-15 14:58                                 ` Alan Cox
2006-09-15 14:57                                   ` Karim Yaghmour
2006-09-15 17:49                                     ` Andrew Morton
2006-09-15 18:20                                       ` Karim Yaghmour
2006-09-15 17:01                                   ` Tim Bird
2006-09-15 17:08                                   ` Frank Ch. Eigler
2006-09-15 17:57                                     ` Andrew Morton
2006-09-15 18:31                                     ` Alan Cox
2006-09-15 18:12                                       ` Ingo Molnar
2006-09-15 19:10                                         ` Roman Zippel
2006-09-15 19:10                                           ` Ingo Molnar
2006-09-15 20:05                                           ` Thomas Gleixner
2006-09-15 20:35                                             ` Roman Zippel
2006-09-15 21:44                                             ` Tim Bird
2006-09-19 12:29                                           ` Christoph Hellwig
2006-09-19 13:17                                             ` Roman Zippel
2006-09-15 18:24                                       ` Frank Ch. Eigler
2006-09-15 18:23                                         ` Ingo Molnar
2006-09-15 18:18                                   ` Martin Bligh
2006-09-15 18:10                           ` Jose R. Santos
2006-09-15 19:49                             ` Mathieu Desnoyers
2006-09-15 20:54                               ` Jose R. Santos
2006-09-15 21:42                                 ` Karim Yaghmour
2006-09-15 21:46                                 ` Mathieu Desnoyers
2006-09-19 15:05                                   ` Jose R. Santos
2006-09-19 15:30                                     ` Mathieu Desnoyers
2006-09-19 16:39                                       ` Jose R. Santos
2006-09-19 18:03                                         ` Mathieu Desnoyers
2006-09-15 17:45                       ` Andrew Morton
2006-09-15 18:16                         ` Karim Yaghmour
2006-09-15 19:20                           ` Jose R. Santos
2006-09-15 19:59                           ` Andrew Morton
2006-09-15 20:24                             ` Karim Yaghmour
2006-09-15 20:25                               ` Thomas Gleixner
2006-09-14 19:47               ` Roman Zippel
2006-09-14 20:24                 ` Ingo Molnar
2006-09-14 20:54                   ` Roman Zippel
2006-09-14 21:08                     ` Daniel Walker
2006-09-14 21:30                       ` Roman Zippel
2006-09-14 22:15                         ` Ingo Molnar
2006-09-14 23:39                           ` Roman Zippel
2006-09-14 23:43                             ` Ingo Molnar
2006-09-15  0:27                               ` Roman Zippel
2006-09-15  1:47                   ` Mathieu Desnoyers
2006-09-15  5:47                     ` Vara Prasad
2006-09-14 18:12           ` Karim Yaghmour
2006-09-14 20:25           ` Martin Bligh
2006-09-14 20:34             ` Ingo Molnar
2006-09-14 20:55               ` Martin Bligh
2006-09-14 21:31                 ` Ingo Molnar
2006-09-14 22:25                   ` Martin Bligh
2006-09-14 22:36                     ` Ingo Molnar
2006-09-14 22:59                       ` Martin Bligh
2006-09-14 23:19                         ` Ingo Molnar
2006-09-15  0:19                           ` Nicholas Miell
2006-09-15  1:04                           ` Martin J. Bligh
2006-09-15 12:38                             ` Ingo Molnar
2006-09-15  7:00                         ` Vara Prasad
2006-09-15 15:37                       ` Michel Dagenais
2006-09-19 12:08                 ` Christoph Hellwig
2006-09-14 21:07               ` Roman Zippel
2006-09-15  9:29               ` Jes Sorensen
2006-09-14 17:51         ` Karim Yaghmour
2006-09-14 15:19       ` Mathieu Desnoyers
2006-09-14 19:39         ` Frank Ch. Eigler
2006-09-15 17:13         ` Jose R. Santos
2006-09-14 15:02   ` Mathieu Desnoyers
2006-09-14 15:14   ` Martin J. Bligh
2006-09-14 17:43     ` Ingo Molnar
2006-09-14 18:25       ` Karim Yaghmour
2006-09-14 20:03       ` Martin Bligh
2006-09-14 20:14         ` Ingo Molnar
2006-09-14 20:40           ` Martin Bligh
2006-09-14 21:05           ` Michel Dagenais
2006-09-14 22:23             ` Ingo Molnar
2006-09-14 22:46               ` Martin Bligh
2006-09-14 22:56                 ` Ingo Molnar
2006-09-14 19:03     ` grundig
2006-09-14 19:21       ` Karim Yaghmour
2006-09-14 19:48     ` Frank Ch. Eigler
2006-09-15 16:32       ` Jose R. Santos
2006-09-19 11:59   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox