* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-15 17:14 Chuck Ebbert
2006-09-15 18:32 ` Alan Cox
2006-09-16 10:46 ` Jes Sorensen
0 siblings, 2 replies; 271+ messages in thread
From: Chuck Ebbert @ 2006-09-15 17:14 UTC (permalink / raw)
To: Alan Cox
Cc: Greg Kroah-Hartman, linux-kernel, Roman Zippel, Jes Sorensen,
Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers,
Christoph Hellwig, Andrew Morton, Thomas Gleixner, Tom Zanussi,
ltt-dev, Michel Dagenais
In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain>
On Fri, 15 Sep 2006 15:37:51 +0100, Alan Cox wrote:
> > $ grep KPROBES arch/*/Kconf*
> > arch/i386/Kconfig:config KPROBES
> > arch/ia64/Kconfig:config KPROBES
> > arch/powerpc/Kconfig:config KPROBES
> > arch/sparc64/Kconfig:config KPROBES
> > arch/x86_64/Kconfig:config KPROBES
>
> Send patches. The fact nobody has them implemented on your platform
> isn't a reason to implement something else, quite the reverse in fact.
Yes, but the point is: until that's done you can't claim kprobes is a
valid tracing tool for everyone.
And things like net/ipv4/tcp_probe.c shouldn't be generally implemented
until every arch is supported.
--
Chuck
^ permalink raw reply [flat|nested] 271+ messages in thread* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert @ 2006-09-15 18:32 ` Alan Cox 2006-09-16 10:46 ` Jes Sorensen 1 sibling, 0 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 18:32 UTC (permalink / raw) To: Chuck Ebbert; +Cc: linux-kernel Ar Gwe, 2006-09-15 am 13:14 -0400, ysgrifennodd Chuck Ebbert: > In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain> > > > $ grep KPROBES arch/*/Kconf* > > > arch/i386/Kconfig:config KPROBES > > > arch/ia64/Kconfig:config KPROBES > > > arch/powerpc/Kconfig:config KPROBES > > > arch/sparc64/Kconfig:config KPROBES > > > arch/x86_64/Kconfig:config KPROBES > > > > Send patches. The fact nobody has them implemented on your platform > > isn't a reason to implement something else, quite the reverse in fact. > > Yes, but the point is: until that's done you can't claim kprobes is a > valid tracing tool for everyone. I can however claim that kprobes is what they should be implementing not adding new large patches for another infrastructure whose author has already said for dynamic stuff it is based on the same things. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert 2006-09-15 18:32 ` Alan Cox @ 2006-09-16 10:46 ` Jes Sorensen 1 sibling, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-16 10:46 UTC (permalink / raw) To: Chuck Ebbert Cc: Alan Cox, Greg Kroah-Hartman, linux-kernel, Roman Zippel, Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers, Christoph Hellwig, Andrew Morton, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Chuck Ebbert wrote: > In-Reply-To: <1158331071.29932.63.camel@localhost.localdomain> > > On Fri, 15 Sep 2006 15:37:51 +0100, Alan Cox wrote: > >>> $ grep KPROBES arch/*/Kconf* >>> arch/i386/Kconfig:config KPROBES >>> arch/ia64/Kconfig:config KPROBES >>> arch/powerpc/Kconfig:config KPROBES >>> arch/sparc64/Kconfig:config KPROBES >>> arch/x86_64/Kconfig:config KPROBES >> Send patches. The fact nobody has them implemented on your platform >> isn't a reason to implement something else, quite the reverse in fact. > > Yes, but the point is: until that's done you can't claim kprobes is a > valid tracing tool for everyone. The fact that the remaining architectures haven't bothered implementing kprobe supposed should not be used as an argument for pushing something inferior out of laziness. It's the same with syscalls, the kernel infrastructure is there, but if you don't bother updating the syscall tables and wrap it in with glibc, then the call isn't available on your architecture. The core kprobe infrastructure is available to all architectures, it's up to the developers of the remaining architectures to implement the remaining bits. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-25 15:20 Chuck Ebbert
2006-09-25 15:39 ` Ingo Molnar
0 siblings, 1 reply; 271+ messages in thread
From: Chuck Ebbert @ 2006-09-25 15:20 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
In-Reply-To: <20060918151713.GA11495@elte.hu>
On Mon, 18 Sep 2006 17:17:13 +0200, Ingo Molnar wrote:
> yeah - and i dont think the kprobes overhead is a fundamental thing - i
> posted a few kprobes-speedup patches as a reply to your measurements.
Where is the source code for the kprobes benchmarks you used?
--
Chuck
^ permalink raw reply [flat|nested] 271+ messages in thread* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-25 15:20 Chuck Ebbert @ 2006-09-25 15:39 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-25 15:39 UTC (permalink / raw) To: Chuck Ebbert; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 716 bytes --] Chuck, i cannot email you because the mail always bounces ... the kprobes benchmark is a simple "NOP" function: static int counter = 0; static int probe_pre_handler (struct kprobe * kp, struct pt_regs * regs) { counter++; return 0; } i've attached it. Ingo * Chuck Ebbert <76306.1226@compuserve.com> wrote: > In-Reply-To: <20060918151713.GA11495@elte.hu> > > On Mon, 18 Sep 2006 17:17:13 +0200, Ingo Molnar wrote: > > > yeah - and i dont think the kprobes overhead is a fundamental thing - i > > posted a few kprobes-speedup patches as a reply to your measurements. > > Where is the source code for the kprobes benchmarks you used? > > -- > Chuck [-- Attachment #2: noop_kprobe.c --] [-- Type: text/plain, Size: 1014 bytes --] /* * no-op kprobe handler * Copyright (c) 2005 Hitachi,Ltd., * Created by Masami Hiramatsu<hiramatu@sdl.hitachi.co.jp> */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/kprobes.h> MODULE_AUTHOR("M.Hiramatsu"); MODULE_LICENSE("GPL"); static unsigned long addr = 0; module_param(addr, ulong, 0444); static struct kprobe kp; static int counter=0; static int probe_pre_handler (struct kprobe * kp, struct pt_regs * regs) { counter++; return 0; } static int install_probe(void) { int ret = -10000; if (addr) { kp.pre_handler = probe_pre_handler; kp.addr = (void *)addr; printk("probe install to %p\n", (void*)addr); ret = register_kprobe(&kp); } if (ret) { printk("probe install error: %d\n",ret); } return ret; } static void uninstall_probe(void) { if (kp.addr) { printk("uninstall from %p\n", (void*)kp.addr); unregister_kprobe(&kp); } printk("count:%d\n",counter); } module_init(install_probe); module_exit(uninstall_probe); ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
@ 2006-09-15 9:17 Richard J Moore
0 siblings, 0 replies; 271+ messages in thread
From: Richard J Moore @ 2006-09-15 9:17 UTC (permalink / raw)
To: linux-kernel
Ingo Molnar wrote:
> > I don't think anyone is saying that static tracepoints do not have
> > their limitations, or that dynamic tracepointing is useless. But
> > that's not the point ... why can't we have one infrastructure that
> > supports both? Preferably in a fairly simple, consistent way.
>
> primarily because i fail to see any property of static tracers that are
> not met by dynamic tracers. So to me dynamic tracers like SystemTap are
> a superset of static tracers.
There is one example whethere dynamic tracing is difficult or very messy to
implement and that's for tracepoints needed during system and device
initialization. In this sense dynamic is not a practical superset of
static. However I believe the tooling, for dynamic trace should work for
static as well.
- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072
^ permalink raw reply [flat|nested] 271+ messages in thread* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 @ 2006-09-15 3:10 James Dickens 0 siblings, 0 replies; 271+ messages in thread From: James Dickens @ 2006-09-15 3:10 UTC (permalink / raw) To: lkml Static probe points in the mainline kernel should not be there for kernel programmers. Any kernel programmer that is interested in an event that a static probe would trace, could with a little work use kprobes, Systemtap, printk, statements or numerous other methods and accomplish the same thing most likely with less impact on the kernel. If you allow static probe points, do them for the people that use your code, If static probing is to work in the mainline kernel, its necessary for everyone to see the value of them. I came up with some simple rules that may help the adoption of static probe points in the kernel. They answer a lot of issues I read in other reads. Some simple rules for Static Probing: - If the probe is not enabled, it turns into a NOP. No probes are enabled by default - Each programmer should provide this as a service to the user. - There should be at most a 1000 static probe points in the entire kernel including modules, drivers, etc. - Probes should not pass out any more information than what a user would need. If the user needs more he needs to find another way to get it, perhaps dynamic probing. - If any part of the kernel has more than a dozen probe points there are too many. - If a probe would be of little use to a user/sysadmin it should be removed from the mainline kernel. - Yes, if a probe point is in the code you are working on, the role of maintaining it falls on you. - If you notice your code is doing something that matches a statically probed event (.i.e. your network driver dropped a packet), it's your responsibility to add the necessary probe in your code. - If "you" need a probe that would not be needed except for debugging your code, use one of the other methods mentioned above, or remove it before your code is submitted to the mainline kernel. Some example static probe points Task going is being moved on to a cpu. Task moving off a cpu Start of an IO End of an IO Network packet received Packet dropped. Various lock activities Lock taken Spin lock taken James Dickens uadmin.blogspot.com ^ permalink raw reply [flat|nested] 271+ messages in thread
* [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 @ 2006-09-14 3:38 Mathieu Desnoyers 2006-09-14 11:27 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-14 3:38 UTC (permalink / raw) To: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi Cc: ltt-dev, Michel Dagenais Hi, Following an advice Christoph gave me this summer, submitting a smaller, easier to review patch should make everybody happier. Here is a stripped down version of LTTng : I removed everything that would make the code review reluctant (especially kernel instrumentation and kernel state dump module). I plan to release this "core" version every few LTTng releases and post it to LKML. Comments and reviews are very welcome. See http://ltt.polymtl.ca > QUICKSTART for information about creating your own instrumentation set. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 3:38 Mathieu Desnoyers @ 2006-09-14 11:27 ` Ingo Molnar 2006-09-14 13:40 ` Roman Zippel ` (3 more replies) 0 siblings, 4 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 11:27 UTC (permalink / raw) To: Mathieu Desnoyers Cc: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > Following an advice Christoph gave me this summer, submitting a > smaller, easier to review patch should make everybody happier. Here is > a stripped down version of LTTng : I removed everything that would > make the code review reluctant (especially kernel instrumentation and > kernel state dump module). I plan to release this "core" version every > few LTTng releases and post it to LKML. > > Comments and reviews are very welcome. i have one very fundamental question: why should we do this source-intrusive method of adding tracepoints instead of the dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap method? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 11:27 ` Ingo Molnar @ 2006-09-14 13:40 ` Roman Zippel 2006-09-14 13:55 ` Ingo Molnar 2006-09-14 15:02 ` Mathieu Desnoyers ` (2 subsequent siblings) 3 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 13:40 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > i have one very fundamental question: why should we do this > source-intrusive method of adding tracepoints instead of the dynamic, > unintrusive (and thus zero-overhead) KProbes+SystemTap method? Could you define "zero-overhead"? Actual implementation aside having a core number of tracepoints is far more portable than KProbes. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 13:40 ` Roman Zippel @ 2006-09-14 13:55 ` Ingo Molnar 2006-09-14 14:33 ` Roman Zippel 2006-09-14 15:19 ` Mathieu Desnoyers 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 13:55 UTC (permalink / raw) To: Roman Zippel Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > i have one very fundamental question: why should we do this > > source-intrusive method of adding tracepoints instead of the dynamic, > > unintrusive (and thus zero-overhead) KProbes+SystemTap method? > > Could you define "zero-overhead"? zero overhead when not used: not a single instruction added to the kernel codepath that is to be traced, anywhere. (which will be the case on 99% of the systems) > Actual implementation aside having a core number of tracepoints is far > more portable than KProbes. the key point is that we want _zero_ "static tracepoints". Firstly, static tracepoints are fundamentally limited: - they can only be added at the source code level - modifying them requires a reboot which is not practical in a production environment - there can only be a limited set of them, while many problems need finegrained tracepoints tailored to the problem at hand - conditional tracepoints are typically either nonexistent or very limited. But besides the usability problems, the most important problem is that static tracepoints add a _constant maintainance overhead_ to the kernel. I'm talking from first hand experience: i wrote 'iotrace' (a static tracer) in 1996 and have maintained it for many years, and even today i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ want static tracepoints in the mainline kernel. enter KProbes+SystemTap. It needs no changes at the source code level at all, so no maintainance overhead to generic kernel code. Tracepoints can be added and removed while the system is running. Trace actions and filters can be added based on a scripting language, so tracing is as dynamic as it gets. (check out http://lwn.net/Articles/198557/ if you have an lwn subscription - it's subscriber-only for a few weeks) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 13:55 ` Ingo Molnar @ 2006-09-14 14:33 ` Roman Zippel 2006-09-14 15:26 ` Michel Dagenais ` (2 more replies) 2006-09-14 15:19 ` Mathieu Desnoyers 1 sibling, 3 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-14 14:33 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > > > i have one very fundamental question: why should we do this > > > source-intrusive method of adding tracepoints instead of the dynamic, > > > unintrusive (and thus zero-overhead) KProbes+SystemTap method? > > > > Could you define "zero-overhead"? > > zero overhead when not used: not a single instruction added to the > kernel codepath that is to be traced, anywhere. (which will be the case > on 99% of the systems) Using alternatives this could be near zero as well and it will likely have less overhead when it's actually used. > > Actual implementation aside having a core number of tracepoints is far > > more portable than KProbes. > > the key point is that we want _zero_ "static tracepoints". Firstly, > static tracepoints are fundamentally limited: BTW I don't mind KProbes as an option, but I have huge problem with making it the only option. > But besides the usability problems, the most important problem is that > static tracepoints add a _constant maintainance overhead_ to the kernel. > I'm talking from first hand experience: i wrote 'iotrace' (a static > tracer) in 1996 and have maintained it for many years, and even today > i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ > want static tracepoints in the mainline kernel. Even dynamic tracepoints have a maintainance overhead and I doubt there is much difference. The big problem is having to maintain them outside the mainline kernel, that's why it's so important to get them into the mainline kernel. You didn't address my main issue at all - kprobes is only available for a few archs... bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 14:33 ` Roman Zippel @ 2006-09-14 15:26 ` Michel Dagenais 2006-09-14 17:48 ` Ingo Molnar 2006-09-14 18:08 ` Nick Piggin 2006-09-14 17:13 ` Ingo Molnar 2006-09-14 17:51 ` Karim Yaghmour 2 siblings, 2 replies; 271+ messages in thread From: Michel Dagenais @ 2006-09-14 15:26 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev On Thu, 2006-14-09 at 16:33 +0200, Roman Zippel wrote: > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > > i have one very fundamental question: why should we do this > > > > source-intrusive method of adding tracepoints instead of the dynamic, > > > > unintrusive (and thus zero-overhead) KProbes+SystemTap method? > Using alternatives this could be near zero as well and it will likely > have less overhead when it's actually used. This is the crucial point. Using an INT3 at each dynamic tracepoint is both costly and is a larger perturbation on the system under study. Static tracepoints can be achieved by various means, including a few NOPs to reserve space which get patched dynamically for activation. They may also be compiled out completely. By the way, there are quite a few tracers already in device drivers in the kernel. > BTW I don't mind KProbes as an option, but I have huge problem with making > it the only option. Indeed, KProbes SystemTAP and LTTng are complementary and people involved in the three projects are cooperating. > > But besides the usability problems, the most important problem is that > > static tracepoints add a _constant maintainance overhead_ to the kernel. > > I'm talking from first hand experience: i wrote 'iotrace' (a static > > tracer) in 1996 and have maintained it for many years, and even today > > i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ > > want static tracepoints in the mainline kernel. > > Even dynamic tracepoints have a maintainance overhead and I doubt there is > much difference. The big problem is having to maintain them outside the > mainline kernel, that's why it's so important to get them into the > mainline kernel. Indeed, dynamic tracepoints are like code patches, when the kernel source changes they may or not apply to newer versions. Mainline kernel "static" tracepoints are more like the existing 70000+ printk statements! ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:26 ` Michel Dagenais @ 2006-09-14 17:48 ` Ingo Molnar 2006-09-15 15:04 ` Mathieu Desnoyers 2006-09-14 18:08 ` Nick Piggin 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 17:48 UTC (permalink / raw) To: Michel Dagenais Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev * Michel Dagenais <michel.dagenais@polymtl.ca> wrote: > This is the crucial point. Using an INT3 at each dynamic tracepoint is > both costly and is a larger perturbation on the system under study. > [...] have you measured this? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:48 ` Ingo Molnar @ 2006-09-15 15:04 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 15:04 UTC (permalink / raw) To: Ingo Molnar Cc: Michel Dagenais, Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev * Ingo Molnar (mingo@elte.hu) wrote: > > * Michel Dagenais <michel.dagenais@polymtl.ca> wrote: > > > This is the crucial point. Using an INT3 at each dynamic tracepoint is > > both costly and is a larger perturbation on the system under study. > > [...] > > have you measured this? > Hi Ingo, A very quick test (yes, done in user space, but should be accurate enough for our needs) on a pentium 4 3 GHz shows that generating a int3 breakpoint in a loop (connected to an empty handler) takes an average of 2.01µs per breakpoint. LTT has an impact of about 0.220µs per probe (10 times smaller). Please refer to this kind of high event rate workload : http://www.listserv.shafik.org/pipermail/ltt-dev/2005-December/001139.html On the same pentium 4, 3 GHz (in the following results, I do not consider the fact that the CPU had hyperthreading enabled) : Probe execution time at probe site : 220ns/event 220ns * 9588836 events = 2.11s Event rate : 749994 events per second LTT : 749994 events/s * 0.220µs/event = 16.5 % of cpu time With a breakpoint : 749994 events/s * 2.01µs/event = 150 % of cpu time Considering the limitations of these tests : - int3 timings taken from user space, which implies calling an empty handler in user space. - The machine had hyperthreading enabled, but considered UP here. It shows that tracing the same workload with breakpoints would make the machine more than twice slower when a direct memory write has a relatively small impact (16.5% of cpu time spent in probes). In high event rate/low perturbation scenarios where instrumentation is put at arbitrary locations in the code, it shows necessary to use the static instrumentation alternative because the breakpoint approach is just too slow. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:26 ` Michel Dagenais 2006-09-14 17:48 ` Ingo Molnar @ 2006-09-14 18:08 ` Nick Piggin 2006-09-14 18:38 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Nick Piggin @ 2006-09-14 18:08 UTC (permalink / raw) To: Michel Dagenais Cc: Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev Michel Dagenais wrote: > On Thu, 2006-14-09 at 16:33 +0200, Roman Zippel wrote: >>BTW I don't mind KProbes as an option, but I have huge problem with making >>it the only option. > > > Indeed, KProbes SystemTAP and LTTng are complementary and people > involved in the three projects are cooperating. That doesn't mean we want them all in the kernel. The best aim would of course be to come up with a solution that has the advantages of all and disadvantages of none. That may be impossible, but if we can find one way to do things that is acceptable to all... What's the huge problem with making kprobes the only option (that can't be fixed by doing a bit of coding)? -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:08 ` Nick Piggin @ 2006-09-14 18:38 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 18:38 UTC (permalink / raw) To: Nick Piggin Cc: Michel Dagenais, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev Nick Piggin wrote: > What's the huge problem with making kprobes the only option (that can't > be fixed by doing a bit of coding)? No offense, having been on the receiving end of this for a number of years, one feels like he's watching a never-ending repeat of a 30second commercial where the woman is holding up a magic scrub and says something like "Just use Mr. Scrub" and the product then twinkles with some light music and then cut, next commercial; except in this case, it's "Just use Kprobes" and all your problems will go away, wink-wink! Sorry, it's just not that straight-forward. There's a reason why the systemtap folks got interested in the markers proposal, they actually have to maintain a dynamic instrumentation set. Mr. Scrub just doesn't scrub as clean as advertised, you actually have to scrub to make the scum go away. Which goes back to what I said elsewhere: no matter where you draw the line someone is doing the heavy lifting. Doing it outside the kernel only means that there's yet another piece of software that needs to be updated before you can actually start profiting from your new and improved kernel ... Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 14:33 ` Roman Zippel 2006-09-14 15:26 ` Michel Dagenais @ 2006-09-14 17:13 ` Ingo Molnar 2006-09-14 17:55 ` Roman Zippel ` (2 more replies) 2006-09-14 17:51 ` Karim Yaghmour 2 siblings, 3 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 17:13 UTC (permalink / raw) To: Roman Zippel Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi, > > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > > > > > i have one very fundamental question: why should we do this > > > > source-intrusive method of adding tracepoints instead of the dynamic, > > > > unintrusive (and thus zero-overhead) KProbes+SystemTap method? > > > > > > Could you define "zero-overhead"? > > > > zero overhead when not used: not a single instruction added to the > > kernel codepath that is to be traced, anywhere. (which will be the case > > on 99% of the systems) > > Using alternatives this could be near zero as well and it will likely > have less overhead when it's actually used. if there are lots of tracepoints (and the union of _all_ useful tracepoints that i ever encountered in my life goes into the thousands) then the overhead is not zero at all. also, the other disadvantages i listed very much count too. Static tracepoints are fundamentally limited because: - they can only be added at the source code level - modifying them requires a reboot which is not practical in a production environment - there can only be a limited set of them, while many problems need finegrained tracepoints tailored to the problem at hand - conditional tracepoints are typically either nonexistent or very limited. for me these are all _independent_ grounds for rejection, as a generic kernel infrastructure. > > the key point is that we want _zero_ "static tracepoints". Firstly, > > static tracepoints are fundamentally limited: > > BTW I don't mind KProbes as an option, but I have huge problem with > making it the only option. i'm not arguing for SystemTap to be the only option (KProbes is just the infrastructure SystemTap is using - there are other uses for KProbes too), but i'm arguing against the inclusion of static tracepoints as an infrastructure, precisely because a much better option (SystemTap) is already available and is usable on the stock kernel. You are of course free to invent other, equally advantageous (or better) options. > > But besides the usability problems, the most important problem is > > that static tracepoints add a _constant maintainance overhead_ to > > the kernel. I'm talking from first hand experience: i wrote > > 'iotrace' (a static tracer) in 1996 and have maintained it for many > > years, and even today i'm maintaining a handful of tracepoints in > > the -rt kernel. I _dont_ want static tracepoints in the mainline > > kernel. > > Even dynamic tracepoints have a maintainance overhead and I doubt > there is much difference. The big problem is having to maintain them > outside the mainline kernel, that's why it's so important to get them > into the mainline kernel. i dispute that: for example kernel/sched.c has zero maintainance overhead under SystemTap, while it's nonzero with static tracepoints. Of course SystemTap _itself_ has maintainance overhead, but it does not slow down any other subsystem's speed of progress. > You didn't address my main issue at all - kprobes is only available > for a few archs... the kprobes infrastructure, despite being fairly young, is widely available: powerpc, i386, x86_64, ia64 and sparc64. The other architectures are free to implement them too, there's nothing hardware-specific about kprobes and the "porting overhead" is in essence a one-time cost - while for static tracepoints the maintainance overhead goes on forever and scales linearly with the number of tracepoints added. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:13 ` Ingo Molnar @ 2006-09-14 17:55 ` Roman Zippel 2006-09-14 18:15 ` Ingo Molnar 2006-09-14 18:12 ` Karim Yaghmour 2006-09-14 20:25 ` Martin Bligh 2 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 17:55 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > also, the other disadvantages i listed very much count too. Static > tracepoints are fundamentally limited because: > > - they can only be added at the source code level > > - modifying them requires a reboot which is not practical in a > production environment > > - there can only be a limited set of them, while many problems need > finegrained tracepoints tailored to the problem at hand > > - conditional tracepoints are typically either nonexistent or very > limited. > > for me these are all _independent_ grounds for rejection, as a generic > kernel infrastructure. Tracepoints of course need to be managed, but that's true for both dynamic and static tracepoints. Both have their advantages and disadvantages and just hammering on the possible problems of static ones (which are not much of a problem for other people) is highly unfair and not a reason for rejection. If you don't like them, don't use them, nobody forces you, it's that simple... > > You didn't address my main issue at all - kprobes is only available > > for a few archs... > > the kprobes infrastructure, despite being fairly young, is widely > available: powerpc, i386, x86_64, ia64 and sparc64. The other > architectures are free to implement them too, there's nothing > hardware-specific about kprobes and the "porting overhead" is in essence > a one-time cost - while for static tracepoints the maintainance overhead > goes on forever and scales linearly with the number of tracepoints > added. kprobes are not trivial to implement (especially to reach the level of perfomance and flexibility of static tracepoints) and until then you deny their users/developers a useful tool? I also think you highly exaggerate the maintaince overhead of static tracepoints, once added they hardly need any maintainance, most of the time you can just ignore them. Only if the code drastically changes they need to be adjusted, but at that point this should be the smallest problem. The kernel is full debug prints, do you seriously suggest to throw them out because of their "high maintainance"? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:55 ` Roman Zippel @ 2006-09-14 18:15 ` Ingo Molnar 2006-09-14 18:35 ` Mathieu Desnoyers ` (3 more replies) 0 siblings, 4 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 18:15 UTC (permalink / raw) To: Roman Zippel Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > for me these are all _independent_ grounds for rejection, as a generic > > kernel infrastructure. > > Tracepoints of course need to be managed, but that's true for both > dynamic and static tracepoints. [...] that's not true, and this is the important thing that i believe you are missing. A dynamic tracepoint is _detached_ from the normal source code and thus is zero maintainance overhead. You dont have to maintain it during normal development - only if you need it. You dont see the dynamic tracepoints in the source code. a static tracepoint, once it's in the mainline kernel, is a nonzero maintainance overhead _until eternity_. It is a constant visual hindrance and a constant build-correctness and boot-correctness problem if you happen to change the code that is being traced by a static tracepoint. Again, I am talking out of actual experience with static tracepoints: i frequently break my kernel via static tracepoints and i have constant maintainance cost from them. So what i do is that i try to minimize the number of static tracepoints to _zero_. I.e. i only add them when i need them for a given bug. static tracepoints are inferior to dynamic tracepoints in almost every way. > [...] Both have their advantages and disadvantages and just hammering > on the possible problems of static ones [...] how about giving a line by line rebuttal to the very real problems of static tracepoints i listed (twice already), instead of calling them "possible problems"? i am giving a line by line rebuttal of all arguments that come up. Please be fair and do the same. Here are the arguments again, for a third time. Thanks! > > also, the other disadvantages i listed very much count too. Static > > tracepoints are fundamentally limited because: > > > > - they can only be added at the source code level > > > > - modifying them requires a reboot which is not practical in a > > production environment > > > > - there can only be a limited set of them, while many problems need > > finegrained tracepoints tailored to the problem at hand > > > > - conditional tracepoints are typically either nonexistent or very > > limited. > > the kprobes infrastructure, despite being fairly young, is widely > > available: powerpc, i386, x86_64, ia64 and sparc64. The other > > architectures are free to implement them too, there's nothing > > hardware-specific about kprobes and the "porting overhead" is in > > essence a one-time cost - while for static tracepoints the > > maintainance overhead goes on forever and scales linearly with the > > number of tracepoints added. > > kprobes are not trivial to implement [...] nor are smp-alternatives, which was suggested as a solution to reduce the overhead of static tracepoints. So what's the point? It's a one-off development overhead that has already been done for all the major arches. If another arch needs it they can certainly implement it. it's like arguing against ptrace on the grounds of: "application developers can add printf if they want to debug their apps, or they can add static tracepoints too, and besides, ptrace is hard to implement". > I also think you highly exaggerate the maintaince overhead of static > tracepoints, once added they hardly need any maintainance, most of the > time you can just ignore them. [...] hundreds (or possibly thousands) of tracepoints? Have you ever tried to maintain that? I have and it's a nightmare. Even assuming a rich set of hundreds of static tracepoints, it doesnt even solve the problems at hand: people want to do much more when they probe the kernel - and today, with DTrace under Solaris people _know_ that much better tracing _can be done_, and they _demand_ that Linux adopts an intelligent solution. The clock is ticking for dinosaurs like static printks and static tracepoints to debug the kernel... > [...] The kernel is full debug prints, do you seriously suggest to > throw them out because of their "high maintainance"? oh yes, these days i frequently throw them out when i find them in code i modify. (my most recent such zap was rwsemtrace()). Also, obviously when most of them were added we didnt have good kernel debugging infrastructure (in fact we didnt have any kernel debugging infrastructure besides printk), so _something_ had to be used back then. But today there's little reason to keep them. Welcome to 2006 :-) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:15 ` Ingo Molnar @ 2006-09-14 18:35 ` Mathieu Desnoyers 2006-09-14 18:54 ` Karim Yaghmour ` (2 subsequent siblings) 3 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-14 18:35 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > that's not true, and this is the important thing that i believe you are > missing. A dynamic tracepoint is _detached_ from the normal source code > and thus is zero maintainance overhead. You dont have to maintain it > during normal development - only if you need it. You dont see the > dynamic tracepoints in the source code. > What happen if someone need trace points in "normal kernel development" (which appears to be the case, see blktrace and latency tracer) ? > a static tracepoint, once it's in the mainline kernel, is a nonzero > maintainance overhead _until eternity_. It is a constant visual > hindrance and a constant build-correctness and boot-correctness problem > if you happen to change the code that is being traced by a static > tracepoint. Again, I am talking out of actual experience with static > tracepoints: i frequently break my kernel via static tracepoints and i > have constant maintainance cost from them. So what i do is that i try to > minimize the number of static tracepoints to _zero_. I.e. i only add > them when i need them for a given bug. > What kind of code are you calling from your instrumentation sites to break your kernel so easily ? Or perhaps are you instrumenting the page fault handler which, yes, can have side effects? My goal is exctly to provide the kind of code that can be called from any kernel site without breaking it! Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:15 ` Ingo Molnar 2006-09-14 18:35 ` Mathieu Desnoyers @ 2006-09-14 18:54 ` Karim Yaghmour 2006-09-15 9:20 ` Jes Sorensen 2006-09-14 19:40 ` Tim Bird 2006-09-14 19:47 ` Roman Zippel 3 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 18:54 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > that's not true, and this is the important thing that i believe you are > missing. A dynamic tracepoint is _detached_ from the normal source code > and thus is zero maintainance overhead. You dont have to maintain it > during normal development - only if you need it. You dont see the > dynamic tracepoints in the source code. And that's actually a problem for those who maintain such dynamic trace points. > a static tracepoint, once it's in the mainline kernel, is a nonzero > maintainance overhead _until eternity_. It is a constant visual > hindrance and a constant build-correctness and boot-correctness problem > if you happen to change the code that is being traced by a static > tracepoint. Again, I am talking out of actual experience with static > tracepoints: i frequently break my kernel via static tracepoints and i > have constant maintainance cost from them. So what i do is that i try to > minimize the number of static tracepoints to _zero_. I.e. i only add > them when i need them for a given bug. Bzzt, wrong. This is your own personal experience with tracing. Marked up code does not need to be active under all build conditions. In fact trace points can be inactive by default at all times, except when you choose to build them in. And as I said elsewhere, the fact that your use of instrumentation is solely for debugging ("i only add them when i need them for a given bug"), I repeat that there are mortals out there that need this for their applications. > static tracepoints are inferior to dynamic tracepoints in almost every > way. Sorry, orthogonal is the word. > hundreds (or possibly thousands) of tracepoints? Have you ever tried to > maintain that? I have and it's a nightmare. I have, and I've showed you that you're wrong. The only reason you can make this argument is that you view these things from the point of view of what use they are for you as a kernel developer and I will repeat what I've said for years now: static instrumentation of the kernel isn't meant to be useful for kernel developers. While it may indeed be in some cases, in most cases it's likely useless, as you've been very successfully arguing in this thread. Nevertheless there are very legitimate uses for standardized instrumentation points. > Even assuming a rich set of hundreds of static tracepoints, it doesnt > even solve the problems at hand: people want to do much more when they > probe the kernel - and today, with DTrace under Solaris people _know_ > that much better tracing _can be done_, and they _demand_ that Linux > adopts an intelligent solution. The clock is ticking for dinosaurs like > static printks and static tracepoints to debug the kernel... Thank you, I couldn't have put it better. This paragraph, more than any other snippet I've seen to date, clearly demonstrates why tracing is such a contentious issue. Kernel developers use tracing during their normal development process, and of course their gut reaction is: why the hell would anybody need this for mainline? But of course this misses the entire point. Kernel tracing for developers is but a corner case of kernel tracing in general. There are very valid and legitimate reasons for userspace to be able to obtain important events. And of course any infrastructure developed with that in mind should also be usable by kernel developers. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:54 ` Karim Yaghmour @ 2006-09-15 9:20 ` Jes Sorensen 2006-09-15 12:38 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 9:20 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais >>>>> "Karim" == Karim Yaghmour <karim@opersys.com> writes: Karim> Ingo Molnar wrote: >> that's not true, and this is the important thing that i believe you >> are missing. A dynamic tracepoint is _detached_ from the normal >> source code and thus is zero maintainance overhead. You dont have >> to maintain it during normal development - only if you need it. You >> dont see the dynamic tracepoints in the source code. Karim> And that's actually a problem for those who maintain such Karim> dynamic trace points. And who should pay here? The people who want the tracepoints or the people who are not interested in them? >> a static tracepoint, once it's in the mainline kernel, is a nonzero >> maintainance overhead _until eternity_. It is a constant visual >> hindrance and a constant build-correctness and boot-correctness >> problem if you happen to change the code that is being traced by a >> static tracepoint. Again, I am talking out of actual experience >> with static tracepoints: i frequently break my kernel via static >> tracepoints and i have constant maintainance cost from them. So >> what i do is that i try to minimize the number of static >> tracepoints to _zero_. I.e. i only add them when i need them for a >> given bug. Karim> Bzzt, wrong. This is your own personal experience with Karim> tracing. Marked up code does not need to be active under all Karim> build conditions. In fact trace points can be inactive by Karim> default at all times, except when you choose to build them in. You have obviously never tried to maintain a codebase for a long time. Even if the code is not activated, you make a change and something breaks and people come running and screaming, or the thing is in the way for the structural code change you want to make. Not to mention that some of the classical places people wish to add those static tracepoints are in performance sensitive codepaths, syscalls for example. >> static tracepoints are inferior to dynamic tracepoints in almost >> every way. Karim> Sorry, orthogonal is the word. You can do pretty much everything you want to do with dynamic tracepoints, it's just a matter of whether you want to dump the burden of maintenance on someone else. Been there done that, had to show people in the past how to do with dynamic points what they insisted had to be done with static points. >> hundreds (or possibly thousands) of tracepoints? Have you ever >> tried to maintain that? I have and it's a nightmare. Karim> I have, and I've showed you that you're wrong. The only reason Karim> you can make this argument is that you view these things from Karim> the point of view of what use they are for you as a kernel Karim> developer and I will repeat what I've said for years now: Karim> static instrumentation of the kernel isn't meant to be useful Karim> for kernel developers. So you maintain the tracepoints in the kernel and you are offering to take over maintenance of all code that now contain these tracepoints? You add your static tracepoints, next week someone else wants some very similar but slightly different points, the following week it's someone else. Thanks, but no thanks. Karim> Nevertheless there are Karim> very legitimate uses for standardized instrumentation points. Some evidence would be useful here, so far you haven't provided any. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 9:20 ` Jes Sorensen @ 2006-09-15 12:38 ` Karim Yaghmour 2006-09-15 12:32 ` Jes Sorensen 2006-09-15 13:20 ` Paul Mundt 0 siblings, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 12:38 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > Karim> And that's actually a problem for those who maintain such > Karim> dynamic trace points. > > And who should pay here? The people who want the tracepoints or the > people who are not interested in them? If you'd care to read through the thread you'd notice I've demonstrated time and again that those static trace points we're mostly interested in a never-changing. Lest something fundamentally changes with the kernel, there will always be a scheduling change; etc. This "instrumentation is evil" mantra is only substantiated if you view it from the point of view of someone who's only used it to debug code. Yet, and I repeat this again, instrumentation for in-source debugging is but a corner case of instrumentation in general. > You have obviously never tried to maintain a codebase for a long > time. Please, this is not constructive. I've never really grasped the need for posturing on LKML. Jes, I'm not going to fight a war of resumes with you. If you think I'm incompetent then there's very little I can do to change your mind. > Not to mention that some of the classical places people wish to add > those static tracepoints are in performance sensitive codepaths, > syscalls for example. And this argument ignores everything I said on how there does not need be the limitation currently known to previous static tracing mechanisms. > You can do pretty much everything you want to do with dynamic > tracepoints, it's just a matter of whether you want to dump the burden > of maintenance on someone else. Been there done that, had to show > people in the past how to do with dynamic points what they insisted > had to be done with static points. Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can get away with this argument is if you view it exclusively from the point of view of kernel development. And that's why you're wrong. > So you maintain the tracepoints in the kernel and you are offering to > take over maintenance of all code that now contain these tracepoints? Please explain, honestly, why the following instrumentation point is going to be a maintenance drag on the person modifying the scheduler: @@ -1709,6 +1712,7 @@ switch_tasks: ++*switch_count; prepare_arch_switch(rq, next); + TRACE_SCHEDCHANGE(prev, next); prev = context_switch(rq, prev, next); barrier(); And please, don't bother complaining about the semantics, they can be changed. I'm just arguing about location/meaning/content. > You add your static tracepoints, next week someone else wants some > very similar but slightly different points, the following week it's > someone else. Thanks, but no thanks. Obviously there's no point in me spelling any code of conduct to anyone, Martin has already pointed out that it's up to the subsystem maintainers to decide what's appropriate and what's not, as is customary anyway. But the issue I'm putting forth here is that there is value for allowing outsiders to understand the dynamic behavior of your code and the only person who can do that best is the person writing the code. It is then that person's responsibility to distinguish between instrumentation they may find important to debug their code and instrumentation that would be relevant to those using their code. And if you've maintained code long enough, and I trust you do, you would see that there is a clear difference between both. Thanks, Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:38 ` Karim Yaghmour @ 2006-09-15 12:32 ` Jes Sorensen 2006-09-15 14:09 ` Karim Yaghmour 2006-09-15 13:20 ` Paul Mundt 1 sibling, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 12:32 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Jes Sorensen wrote: >>> And who should pay here? The people who want the tracepoints or the >> people who are not interested in them? > > If you'd care to read through the thread you'd notice I've demonstrated > time and again that those static trace points we're mostly interested > in a never-changing. Lest something fundamentally changes with the > kernel, there will always be a scheduling change; etc. Except as I pointed out, that everyone wants their info slightly differently so even trace points in the scheduler will be contentious and we will end up with a stack of them if we are to satisfy everyone. So now, you didn't demonstrate anything. > This > "instrumentation is evil" mantra is only substantiated if you view > it from the point of view of someone who's only used it to debug code. > Yet, and I repeat this again, instrumentation for in-source debugging > is but a corner case of instrumentation in general. Given that I have used this stuff to more than just debug code, then this obviously doesn't apply. >> You have obviously never tried to maintain a codebase for a long >> time. > > Please, this is not constructive. I've never really grasped the need > for posturing on LKML. Jes, I'm not going to fight a war of resumes > with you. If you think I'm incompetent then there's very little I can > do to change your mind. You refuse to take the big picture into account and then claim that there is no cost of doing things your way. Point being that once you start maintaining a large project such as the kernel, or just parts of it, you realize how much those 'zero cost' additions really cost. >> Not to mention that some of the classical places people wish to add >> those static tracepoints are in performance sensitive codepaths, >> syscalls for example. > > And this argument ignores everything I said on how there does not need > be the limitation currently known to previous static tracing mechanisms. And how does there not? If you want to add tracepoints to the syscall path, then you will make an impact. It's non trivial to validate, yes I have seen some scary attempts of adding LTT tracecalls to the ia64 syscall path, and just because it might not be compiled in in most cases that doesn't mean it doesn't raise the complexity. >> You can do pretty much everything you want to do with dynamic >> tracepoints, it's just a matter of whether you want to dump the burden >> of maintenance on someone else. Been there done that, had to show >> people in the past how to do with dynamic points what they insisted >> had to be done with static points. > > Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can > get away with this argument is if you view it exclusively from the > point of view of kernel development. And that's why you're wrong. As I said, kprobes are much more than kernel development! But you obviously haven't bothered looking at those properly! Been there done that! >> So you maintain the tracepoints in the kernel and you are offering to >> take over maintenance of all code that now contain these tracepoints? > > Please explain, honestly, why the following instrumentation point is > going to be a maintenance drag on the person modifying the scheduler: > @@ -1709,6 +1712,7 @@ switch_tasks: > ++*switch_count; > > prepare_arch_switch(rq, next); > + TRACE_SCHEDCHANGE(prev, next); > prev = context_switch(rq, prev, next); > barrier(); > > And please, don't bother complaining about the semantics, they can > be changed. I'm just arguing about location/meaning/content. It will be a drag because next week someone else wants a tracepoint 5 lines further down the code! Again, I have seen people try and do that on top of the old LTT patchsets, so maybe *you* didn't want the tracepoint somewhere else, but some people did! Next? >> You add your static tracepoints, next week someone else wants some >> very similar but slightly different points, the following week it's >> someone else. Thanks, but no thanks. > > Obviously there's no point in me spelling any code of conduct to > anyone, Martin has already pointed out that it's up to the subsystem > maintainers to decide what's appropriate and what's not, as is > customary anyway. But the issue I'm putting forth here is that there > is value for allowing outsiders to understand the dynamic behavior of > your code and the only person who can do that best is the person > writing the code. It is then that person's responsibility to > distinguish between instrumentation they may find important to debug > their code and instrumentation that would be relevant to those using > their code. And if you've maintained code long enough, and I trust > you do, you would see that there is a clear difference between both. You are once again ignoring the point that not everyone needs the exact same view of things that you are looking for. Dynamic probes allows for that, doing that with static probes is going to turn into maintenance hell. Guess what, some of us still try to look after code 8-10 years after we wrote it initially. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:32 ` Jes Sorensen @ 2006-09-15 14:09 ` Karim Yaghmour 2006-09-15 14:30 ` Jes Sorensen 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:09 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > Except as I pointed out, that everyone wants their info slightly > differently so even trace points in the scheduler will be contentious > and we will end up with a stack of them if we are to satisfy everyone. > So now, you didn't demonstrate anything. There is in my view, and this is what this whole debate is really about, a clear difference in between the type of instrumentation being added. Clearly in the view of others there just isn't. But bare with me. I submit to you that there are 3 classes of trace points: - OS-class: These are trace points which will be found in a given kernel regardless of how it is implemented if it belongs to a certain family of OSes. Linux being made to mimic Unix, it will always have key events. And if you look closely at the initial set of points added by ltt, these would be found in any Unix. It's not for nothing that my paper on ltt was accepted at Usenix 2000 - and in fact during the question period somebody asked how easy it would be to port it to BSD, and the answer: trivial. - Subsystem-class: These are trace points which are specific to a given implementation. Say block tracing, scsi tracing, etc. as they are implemented in Linux. The purpose of these is to allow a user of these given subsystems to get more in-depth understanding of what's happening inside the box. - Debug-class: These are trace points required to find difficult problems such as race-conditions/etc. which are needed to debug the OS. I'm not arguing for the inclusion of debug tracepoints. I can see that within a given subsystem there can be disagreement over the placement of specific tracepoints, and this is where I think your argument lies and it is not without merit - IOW such tracepoints should be more carefully scrutinized. However, there are OS-class tracepoints for which I hardly see any possible debate either in terms of usefulness or in terms of maintainability. > Given that I have used this stuff to more than just debug code, then > this obviously doesn't apply. ... > You refuse to take the big picture into account and then claim that > there is no cost of doing things your way. Point being that once you > start maintaining a large project such as the kernel, or just parts of > it, you realize how much those 'zero cost' additions really cost. Someone else alluded to the parallel between in-code comments and documentation maintained separately. There is a cost to in-code instrumentation in the same way that there is to in-code documentation. And they, in fact, are very much alike. > And how does there not? If you want to add tracepoints to the syscall > path, then you will make an impact. It's non trivial to validate, yes > I have seen some scary attempts of adding LTT tracecalls to the ia64 > syscall path, and just because it might not be compiled in in most cases > that doesn't mean it doesn't raise the complexity. Again, this is an implementation issue. If we have a way to mark-up code, then we can at least "hide" much of the scary stuff. > As I said, kprobes are much more than kernel development! But you > obviously haven't bothered looking at those properly! Been there done > that! I have, and taking an int3 on every tracepoint wasn't my liking, nor was having to chase kernel versions for binary editing. If I was going do maintenance I was much happier to work with source than binary. > It will be a drag because next week someone else wants a tracepoint > 5 lines further down the code! Again, I have seen people try and do > that on top of the old LTT patchsets, so maybe *you* didn't want the > tracepoint somewhere else, but some people did! Next? Not if you understand the distinction I am making above. Now, I can understand that you may think: Karim, nobody is going to fsck'ing care about the distinction you're making once this is in the kernel. But for me this is a separate, but yet entirely relevant, part of the debate. The argument here has already been pointed out elsewhere: There are already subsystem maintainers and they are more than capable of taking the appropriate decisions. The distinction I make above is not esoteric. > You are once again ignoring the point that not everyone needs the exact > same view of things that you are looking for. Dynamic probes allows for > that, doing that with static probes is going to turn into maintenance > hell. Guess what, some of us still try to look after code 8-10 years > after we wrote it initially. I'm not ignoring that people have different needs. I'm being depicted as endorsing static traces all over the place, and I'm not advocating such a course of action. The only reason any argument against static instrumentation can be made is if you consider it from the debug point of view and what drag such instrumentation would have. There is a big difference of purpose and of persistent-relevance in between debug instrumentation of os-class instrumentation. It's entirely disingenuous to suggest otherwise. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:09 ` Karim Yaghmour @ 2006-09-15 14:30 ` Jes Sorensen 2006-09-15 15:12 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 14:30 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Jes Sorensen wrote: > There is in my view, and this is what this whole debate is really > about, a clear difference in between the type of instrumentation > being added. Clearly in the view of others there just isn't. But > bare with me. I submit to you that there are 3 classes of trace > points: > > - OS-class: These are trace points which will be found in a given > kernel regardless of how it is implemented if it belongs to a > certain family of OSes. Linux being made to mimic Unix, it will > always have key events. And if you look closely at the initial > set of points added by ltt, these would be found in any Unix. > It's not for nothing that my paper on ltt was accepted at Usenix > 2000 - and in fact during the question period somebody asked how > easy it would be to port it to BSD, and the answer: trivial. There very few tracepoints in this category, the only things you can claim are more or less generic are syscalls, and tracing syscall handling is tricky. > - Subsystem-class: These are trace points which are specific to > a given implementation. Say block tracing, scsi tracing, etc. as > they are implemented in Linux. The purpose of these is to allow > a user of these given subsystems to get more in-depth understanding > of what's happening inside the box. This is grossly over simplifying things and why the whole things doesn't hold water. There is no such thing as 'the place' to put a specific tracepoint. Especially when we start talking about things like tracepoints in the scheduler. Note that I haven't been referring to debug tracepoints at any point in this debate. >> It will be a drag because next week someone else wants a tracepoint >> 5 lines further down the code! Again, I have seen people try and do >> that on top of the old LTT patchsets, so maybe *you* didn't want the >> tracepoint somewhere else, but some people did! Next? > > Not if you understand the distinction I am making above. Your distinction above doesn't hold water, but I did understand it very well .... You seem to think that it's fine to add instrumentation in the syscall path as an example as long as it's compiled out. Well on some architectures, the syscall path is very sensitive to alignment and there may be restrictions on how large the stub of code is allowed to be, like a few hundred bytes. Just because things work one way on x86, doesn't mean they work like that everywhere. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:30 ` Jes Sorensen @ 2006-09-15 15:12 ` Karim Yaghmour 2006-09-16 10:41 ` Jes Sorensen 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 15:12 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > There very few tracepoints in this category, Wow, that's progress. > the only things you can > claim are more or less generic are syscalls, and tracing syscall > handling is tricky. If there are implementation issue, I trust an adequate solution can be found by using the tested-and-proven method of posting stuff on the lkml for review. > This is grossly over simplifying things and why the whole things doesn't > hold water. There is no such thing as 'the place' to put a specific > tracepoint. > > Especially when we start talking about things like tracepoints in the > scheduler. I do not underestimate the difficulty of selecting such tracepoints. This is why I chose not to maintain other people's specific tracepoints. I realize this is a tough problem, but I also trust subsystem maintainers are smart enough to make the appropriate decision. Obviously for such things like the scheduler, any fine-grained instrumentation will draw a barrage of criticism from anyone since a lot of stuff depends on it. Either the lkml process works or it doesn't, but it isn't for me to decide. > Note that I haven't been referring to debug tracepoints at any point in > this debate. You're right, but others have happily intermingled the whole lot, and I just wanted to document my personal categorization on lkml for all to see. > You seem to think that it's fine to add instrumentation in the syscall > path as an example as long as it's compiled out. Well on some > architectures, the syscall path is very sensitive to alignment and there > may be restrictions on how large the stub of code is allowed to be, like > a few hundred bytes. Just because things work one way on x86, doesn't > mean they work like that everywhere. If ltt failed to implement such things appropriately, then we apologize. That fact doesn't preclude proper implementation in the future, however. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 15:12 ` Karim Yaghmour @ 2006-09-16 10:41 ` Jes Sorensen 2006-09-16 15:28 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-16 10:41 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Jes Sorensen wrote: >> There very few tracepoints in this category, > > Wow, that's progress. Karim, A personal question?, do you feel that being patronising and insulting is in any way going to put your LTT project in a better light? It certainly makes it a lot harder for many of us to take your arguments serious. >> the only things you can claim are more or less generic are syscalls, >> and tracing syscall handling is tricky. > > If there are implementation issue, I trust an adequate solution can be > found by using the tested-and-proven method of posting stuff on the > lkml for review. And how is this going to solve the case where trace code in the syscall path has a negative impact on cacheline utilization and alignment, even when the trace data is not being used? >> This is grossly over simplifying things and why the whole things doesn't >> hold water. There is no such thing as 'the place' to put a specific >> tracepoint. [snip] > I do not underestimate the difficulty of selecting such tracepoints. > This is why I chose not to maintain other people's specific tracepoints. > I realize this is a tough problem, but I also trust subsystem maintainers > are smart enough to make the appropriate decision. So you are back to saying that trace data other people wish to collect are uninteresting and therefore should just be ignored? If not, what you are saying there otherwise just backs up the argument that if LTT or something similar goes into mainline, we will see the amount of tracepoints grow significantly. >> You seem to think that it's fine to add instrumentation in the syscall >> path as an example as long as it's compiled out. Well on some >> architectures, the syscall path is very sensitive to alignment and there >> may be restrictions on how large the stub of code is allowed to be, like >> a few hundred bytes. Just because things work one way on x86, doesn't >> mean they work like that everywhere. > > If ltt failed to implement such things appropriately, then we apologize. > That fact doesn't preclude proper implementation in the future, however. Please read what I wrote above! Touching the syscall path with static tracepoints is costly and has side effects! The argument that things can be compiled out is just pointless, end users do not recompile kernels at random and many of the 'end user' cases where people wish to vizualize trace data, are running on precompiled vendor kernels. Recompiling the kernel and rebooting is not an option here! In fact, the users who wish to trace data in self-compiled kernels are a tiny subset of the potential userbase for this stuff which is primarily useful to developers .... which in terms makes your argument about debug tracepoints irrelevant since you are turning all the tracepoints into debug tracepoints :) Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 10:41 ` Jes Sorensen @ 2006-09-16 15:28 ` Karim Yaghmour 2006-09-18 8:57 ` Jes Sorensen 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 15:28 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > A personal question?, do you feel that being patronising and insulting > is in any way going to put your LTT project in a better light? It > certainly makes it a lot harder for many of us to take your arguments > serious. ltt isn't *mine* anymore, somebody else is maintaining it at this point, and it remains to be seen whether any of my input in this thread is: a) appreciated by them, b) agreed by them. With regards to the tone of the thread, then please at least read other people's approach to me, including yourself. I think the casual observer will see that there was a great deal of animosity aimed at me personally. I'll admit to being sarcastic and biting back. But that's hardly alien to lkml. > And how is this going to solve the case where trace code in the syscall > path has a negative impact on cacheline utilization and alignment, even > when the trace data is not being used? Hmm... and then compare that to the negative impact of kprobes at runtime. Of course if we could override the syscall table your point disappears. That's not how ltt does it now, but it could easily be done otherwise. All implementations I've looked at so far of syscall in Linux involve a table. If the base of this table was a dynamically modifiable entry, then the problem is solved. Wouldn't it? > So you are back to saying that trace data other people wish to collect > are uninteresting and therefore should just be ignored? If not, what you > are saying there otherwise just backs up the argument that if LTT or > something similar goes into mainline, we will see the amount of > tracepoints grow significantly. I've explained earlier the difference in between these things. > Please read what I wrote above! Touching the syscall path with static > tracepoints is costly and has side effects! The argument that things can > be compiled out is just pointless, end users do not recompile kernels at > random and many of the 'end user' cases where people wish to vizualize > trace data, are running on precompiled vendor kernels. Recompiling the > kernel and rebooting is not an option here! It is for some. And please stop repeating the syscall path stuff. It can be solved elegantly. The fact that it hasn't up to this point is only an excuse to keep working harder on it. There is, in fact, no reason that the solution may not just be a combination of static markup and dynamic modification. > In fact, the users who wish to trace data in self-compiled kernels are a > tiny subset of the potential userbase for this stuff which is primarily > useful to developers .... which in terms makes your argument about debug > tracepoints irrelevant since you are turning all the tracepoints into > debug tracepoints :) How many embedded Linux projects did you personally work on? Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 15:28 ` Karim Yaghmour @ 2006-09-18 8:57 ` Jes Sorensen 2006-09-18 14:48 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:57 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > It is for some. And please stop repeating the syscall path stuff. It can > be solved elegantly. The fact that it hasn't up to this point is only an > excuse to keep working harder on it. There is, in fact, no reason that > the solution may not just be a combination of static markup and dynamic > modification. You just don't want to listen, this is *not* a question of a modifiable table or not. It's a question of *how* code needs to be added to the syscall path, we both know why a modifiable table is not going to happen. How do you plan to handle vdso based syscalls with LTT? >> In fact, the users who wish to trace data in self-compiled kernels are a >> tiny subset of the potential userbase for this stuff which is primarily >> useful to developers .... which in terms makes your argument about debug >> tracepoints irrelevant since you are turning all the tracepoints into >> debug tracepoints :) > > How many embedded Linux projects did you personally work on? You know what, I give up. Your primary interest seems to be in attacking people personally because they didn't start out jumping up and down clapping their hands in support of your pet project. Even if I wanted to I couldn't tell you about the number of different projects I have worked, partly because I can't remember half of them, partly because of contract limitation, and most importantly because I do not need to justify my experience to you. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 8:57 ` Jes Sorensen @ 2006-09-18 14:48 ` Ingo Molnar 2006-09-18 15:37 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-18 14:48 UTC (permalink / raw) To: Jes Sorensen Cc: karim, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Jes Sorensen <jes@sgi.com> wrote: > >> tiny subset of the potential userbase for this stuff which is primarily > >> useful to developers .... which in terms makes your argument about debug > >> tracepoints irrelevant since you are turning all the tracepoints into > >> debug tracepoints :) > > > > How many embedded Linux projects did you personally work on? > > You know what, I give up. Your primary interest seems to be in > attacking people personally because they didn't start out jumping up > and down clapping their hands in support of your pet project. [...] i'm giving up on Karim too. I did apologize to Karim for the mistake i did in this thread-of-200-mails, but it's revolting to see that Karim still goes on and attacks top Linux contributors like you, without looking back, without apologizing for anything and without feeling any remorse. Karim patronized, attacked and insulted various people dozens of times in this thread alone. I just dont see any value in trying to "work with" Karim anymore, because it's apparently not something he is interested in doing. I feel a bit sorry for him too, because at heart he must be a deeply lonely person. ( I do see value in working with Mathieu, who has shown lot of insight, patience, ability in cleaning up the LTT codebase and producing LTTng. I dont envy him for having to work with Karim though. LTTng still needs alot of work to be upstream-acceptable but my current impression is that Mathieu's fundamentally professional approach will be successful. ) > > How many embedded Linux projects did you personally work on? > > > [...] Even if I wanted to I couldn't tell you about the number of > different projects I have worked, partly because I can't remember half > of them, partly because of contract limitation, and most importantly > because I do not need to justify my experience to you. you dont need to justify your experience to Karim. Your countless contributions to the Linux kernel speak for themselves. Most tellingly, his boasting aside, the only embedded-related Linux kernel contribution i have ever seen from Karim was the 1000-lines relayfs code - and even that code took years for Tom Zanussi to clean up and to get upstream. Besides that i have not seen a single line of code from Karim - not a single patch, not a oneliner fix, nothing. So if someone needs to prove his experience in embedded Linux matters on this forum then it's Karim. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 14:48 ` Ingo Molnar @ 2006-09-18 15:37 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-18 15:37 UTC (permalink / raw) To: Ingo Molnar Cc: Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Trust me, I don't intend to drag this any longer. I just want to make sure this issue of "respect" is cleared up. Ingo Molnar wrote: > i'm giving up on Karim too. I did apologize to Karim for the mistake i > did in this thread-of-200-mails, but it's revolting to see that Karim > still goes on and attacks top Linux contributors like you, without > looking back, without apologizing for anything and without feeling any > remorse. If there exists a cult where top contributors are to be venerated, then I'm not part of it. If my calling individuals to account on their supposed expertise on tracing, which they use as justification for continued marginalization of such related projects, has generated so much backlash, then it is for me but a sign of how entrenched arrogance can be in some quarters. Don't get wrong, I have immense respect for the collective talent of kernel developers. But no matter how broad collective talent can be, it cannot be omniscient. > Karim patronized, attacked and insulted various people dozens > of times in this thread alone. I just dont see any value in trying to > "work with" Karim anymore, because it's apparently not something he is > interested in doing. I feel a bit sorry for him too, because at heart he > must be a deeply lonely person. Ditto. > single patch, not a oneliner fix, nothing. So if someone needs to prove > his experience in embedded Linux matters on this forum then it's Karim. http://www.oreilly.com/catalog/belinuxsys/ Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:38 ` Karim Yaghmour 2006-09-15 12:32 ` Jes Sorensen @ 2006-09-15 13:20 ` Paul Mundt 2006-09-15 13:41 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Paul Mundt @ 2006-09-15 13:20 UTC (permalink / raw) To: Karim Yaghmour Cc: Jes Sorensen, Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote: > If you'd care to read through the thread you'd notice I've demonstrated > time and again that those static trace points we're mostly interested > in a never-changing. Lest something fundamentally changes with the > kernel, there will always be a scheduling change; etc. This > "instrumentation is evil" mantra is only substantiated if you view > it from the point of view of someone who's only used it to debug code. > Yet, and I repeat this again, instrumentation for in-source debugging > is but a corner case of instrumentation in general. > I didn't get the "instrumentation is evil" mantra from this thread, rather "static tracepoints are good, so long as someone else is maintaining them". The issue comes down to who ends up maintaining the trace points, and given with how intrusive LTT was in the past, I can't see anyone wanting to suddenly start littering them around the kernel now (at least in the areas that they're responsible for, particularly if it's not something that's going to be useful to most people). Admittedly LTTng is not as bad at this as LTT was in this regard, though. If static tracepoints are something that's useful for you, then you can continue maintaining them out of tree. > Yes, Mr. Scrub, I mean kprobes is your answer. The only reason you can > get away with this argument is if you view it exclusively from the > point of view of kernel development. And that's why you're wrong. > kprobes may not be the answer to all lifes problems, but it is non-intrusive once the initial implementation pains are out of the way.. > Please explain, honestly, why the following instrumentation point is > going to be a maintenance drag on the person modifying the scheduler: > @@ -1709,6 +1712,7 @@ switch_tasks: > ++*switch_count; > > prepare_arch_switch(rq, next); > + TRACE_SCHEDCHANGE(prev, next); > prev = context_switch(rq, prev, next); > barrier(); > > And please, don't bother complaining about the semantics, they can > be changed. I'm just arguing about location/meaning/content. > For someone complaining about meaningless posturing on the list, posting this as a representation for the isolated changes involved is rather interesting. If it were down to a small handful of critical static tracepoints in-tree and the rest left up to the people that really want them in out-of-tree patches, I doubt LTT would have ever had half of the resistance towards it. It's the intrusiveness that becomes the maintenance burden, and if you whittle it down to a point where the intrusiveness is not that big of a deal, then I'm not sure I see what static points would buy you over dynamic instrumentation. It's easy to write off the maintenance overhead when you aren't the one maintaining the code.. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:20 ` Paul Mundt @ 2006-09-15 13:41 ` Roman Zippel 2006-09-15 13:44 ` Jes Sorensen 2006-09-15 13:57 ` Paul Mundt 0 siblings, 2 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-15 13:41 UTC (permalink / raw) To: Paul Mundt Cc: Karim Yaghmour, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Paul Mundt wrote: > On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote: > > If you'd care to read through the thread you'd notice I've demonstrated > > time and again that those static trace points we're mostly interested > > in a never-changing. Lest something fundamentally changes with the > > kernel, there will always be a scheduling change; etc. This > > "instrumentation is evil" mantra is only substantiated if you view > > it from the point of view of someone who's only used it to debug code. > > Yet, and I repeat this again, instrumentation for in-source debugging > > is but a corner case of instrumentation in general. > > > I didn't get the "instrumentation is evil" mantra from this thread, > rather "static tracepoints are good, so long as someone else is > maintaining them". The issue comes down to who ends up maintaining the > trace points, The claim that these tracepoints would be maintainance burden is pretty much unproven so far. The static tracepoint haters just assume the kernel will be littered with thousands of unrelated tracepoints, where a good tracepoint would only document what already happens in that function, so that the tracepoint would be far from something obscure, which only few people could understand and maintain. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:41 ` Roman Zippel @ 2006-09-15 13:44 ` Jes Sorensen 2006-09-15 14:03 ` Roman Zippel 2006-09-15 13:57 ` Paul Mundt 1 sibling, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 13:44 UTC (permalink / raw) To: Roman Zippel Cc: Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Roman Zippel wrote: > The claim that these tracepoints would be maintainance burden is pretty > much unproven so far. The static tracepoint haters just assume the kernel > will be littered with thousands of unrelated tracepoints, where a good > tracepoint would only document what already happens in that function, so > that the tracepoint would be far from something obscure, which only few > people could understand and maintain. How do you propose to handle the case where two tracepoint clients wants slightly different data from the same function? I saw this with LTT users where someone wanted things in different places in schedule(). It *is* a nightmare to maintain. You still haven't explained your argument about kprobes not being generally available - where? Cheers, Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:44 ` Jes Sorensen @ 2006-09-15 14:03 ` Roman Zippel 2006-09-15 14:37 ` Alan Cox 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 14:03 UTC (permalink / raw) To: Jes Sorensen Cc: Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Jes Sorensen wrote: > Roman Zippel wrote: > > The claim that these tracepoints would be maintainance burden is pretty > > much unproven so far. The static tracepoint haters just assume the kernel > > will be littered with thousands of unrelated tracepoints, where a good > > tracepoint would only document what already happens in that function, so > > that the tracepoint would be far from something obscure, which only few > > people could understand and maintain. > > How do you propose to handle the case where two tracepoint clients wants > slightly different data from the same function? I saw this with LTT > users where someone wanted things in different places in schedule(). > > It *is* a nightmare to maintain. That nightmare would not be with tracepoints itself, but with the users of it, so you're missing the point. Tracepoints can be abused of course, but it's quite a leap to conclude from this that they are bad in general. > You still haven't explained your argument about kprobes not being > generally available - where? Huh? What kind of explanation do you want? $ grep KPROBES arch/*/Kconf* arch/i386/Kconfig:config KPROBES arch/ia64/Kconfig:config KPROBES arch/powerpc/Kconfig:config KPROBES arch/sparc64/Kconfig:config KPROBES arch/x86_64/Kconfig:config KPROBES bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:03 ` Roman Zippel @ 2006-09-15 14:37 ` Alan Cox 2006-09-15 14:34 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Alan Cox @ 2006-09-15 14:37 UTC (permalink / raw) To: Roman Zippel Cc: Jes Sorensen, Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 16:03 +0200, ysgrifennodd Roman Zippel: > Huh? What kind of explanation do you want? > > $ grep KPROBES arch/*/Kconf* > arch/i386/Kconfig:config KPROBES > arch/ia64/Kconfig:config KPROBES > arch/powerpc/Kconfig:config KPROBES > arch/sparc64/Kconfig:config KPROBES > arch/x86_64/Kconfig:config KPROBES Send patches. The fact nobody has them implemented on your platform isn't a reason to implement something else, quite the reverse in fact. Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:37 ` Alan Cox @ 2006-09-15 14:34 ` Roman Zippel 0 siblings, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-15 14:34 UTC (permalink / raw) To: Alan Cox Cc: Jes Sorensen, Paul Mundt, Karim Yaghmour, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Alan Cox wrote: > Ar Gwe, 2006-09-15 am 16:03 +0200, ysgrifennodd Roman Zippel: > > Huh? What kind of explanation do you want? > > > > $ grep KPROBES arch/*/Kconf* > > arch/i386/Kconfig:config KPROBES > > arch/ia64/Kconfig:config KPROBES > > arch/powerpc/Kconfig:config KPROBES > > arch/sparc64/Kconfig:config KPROBES > > arch/x86_64/Kconfig:config KPROBES > > Send patches. The fact nobody has them implemented on your platform > isn't a reason to implement something else, quite the reverse in fact. Alan, you offer no fact at all and all I can think about this is rather emotional and potentially offensive, so I'll refrain from further comments. The anti-tracepoint league has made up its mind anyway, so what's the point... :-( bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:41 ` Roman Zippel 2006-09-15 13:44 ` Jes Sorensen @ 2006-09-15 13:57 ` Paul Mundt 2006-09-15 14:17 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Paul Mundt @ 2006-09-15 13:57 UTC (permalink / raw) To: Roman Zippel Cc: Karim Yaghmour, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, Sep 15, 2006 at 03:41:03PM +0200, Roman Zippel wrote: > > On Fri, Sep 15, 2006 at 08:38:33AM -0400, Karim Yaghmour wrote: > > I didn't get the "instrumentation is evil" mantra from this thread, > > rather "static tracepoints are good, so long as someone else is > > maintaining them". The issue comes down to who ends up maintaining the > > trace points, > > The claim that these tracepoints would be maintainance burden is pretty > much unproven so far. The static tracepoint haters just assume the kernel > will be littered with thousands of unrelated tracepoints, where a good > tracepoint would only document what already happens in that function, so > that the tracepoint would be far from something obscure, which only few > people could understand and maintain. > Again, this works fine so long as the number of static tracepoints is small and manageable, but it seems like there's a division between what the subsystem developer deems as meaningful and what someone doing the tracing might want to look at. Static tracepoints are completely subjective, LTT proved that this was a problem regarding general code-level intrusiveness when the number of tracepoints in relatively close locality started piling up based on what people considered arbitrarily useful, and LTTng doesn't appear to do anything to address this. This doesn't really match my definition of a neglible maintenance burden.. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:57 ` Paul Mundt @ 2006-09-15 14:17 ` Karim Yaghmour 2006-09-15 14:13 ` Jes Sorensen 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:17 UTC (permalink / raw) To: Paul Mundt Cc: Roman Zippel, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Paul Mundt wrote: > subjective, LTT proved that this was a problem regarding general > code-level intrusiveness when the number of tracepoints in relatively > close locality started piling up based on what people considered > arbitrarily useful, and LTTng doesn't appear to do anything to address > this. "LTT proved that ..." what are you talking about? Have you noticed the posting earlier regarding the fact that the ltt tracepoints did not change over a 5 year span? **five** years ... Where do you get this claim that ltt trace points "started piling up"? Have a look at figure 2 of this article and let me know exactly which of those tracepoints are actually a problem to you: http://www.usenix.org/events/usenix2000/general/full_papers/yaghmour/yaghmour_html/index.html Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:17 ` Karim Yaghmour @ 2006-09-15 14:13 ` Jes Sorensen 2006-09-15 14:31 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 14:13 UTC (permalink / raw) To: karim Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Paul Mundt wrote: >> subjective, LTT proved that this was a problem regarding general >> code-level intrusiveness when the number of tracepoints in relatively >> close locality started piling up based on what people considered >> arbitrarily useful, and LTTng doesn't appear to do anything to address >> this. > > "LTT proved that ..." what are you talking about? Have you noticed > the posting earlier regarding the fact that the ltt tracepoints did > not change over a 5 year span? **five** years ... Where do you get > this claim that ltt trace points "started piling up"? Have a look > at figure 2 of this article and let me know exactly which of those > tracepoints are actually a problem to you: Because other people have tried to use LTT for additional projects, but said projects haven't been integrated into LTT. In other words, just because *you* haven't added those, doesn't mean someone else won't try and do it later, if LTT was integrated. Nice try! Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:13 ` Jes Sorensen @ 2006-09-15 14:31 ` Karim Yaghmour 2006-09-15 14:28 ` Paul Mundt 2006-09-15 14:39 ` Jes Sorensen 0 siblings, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:31 UTC (permalink / raw) To: Jes Sorensen Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > Because other people have tried to use LTT for additional projects, > but said projects haven't been integrated into LTT. In other words, > just because *you* haven't added those, doesn't mean someone else > won't try and do it later, if LTT was integrated. Thank you. I will take it as a complement and likely laminate this email for your suggestion that I've acted responsibly in my maintenance of ltt. Boy, can you imagine what this debate would have looked like if I had included precisely those additional projects ... C'mon Jes, if I was able to responsibly maintain ltt over 5 years *out* of the tree and I'm being labeled as incompetent all over this thread, then imagine what the very competent people maintaining the kernel could actually do. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:31 ` Karim Yaghmour @ 2006-09-15 14:28 ` Paul Mundt 2006-09-15 14:46 ` Martin J. Bligh 2006-09-15 14:51 ` Karim Yaghmour 2006-09-15 14:39 ` Jes Sorensen 1 sibling, 2 replies; 271+ messages in thread From: Paul Mundt @ 2006-09-15 14:28 UTC (permalink / raw) To: Karim Yaghmour Cc: Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, Sep 15, 2006 at 10:31:51AM -0400, Karim Yaghmour wrote: > Jes Sorensen wrote: > > Because other people have tried to use LTT for additional projects, > > but said projects haven't been integrated into LTT. In other words, > > just because *you* haven't added those, doesn't mean someone else > > won't try and do it later, if LTT was integrated. > > Thank you. I will take it as a complement and likely laminate this > email for your suggestion that I've acted responsibly in my > maintenance of ltt. Boy, can you imagine what this debate would > have looked like if I had included precisely those additional > projects ... > Which brings back the point of static tracepoints being entirely subjective. By this line of reasoning, you define for other people what the useful tracepoints are, and couldn't care less which points they're actually interested in. How exactly is this serving the need of people looking for instrumentation, rather than a pre-canned view of what they can trace? If they already have to go with their own tracepoints for the things they're interested in, then having a few static points pre-existing doesn't really buy anyone much else either, especially if by your own admission you're not integrating the points that people _are_ interested in. I'm not indicating that you didn't do exactly what you should have in this situation, only that static tracepoints in general are only going to be a small part of the picture, and not a complete solution to most people on their own. Dynamic instrumentation fills the same sort of gap without worrying about arbitrary maintenance, so what exactly does shoving static instrumentation in to the kernel buy us? ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:28 ` Paul Mundt @ 2006-09-15 14:46 ` Martin J. Bligh 2006-09-15 15:22 ` Alan Cox 2006-09-15 14:51 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Martin J. Bligh @ 2006-09-15 14:46 UTC (permalink / raw) To: Paul Mundt Cc: Karim Yaghmour, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais > Which brings back the point of static tracepoints being entirely > subjective. By this line of reasoning, you define for other people what > the useful tracepoints are, and couldn't care less which points they're > actually interested in. How exactly is this serving the need of people > looking for instrumentation, rather than a pre-canned view of what they > can trace? If they already have to go with their own tracepoints for the > things they're interested in, then having a few static points > pre-existing doesn't really buy anyone much else either, especially if > by your own admission you're not integrating the points that people > _are_ interested in. They're not *entirely* subjective, though I agree some are. I find the fact that Andrew Morton, myself, and apparently several other people have all instrumented the memory reclaim code to tell you *why* it's failing to reclaim pages at various points in time slightly amusing, but also rather depressing. It's all rather a waste of effort. Moreover, subsystem experts know what needs to be traced in order to give useful information, and the users may not. It's a damned sight easier for them to say "oh, please turn on tracing for VM events and send me the output" than custom-construct a set of probes for that user, and send them off. There's a barrier to entry that just won't happen there. Hell, look at all the debug printks in the kernel for example, and the various small add-hoc tracing facilities. If all we do is unite those, it'll still be a step forwards. > I'm not indicating that you didn't do exactly what you should have in > this situation, only that static tracepoints in general are only going > to be a small part of the picture, and not a complete solution to most > people on their own. Dynamic instrumentation fills the same sort of gap > without worrying about arbitrary maintenance, so what exactly does > shoving static instrumentation in to the kernel buy us? Dynamic probes do NOT reduce maintenance, they increase it. They just push it into somebody else's lap, where it's done more inefficiently. That's not a solution. The question is what's add-hoc debug for a particular problem vs. what's generically useful. I refuse to believe that the subsystem maintainers are too stupid to be able to make that judgement call. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:46 ` Martin J. Bligh @ 2006-09-15 15:22 ` Alan Cox 2006-09-15 15:47 ` Martin J. Bligh 0 siblings, 1 reply; 271+ messages in thread From: Alan Cox @ 2006-09-15 15:22 UTC (permalink / raw) To: Martin J. Bligh Cc: Paul Mundt, Karim Yaghmour, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 07:46 -0700, ysgrifennodd Martin J. Bligh: > Moreover, subsystem experts know what needs to be traced in order to > give useful information, and the users may not. It's a damned sight > easier for them to say "oh, please turn on tracing for VM events > and send me the output" than custom-construct a set of probes for > that user, and send them off. There's a barrier to entry that just > won't happen there. That has nothing to do with the static or dynamic probe question. Scriptable dynamic probes do everything your static probes do and more. > Hell, look at all the debug printks in the kernel for example, and > the various small add-hoc tracing facilities. If all we do is unite > those, it'll still be a step forwards. Look how many there are, look how they spread, tracepoints will do the same. > Dynamic probes do NOT reduce maintenance, they increase it. Thats a logical fallacy to begin with. A dynamic probe can probe anything a static probe can. So a static probe can be implemented with a dynamic probe. In other words if you like static probe lists and your subsystem happens to be one where it is useful then you can script it with the same effect and send people the script. With kprobes you've got a passably good chance (ie if Distros can be persuaded to package the debug data) that you can say "run this systemtap script". With static tracepoints its "recompile your vendor kernel in your vendor manner with your vendor initrd and add it to the boot loader" Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 15:22 ` Alan Cox @ 2006-09-15 15:47 ` Martin J. Bligh 0 siblings, 0 replies; 271+ messages in thread From: Martin J. Bligh @ 2006-09-15 15:47 UTC (permalink / raw) To: Alan Cox Cc: Paul Mundt, Karim Yaghmour, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > Ar Gwe, 2006-09-15 am 07:46 -0700, ysgrifennodd Martin J. Bligh: >> Moreover, subsystem experts know what needs to be traced in order to >> give useful information, and the users may not. It's a damned sight >> easier for them to say "oh, please turn on tracing for VM events >> and send me the output" than custom-construct a set of probes for >> that user, and send them off. There's a barrier to entry that just >> won't happen there. > > That has nothing to do with the static or dynamic probe question. > Scriptable dynamic probes do everything your static probes do and more. No. The point is that they're not *there* and have to be modified for every kernel version. And do you mean with or without the markers in the code to tell the dynamic probes where to hook in, and what data to fetch? that makes a huge difference. Suppose, as a very real example, I want to instrument shrink_list. There are 20 or so places where it can switch what we're doing with a page for different reasons. Potentially we're scanning through many thousands of pages. If I can keep counters as I go through the function, and then do one trace entry at the end, that's fairly efficient. If I have to create 20 separate hooks that all jump out of line, it's going to be a lot slower. If I log a tracepoint at every damned page every time it switches, it's going to be a nightmare. Most things can be done with dynamic probes. Some things will require markers in the code to tell us sustainably over time where to attatch them. A few things (like the above) probably require some explicit code. >> Hell, look at all the debug printks in the kernel for example, and >> the various small add-hoc tracing facilities. If all we do is unite >> those, it'll still be a step forwards. > > Look how many there are, look how they spread, tracepoints will do the > same. As long as they all use the same infrastructure, that's an improvement. >> Dynamic probes do NOT reduce maintenance, they increase it. > > Thats a logical fallacy to begin with. A dynamic probe can probe > anything a static probe can. So a static probe can be implemented with a > dynamic probe. In the absence of the markers, I don't think that's true - there's the maintenance of exactly where they go, plus access to local data. If you mean with markers, then yes, that's fine. The markers + dynamic probes seems to be a reasonable compromise between the two. Exactly what we call that combo, static or dynamic, I don't really care ;-) > In other words if you like static probe lists and your subsystem happens > to be one where it is useful then you can script it with the same effect > and send people the script. > > With kprobes you've got a passably good chance (ie if Distros can be > persuaded to package the debug data) that you can say "run this > systemtap script". With static tracepoints its "recompile your vendor > kernel in your vendor manner with your vendor initrd and add it to the > boot loader" You're thinking of one situation where you can't recompile. I'm thinking of a situation where it's trivial to recompile. Both exist, neither is invalid. Of course, where possible, we'd like to be able to add stuff on the fly, but it's not a panacea. Without the markers, maintaining a usable set of dynamic probe points that's always available for every kernel version seems infeasible. With them, I think it'll cover 99% of the cases, and would be pretty useful. If people agree on putting tags in there, perhaps we can discuss things like the logging mechanism, format, and readout. If not, I suppose we have to drag this debate out even longer. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:28 ` Paul Mundt 2006-09-15 14:46 ` Martin J. Bligh @ 2006-09-15 14:51 ` Karim Yaghmour 2006-09-15 15:00 ` Thomas Gleixner 2006-09-15 15:24 ` Alan Cox 1 sibling, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:51 UTC (permalink / raw) To: Paul Mundt Cc: Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Paul Mundt wrote: > Which brings back the point of static tracepoints being entirely > subjective. By this line of reasoning, you define for other people what > the useful tracepoints are, and couldn't care less which points they're > actually interested in. How exactly is this serving the need of people > looking for instrumentation, rather than a pre-canned view of what they > can trace? If they already have to go with their own tracepoints for the > things they're interested in, then having a few static points > pre-existing doesn't really buy anyone much else either, especially if > by your own admission you're not integrating the points that people > _are_ interested in. > > I'm not indicating that you didn't do exactly what you should have in > this situation, only that static tracepoints in general are only going > to be a small part of the picture, and not a complete solution to most > people on their own. Dynamic instrumentation fills the same sort of gap > without worrying about arbitrary maintenance, so what exactly does > shoving static instrumentation in to the kernel buy us? And this flies in the face of all of those who, for years, have been satisfied customers for ltt and who were more than looking forwad for not having to depend on me to get a working traceable kernel. The static tracepoints we maintained were *the* solution for a great deal many people. As a maintainer I had two choices with those who were not content: a- Maintain their tracepoints for them -- not happening. b- Suggest they contribute to helping getting a generic tracing infrastructure into the kernel and then make their case on the lkml as to the pertinence of their instrumentation. And what I did is "b". I wasn't going to defend anybody else's choice of tracepoints. Those who were using ltt for its designated purpose -- allowing normal users and developers to get an accurate view of the behavior of their system -- were very happy with it. You want to know who was unhappy with using it: kernel developers. It just wasn't geared for them. Which goes back to my earlier arguments ... Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:51 ` Karim Yaghmour @ 2006-09-15 15:00 ` Thomas Gleixner 2006-09-15 15:28 ` Karim Yaghmour 2006-09-15 18:16 ` Andrew Morton 2006-09-15 15:24 ` Alan Cox 1 sibling, 2 replies; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 15:00 UTC (permalink / raw) To: karim Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 10:51 -0400, Karim Yaghmour wrote: > And what I did is "b". I wasn't going to defend anybody else's > choice of tracepoints. Those who were using ltt for its designated > purpose -- allowing normal users and developers to get an accurate > view of the behavior of their system -- were very happy with it. > > You want to know who was unhappy with using it: kernel developers. > It just wasn't geared for them. Which goes back to my earlier > arguments ... What do you want to prove with this rant ? Simply the fact that your view of tracing is not matching the view of others. Nothing else. You just made it clear, that your solution was and still is targeted on one single user group. Nobody is opposing instrumentation per se, we just need to figure out a good solution suitable for endusers, kernel developers, debug fetishists ... without splattering ten different tracers all across the kernel source. The way to a solid kernel instrumentation is definitely not by pushing a single purpose solution in, which we have to _maintain_ for a long time without being convinced that it is the _best_ technical solution we can have right now. tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 15:00 ` Thomas Gleixner @ 2006-09-15 15:28 ` Karim Yaghmour 2006-09-15 18:16 ` Andrew Morton 1 sibling, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 15:28 UTC (permalink / raw) To: tglx Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Thomas Gleixner wrote: > You just made it clear, that your solution was and still is targeted on > one single user group. And that was part of my point. Every time I got in a debate on lkml regarding ltt, there were crowds screaming in horror at the possibility of trace points everywhere. > Nobody is opposing instrumentation per se, we just need to figure out a > good solution suitable for endusers, kernel developers, debug > fetishists ... without splattering ten different tracers all across the > kernel source. I agree entirely. > The way to a solid kernel instrumentation is definitely not by pushing a > single purpose solution in, which we have to _maintain_ for a long time > without being convinced that it is the _best_ technical solution we can > have right now. I think we're in full agreement. A solid kernel instrumentation mechanism is exactly what is needed. The whole point of posting the ltt stuff on the lkml is exactly to get the best technical solution. The ltt developers are more than happy to take suggestions as to how to achieve this. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 15:00 ` Thomas Gleixner 2006-09-15 15:28 ` Karim Yaghmour @ 2006-09-15 18:16 ` Andrew Morton 2006-09-15 18:19 ` Ingo Molnar ` (3 more replies) 1 sibling, 4 replies; 271+ messages in thread From: Andrew Morton @ 2006-09-15 18:16 UTC (permalink / raw) To: tglx Cc: karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 15 Sep 2006 17:00:47 +0200 Thomas Gleixner <tglx@linutronix.de> wrote: > On Fri, 2006-09-15 at 10:51 -0400, Karim Yaghmour wrote: > > And what I did is "b". I wasn't going to defend anybody else's > > choice of tracepoints. Those who were using ltt for its designated > > purpose -- allowing normal users and developers to get an accurate > > view of the behavior of their system -- were very happy with it. > > > > You want to know who was unhappy with using it: kernel developers. > > It just wasn't geared for them. Which goes back to my earlier > > arguments ... > > What do you want to prove with this rant ? Simply the fact that your > view of tracing is not matching the view of others. Nothing else. What Karim is sharing with us here (yet again) is the real in-field experience of real users (ie: not kernel developers). I mean, on one hand we have people explaining what they think a tracing facility should and shouldn't do, and on the other hand we have a guy who has been maintaining and shipping exactly that thing to (paying!) customers for many years. Me thinks our time would be best spent trying to benefit from his experience.. Me, I'm not particularly averse to some 50-100 static tracepoints if experience tells us that we need such things. And both Karim's and Frank's experience does indicate that such things are needed, which carries weight. What I _am_ concerned about with this patchset is all the infrastructural goop which backs up those tracepoints. I'd have thought that a better approach would be to make those explicit tracepoints be "helpers" for the existing kprobe code. Of course, it they are properly designed, the one set of tracepoints could be used by different tracing backends - that allows us to separate the concepts of "tracepoints" and "tracing backends". ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Andrew Morton @ 2006-09-15 18:19 ` Ingo Molnar 2006-09-15 19:26 ` Karim Yaghmour ` (2 more replies) 2006-09-15 19:35 ` Thomas Gleixner ` (2 subsequent siblings) 3 siblings, 3 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 18:19 UTC (permalink / raw) To: Andrew Morton Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Andrew Morton <akpm@osdl.org> wrote: > What Karim is sharing with us here (yet again) is the real in-field > experience of real users (ie: not kernel developers). well, Jes has that experience and Thomas too. > I mean, on one hand we have people explaining what they think a > tracing facility should and shouldn't do, and on the other hand we > have a guy who has been maintaining and shipping exactly that thing to > (paying!) customers for many years. so does Thomas and Jes. So what's the point? i judge LTT by its current code quality, not by its proponents shouting volume - and that quality is still quite poor at the moment. (and then there are the conceptual problems too, outlined numerous times) I have quoted specific example(s) for that in this thread. Furthermore, LTT does this: 246 files changed, 26207 insertions(+), 71 deletions(-) and this gives me the shivers, for all the reasons i outlined. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:19 ` Ingo Molnar @ 2006-09-15 19:26 ` Karim Yaghmour 2006-09-15 19:43 ` Roman Zippel 2006-09-15 20:13 ` Andrew Morton 2 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 19:26 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, tglx, Paul Mundt, Jes Sorensen, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > well, Jes has that experience and Thomas too. ... > so does Thomas and Jes. So what's the point? Either I'm too stupid for you to bother replying to any of my emails (which is very possible) or, shall we say politely, you're not exactly humble. I've responded to half a dozen of your emails, yet you have not deemed it worthwhile to talk to me directly. First you came out screaming that static tracepoints are heresy, and then when there was non-ltt-specific interest being voiced for code markup, you viciously set out to fud ltt as best you can using your experience at implementing kernel tracers as ammunition. So answer this simple question, how many tracers did you actually write which were geared for non-kernel-developer users? Based on your own account from yesterday, the answer I conclude is: NONE. I'd say you've got pretty strong opinions about something you've never attempted to do. Of course you claim that all tracers are the same, how could they be different? But that's where experience talks and hubris walks. > i judge LTT by its current code quality, not by its proponents shouting > volume - and that quality is still quite poor at the moment. You're either skillfully trying to steer arguments in your direction or you're simply unaware of the basic rules of debating. You started by saying that static instrumentation of any kind is evil, yet this is demonstrably false, if nothing else by the outpour of experience from those who have had to maintain non-inlined instrumentation. Then you proceed to try to amalgamate this attack with a vicious attack on ltt. I'll say it one more time: the ltt code gets posted to lkml *for review*. If you're that concerned about the code, then go ahead look at it and tell the maintainers what you'd like to see fixed. Instead, you run out and come back and conclude "The best that Frank and me came up ..." and then you present your own nomenclature for static instrumentation. I mean, if nothing, else, have a little decency for those who have put effort in trying to make this stuff work. I mean, at least explain to me why you insist on using such a tone against a project that is now within its 7th year of existence (a pretty long lifetime if you ask me for something that has been labeled useless all over this thread.) Do you actually realize the lkml's past reluctance to admitting a standard tracing mechanism into the kernel has actually contributed in doing great harm to those who had put substantial personal and financial investment in getting something to work. I'll spare you the political debates, but look at past involvement of major corporate users in ltt and ask yourself why they've decided to put their efforts elsewhere. We were basically told: we cannot justify investing any further funds in a project which does not seem to gain any sort of acceptance by the kernel developers. I've never complained about this before because I don't like whining. Do, however, realize that the fact that there are 4 separate teams working on this in parallel (ltt, lkst, systemtap, lket, off the top of my head) is directly due to the lack of success ltt has had in being admitted into the kernel. Do, at least, realize that this is huge miscarriage of the lkml process. And finally, do realize that in 2000 I personally contacted the head of the DProbes project IBM in order to foster common development, following which ltt was effectively modified in order to allow dynamic instrumentation of the kernel ... cheesh ... Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:19 ` Ingo Molnar 2006-09-15 19:26 ` Karim Yaghmour @ 2006-09-15 19:43 ` Roman Zippel 2006-09-15 20:05 ` Ingo Molnar 2006-09-15 20:13 ` Andrew Morton 2 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 19:43 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > > What Karim is sharing with us here (yet again) is the real in-field > > experience of real users (ie: not kernel developers). > > well, Jes has that experience and Thomas too. > > > I mean, on one hand we have people explaining what they think a > > tracing facility should and shouldn't do, and on the other hand we > > have a guy who has been maintaining and shipping exactly that thing to > > (paying!) customers for many years. > > so does Thomas and Jes. So what's the point? That only Karim's experience is being in question here? > i judge LTT by its current code quality, not by its proponents shouting > volume - and that quality is still quite poor at the moment. (and then > there are the conceptual problems too, outlined numerous times) I have > quoted specific example(s) for that in this thread. Furthermore, LTT > does this: > > 246 files changed, 26207 insertions(+), 71 deletions(-) > > and this gives me the shivers, for all the reasons i outlined. Well, I'm first to admit that LTT needs improvement, but that has never been the point. We need to get to some kind of agreement what level of tracing Linux should support in general, preferably something that is easy to integrate and usable by everyone. Especially the latter means that there is not one true solution, so we need to figure out what kind of common infrastructure can be implemented, from which all of them can benefit. At this point you've been rather uncompromising contrary to every single argument from either side. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:43 ` Roman Zippel @ 2006-09-15 20:05 ` Ingo Molnar 2006-09-15 20:22 ` Mathieu Desnoyers 2006-09-15 21:12 ` Roman Zippel 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 20:05 UTC (permalink / raw) To: Roman Zippel Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi, > > On Fri, 15 Sep 2006, Ingo Molnar wrote: > > > > What Karim is sharing with us here (yet again) is the real in-field > > > experience of real users (ie: not kernel developers). > > > > well, Jes has that experience and Thomas too. > > > > > I mean, on one hand we have people explaining what they think a > > > tracing facility should and shouldn't do, and on the other hand we > > > have a guy who has been maintaining and shipping exactly that thing to > > > (paying!) customers for many years. > > > > so does Thomas and Jes. So what's the point? > > That only Karim's experience is being in question here? i think you misunderstood, please read the paragraphs above. They suggest that there's "real in-field experience of real users" against "people explaining what they think a tracing facility should and shouldn't do". I only pointed out that those people (Thomas, Jes) dont just randomly express their opinion but have actual in-field experience too (of paying customers), about the very topic at hand. > > i judge LTT by its current code quality, not by its proponents shouting > > volume - and that quality is still quite poor at the moment. (and then > > there are the conceptual problems too, outlined numerous times) I have > > quoted specific example(s) for that in this thread. Furthermore, LTT > > does this: > > > > 246 files changed, 26207 insertions(+), 71 deletions(-) > > > > and this gives me the shivers, for all the reasons i outlined. > > Well, I'm first to admit that LTT needs improvement, but that has > never been the point. that might not be your point, but that very much is my point. I do claim that LTT's problems arise out of its fundamental mistake on the kernel side: that it is a static tracer that tries to be too many things to too many people. SystemTap is available here and today on an unmodified upstream kernel. LTT has been in this shape for the past ~8 years. But if you wish you can certainly prove me wrong via for example cleaning up and shrinking LTT down to a size and impact that is not scary anymore, with the same functionality, and the clear future path for the removal of its dependencies. I tried to argue that in the abstract, but please by all means feel free to prove me wrong. (or argue against my specific points) > We need to get to some kind of agreement what level of tracing Linux > should support in general, preferably something that is easy to > integrate and usable by everyone. Especially the latter means that > there is not one true solution, [...] sorry, but i disagree. There _is_ a solution that is superior in every aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > At this point you've been rather uncompromising [...] yes, i'm rather uncompromising when i sense attempts to push inferior concepts into the core kernel _when_ a better concept exists here and today. Especially if the concept being pushed adds more than 350 tracepoints that expose something to user-space that amounts to a complex external API, which tracepoints we have little chance of ever getting rid of under a static tracing concept. i'm also looking at it this way too: you already seem to be quite reluctant to add kprobes to your architecture today. How reluctant would you be tomorrow if you had static tracepoints, which would remove a fair chunk of incentive to implement kprobes? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:05 ` Ingo Molnar @ 2006-09-15 20:22 ` Mathieu Desnoyers 2006-09-15 21:08 ` Jose R. Santos ` (2 more replies) 2006-09-15 21:12 ` Roman Zippel 1 sibling, 3 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 20:22 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Please Ingo, stop repeating false argument without taking in account people's corrections : * Ingo Molnar (mingo@elte.hu) wrote: > sorry, but i disagree. There _is_ a solution that is superior in every > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > I am sorry to have to repeat myself, but this is not true for heavy loads. > > At this point you've been rather uncompromising [...] > > yes, i'm rather uncompromising when i sense attempts to push inferior > concepts into the core kernel _when_ a better concept exists here and > today. Especially if the concept being pushed adds more than 350 > tracepoints that expose something to user-space that amounts to a > complex external API, which tracepoints we have little chance of ever > getting rid of under a static tracing concept. > >From an earlier email from Tim bird : "I still think that this is off-topic for the patch posted. I think we should debate the implementation of tracepoints/markers when someone posts a patch for some. I think it's rather scurrilous to complain about code NOT submitted. Ingo has even mis-characterized the not-submitted instrumentation patch, by saying it has 350 tracepoints when it has no such thing. I counted 58 for one architecture (with only 8 being arch-specific)." Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:22 ` Mathieu Desnoyers @ 2006-09-15 21:08 ` Jose R. Santos 2006-09-15 21:25 ` Mathieu Desnoyers 2006-09-15 22:03 ` Ingo Molnar 2006-09-15 21:32 ` Ingo Molnar 2006-09-16 9:59 ` Jes Sorensen 2 siblings, 2 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 21:08 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > Please Ingo, stop repeating false argument without taking in account people's > corrections : > > * Ingo Molnar (mingo@elte.hu) wrote: > > sorry, but i disagree. There _is_ a solution that is superior in every > > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > > > > I am sorry to have to repeat myself, but this is not true for heavy loads. > This thread has already discuss the merits of static instrumentation when it comes to the performance impacts. The key is now to find a balance between static vs dynamic probes. While it is true that static probes will provide less overhead compared to dynamic probes, some probe point will see less of an impact in measurable performance impact of dynamic probes due to the nature of the probe. We need to find what that balance is. To some people performance is the #1 priority and to other it is flexibility. I would like to come up with a list of those probe point that absolutely need to be inserted into the code statically. Those that are not absolutely critical to have statically should be implemented dynamically. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:08 ` Jose R. Santos @ 2006-09-15 21:25 ` Mathieu Desnoyers 2006-09-15 22:02 ` Jose R. Santos 2006-09-15 22:03 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 21:25 UTC (permalink / raw) To: Jose R. Santos Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jose R. Santos (jrs@us.ibm.com) wrote: > To some people performance is the #1 priority and to other it is > flexibility. I would like to come up with a list of those probe point > that absolutely need to be inserted into the code statically. Those > that are not absolutely critical to have statically should be > implemented dynamically. > I agree with you that only very specific parts of the kernel have this kind of high throughput. Using kprobes for lower thoughput tracepoints if perfectly acceptable from my point of view, as it does not perturb the system too much. I would suggest (as a beginning) those "standard" hi event rate tracepoints : (taken from the highest rates in http://sourceware.org/ml/systemtap/2005-q4/msg00451.html) - syscall entry/exit - irq entry/exit - softirq entry/exit - tasklet entry/exit - trap entry/exit - scheduler change - wakeup - network traffic (packet in/out) - "select" and "poll" system calls - page_alloc/page_free (be warned : this list is probably incomplete, too exhaustive or can cause dizziness under stress condition) :) However, a tracing infrastructure should still provide the ability for developers to instrument their own high traffic interrupt handler with a very low overhead. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:25 ` Mathieu Desnoyers @ 2006-09-15 22:02 ` Jose R. Santos 0 siblings, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 22:02 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > * Jose R. Santos (jrs@us.ibm.com) wrote: > > To some people performance is the #1 priority and to other it is > > flexibility. I would like to come up with a list of those probe point > > that absolutely need to be inserted into the code statically. Those > > that are not absolutely critical to have statically should be > > implemented dynamically. > > > > I agree with you that only very specific parts of the kernel have this kind of > high throughput. Using kprobes for lower thoughput tracepoints if perfectly > acceptable from my point of view, as it does not perturb the system too much. > > I would suggest (as a beginning) those "standard" hi event rate tracepoints : > > (taken from the highest rates in > http://sourceware.org/ml/systemtap/2005-q4/msg00451.html) > > - syscall entry/exit > - irq entry/exit > - softirq entry/exit > - tasklet entry/exit > - trap entry/exit > - scheduler change > - wakeup > - network traffic (packet in/out) > - "select" and "poll" system calls > - page_alloc/page_free > > (be warned : this list is probably incomplete, too exhaustive or can cause > dizziness under stress condition) :) > > However, a tracing infrastructure should still provide the ability for > developers to instrument their own high traffic interrupt handler with a very > low overhead. > This is base on a single scenario, which is wrong. A criteria needs to be establish that describes the justification for a static trace hook. Base on the previous comments on the thread, this list is already seems to big. If a user of the trace tool absolutely need to have the best performance, then the propose tool should be smart enough to use static hooks if available but revert back to dynamic probes if there is no available static counter part. This performance static tracepoint patch can be maintained outside of the kernel tree without bloating the kernel. This way he can have mostly dynamic trace point but at least provide some sort of mechanism for those that absolutely must have static hooks in order to get useful data out of the trace tool. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:08 ` Jose R. Santos 2006-09-15 21:25 ` Mathieu Desnoyers @ 2006-09-15 22:03 ` Ingo Molnar 2006-09-15 22:32 ` Karim Yaghmour ` (2 more replies) 1 sibling, 3 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 22:03 UTC (permalink / raw) To: Jose R. Santos Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jose R. Santos <jrs@us.ibm.com> wrote: > [...] While it is true that static probes will provide less overhead > compared to dynamic probes, [...] that is not true at all. Yes, an INT3 based kprobe might be expensive if +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that is "only" an implementation detail, not a conceptual property. Especially considering that help (djprobes) is on the way. And in the future, as more and more code gets generated (and regenerated) on the fly, dynamic probes will be _faster_ than static probes - plainly because they adapt better to the environment they plug into. so there's basically nothing to balance. My point is that dynamic probes have won or will win on every front, and we shouldnt tie us down with static tracers. 5 years ago with no kprobes, had someone submitted a clean static tracer patchset, we could probably not have resisted it (i though probably would have resisted it on the grounds of maintainance overhead) and would have added it because tracing makes sense in general. But today there's just no reason to add static tracers anymore. NOTE: i still accept the temporary (or non-temporary) introduction of static markers, to help dynamic tracing. But my expectation is that these markers will be less intrusive than static tracepoints, and a lot more flexible. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:03 ` Ingo Molnar @ 2006-09-15 22:32 ` Karim Yaghmour 2006-09-15 22:43 ` Ingo Molnar 2006-09-15 22:59 ` Frank Ch. Eigler 2006-09-15 23:17 ` Jose R. Santos 2 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 22:32 UTC (permalink / raw) To: Ingo Molnar Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > that is not true at all. Yes, an INT3 based kprobe might be expensive if > +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that > is "only" an implementation detail, not a conceptual property. > Especially considering that help (djprobes) is on the way. And in the djprobes has been "on the way" for some time now. Why don't you at least have the intellectual honesty to use the same rules you've repeatedly used against ltt elsewhere in this thread -- i.e. what it does today is what it is, and what it does today isn't worth bragging about. But that would be too much to ask of you Ingo, wouldn't it? But, sarcasm aside, even if this mechanism existed it still wouldn't resolve the need for static markup. It would just make djprobe a likelier candidate for tools that cannot currently rely on kprobes. > NOTE: i still accept the temporary (or non-temporary) introduction of > static markers, to help dynamic tracing. But my expectation is that > these markers will be less intrusive than static tracepoints, and a lot > more flexible. Chalk one up for nice endorsement and another for arbitrary distinction. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:32 ` Karim Yaghmour @ 2006-09-15 22:43 ` Ingo Molnar 2006-09-15 23:33 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 22:43 UTC (permalink / raw) To: Karim Yaghmour Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Karim Yaghmour <karim@opersys.com> wrote: > Ingo Molnar wrote: > > that is not true at all. Yes, an INT3 based kprobe might be expensive if > > +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that > > is "only" an implementation detail, not a conceptual property. > > Especially considering that help (djprobes) is on the way. And in the > > djprobes has been "on the way" for some time now. Why don't you at > least have the intellectual honesty to use the same rules you've > repeatedly used against ltt elsewhere in this thread -- i.e. what it > does today is what it is, and what it does today isn't worth bragging > about. [...] i actually think djprobes are pretty darn inventive. I also think that the tracebuffer management portion of LTT is better than the hacks in SystemTap, and that LTT's visualization tools are better (for example they do exist :-) - so clearly there's synergy possible. But i have no faith at all, for the many reasons outlined before, in the concept of static tracing, because i see no possible future path out of its many limitations and because i see no possible future way to get rid of their dependencies. So i'd rather wait some time for dynamic tracers to outgrow static tracers in even the last final area, than let static tracing into the kernel - which would add dependencies that we'd have to live with almost until eternity. > But, sarcasm aside, even if this mechanism existed it still wouldn't > resolve the need for static markup. It would just make djprobe a > likelier candidate for tools that cannot currently rely on kprobes. it would clearly reduce the number of places where static markup would still be necessary. With static tracers i see no such mechanism that gradually moves the markups out of the kernel. > > NOTE: i still accept the temporary (or non-temporary) introduction > > of static markers, to help dynamic tracing. But my expectation is > > that these markers will be less intrusive than static tracepoints, > > and a lot more flexible. > > Chalk one up for nice endorsement and another for arbitrary > distinction. So you dispute that markups for dynamic tracing will be more flexible and you dispute that they will be less intrusive than markups for static tracing? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:43 ` Ingo Molnar @ 2006-09-15 23:33 ` Karim Yaghmour 2006-09-15 23:52 ` Ingo Molnar 2006-09-15 23:53 ` Ingo Molnar 0 siblings, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 23:33 UTC (permalink / raw) To: Ingo Molnar Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > i actually think djprobes are pretty darn inventive. So do I. While there is a language barrier, the Hitachi folks, especially Hiramatsu-san, are very talented. > I also think that > the tracebuffer management portion of LTT is better than the hacks in > SystemTap, and that LTT's visualization tools are better (for example > they do exist :-) - so clearly there's synergy possible. Great, because I believe all those involved would like to see this happen. I personally am convinced that none of those involved want to continue wasting their time in parallel. > But i have no > faith at all, for the many reasons outlined before, in the concept of > static tracing, because i see no possible future path out of its many > limitations and because i see no possible future way to get rid of their > dependencies. Yes, I do so believe that this is what you most sincerely think. And I'm ok with that. We don't have to approach the problem from the same direction. In my view we should at least settle for working on the most basic thing we *do* agree on: having a markup mechanism for necessary instrumentation. > So i'd rather wait some time for dynamic tracers to > outgrow static tracers in even the last final area, than let static > tracing into the kernel - which would add dependencies that we'd have to > live with almost until eternity. I genuinely understand your concern. And I repeat that ltt's initial design cared little of the provenance of the events. It just needed key events to present an intelligent picture to the user. The patches have since grown to include stuff which was essential as development went ahead. But there's no reason things cannot be refactored into an acceptable format to all by review on the lkml. > it would clearly reduce the number of places where static markup would > still be necessary. With static tracers i see no such mechanism that > gradually moves the markups out of the kernel. Again, I strongly believe that this issue isn't about static vs. dynamic. The goal, and that's what's important, is to allow users to have access to a set of tools they can use on *any* kernel they get their hands on, without having to edit anything anywhere or fix any script. For having spent considerable effort into this, I don't see any other way that using static markup. Here's a simple case: you ask someone who's got a bug report on a kernel crashing because of his user-space realtime task, and you ask him to dump you a trace, and that trace actually ends up misleading because his out-of-tree instrumentation was inserted in the wrong location. Again, the goal is to obtain tools that users can use on *any* kernel they get their hands on. > So you dispute that markups for dynamic tracing will be more flexible > and you dispute that they will be less intrusive than markups for static > tracing? No, I'm saying that the flexibility of the markup is not tied to the instrumentation "grab" mechanism (direct call or binary editing.) That's the "arbitrary" I'm talking about. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:33 ` Karim Yaghmour @ 2006-09-15 23:52 ` Ingo Molnar 2006-09-16 2:24 ` Karim Yaghmour 2006-09-15 23:53 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 23:52 UTC (permalink / raw) To: Karim Yaghmour Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Karim Yaghmour <karim@opersys.com> wrote: > > So you dispute that markups for dynamic tracing will be more > > flexible and you dispute that they will be less intrusive than > > markups for static tracing? > > No, I'm saying that the flexibility of the markup is not tied to the > instrumentation "grab" mechanism (direct call or binary editing.) > That's the "arbitrary" I'm talking about. ok, then i'd like to dispute your point. Contrary to your statement there is a very fundamental difference between "static tracing" (static call, which relies on compile-time insertion of trace points) and "dynamic tracing" (which can insert trace points almost anywhere) - _even if both use in-source markers_. The fundamental difference is this: dynamic tracing has full access to the full environment of the code that it taps into _at the time of tracepoint activation_, while static tracing has to get all its context during compilation. To make my point easier to understand, consider the following example: we want to tap into the middle of a global_function(): int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); ... [lots of code] ... } We want to trace the function right after 'x' has been assigned, and we want to trace an event_A, with parameters: arg1, arg2, arg3 and x. This is a pretty common scenario. Ok so far? here is how the markup looks like under static tracing: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); D(event_A, arg1, arg2, arg3, x); ... [lots of code] ... } that's what you'd expect, right? This is pretty common too, up to this point. now how could the markup look like for a dynamic tracepoint: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); D(event_A, x); ... [lots of code] ... } Note: there's no (arg1, arg2, arg3) passed to the markup! Why? Because SystemTap has full access to the function's arguments and in this particular case it's simply not necessary to reference them explicitly. So the markup has less of an overhead because it does not 'touch' arg1, arg2, arg3 if the tracepoint is not active [which is the common case we optimize for]. Furthermore, the markup is also visually less intrusive. But better than that, the markup could look like this as well: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); ... [lots of code] ... } right, no markup at all, but in a script somewhere we'd have: insert.trace(global_function: "x = func2();", after); or maybe even in a script, annotated in patch format, so that the context of the tapped code is captured too. so, as a result: the dynamic markup() does the same, but has less impact on the compiled code (less parameters touched), and is more flexible in terms of attachment to the source code. Can we do any of this with the static tracepoint? We cannot, fundamentally! So if we allowed static tracers to access that tracepoint anytime, we could never make things more intelligent there in the future! Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:52 ` Ingo Molnar @ 2006-09-16 2:24 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 2:24 UTC (permalink / raw) To: Ingo Molnar Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > ok, then i'd like to dispute your point. Contrary to your statement > there is a very fundamental difference between "static tracing" (static > call, which relies on compile-time insertion of trace points) and > "dynamic tracing" (which can insert trace points almost anywhere) - > _even if both use in-source markers_. Good, a nice little down-to-earth debate for a change ;) > The fundamental difference is this: dynamic tracing has full access to > the full environment of the code that it taps into _at the time of > tracepoint activation_, while static tracing has to get all its context > during compilation. I disagree. > To make my point easier to understand, consider the following example: > we want to tap into the middle of a global_function(): > > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); > > ... [lots of code] ... > } > > We want to trace the function right after 'x' has been assigned, and we > want to trace an event_A, with parameters: arg1, arg2, arg3 and x. This > is a pretty common scenario. Ok so far? Ok so far. > here is how the markup looks like under static tracing: > > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); > D(event_A, arg1, arg2, arg3, x); > > ... [lots of code] ... > } > > that's what you'd expect, right? This is pretty common too, up to this > point. No, that's not what I'd necessarily expect, though it could be and definitely does match current standard practice. There's no reason, though, D(foo) isn't calling a statically-linked function which has a pluggable interface (a module-overloadable symbol if you'd like) which can then do much more than initially fetch arg1-2-3 using, as you alluded to earlier, built-in disassemblers and the likes. One nice thing about the above, though, is that you can easily have type information at build time and can actually create customized logging info right there. But this is just brain farting, more substance below. > now how could the markup look like for a dynamic tracepoint: > > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); > D(event_A, x); > > ... [lots of code] ... > } > > Note: there's no (arg1, arg2, arg3) passed to the markup! Why? Because > SystemTap has full access to the function's arguments and in this > particular case it's simply not necessary to reference them explicitly. > So the markup has less of an overhead because it does not 'touch' arg1, > arg2, arg3 if the tracepoint is not active [which is the common case we > optimize for]. Again, this does not have to be the case. D(arg1, ..., N) could actually be defined to nothing in *ALL* cases in a header. Nothing precludes having a special parser that only runs if tracing is enabled and then generates a special header and corresponding C file which then have what it takes to make these D() markups meaningful. So in this case, the compiler never gives a damn about arg1-Z (i.e. no touch or dependency or anything of the likes), yet a compile-time option allows you to suddenly make D(foo) turn into a system-tap usable probe point or a direct call to a statically-linked function (which is what I refer to as "static tracing".) > Furthermore, the markup is also visually less intrusive. That's debatable. If you're going to mark something up, you might as well state right away what's typically interesting about the event. Sure, you could make a point that arg32 is something you may be interesting in some cases, but if arg1-3 are the ones most relevant 99% of the time for this function, then you might as well say that in your trace marker. > But better than that, the markup could look like this as well: > > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); > > ... [lots of code] ... > } > > right, no markup at all, but in a script somewhere we'd have: > > insert.trace(global_function: "x = func2();", after); That's two files. If we're talking funky, and the following is by no means and endorsement I'm making -- just showing you what could be possible, then here's a better one: Look ma, no hands: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); /*T* @here:arg1,arg2,arg3 */ ... [lots of code] ... } Now you can't say that's visually wrong: we've already got tons of outdated comments in the code. And you can't say there's entirely no precedent: kerneldoc. Yet, this can be used by a build-time tool which automagically generates either information for later use by probe inserters or, alternatively, substitutes the default built file (say foo.c) with an equivalent (foo-trace.c) which has inlined static tracing. Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:33 ` Karim Yaghmour 2006-09-15 23:52 ` Ingo Molnar @ 2006-09-15 23:53 ` Ingo Molnar 2006-09-16 2:51 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 23:53 UTC (permalink / raw) To: Karim Yaghmour Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Karim Yaghmour <karim@opersys.com> wrote: > > the tracebuffer management portion of LTT is better than the hacks > > in SystemTap, and that LTT's visualization tools are better (for > > example they do exist :-) - so clearly there's synergy possible. > > Great, because I believe all those involved would like to see this > happen. I personally am convinced that none of those involved want to > continue wasting their time in parallel. a reasonable compromise for me would be what i suggested a few mails ago: nor do i reject all of LTT: as i said before i like the tools, and i think its collection of trace events should be turned into systemtap markups and scripts. Furthermore, it's ringbuffer implementation looks better. So as far as the user is concerned, LTT could (and should) live on with full capabilities, but with this crutial difference in how it interfaces to the kernel source code. i.e. could you try to just give SystemTap a chance and attempt to integrate a portion of LTT with it ... that shares more of the infrastructure and we'd obviously only need "one" markup variant, and would have full markup (removal-) flexibility. I'll try to help djprobes as much as possible. Hm? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:53 ` Ingo Molnar @ 2006-09-16 2:51 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 2:51 UTC (permalink / raw) To: Ingo Molnar Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > nor do i reject all of LTT: as i said before i like the tools, and i > think its collection of trace events should be turned into systemtap > markups and scripts. Furthermore, it's ringbuffer implementation looks > better. So as far as the user is concerned, LTT could (and should) live > on with full capabilities, but with this crutial difference in how it > interfaces to the kernel source code. The interface to the kernel source code can be worked on. I hope my other email has demonstrated that. > i.e. could you try to just give SystemTap a chance and attempt to > integrate a portion of LTT with it ... that shares more of the > infrastructure and we'd obviously only need "one" markup variant, and > would have full markup (removal-) flexibility. I'll try to help djprobes > as much as possible. Hm? Preface: I have absolutely nothing against SystemTap. I did have a bone with the way it was developed (behind closed-doors practically), but I told the SystemTap people about this and end of story, we moved on and I've had many enjoyable discussions with the SystemTap team since. I just have a feeling that part of the team is proceeding as if ltt was dead and buried. They'd like to interface with us -- at least I think -- but nobody dares to touch ltt with a 10foot poll because it's a political hot-potato i.e. for all they care, ltt could be a liability for SystemTap because of all the fuss about it amongst kernel developers. But that's my take, I could be entirely wrong. Now, on a technical level, SystemTap cannot currently be a substitute for what the ltt patch provides, especially in terms of performance. Maybe one day it will be a substitute, with djprobe and other stuff, but it isn't *now*. Nevertheless, I'm all for encouraging a movement in a common direction. And in that regard I think that there is consensus both amongst the SystemTap team and within the ltt team -- at least I think, for having a common markers interface. This is something we can definitely build on. Hopefully dispelling some of the ltt fud and gathering some positive mantra for the ltt effort on lkml can help ease people's fears about the possibility of rubbing the kernel developers the wrong way. Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:03 ` Ingo Molnar 2006-09-15 22:32 ` Karim Yaghmour @ 2006-09-15 22:59 ` Frank Ch. Eigler 2006-09-15 23:40 ` Karim Yaghmour 2006-09-15 23:17 ` Jose R. Santos 2 siblings, 1 reply; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-15 22:59 UTC (permalink / raw) To: Ingo Molnar Cc: Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar <mingo@elte.hu> writes: > [...] NOTE: i still accept the temporary (or non-temporary) > introduction of static markers, to help dynamic tracing. But my > expectation is that these markers will be less intrusive than static > tracepoints, and a lot more flexible. It seems like an agreement on this is coming together. You and Karim may be in violent agreement, even if others haven't quite come around: Let us design a static marker mechanism that can be coupled at run time either to a dynamic system such as systemtap, or by a specialized tracing system such as lttnng (!). Then "markers" === "static instrumentation", for purposes of the kernel developer. If the markers are lightweight enough, then a distribution kernel can afford keeping them compiled in. - FChE ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:59 ` Frank Ch. Eigler @ 2006-09-15 23:40 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 23:40 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Ingo Molnar, Jose R. Santos, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Frank Ch. Eigler wrote: > Let us design a static marker mechanism that can be coupled at run > time either to a dynamic system such as systemtap, or by a specialized > tracing system such as lttnng (!). Then "markers" === "static > instrumentation", for purposes of the kernel developer. If the > markers are lightweight enough, then a distribution kernel can afford > keeping them compiled in. I'm all for it. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:03 ` Ingo Molnar 2006-09-15 22:32 ` Karim Yaghmour 2006-09-15 22:59 ` Frank Ch. Eigler @ 2006-09-15 23:17 ` Jose R. Santos 2 siblings, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 23:17 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > * Jose R. Santos <jrs@us.ibm.com> wrote: > > > [...] While it is true that static probes will provide less overhead > > compared to dynamic probes, [...] > > that is not true at all. Yes, an INT3 based kprobe might be expensive if > +0.5 usecs per tracepoint (on a 1GHz CPU) is an issue to you - but that > is "only" an implementation detail, not a conceptual property. > Especially considering that help (djprobes) is on the way. And in the > future, as more and more code gets generated (and regenerated) on the > fly, dynamic probes will be _faster_ than static probes - plainly > because they adapt better to the environment they plug into. > Agree. And they are details that can be fixed. One such detail we still see issue with is kretprobes though (which we use on LKET for systemcall exit). These have problem scaling due to spinlock issues even on small smp systems. Its an implementation issue that can be fixed but I've been told that the fix is not trivial and should not expect it anytime soon. > so there's basically nothing to balance. My point is that dynamic probes > have won or will win on every front, and we shouldnt tie us down with > static tracers. 5 years ago with no kprobes, had someone submitted a > clean static tracer patchset, we could probably not have resisted it (i > though probably would have resisted it on the grounds of maintainance > overhead) and would have added it because tracing makes sense in > general. But today there's just no reason to add static tracers anymore. > > NOTE: i still accept the temporary (or non-temporary) introduction of > static markers, to help dynamic tracing. But my expectation is that > these markers will be less intrusive than static tracepoints, and a lot > more flexible. > Agree here as well. Sorry, I was also counting static markers as static tracepoint as well. Even with static markers, there need to be balance of what thing need to be implemented with markers vs those that can just be done dynamically. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:22 ` Mathieu Desnoyers 2006-09-15 21:08 ` Jose R. Santos @ 2006-09-15 21:32 ` Ingo Molnar 2006-09-15 21:58 ` Mathieu Desnoyers 2006-09-16 9:59 ` Jes Sorensen 2 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 21:32 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > * Ingo Molnar (mingo@elte.hu) wrote: > > sorry, but i disagree. There _is_ a solution that is superior in every > > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > > > > I am sorry to have to repeat myself, but this is not true for heavy > loads. djprobes? > > > At this point you've been rather uncompromising [...] > > > > yes, i'm rather uncompromising when i sense attempts to push inferior > > concepts into the core kernel _when_ a better concept exists here and > > today. Especially if the concept being pushed adds more than 350 > > tracepoints that expose something to user-space that amounts to a > > complex external API, which tracepoints we have little chance of ever > > getting rid of under a static tracing concept. > > > From an earlier email from Tim bird : > > "I still think that this is off-topic for the patch posted. I think > we should debate the implementation of tracepoints/markers when > someone posts a patch for some. I think it's rather scurrilous to > complain about code NOT submitted. Ingo has even mis-characterized > the not-submitted instrumentation patch, by saying it has 350 > tracepoints when it has no such thing. I counted 58 for one > architecture (with only 8 being arch-specific)." i missed that (way too many mails in this thread). Here is how i counted them: $ grep "\<trace_.*(" * | wc -l 359 some of those are not true tracepoints, but there's at least this many of them: $ grep "\<trace_.*(" *instrumentation* | wc -l 235 so the real number is somewhere between. patch-2.6.17-lttng-0.5.108-instrumentation-arm.diff patch-2.6.17-lttng-0.5.108-instrumentation.diff patch-2.6.17-lttng-0.5.108-instrumentation-i386.diff patch-2.6.17-lttng-0.5.108-instrumentation-mips.diff patch-2.6.17-lttng-0.5.108-instrumentation-powerpc.diff patch-2.6.17-lttng-0.5.108-instrumentation-ppc.diff patch-2.6.17-lttng-0.5.108-instrumentation-s390.diff patch-2.6.17-lttng-0.5.108-instrumentation-sh.diff patch-2.6.17-lttng-0.5.108-instrumentation-x86_64.diff when judging kernel maintainance overhead, the sum of all patches matters. And i considered all the other patches too (the ones that add actual tracepoints) that will come after the currently offered ones, not just the ones you submitted to lkml. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:32 ` Ingo Molnar @ 2006-09-15 21:58 ` Mathieu Desnoyers 2006-09-15 22:19 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 21:58 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > * Ingo Molnar (mingo@elte.hu) wrote: > > > sorry, but i disagree. There _is_ a solution that is superior in every > > > aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > > > > > > > I am sorry to have to repeat myself, but this is not true for heavy > > loads. > > djprobes? > I am fully aware of djprobes limitations towards fully preemptible kernel (and around branches instructions ? I don't remember if they solved this one). Oh, yes, and if a trap happen to come at the wrong spot, then the thread gets scheduled out... well, it cannot be applied everywhere, eh ? > > > > At this point you've been rather uncompromising [...] > > > > > > yes, i'm rather uncompromising when i sense attempts to push inferior > > > concepts into the core kernel _when_ a better concept exists here and > > > today. Especially if the concept being pushed adds more than 350 > > > tracepoints that expose something to user-space that amounts to a > > > complex external API, which tracepoints we have little chance of ever > > > getting rid of under a static tracing concept. > > > > > From an earlier email from Tim bird : > > > > "I still think that this is off-topic for the patch posted. I think > > we should debate the implementation of tracepoints/markers when > > someone posts a patch for some. I think it's rather scurrilous to > > complain about code NOT submitted. Ingo has even mis-characterized > > the not-submitted instrumentation patch, by saying it has 350 > > tracepoints when it has no such thing. I counted 58 for one > > architecture (with only 8 being arch-specific)." > > i missed that (way too many mails in this thread). > > Here is how i counted them: > > $ grep "\<trace_.*(" * | wc -l > 359 > This count includes the inline trace functions definitions. > some of those are not true tracepoints, but there's at least this many > of them: > > $ grep "\<trace_.*(" *instrumentation* | wc -l > 235 > 1 - This counts per architecture trace points. It quickly adds up considering that we support ARM, MIPS, i386, powerpc, ppc and x86_64. 2 - It also counts some experimental trace points that I do not want to submit. 3 - Most of these are instrumentation of the traps handlers, which is conceptually only one event. > when judging kernel maintainance overhead, the sum of all patches > matters. And i considered all the other patches too (the ones that add > actual tracepoints) that will come after the currently offered ones, not > just the ones you submitted to lkml. > I plan to rework the instrumentation patches before submitting them to LKML, don't worry. I just hasn't been my focus until now. Too bad that you take those as arguments. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:58 ` Mathieu Desnoyers @ 2006-09-15 22:19 ` Ingo Molnar 2006-09-15 22:45 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 22:19 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > > sorry, but i disagree. There _is_ a solution that is superior in > > > > every aspect: kprobes + SystemTap. (or any other equivalent > > > > dynamic tracer) > > > > > > > > > > I am sorry to have to repeat myself, but this is not true for > > > heavy loads. > > > > djprobes? > > > > I am fully aware of djprobes limitations towards fully preemptible > kernel [...] i dont see any fundamental limitation with a preemptible kernel. (preemptability was never a showstopper for any kernel feature in the past, and i dont expect it to be a showstopper for anything in the future either.) > [...] (and around branches instructions ? I don't remember if they > solved this one). Oh, yes, and if a trap happen to come at the wrong > spot, then the thread gets scheduled out... well, it cannot be applied > everywhere, eh ? i expect the number of places where dynamic tracers have problems to gradually shrink. It has shrunk significantly already. Hence i'm supportive of static markers (as i stated it numerous times), as long as it's there to ease dynamic probing - _and as long as these static markers shrink in number as the capabilities of dynamic tracers improve_. With static tracers i just dont see that possibility: a static tracer needs all its static tracepoints forever or otherwise it just wont work. > > $ grep "\<trace_.*(" * | wc -l > > 359 > > > > This count includes the inline trace functions definitions. yes, as i stated: > > some of those are not true tracepoints, but there's at least this many ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > of them: > > > > $ grep "\<trace_.*(" *instrumentation* | wc -l > > 235 > > > > 1 - This counts per architecture trace points. It quickly adds up > considering that we support ARM, MIPS, i386, powerpc, ppc and x86_64. yes. That's my point: overhead of static tracepoints "quickly adds up". The cost goes up linearly, as you grow into more subsystems and into more architectures. btw., an observation: that's 6 LTT architectures in 7 years, while kprobes are now on 5 architectures in 2 years. > 2 - It also counts some experimental trace points that I do not want > to submit. > 3 - Most of these are instrumentation of the traps handlers, which is > conceptually only one event. i counted the number of tracepoints, not the number of unique types of events, because: > > when judging kernel maintainance overhead, the sum of all patches > > matters. And i considered all the other patches too (the ones that > > add actual tracepoints) that will come after the currently offered > > ones, not just the ones you submitted to lkml. > > I plan to rework the instrumentation patches before submitting them to > LKML, don't worry. I just hasn't been my focus until now. Too bad that > you take those as arguments. the static tracer patches make little sense without instrumentation, so sure i considered them. I also clearly declared that you didnt submit them yet: >>> Let me quote from the latest LTT patch (patch-2.6.17-lttng-0.5.108, >>> which is the same version submitted to lkml - although no specific ^^^^^^^^^^^^^^^^^^^^ >>> tracepoints were submitted): ^^^^^^^^^^^^^^^^^^^^^^^^^^ Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:19 ` Ingo Molnar @ 2006-09-15 22:45 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 22:45 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > btw., an observation: that's 6 LTT architectures in 7 years, while > kprobes are now on 5 architectures in 2 years. Actually much of ltt underwent a complete rewrite since Mathieu took over maintainership. Let's, according to this email, Mathieu became the maintainer in November 2005: http://www.listserv.shafik.org/pipermail/ltt-dev/2005-November/001092.html [ Karim takes out calculator and punches: 10/12 = 0.83 ] So that's 7 architectures in 0.83 years, compared to 5 in 2 years. Joke's on you pall. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:22 ` Mathieu Desnoyers 2006-09-15 21:08 ` Jose R. Santos 2006-09-15 21:32 ` Ingo Molnar @ 2006-09-16 9:59 ` Jes Sorensen 2006-09-16 17:24 ` Mathieu Desnoyers 2006-09-16 17:30 ` Mathieu Desnoyers 2 siblings, 2 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-16 9:59 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > Please Ingo, stop repeating false argument without taking in account people's > corrections : > > * Ingo Molnar (mingo@elte.hu) wrote: >> sorry, but i disagree. There _is_ a solution that is superior in every >> aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) >> > I am sorry to have to repeat myself, but this is not true for heavy loads. Alan pointed out earlier in the thread that the actual kprobe is noise in this context, and I have seen similar issues on real workloads. Yes kprobes are probably a little higher overhead in real life, but you have to way that up against the rest of the system load. If you want to prove people wrong, I suggest you do some real life implementation and measure some real workloads with a predefined set of tracepoints implemented using kprobes and LTT and show us that the benchmark of the user application suffers in a way that can actually be measured. Argueing that a syscall takes an extra 50 instructions because it's traced using kprobes rather than LTT doesn't mean it actually has any real impact. "The 'kprobes' are too high overhead that makes them unusable" is one of these classic myths that the static tracepoint advocates so far have only been backing up with rhetoric. Give us some hard evidence or stop repeating this argument please. Just because something is repeated constantly doesn't transform it into truth. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 9:59 ` Jes Sorensen @ 2006-09-16 17:24 ` Mathieu Desnoyers 2006-09-16 17:35 ` Ingo Molnar ` (2 more replies) 2006-09-16 17:30 ` Mathieu Desnoyers 1 sibling, 3 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-16 17:24 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jes Sorensen (jes@sgi.com) wrote: > Mathieu Desnoyers wrote: > >Please Ingo, stop repeating false argument without taking in account > >people's > >corrections : > > > >* Ingo Molnar (mingo@elte.hu) wrote: > >>sorry, but i disagree. There _is_ a solution that is superior in every > >>aspect: kprobes + SystemTap. (or any other equivalent dynamic tracer) > >> > >I am sorry to have to repeat myself, but this is not true for heavy loads. > > Alan pointed out earlier in the thread that the actual kprobe is noise > in this context, and I have seen similar issues on real workloads. Yes > kprobes are probably a little higher overhead in real life, but you have > to way that up against the rest of the system load. > > If you want to prove people wrong, I suggest you do some real life > implementation and measure some real workloads with a predefined set of > tracepoints implemented using kprobes and LTT and show us that the > benchmark of the user application suffers in a way that can actually be > measured. Argueing that a syscall takes an extra 50 instructions > because it's traced using kprobes rather than LTT doesn't mean it > actually has any real impact. > > "The 'kprobes' are too high overhead that makes them unusable" is one of > these classic myths that the static tracepoint advocates so far have > only been backing up with rhetoric. Give us some hard evidence or stop > repeating this argument please. Just because something is repeated > constantly doesn't transform it into truth. > Hi, Here we go. I made a test that we can consider a lower bound for kprobes impact. Two tests per run. Simulation of high speed network traffic : time ping -f localhost First run : without any tracing activated, LTTng probes compiled in : 39457 packets received in 2.021 seconds : 19523.50 packets/s 142672 packets received in 7.237 seconds : 19714.24 packets/s Second run : LTTng tracing activated (traces system calls, interrupts and packet in/out...) : 93051 packets received in 7.395 seconds : 12582.96 packets/s 121585 packets received in 9.703 seconds : 12530.66 packets/s Third run : same LTTng instrumentation, with a kprobe handler triggered by each event traced. 56643 packets received in 11.152 seconds : 5079.17 packets/s 50150 packets received in 9.593 seconds : 5227.77 packets/s The bottom line is : LTTng impact on the studied phenomenon : 35% slower LTTng+kprobes impact on the studied phenomenon : 73% slower Therefore, I conclude that on this type of high event rate workload, kprobes doubles the tracer impact on the system. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:24 ` Mathieu Desnoyers @ 2006-09-16 17:35 ` Ingo Molnar 2006-09-16 17:56 ` Mathieu Desnoyers 2006-09-16 18:11 ` Karim Yaghmour 2006-09-16 17:55 ` Karim Yaghmour 2006-09-18 8:33 ` Jes Sorensen 2 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 17:35 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > Third run : same LTTng instrumentation, with a kprobe handler > triggered by each event traced. where exactly did you put the kprobe handler? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:35 ` Ingo Molnar @ 2006-09-16 17:56 ` Mathieu Desnoyers 2006-09-16 19:10 ` Ingo Molnar 2006-09-16 23:40 ` Ingo Molnar 2006-09-16 18:11 ` Karim Yaghmour 1 sibling, 2 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-16 17:56 UTC (permalink / raw) To: Ingo Molnar Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > Third run : same LTTng instrumentation, with a kprobe handler > > triggered by each event traced. > > where exactly did you put the kprobe handler? ltt_relay_reserve_slot. See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with hyperthreading. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:56 ` Mathieu Desnoyers @ 2006-09-16 19:10 ` Ingo Molnar 2006-09-16 19:37 ` Ingo Molnar 2006-09-16 19:51 ` Karim Yaghmour 2006-09-16 23:40 ` Ingo Molnar 1 sibling, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 19:10 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert > the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with > hyperthreading. i have done a bit of kprobes and djprobes testing on a 2160 MHz Athlon64 CPU, UP. I have tested 2 types of almost-NOP tracepoints (on 2.6.17), where the probe function only increases a counter: static int counter; static void probe_func(struct djprobe *djp, struct pt_regs *regs) { counter++; } and have measured the overhead of an unmodified, kprobes-probed and djprobes-probed sys_getpid() system-call: sys_getpid() unmodified latency: 317 cycles [ 0.146 usecs ] sys_getpid() kprobes latency: 815 cycles [ 0.377 usecs ] sys_getpid() djprobes latency: 380 cycles [ 0.176 usecs ] i.e. the kprobes overhead is +498 cycles (+0.231 usecs), the djprobes overhead is +63 cycles (+0.029 usecs). what do these numbers tell us? Firstly, on this CPU the kprobes overhead is not 1000-2000 cycles but 500 cycles. Secondly, if that's not fast enough, the "next-gen kprobes" code, djprobes have a really small overhead of 63 cycles. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:10 ` Ingo Molnar @ 2006-09-16 19:37 ` Ingo Molnar 2006-09-17 10:13 ` Frederik Deweerdt 2006-09-16 19:51 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 19:37 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar <mingo@elte.hu> wrote: > and have measured the overhead of an unmodified, kprobes-probed and > djprobes-probed sys_getpid() system-call: > > sys_getpid() unmodified latency: 317 cycles [ 0.146 usecs ] > sys_getpid() kprobes latency: 815 cycles [ 0.377 usecs ] > sys_getpid() djprobes latency: 380 cycles [ 0.176 usecs ] i have taken a look at the kprobes fastpath, and there are a few things we can do to speed it up. The patch below shaves off 75 cycles from the kprobes overhead: sys_getpid() kprobes-speedup: 740 cycles [ 0.342 usecs ] that reduces the kprobes overhead to 423 cycles. Ingo ---------------> Subject: [patch] kprobes: speed INT3 trap handling up on i386 From: Ingo Molnar <mingo@elte.hu> speed up kprobes trap handling by special-casing kernel-space INT3 traps (which do not occur otherwise) and doing a kprobes handler check - instead of redirecting over the i386-die-notifier chain. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/i386/kernel/kprobes.c | 2 +- arch/i386/kernel/traps.c | 19 ++++++++++++------- include/asm-i386/kprobes.h | 2 ++ 3 files changed, 15 insertions(+), 8 deletions(-) Index: linux/arch/i386/kernel/kprobes.c =================================================================== --- linux.orig/arch/i386/kernel/kprobes.c +++ linux/arch/i386/kernel/kprobes.c @@ -200,7 +200,7 @@ void __kprobes arch_prepare_kretprobe(st * Interrupts are disabled on entry as trap3 is an interrupt gate and they * remain disabled thorough out this function. */ -static int __kprobes kprobe_handler(struct pt_regs *regs) +int __kprobes kprobe_handler(struct pt_regs *regs) { struct kprobe *p; int ret = 0; Index: linux/arch/i386/kernel/traps.c =================================================================== --- linux.orig/arch/i386/kernel/traps.c +++ linux/arch/i386/kernel/traps.c @@ -802,13 +802,18 @@ EXPORT_SYMBOL_GPL(unset_nmi_callback); #ifdef CONFIG_KPROBES fastcall void __kprobes do_int3(struct pt_regs *regs, long error_code) { - if (notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP) - == NOTIFY_STOP) - return; - /* This is an interrupt gate, because kprobes wants interrupts - disabled. Normal trap handlers don't. */ - restore_interrupts(regs); - do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL); + /* + * kernel-mode INT3s are likely kprobes: + */ + if (!user_mode(regs)) { + if (kprobe_handler(regs)) + return; + /* This is an interrupt gate, because kprobes wants interrupts + disabled. Normal trap handlers don't. */ + restore_interrupts(regs); + do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL); + } + notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP); } #endif Index: linux/include/asm-i386/kprobes.h =================================================================== --- linux.orig/include/asm-i386/kprobes.h +++ linux/include/asm-i386/kprobes.h @@ -88,4 +88,6 @@ static inline void restore_interrupts(st extern int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data); +extern int kprobe_handler(struct pt_regs *regs); + #endif /* _ASM_KPROBES_H */ ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:37 ` Ingo Molnar @ 2006-09-17 10:13 ` Frederik Deweerdt 2006-09-17 14:00 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Frederik Deweerdt @ 2006-09-17 10:13 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Sat, Sep 16, 2006 at 09:37:45PM +0200, Ingo Molnar wrote: > ---------------> > Subject: [patch] kprobes: speed INT3 trap handling up on i386 > From: Ingo Molnar <mingo@elte.hu> > > speed up kprobes trap handling by special-casing kernel-space > INT3 traps (which do not occur otherwise) and doing a kprobes > handler check - instead of redirecting over the i386-die-notifier > chain. > Hi Ingo, Not that it would make any difference to the actual kprobe performance, but I think that not using the die-notifier chain makes the DIE_INT3 handling in kprobe_exceptions_notify() useless. Regards, Frederik Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> diff --git a/arch/i386/kernel/kprobes.c b/arch/i386/kernel/kprobes.c index afe6505..90787ff 100644 --- a/arch/i386/kernel/kprobes.c +++ b/arch/i386/kernel/kprobes.c @@ -652,10 +652,6 @@ int __kprobes kprobe_exceptions_notify(s return ret; switch (val) { - case DIE_INT3: - if (kprobe_handler(args->regs)) - ret = NOTIFY_STOP; - break; case DIE_DEBUG: if (post_kprobe_handler(args->regs)) ret = NOTIFY_STOP; ^ permalink raw reply related [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 10:13 ` Frederik Deweerdt @ 2006-09-17 14:00 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 14:00 UTC (permalink / raw) To: Frederik Deweerdt Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Frederik Deweerdt <deweerdt@free.fr> wrote: > On Sat, Sep 16, 2006 at 09:37:45PM +0200, Ingo Molnar wrote: > > ---------------> > > Subject: [patch] kprobes: speed INT3 trap handling up on i386 > > From: Ingo Molnar <mingo@elte.hu> > > > > speed up kprobes trap handling by special-casing kernel-space > > INT3 traps (which do not occur otherwise) and doing a kprobes > > handler check - instead of redirecting over the i386-die-notifier > > chain. > > > Hi Ingo, > > Not that it would make any difference to the actual kprobe > performance, but I think that not using the die-notifier chain makes > the DIE_INT3 handling in kprobe_exceptions_notify() useless. yeah, indeed - i'll add your patch to the kprobes patchset. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:10 ` Ingo Molnar 2006-09-16 19:37 ` Ingo Molnar @ 2006-09-16 19:51 ` Karim Yaghmour 1 sibling, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 19:51 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > i have done a bit of kprobes and djprobes testing on a 2160 MHz Athlon64 > CPU, UP. I have tested 2 types of almost-NOP tracepoints (on 2.6.17), > where the probe function only increases a counter: > > static int counter; > > static void probe_func(struct djprobe *djp, struct pt_regs *regs) > { > counter++; > } > > and have measured the overhead of an unmodified, kprobes-probed and > djprobes-probed sys_getpid() system-call: > > sys_getpid() unmodified latency: 317 cycles [ 0.146 usecs ] > sys_getpid() kprobes latency: 815 cycles [ 0.377 usecs ] > sys_getpid() djprobes latency: 380 cycles [ 0.176 usecs ] > > i.e. the kprobes overhead is +498 cycles (+0.231 usecs), the djprobes > overhead is +63 cycles (+0.029 usecs). But that's an entirely hypothetical benchmark. Mathieu was asked for real-workload benchmarks and he gave you those. In turn, you set up a simplistic test and then go on to conclude that the measurements are far less than advertised. You ask that ltt replace its static instrumentation by what kprobes provides and Mathieu demonstrated that that's not realistic. If you want to change his mind, at least reproduce the exact information ltt can provide and then we'll talk. > what do these numbers tell us? Firstly, on this CPU the kprobes overhead > is not 1000-2000 cycles but 500 cycles. Secondly, if that's not fast > enough, the "next-gen kprobes" code, djprobes have a really small > overhead of 63 cycles. But djprobe isn't even here yet. If you insist on keeping ltt's _current_ limitations as your single most powerful justification to reject it, how you hold kprobes to a different standard with a straight face? You're only perpetuating the fallacy found throughout this thread that somehow the shortcomings of dynamic editing are "easy" to fix while those of static instrumentation are inherently unrecoverable. That's just plain not true, as I've demonstrated now countless times in this thread. And please Ingo, I'm still waiting for your feedback on the static markup mechanism I proposed earlier. I believe it avoids every single problem you alluded to with regards to the problems generated by inline markup. Thanks, Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:56 ` Mathieu Desnoyers 2006-09-16 19:10 ` Ingo Molnar @ 2006-09-16 23:40 ` Ingo Molnar 2006-09-17 5:33 ` Mathieu Desnoyers 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 23:40 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > Third run : same LTTng instrumentation, with a kprobe handler > > > triggered by each event traced. > > > > where exactly did you put the kprobe handler? > > ltt_relay_reserve_slot. > > See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert > the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with > hyperthreading. ok. In what way did you enable LTTng instrumentation? I have 0.5.108 installed, and i'd like to make sure i do everything as you did, to make the tests comparable. Which kernel config options (default ones?), and what precise lttcl commands did you use, were they the usual: lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace ? What filesystem does /tmp/trace reside on? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 23:40 ` Ingo Molnar @ 2006-09-17 5:33 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-17 5:33 UTC (permalink / raw) To: Ingo Molnar Cc: Jes Sorensen, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > > > Third run : same LTTng instrumentation, with a kprobe handler > > > > triggered by each event traced. > > > > > > where exactly did you put the kprobe handler? > > > > ltt_relay_reserve_slot. > > > > See http://ltt.polymtl.ca/svn/tests/kernel/test-kprobes.c to insert > > the kprobe. Tests done on LTTng 0.5.111, on a x86 3GHz with > > hyperthreading. > > ok. In what way did you enable LTTng instrumentation? I have 0.5.108 > installed, and i'd like to make sure i do everything as you did, to make > the tests comparable. Which kernel config options (default ones?), and > what precise lttcl commands did you use, were they the usual: > > lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace > > ? What filesystem does /tmp/trace reside on? > I used LTTng 0.5.111 (yes, now with debugfs!) ;). I ran the tests on a Pentium 4 3 GHz, with hyperthreading enabled. The system has 1GB of ram. Hard disk : WDC WD1600JD-00H. File system : ext3. The kernel (2.6.17) is configured with SMP enabled. Relevant kernel config : CONFIG_LTT=y CONFIG_LTT_TRACER=m CONFIG_LTT_RELAY=m CONFIG_LTT_ALIGNMENT=y CONFIG_LTT_HEARTBEAT=y CONFIG_LTT_HEARTBEAT_EVENT=y # CONFIG_LTT_SYNTHETIC_TSC is not set CONFIG_LTT_USERSPACE_GENERIC=y CONFIG_LTT_NETLINK_CONTROL=m CONFIG_LTT_STATEDUMP=m CONFIG_LTT_FACILITY_CORE=y CONFIG_LTT_FACILITY_FS=y CONFIG_LTT_FACILITY_FS_DATA=y CONFIG_LTT_FACILITY_IPC=y CONFIG_LTT_FACILITY_KERNEL=y CONFIG_LTT_FACILITY_KERNEL_ARCH=y # CONFIG_LTT_FACILITY_LOCKING is not set CONFIG_LTT_FACILITY_MEMORY=y CONFIG_LTT_FACILITY_NETWORK=y CONFIG_LTT_FACILITY_NETWORK_IP_INTERFACE=y CONFIG_LTT_FACILITY_PROCESS=y CONFIG_LTT_FACILITY_SOCKET=y CONFIG_LTT_FACILITY_STATEDUMP=y CONFIG_LTT_FACILITY_TIMER=y CONFIG_LTT_FACILITY_STACK=y CONFIG_LTT_PROCESS_STACK=y CONFIG_LTT_PROCESS_MAX_FUNCTION_STACK=100 CONFIG_LTT_PROCESS_MAX_STACK_LEN=250 CONFIG_LTT_KERNEL_STACK=y CONFIG_LTT_STACK_SYSCALL=y CONFIG_LTT_STACK_INTERRUPT=y CONFIG_LTT_STACK_NMI=y Huge note : I left CONFIG_LTT_FACILITY_STACK enabled, but THIS IS EXPERIMENTAL. lttctl commands : Start tracing : lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace1 (note : 0.5.111 uses debugfs, 0.5.108 uses relayfs) Stop tracing : lttctl -n trace -R See http://ltt.polymtl.ca > QUICKSTART for other details (modules to load...) Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:35 ` Ingo Molnar 2006-09-16 17:56 ` Mathieu Desnoyers @ 2006-09-16 18:11 ` Karim Yaghmour 2006-09-16 17:44 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 18:11 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > where exactly did you put the kprobe handler? So location matters, huh? If you're keen to ask this question, then it might be worth asking why should non-experts be trusted with keeping instrumentation pertinent out of tree. [ I know you've said that you acknowledge the need for static markup. I'm just highlighting a fact substantiating the position I stated to you in my response late last evening. ] Thanks, Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 18:11 ` Karim Yaghmour @ 2006-09-16 17:44 ` Ingo Molnar 2006-09-16 18:15 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 17:44 UTC (permalink / raw) To: Karim Yaghmour Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Karim Yaghmour <karim@opersys.com> wrote: > Ingo Molnar wrote: > > where exactly did you put the kprobe handler? > > So location matters, huh? [...] yes, location very much matters if someone wants to reproduce the numbers. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:44 ` Ingo Molnar @ 2006-09-16 18:15 ` Karim Yaghmour 2006-09-18 8:18 ` Jes Sorensen 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 18:15 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, Jes Sorensen, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > yes, location very much matters if someone wants to reproduce the > numbers. Was that really the angle? I'll give you the benefit of the doubt. But I'm sure you understand the importance of probe placement with regards to impact of performance ... Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 18:15 ` Karim Yaghmour @ 2006-09-18 8:18 ` Jes Sorensen 0 siblings, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:18 UTC (permalink / raw) To: karim Cc: Ingo Molnar, Mathieu Desnoyers, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Ingo Molnar wrote: >> yes, location very much matters if someone wants to reproduce the >> numbers. > > Was that really the angle? I'll give you the benefit of the doubt. > But I'm sure you understand the importance of probe placement > with regards to impact of performance ... So now you produce a benchmark, then won't allow someone to reproduce it ..... do we see a pattern here? Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:24 ` Mathieu Desnoyers 2006-09-16 17:35 ` Ingo Molnar @ 2006-09-16 17:55 ` Karim Yaghmour 2006-09-18 8:21 ` Jes Sorensen 2006-09-18 8:33 ` Jes Sorensen 2 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 17:55 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Ingo Molnar, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > The bottom line is : > > LTTng impact on the studied phenomenon : 35% slower > > LTTng+kprobes impact on the studied phenomenon : 73% slower > > Therefore, I conclude that on this type of high event rate workload, kprobes > doubles the tracer impact on the system. Amen to that. Hopefully this puts to rest the myth of Mr. Scrub. Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:55 ` Karim Yaghmour @ 2006-09-18 8:21 ` Jes Sorensen 0 siblings, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:21 UTC (permalink / raw) To: karim Cc: Mathieu Desnoyers, Ingo Molnar, Roman Zippel, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Mathieu Desnoyers wrote: >> The bottom line is : >> >> LTTng impact on the studied phenomenon : 35% slower >> >> LTTng+kprobes impact on the studied phenomenon : 73% slower >> >> Therefore, I conclude that on this type of high event rate workload, kprobes >> doubles the tracer impact on the system. > > Amen to that. Hopefully this puts to rest the myth of Mr. Scrub. If it wasn't because it's so sad, this would be hysterically funny. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:24 ` Mathieu Desnoyers 2006-09-16 17:35 ` Ingo Molnar 2006-09-16 17:55 ` Karim Yaghmour @ 2006-09-18 8:33 ` Jes Sorensen 2006-09-18 15:01 ` Mathieu Desnoyers 2 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:33 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > The bottom line is : > > LTTng impact on the studied phenomenon : 35% slower > > LTTng+kprobes impact on the studied phenomenon : 73% slower > > Therefore, I conclude that on this type of high event rate workload, kprobes > doubles the tracer impact on the system. For this specific benchmark, for which we have not seen the code, nor do we know what system configuration it was run on. Sorry, but even M$'s sham benchmarks generally tell you which system they used for their tests. In addition, some profiling would be interesting so we can see exactly where things go wrong and fix it. Ingo seems to be doing a good job at that even without you providing this basic info.... Anyway, despite what Karim likes to claim, this *is* the Linux way! Things don't get fixed if they are not reported broken and when they are, whoever is interested in the item will try and fix it. We are not going to cease Linux kernel development just to please Karim. The point of this discussion is that the concept of dynamic tracing is the way to go. If the code isn't 100% there today, then it should be fixed, thats *not* an excuse to add a lot of cruft based on the wrong design when we know which path to take. I know it's hard for someone to accept when he's thrown so much personal time into a project, but as Ingo keeps saying, there is a lot of value in LTT, the actual markup isn't the big issue. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 8:33 ` Jes Sorensen @ 2006-09-18 15:01 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-18 15:01 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jes Sorensen (jes@sgi.com) wrote: > Mathieu Desnoyers wrote: > > The bottom line is : > > > > LTTng impact on the studied phenomenon : 35% slower > > > > LTTng+kprobes impact on the studied phenomenon : 73% slower > > > > Therefore, I conclude that on this type of high event rate workload, kprobes > > doubles the tracer impact on the system. > > For this specific benchmark, for which we have not seen the code, nor > do we know what system configuration it was run on. Sorry, but even M$'s > sham benchmarks generally tell you which system they used for their > tests. > > In addition, some profiling would be interesting so we can see exactly > where things go wrong and fix it. Ingo seems to be doing a good job at > that even without you providing this basic info.... > Hi Jes, I did not repeat my system configuration from the previous email in the thread as it seemed redundant. Ingo asked me politely to tell more about my config and tests, which I have done. Please read on further down this thread to get that information. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 9:59 ` Jes Sorensen 2006-09-16 17:24 ` Mathieu Desnoyers @ 2006-09-16 17:30 ` Mathieu Desnoyers 2006-09-18 8:15 ` Jes Sorensen 1 sibling, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-16 17:30 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jes Sorensen (jes@sgi.com) wrote: > If you want to prove people wrong, I suggest you do some real life > implementation and measure some real workloads with a predefined set of > tracepoints implemented using kprobes and LTT and show us that the > benchmark of the user application suffers in a way that can actually be > measured. Argueing that a syscall takes an extra 50 instructions > because it's traced using kprobes rather than LTT doesn't mean it > actually has any real impact. > And about those extra cycles.. according to : Documentation/kprobes.txt "6. Probe Overhead On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 microseconds to process. Specifically, a benchmark that hits the same probepoint repeatedly, firing a simple handler each time, reports 1-2 million hits per second, depending on the architecture. A jprobe or return-probe hit typically takes 50-75% longer than a kprobe hit. When you have a return probe set on a function, adding a kprobe at the entry to that function adds essentially no overhead. i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40 x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07 ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU) k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 So, 1 microsecond seems more like 1500-2000 cycles to me, not 50. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 17:30 ` Mathieu Desnoyers @ 2006-09-18 8:15 ` Jes Sorensen 2006-09-18 14:53 ` Mathieu Desnoyers 0 siblings, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:15 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > And about those extra cycles.. according to : > Documentation/kprobes.txt > "6. Probe Overhead > > On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 > microseconds to process. Specifically, a benchmark that hits the same > probepoint repeatedly, firing a simple handler each time, reports 1-2 > million hits per second, depending on the architecture. A jprobe or > return-probe hit typically takes 50-75% longer than a kprobe hit. > When you have a return probe set on a function, adding a kprobe at > the entry to that function adds essentially no overhead. [snip] > So, 1 microsecond seems more like 1500-2000 cycles to me, not 50. So call it 2000 cycles, now go measure it in *real* life benchmarks and not some artificial I call this one syscall that hits the probe every time in a tight loop, kinda thing. Show us some *real* numbers please. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 8:15 ` Jes Sorensen @ 2006-09-18 14:53 ` Mathieu Desnoyers 2006-09-18 15:17 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-18 14:53 UTC (permalink / raw) To: Jes Sorensen Cc: Ingo Molnar, Roman Zippel, Andrew Morton, tglx, karim, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Jes Sorensen (jes@sgi.com) wrote: > Mathieu Desnoyers wrote: > > And about those extra cycles.. according to : > > Documentation/kprobes.txt > > "6. Probe Overhead > > > > On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 > > microseconds to process. Specifically, a benchmark that hits the same > > probepoint repeatedly, firing a simple handler each time, reports 1-2 > > million hits per second, depending on the architecture. A jprobe or > > return-probe hit typically takes 50-75% longer than a kprobe hit. > > When you have a return probe set on a function, adding a kprobe at > > the entry to that function adds essentially no overhead. > [snip] > > So, 1 microsecond seems more like 1500-2000 cycles to me, not 50. > > So call it 2000 cycles, now go measure it in *real* life benchmarks > and not some artificial I call this one syscall that hits the probe > every time in a tight loop, kinda thing. > > Show us some *real* numbers please. > You are late (I don't blame you about it, considering the size of this thread). It has been posted in the following email : http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 14:53 ` Mathieu Desnoyers @ 2006-09-18 15:17 ` Ingo Molnar 2006-09-18 16:54 ` Mathieu Desnoyers 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-18 15:17 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jes Sorensen, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > You are late (I don't blame you about it, considering the size of this > thread). It has been posted in the following email : > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html yeah - and i dont think the kprobes overhead is a fundamental thing - i posted a few kprobes-speedup patches as a reply to your measurements. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 15:17 ` Ingo Molnar @ 2006-09-18 16:54 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-18 16:54 UTC (permalink / raw) To: Ingo Molnar Cc: Jes Sorensen, Andrew Morton, tglx, Paul Mundt, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > You are late (I don't blame you about it, considering the size of this > > thread). It has been posted in the following email : > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-09/msg04492.html > > yeah - and i dont think the kprobes overhead is a fundamental thing - i > posted a few kprobes-speedup patches as a reply to your measurements. > Hi Ingo, Yes, and I replied that I really don't think that a few cycles saved here and there by a predicted branch will change anything significant compared to the int3 cost. As my test bench is really not that hard to deploy (I have given the precise instructions to do so), I assume that the burden of the proof is on your side there. Anyhow, I prefer to move to a more constructive matter than testing kprobes branch optimisations. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:05 ` Ingo Molnar 2006-09-15 20:22 ` Mathieu Desnoyers @ 2006-09-15 21:12 ` Roman Zippel 2006-09-15 21:08 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 21:12 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > i'm also looking at it this way too: you already seem to be quite > reluctant to add kprobes to your architecture today. How reluctant would > you be tomorrow if you had static tracepoints, which would remove a fair > chunk of incentive to implement kprobes? If I see that whole teams spend years to implement efficient dynamic tracing, do you really think that your "incentive" makes any difference? byem Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:12 ` Roman Zippel @ 2006-09-15 21:08 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 21:08 UTC (permalink / raw) To: Roman Zippel Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi, > > On Fri, 15 Sep 2006, Ingo Molnar wrote: > > > i'm also looking at it this way too: you already seem to be quite > > reluctant to add kprobes to your architecture today. How reluctant > > would you be tomorrow if you had static tracepoints, which would > > remove a fair chunk of incentive to implement kprobes? > > If I see that whole teams spend years to implement efficient dynamic > tracing, do you really think that your "incentive" makes any > difference? oh, being the first mover is the hardest part. Finding the right solution is a hard, it is blind Brownian motion in untested waters. Once good solutions have been found and once they have been integrated upstream, an architecture 'only' has to follow straight through the example. (which is _still_ far from trivial, but it certainly doesnt take years.) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:19 ` Ingo Molnar 2006-09-15 19:26 ` Karim Yaghmour 2006-09-15 19:43 ` Roman Zippel @ 2006-09-15 20:13 ` Andrew Morton 2006-09-15 21:49 ` Jose R. Santos 2006-09-16 10:19 ` Jes Sorensen 2 siblings, 2 replies; 271+ messages in thread From: Andrew Morton @ 2006-09-15 20:13 UTC (permalink / raw) To: Ingo Molnar Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 15 Sep 2006 20:19:07 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > * Andrew Morton <akpm@osdl.org> wrote: > > > What Karim is sharing with us here (yet again) is the real in-field > > experience of real users (ie: not kernel developers). > > well, Jes has that experience and Thomas too. systemtap and ltt are the only full-scale tracing tools which target sysadmins and applciation developers of which I am aware.. > > I mean, on one hand we have people explaining what they think a > > tracing facility should and shouldn't do, and on the other hand we > > have a guy who has been maintaining and shipping exactly that thing to > > (paying!) customers for many years. > > so does Thomas and Jes. So what's the point? My point is that I respect Karim and Frank's experience. I in fact disagree with them (or at least, I want to). But they've been there, and I haven't. So I listen. > i judge LTT by its current code quality, not by its proponents shouting > volume - and that quality is still quite poor at the moment. (and then > there are the conceptual problems too, outlined numerous times) I have > quoted specific example(s) for that in this thread. Furthermore, LTT > does this: > > 246 files changed, 26207 insertions(+), 71 deletions(-) > > and this gives me the shivers, for all the reasons i outlined. > In the bit of text which you snipped I was agreeing with this... Look, if Karim and Frank (who I assume is a systemtap developer) think that we need static tracepoints then I have no reason to disagree with them. What I would propose is that: a) Those tracepoints be integrated one at a time on well-understood grounds of necessity. Tracepoints _should_ be added dynamically. But if there are instances where that's not working and cannot be made to work then OK, in we go. b) Saying "we need the static tracepoints because the line numbers keep on changing" is not, repeat not a justification for static tracepoints. It's a SMOP to develop tracepoint-adding code which can handle line numbers changing. lwall did it. c) Any static tracepoints should be seen as corner-case augmentation of existing dynamic tracing framework(s). IOW: I see no justification at this time for adding complete new second set of backend accumulation/reporting/management infrastructure (ie: LTT core). Shorter version: I agree with Frank. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:13 ` Andrew Morton @ 2006-09-15 21:49 ` Jose R. Santos 2006-09-16 10:19 ` Jes Sorensen 1 sibling, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 21:49 UTC (permalink / raw) To: Andrew Morton Cc: Ingo Molnar, tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Andrew Morton wrote: > On Fri, 15 Sep 2006 20:19:07 +0200 > Ingo Molnar <mingo@elte.hu> wrote: > > > > > * Andrew Morton <akpm@osdl.org> wrote: > > > > > What Karim is sharing with us here (yet again) is the real in-field > > > experience of real users (ie: not kernel developers). > > > > well, Jes has that experience and Thomas too. > > systemtap and ltt are the only full-scale tracing tools which target > sysadmins and applciation developers of which I am aware.. > IMO, I think SystemTap is to generic of a tool to be considered a tracing tool. LKET and LKST are more comparable with the functionality that LTT provides. LKET is implemented using SystemTap while LKST has both a SystemTap and static kernel patch implementation. > In the bit of text which you snipped I was agreeing with this... > > Look, if Karim and Frank (who I assume is a systemtap developer) think that > we need static tracepoints then I have no reason to disagree with them. > What I would propose is that: > > a) Those tracepoints be integrated one at a time on well-understood > grounds of necessity. Tracepoints _should_ be added dynamically. But > if there are instances where that's not working and cannot be made to > work then OK, in we go. > Agree. What would be the criteria that justifies having static probe vs a dynamic one? -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:13 ` Andrew Morton 2006-09-15 21:49 ` Jose R. Santos @ 2006-09-16 10:19 ` Jes Sorensen 2006-09-16 16:05 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-16 10:19 UTC (permalink / raw) To: Andrew Morton Cc: Ingo Molnar, tglx, karim, Paul Mundt, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Andrew Morton wrote: > On Fri, 15 Sep 2006 20:19:07 +0200 > Ingo Molnar <mingo@elte.hu> wrote: > >> * Andrew Morton <akpm@osdl.org> wrote: >> >>> What Karim is sharing with us here (yet again) is the real in-field >>> experience of real users (ie: not kernel developers). >> well, Jes has that experience and Thomas too. > > systemtap and ltt are the only full-scale tracing tools which target > sysadmins and applciation developers of which I am aware.. Just to clarify, the stuff I have looked at in the field was based on LTT, but not part of the official LTT. It simply goes to show that end users cannot agree on a small set of fixed tracepoints because someone always wants a slightly different view of things, like in the cases I looked at. Not to mention that the changes LTT users make, at times, to shoehorn their stuff in, especially in sensitive codepaths such as the syscall path, have side effects which clearly weren't considered. In one case I ended up doing an alternative implementation using kprobes to prove that similar results could be achieved in that manner. Strangely enough I was right :) I don't have any objections to markers as Ingo suggested. I just don't buy the repeated argument that LTT has been around for years and barely changed. It's simply a case of the LTT team not being aware (or deciding to ignore, I cannot say which) what users have actually done with the LTT codebase, but it seems obvious they are not aware what everyone is doing with it. But we have seen before how if an infrastructure like LTT goes into the kernel, many more users will pop up and want to have their stuff added. The other part is the constantly repeated performance claim, which to this point hasn't been backed up by any hard evidence. If we are to take that argument serious, then I strongly encourage the LTT community to present some real numbers, but until then it can be classified as nothing but FUD. I shall be the first to point out that kprobes are less than ideal, especially the current ia64 implementation suffers from some tricky limitations, but thats an implementation issue. Cheers, Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 10:19 ` Jes Sorensen @ 2006-09-16 16:05 ` Karim Yaghmour 2006-09-17 4:54 ` Ganesan Rajagopal 2006-09-18 8:13 ` Jes Sorensen 0 siblings, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-16 16:05 UTC (permalink / raw) To: Jes Sorensen Cc: Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > Just to clarify, the stuff I have looked at in the field was based on > LTT, but not part of the official LTT. It simply goes to show that end > users cannot agree on a small set of fixed tracepoints because someone > always wants a slightly different view of things, like in the cases I > looked at. Not to mention that the changes LTT users make, at times, to > shoehorn their stuff in, especially in sensitive codepaths such as the > syscall path, have side effects which clearly weren't considered. Good. So give me concrete examples of those cases that you saw and tell me exactly what those people you were working with were attempting to achieve. > I don't have any objections to markers as Ingo suggested. I just don't > buy the repeated argument that LTT has been around for years and barely > changed. It's simply a case of the LTT team not being aware (or deciding > to ignore, I cannot say which) what users have actually done with the > LTT codebase, but it seems obvious they are not aware what everyone is > doing with it. But we have seen before how if an infrastructure like LTT > goes into the kernel, many more users will pop up and want to have their > stuff added. Either ltt had a userbase or it didn't. To say that all its users went out and added their own tracepoints is to not know enough about the project and so too is it to say that none of its users could actually just use it out of the box without modifying it. Now, as an outsider, trying to measure how many users were using it without modifying it is like trying to figure out how many Linux users there are out there. There's a silent majority and there's those that need customization. Guess who you've been talking to? Strange, come to think of it I don't remember *ever* getting an email from you while being the maintainer or seing *any* emails by you on the ltt lists -- that's indicative of mindset, namely that you personally assumed you knew all about tracing and didn't need us to make suggestions to help you AND that you personally never found it relevant to contribute back. That's like me going off forking the kernel, adding features to it and then calling the kernel developers incompetent when they come around saying that what I'm doing is wrong. Who's patronizing who here? And I submit to you an idea which I submitted to Ingo yesterday and have not yet received feedback on. Here's static markup as it could be implemented: The plain function: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); ... [lots of code] ... } The function with static markup: int global_function(int arg1, int arg2, int arg3) { ... [lots of code] ... x = func2(); /*T* @here:arg1,arg2,arg3 */ ... [lots of code] ... } The semantics are primitive at this stage, and they could definitely benefit from lkml input, but essentially we have a build-time parser that goes around the code and automagically does one of two things: a) create information for binary editors to use b) generate an alternative C file (foo-trace.c) with inlined static function calls. And there might be other possibilities I haven't thought of. This beats every argument I've seen to date on static instrumentation. Namely: - It isn't visually offensive: it's a comment. - It's not a maintenance drag: outdated comments are not alien. - It doesn't use weird function names or caps: it's a comment. - There is precedent: kerneldoc. And it does preserve most of the key things those who've asked for static markup are looking for. Namely: - Static instrumentation - Mainline maintainability - Contextualized variables When I was still part of the ltt development process we had accumulated a huge amount of ideas of how we could optimize and fix stuff here and there. We were never actually ever able to reduce these to practice because folks like you never bothered interfacing with us and the attitude on the lkml was exactly as I described. We spent our time chasing kernels. > The other part is the constantly repeated performance claim, which to > this point hasn't been backed up by any hard evidence. If we are to take > that argument serious, then I strongly encourage the LTT community to > present some real numbers, but until then it can be classified as > nothing but FUD. Hmm... beats me why even the systemtap folks would themselves admit to performance limitations. > I shall be the first to point out that kprobes are less than ideal, > especially the current ia64 implementation suffers from some tricky > limitations, but thats an implementation issue. Ah, so it's ok for kprobes to have implementation issues, but not ltt. Somehow there's this magic thought recurring throughout this thread that the limitations of dynamic instrumentation are trivial to fix, but those of static instrumentation are unrecoverable. *That* is a fallacy if I ever saw one. I'm willing to admit that a combination of dynamic editing and static instrumentation is a good balance, but Jes please drop this discourse, it's not constructive. Karim -- President / Opersys Inc. Embedded Linux Training and Expertise www.opersys.com / 1.866.677.4546 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 16:05 ` Karim Yaghmour @ 2006-09-17 4:54 ` Ganesan Rajagopal 2006-09-18 8:13 ` Jes Sorensen 1 sibling, 0 replies; 271+ messages in thread From: Ganesan Rajagopal @ 2006-09-17 4:54 UTC (permalink / raw) To: linux-kernel; +Cc: ltt-dev >>>>> Karim Yaghmour <karim@opersys.com> writes: > And I submit to you an idea which I submitted to Ingo yesterday and have > not yet received feedback on. Here's static markup as it could be > implemented: > > The plain function: > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); > > ... [lots of code] ... > } > > The function with static markup: > int global_function(int arg1, int arg2, int arg3) > { > ... [lots of code] ... > > x = func2(); /*T* @here:arg1,arg2,arg3 */ > > ... [lots of code] ... > } > > The semantics are primitive at this stage, and they could definitely > benefit from lkml input, but essentially we have a build-time parser > that goes around the code and automagically does one of two things: > a) create information for binary editors to use > b) generate an alternative C file (foo-trace.c) with inlined static > function calls. This makes sense to me, when combined with kprobes. I refer to the dtrace Usenix http://www.sun.com/bigadmin/content/dtrace/dtrace_usenix.pdf. They argue (Section 4.2 Statically-defined Tracing): "While FBT (Function Boundary Tracing) allows for comprehensive probe coverage, one must be familar with the kernel implementation to use it effectively. To have probes with semantic meaning, one must allow probes to be statically declared in the implementation. The mechanism for implemting this is typically a macro that expands to a conditional call into a tracing framework if tracing is enabled. While the probe effect of this mechanism is small, it is observable: even when disabled, the expanded macro introduces a load, a compare and a taken branch. In keeping with our philosophy of zero probe effect when disabled, we have implemnted a statically defined tracing (SDT) provider by defining a C macro that expands to a call to a non-existent function with a well-defined prefix ("__dtrace_probe_"). When the kernel linker sees a relocation against a function with this prefix, it replaces the call instruction with a no-operation and records the full name of the bogus function along with the location of the call site. Wehn the SDT provider loads, it queries the auxiliary structure and creates a probe with a name specified by the function name. When a SDT probe is enabled, teh no-operation at the call site is patched to be a call into an SDT-controlled trampoline that transfers control into DTrace." -- Ganesan Rajagopal ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 16:05 ` Karim Yaghmour 2006-09-17 4:54 ` Ganesan Rajagopal @ 2006-09-18 8:13 ` Jes Sorensen 2006-09-18 14:46 ` Mathieu Desnoyers 2006-09-18 17:06 ` Martin Bligh 1 sibling, 2 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:13 UTC (permalink / raw) To: karim Cc: Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Jes Sorensen wrote: > Good. So give me concrete examples of those cases that you saw and tell > me exactly what those people you were working with were attempting to > achieve. I don't have all the details at hand, but it included syscalls and scheduler points amongst others. > Either ltt had a userbase or it didn't. To say that all its users went > out and added their own tracepoints is to not know enough about the project > and so too is it to say that none of its users could actually just use > it out of the box without modifying it. Now, as an outsider, trying to > measure how many users were using it without modifying it is like > trying to figure out how many Linux users there are out there. There's > a silent majority and there's those that need customization. Guess > who you've been talking to? Or maybe people start looking at it not knowing whether the want to pursue it to the end for their product. > Strange, come to think of it I don't remember *ever* getting an > email from you while being the maintainer or seing *any* emails by you > on the ltt lists -- that's indicative of mindset, namely that you > personally assumed you knew all about tracing and didn't need us to make > suggestions to help you AND that you personally never found it relevant > to contribute back. There's a word for that: *plonk* Maybe the code was used to evaluate it as an option, maybe they realized it wasn't worth using in the end, maybe they decided they could make it work. Maybe the LTT mailing list had been *dead* for 18 months by the time? You know, reading C code isn't that hard, and it didn't state anywhere in the LTT license that one is required to take out a paying contract with a certain Mr. Yaghmour just to be allowed to compile the code. > The semantics are primitive at this stage, and they could definitely > benefit from lkml input, but essentially we have a build-time parser > that goes around the code and automagically does one of two things: > a) create information for binary editors to use > b) generate an alternative C file (foo-trace.c) with inlined static > function calls. You intend to handle inline assembly how? You plan to handle the issue of debugging the code when the markup is present how? > And there might be other possibilities I haven't thought of. > > This beats every argument I've seen to date on static instrumentation. > Namely: > - It isn't visually offensive: it's a comment. > - It's not a maintenance drag: outdated comments are not alien. > - It doesn't use weird function names or caps: it's a comment. > - There is precedent: kerneldoc. > And it does preserve most of the key things those who've asked for > static markup are looking for. Namely: > - Static instrumentation > - Mainline maintainability > - Contextualized variables And it doesn't address the following issues: a) The static community providing actual evidence that dynamic tracing is noticably slower. b) It will not be enabled per default in vendor kernels so in practice the information will not be available anywhere, only in debug kernels. c) The point that we will end up with markups all over the place to satisfy everybody's needs. >> The other part is the constantly repeated performance claim, which to >> this point hasn't been backed up by any hard evidence. If we are to take >> that argument serious, then I strongly encourage the LTT community to >> present some real numbers, but until then it can be classified as >> nothing but FUD. > > Hmm... beats me why even the systemtap folks would themselves admit > to performance limitations. Everything has performance limitations, you keep running around touting that static is the only thing thats not a problem. Now show us the numbers! >> I shall be the first to point out that kprobes are less than ideal, >> especially the current ia64 implementation suffers from some tricky >> limitations, but thats an implementation issue. > > Ah, so it's ok for kprobes to have implementation issues, but not ltt. > Somehow there's this magic thought recurring throughout this thread > that the limitations of dynamic instrumentation are trivial to fix, > but those of static instrumentation are unrecoverable. *That* is a > fallacy if I ever saw one. I'm willing to admit that a combination of > dynamic editing and static instrumentation is a good balance, but Jes > please drop this discourse, it's not constructive. Oh so bringing fact into a discussion is not allowed. Karim, maybe you should try using some real arguments. What I am saying about the ia64 implementation is that there are limitations but I am also saying they can be fixed, it's an implementation issue, not a problem with the concept. The problems pointed out with LTT are *conceptual*, but of course you keep ignoring the facts and refusing to provide real numbers. Says it all really .... Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 8:13 ` Jes Sorensen @ 2006-09-18 14:46 ` Mathieu Desnoyers 2006-09-18 17:06 ` Martin Bligh 1 sibling, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-18 14:46 UTC (permalink / raw) To: Jes Sorensen Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi Jes, * Jes Sorensen (jes@sgi.com) wrote: > Everything has performance limitations, you keep running around touting > that static is the only thing thats not a problem. Now show us the > numbers! > If I may : I showed in a precedent thread that kprobes impact doubled LTTng's impact on the system. If you are interested in numbers about LTTng, here they are : "The LTTng tracer : A Low Impact Performance and Behavior Monitor for GNU/Linux" (OLS2006) http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf (and for Ingo : I haven't rerun the tests on your modified kprobes, it will come in time. But I do not really expect that 30-50 cycles compared to 1500 will make a very big difference.) Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 8:13 ` Jes Sorensen 2006-09-18 14:46 ` Mathieu Desnoyers @ 2006-09-18 17:06 ` Martin Bligh 2006-09-20 14:17 ` Jes Sorensen 1 sibling, 1 reply; 271+ messages in thread From: Martin Bligh @ 2006-09-18 17:06 UTC (permalink / raw) To: Jes Sorensen Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais > And it doesn't address the following issues: > > a) The static community providing actual evidence that dynamic tracing > is noticably slower. ... > Everything has performance limitations, you keep running around touting > that static is the only thing thats not a problem. Now show us the > numbers! When comparing two different approaches to a problem, it is unreasonable and disingenuous to try to force the onus on the proponents of one particular approach to do all the benchmarking for both sides. Everybody has to help try to find the correct solution. Furthermore, Mathieu already did provide numbers, if you go back and look. > The problems pointed out with LTT are *conceptual*, but of course you > keep ignoring the facts and refusing to provide real numbers. This is getting very silly, and unnecessarily abusive. Real problems exist on both sides of the fence, which have been discussed ad nauseam. If you don't recall them, then go back and read the thread again. The question is how to strike a comprimise between two different set of problems, which Ingo and Karim actually seemed to be making progress on towards the end of the thread. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-18 17:06 ` Martin Bligh @ 2006-09-20 14:17 ` Jes Sorensen 0 siblings, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-20 14:17 UTC (permalink / raw) To: Martin Bligh Cc: karim, Andrew Morton, Ingo Molnar, tglx, Paul Mundt, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Martin Bligh wrote: >> Everything has performance limitations, you keep running around touting >> that static is the only thing thats not a problem. Now show us the >> numbers! > > When comparing two different approaches to a problem, it is unreasonable > and disingenuous to try to force the onus on the proponents of one > particular approach to do all the benchmarking for both sides. Everybody > has to help try to find the correct solution. Martin, If you have one side of a discussion stating that the other side's suggestion is useless for performance reasons, then it is IMHO totally fair for the second side to ask the first side to back up their statement with facts. If one wants to get a patch into the kernel, you also get asked for justication, and if you want to get it into a vendor kernel, a benchmark proving your patch is not causing any damage is pretty much standard. Fortunately Mathieu also showed that he was willing to try and do that. > This is getting very silly, and unnecessarily abusive. Real problems > exist on both sides of the fence, which have been discussed ad nauseam. > If you don't recall them, then go back and read the thread again. The > question is how to strike a comprimise between two different set of > problems, which Ingo and Karim actually seemed to be making progress > on towards the end of the thread. This got very silly and abuse pretty much from the beginning, at the very point anyone tried to challenge the justification that was initially presented with the LTT patches. This isn't how Linux works, if you want to post a patch, you should be ready to accept public scrutiny of your design and your actual code. Just because something is your personal pet project doesn't mean it nobody has the right to challenge it. Even after Christoph tried to be the neutral middle-man, we had to see another three follow-ups of 'I must have the last word' postings :( As I said in my last posting related to this thread, I had had enough, I haven't even read all the responses to my posting and I doubt I will. Instead I went back and starting writing code (unrelated and really evil code, but in a very different way, and trust me it's making me very grumpy :) Fortunately, we at least now have a situation where Mathieu has shown he is interested in being constructive on the issue and is able to work with Ingo on the static markers, which I'd like to applaud. I am optimistic a useful solution will come out of it finally, but I will rather stay out of it at this point. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Andrew Morton 2006-09-15 18:19 ` Ingo Molnar @ 2006-09-15 19:35 ` Thomas Gleixner 2006-09-15 19:40 ` Ingo Molnar 2006-09-15 19:56 ` Karim Yaghmour 2006-09-15 20:00 ` Mathieu Desnoyers 2006-09-15 20:37 ` Alan Cox 3 siblings, 2 replies; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 19:35 UTC (permalink / raw) To: Andrew Morton Cc: karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 11:16 -0700, Andrew Morton wrote: > Me thinks our time would be best spent trying to benefit from his > experience.. I was involved in tracer development for quite a while and I have used them in $paying customer projects too. > Me, I'm not particularly averse to some 50-100 static tracepoints if > experience tells us that we need such things. And both Karim's and Frank's > experience does indicate that such things are needed, which carries weight. >From my experience the tracepoints usually are not at the place where you need them to track down a particular problem or analyse a particular usage scenario in detail. This has been true from a kernel and from an application programmer POV. Also many of the LTT customer I'm aware of used their own homebrewed set of trace points. What I always hated on static tracers is the requirement to recompile / reboot the kernel in order to gather information. Kprobes / systemtap is really a conveniant way to avoid this. I completely agree that the maintenance of the "out of code" trace scripts is a task which needs a lot of effort, but it does not offload the maintenance effort to those modifying the code and we have not yet another pseudo instruction/function set which is interfering with the goal to have clear and understandable code. Hell, the code in those code paths which are of common interest for instrumentation is already complex enough. We really can do without adding some more obfuscated macro constructs. When we can maintain a basic set of tracescripts in the kernel tree and once the necessary infrastructure is in place, I'm quite sure that quite a lot of kernel developers would keep those fundamental trace scripts in shape out of their own interest. It might take a while to get this going but once it is established, distros will ship the scripts along with dynamic tracing enabled in the kernels. I see a major advantage over static tracing in that: Static tracing is usually not enabled in production kernels, but the dynamic tracing infrastructure can be enabled without costs. So you can actually request traces (at least for the standard set of tracepoints) from Joe User to track down complex problems. One thing which is much more important IMHO is the availablity of _USEFUL_ postprocessing tools to give users a real value of instrumentation. This is a much more complex task than this whole kernel instrumentation business. This also includes the ability to coordinate user space _and_ kernel space instrumentation, which is necessary to analyse complex kernel / application code interactions. tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:35 ` Thomas Gleixner @ 2006-09-15 19:40 ` Ingo Molnar 2006-09-15 19:56 ` Karim Yaghmour 1 sibling, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 19:40 UTC (permalink / raw) To: Thomas Gleixner Cc: Andrew Morton, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Thomas Gleixner <tglx@linutronix.de> wrote: > I see a major advantage over static tracing in that: > > Static tracing is usually not enabled in production kernels, but the > dynamic tracing infrastructure can be enabled without costs. So you > can actually request traces (at least for the standard set of > tracepoints) from Joe User to track down complex problems. FYI, kprobes/SystemTap is already enabled in RHEL4. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:35 ` Thomas Gleixner 2006-09-15 19:40 ` Ingo Molnar @ 2006-09-15 19:56 ` Karim Yaghmour 2006-09-15 20:23 ` Thomas Gleixner 1 sibling, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 19:56 UTC (permalink / raw) To: tglx Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Thomas Gleixner wrote: > One thing which is much more important IMHO is the availablity of > _USEFUL_ postprocessing tools to give users a real value of > instrumentation. This is a much more complex task than this whole kernel > instrumentation business. This also includes the ability to coordinate > user space _and_ kernel space instrumentation, which is necessary to > analyse complex kernel / application code interactions. And of course the usefulness of such postprocessing tools is gated by the ability of users to use them on _any_ kernel they get their hands on. Up to this point, this has not been for *any* of the existing toolsets, simply because they require the user to either recompile his kernel or modify his probe points to match his kernel. Until users can actually do without either of these steps (which is only possible with static markup) then the development teams of the various projects will continue having to invest resources chasing the kernel. We don't need separate popstprocessing tool teams. The only reasons there are separate project teams is because managers in key positions made the decision that they'd rather break from existing projects which had had little success mainlining and instead use their corporate bodyweight to pressure/seduce kernel developers working for them into pushing their new great which-aboslutely- has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree with you kernel developers that this is crap, this is why we're developing this new amazing thing). That's the truth plain and simple. When I started involving myself in Linux development a decade ago, I honestly did not think I'd ever see this kind of stuff happen, but, hey, that's life. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:56 ` Karim Yaghmour @ 2006-09-15 20:23 ` Thomas Gleixner 2006-09-15 20:40 ` Roman Zippel 2006-09-15 21:05 ` Karim Yaghmour 0 siblings, 2 replies; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 20:23 UTC (permalink / raw) To: karim Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 15:56 -0400, Karim Yaghmour wrote: > Thomas Gleixner wrote: > > One thing which is much more important IMHO is the availablity of > > _USEFUL_ postprocessing tools to give users a real value of > > instrumentation. This is a much more complex task than this whole kernel > > instrumentation business. This also includes the ability to coordinate > > user space _and_ kernel space instrumentation, which is necessary to > > analyse complex kernel / application code interactions. > > And of course the usefulness of such postprocessing tools is gated > by the ability of users to use them on _any_ kernel they get their > hands on. Up to this point, this has not been for *any* of the > existing toolsets, simply because they require the user to either > recompile his kernel or modify his probe points to match his kernel. So this has to be changed. And requiring to recompile the kernel is the wrong answer. Having some nifty tool, which allows you to define the set of dynamic trace points or use a predefined one is the way to go. > Until users can actually do without either of these steps (which is > only possible with static markup) Generalization like that are simply wrong. Static markup is not a panacea. It might help for some things in the first place, but it is not flexible enough in the long run. It is an engineering challenge to make the "static" trace rules autogenerated by some means as Andrew pointed out several times in this thread (see patch(1)), so we can provide a useful ad hoc set for the users. > We don't need separate popstprocessing tool teams. The only reasons > there are separate project teams is because managers in key > positions made the decision that they'd rather break from existing > projects which had had little success mainlining and instead use > their corporate bodyweight to pressure/seduce kernel developers > working for them into pushing their new great which-aboslutely- > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree > with you kernel developers that this is crap, this is why we're > developing this new amazing thing). That's the truth plain and > simple. Stop whining! LTT did not manage to solve the problem in a generic, mainline acceptable way. If you really believe that Kprobes / Systemtap is just a $corporate maliciousness to kick you out of business, then I really start to doubt your sanity. This has nothing to do with postprocessing and tracepoint creation tools. The postprocessing stuff is not in the scope of mainlining. Once a halfways future proof interface is available, tools will come up within no time. There are a lot of companies out there who have the interest and the capabilites to do an intergration into Eclipse to name one example. They will not start to spend a second of work time until there is a consolidated instrumentation core in the kernel. > When I started involving myself in Linux development a decade ago, > I honestly did not think I'd ever see this kind of stuff happen, > but, hey, that's life. - ENOPARSE tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:23 ` Thomas Gleixner @ 2006-09-15 20:40 ` Roman Zippel 2006-09-15 20:48 ` Ingo Molnar 2006-09-15 21:05 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 20:40 UTC (permalink / raw) To: Thomas Gleixner Cc: karim, Andrew Morton, Paul Mundt, Jes Sorensen, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Thomas Gleixner wrote: > So this has to be changed. And requiring to recompile the kernel is the > wrong answer. Having some nifty tool, which allows you to define the set > of dynamic trace points or use a predefined one is the way to go. Nobody is taking dynamic tracing away! You make it sound that tracing is only possible via dynamic traces. If I want to use static tracepoints, why shouldn't I? > Stop whining! So we're back to personal attacks now. :-( bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:40 ` Roman Zippel @ 2006-09-15 20:48 ` Ingo Molnar 2006-09-15 21:17 ` Karim Yaghmour 2006-09-15 21:27 ` Roman Zippel 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 20:48 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > On Fri, 15 Sep 2006, Thomas Gleixner wrote: > > > So this has to be changed. And requiring to recompile the kernel is the > > wrong answer. Having some nifty tool, which allows you to define the set > > of dynamic trace points or use a predefined one is the way to go. > > Nobody is taking dynamic tracing away! > You make it sound that tracing is only possible via dynamic traces. > If I want to use static tracepoints, why shouldn't I? because: - static tracepoints, once added, are very hard to remove - up until eternity. (On the other hand, markers for dynamic tracers are easily removed, either via making the dynamic tracer smarter, or by detaching the marker via the patch(1) method. In any case, if a marker goes away then hell does not break loose in dynamic tracing land - but it does in static tracing land. - the markers needed for dynamic tracing are different from the LTT static tracepoints. - a marker for dynamic tracing has lower performance impact than a static tracepoint, on systems that are not being traced. (but which have the tracing infrastructure enabled otherwise) - having static tracepoints dillutes the incentive for architectures to implement proper kprobes support. > > > there are separate project teams is because managers in key > > > positions made the decision that they'd rather break from existing > > > projects which had had little success mainlining and instead use > > > their corporate bodyweight to pressure/seduce kernel developers > > > working for them into pushing their new great which-aboslutely- > > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree > > > with you kernel developers that this is crap, this is why we're > > > developing this new amazing thing). That's the truth plain and > > > simple. > > > > Stop whining! > > So we're back to personal attacks now. :-( hm, so you dont consider the above paragraph a whine. How would you characterize it then? A measured, balanced, on-topic technical comment? I'm truly curious. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:48 ` Ingo Molnar @ 2006-09-15 21:17 ` Karim Yaghmour 2006-09-15 21:15 ` Ingo Molnar 2006-09-15 21:27 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 21:17 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > hm, so you dont consider the above paragraph a whine. How would you > characterize it then? A measured, balanced, on-topic technical comment? > I'm truly curious. Take it for what you want. It's yours to disparage. Consider, though, that I'm factually explaining the real-life result of resistance to static instrumentation. It's not entirely detached, I'll admit, but consider that it remained on-topic and entirely respectful of all parties involved. I've enjoyed very positive relationships with all those individuals and continue to hold them with high regard. They took the decisions they thought were best at the time, and I can only respect them for having acted as responsibly as they found relevant for their respective organizations. I don't agree with it, but that's life. It was just important to me to point out to the casual reader the source of a lot of the fud than can be found on ltt -- i.e. lots of it is marketing. For sure ltt initially got a lot of things wrong, but the progress of kernel tracing overall would have been much better had the naysayers actually chose to understand the problem instead of stonewalling the efforts being invested. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:17 ` Karim Yaghmour @ 2006-09-15 21:15 ` Ingo Molnar 2006-09-15 21:56 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 21:15 UTC (permalink / raw) To: Karim Yaghmour Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Karim Yaghmour <karim@opersys.com> wrote: > [...] Consider, though, that I'm factually explaining the real-life > result of resistance to static instrumentation. [...] with all due respect, do you realize the possibility that this resistance might be a genuine technical opinion on my part that is driven by the quality of the code being offered and by the conceptual problems static tracing introduces in the future, as i see them? And thus, maybe, what you wrote: " and instead use their corporate bodyweight to pressure/seduce kernel developers working for them into pushing their new great [...] " could possibly be total, utter nonsense? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:15 ` Ingo Molnar @ 2006-09-15 21:56 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 21:56 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > with all due respect, do you realize the possibility that this > resistance might be a genuine technical opinion on my part that is > driven by the quality of the code being offered and by the conceptual > problems static tracing introduces in the future, as i see them? Wait. What I said could not possibly apply to comments you, or anybody else for that matter, made within this thread. What I said refers to events and threads which have long since passed. The "resistance" I allude to is that faced by ltt early on and for as long as several parties were actively involved in trying to standardize on it. I'm merely trying to explain the current status of this: several teams in "apparent" competition one another. > " and instead use their corporate bodyweight to pressure/seduce kernel > developers working for them into pushing their new great [...] " > > could possibly be total, utter nonsense? Please read this in the above context -- passed events. In as far as my understanding of events as I was part of them, this was the best I made of the decision-making thought process at a managerial level. And I do not wish to substantiate that nor was this meant as a personal attack against any person or organization. Everyone acted to the best of their knowledge of the facts at the time and I cannot fault them for that. I disagreed and was disappointed, obviously, but that's mine to bear. Put simply: all parties involved would actually wish things were different. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:48 ` Ingo Molnar 2006-09-15 21:17 ` Karim Yaghmour @ 2006-09-15 21:27 ` Roman Zippel 2006-09-15 21:51 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 21:27 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > > Nobody is taking dynamic tracing away! > > You make it sound that tracing is only possible via dynamic traces. > > If I want to use static tracepoints, why shouldn't I? > > because: > > - static tracepoints, once added, are very hard to remove - up until > eternity. (On the other hand, markers for dynamic tracers are easily > removed, either via making the dynamic tracer smarter, or by > detaching the marker via the patch(1) method. In any case, if a > marker goes away then hell does not break loose in dynamic tracing > land - but it does in static tracing land. This is simply not true, at the source level you can remove a static tracepoint as easily as a dynamic tracepoint, the effect of the missing trace information is the same either way. > - the markers needed for dynamic tracing are different from the LTT > static tracepoints. What makes the requirements so different? I would actually think it depends on the user independent of the tracing is done. > - a marker for dynamic tracing has lower performance impact than a > static tracepoint, on systems that are not being traced. (but which > have the tracing infrastructure enabled otherwise) Anyone using static tracing intents to use, which makes this point moot. > - having static tracepoints dillutes the incentive for architectures to > implement proper kprobes support. Considering the level of work needed to support efficient dynamic tracing it only withholds archs from tracing support for no good reason. > > > > there are separate project teams is because managers in key > > > > positions made the decision that they'd rather break from existing > > > > projects which had had little success mainlining and instead use > > > > their corporate bodyweight to pressure/seduce kernel developers > > > > working for them into pushing their new great which-aboslutely- > > > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree > > > > with you kernel developers that this is crap, this is why we're > > > > developing this new amazing thing). That's the truth plain and > > > > simple. > > > > > > Stop whining! > > > > So we're back to personal attacks now. :-( > > hm, so you dont consider the above paragraph a whine. How would you > characterize it then? A measured, balanced, on-topic technical comment? > I'm truly curious. It's sarcastic, but considering the disrespect towards Karim, I don't blame him. At some point the "whining" argument was funny, but lately it's only used to descredit people. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:27 ` Roman Zippel @ 2006-09-15 21:51 ` Ingo Molnar 2006-09-15 22:15 ` Karim Yaghmour 2006-09-15 22:53 ` Roman Zippel 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 21:51 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > because: > > > > - static tracepoints, once added, are very hard to remove - up until > > eternity. (On the other hand, markers for dynamic tracers are easily > > removed, either via making the dynamic tracer smarter, or by > > detaching the marker via the patch(1) method. In any case, if a > > marker goes away then hell does not break loose in dynamic tracing > > land - but it does in static tracing land. > > This is simply not true, at the source level you can remove a static > tracepoint as easily as a dynamic tracepoint, the effect of the > missing trace information is the same either way. this is not true. I gave you one example already a few mails ago (which you did not reply to, neither did you reply the previous time when i first mentioned this - perhaps you missed it in the high volume of emails): " i outlined one such specific "removal of static tracepoint" example already: static trace points at the head/prologue of functions (half of the existing tracepoints are such). The sock_sendmsg() example i quoted before is such a case. Those trace points can be replaced with a simple GCC function attribute, which would cause a 5-byte (or whatever necessary) NOP to be inserted at the function prologue. The attribute would be alot less invasive than an explicit tracepoint (and thus easier to maintain) " > > - the markers needed for dynamic tracing are different from the LTT > > static tracepoints. > > What makes the requirements so different? I would actually think it > depends on the user independent of the tracing is done. yes, and i mentioned before that they can be merged (i even outlined a few APIs for it), but still that is not being offered by LTT today. > > - a marker for dynamic tracing has lower performance impact than a > > static tracepoint, on systems that are not being traced. (but which > > have the tracing infrastructure enabled otherwise) > > Anyone using static tracing intents to use, which makes this point > moot. that's not at all true, on multiple grounds: Firstly, many people use distro kernels. A Linux distribution typically wants to offer as few kernel rpms as possible (one per arch to be precise), but it also wants to offer as many features as possible. So if there was a static tracer in there, a distro would enable it - but 99.9% of the users would never use it - still they would see the overhead. Hence the user would have it enabled, but does not intend to use it - which contradicts your statement. Secondly, even people who intend to _eventually_ make use of tracing, dont use it most of the time. So why should they have more overhead when they are not tracing? Again: the point is not moot because even though the user intends to use tracing, but does not always want to trace. > > - having static tracepoints dillutes the incentive for architectures to > > implement proper kprobes support. > > Considering the level of work needed to support efficient dynamic > tracing it only withholds archs from tracing support for no good > reason. 5 major architectures (both RISC and CISC) already support kprobes, so fortunately this point is largely moot - but you are right to a certain degree, it's not totally solved. But the examples are there. It's still not trivial to implement a feature like this, but kernel programming never is. I far more prefer the harder but more intelligent solution than the easier but less intelligent solution - even if that means a temporary unavailability of a feature for some rarer arch. > > > > > there are separate project teams is because managers in key > > > > > positions made the decision that they'd rather break from existing > > > > > projects which had had little success mainlining and instead use > > > > > their corporate bodyweight to pressure/seduce kernel developers > > > > > working for them into pushing their new great which-aboslutely- > > > > > has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree > > > > > with you kernel developers that this is crap, this is why we're > > > > > developing this new amazing thing). That's the truth plain and > > > > > simple. > > > > > > > > Stop whining! > > > > > > So we're back to personal attacks now. :-( > > > > hm, so you dont consider the above paragraph a whine. How would you > > characterize it then? A measured, balanced, on-topic technical > > comment? I'm truly curious. > > It's sarcastic, [...] oh, really? Karim's characterization was: " I'm factually explaining the real-life result of resistance to static instrumentation. " so whose interpretation of Karim's comments should i accept, yours or Karim's? I'm really torn on that issue. (_that_ was sarcastic) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:51 ` Ingo Molnar @ 2006-09-15 22:15 ` Karim Yaghmour 2006-09-15 22:53 ` Roman Zippel 1 sibling, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 22:15 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > oh, really? Karim's characterization was: > > " I'm factually explaining the real-life result of resistance to static > instrumentation. " > > so whose interpretation of Karim's comments should i accept, yours or > Karim's? I'm really torn on that issue. (_that_ was sarcastic) Hmm ... this might explain why we're having a hard time here ... me thinks: Ingo don't see that dynamic tracing is orthogonal to static markup and Ingo don't see that my explanation is orthogonal to Roman's (i.e. I did factually explain stuff and did resort to sarcasm as part of said explanation) ... maybe Ingo does not like orthogonal stuff ... That _too_ was sarcastic. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:51 ` Ingo Molnar 2006-09-15 22:15 ` Karim Yaghmour @ 2006-09-15 22:53 ` Roman Zippel 2006-09-15 23:14 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 22:53 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > > This is simply not true, at the source level you can remove a static > > tracepoint as easily as a dynamic tracepoint, the effect of the > > missing trace information is the same either way. > > this is not true. I gave you one example already a few mails ago (which > you did not reply to, neither did you reply the previous time when i > first mentioned this - perhaps you missed it in the high volume of > emails): > > " i outlined one such specific "removal of static tracepoint" example > already: static trace points at the head/prologue of functions (half > of the existing tracepoints are such). The sock_sendmsg() example i > quoted before is such a case. Those trace points can be replaced with > a simple GCC function attribute, which would cause a 5-byte (or > whatever necessary) NOP to be inserted at the function prologue. The > attribute would be alot less invasive than an explicit tracepoint (and > thus easier to maintain) " As I said before you're mixing up function tracing with event tracing, not all events are tied to functions, functions can be moved and renamed, the actual event more often stays the same. Function attributes also doesn't provide information local to the function. > > > - the markers needed for dynamic tracing are different from the LTT > > > static tracepoints. > > > > What makes the requirements so different? I would actually think it > > depends on the user independent of the tracing is done. > > yes, and i mentioned before that they can be merged (i even outlined a > few APIs for it), but still that is not being offered by LTT today. It's possible I missed something, but pretty much anything you outlined wouldn't make the live of static tracepoints any easier. > > > - a marker for dynamic tracing has lower performance impact than a > > > static tracepoint, on systems that are not being traced. (but which > > > have the tracing infrastructure enabled otherwise) > > > > Anyone using static tracing intents to use, which makes this point > > moot. > > that's not at all true, on multiple grounds: > > Firstly, many people use distro kernels. A Linux distribution typically > wants to offer as few kernel rpms as possible (one per arch to be > precise), but it also wants to offer as many features as possible. So if > there was a static tracer in there, a distro would enable it - but 99.9% > of the users would never use it - still they would see the overhead. > Hence the user would have it enabled, but does not intend to use it - > which contradicts your statement. So if dynamic tracing is available use it, as distributions already do. OTOH the barrier to use static tracing is drastically different whether the user has to deal with external patches or whether it's a simple kernel option. Again, static tracing doesn't exclude the possibility of dynamic tracing, that's something you constantly omit and thus make it sound like both options were mutually exlusive. > Secondly, even people who intend to _eventually_ make use of tracing, > dont use it most of the time. So why should they have more overhead when > they are not tracing? Again: the point is not moot because even though > the user intends to use tracing, but does not always want to trace. I've used kernels which included static tracing and the perfomance overhead is negligible for occasional use. > > > - having static tracepoints dillutes the incentive for architectures to > > > implement proper kprobes support. > > > > Considering the level of work needed to support efficient dynamic > > tracing it only withholds archs from tracing support for no good > > reason. > > 5 major architectures (both RISC and CISC) already support kprobes, so > fortunately this point is largely moot - but you are right to a certain > degree, it's not totally solved. But the examples are there. It's still > not trivial to implement a feature like this, but kernel programming > never is. I far more prefer the harder but more intelligent solution > than the easier but less intelligent solution - even if that means a > temporary unavailability of a feature for some rarer arch. Why don't you leave the choice to the users? Why do you constantly make it an exclusive choice? There is a lot of common ground, but you seem to be hellbent to make the life of static tracers and thus their users as hard possible. Only for pursuit of some perfect solution while the more practical solution is easily available without any ill effects? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 22:53 ` Roman Zippel @ 2006-09-15 23:14 ` Ingo Molnar 2006-09-15 23:49 ` Nicholas Miell 2006-09-16 0:31 ` Roman Zippel 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 23:14 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > This is simply not true, at the source level you can remove a static > > > tracepoint as easily as a dynamic tracepoint, the effect of the > > > missing trace information is the same either way. > > > > this is not true. I gave you one example already a few mails ago (which > > you did not reply to, neither did you reply the previous time when i > > first mentioned this - perhaps you missed it in the high volume of > > emails): > > > > " i outlined one such specific "removal of static tracepoint" example > > already: static trace points at the head/prologue of functions (half > > of the existing tracepoints are such). The sock_sendmsg() example i > > quoted before is such a case. Those trace points can be replaced with > > a simple GCC function attribute, which would cause a 5-byte (or > > whatever necessary) NOP to be inserted at the function prologue. The > > attribute would be alot less invasive than an explicit tracepoint (and > > thus easier to maintain) " > > As I said before you're mixing up function tracing with event tracing, > not all events are tied to functions, functions can be moved and > renamed, the actual event more often stays the same. you are showing a clear misunderstanding of how tracing is typically done. Both for LTT and for blktrace (and for the tracers i've done myself), roughly half (50%) of the tracepoints are right at the top of the function and trace the function arguments. Let me quote an example straight from LTT: int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct kiocb iocb; struct sock_iocb siocb; int ret; trace_socket_sendmsg(sock, sock->sk->sk_family, sock->sk->sk_type, sock->sk->sk_protocol, size); this tracepoint, under a dynamic tracing concept, can be replaced with: int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct kiocb iocb; struct sock_iocb siocb; int ret; note the "__trace" attribute to the function. (see my previous mails where i talked about __trace for more details) SystemTap can hook to that point and can access the very same parameters that the markup does, in a lot less invasive way. So a 5-line markup can be replaced with a single function attribute. roughly half of the existing tracepoints in blktrace/LTT can be replaced that way. A 50% reduction in the number of markups is significant - but such a reduction in markups not possible under the static tracing concept. And that method was just off the top of my head - Andrew provided other ideas to reduce the number of markups. > Function attributes also doesn't provide information local to the > function. of course, but where does the above tracepoint i quoted use information local to the function? A fair number of markups use global functions because, surprise, alot of interesting activity happens along global functions. So a healthy reduction in markups can be achieved. > > > > - the markers needed for dynamic tracing are different from the > > > > LTT static tracepoints. > > > > > > What makes the requirements so different? I would actually think > > > it depends on the user independent of the tracing is done. > > > > yes, and i mentioned before that they can be merged (i even outlined > > a few APIs for it), but still that is not being offered by LTT > > today. > > It's possible I missed something, but pretty much anything you > outlined wouldn't make the live of static tracepoints any easier. sorry, but if you re-read the above line of argument, your sentence appears non-sequitor. I said "the markers needed for dynamic tracing are different from the LTT static tracepoints". You asked why they are so different, and i replied that i already outlined what the right API would be in my opinion to do markups, but that API is different from what LTT is offering now. To which you are now replying: "pretty much anything you outlined wouldn't make the life of static tracepoints any easier." Huh? > > > > - a marker for dynamic tracing has lower performance impact than a > > > > static tracepoint, on systems that are not being traced. (but which > > > > have the tracing infrastructure enabled otherwise) > > > > > > Anyone using static tracing intents to use, which makes this point > > > moot. > > > > that's not at all true, on multiple grounds: > > > > Firstly, many people use distro kernels. A Linux distribution typically > > wants to offer as few kernel rpms as possible (one per arch to be > > precise), but it also wants to offer as many features as possible. So if > > there was a static tracer in there, a distro would enable it - but 99.9% > > of the users would never use it - still they would see the overhead. > > Hence the user would have it enabled, but does not intend to use it - > > which contradicts your statement. > > So if dynamic tracing is available use it, as distributions already > do. OTOH the barrier to use static tracing is drastically different > whether the user has to deal with external patches or whether it's a > simple kernel option. Again, static tracing doesn't exclude the > possibility of dynamic tracing, that's something you constantly omit > and thus make it sound like both options were mutually exlusive. how does this reply to my point that: "a marker for dynamic tracing has lower performance impact than a static tracepoint, on systems that are not being traced", which point you claimed moot? > > Secondly, even people who intend to _eventually_ make use of > > tracing, dont use it most of the time. So why should they have more > > overhead when they are not tracing? Again: the point is not moot > > because even though the user intends to use tracing, but does not > > always want to trace. > > I've used kernels which included static tracing and the perfomance > overhead is negligible for occasional use. how does this suddenly make my point, that "a marker for dynamic tracing has lower performance impact than a static tracepoint, on systems that are not being traced", "moot"? > > > > - having static tracepoints dillutes the incentive for > > > > architectures to > > > > implement proper kprobes support. > > > > > > Considering the level of work needed to support efficient dynamic > > > tracing it only withholds archs from tracing support for no good > > > reason. > > > > 5 major architectures (both RISC and CISC) already support kprobes, > > so fortunately this point is largely moot - but you are right to a > > certain degree, it's not totally solved. But the examples are there. > > It's still not trivial to implement a feature like this, but kernel > > programming never is. I far more prefer the harder but more > > intelligent solution than the easier but less intelligent solution - > > even if that means a temporary unavailability of a feature for some > > rarer arch. > > Why don't you leave the choice to the users? Why do you constantly > make it an exclusive choice? [...] as i outlined it tons of times before: once we add markups for static tracers, we cannot remove them. That is a constant kernel maintainance drag that i feel uncomfortable about. While with dynamic tracers i see a clear path out of any such drag. We can, in a very finegrained way, tune the overhead of markups vs. out-of-source scripts. Static tracers dont give us this flexibility - and hence limit our future choices. the user of course does not care about kernel internal design and maintainance issues. Think about the many reasons why STREAMS was rejected - users wanted that too. And note that users dont want "static tracers" or any design detail of LTT in particular: what they want is the _functionality_ of LTT. nor do i reject all of LTT: as i said before i like the tools, and i think its collection of trace events should be turned into systemtap markups and scripts. Furthermore, it's ringbuffer implementation looks better. So as far as the user is concerned, LTT could (and should) live on with full capabilities, but with this crutial difference in how it interfaces to the kernel source code. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:14 ` Ingo Molnar @ 2006-09-15 23:49 ` Nicholas Miell 2006-09-15 23:57 ` Ingo Molnar 2006-09-16 0:31 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Nicholas Miell @ 2006-09-15 23:49 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Sat, 2006-09-16 at 01:14 +0200, Ingo Molnar wrote: > * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > > This is simply not true, at the source level you can remove a static > > > > tracepoint as easily as a dynamic tracepoint, the effect of the > > > > missing trace information is the same either way. > > > > > > this is not true. I gave you one example already a few mails ago (which > > > you did not reply to, neither did you reply the previous time when i > > > first mentioned this - perhaps you missed it in the high volume of > > > emails): > > > > > > " i outlined one such specific "removal of static tracepoint" example > > > already: static trace points at the head/prologue of functions (half > > > of the existing tracepoints are such). The sock_sendmsg() example i > > > quoted before is such a case. Those trace points can be replaced with > > > a simple GCC function attribute, which would cause a 5-byte (or > > > whatever necessary) NOP to be inserted at the function prologue. The > > > attribute would be alot less invasive than an explicit tracepoint (and > > > thus easier to maintain) " > > > > As I said before you're mixing up function tracing with event tracing, > > not all events are tied to functions, functions can be moved and > > renamed, the actual event more often stays the same. > > you are showing a clear misunderstanding of how tracing is typically > done. Both for LTT and for blktrace (and for the tracers i've done > myself), roughly half (50%) of the tracepoints are right at the top of > the function and trace the function arguments. Let me quote an example > straight from LTT: > > int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) > { > struct kiocb iocb; > struct sock_iocb siocb; > int ret; > > trace_socket_sendmsg(sock, sock->sk->sk_family, > sock->sk->sk_type, > sock->sk->sk_protocol, > size); > > this tracepoint, under a dynamic tracing concept, can be replaced with: > > int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) > { > struct kiocb iocb; > struct sock_iocb siocb; > int ret; > > note the "__trace" attribute to the function. (see my previous mails > where i talked about __trace for more details) SystemTap can hook to > that point and can access the very same parameters that the markup does, > in a lot less invasive way. > > So a 5-line markup can be replaced with a single function attribute. > > roughly half of the existing tracepoints in blktrace/LTT can be replaced > that way. A 50% reduction in the number of markups is significant - but > such a reduction in markups not possible under the static tracing > concept. And that method was just off the top of my head - Andrew > provided other ideas to reduce the number of markups. > You're going to want to be able to trace every function in the kernel, which means they'd all need a __trace -- and in that case, a -fpad-functions-for-tracing gcc option would make more sense then per-function attributes. The option could also insert NOPs before RETs, not just before the prologue so that function returns are equally easy to trace. (It might also inhibit tail calls, assuming being able to trace all function returns is more important than that optimization.) And SystemTap can already hook into sock_sendmsg() (or any other function) and examine it's arguments -- all of this GCC extension talk is just performance enhancement. -- Nicholas Miell <nmiell@comcast.net> ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:49 ` Nicholas Miell @ 2006-09-15 23:57 ` Ingo Molnar 2006-09-16 0:41 ` Nicholas Miell 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 23:57 UTC (permalink / raw) To: Nicholas Miell Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Nicholas Miell <nmiell@comcast.net> wrote: > You're going to want to be able to trace every function in the kernel, > which means they'd all need a __trace -- and in that case, a > -fpad-functions-for-tracing gcc option would make more sense then > per-function attributes. the __trace attribute would be a _specific_ replacement for a _specific_ static markup at the entry of a function. So no, we would not want to add __trace to _every_ function in the kernel: only those which get commonly traced. And note that SystemTap can trace the rest too, just with slighly higher overhead. In that sense __trace is not an enabling infrastructure, it's a performance tuning infrastructure. > The option could also insert NOPs before RETs, not just before the > prologue so that function returns are equally easy to trace. (It might > also inhibit tail calls, assuming being able to trace all function > returns is more important than that optimization.) yeah. __trace_entry and __trace_exit [or both] attributes. Makes sense. > And SystemTap can already hook into sock_sendmsg() (or any other > function) and examine it's arguments -- all of this GCC extension talk > is just performance enhancement. yes, yes, yes, exactly!!! Finally someone reads my mails and understands my points. There's hope! ;) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:57 ` Ingo Molnar @ 2006-09-16 0:41 ` Nicholas Miell 0 siblings, 0 replies; 271+ messages in thread From: Nicholas Miell @ 2006-09-16 0:41 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Sat, 2006-09-16 at 01:57 +0200, Ingo Molnar wrote: > * Nicholas Miell <nmiell@comcast.net> wrote: > > > You're going to want to be able to trace every function in the kernel, > > which means they'd all need a __trace -- and in that case, a > > -fpad-functions-for-tracing gcc option would make more sense then > > per-function attributes. > > the __trace attribute would be a _specific_ replacement for a _specific_ > static markup at the entry of a function. So no, we would not want to > add __trace to _every_ function in the kernel: only those which get > commonly traced. And note that SystemTap can trace the rest too, just > with slighly higher overhead. > > In that sense __trace is not an enabling infrastructure, it's a > performance tuning infrastructure. > > > The option could also insert NOPs before RETs, not just before the > > prologue so that function returns are equally easy to trace. (It might > > also inhibit tail calls, assuming being able to trace all function > > returns is more important than that optimization.) > > yeah. __trace_entry and __trace_exit [or both] attributes. Makes sense. > > > And SystemTap can already hook into sock_sendmsg() (or any other > > function) and examine it's arguments -- all of this GCC extension talk > > is just performance enhancement. > > yes, yes, yes, exactly!!! Finally someone reads my mails and understands > my points. There's hope! ;) I'm not sure that I do, actually. You seem to be opposed to all static probe markers in general, but I think that they'd be useful for big abstract things like "new thread created" (which would encompass fork/vfork/clone and probably consist of a single marker in do_fork) or for similar things that happen all over the kernel (for example, I imagine that all filesystems would want to use the same set of probe names just to make I/O tracing easier for userspace). -- Nicholas Miell <nmiell@comcast.net> ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 23:14 ` Ingo Molnar 2006-09-15 23:49 ` Nicholas Miell @ 2006-09-16 0:31 ` Roman Zippel 2006-09-16 8:20 ` Ingo Molnar ` (6 more replies) 1 sibling, 7 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-16 0:31 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sat, 16 Sep 2006, Ingo Molnar wrote: > > As I said before you're mixing up function tracing with event tracing, > > not all events are tied to functions, functions can be moved and > > renamed, the actual event more often stays the same. > > you are showing a clear misunderstanding of how tracing is typically > done. Not really, you're missing the point I'm trying to make, we want to trace _events_ not functions. Function specific tracing would still require kernel specific mapping to map function names to events. > Both for LTT and for blktrace (and for the tracers i've done > myself), roughly half (50%) of the tracepoints are right at the top of > the function and trace the function arguments. Let me quote an example > straight from LTT: > > int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) > { > struct kiocb iocb; > struct sock_iocb siocb; > int ret; > > trace_socket_sendmsg(sock, sock->sk->sk_family, > sock->sk->sk_type, > sock->sk->sk_protocol, > size); > > this tracepoint, under a dynamic tracing concept, can be replaced with: > > int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) > { > struct kiocb iocb; > struct sock_iocb siocb; > int ret; > > note the "__trace" attribute to the function. (see my previous mails > where i talked about __trace for more details) SystemTap can hook to > that point and can access the very same parameters that the markup does, > in a lot less invasive way. > > So a 5-line markup can be replaced with a single function attribute. A nice example where you make life more difficult for static tracers for no reason, whereas a "trace_socket_sendmsg(sock, size);" is just as usable. It would also add virtually no maintainance overhead as you like to claim - how often does this function change? > > Function attributes also doesn't provide information local to the > > function. > > of course, but where does the above tracepoint i quoted use information > local to the function? A fair number of markups use global functions > because, surprise, alot of interesting activity happens along global > functions. So a healthy reduction in markups can be achieved. But not completely, which is the whole point. > > It's possible I missed something, but pretty much anything you > > outlined wouldn't make the live of static tracepoints any easier. > > sorry, but if you re-read the above line of argument, your sentence > appears non-sequitor. I said "the markers needed for dynamic tracing are > different from the LTT static tracepoints". You asked why they are so > different, and i replied that i already outlined what the right API > would be in my opinion to do markups, but that API is different from > what LTT is offering now. To which you are now replying: "pretty much > anything you outlined wouldn't make the life of static tracepoints any > easier." Huh? Yeah, huh? I have no idea, what you're trying to tell me. As you demonstrated above your "right API" is barely usable for static tracers. > > So if dynamic tracing is available use it, as distributions already > > do. OTOH the barrier to use static tracing is drastically different > > whether the user has to deal with external patches or whether it's a > > simple kernel option. Again, static tracing doesn't exclude the > > possibility of dynamic tracing, that's something you constantly omit > > and thus make it sound like both options were mutually exlusive. > > how does this reply to my point that: "a marker for dynamic tracing has > lower performance impact than a static tracepoint, on systems that are > not being traced", which point you claimed moot? Because it's pretty much an implementation issue. The point is about adding markers at all, it's about the choice being able to use static tracers in the first place. Both have undeniable their advantages/ disadvantages, where you prefer to emphasize only the strong points of dynamic tracing and constantly declare its problems as nonissues. > > > Secondly, even people who intend to _eventually_ make use of > > > tracing, dont use it most of the time. So why should they have more > > > overhead when they are not tracing? Again: the point is not moot > > > because even though the user intends to use tracing, but does not > > > always want to trace. > > > > I've used kernels which included static tracing and the perfomance > > overhead is negligible for occasional use. > > how does this suddenly make my point, that "a marker for dynamic tracing > has lower performance impact than a static tracepoint, on systems that > are not being traced", "moot"? Why exactly is the point relevant in first place? How exactly is the added (minor!) overhead such a fundamental problem? > > Why don't you leave the choice to the users? Why do you constantly > > make it an exclusive choice? [...] > > as i outlined it tons of times before: once we add markups for static > tracers, we cannot remove them. That is a constant kernel maintainance > drag that i feel uncomfortable about. As many, many people have already said, any tracepoints have an maintainance overhead, which is barely different between dynamic and static tracing and only increases the further away the tracepoints are from the source. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel @ 2006-09-16 8:20 ` Ingo Molnar 2006-09-16 8:21 ` Ingo Molnar ` (5 subsequent siblings) 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:20 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > this tracepoint, under a dynamic tracing concept, can be replaced with: > > > > int __trace sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) > > A nice example where you make life more difficult for static tracers > for no reason, [...] No, it's simply a clever feature: "halve the impact of static markups". What you say will be _precisely_ the kind of situations that make me very wary of static tracers. Someone does something smart that enables us to remove half of the tracepoints from the kernel source code, while you will go on and complain: "why do you make the life harder for static tracers". You, perhaps inwillingly, are giving the perfect demonstration of why static tracepoints are a maintainance problem: once added _they can not be removed without breaking static tracers_. And i see you didnt reply to (and you didnt even quote) the paragraph that i believe answers your point: > > the user of course does not care about kernel internal design and > > maintainance issues. Think about the many reasons why STREAMS was > > rejected - users wanted that too. And note that users dont want > > "static tracers" or any design detail of LTT in particular: what > > they want is the _functionality_ of LTT. The kernel tree is not there to make it easier for inferior approaches. How hard is it for the static tracer folks to take a look at dynamic tracers and realize that it's the fundamentally better approach, for the reasons above and for other reasons, and pick the concept up and integrate it with their code? Just like the STREAMS folks had a chance to look at the existing TCP/IP implementation in the Linux kernel and had the chance to realize that it's the better approach. Yet they insisted on just adding a few hooks here and there, to "make the life easier for STREAMS". Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel 2006-09-16 8:20 ` Ingo Molnar @ 2006-09-16 8:21 ` Ingo Molnar 2006-09-16 8:21 ` Ingo Molnar ` (4 subsequent siblings) 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:21 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > [...] It would also add virtually no maintainance overhead as you like > to claim - how often does this function change? as i said, roughly half of the tracepoints are like this - and some of them in functions in frequented places. That's far from "virtually no maintainance overhead". In the -rt tree i have never more than a dozen static tracepoints, yet even this small amount caused at least 5 extra -rt tree iterations due to various breakages (build problems or even crashes). Cruft comes in small steps, and my worry is that such _unremovable_ markups will be cruft that never shrinks. With dynamic tracers i see the _chance_ for cruft to shift to places where it does not hurt, if that cruft turns out to become a hindrance. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel 2006-09-16 8:20 ` Ingo Molnar 2006-09-16 8:21 ` Ingo Molnar @ 2006-09-16 8:21 ` Ingo Molnar 2006-09-16 8:22 ` Ingo Molnar ` (3 subsequent siblings) 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:21 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > > This is simply not true, at the source level you can remove a > > > > > static tracepoint as easily as a dynamic tracepoint, the > > > > > effect of the missing trace information is the same either way. > > > > > > > > this is not true. I gave you one example already a few mails ago > > > > [...] > > > > > > Function attributes also doesn't provide information local to the > > > function. > > > > of course, but where does the above tracepoint i quoted use > > information local to the function? A fair number of markups use > > global functions because, surprise, alot of interesting activity > > happens along global functions. So a healthy reduction in markups > > can be achieved. > > But not completely, which is the whole point. the point was what you said above, which i claimed and still claim to be false: "at the source level you can remove a static tracepoint as easily as a dynamic tracepoint, the effect of the missing trace information is the same either way." Your point is still incorrect. I gave you an example of how half of the tracepoints could be removed under a dynamic scheme - while they couldnt be removed under a static scheme. Hence that directly contradicts your contention that "you can remove a static tracepoint as easily as a dynamic tracepoint". Nothing more, nothing less. I just pointed out the point in your thinking that i believe to be incorrect. Reality is that you can remove a dynamic tracepoint much easier, due to the fundamental flexibility of dynamic tracers. While with static tracers, every tracepoint has to be _somewhere_ in the source code, otherwise people like you will complain just like you did in this mail: "you make life more difficult for static tracers for no reason". You can concede my point or you can dispute that argument - but what you did above was neither: you snipped all the quotations and you claimed a totally new point. (which new point i never argued with: _of course_ i never claimed that __trace function attributes can remove _all_ markups. They can "only" remove half of them.) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel ` (2 preceding siblings ...) 2006-09-16 8:21 ` Ingo Molnar @ 2006-09-16 8:22 ` Ingo Molnar 2006-09-16 19:58 ` Roman Zippel 2006-09-16 8:23 ` Ingo Molnar ` (2 subsequent siblings) 6 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:22 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > It's possible I missed something, but pretty much anything you > > > outlined wouldn't make the live of static tracepoints any easier. > > > > sorry, but if you re-read the above line of argument, your sentence > > appears non-sequitor. I said "the markers needed for dynamic tracing are > > different from the LTT static tracepoints". You asked why they are so > > different, and i replied that i already outlined what the right API > > would be in my opinion to do markups, but that API is different from > > what LTT is offering now. To which you are now replying: "pretty much > > anything you outlined wouldn't make the life of static tracepoints any > > easier." Huh? > > Yeah, huh? > > I have no idea, what you're trying to tell me. As you demonstrated > above your "right API" is barely usable for static tracers. you raise a new point again (without conceding or disputing the point we were discussing, which point you snipped from your reply) but i'm happy to reply to this new point too: my suggested API is not "barely usable" for static tracers but "totally unusable". Did i tell you yet that i disagree with the addition of markups for static tracers? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 8:22 ` Ingo Molnar @ 2006-09-16 19:58 ` Roman Zippel 2006-09-16 22:50 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-16 19:58 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, I don't know why you split this into multiple subthreads and instead of delving further into secondary issues, please let me get back to the primary issues to put everything a little into perspective. The foremost issue is still that there is only limited kprobes support. The way you ignore this and try to make this a non-issue makes it appear to me rather arrogant, I appreciate it that you want to push technology forward, but it's rather ignorant how you leave people behind in the dust who can't keep up, by making it very hard for them to easily get access to tracing in the kernel. Since I have a quite good idea of the amount of work needed to implement second rate kprobes hack, first rate kprobes support and first rate ltt(ng) support, it's a quite simple decision what I'm going to do. Since your "incentive" to add kprobes support is not very high, it's more likely to backfire in making you the jerk denying me easy access to tracing technologies. Since my options are right now limited to a static tracer in first place, most of the issues you mentioned over the various mails become really moot, e.g. why should I care about the overhead of inactive traces? We can happily discuss the merits of dynamic tracers forever, but it does _not_ change my current situation, that I have no access to one on some machines I care about. The main issue in supporting static tracers are the tracepoints and so far I haven't seen any convincing proof that the maintainance overhead of dynamic and static tracepoints has to be significantly different. What you did is constructing a worst case scenario, which only proves that it's possible, what it doesn't prove is that there are no measures to prevent this from happining. This means nobody proved so far that it's not possible to create and enforce a set of rules to keep the amount and effect of tracepoints under control. Let's take your example of a tracepoint in an area of high development activity, since such development should happen in -mm, it would be no problem to drop the trace and add it back once development calmed down, exactly like you would do for a dynamic trace. OTOH it's very well possible some people might find the trace useful during development. So the problem here is now that you simply work from the unproven premiss, that static tracepoints automatically lead to uncontrolled chaos. This makes a reasonable discussion about managing tracepoints impossible, since you don't want to support static tracepoints at all. Ingo, as long as you don't give up this zero tolerance strategy, it doesn't make much sense to discuss details and I can only hope there are other people who are more reasonable... bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:58 ` Roman Zippel @ 2006-09-16 22:50 ` Ingo Molnar 2006-09-16 23:00 ` Ingo Molnar 2006-09-16 23:14 ` Ingo Molnar 2 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 22:50 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > I don't know why you split this into multiple subthreads [...] huh? Maybe because the mail got ... too big? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:58 ` Roman Zippel 2006-09-16 22:50 ` Ingo Molnar @ 2006-09-16 23:00 ` Ingo Molnar 2006-09-17 1:15 ` Roman Zippel 2006-09-16 23:14 ` Ingo Molnar 2 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 23:00 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Since my options are right now limited to a static tracer in first > place, [...] Lets see the equation of the current situation. On one side you want static tracing but you dont want to implement kprobes on m68k - although you probably could. On the other side there is the main kernel, which, if it ever accepted static tracepoints, could probably never get rid of them. so, you request the main kernel to accept hundreds of static tracepoints that would probably never go away, just because you are reluctant at the moment to implement kprobes? And that only to bridge a temporary period of time when m68k has no kprobes support yet? Combined with the fact that m68k was just fine without tracing for 13 years? Did i get that right? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 23:00 ` Ingo Molnar @ 2006-09-17 1:15 ` Roman Zippel 2006-09-17 8:42 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-17 1:15 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > Lets see the equation of the current situation. On one side you want > static tracing but you dont want to implement kprobes on m68k - although > you probably could. You would have a point if would it be just about m68k. > On the other side there is the main kernel, which, > if it ever accepted static tracepoints, could probably never get rid of > them. If they are useful and not hurting anyone, why should we? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 1:15 ` Roman Zippel @ 2006-09-17 8:42 ` Ingo Molnar 2006-09-17 15:16 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 8:42 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > On the other side there is the main kernel, which, if it ever > > accepted static tracepoints, could probably never get rid of them. > > If they are useful and not hurting anyone, why should we? FYI, whether it is true that "they not hurting anyone" is one of those "secondary issues" that I analyzed in great detail in the emails yesterday, and which you opted not to "further dvelve into": Message-ID: <20060916082347.GG6317@elte.hu>: ' That is a constant kernel maintainance drag that i feel uncomfortable about. ' Message-ID: <20060916082107.GB6317@elte.hu>: 'That's far from "virtually no maintainance overhead".' Message-ID: <20060916082054.GA6317@elte.hu>: 'static tracepoints are a maintainance problem: once added _they can not be removed without breaking static tracers_.' I still very much opine that your claim that static tracepoints are not hurting anyone is false: they can cause significant maintainance overhead in the long run that we cannot remove, and these costs integrate over a long period of time. We have statements from two people who have /used and hacked/ LTT in products and have seen LTT's use, indicating that the maintainance overhead is nonzero and that the combined number of tracepoints in use by actual customers is much larger than posited in this thread. We also have LTT proponents disputing that and suggesting that the long-term maintainance overhead is very low. So even taking my opinion out of the picture, the picture is far from clear. If we put my opinion back into the picture: i base it on my first-hand experience with tracers. [**] so at least to me the rule in such a situation is clear: if we have the choice between two approaches that are useful in similar ways [*] but one has a larger flexibility to decrease the total maintainance cost, then we _must_ pick that one. This really isnt rocket science, we do such decisions every day. We did that decision for STREAMS too. (which STREAMS argument you ignored for a number of times.) STREAMS was a similar situation: people wanted "just a few unintrusive hooks which you could compile out" for external STREAMS functionality to hook into. and unlike STREAMS, in the LTT case it's not the totality of the project that is being disputed: i only dispute the static tracing aspect of it, which is a comparatively small (but intrusive) portion of a project that consists of a 26,000 lines kernel patchset and a large body of userspace tools. Ingo [*] furthermore, dynamic tracing is not only "similarly useful", it is _more useful_ because it allows alot more than static tracing does. That's why i analyzed the "secondary issue" of the usefulness of dynamic tracers: the decision gets easier if one of the technologies is fundamentally more capable. [**] Also, just yesterday i tried to merge the 2.6.17 version of the LTT patchset to 2.6.18, and it created non-trivial rejects left and right. That is a further objective indicator to me - if something has low maintainance cost, how come its patchset is so intrusive that it cannot survive 3 months of kernel development flux? If it's intrusive, shouldnt we have the fundamental option to shift that maintainance overhead out of the core kernel, back to the people that want those features? ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 8:42 ` Ingo Molnar @ 2006-09-17 15:16 ` Roman Zippel 2006-09-17 15:25 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-17 15:16 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > > If they are useful and not hurting anyone, why should we? > > FYI, whether it is true that "they not hurting anyone" is one of those > "secondary issues" that I analyzed in great detail in the emails > yesterday, and which you opted not to "further dvelve into": Ingo, you happily still ignore my primary issues, how serious do you expect me to take this? > so at least to me the rule in such a situation is clear: if we have the > choice between two approaches that are useful in similar ways [*] but > one has a larger flexibility to decrease the total maintainance cost, > then we _must_ pick that one. That would assume the choices are mutually exclusive, which you haven't proven at all. To put everything in yet another perspective: We have the kernel full of security hooks, which are likely more invasive than any trace marker ever will be. These security hooks are well hated by a few developers, but we merged them anyway, because they are useful. So the big question is now, why should it be impossible to create and merge a well defined set of markers, which can be used by any tracer? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 15:16 ` Roman Zippel @ 2006-09-17 15:25 ` Ingo Molnar 2006-09-17 16:02 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 15:25 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Ingo, you happily still ignore my primary issues, how serious do you > expect me to take this? I did not ignore your new "primary issues", to the contrary. Please read my replies. To recap, your "primary issues" are: > The foremost issue is still that there is only limited kprobes > support. > The main issue in supporting static tracers are the tracepoints and so > far I haven't seen any convincing proof that the maintainance overhead > of dynamic and static tracepoints has to be significantly different. to both points i (and others) already replied in great detail - please follow up on them. (I can quote message-IDs if you cannot find them.) [ Or if it's not these two then let me know if i missed some important point - it's easy to miss a valid point in a sea of of replies. For example yesterday i have replied to 7 different issues _you_ raised, partly issues where you have questioned my credibility and competence, so i felt compelled to reply - but still you replied to none of those mails, only declaring them "secondary" in a passing reference. If they were secondary then why did you raise them in the first place? Or do you summarily concede all those points by not replying to them? And is there any guarantee that you will reply to any mails i write to you now? Will you declare them "secondary" too once the argument appears to turn unfavorable to your position? ] Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 15:25 ` Ingo Molnar @ 2006-09-17 16:02 ` Roman Zippel 2006-09-17 16:45 ` Ingo Molnar 2006-09-17 16:59 ` Nick Piggin 0 siblings, 2 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-17 16:02 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > > The foremost issue is still that there is only limited kprobes > > support. > > > The main issue in supporting static tracers are the tracepoints and so > > far I haven't seen any convincing proof that the maintainance overhead > > of dynamic and static tracepoints has to be significantly different. > > to both points i (and others) already replied in great detail - please > follow up on them. (I can quote message-IDs if you cannot find them.) What you basically tell me is (rephrased to make it more clear): Implement kprobes support or fuck off! You make it very clear, that you're unwilling to support static tracers even to point to make _any_ static trace support impossible. It's impossible to discuss this with you, because you're absolutely unwilling to make any concessions. What am I supposed to do than it's very clear to me, that you don't want to make any compromise anyway? You leave me _nothing_ to work with, that's the main reason I leave such things unanswered. AFAICT there is nothing I can do about that than just repeating what I told you already anyway and you'll continue to ignore it and I'm sick and tired of it. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 16:02 ` Roman Zippel @ 2006-09-17 16:45 ` Ingo Molnar 2006-09-17 16:59 ` Nick Piggin 1 sibling, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 16:45 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > to both points i (and others) already replied in great detail - > > please follow up on them. (I can quote message-IDs if you cannot > > find them.) > > What you basically tell me is (rephrased to make it more clear): > Implement kprobes support or fuck off! [...] What i am saying (again and again) is: "the other option you suggest is not acceptable to me because a better solution exists" [for the many reasons outlined before]. Think about the STREAMS example: there too _that_ particular approach was rejected, because a better solution existed. (although it was a _much_ larger body of code that was rejected) I'm not "forcing" kprobes on you: you can invent whatever other approach that solves the problems i and others raised, or you can have your own separate patchset - this is standard kernel acceptance procedure. Granted, kprobes is an existing solution with extensive existing infrastructure, so it's IMO the easiest solution technically, but you are certainly not 'forced' to do it. You want the feature on your architecture _without_ kprobes, solve the problems. > [...] You make it very clear, that you're unwilling to support static > tracers even to point to make _any_ static trace support impossible. > It's impossible to discuss this with you, because you're absolutely > unwilling to make any concessions. [...] Because we either accept the concept of static tracing or not - unfortunately there's no meaningful middle ground. I'd love it if there was some meaningful middle-ground, because then we'd not have this lengthy discussion at all. But sometimes such situations do happen. Same was true for STREAMS: the only choice was to either it was accepted or it was rejected. One cannot get a "little bit pregnant". The "add some static markups" suggestion is IMO just tactical pretense: static tracing will only be fully functional once it grows a comprehensive set of static tracepoints, so once we accept a "little bit" of static tracing where all the tools are built around a full set of tracepoints, we've created an expectance to have all of it. Hence my suggestion: forget static tracing for the LTT engine and concentrate on dynamic tracepoints with _static markups_. Do you realize that dynamic tracers can insert _function calls_ into static markups, today? [and i'm not talking about djprobes here but current existing SystemTap behavior.] Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 16:02 ` Roman Zippel 2006-09-17 16:45 ` Ingo Molnar @ 2006-09-17 16:59 ` Nick Piggin 2006-09-17 17:26 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Nick Piggin @ 2006-09-17 16:59 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, Roman Zippel wrote: > Hi, > > On Sun, 17 Sep 2006, Ingo Molnar wrote: > > >>>The foremost issue is still that there is only limited kprobes >>>support. >> >>>The main issue in supporting static tracers are the tracepoints and so >>>far I haven't seen any convincing proof that the maintainance overhead >>>of dynamic and static tracepoints has to be significantly different. Above, weren't you asking about static vs dynamic trace-*points*, rather than the implementation of the tracer itself. I think Ingo said that some "static tracepoints" (eg. annotation) could be acceptable. >>to both points i (and others) already replied in great detail - please >>follow up on them. (I can quote message-IDs if you cannot find them.) > > > What you basically tell me is (rephrased to make it more clear): Implement > kprobes support or fuck off! You make it very clear, that you're unwilling > to support static tracers even to point to make _any_ static trace support Now it seems you are talking about compiled vs runtime inserted traces, which is different. And so far I have to agree with Ingo: dynamic seems to be better in almost every way. Implementation may be more complex, but that's never stood in the way of a better solution before, and I don't think anybody has shown it to be prohibitive ("I won't implement it" notwithstanding) -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 16:59 ` Nick Piggin @ 2006-09-17 17:26 ` Roman Zippel 2006-09-17 17:56 ` Nick Piggin 2006-09-17 19:23 ` Ingo Molnar 0 siblings, 2 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-17 17:26 UTC (permalink / raw) To: Nick Piggin Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Mon, 18 Sep 2006, Nick Piggin wrote: > > > > The foremost issue is still that there is only limited kprobes support. > > > > > > > The main issue in supporting static tracers are the tracepoints and so > > > > far I haven't seen any convincing proof that the maintainance overhead > > > > of dynamic and static tracepoints has to be significantly different. > > Above, weren't you asking about static vs dynamic trace-*points*, rather > than the implementation of the tracer itself. I think Ingo said that > some "static tracepoints" (eg. annotation) could be acceptable. No, he made it rather clear, that as far as possible he only wants dynamic annotations (e.g. via function attributes). > > What you basically tell me is (rephrased to make it more clear): Implement > > kprobes support or fuck off! You make it very clear, that you're unwilling > > to support static tracers even to point to make _any_ static trace support > > Now it seems you are talking about compiled vs runtime inserted traces, > which is different. And so far I have to agree with Ingo: dynamic seems > to be better in almost every way. Implementation may be more complex, > but that's never stood in the way of a better solution before, and I > don't think anybody has shown it to be prohibitive ("I won't implement > it" notwithstanding) I don't deny that dynamic tracer are more flexible, but I simply don't have the resources to implement one. If those who demand I use a dynamic tracer, would also provide the appropriate funding, it would change the situation completely, but without that I have to live with the tools available to me. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 17:26 ` Roman Zippel @ 2006-09-17 17:56 ` Nick Piggin 2006-09-17 18:59 ` Roman Zippel 2006-09-17 21:32 ` Ingo Molnar 2006-09-17 19:23 ` Ingo Molnar 1 sibling, 2 replies; 271+ messages in thread From: Nick Piggin @ 2006-09-17 17:56 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Roman Zippel wrote: > Hi, > > On Mon, 18 Sep 2006, Nick Piggin wrote: > > >>Above, weren't you asking about static vs dynamic trace-*points*, rather >>than the implementation of the tracer itself. I think Ingo said that >>some "static tracepoints" (eg. annotation) could be acceptable. > > > No, he made it rather clear, that as far as possible he only wants dynamic > annotations (e.g. via function attributes). OK we must have him interpreted differently. I won't speak for Ingo, but he can respond if he likes. >>Now it seems you are talking about compiled vs runtime inserted traces, >>which is different. And so far I have to agree with Ingo: dynamic seems >>to be better in almost every way. Implementation may be more complex, >>but that's never stood in the way of a better solution before, and I >>don't think anybody has shown it to be prohibitive ("I won't implement >>it" notwithstanding) > > > I don't deny that dynamic tracer are more flexible, but I simply don't > have the resources to implement one. If those who demand I use a dynamic > tracer, would also provide the appropriate funding, it would change the > situation completely, but without that I have to live with the tools > available to me. You definitely don't have to use a dynamic tracer, nor even implement one on m68k (that will presumably happen if/when somebody does want a dynamic tracer enough). But equally nobody can demand that a feature go into the upstream kernel. Especially not if there is a more flexible alternative already available that just requires implementing for their arch. This shouldn't be surprising, the kernel doesn't have a doctrine of unlimited choice or merge features because they exist. For example people wanted pluggable (runtime and/or compile time CPU scheduler in the kernel. This was rejected (IIRC by Linus, Andrew, Ingo, and myself). No doubt it would have been useful for a small number of people but it was decided that it would split testing and development resources. The STREAMS example is another one. As an aside, there are quite a number of different types of tracing things (mostly static, compile out) in the kernel. Everything from blktrace to various userspace notifiers to lots of /proc/stuff could be considered a type of static event tracing. I don't know what my point is other than all these big, disjoint frameworks trying to be pushed into the kernel. Are there any plans for working some things together, or is that somebody else's problem? Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 17:56 ` Nick Piggin @ 2006-09-17 18:59 ` Roman Zippel 2006-09-17 21:23 ` Ingo Molnar ` (2 more replies) 2006-09-17 21:32 ` Ingo Molnar 1 sibling, 3 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-17 18:59 UTC (permalink / raw) To: Nick Piggin Cc: Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Mon, 18 Sep 2006, Nick Piggin wrote: > But equally nobody can demand that a feature go into the upstream > kernel. Especially not if there is a more flexible alternative > already available that just requires implementing for their arch. I completely agree with you under the condition that these alternatives were mutually exclusive or conflicting with each other. > This shouldn't be surprising, the kernel doesn't have a doctrine of > unlimited choice or merge features because they exist. Do we have a doctrine which forces us to design a feature in such way that has to be as difficult as possible to make it available to our users? In this case it would be very easy to provide some basic functionality via static tracing and the full functionality via dynamic tracing. Where is the law that forbids this? > For example > people wanted pluggable (runtime and/or compile time CPU scheduler > in the kernel. This was rejected (IIRC by Linus, Andrew, Ingo, and > myself). No doubt it would have been useful for a small number of > people but it was decided that it would split testing and development > resources. The STREAMS example is another one. Comparing it to STREAMS is an insult and Ingo should be aware of this. :-( > As an aside, there are quite a number of different types of tracing > things (mostly static, compile out) in the kernel. Everything from > blktrace to various userspace notifiers to lots of /proc/stuff could > be considered a type of static event tracing. I don't know what my > point is other than all these big, disjoint frameworks trying to be > pushed into the kernel. Are there any plans for working some things > together, or is that somebody else's problem? All the controversy around static tracing in general and LTT in specific has prevented this so far... bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 18:59 ` Roman Zippel @ 2006-09-17 21:23 ` Ingo Molnar 2006-09-17 21:52 ` Roman Zippel 2006-09-17 21:40 ` Ingo Molnar 2006-09-18 8:43 ` Jes Sorensen 2 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 21:23 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > For example people wanted pluggable (runtime and/or compile time CPU > > scheduler in the kernel. This was rejected (IIRC by Linus, Andrew, > > Ingo, and myself). No doubt it would have been useful for a small > > number of people but it was decided that it would split testing and > > development resources. The STREAMS example is another one. > > Comparing it to STREAMS is an insult and Ingo should be aware of this. > :-( so in your opinion Nick's mentioning of STREAMS is an insult too? I certainly do not understand Nick's example as an insult. Is STREAMS now a dirty word to you that no-one is allowed to use as an example in kernel maintanance discussions? Let me recap how I mentioned STREAMS for the first time: it was simply the best example i could think of when you asked the following question: > > Why don't you leave the choice to the users? Why do you constantly > > make it an exclusive choice? [...] > > [...] > > the user of course does not care about kernel internal design and > maintainance issues. Think about the many reasons why STREAMS was > rejected - users wanted that too. And note that users dont want > "static tracers" or any design detail of LTT in particular: what they > want is the _functionality_ of LTT. (see <20060915231419.GA24731@elte.hu> for the full context. Tellingly, that point of mine you have left unreplied too.) btw., you still have not retracted or corrected your false suggestion that "concessions" or a "compromise" were possible and you did not retract or correct your false accusation that i "dont want to make them": > It's impossible to discuss this with you, because you're absolutely > unwilling to make any concessions. What am I supposed to do than it's > very clear to me, that you don't want to make any compromise anyway? while, as i explained it before, such a concession simply does not exist - so i am not in the position to "make such a concession". There are only two choices in essence: either we accept a generic static tracer, or we reject it. (see <Pine.LNX.4.64.0609171744570.6761@scrub.home>) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 21:23 ` Ingo Molnar @ 2006-09-17 21:52 ` Roman Zippel 2006-09-17 22:27 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-17 21:52 UTC (permalink / raw) To: Ingo Molnar Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > btw., you still have not retracted or corrected your false suggestion > that "concessions" or a "compromise" were possible and you did not > retract or correct your false accusation that i "dont want to make > them": Sorry, I have nothing to retract and I'm not interesting in playing your word games. :-( > > It's impossible to discuss this with you, because you're absolutely > > unwilling to make any concessions. What am I supposed to do than it's > > very clear to me, that you don't want to make any compromise anyway? > > while, as i explained it before, such a concession simply does not exist > - so i am not in the position to "make such a concession". There are > only two choices in essence: either we accept a generic static tracer, > or we reject it. Wrong, this is about the minimum support, which can be used by both static and dynamic tracers. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 21:52 ` Roman Zippel @ 2006-09-17 22:27 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 22:27 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > On Sun, 17 Sep 2006, Ingo Molnar wrote: > > > btw., you still have not retracted or corrected your false suggestion > > that "concessions" or a "compromise" were possible and you did not > > retract or correct your false accusation that i "dont want to make > > them": > > Sorry, I have nothing to retract and I'm not interesting in playing > your word games. :-( you are wrong if you call my asking you to retract your false suggestion and false accusation a "word game". It is my basic right to point out misrepresentations, false statements, false accusations and misinterpretations when i see them. The sentences i pointed out were not just opinions, they were materially false statements of yours. But you are of course free to not retract or correct them (or to not dispute my characterization of them as such). Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 18:59 ` Roman Zippel 2006-09-17 21:23 ` Ingo Molnar @ 2006-09-17 21:40 ` Ingo Molnar 2006-09-18 8:43 ` Jes Sorensen 2 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 21:40 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > As an aside, there are quite a number of different types of tracing > > things (mostly static, compile out) in the kernel. Everything from > > blktrace to various userspace notifiers to lots of /proc/stuff could > > be considered a type of static event tracing. I don't know what my > > point is other than all these big, disjoint frameworks trying to be > > pushed into the kernel. Are there any plans for working some things > > together, or is that somebody else's problem? > > All the controversy around static tracing in general and LTT in > specific has prevented this so far... BLKTRACE is a special-purpose tracing facility limited to one subsystem and written and maintained by the /same/ person (Jens) who maintains that subsystem. He maintains the subsystem, the tracer and the userspace tool that extracts the tracer data. LTT on the other hand is a static tracer that affects _all_ subsystems. That is a very different situation from a maintainance overhead POV, and i believe you must know that. your suggestion that this controversy has prevented consolidation in this area is baseless and misleading, please correct or retract it. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 18:59 ` Roman Zippel 2006-09-17 21:23 ` Ingo Molnar 2006-09-17 21:40 ` Ingo Molnar @ 2006-09-18 8:43 ` Jes Sorensen 2 siblings, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-18 8:43 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Ingo Molnar, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Roman Zippel wrote: > Hi, > > On Mon, 18 Sep 2006, Nick Piggin wrote: > >> But equally nobody can demand that a feature go into the upstream >> kernel. Especially not if there is a more flexible alternative >> already available that just requires implementing for their arch. > > I completely agree with you under the condition that these alternatives > were mutually exclusive or conflicting with each other. Roman, I don't get this, you are arguing that we should put it in because it doesn't do any damage. First of all it does, by adding a lot of clutter all over the place. Second, if we take that argument, then we should allow anybody to put in anything they want, are you also suggesting we put devfs back in? Point is that the Linux kernel gets so many proposals, some are good some are bad and some while maybe looking like a good idea at the beginning, show out later to be a bad idea - LTT falls into this category. *However*, it doesn't mean the knowledge and tools that were developed with LTT are bad or useless. To take another related project, look at relayfs. There was so much noise about it when it was initially pushed, yuck I even remember how it was suggested that printk should be implemented via relayfs. But look at it now, there is no fs/relayfs/* these days. The kernel moved on, used the knowledge optained and provided the feature in a better way - exactly like it is being proposed to do for trace points, by using dynamic probes. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 17:56 ` Nick Piggin 2006-09-17 18:59 ` Roman Zippel @ 2006-09-17 21:32 ` Ingo Molnar 1 sibling, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 21:32 UTC (permalink / raw) To: Nick Piggin Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > As an aside, there are quite a number of different types of tracing > things (mostly static, compile out) in the kernel. Everything from > blktrace to various userspace notifiers to lots of /proc/stuff could > be considered a type of static event tracing. I don't know what my > point is other than all these big, disjoint frameworks trying to be > pushed into the kernel. Are there any plans for working some things > together, or is that somebody else's problem? AFAIK Jens has indicated interest in seeing experiments that would try to replace BKLTRACE with dynamic tracepoints, so it's being worked on. but yes, that would be the general idea: to turn all existing ad-hoc tracing/debugging points in the kernel into static SystemTap markers or SystemTap scripts. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 17:26 ` Roman Zippel 2006-09-17 17:56 ` Nick Piggin @ 2006-09-17 19:23 ` Ingo Molnar 2006-09-17 19:45 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 19:23 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > [...] I think Ingo said that some "static tracepoints" (eg. > > annotation) could be acceptable. > > No, he made it rather clear, that as far as possible he only wants > dynamic annotations (e.g. via function attributes). what you say is totally and utterly nonsensical misrepresentation of what i have said. I always said: i support in-source annotations too (I even suggested APIs how to do them), as long as they are not a total _guaranteed_ set destined for static tracers, i.e. as long as they are there for the purpose of dynamic tracers. I dont _care_ about static annotations as long as they are there for dynamic tracers, because they can be moved into scripts if they cause problems. But static annotations for static tracers are much, much harder to remove. Please go on and read my "tracepoint maintainance models" email: Message-ID: <20060917143623.GB15534@elte.hu> Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 19:23 ` Ingo Molnar @ 2006-09-17 19:45 ` Roman Zippel 2006-09-17 20:56 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-17 19:45 UTC (permalink / raw) To: Ingo Molnar Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > > > [...] I think Ingo said that some "static tracepoints" (eg. > > > annotation) could be acceptable. > > > > No, he made it rather clear, that as far as possible he only wants > > dynamic annotations (e.g. via function attributes). > > what you say is totally and utterly nonsensical misrepresentation of > what i have said. I always said: i support in-source annotations too (I > even suggested APIs how to do them), Some consistency would certainly help: 'my suggested API is not "barely usable" for static tracers but "totally unusable".' <20060916082214.GD6317@elte.hu> ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 19:45 ` Roman Zippel @ 2006-09-17 20:56 ` Ingo Molnar 2006-09-17 21:36 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 20:56 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi, > > On Sun, 17 Sep 2006, Ingo Molnar wrote: > > > > > [...] I think Ingo said that some "static tracepoints" (eg. > > > > annotation) could be acceptable. > > > > > > No, he made it rather clear, that as far as possible he only wants > > > dynamic annotations (e.g. via function attributes). > > > > what you say is totally and utterly nonsensical misrepresentation of > > what i have said. I always said: i support in-source annotations too (I > > even suggested APIs how to do them), > > Some consistency would certainly help: 'my suggested API is not > "barely usable" for static tracers but "totally unusable".' I am really sorry that you were able to misunderstand and misrepresent such a simple sentence. Let me quote the full paragraph of what i said: > you raise a new point again (without conceding or disputing the point > we were discussing, which point you snipped from your reply) but i'm > happy to reply to this new point too: my suggested API is not "barely > usable" for static tracers but "totally unusable". Did i tell you yet > that i disagree with the addition of markups for static tracers? this makes it clear that i disagree with adding static markups for static tracers - but i of course still agree with static markups for _dynamic tracers_. The markups would be totally unusable for static tracers because there is no guarantee for the existence of static markups _everywhere_: the static markups would come and go, as per the "tracepoint maintainance model". Do you understand that or should i explain it in more detail? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 20:56 ` Ingo Molnar @ 2006-09-17 21:36 ` Roman Zippel 2006-09-17 22:13 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-17 21:36 UTC (permalink / raw) To: Ingo Molnar Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Sun, 17 Sep 2006, Ingo Molnar wrote: > > Some consistency would certainly help: 'my suggested API is not > > "barely usable" for static tracers but "totally unusable".' > > I am really sorry that you were able to misunderstand and misrepresent > such a simple sentence. Considering the context, which is not exactly full of support for static tracer, I think my understanding was and still is quite correct. Let's take <20060915231419.GA24731@elte.hu>, where you suggest converting as much possible tracepoints to this API, thus excluding a lot of information from static tracers. > this makes it clear that i disagree with adding static markups for > static tracers - but i of course still agree with static markups for > _dynamic tracers_. The markups would be totally unusable for static > tracers because there is no guarantee for the existence of static > markups _everywhere_: the static markups would come and go, as per the > "tracepoint maintainance model". Do you understand that or should i > explain it in more detail? Well, I rather just wait for the real patch, where you can show your support for all possible users. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 21:36 ` Roman Zippel @ 2006-09-17 22:13 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 22:13 UTC (permalink / raw) To: Roman Zippel Cc: Nick Piggin, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > I am really sorry that you were able to misunderstand and > > misrepresent such a simple sentence. > > Considering the context, which is not exactly full of support for > static tracer, I think my understanding was and still is quite > correct. this thought of you is still false. Nick said: ' I think Ingo said that some "static tracepoints" (eg. annotation) could be acceptable. ' to which you replied: ' No, he made it rather clear, that as far as possible he only wants dynamic annotations (e.g. via function attributes). ' That "No" word at the beginning of your sentence, by its plain meaning, falsely questions Nick's correct interpretation of what I said. I ask you to retract or correct this false statement. Nick is of course correct: i said before that some static markups could be acceptable. In fact, i even outlined a possible API for such static markups in 20060914231956.GB29229@elte.hu. Would I want to reduce the number of such static markups: of course, not wanting to reduce the number of subsystem-functionality unrelated source code lines would be foolish. > > this makes it clear that i disagree with adding static markups for > > static tracers - but i of course still agree with static markups for > > _dynamic tracers_. The markups would be totally unusable for static > > tracers because there is no guarantee for the existence of static > > markups _everywhere_: the static markups would come and go, as per > > the "tracepoint maintainance model". Do you understand that or > > should i explain it in more detail? > > Well, I rather just wait for the real patch, where you can show your > support for all possible users. this answer of yours does not rectify the false statement you did. Your sentence also introduces a new misrepresentation of my intentions: my intention with partial static markups (which intention i've written to you about before, so it was known to you when you wrote this stentence) is not to support "all possible users", but to support dynamic tracers. Static tracers cannot use static markups that go away into dynamic tracing scripts. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 19:58 ` Roman Zippel 2006-09-16 22:50 ` Ingo Molnar 2006-09-16 23:00 ` Ingo Molnar @ 2006-09-16 23:14 ` Ingo Molnar 2006-09-17 14:19 ` Frank Ch. Eigler [not found] ` <y0mu036eglz.fsf@ton.toronto.redhat.com> 2 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 23:14 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > [...] instead of delving further into secondary issues, please let me > get back to the primary issues [...] here's a list of some of those "secondary issues" that we were discussing, and which you opted not to "further dvelve into": firstly, a factually wrong statement of yours: > [...] any tracepoints have an maintainance overhead, which is barely > different between dynamic and static tracing [...] secondly, a factually wrong statement of yours: > [...] at the source level you can remove a static tracepoint as easily > as a dynamic tracepoint, [...] thirdly, a factually wrong statement of yours: > [...] It would also add virtually no maintainance overhead [...] [see the previous mails for the full context on these items.] Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 23:14 ` Ingo Molnar @ 2006-09-17 14:19 ` Frank Ch. Eigler 2006-09-17 15:31 ` Ingo Molnar [not found] ` <y0mu036eglz.fsf@ton.toronto.redhat.com> 1 sibling, 1 reply; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-17 14:19 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar <mingo@elte.hu> writes: > [...] > firstly, a factually wrong statement of yours: > > > [...] any tracepoints have an maintainance overhead, which is barely > > different between dynamic and static tracing [...] If one totals the fixup effort required across the programmers who need to do the work, I would concur with the OP; or if there is a difference, it is in favour of the static markers. It is unfortunate that all the talk about maintenance has been almost entirely aloof and disconnected from empirical examples. It would be much better if we were able to sketch out plausible designs for static instrumentation and similar dynamic probes, and carry out gedanken experiments aobut how they would need to adopt to realistic examples of code drift. It is not the case that all "maintenance" is alike. > secondly, a factually wrong statement of yours: > > > [...] at the source level you can remove a static tracepoint as easily > > as a dynamic tracepoint, [...] It is not hard to imagine commenting out a single line; nor inserting the equivalent of "#define NDEBUG" at the head of the .c file to disable them all for the whole compilation unit. The retort that "this would break the entire tracing system" does not hold water without far more argument. Missing events do not necessarily a totally broken system make. (Renamed or changed events may even be mapped back via a translation layer.) Tracing events need not become as firmly fixed (unremovable or unchangeable) a user interface as the syscalls. > thirdly, a factually wrong statement of yours: > > > [...] It would also add virtually no maintainance overhead [...] Yes, the knife cuts both ways: both cost ongoing effort. The question is how much; who would do the work; who is better able to do the work; who (users/developers) receives value from the work. The overall cost/benefit calculation is far more complicated than pithy lines about "no maintenance" or its opposite. As for the possibilities of kprobes performance improvements: bring them on, they're great. It is however almost certain that, because reasons like debugging-information imperfection or absence, compiler optimizations, different deployment scenarios, some un-probable blind spots would remain kprobes-only probing system. As for Karim's proposed comment-based markers, I don't have a strong opinion, not being one whose kernel-side code would be marked up one way or the other. My intuition suggests that, if the runtime costs of a dormant static marker are low enough, they should be just compiled in by default. And if they are compiled in, then by golly, compile them in honestly and don't hide them. Something like build-time multilibbing seems like too much effort to trade one eyesore for a different eyesore. But that's just my opinion, I could be wrong. - FChE ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 14:19 ` Frank Ch. Eigler @ 2006-09-17 15:31 ` Ingo Molnar 2006-09-17 17:15 ` Mathieu Desnoyers 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 15:31 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Frank Ch. Eigler <fche@redhat.com> wrote: > As for Karim's proposed comment-based markers, I don't have a strong > opinion, not being one whose kernel-side code would be marked up one > way or the other. [...] What makes the difference isnt just the format of markup (although i fully agree that the least visually intrusive markup format should be used for static markers, and the range of possibilities includes comment-based markers too), but what makes the differen is: the /guarantee/ of a full (comprehensive) set to /static tracers/ The moment we allow a static tracer into the upstream kernel, we make that guarantee, implicitly and explicitly. (I've expanded on this line of argument in the previous few mails, extensively.) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-17 15:31 ` Ingo Molnar @ 2006-09-17 17:15 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-17 17:15 UTC (permalink / raw) To: Ingo Molnar Cc: Frank Ch. Eigler, Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, * Ingo Molnar (mingo@elte.hu) wrote: > > * Frank Ch. Eigler <fche@redhat.com> wrote: > > > As for Karim's proposed comment-based markers, I don't have a strong > > opinion, not being one whose kernel-side code would be marked up one > > way or the other. [...] > > What makes the difference isnt just the format of markup (although i > fully agree that the least visually intrusive markup format should be > used for static markers, and the range of possibilities includes > comment-based markers too), but what makes the differen is: > > the /guarantee/ of a full (comprehensive) set to /static tracers/ > > The moment we allow a static tracer into the upstream kernel, we make > that guarantee, implicitly and explicitly. (I've expanded on this line > of argument in the previous few mails, extensively.) > Ingo, your definition of a static tracer seems to be slightly off from LTTng's reality in two ways : First, the kernel tracer supports dynamically loadable "event types", which makes it quite more flexible than a static tracer that would have to guarantee a full set of trace points. There is a clear difference between statically adding instrumentation and statically adding new event types in that forcing a static set of events would indeed break the user space tools when an event is added or removed. Second, the user space analysis tools are built so that they can handle missing information. So, if they lack things like scheduler change or irq entry/exit events, they will still show the available information. No "breakage" would result from a missing probe. Moreover, the LTTV trace analysis tool being modular and plugin-based, developers can choose to load or not analysis on the data based on the instrumentation present in the traced kernel. So there is no guarantee of any full instrumentation set : both instrumentation and analysis tools are extensible by the users when needed. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
[parent not found: <y0mu036eglz.fsf@ton.toronto.redhat.com>]
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 [not found] ` <y0mu036eglz.fsf@ton.toronto.redhat.com> @ 2006-09-17 15:00 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-17 15:00 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Roman Zippel, Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Frank Ch. Eigler <fche@redhat.com> wrote: > [...] It would be much better if we were able to sketch out plausible > designs for static instrumentation and similar dynamic probes, and > carry out gedanken experiments aobut how they would need to adopt to > realistic examples of code drift. It is not the case that all > "maintenance" is alike. see my previous mail - hopefully that explains my position even clearer. A number of people have expressed doubts about the all-static model (i'm amongst them) - and that's all based on actual experience. So there's no need for Gedanken-experiments, because we've got real-life experiments :-) A number of people also have expressed that they think an all-static markup model is the right one - and that's based on experience as well. Just looking at the opinions objectively and excluding my opinion i'd say that the most likely model will thus be a _hybrid_ one: some markups will be static, some will be dynamic. Whether a tracepoint will be static or dynamic will depend on the 'flux of changes' in the tracing code and of the code they trace. If tracing code has a high flux, or the traced code has a high flux, then the lowest maintainance overhead is to have a dynamic tracepoint. If _both_ the tracing code and the traced code has low flux of changes, then the lowest maintainance overhead will be a static markup. Put differently: dynamic markups will turn into static markups if the code that they handle "cools down". Static markups will turn into dynamic markups if the code where they reside in gets "too hot" or if the markups themselves are "too hot". But one thing is sure: with a static tracer model accepted into the kernel we are forced to have a comprehensive, always-maintained, full set of static markups in the tree, for a long time. Dynamic tracers will still be around, but we wont be able to fully benefit from the more flexible tracepoint maintainance models they allow, because we'll always have to carry around the static markups, for the sake of static tracers. There will likely be periodic friction about how many static markups there should be in the source: subsystem maintainers will want them out, static-trace-users will want them in. If a crutial static markup is removed or damaged then the kernel will regress materially. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel ` (3 preceding siblings ...) 2006-09-16 8:22 ` Ingo Molnar @ 2006-09-16 8:23 ` Ingo Molnar 2006-09-16 8:23 ` Ingo Molnar 2006-09-16 8:23 ` Ingo Molnar 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:23 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > > > - a marker for dynamic tracing has lower performance impact ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > than a static tracepoint, on systems that are not being ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > traced. (but which have the tracing infrastructure enabled ^^^^^^ > > > > > > otherwise) > > > > > > > > > > Anyone using static tracing intents to use, which makes this point > > > > > moot. > > > > > > > > that's not at all true, on multiple grounds: > > > > > > > > Firstly, many people use distro kernels. A Linux distribution > > > > typically wants to offer as few kernel rpms as possible (one per > > > > arch to be precise), but it also wants to offer as many features > > > > as possible. So if there was a static tracer in there, a distro > > > > would enable it - but 99.9% of the users would never use it - still > > > > they would see the overhead. Hence the user would have it enabled, > > > > but does not intend to use it - which contradicts your statement. > > > > > > So if dynamic tracing is available use it, as distributions > > > already do. OTOH the barrier to use static tracing is drastically > > > different whether the user has to deal with external patches or > > > whether it's a simple kernel option. Again, static tracing doesn't > > > exclude the possibility of dynamic tracing, that's something you > > > constantly omit and thus make it sound like both options were > > > mutually exlusive. > > > > how does this reply to my point that: "a marker for dynamic tracing has > > lower performance impact than a static tracepoint, on systems that are > > not being traced", which point you claimed moot? > > Because it's pretty much an implementation issue. [...] No, that's my point, it's not an "implementational issue" of static tracers, the overhead of markups for static tracers is: _inherent to their concept of being compile-time and static_ ok? > [...] The point is about adding markers at all, it's about the choice > being able to use static tracers in the first place. [...] your characterization of "the point" is at odds with the specific point that we are discussing - see the underlined sentence above, right at the top of the quotes: > > > > > > - a marker for dynamic tracing has lower performance impact > > > > > > than a static tracepoint, on systems that are not being > > > > > > traced. (but which have the tracing infrastructure enabled Please either concede the point or dispute it, before shifting to new grounds. Thanks, Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel ` (4 preceding siblings ...) 2006-09-16 8:23 ` Ingo Molnar @ 2006-09-16 8:23 ` Ingo Molnar 2006-09-16 8:23 ` Ingo Molnar 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:23 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > Secondly, even people who intend to _eventually_ make use of > > > > tracing, dont use it most of the time. So why should they have > > > > more overhead when they are not tracing? Again: the point is not > > > > moot because even though the user intends to use tracing, but > > > > does not always want to trace. > > > > > > I've used kernels which included static tracing and the perfomance > > > overhead is negligible for occasional use. > > > > how does this suddenly make my point, that "a marker for dynamic > > tracing has lower performance impact than a static tracepoint, on > > systems that are not being traced", "moot"? > > Why exactly is the point relevant in first place? How exactly is the > added (minor!) overhead such a fundamental problem? how could a fundamental performance difference between two markup schemes be not relevant to kernel design decisions? Which performance difference i claim derives straight from the conceptual difference between the two approaches and is thus "unfixable" (and not an "implementational issue"). Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-16 0:31 ` Roman Zippel ` (5 preceding siblings ...) 2006-09-16 8:23 ` Ingo Molnar @ 2006-09-16 8:23 ` Ingo Molnar 6 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-16 8:23 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, karim, Andrew Morton, Paul Mundt, Jes Sorensen, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > Why don't you leave the choice to the users? Why do you constantly > > > make it an exclusive choice? [...] > > > > as i outlined it tons of times before: once we add markups for static > > tracers, we cannot remove them. That is a constant kernel maintainance > > drag that i feel uncomfortable about. > > As many, many people have already said, any tracepoints have an > maintainance overhead, which is barely different between dynamic and > static tracing and only increases the further away the tracepoints are > from the source. i have demonstrated that with dynamic tracers it's possible to have: "half the number of tracepoints" or "no tracepoints at all", right in the traced kernel source. That way we are able to shift away the maintainance overhead from the subsystem which is being traced to the person who _wants_ to do the tracing (instead of on the person who maintains the code that is being traced), in a finegrained way. But even the secondary metric, the "sum of all maintainance, including the maintanance of tracepoints" can become lower with dynamic tracers: if a subsystem changes with a much higher frequency than the tracing scripts follow. Let me try to explain it to you with other words: if all tracing is done via scripts and no in-source tracepoints at all, then we could for example update the tracing scripts only once per release. A subsystem might undergo a heavy cycle of updates, changing functions that are traced many times: i call this a "high frequency update to the source code". If tracing is done via tracepoints for static tracers, then such "high frequency updates to the source code" have to "carry with them" all the markups. It might be zero overhead if a subsystem has no tracepoints, but it might be alot more complex too. For example, I can tell you that the -rt tree has a number of very useful scheduling tracepoints but which are also a constant maintainance hindrance. For example i even have a separate _function_ that is a helper to one of the tracepoints. And this was the _bare minimum_ of static tracepoints i needed for the purposes of visualizing and analyzing scheduling patterns in the -rt tree (either on my boxes or on users' boxes). Occasionally users needed alot more tracepoints. So i am talking from first-hand experience. This maintainance overhead occured (and still occurs) to /me/, so please dont try to tell me that the maintainance overhead is minimal. Even "half the tracepoints" would be great. And i only have a dozen tracepoints, not hundreds! Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:23 ` Thomas Gleixner 2006-09-15 20:40 ` Roman Zippel @ 2006-09-15 21:05 ` Karim Yaghmour 2006-09-15 21:17 ` Thomas Gleixner 1 sibling, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 21:05 UTC (permalink / raw) To: tglx Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Thomas Gleixner wrote: > Stop whining! I resent that. If your efforts in working on popular kernel topics met rapid reward then I'm happy for you. The fact that others tackle unpopular topics and persist despite constant personal attacks should nevertheless be recognized for what it is. > LTT did not manage to solve the problem in a generic, You're entirely correct. I never claimed it to be perfect, that's why I had approached others early on to try to bridge things together and that's why I used to post ltt patches to the lkml. > mainline acceptable way. If you really believe that Kprobes / Systemtap > is just a $corporate maliciousness to kick you out of business, then I > really start to doubt your sanity. If that's how it was read, then it wasn't written right. ltt was never really a profit center for me, embedded Linux training was -- you wouldn't believe how much more profitable training is than pure consulting. But my own business is just beside the point. My point was that the high barrier to entry for tracing fragmented efforts around it. As for corporate decisions which culminated from such resistance, they probably were the sanest decision to take at the time. Heck if I was a manager at any of those companies I would have likely taken the same decision. It was, and still is, though, counterproductive. Fully justifiable, but counterproductive. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:05 ` Karim Yaghmour @ 2006-09-15 21:17 ` Thomas Gleixner 2006-09-15 21:31 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 21:17 UTC (permalink / raw) To: karim Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 17:05 -0400, Karim Yaghmour wrote: > Thomas Gleixner wrote: > > Stop whining! > > I resent that. See last sentence of this mail. > If your efforts in working on popular kernel topics > met rapid reward then I'm happy for you. The fact that others tackle > unpopular topics and persist despite constant personal attacks should > nevertheless be recognized for what it is. Oh well. I'm working on unpopular and intrusive stuff as long as you do. Just our ways to work and communicate differ slightly. > > mainline acceptable way. If you really believe that Kprobes / Systemtap > > is just a $corporate maliciousness to kick you out of business, then I > > really start to doubt your sanity. > > If that's how it was read, then it wasn't written right Ouch. Can you please tell me what's the technical merit of this paragraph: " ... The only reasons there are separate project teams is because managers in key positions made the decision that they'd rather break from existing projects which had had little success mainlining and instead use their corporate bodyweight to pressure/seduce kernel developers working for them into pushing their new great which-aboslutely- has-nothing-to-do-with-this-ltt-crap-(no,no, we actually agree with you kernel developers that this is crap, this is why we're developing this new amazing thing). That's the truth plain and simple." Sorry, I have not found a way to interpret it usefully. tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:17 ` Thomas Gleixner @ 2006-09-15 21:31 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 21:31 UTC (permalink / raw) To: tglx Cc: Andrew Morton, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Thomas Gleixner wrote: > Oh well. I'm working on unpopular and intrusive stuff as long as you do. Well, I won't debate that shall I :) > Just our ways to work and communicate differ slightly. Maybe so. Any wisdom would be greatly appreciated. > Sorry, I have not found a way to interpret it usefully. See my response to Ingo on this topic. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Andrew Morton 2006-09-15 18:19 ` Ingo Molnar 2006-09-15 19:35 ` Thomas Gleixner @ 2006-09-15 20:00 ` Mathieu Desnoyers 2006-09-15 20:27 ` Jose R. Santos 2006-09-15 20:37 ` Alan Cox 3 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 20:00 UTC (permalink / raw) To: Andrew Morton Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Andrew Morton (akpm@osdl.org) wrote: > Of course, it they are properly designed, the one set of tracepoints could > be used by different tracing backends - that allows us to separate the > concepts of "tracepoints" and "tracing backends". If I try to develop your idea a little further, we could this of dividing the tracing problem into four layers : - tracepoints (where the code is instrumented) - identifying code - accessing data surrounding the code - tracing backend (how to add the tracepoints) - tracing infrastructure (what code will serialize the information) - data extraction (getting the data out to disk, network, ...) I think that, if we agree on this segmentation of the problem, this thread is generally debating on the tracing backends and their respective limitations. I just want to point out that the patch I have submitted adresses mainly the "tracing infrastructure" and "data extraction" topics. Regards, Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:00 ` Mathieu Desnoyers @ 2006-09-15 20:27 ` Jose R. Santos 0 siblings, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 20:27 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > * Andrew Morton (akpm@osdl.org) wrote: > > Of course, it they are properly designed, the one set of tracepoints could > > be used by different tracing backends - that allows us to separate the > > concepts of "tracepoints" and "tracing backends". > > If I try to develop your idea a little further, we could this of dividing the > tracing problem into four layers : > > - tracepoints (where the code is instrumented) > - identifying code > - accessing data surrounding the code > - tracing backend (how to add the tracepoints) > - tracing infrastructure (what code will serialize the information) > - data extraction (getting the data out to disk, network, ...) > I think you missing user-space post processing which should be also considered part of the problem since the capabilities of post-processing will be limited by the "tracepoints" available. Tracepoints and post-processing are also the problems which need to be address first between the other established tracing projects before going forward with in-kernel solutions. > I think that, if we agree on this segmentation of the problem, this thread is > generally debating on the tracing backends and their respective limitations. > I just want to point out that the patch I have submitted adresses mainly the > "tracing infrastructure" and "data extraction" topics. > This seem like a good idea to dissect the problem since it seem like other important issues relevant to general tracing are being ignore simply because of a dislike of the way LTTng has chosen to implement trace. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Andrew Morton ` (2 preceding siblings ...) 2006-09-15 20:00 ` Mathieu Desnoyers @ 2006-09-15 20:37 ` Alan Cox 2006-09-15 20:26 ` Mathieu Desnoyers ` (2 more replies) 3 siblings, 3 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 20:37 UTC (permalink / raw) To: Andrew Morton Cc: tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 11:16 -0700, ysgrifennodd Andrew Morton: > What Karim is sharing with us here (yet again) is the real in-field > experience of real users (ie: not kernel developers). A lot of us have plenty of experience helping customers and end users trace bugs. Thats a good part of why we get paid in the first place. > What I _am_ concerned about with this patchset is all the infrastructural > goop which backs up those tracepoints. I'd have thought that a better > approach would be to make those explicit tracepoints be "helpers" for the > existing kprobe code. If you put explicit tracepoints in they will be compiled out for end users. If you have a script which hits the standard tracepoint set it'll be usable by end users. > Of course, it they are properly designed, the one set of tracepoints could > be used by different tracing backends - that allows us to separate the > concepts of "tracepoints" and "tracing backends". There are more than two layers. The first question is "how do I trace event XYZ" which seems to be the big debate. The second is "how do I find XYZ" which seems to have some commonality. The third is "what do I do when the event is hit", which kprobes provides to all the existing consumers such as systemtap and can field into arrays for graph plotting and the like. Ignoring the question of static compiled in trace points kprobes appears to have solved the problem space. Everyone else can use the kprobes interfaces to do pretty much anything computationally viable. I am sceptical about static tracepoints in critical spots because if they make the variable easy to access they will reduce optimisations and that will cost a lot more than 5 or 6 clocks. In addition ideally we want a mechanism that is also sufficient that printk can be mangled into so that you can pull all the printk text strings _out_ of the kernel and into the debug traces for embedded work. [ie you want printk("Oh dear %s exploded.\n", foo->bar); to end up with "Oh dear %s exploded.\n" out of kernel and in kernel tracepoint_printk(foo->bar); maybe with minimal type info (although that can be pulled at debug time from the string spat into the debug data).] Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:37 ` Alan Cox @ 2006-09-15 20:26 ` Mathieu Desnoyers 2006-09-15 20:51 ` Karim Yaghmour 2006-09-17 17:53 ` Mathieu Desnoyers 2 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 20:26 UTC (permalink / raw) To: Alan Cox Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, * Alan Cox (alan@lxorguk.ukuu.org.uk) wrote: > In addition ideally we want a mechanism that is also sufficient that > printk can be mangled into so that you can pull all the printk text > strings _out_ of the kernel and into the debug traces for embedded work. > > [ie you want printk("Oh dear %s exploded.\n", foo->bar); to end up with > "Oh dear %s exploded.\n" out of kernel and in kernel > > tracepoint_printk(foo->bar); > Good idea, trivial to implement on top of LTTng. When seeing printk's reentrancy limitations, I have though about doing it a couple of times. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:37 ` Alan Cox 2006-09-15 20:26 ` Mathieu Desnoyers @ 2006-09-15 20:51 ` Karim Yaghmour 2006-09-17 17:53 ` Mathieu Desnoyers 2 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 20:51 UTC (permalink / raw) To: Alan Cox Cc: Andrew Morton, tglx, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > A lot of us have plenty of experience helping customers and end users > trace bugs. Thats a good part of why we get paid in the first place. But of course, and I wouldn't dare compare my experience with yours. FWIW, though, I submit to you that there is a difference in between helping a customer trace something and actually attempting to create a tool which standard users can use to trace their own stuff. Then, again, my experience may just be lacking. Here's an example just for the fun of it: I was giving a class at a customer's site. It so happened they scheduled this class right after product delivery (advice: this is a mistake.) And, predictably, in came the technician asking for Joe, out went Joe, in came Joe, repeat. They spent quite some time after hours trying to figure this one out. Midweek, they asked if I could help, they were having some odd behavior in user-space on a custom-developed board. Try as I may, none of the standard user-space stuff was effective. Ok, time to try ltt. Now this was a "vendor" kernel, with preemption (ok, I'm not telling who, but this was definitely before Ingo's work) -- the sort of which I hadn't dabbled in before. I spent the evening trying to figure out how the heck the thing worked to no avail -- the locking mechanisms were just wrong for what ltt needed at the time. Last day I asked him if they could get a *normal* kernel on there and someone somewhere found an odd-port stable enough to run. So got an ltt patch, customized it for said kernel (would have had to do something similar if it were probe points instead of static traces), got a trace, and within 5 minutes we had found a bug in their custom hardware (and no, their drivers were just fine). This customer would not have even needed me or needed to waste their time if he had been able to get a trace for his bastardized kernel. But the way the anti-static-instrumentation creed goes this customer would still have needed me ... or someone else ... <conspiracy> wait a minute, maybe that's not a coincidence ... </conspiracy> ;) Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:37 ` Alan Cox 2006-09-15 20:26 ` Mathieu Desnoyers 2006-09-15 20:51 ` Karim Yaghmour @ 2006-09-17 17:53 ` Mathieu Desnoyers 2 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-17 17:53 UTC (permalink / raw) To: Alan Cox Cc: Andrew Morton, tglx, karim, Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais * Alan Cox (alan@lxorguk.ukuu.org.uk) wrote: > In addition ideally we want a mechanism that is also sufficient that > printk can be mangled into so that you can pull all the printk text > strings _out_ of the kernel and into the debug traces for embedded work. Hi, I just implemented a printk instrumentation that logs the printks into LTTng traces ASAP in order to keep the time causality correct. It can be found in LTTng 0.5.112. Regards, Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:51 ` Karim Yaghmour 2006-09-15 15:00 ` Thomas Gleixner @ 2006-09-15 15:24 ` Alan Cox 2006-09-15 15:23 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Alan Cox @ 2006-09-15 15:24 UTC (permalink / raw) To: karim Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 10:51 -0400, ysgrifennodd Karim Yaghmour: > The static tracepoints we maintained were *the* solution for a great I think you mean "a" solution. You've not proved there are no others. > deal many people. As a maintainer I had two choices with those who > were not content: > a- Maintain their tracepoints for them -- not happening. > b- Suggest they contribute to helping getting a generic tracing > infrastructure into the kernel and then make their case on the > lkml as to the pertinence of their instrumentation. b has been done, its called kprobes. We just need better tools for the dynamic probes. > choice of tracepoints. Those who were using ltt for its designated > purpose -- allowing normal users and developers to get an accurate > view of the behavior of their system -- were very happy with it. and you can maintain "Karim's probe list" which is the dynamic probe set which matches your old static probes, only of course its now much more flexible. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 15:24 ` Alan Cox @ 2006-09-15 15:23 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 15:23 UTC (permalink / raw) To: Alan Cox Cc: Paul Mundt, Jes Sorensen, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > b has been done, its called kprobes. We just need better tools for the > dynamic probes. As long as there needs to be the updating of an outside piece of something then "b" hasn't been done. Especially with regards to what this means to figuring out which of kernel or instrumentation-script is broken when you get bug reports on lkml. > and you can maintain "Karim's probe list" which is the dynamic probe set > which matches your old static probes, only of course its now much more > flexible. Sorry, the issue isn't about my probe list. The issue is that there needs to be a way of pointing important events without having to modify things at 3 or 4 different places. The only way this can be done is if it's in the tree -- regardless of the mechanism. This isn't about static tracers vs. dynamic tracers, it's about statically marking code. What goes underneath is secondary. And if the static markup -- with even the SystemTap people are interested in -- is but a hook for further selecting the appropriate instrumentation mechanism, then that's fine too. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:31 ` Karim Yaghmour 2006-09-15 14:28 ` Paul Mundt @ 2006-09-15 14:39 ` Jes Sorensen 2006-09-15 15:04 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 14:39 UTC (permalink / raw) To: karim Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > Jes Sorensen wrote: >> Because other people have tried to use LTT for additional projects, >> but said projects haven't been integrated into LTT. In other words, >> just because *you* haven't added those, doesn't mean someone else >> won't try and do it later, if LTT was integrated. > > Thank you. I will take it as a complement and likely laminate this > email for your suggestion that I've acted responsibly in my > maintenance of ltt. Boy, can you imagine what this debate would > have looked like if I had included precisely those additional > projects ... Karim, Thank you for this, it just proves that taking this discussion any further is a waste of everybody's time. > C'mon Jes, if I was able to responsibly maintain ltt over 5 > years *out* of the tree and I'm being labeled as incompetent all > over this thread, then imagine what the very competent people > maintaining the kernel could actually do. Nobody ever said you were irresponsible, but you are claiming that you are able to define a finite set of static tracepoints that are relevant to everybody. Or in other words, they are defined as being the ones relevant to you. Please read Paul Mundt's response to your email - it's bang on, couldn't put it any better myself. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:39 ` Jes Sorensen @ 2006-09-15 15:04 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 15:04 UTC (permalink / raw) To: Jes Sorensen Cc: Paul Mundt, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jes Sorensen wrote: > Thank you for this, it just proves that taking this discussion any > further is a waste of everybody's time. Sorry you feel this way. > Nobody ever said you were irresponsible, but you are claiming that you > are able to define a finite set of static tracepoints that are relevant > to everybody. Or in other words, they are defined as being the ones > relevant to you. No, I'm precisely not claiming that the tracepoints I was looking for were "relevant to everybody". They are, however, very relevant to any standard sysadmin or developer who wants to get a better picture of what his kernel is doing. Again, please refer to figure 2 of this article and explain to me why it's not relevant for standard users and developers to understand when these events happen inside the kernel: http://www.usenix.org/events/usenix2000/general/full_papers/yaghmour/yaghmour_html/index.html Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:15 ` Ingo Molnar 2006-09-14 18:35 ` Mathieu Desnoyers 2006-09-14 18:54 ` Karim Yaghmour @ 2006-09-14 19:40 ` Tim Bird 2006-09-14 20:00 ` Ingo Molnar 2006-09-15 11:40 ` Alan Cox 2006-09-14 19:47 ` Roman Zippel 3 siblings, 2 replies; 271+ messages in thread From: Tim Bird @ 2006-09-14 19:40 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > * Roman Zippel <zippel@linux-m68k.org> wrote: > >>> for me these are all _independent_ grounds for rejection, as a generic >>> kernel infrastructure. >> Tracepoints of course need to be managed, but that's true for both >> dynamic and static tracepoints. [...] > > that's not true, and this is the important thing that i believe you are > missing. A dynamic tracepoint is _detached_ from the normal source code > and thus is zero maintainance overhead. You dont have to maintain it > during normal development - only if you need it. You dont see the > dynamic tracepoints in the source code. It's only zero maintenance overhead for you. Someone has to maintain it. The party line for years has been that in-tree maintenance is easier than out-of-tree maintenance. > > a static tracepoint, once it's in the mainline kernel, is a nonzero > maintainance overhead _until eternity_. It is a constant visual > hindrance and a constant build-correctness and boot-correctness problem > if you happen to change the code that is being traced by a static > tracepoint. Again, I am talking out of actual experience with static > tracepoints: i frequently break my kernel via static tracepoints and i > have constant maintainance cost from them. So what i do is that i try to > minimize the number of static tracepoints to _zero_. I.e. i only add > them when i need them for a given bug. Ingo - I'm sure you are doing things at a level where static tracepoints impose a significant perturbation to the code. However, if you look historically at the set of static tracepoints that people have used with Linux (with LTT or LKST), they are really not too bad to maintain. I'm repeating what others have said, but I've been working with LTT and LTTng for several years, and the tracepoints haven't changed very much in that time. Heck, I've even brought LTTng up on new kernel versions and new architectures. How hard could it be if I can do it? ;-) (Of course, who knows if I did it right? - since it's out-of-tree it doesn't get as much testing.) The set of static tracepoints (or markers) that is envisioned is in the range of about 30 to 40 key kernel events. Dynamic tracepoints would be used for other stuff. I don't want to offend you, but I suspect your usage model for tracepoints is different from what the expected (and historical) usage model would be for LTTng-style static tracepoints. > > static tracepoints are inferior to dynamic tracepoints in almost every > way. > >> [...] Both have their advantages and disadvantages and just hammering >> on the possible problems of static ones [...] > > how about giving a line by line rebuttal to the very real problems of > static tracepoints i listed (twice already), instead of calling them > "possible problems"? I respect your experience, but I think it would be more productive to have this debate when a patch is submitted with a static tracepoint (or marker) implementation. The patch in question, if I understand correctly, provides infrastructure for tracing activities and should hopefully be useful for either static or dynamic tracepoints. I'm hoping someone from the SystemTAP camp can speak up and give their opinion on whether this is useful. If it is, then the whole debate about static vs. dynamic tracepoints is less important. If not, then that's a different debate. I maintain Kernel Function Trace (KFT) out-of-tree. This is a system which uses compiler flags to instrument every kernel function entry and exit. For obvious reasons this type of instrumentation is used only during development, but it has proven quite handy for certain development tasks (finding long-duration routines and finding bloated call sequences). I can imagine KFT using the infrastructure that is provided by the LTTng-core patch (and relinquishing my own infrastructure for activation, trace control, event handling etc.) Regards, -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Electronics ============================= ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 19:40 ` Tim Bird @ 2006-09-14 20:00 ` Ingo Molnar 2006-09-14 20:46 ` Karim Yaghmour 2006-09-14 21:02 ` Roman Zippel 2006-09-15 11:40 ` Alan Cox 1 sibling, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 20:00 UTC (permalink / raw) To: Tim Bird Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Tim Bird <tim.bird@am.sony.com> wrote: > > that's not true, and this is the important thing that i believe you > > are missing. A dynamic tracepoint is _detached_ from the normal > > source code and thus is zero maintainance overhead. You dont have to > > maintain it during normal development - only if you need it. You > > dont see the dynamic tracepoints in the source code. > > It's only zero maintenance overhead for you. Someone has to maintain > it. The party line for years has been that in-tree maintenance is > easier than out-of-tree maintenance. There's a third option, and that's the one i'm advocating: adding the tracepoint rules to the kernel, but in a _detached_ form from the actual source code. yes, someone has to maintain it, but that will be a detached effort, on a low-frequency as-needed basis. It doesnt slow down or hinder high-frequency fast prototyping work, it does not impact the source code visually, and it does not make reading the code harder. Furthermore, while a single broken LTT tracepoint prevents the kernel from building at all, a single broken dynamic rule just wont be inserted into the kernel. All the other rules are still very much intact. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:00 ` Ingo Molnar @ 2006-09-14 20:46 ` Karim Yaghmour 2006-09-19 12:05 ` Christoph Hellwig 2006-09-14 21:02 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 20:46 UTC (permalink / raw) To: Ingo Molnar Cc: Tim Bird, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > There's a third option, and that's the one i'm advocating: adding the > tracepoint rules to the kernel, but in a _detached_ form from the actual > source code. > > yes, someone has to maintain it, but that will be a detached effort, on > a low-frequency as-needed basis. It doesnt slow down or hinder > high-frequency fast prototyping work, it does not impact the source code > visually, and it does not make reading the code harder. Furthermore, > while a single broken LTT tracepoint prevents the kernel from building > at all, a single broken dynamic rule just wont be inserted into the > kernel. All the other rules are still very much intact. Actually the way ltt used to add its trace-statements is again an implementation issue. Broken tracepoints need not lead to kernel build failure. That's where the markers idea can be useful. What a marker should do is but provide location. It doesn't need to specify the variables being observed or anything local, though it doesn't mean the infrastructure shouldn't allow for this if the maintainer of the code wanted to. Ideally, though, markers should be self-contained. IOW, the person implementing such a marker should not need to edit any other file that the one being worked on to add an instrumentation point -- at least that's the way I think is easiest. What this means is that you would be able to add an instrumentation point in the kernel, build it, run the tracing and view the trace with your new event without any further intervention on any tool, header, or anything else. The only way that I believe this can be done is with a flexible marker infrastructure that a has a few basic properties: - Markers should be inlined (clearly this is the bone of contention at this point of the thread.) - By default, all markers should generate not a single instruction or modify any instruction path that would be generated should the the instrumentation not be there. - Allow the person instrumenting to specify which variables they are interested in without any possibility of build failure should the code change making the variable obsolete. - Build options should be added allowing users to: - Keep instrumentation disabled. - Create inlined trace points. - Create dynamic instrumentation markers. - Automatically generate appropriate information required for tools to be able to deal with the new instrumentation and/or display new information properly -- possibly in a new section of the binary. - etc. Again, the goal is to have the loop from instrumentation to visualization as simple as possible. Any instrumentation required more that single-file modification is bound to fall in bitrot, and fast. Hope this helps. Thanks, Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:46 ` Karim Yaghmour @ 2006-09-19 12:05 ` Christoph Hellwig 0 siblings, 0 replies; 271+ messages in thread From: Christoph Hellwig @ 2006-09-19 12:05 UTC (permalink / raw) To: Karim Yaghmour Cc: Ingo Molnar, Tim Bird, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Thu, Sep 14, 2006 at 04:46:21PM -0400, Karim Yaghmour wrote: > Ideally, though, markers should be self-contained. IOW, the person > implementing such a marker should not need to edit any other file > that the one being worked on to add an instrumentation point -- > at least that's the way I think is easiest. What this means is that > you would be able to add an instrumentation point in the kernel, > build it, run the tracing and view the trace with your new event > without any further intervention on any tool, header, or anything > else. Just in case my first mail on this subject wasn't clear enough I completely agree with that statement. complex traces detaches from the actual sourcecode are an uteer maintaince nightmare and should be avoided for anything but spontanous debugging. For that case they are of course imensely useful. Thus we need two forms to specify probes, and to not make the tracing an utter mess they need to share as much infrastructure as possible. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:00 ` Ingo Molnar 2006-09-14 20:46 ` Karim Yaghmour @ 2006-09-14 21:02 ` Roman Zippel 1 sibling, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-14 21:02 UTC (permalink / raw) To: Ingo Molnar Cc: Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > > It's only zero maintenance overhead for you. Someone has to maintain > > it. The party line for years has been that in-tree maintenance is > > easier than out-of-tree maintenance. > > There's a third option, and that's the one i'm advocating: adding the > tracepoint rules to the kernel, but in a _detached_ form from the actual > source code. > > yes, someone has to maintain it, but that will be a detached effort, on > a low-frequency as-needed basis. It doesnt slow down or hinder > high-frequency fast prototyping work, it does not impact the source code > visually, and it does not make reading the code harder. Furthermore, > while a single broken LTT tracepoint prevents the kernel from building > at all, a single broken dynamic rule just wont be inserted into the > kernel. All the other rules are still very much intact. This pretty much contradicts existing experience, most core events are rather static - a schedule event is a schedule event no matter how the actual scheduler is implemented. Separate tracepoints are like separate documentation, there are forgotten by the developers who could easily keep them uptodate if they were close to the source. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 19:40 ` Tim Bird 2006-09-14 20:00 ` Ingo Molnar @ 2006-09-15 11:40 ` Alan Cox 2006-09-15 11:46 ` Roman Zippel 1 sibling, 1 reply; 271+ messages in thread From: Alan Cox @ 2006-09-15 11:40 UTC (permalink / raw) To: Tim Bird Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Iau, 2006-09-14 am 12:40 -0700, ysgrifennodd Tim Bird: > It's only zero maintenance overhead for you. Someone has to > maintain it. The party line for years has been that in-tree > maintenance is easier than out-of-tree maintenance. That misses the entire point. If you have dynamic tracepoints you don't have any static tracepoints to maintain because you don't need them. They may be a clock or three slower but you are then going to branch into the trace tool code paths, take tlb misses, take cache misses, and eventually get back, so the cost of it being dynamic is so close to zero in the biger picture it doesn't matter. Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 11:40 ` Alan Cox @ 2006-09-15 11:46 ` Roman Zippel 2006-09-15 12:38 ` Alan Cox 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 11:46 UTC (permalink / raw) To: Alan Cox Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Alan Cox wrote: > Ar Iau, 2006-09-14 am 12:40 -0700, ysgrifennodd Tim Bird: > > It's only zero maintenance overhead for you. Someone has to > > maintain it. The party line for years has been that in-tree > > maintenance is easier than out-of-tree maintenance. > > That misses the entire point. If you have dynamic tracepoints you don't > have any static tracepoints to maintain because you don't need them. This assumes dynamic tracepoints are generally available, which is wrong. This assumes that dynamic tracepoints can't benefit from static source annotations, which is also wrong. He doesn't miss the point at all, dynamic tracepoints don't imply zero maintenance overhead. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 11:46 ` Roman Zippel @ 2006-09-15 12:38 ` Alan Cox 2006-09-15 12:39 ` Roman Zippel 2006-09-15 17:45 ` Andrew Morton 0 siblings, 2 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 12:38 UTC (permalink / raw) To: Roman Zippel Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 13:46 +0200, ysgrifennodd Roman Zippel: > > That misses the entire point. If you have dynamic tracepoints you don't > > have any static tracepoints to maintain because you don't need them. > > This assumes dynamic tracepoints are generally available, which is wrong. Wrong in what sense, you don't have them implemented or your architecture is mindbogglingly braindead you can't implement them ? > This assumes that dynamic tracepoints can't benefit from static source > annotations, which is also wrong. gcc -g produces extensive annotations which are then usably by many tools other than gdb. Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:38 ` Alan Cox @ 2006-09-15 12:39 ` Roman Zippel 2006-09-15 13:41 ` Alan Cox 2006-09-15 17:45 ` Andrew Morton 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 12:39 UTC (permalink / raw) To: Alan Cox Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Alan Cox wrote: > Ar Gwe, 2006-09-15 am 13:46 +0200, ysgrifennodd Roman Zippel: > > > That misses the entire point. If you have dynamic tracepoints you don't > > > have any static tracepoints to maintain because you don't need them. > > > > This assumes dynamic tracepoints are generally available, which is wrong. > > Wrong in what sense, you don't have them implemented or your > architecture is mindbogglingly braindead you can't implement them ? > > > This assumes that dynamic tracepoints can't benefit from static source > > annotations, which is also wrong. > > gcc -g produces extensive annotations which are then usably by many > tools other than gdb. Both points have very strong consequences regarding complexity. Why do you want to deny me the choice to use something simple, especially since both solutions are not mutually exclusive and can even complement each other? What's the point in forcing everyone to use a single solution? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:39 ` Roman Zippel @ 2006-09-15 13:41 ` Alan Cox 2006-09-15 13:34 ` Roman Zippel 2006-09-15 18:10 ` Jose R. Santos 0 siblings, 2 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 13:41 UTC (permalink / raw) To: Roman Zippel Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 14:39 +0200, ysgrifennodd Roman Zippel: > Both points have very strong consequences regarding complexity. Why do you > want to deny me the choice to use something simple, especially since both > solutions are not mutually exclusive and can even complement each other? I don't want to deny you the choice, I just don't want to see unneccessary garbage in the base kernel. What you put in your own toilet is a private matter. What you leave out in a public place is different. > What's the point in forcing everyone to use a single solution? Maintainability ? common good over individual weirdnesses ? Ability for people to concentrate on getting one good set of interfaces not twelve bad ones ? Consistency for user space ? Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:41 ` Alan Cox @ 2006-09-15 13:34 ` Roman Zippel 2006-09-15 14:41 ` Alan Cox 2006-09-15 18:10 ` Jose R. Santos 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-15 13:34 UTC (permalink / raw) To: Alan Cox Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Alan Cox wrote: > Ar Gwe, 2006-09-15 am 14:39 +0200, ysgrifennodd Roman Zippel: > > Both points have very strong consequences regarding complexity. Why do you > > want to deny me the choice to use something simple, especially since both > > solutions are not mutually exclusive and can even complement each other? > > I don't want to deny you the choice, I just don't want to see > unneccessary garbage in the base kernel. What you put in your own toilet > is a private matter. What you leave out in a public place is different. Now we've already sunken to the toilet level... :-( > > What's the point in forcing everyone to use a single solution? > > Maintainability ? common good over individual weirdnesses ? Ability for > people to concentrate on getting one good set of interfaces not twelve > bad ones ? Consistency for user space ? Alan, you're making things up without any proof. Listening to this diatribe against static tracepoints, one could get idea they would be something alien, which would polute the source. Well, everything can be abused, but good tracepoints are like good documentation, nobody wants to write and maintain it, but in the end others benefit from it if it exists. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:34 ` Roman Zippel @ 2006-09-15 14:41 ` Alan Cox 2006-09-15 14:35 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Alan Cox @ 2006-09-15 14:41 UTC (permalink / raw) To: Roman Zippel Cc: Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 15:34 +0200, ysgrifennodd Roman Zippel: > > Maintainability ? common good over individual weirdnesses ? Ability for > > people to concentrate on getting one good set of interfaces not twelve > > bad ones ? Consistency for user space ? > > Alan, you're making things up without any proof. Welcome to my killfile. There isn't much point having a discussion with anyone who considers any view or fact not in agreement as "no proof" and any view or fact that favours them as "proven". In the meantime perhaps the saner members of the static trace brigade can explain why gcc debug data isn't good enough for them when its good enough for kgdb to do single stepping at source level and variable printing ? Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:41 ` Alan Cox @ 2006-09-15 14:35 ` Karim Yaghmour 2006-09-15 14:58 ` Alan Cox 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:35 UTC (permalink / raw) To: Alan Cox Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > In the meantime perhaps the saner members of the static trace brigade > can explain why gcc debug data isn't good enough for them when its good > enough for kgdb to do single stepping at source level and variable > printing ? Care to explain how I can use to implement the equivalent of this: @@ -1709,6 +1712,7 @@ switch_tasks: ++*switch_count; prepare_arch_switch(rq, next); + TRACE_SCHEDCHANGE(prev, next); prev = context_switch(rq, prev, next); barrier(); Also, care to explain how kprobes can be used to access same data without having to actually customize a probe point for every binary? Thanks, Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:35 ` Karim Yaghmour @ 2006-09-15 14:58 ` Alan Cox 2006-09-15 14:57 ` Karim Yaghmour ` (3 more replies) 0 siblings, 4 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 14:58 UTC (permalink / raw) To: karim Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 10:35 -0400, ysgrifennodd Karim Yaghmour: > Care to explain how I can use to implement the equivalent of this: > > @@ -1709,6 +1712,7 @@ switch_tasks: > ++*switch_count; > > prepare_arch_switch(rq, next); > + TRACE_SCHEDCHANGE(prev, next); > prev = context_switch(rq, prev, next); > barrier(); The gdb debug data lets you find each line and also the variable assignments (except when highly optimised in some cases). Try breakpointing there with kgdb and using "where"... A kgdb script is the wrong way to do instrumentation but it does demonstrate the information is already out there, automatically generated and self maintaining. You do need the gdb -g debug data, but equally if it was static you'd need to recompile with the tracepoint because it would be off by default, and there is a very small risk in both cases you'll disturb or change the code behaviour/flow. > Also, care to explain how kprobes can be used to access same data > without having to actually customize a probe point for every binary? Thats why we have things like systemtap. All we appear to lack is systemtap ability to parse debug data so it can be told "trace on line 9 of sched.c and record rq and next" Alan ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:58 ` Alan Cox @ 2006-09-15 14:57 ` Karim Yaghmour 2006-09-15 17:49 ` Andrew Morton 2006-09-15 17:01 ` Tim Bird ` (2 subsequent siblings) 3 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 14:57 UTC (permalink / raw) To: Alan Cox Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > The gdb debug data lets you find each line and also the variable > assignments (except when highly optimised in some cases). Try > breakpointing there with kgdb and using "where"... A kgdb script is the > wrong way to do instrumentation but it does demonstrate the information > is already out there, automatically generated and self maintaining. > > You do need the gdb -g debug data, but equally if it was static you'd > need to recompile with the tracepoint because it would be off by > default, and there is a very small risk in both cases you'll disturb or > change the code behaviour/flow. ... > Thats why we have things like systemtap. > > All we appear to lack is systemtap ability to parse debug data so it can > be told "trace on line 9 of sched.c and record rq and next" Thanks for the explanation. But I submit to you that both explanations actually highlight the argument I was making earlier with regards to dynamic tracing (and gdb info in this case) actually require a non- expert to chase kernel versions and create appropriate appropriate scripts/config-info for the post-insertion of instrumentation, with the risks to kernel developers this may have (ex.: bug report to lkml from user claiming to have discovered problem in subsystem when, in fact, trace point by external maintainer was ill-chosen.) Cheers, Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:57 ` Karim Yaghmour @ 2006-09-15 17:49 ` Andrew Morton 2006-09-15 18:20 ` Karim Yaghmour 0 siblings, 1 reply; 271+ messages in thread From: Andrew Morton @ 2006-09-15 17:49 UTC (permalink / raw) To: karim Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 15 Sep 2006 10:57:29 -0400 Karim Yaghmour <karim@opersys.com> wrote: > But I submit to you that both explanations > actually highlight the argument I was making earlier with regards to > dynamic tracing (and gdb info in this case) actually require a non- > expert to chase kernel versions and create appropriate appropriate > scripts/config-info for the post-insertion of instrumentation > ... Again, I don't see this as a huge problem. patch(1) is able to keep track of specific places within source code even in the presence of quite violent changes to that source code. There's no reason why systemtap support code cannot do the same. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:49 ` Andrew Morton @ 2006-09-15 18:20 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 18:20 UTC (permalink / raw) To: Andrew Morton Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Andrew Morton wrote: > Again, I don't see this as a huge problem. patch(1) is able to keep track > of specific places within source code even in the presence of quite violent > changes to that source code. There's no reason why systemtap support code > cannot do the same. If you don't want to listen to my part of the argument then consider the point of view of those who have maintained systems entirely based on binary editing, namely systemtap and LKET. It's indicative that all those who have been involved in tracing, be it by static instrumentation of code or the use of binary editing, all favor some form of static markup mechanism of the code. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:58 ` Alan Cox 2006-09-15 14:57 ` Karim Yaghmour @ 2006-09-15 17:01 ` Tim Bird 2006-09-15 17:08 ` Frank Ch. Eigler 2006-09-15 18:18 ` Martin Bligh 3 siblings, 0 replies; 271+ messages in thread From: Tim Bird @ 2006-09-15 17:01 UTC (permalink / raw) To: Alan Cox Cc: karim, Roman Zippel, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > Ar Gwe, 2006-09-15 am 10:35 -0400, ysgrifennodd Karim Yaghmour: >> @@ -1709,6 +1712,7 @@ switch_tasks: >> ++*switch_count; >> >> prepare_arch_switch(rq, next); >> + TRACE_SCHEDCHANGE(prev, next); >> prev = context_switch(rq, prev, next); >> barrier(); > > All we appear to lack is systemtap ability to parse debug data so it can > be told "trace on line 9 of sched.c and record rq and next" If the latter is a suggestion for how an out-of-tree rule for a tracepoint definition should look, it's a terrible one. Alan's example is much more fragile, from a maintenance perspective, than Karim's. Plus, it's much more difficult to implement, whether you plan to inject no-ops at compile time, just record locations and stack offsets, or actually place some tracing code (heaven forbid) that the compiler could optimize for that context. I still think that this is off-topic for the patch posted. I think we should debate the implementation of tracepoints/markers when someone posts a patch for some. I think it's rather scurrilous to complain about code NOT submitted. Ingo has even mis-characterized the not-submitted instrumentation patch, by saying it has 350 tracepoints when it has no such thing. I counted 58 for one architecture (with only 8 being arch-specific). -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Electronics ============================= ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:58 ` Alan Cox 2006-09-15 14:57 ` Karim Yaghmour 2006-09-15 17:01 ` Tim Bird @ 2006-09-15 17:08 ` Frank Ch. Eigler 2006-09-15 17:57 ` Andrew Morton 2006-09-15 18:31 ` Alan Cox 2006-09-15 18:18 ` Martin Bligh 3 siblings, 2 replies; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-15 17:08 UTC (permalink / raw) To: Alan Cox Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > [...] > > > > prepare_arch_switch(rq, next); > > + TRACE_SCHEDCHANGE(prev, next); > > prev = context_switch(rq, prev, next); > > barrier(); > > The gdb debug data lets you find each line and also the variable > assignments (except when highly optimised in some cases). [...] Unfortunately, variables and even control flow are quite regularly made non-probe-capable by modern gcc. Statement boundaries and variables are not preserved. There is an arms race within gcc to both improve code optimization and its own "reverse-engineering" debugging data generation, and the former is always ahead. The end result is that there are many spots that we'd like to probe in systemtap, but can't place exactly or extract all the data we'd like. Really. There are also spots that for other reasons cannot tolerate a fully dynamic kprobes-style probe: - where 1000-cycle int3-dispatching overheads too high - in low-level code such as fault handling or locking, that, if probed dynamically, could entail infinite regress - debugging information may not be available This is the reason why I'm in favour of some lightweight event-marking facility: a way of catching those points where dynamic probing is not sufficiently fast or dependable. > [...] > All we appear to lack is systemtap ability to parse debug data so it can > be told "trace on line 9 of sched.c and record rq and next" Actually: #! stap probe kernel.function("*@kernel/sched.c:9") { printf("%p %p", $rq, $next) } - FChE ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:08 ` Frank Ch. Eigler @ 2006-09-15 17:57 ` Andrew Morton 2006-09-15 18:31 ` Alan Cox 1 sibling, 0 replies; 271+ messages in thread From: Andrew Morton @ 2006-09-15 17:57 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Alan Cox, karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On 15 Sep 2006 13:08:29 -0400 fche@redhat.com (Frank Ch. Eigler) wrote: > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > > [...] > > > > > > prepare_arch_switch(rq, next); > > > + TRACE_SCHEDCHANGE(prev, next); > > > prev = context_switch(rq, prev, next); > > > barrier(); > > > > The gdb debug data lets you find each line and also the variable > > assignments (except when highly optimised in some cases). [...] > > Unfortunately, variables and even control flow are quite regularly > made non-probe-capable by modern gcc. Statement boundaries and > variables are not preserved. There is an arms race within gcc to both > improve code optimization and its own "reverse-engineering" debugging > data generation, and the former is always ahead. > > The end result is that there are many spots that we'd like to probe in > systemtap, but can't place exactly or extract all the data we'd like. > Really. Useful info, thanks. > There are also spots that for other reasons cannot tolerate a fully > dynamic kprobes-style probe: > > - where 1000-cycle int3-dispatching overheads too high Is that still true of the recent kprobes "boosting" changes? > - in low-level code such as fault handling or locking, that, if probed > dynamically, could entail infinite regress > - debugging information may not be available > > This is the reason why I'm in favour of some lightweight event-marking > facility: a way of catching those points where dynamic probing is not > sufficiently fast or dependable. OK. > > [...] > > All we appear to lack is systemtap ability to parse debug data so it can > > be told "trace on line 9 of sched.c and record rq and next" > > Actually: > > #! stap > probe kernel.function("*@kernel/sched.c:9") { printf("%p %p", $rq, $next) } > Really. That's impressive progress. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:08 ` Frank Ch. Eigler 2006-09-15 17:57 ` Andrew Morton @ 2006-09-15 18:31 ` Alan Cox 2006-09-15 18:12 ` Ingo Molnar 2006-09-15 18:24 ` Frank Ch. Eigler 1 sibling, 2 replies; 271+ messages in thread From: Alan Cox @ 2006-09-15 18:31 UTC (permalink / raw) To: Frank Ch. Eigler Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler: > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > - where 1000-cycle int3-dispatching overheads too high Why are your despatching overheads 1000 cycles ? (and if its due to int3 why are you using int 3 8)) ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:31 ` Alan Cox @ 2006-09-15 18:12 ` Ingo Molnar 2006-09-15 19:10 ` Roman Zippel 2006-09-15 18:24 ` Frank Ch. Eigler 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 18:12 UTC (permalink / raw) To: Alan Cox Cc: Frank Ch. Eigler, karim, Roman Zippel, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler: > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > - where 1000-cycle int3-dispatching overheads too high > > Why are your despatching overheads 1000 cycles ? (and if its due to > int3 why are you using int 3 8)) this is being worked on actively: there's the "djprobes" patchset, which includes a simplified disassembler to analyze common target code and can thus insert much faster, call-a-trampoline-function based tracepoints that are just as fast as (or faster than) compile-time, static tracepoints. there's no fundamental reason why INT3 should be the primary model of inserting kprobes. Sometimes we are unlucky and the code which we target is too complex - then we take a few hundred cycles of a penalty. If that piece of code is a really common destination then we can add a static marker in the source which both prepares parameters and inserts a sufficiently sized NOP (or a function call) to prepare things for fast dynamic tracing - but it should only be an optional performance helper that we have the freedom to zap. (kprobes can be thought of as a special "JIT", and there's no fundamental reason why it couldnt do almost arbitrary transformations on kernel code.) and there's alot more that kprobes/systemtap can do: it can be a method of extending the kernel along a 'plugin' model - without having to impact the kernel source! That way people can experiment with kernel extensions on live kernels, without the barrier of recompile/reboot. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:12 ` Ingo Molnar @ 2006-09-15 19:10 ` Roman Zippel 2006-09-15 19:10 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-15 19:10 UTC (permalink / raw) To: Ingo Molnar Cc: Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler: > > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > > - where 1000-cycle int3-dispatching overheads too high > > > > Why are your despatching overheads 1000 cycles ? (and if its due to > > int3 why are you using int 3 8)) > > this is being worked on actively: there's the "djprobes" patchset, which > includes a simplified disassembler to analyze common target code and can > thus insert much faster, call-a-trampoline-function based tracepoints > that are just as fast as (or faster than) compile-time, static > tracepoints. Who is going to implement this for every arch? Is this now the official party line that only archs, which implement all of this, can make use of efficient tracing? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:10 ` Roman Zippel @ 2006-09-15 19:10 ` Ingo Molnar 2006-09-15 20:05 ` Thomas Gleixner 2006-09-19 12:29 ` Christoph Hellwig 2 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 19:10 UTC (permalink / raw) To: Roman Zippel Cc: Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > On Fri, 15 Sep 2006, Ingo Molnar wrote: > > > this is being worked on actively: there's the "djprobes" patchset, > > which includes a simplified disassembler to analyze common target > > code and can thus insert much faster, call-a-trampoline-function > > based tracepoints that are just as fast as (or faster than) > > compile-time, static tracepoints. > > Who is going to implement this for every arch? someone who is interested enough in that arch growing that capability? > Is this now the official party line that only archs, which implement > all of this, can make use of efficient tracing? that's certainly my preference - kprobes have lots of other advantages besides tracing. Whether that becomes the "official party line" depends on the technological analysis of the situation which will ultimately shape the outcome of this discussion. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:10 ` Roman Zippel 2006-09-15 19:10 ` Ingo Molnar @ 2006-09-15 20:05 ` Thomas Gleixner 2006-09-15 20:35 ` Roman Zippel 2006-09-15 21:44 ` Tim Bird 2006-09-19 12:29 ` Christoph Hellwig 2 siblings, 2 replies; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 20:05 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 21:10 +0200, Roman Zippel wrote: > > > > this is being worked on actively: there's the "djprobes" patchset, which > > includes a simplified disassembler to analyze common target code and can > > thus insert much faster, call-a-trampoline-function based tracepoints > > that are just as fast as (or faster than) compile-time, static > > tracepoints. > > Who is going to implement this for every arch? > Is this now the official party line that only archs, which implement all > of this, can make use of efficient tracing? In the reverse you are enforcing an ugly - but available for all archs - solution due to the fact that there is nobody interested enough to implement it ? If there is no interest to do that, then this arch can probably live w/o instrumentation for the next decade too. tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:05 ` Thomas Gleixner @ 2006-09-15 20:35 ` Roman Zippel 2006-09-15 21:44 ` Tim Bird 1 sibling, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-15 20:35 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Thomas Gleixner wrote: > > Who is going to implement this for every arch? > > Is this now the official party line that only archs, which implement all > > of this, can make use of efficient tracing? > > In the reverse you are enforcing an ugly - but available for all archs - > solution due to the fact that there is nobody interested enough to > implement it ? Where is the proof that such solution is inherently ugly? (Note that just picking some example from LTT doesn't make a general proof.) I am also not the one who wants to enforce a single solution onto everyone. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:05 ` Thomas Gleixner 2006-09-15 20:35 ` Roman Zippel @ 2006-09-15 21:44 ` Tim Bird 1 sibling, 0 replies; 271+ messages in thread From: Tim Bird @ 2006-09-15 21:44 UTC (permalink / raw) To: tglx Cc: Roman Zippel, Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais Thomas Gleixner wrote: > On Fri, 2006-09-15 at 21:10 +0200, Roman Zippel wrote: > >>>this is being worked on actively: there's the "djprobes" patchset, which >>>includes a simplified disassembler to analyze common target code and can >>>thus insert much faster, call-a-trampoline-function based tracepoints >>>that are just as fast as (or faster than) compile-time, static >>>tracepoints. >> >>Who is going to implement this for every arch? >>Is this now the official party line that only archs, which implement all >>of this, can make use of efficient tracing? > > In the reverse you are enforcing an ugly - but available for all archs - > solution due to the fact that there is nobody interested enough to > implement it ? ???? If there's a solution people are willing to implement, and one they aren't - doesn't that say something? Static tracepoint patches for numerous architectures have existed and been maintained out-of-tree for years. > If there is no interest to do that, then this arch can probably live w/o > instrumentation for the next decade too. The arches already have instrumentation - just not dynamic instrumentation. The reason static tracepoints have been implemented and kprobes haven't is that static tracepoints are sufficient for what those people are doing, and dynamic tracepoints are a pain to implement. Let me repeat that, just in case people missed it: "Static tracepoints work for what I need." If other people want to implement something fancier that works for them, then feel free. ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Electronics ============================= ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:10 ` Roman Zippel 2006-09-15 19:10 ` Ingo Molnar 2006-09-15 20:05 ` Thomas Gleixner @ 2006-09-19 12:29 ` Christoph Hellwig 2006-09-19 13:17 ` Roman Zippel 2 siblings, 1 reply; 271+ messages in thread From: Christoph Hellwig @ 2006-09-19 12:29 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, Sep 15, 2006 at 09:10:44PM +0200, Roman Zippel wrote: > Hi, > > On Fri, 15 Sep 2006, Ingo Molnar wrote: > > > > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler: > > > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > > > - where 1000-cycle int3-dispatching overheads too high > > > > > > Why are your despatching overheads 1000 cycles ? (and if its due to > > > int3 why are you using int 3 8)) > > > > this is being worked on actively: there's the "djprobes" patchset, which > > includes a simplified disassembler to analyze common target code and can > > thus insert much faster, call-a-trampoline-function based tracepoints > > that are just as fast as (or faster than) compile-time, static > > tracepoints. > > Who is going to implement this for every arch? > Is this now the official party line that only archs, which implement all > of this, can make use of efficient tracing? Come on, stop trying to be an asshole. It's always been the case that to use new functionality you have to add arch code where nessecary. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-19 12:29 ` Christoph Hellwig @ 2006-09-19 13:17 ` Roman Zippel 0 siblings, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-19 13:17 UTC (permalink / raw) To: Christoph Hellwig Cc: Ingo Molnar, Alan Cox, Frank Ch. Eigler, karim, Tim Bird, Mathieu Desnoyers, linux-kernel, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Tue, 19 Sep 2006, Christoph Hellwig wrote: > > Who is going to implement this for every arch? > > Is this now the official party line that only archs, which implement all > > of this, can make use of efficient tracing? > > Come on, stop trying to be an asshole. It's always been the case that to > use new functionality you have to add arch code where nessecary. On the contrary I'm really trying my best to be reasonable. If there were no way around implementing kprobes, I would completely agree with you. Let's take an item from todo list: TLS support for m68k. This a language feature becoming more and more important and increasingly difficult to work around it. Considering the complexities of this feature it will take quite a bit of the time available to me and somehow I doubt someone will beat me to it. I'm not complaining about it, I even enjoy hacking on it, but I also have to take no shit on how I have to spend my time. Considering this I hope you understand how important kprobes are to me, I admit it's a nice a feature, but it's far from being essential. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:31 ` Alan Cox 2006-09-15 18:12 ` Ingo Molnar @ 2006-09-15 18:24 ` Frank Ch. Eigler 2006-09-15 18:23 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-15 18:24 UTC (permalink / raw) To: Alan Cox Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais [-- Attachment #1: Type: text/plain, Size: 734 bytes --] Hi - On Fri, Sep 15, 2006 at 07:31:48PM +0100, Alan Cox wrote: > Ar Gwe, 2006-09-15 am 13:08 -0400, ysgrifennodd Frank Ch. Eigler: Yeah, or something. :-) > > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > - where 1000-cycle int3-dispatching overheads too high > > Why are your despatching overheads 1000 cycles ? (and if its due to int3 > why are you using int 3 8)) Smart teams from IBM and Hitachi have been hammering away at this code for a year or two now, and yet (roughly) here we are. There have been experiments involving plopping branches instead of int3's at probe locations, but this is self-modifying code involving multiple instructions, and appears to be tricky on SMP/preempt boxes. - FChE [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:24 ` Frank Ch. Eigler @ 2006-09-15 18:23 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 18:23 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Alan Cox, karim, Roman Zippel, Tim Bird, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Frank Ch. Eigler <fche@redhat.com> wrote: > > Why are your despatching overheads 1000 cycles ? (and if its due to > > int3 why are you using int 3 8)) > > Smart teams from IBM and Hitachi have been hammering away at this code > for a year or two now, and yet (roughly) here we are. There have been > experiments involving plopping branches instead of int3's at probe > locations, but this is self-modifying code involving multiple > instructions, and appears to be tricky on SMP/preempt boxes. i am talking to them about that, and i'm 100% sure the solution is much easier than the many (much harder) problems that SystemTap has already solved. I think you are way too modest to realize how powerful (and important) SystemTap is :-) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 14:58 ` Alan Cox ` (2 preceding siblings ...) 2006-09-15 17:08 ` Frank Ch. Eigler @ 2006-09-15 18:18 ` Martin Bligh 3 siblings, 0 replies; 271+ messages in thread From: Martin Bligh @ 2006-09-15 18:18 UTC (permalink / raw) To: Alan Cox Cc: karim, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais >>Also, care to explain how kprobes can be used to access same data >>without having to actually customize a probe point for every binary? > > > Thats why we have things like systemtap. > > All we appear to lack is systemtap ability to parse debug data so it can > be told "trace on line 9 of sched.c and record rq and next" But that's the whole point - if it's not integrated into a marker as source code, it requires manual intervention for every bloody release to do. "line 9 of sched.c" is a farcically stupid way of doing tags on a dynamically moving project like the linux kernel. Yes, that may work OK for something that is very static, like a distro snapshot, but as a general mechanism, it's unsustainable and broken. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 13:41 ` Alan Cox 2006-09-15 13:34 ` Roman Zippel @ 2006-09-15 18:10 ` Jose R. Santos 2006-09-15 19:49 ` Mathieu Desnoyers 1 sibling, 1 reply; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 18:10 UTC (permalink / raw) To: Alan Cox Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Alan Cox wrote: > Consistency for user space ? > With several other trace tools being implemented for the kernel, there is a great problem with consistencies among these tool. It is my opinion that trace are of very little use to _most_ people with out the availability of post-processing tools to analyses these trace. While I wont say that we need one all powerful solution, it would be good if all solutions would at least be able to talk to the same post-processing facilities in user-space. Before LTTng is even considered into the kernel, there need to be discussion to determine if the trace mechanism being propose is suitable for all people interested in doing trace analysis. The fact the there also exist tool like LKET and LKST seem to suggest that there other things to be considered when it comes to implementing a trace mechanism that everyone would be happy with. It would also be useful for all the trace tool to implement the same probe points so that post-processing tools can be interchanged between the various trace implementations. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:10 ` Jose R. Santos @ 2006-09-15 19:49 ` Mathieu Desnoyers 2006-09-15 20:54 ` Jose R. Santos 0 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 19:49 UTC (permalink / raw) To: Jose R. Santos Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Jose R. Santos (jrs@us.ibm.com) wrote: > Alan Cox wrote: > > With several other trace tools being implemented for the kernel, there > is a great problem with consistencies among these tool. It is my > opinion that trace are of very little use to _most_ people with out the > availability of post-processing tools to analyses these trace. While I > wont say that we need one all powerful solution, it would be good if all > solutions would at least be able to talk to the same post-processing > facilities in user-space. Before LTTng is even considered into the > kernel, there need to be discussion to determine if the trace mechanism > being propose is suitable for all people interested in doing trace > analysis. The fact the there also exist tool like LKET and LKST seem to > suggest that there other things to be considered when it comes to > implementing a trace mechanism that everyone would be happy with. > > It would also be useful for all the trace tool to implement the same > probe points so that post-processing tools can be interchanged between > the various trace implementations. > > Hi Jose, I completely agree that there is a crying need for standardisation there. The reason why I propose the LTTng infrastructure as a tracing core in the Linux kernel is this : the fundamental problem I have found with kernel tracers so far is that they perturb the system too much or do not offer enough fine grained protection against reentrancy. Ingo's post about tracing statement breaking the kernel all the time seems to me like a sufficient proof that this is a real problem. My goal with LTTng is to provide a reentrant data serialisation mechanism that can be called from anywhere in the kernel (ok, the vmalloc path of the page fault handler is _the_ exception) that does not use any lock and can therefore trace code paths like NMI handlers. I also implemented code that would serialize any type of data structure I could think of. If it is too much, well, we can use part of it. LTTng trace format is explained there. Your comments on it are very welcome. http://ltt.polymtl.ca/ > LTTV and LTTng developer documentation > format.html (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/format.html) Regards, Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:49 ` Mathieu Desnoyers @ 2006-09-15 20:54 ` Jose R. Santos 2006-09-15 21:42 ` Karim Yaghmour 2006-09-15 21:46 ` Mathieu Desnoyers 0 siblings, 2 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 20:54 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > * Jose R. Santos (jrs@us.ibm.com) wrote: > > Alan Cox wrote: > > > > With several other trace tools being implemented for the kernel, there > > is a great problem with consistencies among these tool. It is my > > opinion that trace are of very little use to _most_ people with out the > > availability of post-processing tools to analyses these trace. While I > > wont say that we need one all powerful solution, it would be good if all > > solutions would at least be able to talk to the same post-processing > > facilities in user-space. Before LTTng is even considered into the > > kernel, there need to be discussion to determine if the trace mechanism > > being propose is suitable for all people interested in doing trace > > analysis. The fact the there also exist tool like LKET and LKST seem to > > suggest that there other things to be considered when it comes to > > implementing a trace mechanism that everyone would be happy with. > > > > It would also be useful for all the trace tool to implement the same > > probe points so that post-processing tools can be interchanged between > > the various trace implementations. > > > > > > Hi Jose, > > I completely agree that there is a crying need for standardisation there. The > reason why I propose the LTTng infrastructure as a tracing core in the Linux > kernel is this : the fundamental problem I have found with kernel tracers so > far is that they perturb the system too much or do not offer enough fine > grained protection against reentrancy. Ingo's post about tracing statement > breaking the kernel all the time seems to me like a sufficient proof that this > is a real problem. > > I agree with your goal for ltt. > My goal with LTTng is to provide a reentrant data serialisation mechanism that > can be called from anywhere in the kernel (ok, the vmalloc path of the page > fault handler is _the_ exception) that does not use any lock and can therefore > trace code paths like NMI handlers. > One of the things that I've notice from this thread that neither you or Karim sees to have answer is why is LTTng needed if a suitable replacement can be developed using SystemTap with static markers. I am personally interested in this answer as well. If all the things that LTT is proposing can be implemented in SystemTap, what then is the advantage of accenting such an interface into the kernel. I don't really care which method is used as long as its the right tool for the job. I see several idea from LTT that could be integrated into SystemTap in order to make it a one stop solution for both dynamic and static tracing. Would you care to elaborate why you think having separate projects is a better solution? > I also implemented code that would serialize any type of data structure I could > think of. If it is too much, well, we can use part of it. > > LTTng trace format is explained there. Your comments on it are very welcome. > > http://ltt.polymtl.ca/ > LTTV and LTTng developer documentation > format.html > (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/format.html) > Trace event headers are very similar between both LTT and LKET which is good in other to get some synergy between our projects. One thing that LKET has on each trace event that LTT doesn't is the tid and CPU id of each event. We find this extremely useful for post-processing. Also, why have the event_size on every event taken? Why not describe the event during the trace header and remove this redundant information from the event header and save some trace file space. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:54 ` Jose R. Santos @ 2006-09-15 21:42 ` Karim Yaghmour 2006-09-15 21:46 ` Mathieu Desnoyers 1 sibling, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 21:42 UTC (permalink / raw) To: jrs Cc: Mathieu Desnoyers, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Jose R. Santos wrote: > I don't really care which method is used as long as its the right tool > for the job. I see several idea from LTT that could be integrated into > SystemTap in order to make it a one stop solution for both dynamic and > static tracing. Would you care to elaborate why you think having > separate projects is a better solution? We don't -- at least *I* wouldn't care, but I'm not the current maintainer. ltt's usefulness has always been in the digested information it can present to the user. The kernel patching part was a necessary evil. What I object to is the depiction of dynamic tracing as solving the need for static markup. I doesn't, and, therefore, does not currently constitute an adequate substitute for ltt's patches. If someone else can actually provide ltt with the events and surround detail (timestamping and all) it needs while still providing the same performance we currently get out of the current ltt patches, then I'd say more power to them -- the current developers may how more relevant things to say. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:54 ` Jose R. Santos 2006-09-15 21:42 ` Karim Yaghmour @ 2006-09-15 21:46 ` Mathieu Desnoyers 2006-09-19 15:05 ` Jose R. Santos 1 sibling, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 21:46 UTC (permalink / raw) To: Jose R. Santos Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi Jose, * Jose R. Santos (jrs@us.ibm.com) wrote: > >My goal with LTTng is to provide a reentrant data serialisation mechanism > >that > >can be called from anywhere in the kernel (ok, the vmalloc path of the page > >fault handler is _the_ exception) that does not use any lock and can > >therefore > >trace code paths like NMI handlers. > > > > One of the things that I've notice from this thread that neither you or > Karim sees to have answer is why is LTTng needed if a suitable > replacement can be developed using SystemTap with static markers. I am > personally interested in this answer as well. If all the things that > LTT is proposing can be implemented in SystemTap, what then is the > advantage of accenting such an interface into the kernel. > Well, last time I have checked, SystemTAP did not have a reentrant serialisation mechanism to write the information to the buffers. Also, the goals of the projects differ : SystemTAP finds acceptable to suffer from the kprobe performance hit while it is unacceptable for LTTng. > I don't really care which method is used as long as its the right tool > for the job. I see several idea from LTT that could be integrated into > SystemTap in order to make it a one stop solution for both dynamic and > static tracing. Would you care to elaborate why you think having > separate projects is a better solution? I think that each projet focus on their own different goals but that there is much to gain in reusing the strenghts of each. SystemTAP is good at dynamic instrumentation. LTTng is good at data serialisation under a fully reentrant kernel. LTTng provides logging primitives for any data type, including SystemTAP text output. Is someone willing to try to create a small facility that will dump SystemTAP's output in LTTng ? It is nearly trivial : if I wasn't completing my debugfs port, I would probably be doing it right now. > Trace event headers are very similar between both LTT and LKET which is > good in other to get some synergy between our projects. One thing that > LKET has on each trace event that LTT doesn't is the tid and CPU id of > each event. We find this extremely useful for post-processing. Also, > why have the event_size on every event taken? Why not describe the > event during the trace header and remove this redundant information from > the event header and save some trace file space. > A standard event header has to have only crucial information, nothing more, or it becomes bloated and quickly grow trace size. We decided not to put tid and CPU id in the event header because tid is already available with the schedchange events at post-processing time and CPU id is already available too, as we have per CPU buffers. The event size is completely unnecessary, but in reality very, very useful to authenticate the correspondance between the size of the data recorded by the kernel and the size of data the viewer thinks it is reading. Think of it as a consistency check between kernel and viewer algorithms. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 21:46 ` Mathieu Desnoyers @ 2006-09-19 15:05 ` Jose R. Santos 2006-09-19 15:30 ` Mathieu Desnoyers 0 siblings, 1 reply; 271+ messages in thread From: Jose R. Santos @ 2006-09-19 15:05 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > > Trace event headers are very similar between both LTT and LKET which is > > good in other to get some synergy between our projects. One thing that > > LKET has on each trace event that LTT doesn't is the tid and CPU id of > > each event. We find this extremely useful for post-processing. Also, > > why have the event_size on every event taken? Why not describe the > > event during the trace header and remove this redundant information from > > the event header and save some trace file space. > > > > A standard event header has to have only crucial information, nothing more, or > it becomes bloated and quickly grow trace size. We decided not to put tid and > CPU id in the event header because tid is already available with the schedchange > events at post-processing time and CPU id is already available too, as we have > per CPU buffers. > We still keep the CPU id because LKET still support ASCII tracing which mixes the output of all the CPUs together. It is still debatable whether this is a useful feature or not though. If we remove ASCII event tracing from LKET, we could remove CPU id from the event header as well. The tid we still include because LKET supports turning on individual tracepoints unlike LTT, which if I remember correctly turns on all the tracepoint that are compiled into the running kernel. Since the user is free to chose which tracepoints he wants to use for his workload, we can not guarantee that scheduler tracepoints are going to be available. We consider taking the tid as one of those absolute minimum pieces of data required to do meaningful analysis. We chose to control performance and trace output size by letting users have control of number of tracepoint he can activate at any given time. This is important to us since we plan to add many dynamic tracepoints to different sub-systems (filesystem, device drivers, core kernel facilities, etc...). Turning on all of these tracepoint at the same time would slow down the system to much and change the performance characteristics of the environment being studied. > The event size is completely unnecessary, but in reality very, very useful to > authenticate the correspondance between the size of the data recorded by the > kernel and the size of data the viewer thinks it is reading. Think of it as a > consistency check between kernel and viewer algorithms. > I understand. But if the size of each event is fixed, why would you expect the data sizes that the tool reports in the trace header for each event to change over the course of a trace. If the data on the per-CPU buffers is serialized, a similar authentication could be done using the timestamp by checking the timestamps of the events before and after the current event, thus validating the current timestamp as well as the size offset of the previous event. Just a thought. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-19 15:05 ` Jose R. Santos @ 2006-09-19 15:30 ` Mathieu Desnoyers 2006-09-19 16:39 ` Jose R. Santos 0 siblings, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-19 15:30 UTC (permalink / raw) To: Jose R. Santos Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Jose R. Santos (jrs@us.ibm.com) wrote: > Mathieu Desnoyers wrote: > >A standard event header has to have only crucial information, nothing > >more, or > >it becomes bloated and quickly grow trace size. We decided not to put tid > >and > >CPU id in the event header because tid is already available with the > >schedchange > >events at post-processing time and CPU id is already available too, as we > >have > >per CPU buffers. > > > > We still keep the CPU id because LKET still support ASCII tracing which > mixes the output of all the CPUs together. It is still debatable > whether this is a useful feature or not though. If we remove ASCII > event tracing from LKET, we could remove CPU id from the event header as > well. > How hard would it be to make LKET send its ASCII output to multiple "channels" (buffers) and then fetch and combine them in user space ? Have a look at lttd and lttv in the ltt-control package from the LTTng project : it would be trivial to adapt. In fact, there is already a text dump module available. > The tid we still include because LKET supports turning on individual > tracepoints unlike LTT, which if I remember correctly turns on all the > tracepoint that are compiled into the running kernel. Since the user is > free to chose which tracepoints he wants to use for his workload, we can > not guarantee that scheduler tracepoints are going to be available. We > consider taking the tid as one of those absolute minimum pieces of data > required to do meaningful analysis. > I understand, but it does not have to be included in the bare-boned event header. We could think of an optional "event context" header that would have its individual parts enabled or not depending on the events recorded in the trace. For instance : With scheduler instrumentation activated : Event Header | Variable data Without scheduler instrumentation activated : Event Header | PID | Variable data The information about whether or not the optional event context is present in the trace or not could be saved in the trace header. This way, we could not add unnecessary data when it is not needed. And furthermore, this is extensible for other event context information. > We chose to control performance and trace output size by letting users > have control of number of tracepoint he can activate at any given time. > This is important to us since we plan to add many dynamic tracepoints to > different sub-systems (filesystem, device drivers, core kernel > facilities, etc...). Turning on all of these tracepoint at the same > time would slow down the system to much and change the performance > characteristics of the environment being studied. Yes, I know that overhead is a big problem with dynamic instrumentation ;) I think we can find a way to both have an optimal trace format while giving a dynamic probe based tracer enough context when needed. > >The event size is completely unnecessary, but in reality very, very useful > >to > >authenticate the correspondance between the size of the data recorded by > >the > >kernel and the size of data the viewer thinks it is reading. Think of it > >as a > >consistency check between kernel and viewer algorithms. > > > > I understand. But if the size of each event is fixed, why would you > expect the data sizes that the tool reports in the trace header for each > event to change over the course of a trace. If the data on the per-CPU > buffers is serialized, a similar authentication could be done using the > timestamp by checking the timestamps of the events before and after the > current event, thus validating the current timestamp as well as the size > offset of the previous event. Just a thought. > Yes, but if there is a bug with the timestamp (time going backward because of problematic event record serialization), it becomes harder to pinpoint the source of the problem (if it is due to a bug in the variable data serialization mechanism, a bug in the user space "unserialization" mechanism or a bug in event serialization within the kernel). LTTng hasn't suffered of this kind of issue for quite some time, but when under heavy development, those indicators of data consistency have all proven their usefulness. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-19 15:30 ` Mathieu Desnoyers @ 2006-09-19 16:39 ` Jose R. Santos 2006-09-19 18:03 ` Mathieu Desnoyers 0 siblings, 1 reply; 271+ messages in thread From: Jose R. Santos @ 2006-09-19 16:39 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Mathieu Desnoyers wrote: > > We still keep the CPU id because LKET still support ASCII tracing which > > mixes the output of all the CPUs together. It is still debatable > > whether this is a useful feature or not though. If we remove ASCII > > event tracing from LKET, we could remove CPU id from the event header as > > well. > > > > How hard would it be to make LKET send its ASCII output to multiple "channels" > (buffers) and then fetch and combine them in user space ? Have a look at lttd > and lttv in the ltt-control package from the LTTng project : it would be > trivial to adapt. In fact, there is already a text dump module available. > Actually, ASCII trace should output to multiple channels if we use bulk mode. The original idea for keeping ASCII trace (this was the original output mechanism) was that a user may have wanted to look at trace output information in real-time as it was being printed onto the screen (which requires merging all the output channels). Again, I question the usability of this feature and if a user really wanted to look at ASCII trace data in real time, a better solution would be for the lket-b2a conversion tool to have a mode were it could print the output of constantly changing trace buffers to the screen. The ASCII output mode in LKET is cryptic and having lket-b2a do this would perform better and produce prettier output while also reducing the trace file output size a bit. > > The tid we still include because LKET supports turning on individual > > tracepoints unlike LTT, which if I remember correctly turns on all the > > tracepoint that are compiled into the running kernel. Since the user is > > free to chose which tracepoints he wants to use for his workload, we can > > not guarantee that scheduler tracepoints are going to be available. We > > consider taking the tid as one of those absolute minimum pieces of data > > required to do meaningful analysis. > > > > I understand, but it does not have to be included in the bare-boned event > header. We could think of an optional "event context" header that would have its > individual parts enabled or not depending on the events recorded in the trace. > For instance : > > With scheduler instrumentation activated : > > Event Header | Variable data > > Without scheduler instrumentation activated : > > Event Header | PID | Variable data > > The information about whether or not the optional event context is present in > the trace or not could be saved in the trace header. > > This way, we could not add unnecessary data when it is not needed. And > furthermore, this is extensible for other event context information. > Thats also a possible and it should not be difficult to implement. > > We chose to control performance and trace output size by letting users > > have control of number of tracepoint he can activate at any given time. > > This is important to us since we plan to add many dynamic tracepoints to > > different sub-systems (filesystem, device drivers, core kernel > > facilities, etc...). Turning on all of these tracepoint at the same > > time would slow down the system to much and change the performance > > characteristics of the environment being studied. > > Yes, I know that overhead is a big problem with dynamic instrumentation ;) I > think we can find a way to both have an optimal trace format while giving > a dynamic probe based tracer enough context when needed. > Actually, we started doing this six years ago on our internal *static* trace tool before we started implementing event tracing using SystemTap. Regardless of whether the tool uses static or dynamic probes, if the problem only requires 3 tracepoints to figure out, why would you want to activate 50+ hooks. > > > I understand. But if the size of each event is fixed, why would you > > expect the data sizes that the tool reports in the trace header for each > > event to change over the course of a trace. If the data on the per-CPU > > buffers is serialized, a similar authentication could be done using the > > timestamp by checking the timestamps of the events before and after the > > current event, thus validating the current timestamp as well as the size > > offset of the previous event. Just a thought. > > > > Yes, but if there is a bug with the timestamp (time going backward because of > problematic event record serialization), it becomes harder to pinpoint the > source of the problem (if it is due to a bug in the variable data serialization > mechanism, a bug in the user space "unserialization" mechanism or a bug in event > serialization within the kernel). LTTng hasn't suffered of this kind of issue > for quite some time, but when under heavy development, those indicators of data > consistency have all proven their usefulness. > > Look like the example you propose above could also apply to this as well. You could implement some sort of debug mode to the trace data that provides extra information useful for debugging the tool. If the information is really only useful when debugging the trace tool during development, wouldn't it make sense to have a way to disable debugging junk as needed? -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-19 16:39 ` Jose R. Santos @ 2006-09-19 18:03 ` Mathieu Desnoyers 0 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-19 18:03 UTC (permalink / raw) To: Jose R. Santos Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Jose R. Santos (jrs@us.ibm.com) wrote: > Look like the example you propose above could also apply to this as > well. You could implement some sort of debug mode to the trace data > that provides extra information useful for debugging the tool. If the > information is really only useful when debugging the trace tool during > development, wouldn't it make sense to have a way to disable debugging > junk as needed? > You are absolutely right. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 12:38 ` Alan Cox 2006-09-15 12:39 ` Roman Zippel @ 2006-09-15 17:45 ` Andrew Morton 2006-09-15 18:16 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Andrew Morton @ 2006-09-15 17:45 UTC (permalink / raw) To: Alan Cox Cc: Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 15 Sep 2006 13:38:58 +0100 Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > gcc -g produces extensive annotations which are then usably by many > tools other than gdb. This is something I'm curious about. AFAICT there are two(*) reasons for wanting static tracepoints: a) to be able to get at local variables and b) as a "marker" somewhere within the body of a function - the expectation here is that identifiying that particular spot in the function would be hard without some marker which moves around as the functions itself is modified over time. If a) is true, then isn't this simply a feature request against the systemtap infrastructure? There's no reason per-se why a kprobe point cannot access locals, using the dwarf debug info. It'll be somewhat unreliable, because stack slots and registers go out of scope and get reused for other things. But as any gdb user will know, it's still useful. As for b), if it was _really_ an advantage to be able to identify particular places within the body of a function then one could concoct a macro which inserts some info into a separate elf section and which adds no code at all to actual .text. Although IMO this is a bit lame - it is quite possible to go into SexySystemTapGUI, click on a particular kernel file-n-line and have systemtap userspace keep track of that place in the kernel source across many kernel versions: all it needs to do is to remember the file+line and a snippet of the surrounding text, for readjustment purposes. (*) I don't buy the performance arguments: kprobes are quick, and I'd expect that the CPU consumption of the destination of the probe is comparable to or higher than the cost of taking the initial trap. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 17:45 ` Andrew Morton @ 2006-09-15 18:16 ` Karim Yaghmour 2006-09-15 19:20 ` Jose R. Santos 2006-09-15 19:59 ` Andrew Morton 0 siblings, 2 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 18:16 UTC (permalink / raw) To: Andrew Morton Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Andrew Morton wrote: > This is something I'm curious about. AFAICT there are two(*) reasons for > wanting static tracepoints: > > a) to be able to get at local variables and > > b) as a "marker" somewhere within the body of a function - the > expectation here is that identifiying that particular spot in the > function would be hard without some marker which moves around as the > functions itself is modified over time. > > > If a) is true, then isn't this simply a feature request against the > systemtap infrastructure? There's no reason per-se why a kprobe point > cannot access locals, using the dwarf debug info. It'll be somewhat > unreliable, because stack slots and registers go out of scope and get > reused for other things. But as any gdb user will know, it's still > useful. I believe this has been addressed by Frank in his other email, so I'll skip. > As for b), if it was _really_ an advantage to be able to identify > particular places within the body of a function then one could concoct a > macro which inserts some info into a separate elf section and which adds no > code at all to actual .text. Yes, and this specific suggestion has been made a number of times. Though, then, this is an implementation debate and there are number of things which could be made available as build-time options. The emerging consensus in this thread, however, that there is a clear need for a way for statically marking up important events, and this point has been emphasized both by those who have maintained infrastructure based on "static" tracepoints and those maintaining such infrastructure based on "dynamic" tracepoints. > Although IMO this is a bit lame - it is quite possible to go into > SexySystemTapGUI, click on a particular kernel file-n-line and have > systemtap userspace keep track of that place in the kernel source across > many kernel versions: all it needs to do is to remember the file+line and a > snippet of the surrounding text, for readjustment purposes. Sure, if you're a kernel developer, but as I've explained numberous times in this thread, there are far more many users of tracing than kernel developers. > (*) I don't buy the performance arguments: kprobes are quick, and I'd > expect that the CPU consumption of the destination of the probe is > comparable to or higher than the cost of taking the initial trap. Please see Mathieu's earlier posting of numbers comparing kprobes to static points. Nevertheless, I do not believe that the use of kprobes should be pitted against static instrumentation, the two are orthogonal. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Karim Yaghmour @ 2006-09-15 19:20 ` Jose R. Santos 2006-09-15 19:59 ` Andrew Morton 1 sibling, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 19:20 UTC (permalink / raw) To: karim Cc: Andrew Morton, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Karim Yaghmour wrote: > > Although IMO this is a bit lame - it is quite possible to go into > > SexySystemTapGUI, click on a particular kernel file-n-line and have > > systemtap userspace keep track of that place in the kernel source across > > many kernel versions: all it needs to do is to remember the file+line and a > > snippet of the surrounding text, for readjustment purposes. > > Sure, if you're a kernel developer, but as I've explained numberous > times in this thread, there are far more many users of tracing than > kernel developers. > This is so true (and the main reason we implemented a trace utility in SystemTap). Several of the people that work with in my team are _not_ kernel developers. They do not necessarily know the Linux kernel code enough to insert their own instrumentation. On the other had, they do posses other very good knowledges about things specific to a particular software stack or a HW subsystem. Structured predefined probe points (dynamic or static) allow people with limited kernel hacking skills to feedback useful information back to developers of the kernel. I agree with Karim that a trace tool (while useful to developers) is mostly targeted at a non kernel developer audience. They are mostly meant to enhance the communication between developers and regular users. Any solution that is intended to be dynamic replacement for LTTng needs to take these kinds of users into account. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 18:16 ` Karim Yaghmour 2006-09-15 19:20 ` Jose R. Santos @ 2006-09-15 19:59 ` Andrew Morton 2006-09-15 20:24 ` Karim Yaghmour 1 sibling, 1 reply; 271+ messages in thread From: Andrew Morton @ 2006-09-15 19:59 UTC (permalink / raw) To: karim Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 15 Sep 2006 14:16:18 -0400 Karim Yaghmour <karim@opersys.com> wrote: > > Although IMO this is a bit lame - it is quite possible to go into > > SexySystemTapGUI, click on a particular kernel file-n-line and have > > systemtap userspace keep track of that place in the kernel source across > > many kernel versions: all it needs to do is to remember the file+line and a > > snippet of the surrounding text, for readjustment purposes. > > Sure, if you're a kernel developer, but as I've explained numberous > times in this thread, there are far more many users of tracing than > kernel developers. Disagree. I was describing a means by which a set of systemtap trace points could be described. A means which would allow those tracepoints to be maintained without human intervention as the kernel source changes. (ie: use a similar algorithm and representation as patch(1)). Presumably those tracepoints would have been provided by a kernel developer and delivered to non-developers, just like static tracepoints. > > (*) I don't buy the performance arguments: kprobes are quick, and I'd > > expect that the CPU consumption of the destination of the probe is > > comparable to or higher than the cost of taking the initial trap. > > Please see Mathieu's earlier posting of numbers comparing kprobes to > static points. Nevertheless, I do not believe that the use of kprobes > should be pitted against static instrumentation, the two are > orthogonal. People have been speeding up kprobes in recent kernels, to avoid the int3 overhead. I don't recall seeing how effective that has been. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 19:59 ` Andrew Morton @ 2006-09-15 20:24 ` Karim Yaghmour 2006-09-15 20:25 ` Thomas Gleixner 0 siblings, 1 reply; 271+ messages in thread From: Karim Yaghmour @ 2006-09-15 20:24 UTC (permalink / raw) To: Andrew Morton Cc: Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Andrew Morton wrote: > People have been speeding up kprobes in recent kernels, to avoid the int3 > overhead. I don't recall seeing how effective that has been. I don't want to microdebate this one, but here's the quote from Frank on the topic of djprobe: > Smart teams from IBM and Hitachi have been hammering away at this code > for a year or two now, and yet (roughly) here we are. There have been > experiments involving plopping branches instead of int3's at probe > locations, but this is self-modifying code involving multiple > instructions, and appears to be tricky on SMP/preempt boxes. The idea behind this mechanism is neat. But every step along the way there seem to be ever more complex corner cases where it can't be used. Should this mechanism ever be made to work, the need for static markup would still be felt however. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 20:24 ` Karim Yaghmour @ 2006-09-15 20:25 ` Thomas Gleixner 0 siblings, 0 replies; 271+ messages in thread From: Thomas Gleixner @ 2006-09-15 20:25 UTC (permalink / raw) To: karim Cc: Andrew Morton, Alan Cox, Roman Zippel, Tim Bird, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Ingo Molnar, Greg Kroah-Hartman, Tom Zanussi, ltt-dev, Michel Dagenais On Fri, 2006-09-15 at 16:24 -0400, Karim Yaghmour wrote: > Should this mechanism ever be made to work, the need for static > markup would still be felt however. This might apply to some exotic points, but for 98% of the instrumentation scenarios static markup is not necessary. tglx ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 18:15 ` Ingo Molnar ` (2 preceding siblings ...) 2006-09-14 19:40 ` Tim Bird @ 2006-09-14 19:47 ` Roman Zippel 2006-09-14 20:24 ` Ingo Molnar 3 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 19:47 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > for me these are all _independent_ grounds for rejection, as a generic > > > kernel infrastructure. > > > > Tracepoints of course need to be managed, but that's true for both > > dynamic and static tracepoints. [...] > > that's not true, and this is the important thing that i believe you are > missing. A dynamic tracepoint is _detached_ from the normal source code > and thus is zero maintainance overhead. You dont have to maintain it > during normal development - only if you need it. You dont see the > dynamic tracepoints in the source code. > > a static tracepoint, once it's in the mainline kernel, is a nonzero > maintainance overhead _until eternity_. I hope you do realize that this a rather selfish point of view. The zero maintainance overhead is a myth, only because _you_ don't have to do it. OTOH maintaining the trace points along with the corresponding source is a barely noticable noise and is certainly less work than having them to maintain separately. > It is a constant visual > hindrance and a constant build-correctness and boot-correctness problem > if you happen to change the code that is being traced by a static > tracepoint. Again, I am talking out of actual experience with static > tracepoints: i frequently break my kernel via static tracepoints and i > have constant maintainance cost from them. Sorry, but you're not the only one with actual experience and in my experience the value far outweighs the occasional need for adjustments. If you don't use them, they are of course a nuisance, but is your personal dislike really reason enough to deny others a useful tool? > i am giving a line by line rebuttal of all arguments that come up. > Please be fair and do the same. Here are the arguments again, for a > third time. Thanks! Ingo, maybe you should try to understand the point I'm trying to make? You mostly emphasize your personal dislike of static tracepoints. > > > also, the other disadvantages i listed very much count too. Static > > > tracepoints are fundamentally limited because: > > > > > > - they can only be added at the source code level > > > > > > - modifying them requires a reboot which is not practical in a > > > production environment > > > > > > - there can only be a limited set of them, while many problems need > > > finegrained tracepoints tailored to the problem at hand > > > > > > - conditional tracepoints are typically either nonexistent or very > > > limited. Sorry, but I fail to see the point you're trying to make (beside your personal preferences), none of this is a unsolvable problem, which would prevent making good use of static tracepoints. > > > the kprobes infrastructure, despite being fairly young, is widely > > > available: powerpc, i386, x86_64, ia64 and sparc64. The other > > > architectures are free to implement them too, there's nothing > > > hardware-specific about kprobes and the "porting overhead" is in > > > essence a one-time cost - while for static tracepoints the > > > maintainance overhead goes on forever and scales linearly with the > > > number of tracepoints added. > > > > kprobes are not trivial to implement [...] > > nor are smp-alternatives, which was suggested as a solution to reduce > the overhead of static tracepoints. So what's the point? It's a one-off > development overhead that has already been done for all the major > arches. If another arch needs it they can certainly implement it. Static tracepoints don't have to be implemented via alternatives and you continue to ignore that kprobes are nontrivial, you continue to ignore that both can coexist just fine. You just want to force your personal preferences onto others. :-( > it's like arguing against ptrace on the grounds of: "application > developers can add printf if they want to debug their apps, or they can > add static tracepoints too, and besides, ptrace is hard to implement". Sorry, I don't understand this point. Ptrace support would match kernel gdb support, which would be a complete different discussion... > > I also think you highly exaggerate the maintaince overhead of static > > tracepoints, once added they hardly need any maintainance, most of the > > time you can just ignore them. [...] > > hundreds (or possibly thousands) of tracepoints? Have you ever tried to > maintain that? I have and it's a nightmare. _This_ discussion is about a core set of trace points! Yes, you can have thousands of trace points in drivers, but they don't have to be enabled by default and are no reason at all against a few core trace point, which can be used by _all_ archs to trace core events as _cheaply_ as possible. > Even assuming a rich set of hundreds of static tracepoints, it doesnt > even solve the problems at hand: people want to do much more when they > probe the kernel - and today, with DTrace under Solaris people _know_ > that much better tracing _can be done_, and they _demand_ that Linux > adopts an intelligent solution. The clock is ticking for dinosaurs like > static printks and static tracepoints to debug the kernel... Huh? How exactly do static tracepoints prevent you from doing this? Different problems require different solutions, nobody is taking Kprobes away, but why should Kprobes be the only solution? bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 19:47 ` Roman Zippel @ 2006-09-14 20:24 ` Ingo Molnar 2006-09-14 20:54 ` Roman Zippel 2006-09-15 1:47 ` Mathieu Desnoyers 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 20:24 UTC (permalink / raw) To: Roman Zippel Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > also, the other disadvantages i listed very much count too. Static > > > > tracepoints are fundamentally limited because: > > > > > > > > - they can only be added at the source code level > > > > > > > > - modifying them requires a reboot which is not practical in a > > > > production environment > > > > > > > > - there can only be a limited set of them, while many problems need > > > > finegrained tracepoints tailored to the problem at hand > > > > > > > > - conditional tracepoints are typically either nonexistent or very > > > > limited. > > Sorry, but I fail to see the point you're trying to make (beside your > personal preferences), none of this is a unsolvable problem, which > would prevent making good use of static tracepoints. those are technical arguments - i'm not sure how you can understand them to be "personal preferences". The only personal preference i have is that in the end a technically most superior solution should be merged. (be that one project or the other, or a hybrid of the two) The analysis of which one is a better solution depends on pros and cons - exactly like the ones listed above. If they are solvable problems then please let me know how you would solve them and when you (or others) would solve them, preferably before merging the code. Right now they are pretty heavy cons as far as LTT goes, so obviously they have a primary impact on the topic at hand (whic is whether to merge LTT or not). Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:24 ` Ingo Molnar @ 2006-09-14 20:54 ` Roman Zippel 2006-09-14 21:08 ` Daniel Walker 2006-09-15 1:47 ` Mathieu Desnoyers 1 sibling, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 20:54 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > those are technical arguments - i'm not sure how you can understand them > to be "personal preferences". The only personal preference i have is > that in the end a technically most superior solution should be merged. Ingo, so far you have made not a single argument why they can't coexist except for your personal dislike. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:54 ` Roman Zippel @ 2006-09-14 21:08 ` Daniel Walker 2006-09-14 21:30 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Daniel Walker @ 2006-09-14 21:08 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Thu, 2006-09-14 at 22:54 +0200, Roman Zippel wrote: > Hi, > > On Thu, 14 Sep 2006, Ingo Molnar wrote: > > > those are technical arguments - i'm not sure how you can understand them > > to be "personal preferences". The only personal preference i have is > > that in the end a technically most superior solution should be merged. > > Ingo, so far you have made not a single argument why they can't coexist > except for your personal dislike. Not to put to fine a point on it, but I think there's not a small number of us that "prefer" the best solution. Daniel ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 21:08 ` Daniel Walker @ 2006-09-14 21:30 ` Roman Zippel 2006-09-14 22:15 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 21:30 UTC (permalink / raw) To: Daniel Walker Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Thu, 14 Sep 2006, Daniel Walker wrote: > > Ingo, so far you have made not a single argument why they can't coexist > > except for your personal dislike. > > Not to put to fine a point on it, but I think there's not a small number > of us that "prefer" the best solution. You can have it. OTOH I would also like to know what's going in my m68k kernel without having to implement some rather complex infrastructure, which I don't need otherwise. There hasn't been a single argument so far, why we can't have both. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 21:30 ` Roman Zippel @ 2006-09-14 22:15 ` Ingo Molnar 2006-09-14 23:39 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 22:15 UTC (permalink / raw) To: Roman Zippel Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi, > > On Thu, 14 Sep 2006, Daniel Walker wrote: > > > > Ingo, so far you have made not a single argument why they can't coexist > > > except for your personal dislike. > > > > Not to put to fine a point on it, but I think there's not a small number > > of us that "prefer" the best solution. > > You can have it. > OTOH I would also like to know what's going in my m68k kernel without > having to implement some rather complex infrastructure, which I don't > need otherwise. There hasn't been a single argument so far, why we > can't have both. the argument is very simple: LTT creates strong coupling, it is almost a set of 350+ system-calls, moved into the heart of the kernel. Once moved in, it's very hard to remove it. "Why did you remove that trace information, you broke my LTT script!" While with SystemTap the coupling is alot smaller. With dynamic tracing there's no _fundamental requirement_ for _any_ tracepoint to be in the source code, hence we have the present and future flexibility to eliminate most of them. So my point is: shape all the static tracepoints in a "provide data to dynamic tracers" way. If they are removed (which we should have the freedom to do), the removal is not a showstopper. Flexibility of future choices, especially for user/developer-visible features, is one of the most important factors of kernel maintainance. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:15 ` Ingo Molnar @ 2006-09-14 23:39 ` Roman Zippel 2006-09-14 23:43 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Roman Zippel @ 2006-09-14 23:39 UTC (permalink / raw) To: Ingo Molnar Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > > OTOH I would also like to know what's going in my m68k kernel without > > having to implement some rather complex infrastructure, which I don't > > need otherwise. There hasn't been a single argument so far, why we > > can't have both. > > the argument is very simple: LTT creates strong coupling, it is almost a > set of 350+ system-calls, moved into the heart of the kernel. Once moved > in, it's very hard to remove it. "Why did you remove that trace > information, you broke my LTT script!" You are changing the topic. Nobody said the current LTT tracepoints have to be merged as is. You generalize from a work in progress to static trace points in general. > While with SystemTap the coupling is alot smaller. What guarantees we don't have similiar problems with dynamic tracepoints? As soon as any tracing is merged, users will have some kind of expectation and thus you can expect "Why did you change this source? It broke my SystemTap script!" here as well. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 23:39 ` Roman Zippel @ 2006-09-14 23:43 ` Ingo Molnar 2006-09-15 0:27 ` Roman Zippel 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 23:43 UTC (permalink / raw) To: Roman Zippel Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Roman Zippel <zippel@linux-m68k.org> wrote: > > While with SystemTap the coupling is alot smaller. > > What guarantees we don't have similiar problems with dynamic > tracepoints? As soon as any tracing is merged, users will have some > kind of expectation [...] because users rely on the functionality, not on the implementation details. As i outlined it before: with dynamic tracers, static tracepoints _are not a necessity_. With static tracers, _static tracepoints are the only game in town_. i outlined one such specific "removal of static tracepoint" example already: static trace points at the head/prologue of functions (half of the existing tracepoints are such). The sock_sendmsg() example i quoted before is such a case. Those trace points can be replaced with a simple GCC function attribute, which would cause a 5-byte (or whatever necessary) NOP to be inserted at the function prologue. The attribute would be alot less invasive than an explicit tracepoint (and thus easier to maintain): int __trace function(char arg1, char arg2) { } where kprobes can be used to attach a lightweight tracepoint that does a call, not a break (INT3) instruction. With static tracers we couldnt do this so we'd have to stick with the static tracepoints forever! It's always hard to remove features, so we have to make sure we add the feature that we know is the best long-term solution. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 23:43 ` Ingo Molnar @ 2006-09-15 0:27 ` Roman Zippel 0 siblings, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-15 0:27 UTC (permalink / raw) To: Ingo Molnar Cc: Daniel Walker, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Hi, On Fri, 15 Sep 2006, Ingo Molnar wrote: > int __trace function(char arg1, char arg2) > { > } > > where kprobes can be used to attach a lightweight tracepoint that does a > call, not a break (INT3) instruction. With static tracers we couldnt do > this so we'd have to stick with the static tracepoints forever! It's > always hard to remove features, so we have to make sure we add the > feature that we know is the best long-term solution. Where is the prove for that? Why can't the same rules apply to dynamic and static trace points? You're also mixing up function tracing with event tracing. Most of the LTT trace points log rather high level events, which are rather unlikely to disappear. It's more likely that the place where they are generated is moved and then it's only advantageous if the marker is moved as well at the same time. OTOH if the actual event really is not generated anymore, there is also no need for the marker anymore. bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:24 ` Ingo Molnar 2006-09-14 20:54 ` Roman Zippel @ 2006-09-15 1:47 ` Mathieu Desnoyers 2006-09-15 5:47 ` Vara Prasad 1 sibling, 1 reply; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-15 1:47 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Ingo Molnar (mingo@elte.hu) wrote: > > * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > > > also, the other disadvantages i listed very much count too. Static > > > > > tracepoints are fundamentally limited because: > > > > > [...] > Right now they are > pretty heavy cons as far as LTT goes, so obviously they have a primary > impact on the topic at hand (whic is whether to merge LTT or not). > Ingo, why are you arguing about static instrumentation when I don't submit any static instrumentation in my patch ? You can argue about static VS dynamic instrumentation all you want, but please don't apply this debate to a dicision about including or not a core tracing infrastructure that has nothing to do with the way instrumentation or probes are inserted. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 1:47 ` Mathieu Desnoyers @ 2006-09-15 5:47 ` Vara Prasad 0 siblings, 0 replies; 271+ messages in thread From: Vara Prasad @ 2006-09-15 5:47 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, systemtap Mathieu Desnoyers wrote: >* Ingo Molnar (mingo@elte.hu) wrote: > > >>* Roman Zippel <zippel@linux-m68k.org> wrote: >> >> >> >>>>>>also, the other disadvantages i listed very much count too. Static >>>>>>tracepoints are fundamentally limited because: >>>>>> >>>>>> >>>>>> >[...] > > >>Right now they are >>pretty heavy cons as far as LTT goes, so obviously they have a primary >>impact on the topic at hand (whic is whether to merge LTT or not). >> >> >> > >Ingo, why are you arguing about static instrumentation when I don't submit any >static instrumentation in my patch ? You can argue about static VS dynamic >instrumentation all you want, but please don't apply this debate to a dicision >about including or not a core tracing infrastructure that has nothing to do >with the way instrumentation or probes are inserted. > >Mathieu > > > > I think Ingo is right in saying what we really need first is a generic mechanism in how to specify static markers in the kernel which can be used to put dynamic probes on demand or use as a real static function calls if one chooses. Once we agree on the marker mechanism dynamic tracing and static tracing can both co-exist happily. Coming to your rest of the patches i really don't think we need whole lot more than the facilities we already got in the kernel. Frank has successfully demonstrated in OLS how one can use static markers by using only existing facilities in the kernel. >OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg >Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:13 ` Ingo Molnar 2006-09-14 17:55 ` Roman Zippel @ 2006-09-14 18:12 ` Karim Yaghmour 2006-09-14 20:25 ` Martin Bligh 2 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 18:12 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > also, the other disadvantages i listed very much count too. Static > tracepoints are fundamentally limited because: > > - they can only be added at the source code level Non-issue. See below. This is actually a feature, as can be seen by browsing the source code of various subsystems/filesystems/etc. who's authors saw fit to include their own static tracepoints. Darn, they must've been all misguided, so too were those who reviewed the code and let it in. > - modifying them requires a reboot which is not practical in a > production environment Non-issue. See below. > - there can only be a limited set of them, while many problems need > finegrained tracepoints tailored to the problem at hand Non-issue. See below. > - conditional tracepoints are typically either nonexistent or very > limited. I don't get this one. What's a "conditional tracepoint" for you? > for me these are all _independent_ grounds for rejection, as a generic > kernel infrastructure. I've addressed other issues in another posting, but I want to reiterate something here that Roman said that keeps getting forgotten: There is no competition between static and dynamic trace points. They are both useful and complementary. If some set of existing static trace points are insufficient at runtime for you to resolve an issue, nothing precludes you from using the dynamic mechanisms for adding more localized instrumentation. Side point: you may be a kernel god, but there are mere mortals out there who use Linux. The point I've been making for years now is that there are legitimate reasons why normal non-kernel- developer users who would benefit greatly from being able to have access to tools that generate digested information regarding key kernel events. You can argue all you want about maintainability, and I continue to think you're wrong, but you should know that the development and usefulness of any such tools is gated by the continued inability to have a standard set of known-to-be-good source of key kernel events. And I repeat, the use of dynamic tracing does *not* solve this issue. At OLS2005 I had suggested a development of a markers infrastructure who's users could use just to mark-up their code, the decision for tying such markers to a given type of instrumentation not actually being tied to the markers themselves. At OLS this year a very good talk was given on this topic by Frank from the systemtap team and it was very well received by the jam-packed audience. IOW, while there used to be a time when people pitted static instrumentation against dynamic instrumentation, there's been an ever growing consensus that no such choice need be made. Thanks, Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:13 ` Ingo Molnar 2006-09-14 17:55 ` Roman Zippel 2006-09-14 18:12 ` Karim Yaghmour @ 2006-09-14 20:25 ` Martin Bligh 2006-09-14 20:34 ` Ingo Molnar 2 siblings, 1 reply; 271+ messages in thread From: Martin Bligh @ 2006-09-14 20:25 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais > if there are lots of tracepoints (and the union of _all_ useful > tracepoints that i ever encountered in my life goes into the thousands) > then the overhead is not zero at all. > > also, the other disadvantages i listed very much count too. Static > tracepoints are fundamentally limited because: > > - they can only be added at the source code level > > - modifying them requires a reboot which is not practical in a > production environment > > - there can only be a limited set of them, while many problems need > finegrained tracepoints tailored to the problem at hand > > - conditional tracepoints are typically either nonexistent or very > limited. > > for me these are all _independent_ grounds for rejection, as a generic > kernel infrastructure. I don't think anyone is saying that static tracepoints do not have their limitations, or that dynamic tracepointing is useless. But that's not the point ... why can't we have one infrastructure that supports both? Preferably in a fairly simple, consistent way. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:25 ` Martin Bligh @ 2006-09-14 20:34 ` Ingo Molnar 2006-09-14 20:55 ` Martin Bligh ` (2 more replies) 0 siblings, 3 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 20:34 UTC (permalink / raw) To: Martin Bligh Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin Bligh <mbligh@mbligh.org> wrote: > >if there are lots of tracepoints (and the union of _all_ useful > >tracepoints that i ever encountered in my life goes into the thousands) > >then the overhead is not zero at all. > > > >also, the other disadvantages i listed very much count too. Static > >tracepoints are fundamentally limited because: > > > > - they can only be added at the source code level > > > > - modifying them requires a reboot which is not practical in a > > production environment > > > > - there can only be a limited set of them, while many problems need > > finegrained tracepoints tailored to the problem at hand > > > > - conditional tracepoints are typically either nonexistent or very > > limited. > > > >for me these are all _independent_ grounds for rejection, as a generic > >kernel infrastructure. > > I don't think anyone is saying that static tracepoints do not have > their limitations, or that dynamic tracepointing is useless. But > that's not the point ... why can't we have one infrastructure that > supports both? Preferably in a fairly simple, consistent way. primarily because i fail to see any property of static tracers that are not met by dynamic tracers. So to me dynamic tracers like SystemTap are a superset of static tracers. So my position is that what we should concentrate on is to make the life of dynamic tracers easier (be that a handful of generic, parametric hooks that gather debuginfo information and add NOPs for easy patching), while realizing that static tracers have no advantage over dynamic tracers. i.e. why add infrastructure for the sake of something that is clearly inferior? I have no problem with adding infrastructure for SystemTap, but i am asking the question: is it worth adding a static tracer? I would of course accept static tracers too if someone proved it that they offer something that dynamic tracers cannot do. (Just like i would accept the reintroduction of the Big Kernel Lock too, if someone proved it that it's the right thing to do.) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:34 ` Ingo Molnar @ 2006-09-14 20:55 ` Martin Bligh 2006-09-14 21:31 ` Ingo Molnar 2006-09-19 12:08 ` Christoph Hellwig 2006-09-14 21:07 ` Roman Zippel 2006-09-15 9:29 ` Jes Sorensen 2 siblings, 2 replies; 271+ messages in thread From: Martin Bligh @ 2006-09-14 20:55 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche Ingo Molnar wrote: > * Martin Bligh <mbligh@mbligh.org> wrote: > > >>>if there are lots of tracepoints (and the union of _all_ useful >>>tracepoints that i ever encountered in my life goes into the thousands) >>>then the overhead is not zero at all. >>> >>>also, the other disadvantages i listed very much count too. Static >>>tracepoints are fundamentally limited because: >>> >>> - they can only be added at the source code level >>> >>> - modifying them requires a reboot which is not practical in a >>> production environment >>> >>> - there can only be a limited set of them, while many problems need >>> finegrained tracepoints tailored to the problem at hand >>> >>> - conditional tracepoints are typically either nonexistent or very >>> limited. >>> >>>for me these are all _independent_ grounds for rejection, as a generic >>>kernel infrastructure. >> >>I don't think anyone is saying that static tracepoints do not have >>their limitations, or that dynamic tracepointing is useless. But >>that's not the point ... why can't we have one infrastructure that >>supports both? Preferably in a fairly simple, consistent way. > > > primarily because i fail to see any property of static tracers that are > not met by dynamic tracers. So to me dynamic tracers like SystemTap are > a superset of static tracers. 1. They're harder to maintain out of tree. 2. they're written in some jibberish awk crap 3. They're slower. If you're doing thousands of tracepoints a second, into a circular 8GB log buffer, that *does* matter. You want to peturb what you're measuring as little as possible. If you're running across thousands of systems, in live production, in order to catch a rare race condition, the performance does matter. > So my position is that what we should concentrate on is to make the life > of dynamic tracers easier (be that a handful of generic, parametric > hooks that gather debuginfo information and add NOPs for easy patching), > while realizing that static tracers have no advantage over dynamic > tracers. I'm confused. You're saying that the dynamic tracers need help by adding some static data to the kernel, and yet at the same time rejecting static additions to the kernel on the grounds they have no value??? Perhaps we're just meaning different things by static tracing. To me, what is important is that there is a well-defined place in the source code where the data needed to be logged, and the exact place to log it at, is defined. If all that macro does to the compilation is add a couple of nops, and make an entry in a symbol data, or other debug data, for something to hook into later that's *fine*. The point is to maintain the location and intelligence about *what* to trace. Perhaps I'm calling that static, and you're calling it dynamic? Would explain why we're disagreeing ;-) Seems to be exactly what you're suggesting above? If we want it to be superfast, we could compile with a different config option to insert some tracing statically in there or something, but I agree it should not be the default. > i.e. why add infrastructure for the sake of something that is clearly > inferior? I have no problem with adding infrastructure for SystemTap, > but i am asking the question: is it worth adding a static tracer? Yes ;-) Realise that your usage model is not exactly the same as everyone else's, and I don't give a damn if I have to recompile. I realise other people do, but .... > I would of course accept static tracers too if someone proved it that > they offer something that dynamic tracers cannot do. Can you *really* trace *any* variable (stack variables, etc) at *any* point within *any* function with kprobes? It didn't do that before, and I find it hard to see how it could, given compiler optimizations, etc. > (Just like i would accept the reintroduction of the Big Kernel Lock too, > if someone proved it that it's the right thing to do.) Surely it's still there at the moment? ;-) M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:55 ` Martin Bligh @ 2006-09-14 21:31 ` Ingo Molnar 2006-09-14 22:25 ` Martin Bligh 2006-09-19 12:08 ` Christoph Hellwig 1 sibling, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 21:31 UTC (permalink / raw) To: Martin Bligh Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin Bligh <mbligh@mbligh.org> wrote: > > primarily because i fail to see any property of static tracers that > > are not met by dynamic tracers. So to me dynamic tracers like > > SystemTap are a superset of static tracers. > > 1. They're harder to maintain out of tree. as i mentioned before, SystemTap should be in tree. Relayfs was added for the sake of SystemTap for example, i have no problem with moving SystemTap into the tree either. > 2. they're written in some jibberish awk crap You can write embedded-C SystemTap scripts too. There's an "EMBEDDED C" section in "man stap". > 3. They're slower. If you're doing thousands of tracepoints a second, > into a circular 8GB log buffer, that *does* matter. You want > to peturb what you're measuring as little as possible. i very much agree that they should become as fast as possible. So to rephrase the question: can we make dynamic tracepoints as fast (or nearly as fast) as static tracepoints? If yes, should we care about static tracers at all? > >So my position is that what we should concentrate on is to make the life > >of dynamic tracers easier (be that a handful of generic, parametric > >hooks that gather debuginfo information and add NOPs for easy patching), > >while realizing that static tracers have no advantage over dynamic > >tracers. > > I'm confused. You're saying that the dynamic tracers need help by > adding some static data to the kernel, and yet at the same time > rejecting static additions to the kernel on the grounds they have no > value??? no. I'm saying that dynamic tracers are fundamentally more advanced, and that _iff_ we are to add static info to the kernel we should add it _for the sole sake of speeding up dynamic tracers_. If static tracers can live off the same hooks then fine, but we should architect primarily for the needs of the dynamic tracers. > Perhaps we're just meaning different things by static tracing. To me, > what is important is that there is a well-defined place in the source > code where the data needed to be logged, and the exact place to log it > at, is defined. If all that macro does to the compilation is add a > couple of nops, and make an entry in a symbol data, or other debug > data, for something to hook into later that's *fine*. The point is to > maintain the location and intelligence about *what* to trace. ok. For me 'static tracepoints' are like the sort of stuff that LTT adds: funky function names littering the tree. i see the point behind 'data extraction point' hooks mentioned by you as a compromise, which incidentally will also speed up dynamic tracepoints to the level of static tracepoints. But they should be very much constructed as data extraction points for the purposes of dynamic tracers. (which the LTT hooks currently are not) > If we want it to be superfast, we could compile with a different > config option to insert some tracing statically in there or something, > but I agree it should not be the default. for a dynamic tracer all that is needed is a 5-byte NOP (even on 64-bit), and the availability of all the data. Maybe even a function call that can be patched out after bootup, with NOPs. But the current LTT stuff has lots of inlined crap that just bloats the kernel. > >i.e. why add infrastructure for the sake of something that is clearly > >inferior? I have no problem with adding infrastructure for SystemTap, > >but i am asking the question: is it worth adding a static tracer? > > Yes ;-) Realise that your usage model is not exactly the same as > everyone else's, and I don't give a damn if I have to recompile. I > realise other people do, but .... So you dont care about recompiling: that's fine - but others care, so as long as all your needs are met (which we are working on meeting :-) then we'll go for the solution that is better - instead of having some dual debugging infrastructure. > > (Just like i would accept the reintroduction of the Big Kernel Lock > > too, if someone proved it that it's the right thing to do.) > > Surely it's still there at the moment? ;-) no - at least for me it's the Big Kernel Semaphore ;-) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 21:31 ` Ingo Molnar @ 2006-09-14 22:25 ` Martin Bligh 2006-09-14 22:36 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Martin Bligh @ 2006-09-14 22:25 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche Ingo Molnar wrote: > * Martin Bligh <mbligh@mbligh.org> wrote: > > >>>primarily because i fail to see any property of static tracers that >>>are not met by dynamic tracers. So to me dynamic tracers like >>>SystemTap are a superset of static tracers. >> >>1. They're harder to maintain out of tree. > > as i mentioned before, SystemTap should be in tree. Relayfs was added > for the sake of SystemTap for example, i have no problem with moving > SystemTap into the tree either. Right, but I'm not talking about the infrastructure, I'm talking about the placement of the trace points, and the local variables they need to access in order to get useful data. >>2. they're written in some jibberish awk crap > > You can write embedded-C SystemTap scripts too. There's an "EMBEDDED C" > section in "man stap". OK, that helps - thanks. Will try to find some time to go back and look again. > >>3. They're slower. If you're doing thousands of tracepoints a second, >> into a circular 8GB log buffer, that *does* matter. You want >> to peturb what you're measuring as little as possible. > > i very much agree that they should become as fast as possible. So to > rephrase the question: can we make dynamic tracepoints as fast (or > nearly as fast) as static tracepoints? If yes, should we care about > static tracers at all? Depends how many nops you're willing to add, I guess. Anything, even the static tracepoints really needs at least a branch to be useful, IMHO. At least for what I've been doing with it, you need to stop the data flow after a while (when the event you're interested in happens, I'm using it like a flight data recorder, so we can go back and do postmortem on what went wrong). I should imagine branch prediction makes it very cheap on most modern CPUs, but don't have hard data to hand. OTOH, if you don't know in advance how big the tracing point is (ie what it's having to do within there to log), you have a problem. I believe the usual way kprobes/systemtap does this is to do a jump out of line, which is significantly slower. If we could get a good estimate on how large the trace point was *likely* to be, maybe we could leave enough space in nop's inline? OTOH, if we do that a lot, we end up increasing code size .... So I suspect the correct compromise is to have macros that normally are extremely non-invasive, either just entries in a data table (no code impact) or that plus enough nops to do a jump (as I understand it, you sometimes need the nops because it's not always possible to relocate certain bits of code ... perhaps we can detect when?). But it *will* be slower at trace time, because we're still jumping. OTOH, if you want it to be fast, you recompile with the "I actually need tracing to be superfast" option, and it leaves more space. Seems to give the best of both worlds, as needed. >>>So my position is that what we should concentrate on is to make the life >>>of dynamic tracers easier (be that a handful of generic, parametric >>>hooks that gather debuginfo information and add NOPs for easy patching), >>>while realizing that static tracers have no advantage over dynamic >>>tracers. >> >>I'm confused. You're saying that the dynamic tracers need help by >>adding some static data to the kernel, and yet at the same time >>rejecting static additions to the kernel on the grounds they have no >>value??? > > no. I'm saying that dynamic tracers are fundamentally more advanced, and > that _iff_ we are to add static info to the kernel we should add it _for > the sole sake of speeding up dynamic tracers_. If static tracers can > live off the same hooks then fine, but we should architect primarily for > the needs of the dynamic tracers. OK. Not too fusssed about the exact details ... would it be fair to say that you agree that we may need to add *some* instrumentation / hooks into the codebase in order to locate where and what to trace? Beyond that, it seems like little bits of implementation detail to me. What we ended up with was basically: ktrace(major_type, minor_type, data, ...) The minor and major types were enums, but given descriptive names, they actually seem to help, rather than hinder, code readability. I'd send out the code, but it needs a major cleanup first ;-) > ok. For me 'static tracepoints' are like the sort of stuff that LTT > adds: funky function names littering the tree. I think it can be done in different ways, some cleaner than others. What's important, to me at least, is that the tags are in tree to make them maintained along with the code, and we can get at all local variable data, etc, easily. Obviously, beyond that, it should be as clean and uninvasive as possible. Maybe others have different views, not sure. > i see the point behind 'data extraction point' hooks mentioned by you as > a compromise, which incidentally will also speed up dynamic tracepoints > to the level of static tracepoints. But they should be very much > constructed as data extraction points for the purposes of dynamic > tracers. (which the LTT hooks currently are not) OK. Not sure I care too much what the purpose is, as long as they tag where and what needs extracting, people can use them for whatever ... as handbags to dance round, as far as I care ;-) >>If we want it to be superfast, we could compile with a different >>config option to insert some tracing statically in there or something, >>but I agree it should not be the default. > > for a dynamic tracer all that is needed is a 5-byte NOP (even on > 64-bit), and the availability of all the data. Maybe even a function > call that can be patched out after bootup, with NOPs. But the current > LTT stuff has lots of inlined crap that just bloats the kernel. OK. But I don't think that's inherent to tracing hooks ... sounds like more of an implementation detail? Worst case, it's a config option as to whether to put a nop or inlined stuff in there, if we decide that the extra speed of not doing a jump may be important? > So you dont care about recompiling: that's fine - but others care, so as > long as all your needs are met (which we are working on meeting :-) then > we'll go for the solution that is better - instead of having some dual > debugging infrastructure. Sounds absolutely correct to me. Even if we had some static points, I think we'd still want the ability to mix both in *one* infrastructure. >>>(Just like i would accept the reintroduction of the Big Kernel Lock >>> too, if someone proved it that it's the right thing to do.) >> >>Surely it's still there at the moment? ;-) > > no - at least for me it's the Big Kernel Semaphore ;-) Ah, semantics ;-) Fair enough. It still needs to die though ... M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:25 ` Martin Bligh @ 2006-09-14 22:36 ` Ingo Molnar 2006-09-14 22:59 ` Martin Bligh 2006-09-15 15:37 ` Michel Dagenais 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 22:36 UTC (permalink / raw) To: Martin Bligh Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin Bligh <mbligh@mbligh.org> wrote: > > i very much agree that they should become as fast as possible. So to > > rephrase the question: can we make dynamic tracepoints as fast (or > > nearly as fast) as static tracepoints? If yes, should we care about > > static tracers at all? > > Depends how many nops you're willing to add, I guess. Anything, even > the static tracepoints really needs at least a branch to be useful, > IMHO. At least for what I've been doing with it, you need to stop the > data flow after a while (when the event you're interested in happens, > I'm using it like a flight data recorder, so we can go back and do > postmortem on what went wrong). I should imagine branch prediction > makes it very cheap on most modern CPUs, but don't have hard data to > hand. only 5 bytes of NOP are needed by default, so that a kprobe can insert a call/callq instruction. The easiest way in practice is to insert a _single_, unconditional function call that is patched out to NOPs upon its first occurance (doing this is not a performance issue at all). That way the only cost is the NOP and the function parameter preparation side-effects. (which might or might not be significant - with register calling conventions and most parameters being readily available it should be small.) note that such a limited, minimally invasive 'data extraction point' infrastructure is not actually what the LTT patches are doing. It's not even close, and i think you'll be surprised. Let me quote from the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version submitted to lkml - although no specific tracepoints were submitted): +/* Event wakeup logging function */ +static inline void trace_process_wakeup( + unsigned int lttng_param_pid, + int lttng_param_state) +#if (!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS)) +{ +} +#else +{ + unsigned int index; + struct ltt_channel_struct *channel; + struct ltt_trace_struct *trace; + void *transport_data; + char *buffer = NULL; + size_t real_to_base = 0; /* The buffer is allocated on arch_size alignment */ + size_t *to_base = &real_to_base; + size_t real_to = 0; + size_t *to = &real_to; + size_t real_len = 0; + size_t *len = &real_len; + size_t reserve_size; + size_t slot_size; + size_t align; + const char *real_from; + const char **from = &real_from; + u64 tsc; + size_t before_hdr_pad, after_hdr_pad, header_size; + + if(ltt_traces.num_active_traces == 0) return; + + /* For each field, calculate the field size. */ + /* size = *to_base + *to + *len */ + /* Assume that the padding for alignment starts at a + * sizeof(void *) address. */ + + *from = (const char*)<tng_param_pid; + align = sizeof(unsigned int); + + if(*len == 0) { + *to += ltt_align(*to, align); /* align output */ + } else { + *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */ + } + + *len += sizeof(unsigned int); + + *from = (const char*)<tng_param_state; + align = sizeof(int); + + if(*len == 0) { + *to += ltt_align(*to, align); /* align output */ + } else { + *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */ + } + + *len += sizeof(int); + + reserve_size = *to_base + *to + *len; + preempt_disable(); + ltt_nesting[smp_processor_id()]++; + index = ltt_get_index_from_facility(ltt_facility_process_2905B6EB, + event_process_wakeup); + + list_for_each_entry_rcu(trace, <t_traces.head, list) { + if(!trace->active) continue; + + channel = ltt_get_channel_from_index(trace, index); + + slot_size = 0; + buffer = ltt_reserve_slot(trace, channel, &transport_data, + reserve_size, &slot_size, &tsc, + &before_hdr_pad, &after_hdr_pad, &header_size); + if(!buffer) continue; /* buffer full */ + + *to_base = *to = *len = 0; + + ltt_write_event_header(trace, channel, buffer, + ltt_facility_process_2905B6EB, event_process_wakeup, + reserve_size, before_hdr_pad, tsc); + *to_base += before_hdr_pad + after_hdr_pad + header_size; + + *from = (const char*)<tng_param_pid; + align = sizeof(unsigned int); + + if(*len == 0) { + *to += ltt_align(*to, align); /* align output */ + } else { + *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */ + } + + *len += sizeof(unsigned int); + + /* Flush pending memcpy */ + if(*len != 0) { + memcpy(buffer+*to_base+*to, *from, *len); + *to += *len; + *len = 0; + } + + *from = (const char*)<tng_param_state; + align = sizeof(int); + + if(*len == 0) { + *to += ltt_align(*to, align); /* align output */ + } else { + *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */ + } + + *len += sizeof(int); + + /* Flush pending memcpy */ + if(*len != 0) { + memcpy(buffer+*to_base+*to, *from, *len); + *to += *len; + *len = 0; + } + + ltt_commit_slot(channel, &transport_data, buffer, slot_size); + + } + + ltt_nesting[smp_processor_id()]--; + preempt_enable_no_resched(); +} +#endif //(!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS)) + believe it or not, this is inlined into: kernel/sched.c ... 'enuff said. LTT is so far from being even considerable that it's not even funny. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:36 ` Ingo Molnar @ 2006-09-14 22:59 ` Martin Bligh 2006-09-14 23:19 ` Ingo Molnar 2006-09-15 7:00 ` Vara Prasad 2006-09-15 15:37 ` Michel Dagenais 1 sibling, 2 replies; 271+ messages in thread From: Martin Bligh @ 2006-09-14 22:59 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche Ingo Molnar wrote: > * Martin Bligh <mbligh@mbligh.org> wrote: > >>>i very much agree that they should become as fast as possible. So to >>>rephrase the question: can we make dynamic tracepoints as fast (or >>>nearly as fast) as static tracepoints? If yes, should we care about >>>static tracers at all? >> >>Depends how many nops you're willing to add, I guess. Anything, even >>the static tracepoints really needs at least a branch to be useful, >>IMHO. At least for what I've been doing with it, you need to stop the >>data flow after a while (when the event you're interested in happens, >>I'm using it like a flight data recorder, so we can go back and do >>postmortem on what went wrong). I should imagine branch prediction >>makes it very cheap on most modern CPUs, but don't have hard data to >>hand. > > only 5 bytes of NOP are needed by default, so that a kprobe can insert a > call/callq instruction. The easiest way in practice is to insert a > _single_, unconditional function call that is patched out to NOPs upon > its first occurance (doing this is not a performance issue at all). That > way the only cost is the NOP and the function parameter preparation > side-effects. (which might or might not be significant - with register > calling conventions and most parameters being readily available it > should be small.) > > note that such a limited, minimally invasive 'data extraction point' > infrastructure is not actually what the LTT patches are doing. It's not > even close, and i think you'll be surprised. Let me quote from the > latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version > submitted to lkml - although no specific tracepoints were submitted): OK, I grant you that's pretty scary ;-) However, it's not the only way to do it. Most things we're using write a statically sized 64-bit event into a relayfs buffer, with a timestamp, a minor and major event type, and a byte of data payload. > believe it or not, this is inlined into: kernel/sched.c ... > > 'enuff said. LTT is so far from being even considerable that it's not > even funny. Particularly if we're doing more complex things like that, I'd agree that the overhead of doing the out of line jump is non-existant by comparison. Even with the relayfs logging alone, perhaps the jump is not that heavy ... hmmm. If we put the NOPs in (at least as an option on some architectures) from a macro, you don't really need the full kprobes implemented to to tracing, even ... just overwrite the nops with a jump, so presumably would be easier to port. However, not sure how local variable data is specified in that case ... perhaps the kprobes guys know better. Most of the complexity seemed to be with relocating existing code because you didn't have nops. To me, the main thing is to have hooks for the at least some of the basic needs maintained in-kernel - from the dtrace paper Val pointed me to, that seems to be exactly what they do too, and it integrates with the newly added dynamic ones where necessary. Plus I hate the whole awk thing, and general complexity of systemtap, but we can probably avoid that easily enough - either the embedded C option you mentioned, or just a different definiton for the same hook macros under a config option. So perhaps it'll all work. Still need a little bit of data maintained in tree though. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:59 ` Martin Bligh @ 2006-09-14 23:19 ` Ingo Molnar 2006-09-15 0:19 ` Nicholas Miell 2006-09-15 1:04 ` Martin J. Bligh 2006-09-15 7:00 ` Vara Prasad 1 sibling, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 23:19 UTC (permalink / raw) To: Martin Bligh Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin Bligh <mbligh@mbligh.org> wrote: > > note that such a limited, minimally invasive 'data extraction point' > > infrastructure is not actually what the LTT patches are doing. It's > > not even close, and i think you'll be surprised. Let me quote from > > the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same > > version submitted to lkml - although no specific tracepoints were > > submitted): > > OK, I grant you that's pretty scary ;-) However, it's not the only way > to do it. Most things we're using write a statically sized 64-bit > event into a relayfs buffer, with a timestamp, a minor and major event > type, and a byte of data payload. oh, no need to tell me. I wrote ktrace 10 years ago, iotrace 8 years ago and latency-trace 2 years ago. (The latter even does extensive mcount based tracing, which is as demanding on the ringbuffer as it gets - on my testbox i routinely get 10-20 million trace events per second, where each trace entry includes: type, cpu, flags, preempt_count, pid, timestamp and 4 words of arbitrary payload, all fit into 32 bytes. It has static tracepoints too, in addition to the 20,000-40,000 mcount tracepoints a typical kernel has.) So i think i know the advantages and disadvantages of static tracers, their maintainance and performance impact. but i think (and i think now you'll be surprised) the way to go is to do all this in SystemTap ;-) If we add any static points to the kernel then it should have a pure 'local data preparation for extraction' purpose - nothing more. Static tracing can be built around that too, but at that point it will be unnecessary because SystemTap will be able to do that too, with the same (or better, considering the LTT mess) performance. i.e. we should have macros to prepare local information, with macro arities of 2, 3, 4 and 5: _(name, data1); __(name, data1, data2); ___(name, data1, data2, data3); ____(name, data1, data2, data3, data4); that and nothing more. But no guarantees that these trace points will always be there and usable for static tracers: for example about 50% of all tracepoints can be eliminated via a function attribute. (which function attribute tells GCC to generate a 5-byte NOP as the first instruction of the function prologue.) That will be invariant to things like function renames, etc. > So perhaps it'll all work. Still need a little bit of data maintained > in tree though. ok. And i think SystemTap itself should be in tree too, with a couple of examples and helper scripts all around tracing and probing - and of course an LTT-compatible trace output so that all the nice LTT userspace code and visualization can live on. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 23:19 ` Ingo Molnar @ 2006-09-15 0:19 ` Nicholas Miell 2006-09-15 1:04 ` Martin J. Bligh 1 sibling, 0 replies; 271+ messages in thread From: Nicholas Miell @ 2006-09-15 0:19 UTC (permalink / raw) To: Ingo Molnar Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche On Fri, 2006-09-15 at 01:19 +0200, Ingo Molnar wrote: > but i think (and i think now you'll be surprised) the way to go is to do > all this in SystemTap ;-) If we add any static points to the kernel then > it should have a pure 'local data preparation for extraction' purpose - > nothing more. Static tracing can be built around that too, but at that > point it will be unnecessary because SystemTap will be able to do that > too, with the same (or better, considering the LTT mess) performance. > > i.e. we should have macros to prepare local information, with macro > arities of 2, 3, 4 and 5: > > _(name, data1); > __(name, data1, data2); > ___(name, data1, data2, data3); > ____(name, data1, data2, data3, data4); > > that and nothing more. But no guarantees that these trace points will > always be there and usable for static tracers: for example about 50% of > all tracepoints can be eliminated via a function attribute. (which > function attribute tells GCC to generate a 5-byte NOP as the first > instruction of the function prologue.) That will be invariant to things > like function renames, etc. Another interesting idea would be the addition to gcc of a: __builtin_trace_point(char *name, ...) It would output a function call sized NOP at it's call site, and store in another section the trace point name, location, and (this is the important part) a series of DWARF expressions to reconstruct the trace point's argument list from the stack frame and saved registers. This would completely eliminate the argument passing overhead of a patched-out function call in the cases where the trace point takes arguments. It'd also make your __trace function attribute unnecessary, because gcc could presumably figure out that the trace point is at the beginning of the function. It "only" requires compiler support on every architecture that the kernel cares about and compiler upgrades for everyone who wants to use static trace points, which is no mean feat. (Roman Zippel was trimmed from the CC list because his server is rejecting mail from me and/or Comcast. If the first attempts actually make it through and this is yet another duplicate, sorry.) -- Nicholas Miell <nmiell@comcast.net> ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 23:19 ` Ingo Molnar 2006-09-15 0:19 ` Nicholas Miell @ 2006-09-15 1:04 ` Martin J. Bligh 2006-09-15 12:38 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Martin J. Bligh @ 2006-09-15 1:04 UTC (permalink / raw) To: Ingo Molnar Cc: Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche > i.e. we should have macros to prepare local information, with macro > arities of 2, 3, 4 and 5: > > _(name, data1); > __(name, data1, data2); > ___(name, data1, data2, data3); > ____(name, data1, data2, data3, data4); Personally I think that's way more visually offensive that something that looks like a function call, but still ;-) We do it as a caps macro KTRACE(foo, bar) internally, which I suppose makes it not look like a function call. But at the end of the day, it's all just a matter of visual taste, what's actually in there is way more important. > that and nothing more. But no guarantees that these trace points will > always be there and usable for static tracers: for example about 50% of > all tracepoints can be eliminated via a function attribute. (which > function attribute tells GCC to generate a 5-byte NOP as the first > instruction of the function prologue.) That will be invariant to things > like function renames, etc. Yup, sometimes you just want to know when a function is called, and there's no real need to add that. The hook for system calls should be pretty generic too. But things like instrumenting the reclaim code need more work - I ended up incrementing some counters for each type of page recovery failure in shrink_list() and then just logging one compound event on the stats structure at the end. That's pretty specific, but does give you a lot of useful data when the box is dying from mem pressure. >> So perhaps it'll all work. Still need a little bit of data maintained >> in tree though. > > ok. And i think SystemTap itself should be in tree too, with a couple of > examples and helper scripts all around tracing and probing - and of > course an LTT-compatible trace output so that all the nice LTT userspace > code and visualization can live on. I have to figure out how to graft the internal Google stuff onto the same mechanism ... I definitely want to be able to combine the static points with dynamic ones. And then add schedstats and blktrace into the same thing so it interleaves properly ... seeing the blktrace type data interact with memory reclaim debugging was very useful to me, for instance. All these little fragmented tools are way more difficult to deal with. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-15 1:04 ` Martin J. Bligh @ 2006-09-15 12:38 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-15 12:38 UTC (permalink / raw) To: Martin J. Bligh Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin J. Bligh <mbligh@mbligh.org> wrote: > >i.e. we should have macros to prepare local information, with macro > >arities of 2, 3, 4 and 5: > > > > _(name, data1); > > __(name, data1, data2); > > ___(name, data1, data2, data3); > > ____(name, data1, data2, data3, data4); > > Personally I think that's way more visually offensive that something > that looks like a function call, but still ;-) We do it as a caps > macro > > KTRACE(foo, bar) > > internally, which I suppose makes it not look like a function call. > But at the end of the day, it's all just a matter of visual taste, > what's actually in there is way more important. i disagree with the naming, for the reasons stated before: if we add any static info to the kernel, it's a "easier data extraction" thing (for the purposes of speeding up dynamic tracing), not a tracepoint. That way there's no dispute whether what i remove is a tracepoint (on which static tracers might rely in a hard way), or just a speedup for SystemTap. So a better name would be what SystemTap has implemented today: STAP_MARK_NN(kernel_context_switch, prev, next); or what makes this even more explicit: DEBUG_DATA(kernel_context_switch, prev, next); (but i'm flexible about the naming - as long as it doesnt say 'trace' and as long as there are no guarantees at all that those points remain, when a better method of accessing the same data for dynamic tracers is implemented.) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:59 ` Martin Bligh 2006-09-14 23:19 ` Ingo Molnar @ 2006-09-15 7:00 ` Vara Prasad 1 sibling, 0 replies; 271+ messages in thread From: Vara Prasad @ 2006-09-15 7:00 UTC (permalink / raw) To: Martin Bligh Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche, systemtap Martin Bligh wrote: > Ingo Molnar wrote: > >> * Martin Bligh <mbligh@mbligh.org> wrote: >> >>>> i very much agree that they should become as fast as possible. So >>>> to rephrase the question: can we make dynamic tracepoints as fast >>>> (or nearly as fast) as static tracepoints? If yes, should we care >>>> about static tracers at all? >>> >>> >>> Depends how many nops you're willing to add, I guess. Anything, even >>> the static tracepoints really needs at least a branch to be useful, >>> IMHO. At least for what I've been doing with it, you need to stop >>> the data flow after a while (when the event you're interested in >>> happens, I'm using it like a flight data recorder, so we can go back >>> and do postmortem on what went wrong). I should imagine branch >>> prediction makes it very cheap on most modern CPUs, but don't have >>> hard data to hand. >> >> >> only 5 bytes of NOP are needed by default, so that a kprobe can >> insert a call/callq instruction. The easiest way in practice is to >> insert a _single_, unconditional function call that is patched out to >> NOPs upon its first occurance (doing this is not a performance issue >> at all). That way the only cost is the NOP and the function parameter >> preparation side-effects. (which might or might not be significant - >> with register calling conventions and most parameters being readily >> available it should be small.) >> >> note that such a limited, minimally invasive 'data extraction point' >> infrastructure is not actually what the LTT patches are doing. It's >> not even close, and i think you'll be surprised. Let me quote from >> the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same >> version submitted to lkml - although no specific tracepoints were >> submitted): > > > OK, I grant you that's pretty scary ;-) However, it's not the only way > to do it. Most things we're using write a statically sized 64-bit event > into a relayfs buffer, with a timestamp, a minor and major event type, > and a byte of data payload. > >> believe it or not, this is inlined into: kernel/sched.c ... >> >> 'enuff said. LTT is so far from being even considerable that it's not >> even funny. > > > Particularly if we're doing more complex things like that, I'd agree > that the overhead of doing the out of line jump is non-existant by > comparison. Even with the relayfs logging alone, perhaps the jump is > not that heavy ... hmmm. > > If we put the NOPs in (at least as an option on some architectures) > from a macro, you don't really need the full kprobes implemented to > to tracing, even ... just overwrite the nops with a jump, so presumably > would be easier to port. However, not sure how local variable data > is specified in that case ... perhaps the kprobes guys know better. > Most of the complexity seemed to be with relocating existing code > because you didn't have nops. With kprobes one can place probes anywhere you want but the ones placed in the middle of the function are not maintainable because they are tied to a location in the code. Having a NOP leaves a maintainable address that we can hook into when needed. AFAIK writing a portable code for using local variables is not easy without using DWARF information, hence we don't handle that in kprobes. Jprobes is a special case where you can have access to function arguments at the function entry point. SystemTap can be used to specify probes anywhere in the function and local variables can also be used in the probe handlers. The problem still is maintainability as probes are specified using line numbers. > > To me, the main thing is to have hooks for the at least some of the > basic needs maintained in-kernel - from the dtrace paper Val pointed > me to, that seems to be exactly what they do too, and it integrates > with the newly added dynamic ones where necessary. Once we have these static markers one can use both dynamic probes and static probes intermixed getting best of both worlds as Frank demonstrated in OLS. Here are couple of proposals that were discussed in the systemtap mailing list in how to specify static markers, we could use these ideas with the rest in deciding on a maker proposal. http://sources.redhat.com/ml/systemtap/2006-q3/msg00273.html http://sourceware.org/ml/systemtap/2005-q4/msg00415.html > Plus I hate the > whole awk thing, and general complexity of systemtap, but we can > probably avoid that easily enough - either the embedded C option > you mentioned, or just a different definiton for the same hook macros > under a config option. > > So perhaps it'll all work. Still need a little bit of data maintained > in tree though. For placing probes at the begin and end of function we don't really need markers as function boundary works as a marker. I think we only need markers in few places where an important decision is made in the middle of a function. > > M. > - > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:36 ` Ingo Molnar 2006-09-14 22:59 ` Martin Bligh @ 2006-09-15 15:37 ` Michel Dagenais 1 sibling, 0 replies; 271+ messages in thread From: Michel Dagenais @ 2006-09-15 15:37 UTC (permalink / raw) To: Ingo Molnar Cc: Martin Bligh, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, fche > only 5 bytes of NOP are needed by default, so that a kprobe can insert a > call/callq instruction. The easiest way in practice is to insert a > _single_, unconditional function call that is patched out to NOPs upon > its first occurance (doing this is not a performance issue at all). That > way the only cost is the NOP and the function parameter preparation > side-effects. (which might or might not be significant - with register > calling conventions and most parameters being readily available it > should be small.) Interestingly, while this whole thread is full of diverging views, there is nevertheless considerable common ground. - Getting a trace output is very useful, whether it is generated from dynamic or static tracepoints. You need some infrastructure (e.g. relayfs + a few things) to get the data out efficiently. - Some sort of static markers make sense in key locations. Whether they are there "primarily" for dynamic or static tracepoints is mostly irrelevant. Interesting suggestions were made for a syntax clearly identifying their "probe point" status. >From there we can get onto a constructive debate about the technical details of each of these components. > note that such a limited, minimally invasive 'data extraction point' > infrastructure is not actually what the LTT patches are doing. It's not > even close, and i think you'll be surprised. Let me quote from the > latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version > submitted to lkml - although no specific tracepoints were submitted): This is a case where it started with inline code but as you take into account SMP and eventuelly multiple traces (e.g. the sysadmin is tracing the system and a user is generating a trace for his processes) it becomes larger and inlining may not be such a good idea any more, to say the least. However, this is relatively easy to change. It is also worth mentioning that code patching NOPs to minimize the cost of inactive tracepoints was envisioned quite some time ago. Again you might call these "static low overhead placeholders for optimized dynamic tracepoints" or "optimized low overhead static tracepoints"... You need however to be careful when code patching instructions on SMP as it may not be trivial to atomically replace 5 NOPs by a call. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:55 ` Martin Bligh 2006-09-14 21:31 ` Ingo Molnar @ 2006-09-19 12:08 ` Christoph Hellwig 1 sibling, 0 replies; 271+ messages in thread From: Christoph Hellwig @ 2006-09-19 12:08 UTC (permalink / raw) To: Martin Bligh Cc: Ingo Molnar, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche On Thu, Sep 14, 2006 at 01:55:44PM -0700, Martin Bligh wrote: > 1. They're harder to maintain out of tree. > 2. they're written in some jibberish awk crap > 3. They're slower. If you're doing thousands of tracepoints a second, > into a circular 8GB log buffer, that *does* matter. You want > to peturb what you're measuring as little as possible. agreed to all these and I'd like to add: 4. If you merge proper dynamic tracing infrastructure you get static traces for free. It's just a bunch of macros directly calling the trace function also used by the dynamic tracing code, maybe keyed of an enable variable. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:34 ` Ingo Molnar 2006-09-14 20:55 ` Martin Bligh @ 2006-09-14 21:07 ` Roman Zippel 2006-09-15 9:29 ` Jes Sorensen 2 siblings, 0 replies; 271+ messages in thread From: Roman Zippel @ 2006-09-14 21:07 UTC (permalink / raw) To: Ingo Molnar Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche Hi, On Thu, 14 Sep 2006, Ingo Molnar wrote: > primarily because i fail to see any property of static tracers that are > not met by dynamic tracers. So to me dynamic tracers like SystemTap are > a superset of static tracers. You keep ignoring that a dynamic tracer is nontrivial... :-( A static tracer is easy to implement and sufficient for many uses and most important it doesn't prevent anyone from using a dynamic tracer. Having a choice is good! bye, Roman ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:34 ` Ingo Molnar 2006-09-14 20:55 ` Martin Bligh 2006-09-14 21:07 ` Roman Zippel @ 2006-09-15 9:29 ` Jes Sorensen 2 siblings, 0 replies; 271+ messages in thread From: Jes Sorensen @ 2006-09-15 9:29 UTC (permalink / raw) To: Ingo Molnar Cc: Martin Bligh, Roman Zippel, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche >>>>> "Ingo" == Ingo Molnar <mingo@elte.hu> writes: Ingo> * Martin Bligh <mbligh@mbligh.org> wrote: >> I don't think anyone is saying that static tracepoints do not have >> their limitations, or that dynamic tracepointing is useless. But >> that's not the point ... why can't we have one infrastructure that >> supports both? Preferably in a fairly simple, consistent way. Ingo> primarily because i fail to see any property of static tracers Ingo> that are not met by dynamic tracers. So to me dynamic tracers Ingo> like SystemTap are a superset of static tracers. Ingo> So my position is that what we should concentrate on is to make Ingo> the life of dynamic tracers easier (be that a handful of Ingo> generic, parametric hooks that gather debuginfo information and Ingo> add NOPs for easy patching), while realizing that static tracers Ingo> have no advantage over dynamic tracers. The parallel that springs to mind here is C++ kernel components 'I promise to only use the good parts', then next week someone else adds another pile in a worse place. Once the points are in we will never get rid of them, look at how long it took to get rid of devfs :( In addition it is guaranteed that people will not be able to agree on which points to put where, despite the claim that there will be only 30 points - sorry, I am not buying that, we have plenty of evidence to show the opposite. I looked at the old LTT code a while ago and it was pretty appalling, maybe LTTng is better, but I can't say the old code gave me a warm fuzzy feeling. Jes ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 14:33 ` Roman Zippel 2006-09-14 15:26 ` Michel Dagenais 2006-09-14 17:13 ` Ingo Molnar @ 2006-09-14 17:51 ` Karim Yaghmour 2 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 17:51 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Roman Zippel wrote: > Even dynamic tracepoints have a maintainance overhead and I doubt there is > much difference. The big problem is having to maintain them outside the > mainline kernel, that's why it's so important to get them into the > mainline kernel. Thanks for pointing this out. This is indeed the nugget. We can try slicing the pie in any direction we think is best, but the bottom line is that there's somebody somewhere that is matching source code to important events (regardless of whether the instrumentation is static or dynamic.) For a very long time the mantra on LKML was "instrumentation is evil: it's a maintenance nightmare." Try as I may, every argument I put forth was countered by this mantra. Unfortunately for me, but fortunately for the current ltt maintainers, time is a powerful argument. So, with that in mind, here are some excerpts of a discussion I had with Andrew back in the summer of 2004: Here's Andrew pulling the "instrumentation is evil" mantra: http://marc.theaimsgroup.com/?l=linux-kernel&m=108873232414895&w=2 Here's me demonstrating that the mantra is wrong by comparing a patch against 2.2.13 dated 1999/11/18 and a patch against 2.6.3 dated 2004/03/15: http://marc.theaimsgroup.com/?l=linux-kernel&m=108874078111041&w=2 And here's Andrew, to his credit, saying "Fair enough." http://marc.theaimsgroup.com/?l=linux-kernel&m=108874940728542&w=2 Now, this is 2 years ago and I haven't done the analysis recently, but I'd bet the comparison would probably yield very similar results. The 1st ltt patch was made in July 1999, that's more than **7** years ago. How much longer can anybody continue saying with a straight face that static instrumentation is a maintenance problem? In my opinion the real problem is what impact the fact that this issue has lingered on for so long has in encouraging people and/or companies in investing any sort of effort in the kernel development process. There's just no excuse for Linux not to have something that is clearly as essential as this. I think now is a good time to put this issue to rest and drop the misleading mantra. Cheers, Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 13:55 ` Ingo Molnar 2006-09-14 14:33 ` Roman Zippel @ 2006-09-14 15:19 ` Mathieu Desnoyers 2006-09-14 19:39 ` Frank Ch. Eigler 2006-09-15 17:13 ` Jose R. Santos 1 sibling, 2 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-14 15:19 UTC (permalink / raw) To: Ingo Molnar, Karim Yaghmour Cc: Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, Douglas Niehaus * Ingo Molnar (mingo@elte.hu) wrote: > > * Roman Zippel <zippel@linux-m68k.org> wrote: > > the key point is that we want _zero_ "static tracepoints". Firstly, > static tracepoints are fundamentally limited: > > - they can only be added at the source code level > > - modifying them requires a reboot which is not practical in a > production environment Not for kernel modules : unload/load is enough. > - there can only be a limited set of them, while many problems need > finegrained tracepoints tailored to the problem at hand Not true with the dynamic facility loading. LTTng can register new events upon module load/unload. > > - conditional tracepoints are typically either nonexistent or very > limited. > Maybe, but it can be useful to have static instrumentation available for those limited conditional tracepoints. > But besides the usability problems, the most important problem is that > static tracepoints add a _constant maintainance overhead_ to the kernel. > I'm talking from first hand experience: i wrote 'iotrace' (a static > tracer) in 1996 and have maintained it for many years, and even today > i'm maintaining a handful of tracepoints in the -rt kernel. I _dont_ > want static tracepoints in the mainline kernel. > If the trace points are modified with the code by the ones who make the original code changes, it lessens the maintainance overhead. Furthermore, if there is a major change in a code path that requires rethinking the trace points, the person introducing the change has the best knowledge of what to do with the trace point. I think that trace point maintainance should be left to subsystem maintainers, not a centralised task done by distributions once in a while. Talking about experience, Karim has maintained the original LTT trace points, which targeted key kernel event, for years without major trace points changes between kernel versions. I think he already proved that maintainance of static trace points in not an issue. However, I restate that my position is that both static and dynamic instrumentation of the kernel are a necessity and that a tracer core should be usable by both. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:19 ` Mathieu Desnoyers @ 2006-09-14 19:39 ` Frank Ch. Eigler 2006-09-15 17:13 ` Jose R. Santos 1 sibling, 0 replies; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-14 19:39 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Karim Yaghmour, Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, Douglas Niehaus Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> writes: > [...] However, I restate that my position is that both static and > dynamic instrumentation of the kernel are a necessity and that a > tracer core should be usable by both. On a complementary note, it would be nice if whatever static instrumetation hooks are deemed worthwhile were themselves generic so they could be coupled to either a fixed or dynamic "core" or back-end. - FChE ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:19 ` Mathieu Desnoyers 2006-09-14 19:39 ` Frank Ch. Eigler @ 2006-09-15 17:13 ` Jose R. Santos 1 sibling, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 17:13 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Ingo Molnar, Karim Yaghmour, Roman Zippel, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, Douglas Niehaus Mathieu Desnoyers wrote: > * Ingo Molnar (mingo@elte.hu) wrote: > > > > * Roman Zippel <zippel@linux-m68k.org> wrote: > > > > the key point is that we want _zero_ "static tracepoints". Firstly, > > static tracepoints are fundamentally limited: > > > > - they can only be added at the source code level > > > > - modifying them requires a reboot which is not practical in a > > production environment > > Not for kernel modules : unload/load is enough. > This assumes that the module can be unloaded in the first place. Inserting a new probe on the disk controler for your boot drive or in the filesystem module would still require a reboot. > If the trace points are modified with the code by the ones who make the > original code changes, it lessens the maintainance overhead. Furthermore, if > there is a major change in a code path that requires rethinking the trace > points, the person introducing the change has the best knowledge of what to do > with the trace point. I think that trace point maintainance should be left to > subsystem maintainers, not a centralised task done by distributions once in a > while. > I agree with you here, I think is silly to claim dynamic instrumentation as a fix for the "constant maintainace overhead" of static trace point. Working on LKET, one of the biggest burdens that we've had is mantainig the probe points when something in the kernel changes enough to cause a breakage of the dynamic instrumentation. The solution to this is having the SystemTap tapsets maintained by the subsystems maintainers so that changes in the code can be applied to the dynamic instrumentation as well. This of course means that the subsystem maintainer would need to maintain two pieces of code instead of one. There are a lot of advantages to dynamic vs static instrumentation, but I don't think maintainace overhead is one of them. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 11:27 ` Ingo Molnar 2006-09-14 13:40 ` Roman Zippel @ 2006-09-14 15:02 ` Mathieu Desnoyers 2006-09-14 15:14 ` Martin J. Bligh 2006-09-19 11:59 ` Christoph Hellwig 3 siblings, 0 replies; 271+ messages in thread From: Mathieu Desnoyers @ 2006-09-14 15:02 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, Douglas Niehaus * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > Following an advice Christoph gave me this summer, submitting a > > smaller, easier to review patch should make everybody happier. Here is > > a stripped down version of LTTng : I removed everything that would > > make the code review reluctant (especially kernel instrumentation and > > kernel state dump module). I plan to release this "core" version every > > few LTTng releases and post it to LKML. > > > > Comments and reviews are very welcome. > > i have one very fundamental question: why should we do this > source-intrusive method of adding tracepoints instead of the dynamic, > unintrusive (and thus zero-overhead) KProbes+SystemTap method? > Hi Ingo, First, I never said that this tracing infrastructure was tied to static trace points in any way. My goal is to provide a robust data serialisation mechanism that could be used both from static and dynamic trace points. Zero-overhead for static tracepoints can be achieved by compiling them out. One problem with the KProbes approach is that is limits what can be instrumented because of its performance impact when active : traps are very costly and can limit instrumentation of often triggered code paths : scheduler change, traps, interrupts... Also, a major issue with dynamic instrumentation is that it will never be useful to kernel developers who keep current with the git HEAD. Dynamic instrumentation has to be defined outside of the kernel tree and cannot follow the code changes quickly enough to be useful for a developer without himself maintaining his own dynamic instrumentation. I do not advocate for a particular approach : I think that dynamic instrumentation is very well suited for distributions which stick to a particular kernel version for a long time. However, static probes can be very useful for kernel developers as they can follow the kernel HEAD because they are part of the code. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 11:27 ` Ingo Molnar 2006-09-14 13:40 ` Roman Zippel 2006-09-14 15:02 ` Mathieu Desnoyers @ 2006-09-14 15:14 ` Martin J. Bligh 2006-09-14 17:43 ` Ingo Molnar ` (2 more replies) 2006-09-19 11:59 ` Christoph Hellwig 3 siblings, 3 replies; 271+ messages in thread From: Martin J. Bligh @ 2006-09-14 15:14 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > >> Following an advice Christoph gave me this summer, submitting a >> smaller, easier to review patch should make everybody happier. Here is >> a stripped down version of LTTng : I removed everything that would >> make the code review reluctant (especially kernel instrumentation and >> kernel state dump module). I plan to release this "core" version every >> few LTTng releases and post it to LKML. >> >> Comments and reviews are very welcome. > > i have one very fundamental question: why should we do this > source-intrusive method of adding tracepoints instead of the dynamic, > unintrusive (and thus zero-overhead) KProbes+SystemTap method? Because: 1. Kprobes are more overhead when they *are* being used. 2. You can get zero overhead by CONFIG'ing things out. 3. (most importantly) it's a bitch to maintain tracepoints out of-tree on a rapidly moving kernel 4. I believe kprobes still doesn't have full access to local variables. Now (3) is possibly solvable by putting the points in as no-ops (either insert a few nops or just a marker entry in the symbol table?), but full dynamic just isn't sustainable. What would be really nice is one trace infrastructure, that allowed both static and dynamic tracepoints without all the awk-style language crap that seems to come with systemtap. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:14 ` Martin J. Bligh @ 2006-09-14 17:43 ` Ingo Molnar 2006-09-14 18:25 ` Karim Yaghmour 2006-09-14 20:03 ` Martin Bligh 2006-09-14 19:03 ` grundig 2006-09-14 19:48 ` Frank Ch. Eigler 2 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 17:43 UTC (permalink / raw) To: Martin J. Bligh Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais * Martin J. Bligh <mbligh@mbligh.org> wrote: > >>Comments and reviews are very welcome. > > > > i have one very fundamental question: why should we do this > > source-intrusive method of adding tracepoints instead of the > > dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap > > method? > > Because: > > 1. Kprobes are more overhead when they *are* being used. minimally so - at least on i386 and x86_64. In that sense tracing is a _slowpath_, and it _will_ slow things down if done excessively. I dont care about the tracepoint being slower by a few instructions as long as it has _zero effect_ on normal code, be that source code or binary code. > 2. You can get zero overhead by CONFIG'ing things out. but that's not how a fair chunk of people want to use tracing. People (enterprise customers trying to figure out performance problems, engineers trying to debug things on a live, production system) want to be able to insert a tracepoint anywhere and anytime - and also they want to have zero overhead from tracing if no tracepoints are used on a system. > 3. (most importantly) it's a bitch to maintain tracepoints out > of-tree on a rapidly moving kernel wrong: the original demo tracepoints that came with SystemTap still work on the current kernel, because the 'coupling' is loose: based on function names. Static tracepoints on the other hand, if added via an external patch, do depend on the target function not moving around and the context of the tracepoint not being changed. (and static tracepoints if in the source all the time are a constant hindrance to development and code readability.) and of course the big advantage of dynamic probing is its flexibility: you can add add-hoc tracepoints to thousands of functions, instead of having to maintain hundreds (or thousands) of static tracepoints all the time. (and if we wont end up with hundreds/thousands of static tracepoints then it wont be usable enough as a generic solution.) > 4. I believe kprobes still doesn't have full access to local > variables. wrong: with SystemTap you can probe local variables too (via jprobes/kretprobes, all in the upstream kernel already). > Now (3) is possibly solvable by putting the points in as no-ops > (either insert a few nops or just a marker entry in the symbol > table?), but full dynamic just isn't sustainable. [...] i'm not sure i follow. Could you explain where SystemTap has this difficulty? Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:43 ` Ingo Molnar @ 2006-09-14 18:25 ` Karim Yaghmour 2006-09-14 20:03 ` Martin Bligh 1 sibling, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 18:25 UTC (permalink / raw) To: Ingo Molnar Cc: Martin J. Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > but that's not how a fair chunk of people want to use tracing. People > (enterprise customers trying to figure out performance problems, > engineers trying to debug things on a live, production system) want to > be able to insert a tracepoint anywhere and anytime - and also they want > to have zero overhead from tracing if no tracepoints are used on a > system. This is an implementation issue. You can easily have it so that at the site of a marker you generate some code in a special "trace" section of the binary which does the actual tracing and insert noops at the marker site. Therefore the only penalty until the tracing is enabled is the execution of additional noops. [ note: this comes from a suggestion made by Hiramatsu-san at this year's OLS. ] > wrong: the original demo tracepoints that came with SystemTap still work > on the current kernel, because the 'coupling' is loose: based on > function names. > > Static tracepoints on the other hand, if added via an external patch, do > depend on the target function not moving around and the context of the > tracepoint not being changed. (and static tracepoints if in the source > all the time are a constant hindrance to development and code > readability.) Instrumentation of function boundaries is usually not much of an issue. Instrumentation of key events, though, is different. Here's the classic: @@ -1709,6 +1712,7 @@ switch_tasks: ++*switch_count; prepare_arch_switch(rq, next); + TRACE_SCHEDCHANGE(prev, next); prev = context_switch(rq, prev, next); barrier(); This is the kind of thing for which the instrumentation, be it static or dynamic, requires some kind of intelligent analysis of where to get the info. Now, answer honestly, wouldn't it be simpler to have such an event marker instead of having to figure out for every kernel binary you get where the darned probe needs to be inserted? Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 17:43 ` Ingo Molnar 2006-09-14 18:25 ` Karim Yaghmour @ 2006-09-14 20:03 ` Martin Bligh 2006-09-14 20:14 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Martin Bligh @ 2006-09-14 20:03 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Ingo Molnar wrote: > * Martin J. Bligh <mbligh@mbligh.org> wrote: > > >>>>Comments and reviews are very welcome. >>> >>>i have one very fundamental question: why should we do this >>>source-intrusive method of adding tracepoints instead of the >>>dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap >>>method? >> >>Because: >> >>1. Kprobes are more overhead when they *are* being used. > > > minimally so - at least on i386 and x86_64. In that sense tracing is a > _slowpath_, and it _will_ slow things down if done excessively. I dont > care about the tracepoint being slower by a few instructions as long as > it has _zero effect_ on normal code, be that source code or binary code. Would be interesting to see some measurements. But jumping is slower than a simple branch (or noops to skip over that can be overwritten). >>2. You can get zero overhead by CONFIG'ing things out. > > but that's not how a fair chunk of people want to use tracing. People > (enterprise customers trying to figure out performance problems, > engineers trying to debug things on a live, production system) want to > be able to insert a tracepoint anywhere and anytime - and also they want > to have zero overhead from tracing if no tracepoints are used on a > system. I'm fine with that ... "a fair chunk of people" - but it's not everyone, by any means. We need both static and dynamic tracepoints, in one infrastructure. >>3. (most importantly) it's a bitch to maintain tracepoints out >> of-tree on a rapidly moving kernel > > wrong: the original demo tracepoints that came with SystemTap still work > on the current kernel, because the 'coupling' is loose: based on > function names. And what do those trace? I bet not half the stuff we want to do. I've been migrating Google's tracepoints around between different kernel versions, and it's not a mechanical port. Just stupid things like renaming of functions inside memory reclaim creates pain, for starters. (shrink_cache/shrink_list, refill_inactive_zone, etc). > Static tracepoints on the other hand, if added via an external patch, do > depend on the target function not moving around and the context of the > tracepoint not being changed. (and static tracepoints if in the source > all the time are a constant hindrance to development and code > readability.) an external patch is, indeed, pretty useless. Merging a few simple tracepoints should not be a problem - see blktrace and schedstats, for instance. > and of course the big advantage of dynamic probing is its flexibility: > you can add add-hoc tracepoints to thousands of functions, instead of > having to maintain hundreds (or thousands) of static tracepoints all the > time. (and if we wont end up with hundreds/thousands of static > tracepoints then it wont be usable enough as a generic solution.) I wasn't saying that dynamic tracepoints are useless - I agree it's valuable to add stuff on the fly. But some things are better done statically. >>4. I believe kprobes still doesn't have full access to local >>variables. > > wrong: with SystemTap you can probe local variables too (via > jprobes/kretprobes, all in the upstream kernel already). I'll look again, but last time I looked it didn't do this, and when I spoke to the kprobes/systemtap people at OLS, IIRC they said it still couldn't. >>Now (3) is possibly solvable by putting the points in as no-ops >>(either insert a few nops or just a marker entry in the symbol >>table?), but full dynamic just isn't sustainable. [...] > > i'm not sure i follow. Could you explain where SystemTap has this > difficulty? If you have an extremely limited set of probes, on a static area of the kernel, then yes, they may work for a long time. But try tracing something like the scheduler, which people seem to delight in rewriting every month or two ... It amuses me that we're so opposed to external patches to the tree (for perfectly understandable reasons), but we somehow think tracepoints are magically different and should be maintained out of tree somehow. You yourself made the argument that it's a maintainance burden to keep the trace points *in* the tree ... if that's true, how is it any easier to keep them outside of the tree? If we really want to, we can still keep the hooks inside the code, and have them do absolutely nothing at all - putting markers into the symbol table is pretty much free. It also reuses the well structured code-sharing mechanism we already have in place - the linux kernel tree. I really don't want to deal with all the systemtap crap - I just want something that works, and I don't particularly care if I have to recompile the kernel to get it. I know that doesn't suit everyone, but there are requirements on both sides, and we should not dismiss each other's requirements out of hand. Having one consistent consistent collection mechanism for all these different types of tracing data seems both logical and very important to me ... M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:03 ` Martin Bligh @ 2006-09-14 20:14 ` Ingo Molnar 2006-09-14 20:40 ` Martin Bligh 2006-09-14 21:05 ` Michel Dagenais 0 siblings, 2 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 20:14 UTC (permalink / raw) To: Martin Bligh Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche * Martin Bligh <mbligh@mbligh.org> wrote: > an external patch is, indeed, pretty useless. Merging a few simple > tracepoints should not be a problem [...] the problem is, LTT is not about a 'few' tracepoints: it adds a whopping 350 tracepoints, a fair portion of it is multi-line with tons of arguments. $ diffstat patch-2.6.17-lttng-0.5.108-instrumentation* 98 files changed, 1450 insertions(+), 64 deletions(-) saying "it's just a few lightweight tracepoints" misses two points: it's not just a few, and it's not lightweight. and the set of tracepoints never gets smaller. People who start to rely on a tracepoint will scream bloody murder if it goes away or breaks. Static tracepoints are a maintainance PITA that will rarely get smaller, and will easily grow ... > [...] - see blktrace and schedstats, for instance. yes, i do want to remove the 34 schedstats tracepoints too, once a feasible alternative is present. I already have to do two compilations when changing something substantial in the scheduler - once with and once without schedstats. same for blktrace: once SystemTap can provide a compatible replacement, it should. > It amuses me that we're so opposed to external patches to the tree > (for perfectly understandable reasons), but we somehow think > tracepoints are magically different and should be maintained out of > tree somehow. i think you misunderstood what i meant. SystemTap should very much be integrated into the kernel proper, but i dont think the _rules_ (and scripts) should become part of the _source code files themselves_. So yes, there's advantage to kernel integration, but there's disadvantage to littering the kernel source with countless static tracepoints, if dynamic tracepoints can offer the same benefits (or more). the question is: what is more maintainance, hundreds of static tracepoints (with long parameter lists) all around the (core) kernel, or hundreds of detached dynamic rules that need an update every now and then? [but of which most would still be usable even if some of them "broke"] To me the answer is clear: having hundreds of tracepoints _within_ the source code is higher cost. But please prove me wrong :-) Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:14 ` Ingo Molnar @ 2006-09-14 20:40 ` Martin Bligh 2006-09-14 21:05 ` Michel Dagenais 1 sibling, 0 replies; 271+ messages in thread From: Martin Bligh @ 2006-09-14 20:40 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais, fche Ingo Molnar wrote: > * Martin Bligh <mbligh@mbligh.org> wrote: > > >>an external patch is, indeed, pretty useless. Merging a few simple >>tracepoints should not be a problem [...] > > > the problem is, LTT is not about a 'few' tracepoints: it adds a whopping > 350 tracepoints, a fair portion of it is multi-line with tons of > arguments. "static tracepoints" does not equate directly to "all of LTT". I'm not saying we should accept LTT as-is. I'm saying we should not reject the concept of static tracepoints. > $ diffstat patch-2.6.17-lttng-0.5.108-instrumentation* > 98 files changed, 1450 insertions(+), 64 deletions(-) > > saying "it's just a few lightweight tracepoints" misses two points: it's > not just a few, and it's not lightweight. > > and the set of tracepoints never gets smaller. People who start to rely > on a tracepoint will scream bloody murder if it goes away or breaks. > Static tracepoints are a maintainance PITA that will rarely get smaller, > and will easily grow ... If people are *using* them, it's no easier to maintain them outside of tree, than in-tree. it's significantly harder. >>[...] - see blktrace and schedstats, for instance. > > yes, i do want to remove the 34 schedstats tracepoints too, once a > feasible alternative is present. I already have to do two compilations > when changing something substantial in the scheduler - once with and > once without schedstats. > > same for blktrace: once SystemTap can provide a compatible replacement, > it should. Your argument about schedstats only seems to illustrate the flaws in the arguments for dynamic tracepointing - you've put your finger on exactly what the problem is, when the code changes, the tracing HAS to change too. The best time to do this is when the code itself changes. It's the same arguement for putting documentation in the C file against the source itself. >>It amuses me that we're so opposed to external patches to the tree >>(for perfectly understandable reasons), but we somehow think >>tracepoints are magically different and should be maintained out of >>tree somehow. > > i think you misunderstood what i meant. SystemTap should very much be > integrated into the kernel proper, but i dont think the _rules_ (and > scripts) should become part of the _source code files themselves_. So > yes, there's advantage to kernel integration, but there's disadvantage > to littering the kernel source with countless static tracepoints, if > dynamic tracepoints can offer the same benefits (or more). If you're talking about the scriptable awk-like "stuff" that comes with Systemtap, yes I agree it should not be in the C code, it's foul. However, I don't think a simple macro hooks are a burden. > the question is: what is more maintainance, hundreds of static > tracepoints (with long parameter lists) all around the (core) kernel, or > hundreds of detached dynamic rules that need an update every now and > then? [but of which most would still be usable even if some of them > "broke"] To me the answer is clear: having hundreds of tracepoints > _within_ the source code is higher cost. But please prove me wrong :-) How can you possibly say that maintaining the same set of data in two dis-coupled trees is easier than doing it in the same place? You don't require any *less* information to do it with systemtap than you do with some form of static tracing. If you're talking about the effort of maintaining just what's in the kernel tree, then of course it's a little easier, but that's only half the equation. And I don't think it's much of a burden, frankly. Yes, if we have 2 billion tracepoints, it'll be a pain in the arse, but the taste of the subsystem maintainers is what would regulate this, along with everything else that we do. They'll accept a few important ones, and reject the rest. If it's not valuable in general, they won't take it. I don't see what the big problem is. What *is* a problem is having a two separate mechanisms for doing dynamic and static tracing. They should share the same logging facilities and readback mechanisms so we can read both types consistently from userspace, and the data is correctly interspersed. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 20:14 ` Ingo Molnar 2006-09-14 20:40 ` Martin Bligh @ 2006-09-14 21:05 ` Michel Dagenais 2006-09-14 22:23 ` Ingo Molnar 1 sibling, 1 reply; 271+ messages in thread From: Michel Dagenais @ 2006-09-14 21:05 UTC (permalink / raw) To: Ingo Molnar Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, fche > the question is: what is more maintainance, hundreds of static > tracepoints (with long parameter lists) all around the (core) kernel, or > hundreds of detached dynamic rules that need an update every now and > then? [but of which most would still be usable even if some of them > "broke"] To me the answer is clear: having hundreds of tracepoints > _within_ the source code is higher cost. But please prove me wrong :-) Actually I rarely find that any of the 70 000 printk is such a huge nuisance to code readability. They may even help understand what is going on in a code area you are less familiar with. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 21:05 ` Michel Dagenais @ 2006-09-14 22:23 ` Ingo Molnar 2006-09-14 22:46 ` Martin Bligh 0 siblings, 1 reply; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 22:23 UTC (permalink / raw) To: Michel Dagenais Cc: Martin Bligh, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, fche * Michel Dagenais <michel.dagenais@polymtl.ca> wrote: > > the question is: what is more maintainance, hundreds of static > > tracepoints (with long parameter lists) all around the (core) kernel, or > > hundreds of detached dynamic rules that need an update every now and > > then? [but of which most would still be usable even if some of them > > "broke"] To me the answer is clear: having hundreds of tracepoints > > _within_ the source code is higher cost. But please prove me wrong :-) > > Actually I rarely find that any of the 70 000 printk is such a huge > nuisance to code readability. They may even help understand what is > going on in a code area you are less familiar with. i disagree. Consider the following example from LTT: int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct kiocb iocb; struct sock_iocb siocb; int ret; trace_socket_sendmsg(sock, sock->sk->sk_family, sock->sk->sk_type, sock->sk->sk_protocol, size); init_sync_kiocb(&iocb, NULL); iocb.private = &siocb; ret = __sock_sendmsg(&iocb, sock, msg, size); if (-EIOCBQUEUED == ret) ret = wait_on_sync_kiocb(&iocb); return ret; } what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? Nothing. They mostly just duplicate the information i already have from the function declaration. They obscure the clear view of the function: int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct kiocb iocb; struct sock_iocb siocb; int ret; init_sync_kiocb(&iocb, NULL); iocb.private = &siocb; ret = __sock_sendmsg(&iocb, sock, msg, size); if (-EIOCBQUEUED == ret) ret = wait_on_sync_kiocb(&iocb); return ret; } the resulting visual and structural redundancy hurts. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:23 ` Ingo Molnar @ 2006-09-14 22:46 ` Martin Bligh 2006-09-14 22:56 ` Ingo Molnar 0 siblings, 1 reply; 271+ messages in thread From: Martin Bligh @ 2006-09-14 22:46 UTC (permalink / raw) To: Ingo Molnar Cc: Michel Dagenais, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, fche > i disagree. Consider the following example from LTT: ... > trace_socket_sendmsg(sock, sock->sk->sk_family, > sock->sk->sk_type, > sock->sk->sk_protocol, > size); ... > what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? > Nothing. They mostly just duplicate the information i already have from > the function declaration. They obscure the clear view of the function: ... > the resulting visual and structural redundancy hurts. Couldn't that be easily fixed by just doing trace_socket_sendmsg(sock, size); and have it work out which esoteric parts of the sock we want to trace, and which we don't? Is much less visually invasive, and gives the same effect. M. ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 22:46 ` Martin Bligh @ 2006-09-14 22:56 ` Ingo Molnar 0 siblings, 0 replies; 271+ messages in thread From: Ingo Molnar @ 2006-09-14 22:56 UTC (permalink / raw) To: Martin Bligh Cc: Michel Dagenais, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, fche * Martin Bligh <mbligh@mbligh.org> wrote: > >i disagree. Consider the following example from LTT: > ... > > trace_socket_sendmsg(sock, sock->sk->sk_family, > > sock->sk->sk_type, > > sock->sk->sk_protocol, > > size); > ... > > >what do the 5 extra lines introduced by trace_socket_sendmsg() tell us? > >Nothing. They mostly just duplicate the information i already have from > >the function declaration. They obscure the clear view of the function: > ... > >the resulting visual and structural redundancy hurts. > > Couldn't that be easily fixed by just doing > > trace_socket_sendmsg(sock, size); > > and have it work out which esoteric parts of the sock we want to > trace, and which we don't? Is much less visually invasive, and gives > the same effect. yeah, visual impact is everything. The best that Frank and me came up with is: _(socket_sendmsg, sock, size); we could quickly learn to visually skip over lines like that, they have a pretty unique geometric form . While if it's called: trace_socket_sendmsg(sock, size); it always looks like a function call in the corner of the eye and attracts attention. the '_()' macro is defined as: #define _(x,y,z) STAP_MARK(x,y,z) (STAP_MARK is an existing SystemTap helper to insert static tracepoints into the kernel.) but the other property of dynamic tracing is still very important too: we have the technological freedom to remove static tracepoints, if we decide so. With static tracers, once they are in the tree, we are stuck with these APIs. Ingo ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:14 ` Martin J. Bligh 2006-09-14 17:43 ` Ingo Molnar @ 2006-09-14 19:03 ` grundig 2006-09-14 19:21 ` Karim Yaghmour 2006-09-14 19:48 ` Frank Ch. Eigler 2 siblings, 1 reply; 271+ messages in thread From: grundig @ 2006-09-14 19:03 UTC (permalink / raw) To: Martin J. Bligh Cc: mingo, mathieu.desnoyers, linux-kernel, hch, akpm, mingo, gregkh, tglx, zanussi, ltt-dev, michel.dagenais El Thu, 14 Sep 2006 08:14:19 -0700, "Martin J. Bligh" <mbligh@mbligh.org> escribió: > 2. You can get zero overhead by CONFIG'ing things out. IOW, no distro will enable it by default to avoid the overhead, making it useless for lots of real-world working systems where you need to guess what's hapenning to software running real workloads that can't just be stopped. I guess there's no problem in having both LTT and Kprobes merged in the main tree at the same time. But Kprobes + systemtap will get enabled and used by distros massively just because users can start using it inmediately, without recompiling or installing extra kernels and rebooting. There're cases where distros may want to enable automatic tracing in every boot and only on boot but that don't like to suffer from an extra performance hit after booting... I'm not meaning that LTT sucks and doesn't have advantages and that doesn't deserve being merged/used, it just looks like kprobes+systemtap will get way more real-world users no matter how much you discuss here ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 19:03 ` grundig @ 2006-09-14 19:21 ` Karim Yaghmour 0 siblings, 0 replies; 271+ messages in thread From: Karim Yaghmour @ 2006-09-14 19:21 UTC (permalink / raw) To: grundig Cc: Martin J. Bligh, mingo, mathieu.desnoyers, linux-kernel, hch, akpm, mingo, gregkh, tglx, zanussi, ltt-dev, michel.dagenais grundig wrote: > IOW, no distro will enable it by default to avoid the overhead, Please bear in mind that this is an implementation issue. As I've explained elsewhere, there are ways to implement this where even compiled-in static tracepoints have practically no cost at all -- being noops until enabling. Thereby being no justification for not actually shipping with such built kernels and, therefore, no reason why tools such as ltt can't real-world usage. Karim ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 15:14 ` Martin J. Bligh 2006-09-14 17:43 ` Ingo Molnar 2006-09-14 19:03 ` grundig @ 2006-09-14 19:48 ` Frank Ch. Eigler 2006-09-15 16:32 ` Jose R. Santos 2 siblings, 1 reply; 271+ messages in thread From: Frank Ch. Eigler @ 2006-09-14 19:48 UTC (permalink / raw) To: Martin J. Bligh Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais "Martin J. Bligh" <mbligh@mbligh.org> writes: > [...] What would be really nice is one trace infrastructure, that > allowed both static and dynamic tracepoints We in systemtap land hope to encounter *some* static tracepoint structure, perhaps like the one I presented at OLS, via which systemtap could become your unified static+dynamic "infrastructure". Even in that universe, using LTT-derived code for high-performance tracing is within the realm of reason. > without all the awk-style language crap that seems to come with > systemtap. I'm sorry to hear you dislike the scripting language. But that's okay, you Real Men can embed literal C code inside systemtap scripts to do the Real Work, and leave to systemtap only sundry duties such as probe placement and removal. - FChE ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 19:48 ` Frank Ch. Eigler @ 2006-09-15 16:32 ` Jose R. Santos 0 siblings, 0 replies; 271+ messages in thread From: Jose R. Santos @ 2006-09-15 16:32 UTC (permalink / raw) To: Martin J. Bligh Cc: Frank Ch. Eigler, Ingo Molnar, Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais Frank Ch. Eigler wrote: > "Martin J. Bligh" <mbligh@mbligh.org> writes: > > > without all the awk-style language crap that seems to come with > > systemtap. > > I'm sorry to hear you dislike the scripting language. But that's > okay, you Real Men can embed literal C code inside systemtap scripts > to do the Real Work, and leave to systemtap only sundry duties such as > probe placement and removal. > There are also a couple of projects within SystemTap that provide trace like functionality without the need to use the SystemTap language. In the case of LKET, we've tried to make this as simple as possible by predefining probe points using the SystemTap language and embedded C code, but from a users perspective all he really need to do is just invoke a simple script like: #! stap process_snapshot() {} addevent.tskdispatch.cpuidle {} addevent.process {} addevent.syscall.entry { printf ("%4b", $flags) } addevent.syscall.exit {} addevent.tskdispatch.cpuidle {} The data can later be analyses in user-space with what ever method you like. The developer instrumenting the probe point needs to know the Systemtap language, but the user of the trace just need to know which events are available to him. We also plan to do static tracing once SystemTap supports static markers. This may not be the perfect solution, but I'm interested in knowing how we can get there. -JRS ^ permalink raw reply [flat|nested] 271+ messages in thread
* Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 2006-09-14 11:27 ` Ingo Molnar ` (2 preceding siblings ...) 2006-09-14 15:14 ` Martin J. Bligh @ 2006-09-19 11:59 ` Christoph Hellwig 3 siblings, 0 replies; 271+ messages in thread From: Christoph Hellwig @ 2006-09-19 11:59 UTC (permalink / raw) To: Ingo Molnar Cc: Mathieu Desnoyers, linux-kernel, Christoph Hellwig, Andrew Morton, Ingo Molnar, Greg Kroah-Hartman, Thomas Gleixner, Tom Zanussi, ltt-dev, Michel Dagenais On Thu, Sep 14, 2006 at 01:27:18PM +0200, Ingo Molnar wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > > > Following an advice Christoph gave me this summer, submitting a > > smaller, easier to review patch should make everybody happier. Here is > > a stripped down version of LTTng : I removed everything that would > > make the code review reluctant (especially kernel instrumentation and > > kernel state dump module). I plan to release this "core" version every > > few LTTng releases and post it to LKML. > > > > Comments and reviews are very welcome. > > i have one very fundamental question: why should we do this > source-intrusive method of adding tracepoints instead of the dynamic, > unintrusive (and thus zero-overhead) KProbes+SystemTap method? Coming a little late to this thread because I've been travelling the last three weeks I'll answer here before wading through hundreds of mails. I'll categorize tracing methods into a few categories: a) static and in-inline These are tracepoints directly in the kernel source, always compiled in (or under a CONFIG option). We have various ad-hoc tracers of this type already in the kernel, e.g. blktrace or xfs's ktrace b) dynamic and in-line (markers) These are in-line but normally don't do anything in the code except of maybe adding a nop. We currently don't support this at all. c) dynamic and out-of-line These are mainained as external modules or things that need to be translated to modules. We have various low-level mechanisms to implement the hooking up of those currently (*probes) but no other infratsurcture in the kernel to help with those. There's an external project, systemtap which supports probes like those but has a bunch of problems: - it doesn't allow writing scripts in C but only in some odd scripting language - it doesn't actually put support code into the kernel tree but keeps it separate, not allowing to keep probes with the kernel either. In addition it also needs quite frequent updates because it has to poke deep into kernel internals by it's nature. So what's the right way of tracing for us? I'd say a pretty clear all three, and most importantly we need to have a common infrastrucuture for all of those. The most important bit we need right now is a reliable framework to transfer trace data to userspace - one we have that we support a) and a subset of b) above. LTT might be that missing bit, but I'd need to look at the actual patches to see if it's suitable. b) is something people have talked about a lot and we've seen lots of prototypes, in my eyes it's the second priority. But even after that the way we support c) is very rudimentary - we need helpers to look at data, put probes at points outside of function entry/ return we needs things like a dwarf parser, an so on. I think the systemtap approach of the external package is the very last thing we need. Unlike you said elsewhere having the tracepoints externally does not eliminitate maintaince overhead - it shifts it to someone else. Shifting maintaince overhead to someone else is a valid concept in the linux kernel development, we do this all the time for things we don't care about. I think it's fundamentally wrong for traces, though. Traces are very important for debugging complex problems, and I've grown very tired of maintaining all my ad-hoc scripts. Having them in the kernel tree or traces static in it's nature inline would allow and force kernel developers to always keept it uptodate with it's changes. ^ permalink raw reply [flat|nested] 271+ messages in thread
end of thread, other threads:[~2006-09-25 15:47 UTC | newest]
Thread overview: 271+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-15 17:14 [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108 Chuck Ebbert
2006-09-15 18:32 ` Alan Cox
2006-09-16 10:46 ` Jes Sorensen
-- strict thread matches above, loose matches on Subject: below --
2006-09-25 15:20 Chuck Ebbert
2006-09-25 15:39 ` Ingo Molnar
2006-09-15 9:17 Richard J Moore
2006-09-15 3:10 James Dickens
2006-09-14 3:38 Mathieu Desnoyers
2006-09-14 11:27 ` Ingo Molnar
2006-09-14 13:40 ` Roman Zippel
2006-09-14 13:55 ` Ingo Molnar
2006-09-14 14:33 ` Roman Zippel
2006-09-14 15:26 ` Michel Dagenais
2006-09-14 17:48 ` Ingo Molnar
2006-09-15 15:04 ` Mathieu Desnoyers
2006-09-14 18:08 ` Nick Piggin
2006-09-14 18:38 ` Karim Yaghmour
2006-09-14 17:13 ` Ingo Molnar
2006-09-14 17:55 ` Roman Zippel
2006-09-14 18:15 ` Ingo Molnar
2006-09-14 18:35 ` Mathieu Desnoyers
2006-09-14 18:54 ` Karim Yaghmour
2006-09-15 9:20 ` Jes Sorensen
2006-09-15 12:38 ` Karim Yaghmour
2006-09-15 12:32 ` Jes Sorensen
2006-09-15 14:09 ` Karim Yaghmour
2006-09-15 14:30 ` Jes Sorensen
2006-09-15 15:12 ` Karim Yaghmour
2006-09-16 10:41 ` Jes Sorensen
2006-09-16 15:28 ` Karim Yaghmour
2006-09-18 8:57 ` Jes Sorensen
2006-09-18 14:48 ` Ingo Molnar
2006-09-18 15:37 ` Karim Yaghmour
2006-09-15 13:20 ` Paul Mundt
2006-09-15 13:41 ` Roman Zippel
2006-09-15 13:44 ` Jes Sorensen
2006-09-15 14:03 ` Roman Zippel
2006-09-15 14:37 ` Alan Cox
2006-09-15 14:34 ` Roman Zippel
2006-09-15 13:57 ` Paul Mundt
2006-09-15 14:17 ` Karim Yaghmour
2006-09-15 14:13 ` Jes Sorensen
2006-09-15 14:31 ` Karim Yaghmour
2006-09-15 14:28 ` Paul Mundt
2006-09-15 14:46 ` Martin J. Bligh
2006-09-15 15:22 ` Alan Cox
2006-09-15 15:47 ` Martin J. Bligh
2006-09-15 14:51 ` Karim Yaghmour
2006-09-15 15:00 ` Thomas Gleixner
2006-09-15 15:28 ` Karim Yaghmour
2006-09-15 18:16 ` Andrew Morton
2006-09-15 18:19 ` Ingo Molnar
2006-09-15 19:26 ` Karim Yaghmour
2006-09-15 19:43 ` Roman Zippel
2006-09-15 20:05 ` Ingo Molnar
2006-09-15 20:22 ` Mathieu Desnoyers
2006-09-15 21:08 ` Jose R. Santos
2006-09-15 21:25 ` Mathieu Desnoyers
2006-09-15 22:02 ` Jose R. Santos
2006-09-15 22:03 ` Ingo Molnar
2006-09-15 22:32 ` Karim Yaghmour
2006-09-15 22:43 ` Ingo Molnar
2006-09-15 23:33 ` Karim Yaghmour
2006-09-15 23:52 ` Ingo Molnar
2006-09-16 2:24 ` Karim Yaghmour
2006-09-15 23:53 ` Ingo Molnar
2006-09-16 2:51 ` Karim Yaghmour
2006-09-15 22:59 ` Frank Ch. Eigler
2006-09-15 23:40 ` Karim Yaghmour
2006-09-15 23:17 ` Jose R. Santos
2006-09-15 21:32 ` Ingo Molnar
2006-09-15 21:58 ` Mathieu Desnoyers
2006-09-15 22:19 ` Ingo Molnar
2006-09-15 22:45 ` Karim Yaghmour
2006-09-16 9:59 ` Jes Sorensen
2006-09-16 17:24 ` Mathieu Desnoyers
2006-09-16 17:35 ` Ingo Molnar
2006-09-16 17:56 ` Mathieu Desnoyers
2006-09-16 19:10 ` Ingo Molnar
2006-09-16 19:37 ` Ingo Molnar
2006-09-17 10:13 ` Frederik Deweerdt
2006-09-17 14:00 ` Ingo Molnar
2006-09-16 19:51 ` Karim Yaghmour
2006-09-16 23:40 ` Ingo Molnar
2006-09-17 5:33 ` Mathieu Desnoyers
2006-09-16 18:11 ` Karim Yaghmour
2006-09-16 17:44 ` Ingo Molnar
2006-09-16 18:15 ` Karim Yaghmour
2006-09-18 8:18 ` Jes Sorensen
2006-09-16 17:55 ` Karim Yaghmour
2006-09-18 8:21 ` Jes Sorensen
2006-09-18 8:33 ` Jes Sorensen
2006-09-18 15:01 ` Mathieu Desnoyers
2006-09-16 17:30 ` Mathieu Desnoyers
2006-09-18 8:15 ` Jes Sorensen
2006-09-18 14:53 ` Mathieu Desnoyers
2006-09-18 15:17 ` Ingo Molnar
2006-09-18 16:54 ` Mathieu Desnoyers
2006-09-15 21:12 ` Roman Zippel
2006-09-15 21:08 ` Ingo Molnar
2006-09-15 20:13 ` Andrew Morton
2006-09-15 21:49 ` Jose R. Santos
2006-09-16 10:19 ` Jes Sorensen
2006-09-16 16:05 ` Karim Yaghmour
2006-09-17 4:54 ` Ganesan Rajagopal
2006-09-18 8:13 ` Jes Sorensen
2006-09-18 14:46 ` Mathieu Desnoyers
2006-09-18 17:06 ` Martin Bligh
2006-09-20 14:17 ` Jes Sorensen
2006-09-15 19:35 ` Thomas Gleixner
2006-09-15 19:40 ` Ingo Molnar
2006-09-15 19:56 ` Karim Yaghmour
2006-09-15 20:23 ` Thomas Gleixner
2006-09-15 20:40 ` Roman Zippel
2006-09-15 20:48 ` Ingo Molnar
2006-09-15 21:17 ` Karim Yaghmour
2006-09-15 21:15 ` Ingo Molnar
2006-09-15 21:56 ` Karim Yaghmour
2006-09-15 21:27 ` Roman Zippel
2006-09-15 21:51 ` Ingo Molnar
2006-09-15 22:15 ` Karim Yaghmour
2006-09-15 22:53 ` Roman Zippel
2006-09-15 23:14 ` Ingo Molnar
2006-09-15 23:49 ` Nicholas Miell
2006-09-15 23:57 ` Ingo Molnar
2006-09-16 0:41 ` Nicholas Miell
2006-09-16 0:31 ` Roman Zippel
2006-09-16 8:20 ` Ingo Molnar
2006-09-16 8:21 ` Ingo Molnar
2006-09-16 8:21 ` Ingo Molnar
2006-09-16 8:22 ` Ingo Molnar
2006-09-16 19:58 ` Roman Zippel
2006-09-16 22:50 ` Ingo Molnar
2006-09-16 23:00 ` Ingo Molnar
2006-09-17 1:15 ` Roman Zippel
2006-09-17 8:42 ` Ingo Molnar
2006-09-17 15:16 ` Roman Zippel
2006-09-17 15:25 ` Ingo Molnar
2006-09-17 16:02 ` Roman Zippel
2006-09-17 16:45 ` Ingo Molnar
2006-09-17 16:59 ` Nick Piggin
2006-09-17 17:26 ` Roman Zippel
2006-09-17 17:56 ` Nick Piggin
2006-09-17 18:59 ` Roman Zippel
2006-09-17 21:23 ` Ingo Molnar
2006-09-17 21:52 ` Roman Zippel
2006-09-17 22:27 ` Ingo Molnar
2006-09-17 21:40 ` Ingo Molnar
2006-09-18 8:43 ` Jes Sorensen
2006-09-17 21:32 ` Ingo Molnar
2006-09-17 19:23 ` Ingo Molnar
2006-09-17 19:45 ` Roman Zippel
2006-09-17 20:56 ` Ingo Molnar
2006-09-17 21:36 ` Roman Zippel
2006-09-17 22:13 ` Ingo Molnar
2006-09-16 23:14 ` Ingo Molnar
2006-09-17 14:19 ` Frank Ch. Eigler
2006-09-17 15:31 ` Ingo Molnar
2006-09-17 17:15 ` Mathieu Desnoyers
[not found] ` <y0mu036eglz.fsf@ton.toronto.redhat.com>
2006-09-17 15:00 ` Ingo Molnar
2006-09-16 8:23 ` Ingo Molnar
2006-09-16 8:23 ` Ingo Molnar
2006-09-16 8:23 ` Ingo Molnar
2006-09-15 21:05 ` Karim Yaghmour
2006-09-15 21:17 ` Thomas Gleixner
2006-09-15 21:31 ` Karim Yaghmour
2006-09-15 20:00 ` Mathieu Desnoyers
2006-09-15 20:27 ` Jose R. Santos
2006-09-15 20:37 ` Alan Cox
2006-09-15 20:26 ` Mathieu Desnoyers
2006-09-15 20:51 ` Karim Yaghmour
2006-09-17 17:53 ` Mathieu Desnoyers
2006-09-15 15:24 ` Alan Cox
2006-09-15 15:23 ` Karim Yaghmour
2006-09-15 14:39 ` Jes Sorensen
2006-09-15 15:04 ` Karim Yaghmour
2006-09-14 19:40 ` Tim Bird
2006-09-14 20:00 ` Ingo Molnar
2006-09-14 20:46 ` Karim Yaghmour
2006-09-19 12:05 ` Christoph Hellwig
2006-09-14 21:02 ` Roman Zippel
2006-09-15 11:40 ` Alan Cox
2006-09-15 11:46 ` Roman Zippel
2006-09-15 12:38 ` Alan Cox
2006-09-15 12:39 ` Roman Zippel
2006-09-15 13:41 ` Alan Cox
2006-09-15 13:34 ` Roman Zippel
2006-09-15 14:41 ` Alan Cox
2006-09-15 14:35 ` Karim Yaghmour
2006-09-15 14:58 ` Alan Cox
2006-09-15 14:57 ` Karim Yaghmour
2006-09-15 17:49 ` Andrew Morton
2006-09-15 18:20 ` Karim Yaghmour
2006-09-15 17:01 ` Tim Bird
2006-09-15 17:08 ` Frank Ch. Eigler
2006-09-15 17:57 ` Andrew Morton
2006-09-15 18:31 ` Alan Cox
2006-09-15 18:12 ` Ingo Molnar
2006-09-15 19:10 ` Roman Zippel
2006-09-15 19:10 ` Ingo Molnar
2006-09-15 20:05 ` Thomas Gleixner
2006-09-15 20:35 ` Roman Zippel
2006-09-15 21:44 ` Tim Bird
2006-09-19 12:29 ` Christoph Hellwig
2006-09-19 13:17 ` Roman Zippel
2006-09-15 18:24 ` Frank Ch. Eigler
2006-09-15 18:23 ` Ingo Molnar
2006-09-15 18:18 ` Martin Bligh
2006-09-15 18:10 ` Jose R. Santos
2006-09-15 19:49 ` Mathieu Desnoyers
2006-09-15 20:54 ` Jose R. Santos
2006-09-15 21:42 ` Karim Yaghmour
2006-09-15 21:46 ` Mathieu Desnoyers
2006-09-19 15:05 ` Jose R. Santos
2006-09-19 15:30 ` Mathieu Desnoyers
2006-09-19 16:39 ` Jose R. Santos
2006-09-19 18:03 ` Mathieu Desnoyers
2006-09-15 17:45 ` Andrew Morton
2006-09-15 18:16 ` Karim Yaghmour
2006-09-15 19:20 ` Jose R. Santos
2006-09-15 19:59 ` Andrew Morton
2006-09-15 20:24 ` Karim Yaghmour
2006-09-15 20:25 ` Thomas Gleixner
2006-09-14 19:47 ` Roman Zippel
2006-09-14 20:24 ` Ingo Molnar
2006-09-14 20:54 ` Roman Zippel
2006-09-14 21:08 ` Daniel Walker
2006-09-14 21:30 ` Roman Zippel
2006-09-14 22:15 ` Ingo Molnar
2006-09-14 23:39 ` Roman Zippel
2006-09-14 23:43 ` Ingo Molnar
2006-09-15 0:27 ` Roman Zippel
2006-09-15 1:47 ` Mathieu Desnoyers
2006-09-15 5:47 ` Vara Prasad
2006-09-14 18:12 ` Karim Yaghmour
2006-09-14 20:25 ` Martin Bligh
2006-09-14 20:34 ` Ingo Molnar
2006-09-14 20:55 ` Martin Bligh
2006-09-14 21:31 ` Ingo Molnar
2006-09-14 22:25 ` Martin Bligh
2006-09-14 22:36 ` Ingo Molnar
2006-09-14 22:59 ` Martin Bligh
2006-09-14 23:19 ` Ingo Molnar
2006-09-15 0:19 ` Nicholas Miell
2006-09-15 1:04 ` Martin J. Bligh
2006-09-15 12:38 ` Ingo Molnar
2006-09-15 7:00 ` Vara Prasad
2006-09-15 15:37 ` Michel Dagenais
2006-09-19 12:08 ` Christoph Hellwig
2006-09-14 21:07 ` Roman Zippel
2006-09-15 9:29 ` Jes Sorensen
2006-09-14 17:51 ` Karim Yaghmour
2006-09-14 15:19 ` Mathieu Desnoyers
2006-09-14 19:39 ` Frank Ch. Eigler
2006-09-15 17:13 ` Jose R. Santos
2006-09-14 15:02 ` Mathieu Desnoyers
2006-09-14 15:14 ` Martin J. Bligh
2006-09-14 17:43 ` Ingo Molnar
2006-09-14 18:25 ` Karim Yaghmour
2006-09-14 20:03 ` Martin Bligh
2006-09-14 20:14 ` Ingo Molnar
2006-09-14 20:40 ` Martin Bligh
2006-09-14 21:05 ` Michel Dagenais
2006-09-14 22:23 ` Ingo Molnar
2006-09-14 22:46 ` Martin Bligh
2006-09-14 22:56 ` Ingo Molnar
2006-09-14 19:03 ` grundig
2006-09-14 19:21 ` Karim Yaghmour
2006-09-14 19:48 ` Frank Ch. Eigler
2006-09-15 16:32 ` Jose R. Santos
2006-09-19 11:59 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox