* [patch 01/16] RCU read sched notrace
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-16 4:00 ` Paul E. McKenney
2008-11-14 22:47 ` [patch 02/16] Marker fix unregister Mathieu Desnoyers
` (14 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra,
Paul E McKenney
[-- Attachment #1: rcu-read-sched-notrace.patch --]
[-- Type: text/plain, Size: 1284 bytes --]
Add _notrace version to rcu_read_*_sched().
Impact: new API, useful for tracepoints and markers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E McKenney <paulmck@linux.vnet.ibm.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: akpm@linux-foundation.org
---
include/linux/rcupdate.h | 2 ++
1 file changed, 2 insertions(+)
Index: linux-2.6-lttng/include/linux/rcupdate.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/rcupdate.h 2008-10-08 10:27:57.000000000 -0400
+++ linux-2.6-lttng/include/linux/rcupdate.h 2008-10-08 10:40:48.000000000 -0400
@@ -142,6 +142,7 @@ struct rcu_head {
* on the write-side to insure proper synchronization.
*/
#define rcu_read_lock_sched() preempt_disable()
+#define rcu_read_lock_sched_notrace() preempt_disable_notrace()
/*
* rcu_read_unlock_sched - marks the end of a RCU-classic critical section
@@ -149,6 +150,7 @@ struct rcu_head {
* See rcu_read_lock_sched for more information.
*/
#define rcu_read_unlock_sched() preempt_enable()
+#define rcu_read_unlock_sched_notrace() preempt_enable_notrace()
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 01/16] RCU read sched notrace
2008-11-14 22:47 ` [patch 01/16] RCU read sched notrace Mathieu Desnoyers
@ 2008-11-16 4:00 ` Paul E. McKenney
0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2008-11-16 4:00 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Ingo Molnar, linux-kernel, akpm, Linus Torvalds, Peter Zijlstra
On Fri, Nov 14, 2008 at 05:47:34PM -0500, Mathieu Desnoyers wrote:
> Add _notrace version to rcu_read_*_sched().
>
> Impact: new API, useful for tracepoints and markers.
Looks good to me. Just don't try it with rcu_read_lock(), given
preemptable RCU. ;-)
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: Paul E McKenney <paulmck@linux.vnet.ibm.com>
> CC: 'Ingo Molnar' <mingo@elte.hu>
> CC: akpm@linux-foundation.org
> ---
> include/linux/rcupdate.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> Index: linux-2.6-lttng/include/linux/rcupdate.h
> ===================================================================
> --- linux-2.6-lttng.orig/include/linux/rcupdate.h 2008-10-08 10:27:57.000000000 -0400
> +++ linux-2.6-lttng/include/linux/rcupdate.h 2008-10-08 10:40:48.000000000 -0400
> @@ -142,6 +142,7 @@ struct rcu_head {
> * on the write-side to insure proper synchronization.
> */
> #define rcu_read_lock_sched() preempt_disable()
> +#define rcu_read_lock_sched_notrace() preempt_disable_notrace()
>
> /*
> * rcu_read_unlock_sched - marks the end of a RCU-classic critical section
> @@ -149,6 +150,7 @@ struct rcu_head {
> * See rcu_read_lock_sched for more information.
> */
> #define rcu_read_unlock_sched() preempt_enable()
> +#define rcu_read_unlock_sched_notrace() preempt_enable_notrace()
>
>
>
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread
* [patch 02/16] Marker fix unregister
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
2008-11-14 22:47 ` [patch 01/16] RCU read sched notrace Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 03/16] markers: Add missing stdargs.h include, needed due to va_list usage Mathieu Desnoyers
` (13 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: markers-fix-unregister-private-data.patch --]
[-- Type: text/plain, Size: 1896 bytes --]
get_marker() can return a NULL entry because the mutex is released in the middle
of those functions. Make sure we check to see if it has been concurrently
removed.
Impact: Bugfix.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/marker.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
Index: linux.trees.git/kernel/marker.c
===================================================================
--- linux.trees.git.orig/kernel/marker.c 2008-11-14 17:38:39.000000000 -0500
+++ linux.trees.git/kernel/marker.c 2008-11-14 17:38:43.000000000 -0500
@@ -653,10 +653,11 @@ int marker_probe_register(const char *na
goto end;
}
mutex_unlock(&markers_mutex);
- marker_update_probes(); /* may update entry */
+ marker_update_probes();
mutex_lock(&markers_mutex);
entry = get_marker(name);
- WARN_ON(!entry);
+ if (!entry)
+ goto end;
if (entry->rcu_pending)
rcu_barrier_sched();
entry->oldptr = old;
@@ -697,7 +698,7 @@ int marker_probe_unregister(const char *
rcu_barrier_sched();
old = marker_entry_remove_probe(entry, probe, probe_private);
mutex_unlock(&markers_mutex);
- marker_update_probes(); /* may update entry */
+ marker_update_probes();
mutex_lock(&markers_mutex);
entry = get_marker(name);
if (!entry)
@@ -778,10 +779,11 @@ int marker_probe_unregister_private_data
rcu_barrier_sched();
old = marker_entry_remove_probe(entry, NULL, probe_private);
mutex_unlock(&markers_mutex);
- marker_update_probes(); /* may update entry */
+ marker_update_probes();
mutex_lock(&markers_mutex);
entry = get_marker_from_private_data(probe, probe_private);
- WARN_ON(!entry);
+ if (!entry)
+ goto end;
if (entry->rcu_pending)
rcu_barrier_sched();
entry->oldptr = old;
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 03/16] markers: Add missing stdargs.h include, needed due to va_list usage
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
2008-11-14 22:47 ` [patch 01/16] RCU read sched notrace Mathieu Desnoyers
2008-11-14 22:47 ` [patch 02/16] Marker fix unregister Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 04/16] Markers use rcu_*_sched_notrace and notrace Mathieu Desnoyers
` (12 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: markers-add-missing-stdargs.patch --]
[-- Type: text/plain, Size: 898 bytes --]
That seemed to cause built issue when marker.h is included early, even though
stdargs.h is included in kernel.h.
Impact: bugfix.
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
include/linux/marker.h | 1 +
1 file changed, 1 insertion(+)
Index: linux-2.6-lttng/include/linux/marker.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/marker.h 2008-11-14 16:58:54.000000000 -0500
+++ linux-2.6-lttng/include/linux/marker.h 2008-11-14 16:59:00.000000000 -0500
@@ -12,6 +12,7 @@
* See the file COPYING for more details.
*/
+#include <stdarg.h>
#include <linux/types.h>
struct module;
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 04/16] Markers use rcu_*_sched_notrace and notrace
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (2 preceding siblings ...)
2008-11-14 22:47 ` [patch 03/16] markers: Add missing stdargs.h include, needed due to va_list usage Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 05/16] Markers use module notifier Mathieu Desnoyers
` (11 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Masami Hiramatsu,
Peter Zijlstra, Frank Ch. Eigler, Lai Jiangshan, Hideo AOKI,
Takashi Nishiie, Steven Rostedt
[-- Attachment #1: markers-use-rcu-sched-notrace.patch --]
[-- Type: text/plain, Size: 3068 bytes --]
Make marker critical code use notrace (__attribute__((no_instrument_function))
to make sure they can be used as an ftrace callback.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
---
kernel/marker.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
Index: linux-2.6-lttng/kernel/marker.c
===================================================================
--- linux-2.6-lttng.orig/kernel/marker.c 2008-10-15 09:43:15.000000000 -0400
+++ linux-2.6-lttng/kernel/marker.c 2008-10-15 09:44:08.000000000 -0400
@@ -81,7 +81,7 @@ struct marker_entry {
* though the function pointer change and the marker enabling are two distinct
* operations that modifies the execution flow of preemptible code.
*/
-void __mark_empty_function(void *probe_private, void *call_private,
+notrace void __mark_empty_function(void *probe_private, void *call_private,
const char *fmt, va_list *args)
{
}
@@ -97,7 +97,8 @@ EXPORT_SYMBOL_GPL(__mark_empty_function)
* need to put a full smp_rmb() in this branch. This is why we do not use
* rcu_dereference() for the pointer read.
*/
-void marker_probe_cb(const struct marker *mdata, void *call_private, ...)
+notrace void marker_probe_cb(const struct marker *mdata,
+ void *call_private, ...)
{
va_list args;
char ptype;
@@ -107,7 +108,7 @@ void marker_probe_cb(const struct marker
* sure the teardown of the callbacks can be done correctly when they
* are in modules and they insure RCU read coherency.
*/
- rcu_read_lock_sched();
+ rcu_read_lock_sched_notrace();
ptype = mdata->ptype;
if (likely(!ptype)) {
marker_probe_func *func;
@@ -145,7 +146,7 @@ void marker_probe_cb(const struct marker
va_end(args);
}
}
- rcu_read_unlock_sched();
+ rcu_read_unlock_sched_notrace();
}
EXPORT_SYMBOL_GPL(marker_probe_cb);
@@ -157,12 +158,13 @@ EXPORT_SYMBOL_GPL(marker_probe_cb);
*
* Should be connected to markers "MARK_NOARGS".
*/
-static void marker_probe_cb_noarg(const struct marker *mdata, void *call_private, ...)
+static notrace void marker_probe_cb_noarg(const struct marker *mdata,
+ void *call_private, ...)
{
va_list args; /* not initialized */
char ptype;
- rcu_read_lock_sched();
+ rcu_read_lock_sched_notrace();
ptype = mdata->ptype;
if (likely(!ptype)) {
marker_probe_func *func;
@@ -195,7 +197,7 @@ static void marker_probe_cb_noarg(const
multi[i].func(multi[i].probe_private, call_private,
mdata->format, &args);
}
- rcu_read_unlock_sched();
+ rcu_read_unlock_sched_notrace();
}
static void free_old_closure(struct rcu_head *head)
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 05/16] Markers use module notifier
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (3 preceding siblings ...)
2008-11-14 22:47 ` [patch 04/16] Markers use rcu_*_sched_notrace and notrace Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-16 8:52 ` [PATCH] markers/tracpoints: fix non-modular build Ingo Molnar
2008-11-14 22:47 ` [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp()) Mathieu Desnoyers
` (10 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: markers-use-modules-notifiers.patch --]
[-- Type: text/plain, Size: 2110 bytes --]
Use module notifiers instead of adding a hook in module.c.
Impact: cleanup.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/marker.c | 29 +++++++++++++++++++++++++++++
kernel/module.c | 4 ----
2 files changed, 29 insertions(+), 4 deletions(-)
Index: linux.trees.git/kernel/marker.c
===================================================================
--- linux.trees.git.orig/kernel/marker.c 2008-11-14 17:39:27.000000000 -0500
+++ linux.trees.git/kernel/marker.c 2008-11-14 17:39:28.000000000 -0500
@@ -846,3 +846,32 @@ void *marker_get_private_data(const char
return ERR_PTR(-ENOENT);
}
EXPORT_SYMBOL_GPL(marker_get_private_data);
+
+int marker_module_notify(struct notifier_block *self,
+ unsigned long val, void *data)
+{
+ struct module *mod = data;
+
+ switch (val) {
+ case MODULE_STATE_COMING:
+ marker_update_probe_range(mod->markers,
+ mod->markers + mod->num_markers);
+ break;
+ case MODULE_STATE_GOING:
+ marker_update_probe_range(mod->markers,
+ mod->markers + mod->num_markers);
+ break;
+ }
+ return 0;
+}
+
+struct notifier_block marker_module_nb = {
+ .notifier_call = marker_module_notify,
+ .priority = 0,
+};
+
+static int init_markers(void)
+{
+ return register_module_notifier(&marker_module_nb);
+}
+__initcall(init_markers);
Index: linux.trees.git/kernel/module.c
===================================================================
--- linux.trees.git.orig/kernel/module.c 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/kernel/module.c 2008-11-14 17:39:28.000000000 -0500
@@ -2185,10 +2185,6 @@ static noinline struct module *load_modu
struct mod_debug *debug;
unsigned int num_debug;
-#ifdef CONFIG_MARKERS
- marker_update_probe_range(mod->markers,
- mod->markers + mod->num_markers);
-#endif
debug = section_objs(hdr, sechdrs, secstrings, "__verbose",
sizeof(*debug), &num_debug);
dynamic_printk_setup(debug, num_debug);
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [PATCH] markers/tracpoints: fix non-modular build
2008-11-14 22:47 ` [patch 05/16] Markers use module notifier Mathieu Desnoyers
@ 2008-11-16 8:52 ` Ingo Molnar
2008-11-17 5:39 ` Mathieu Desnoyers
0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2008-11-16 8:52 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> Use module notifiers instead of adding a hook in module.c.
>
> Impact: cleanup.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> CC: 'Ingo Molnar' <mingo@elte.hu>
> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> kernel/marker.c | 29 +++++++++++++++++++++++++++++
> kernel/module.c | 4 ----
> 2 files changed, 29 insertions(+), 4 deletions(-)
>From 227a837567e339c74d9d4243d03a29bd943a018c Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Sun, 16 Nov 2008 09:50:34 +0100
Subject: [PATCH] markers/tracpoints: fix non-modular build
fix:
kernel/marker.c: In function 'marker_module_notify':
kernel/marker.c:905: error: 'MODULE_STATE_COMING' undeclared (first use in this function)
[...]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/marker.c | 4 ++++
kernel/tracepoint.c | 4 ++++
2 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/kernel/marker.c b/kernel/marker.c
index c14ec26..ea54f26 100644
--- a/kernel/marker.c
+++ b/kernel/marker.c
@@ -896,6 +896,8 @@ void *marker_get_private_data(const char *name, marker_probe_func *probe,
}
EXPORT_SYMBOL_GPL(marker_get_private_data);
+#ifdef CONFIG_MODULES
+
int marker_module_notify(struct notifier_block *self,
unsigned long val, void *data)
{
@@ -924,3 +926,5 @@ static int init_markers(void)
return register_module_notifier(&marker_module_nb);
}
__initcall(init_markers);
+
+#endif /* CONFIG_MODULES */
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 94ac4e3..7960274 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -542,6 +542,8 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
}
EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
+#ifdef CONFIG_MODULES
+
int tracepoint_module_notify(struct notifier_block *self,
unsigned long val, void *data)
{
@@ -570,3 +572,5 @@ static int init_tracepoints(void)
return register_module_notifier(&tracepoint_module_nb);
}
__initcall(init_tracepoints);
+
+#endif /* CONFIG_MODULES */
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH] markers/tracpoints: fix non-modular build
2008-11-16 8:52 ` [PATCH] markers/tracpoints: fix non-modular build Ingo Molnar
@ 2008-11-17 5:39 ` Mathieu Desnoyers
0 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-17 5:39 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan
* Ingo Molnar (mingo@elte.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Use module notifiers instead of adding a hook in module.c.
> >
> > Impact: cleanup.
> >
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> > CC: 'Ingo Molnar' <mingo@elte.hu>
> > CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> > ---
> > kernel/marker.c | 29 +++++++++++++++++++++++++++++
> > kernel/module.c | 4 ----
> > 2 files changed, 29 insertions(+), 4 deletions(-)
>
> From 227a837567e339c74d9d4243d03a29bd943a018c Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@elte.hu>
> Date: Sun, 16 Nov 2008 09:50:34 +0100
> Subject: [PATCH] markers/tracpoints: fix non-modular build
>
> fix:
>
> kernel/marker.c: In function 'marker_module_notify':
> kernel/marker.c:905: error: 'MODULE_STATE_COMING' undeclared (first use in this function)
> [...]
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yes, that's needed. Thanks for preparing this patch.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> ---
> kernel/marker.c | 4 ++++
> kernel/tracepoint.c | 4 ++++
> 2 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/marker.c b/kernel/marker.c
> index c14ec26..ea54f26 100644
> --- a/kernel/marker.c
> +++ b/kernel/marker.c
> @@ -896,6 +896,8 @@ void *marker_get_private_data(const char *name, marker_probe_func *probe,
> }
> EXPORT_SYMBOL_GPL(marker_get_private_data);
>
> +#ifdef CONFIG_MODULES
> +
> int marker_module_notify(struct notifier_block *self,
> unsigned long val, void *data)
> {
> @@ -924,3 +926,5 @@ static int init_markers(void)
> return register_module_notifier(&marker_module_nb);
> }
> __initcall(init_markers);
> +
> +#endif /* CONFIG_MODULES */
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 94ac4e3..7960274 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -542,6 +542,8 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
> }
> EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
>
> +#ifdef CONFIG_MODULES
> +
> int tracepoint_module_notify(struct notifier_block *self,
> unsigned long val, void *data)
> {
> @@ -570,3 +572,5 @@ static int init_tracepoints(void)
> return register_module_notifier(&tracepoint_module_nb);
> }
> __initcall(init_tracepoints);
> +
> +#endif /* CONFIG_MODULES */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread
* [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (4 preceding siblings ...)
2008-11-14 22:47 ` [patch 05/16] Markers use module notifier Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-16 7:59 ` Ingo Molnar
2008-11-14 22:47 ` [patch 07/16] Markers : create DEFINE_MARKER and GET_MARKER (new API) Mathieu Desnoyers
` (9 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: markers-auto-enable-tracepoints.patch --]
[-- Type: text/plain, Size: 6909 bytes --]
Add a new API trace_mark_tp(), which declares a marker within a tracepoint
probe. When the marker is activated, the tracepoint is automatically enabled.
No branch test is used at the marker site, because it would be a duplicate of
the branch already present in the tracepoint.
Impact: new API.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
include/linux/marker.h | 45 ++++++++++++++++++++++++++++++++++++++++-
init/Kconfig | 1
kernel/marker.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 96 insertions(+), 3 deletions(-)
Index: linux.trees.git/include/linux/marker.h
===================================================================
--- linux.trees.git.orig/include/linux/marker.h 2008-11-14 17:38:47.000000000 -0500
+++ linux.trees.git/include/linux/marker.h 2008-11-14 17:39:45.000000000 -0500
@@ -49,6 +49,8 @@ struct marker {
void (*call)(const struct marker *mdata, void *call_private, ...);
struct marker_probe_closure single;
struct marker_probe_closure *multi;
+ const char *tp_name; /* Optional tracepoint name */
+ void *tp_cb; /* Optional tracepoint callback */
} __attribute__((aligned(8)));
#ifdef CONFIG_MARKERS
@@ -73,7 +75,7 @@ struct marker {
__attribute__((section("__markers"), aligned(8))) = \
{ __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
0, 0, marker_probe_cb, \
- { __mark_empty_function, NULL}, NULL }; \
+ { __mark_empty_function, NULL}, NULL, NULL, NULL }; \
__mark_check_format(format, ## args); \
if (unlikely(__mark_##name.state)) { \
(*__mark_##name.call) \
@@ -81,11 +83,38 @@ struct marker {
} \
} while (0)
+#define __trace_mark_tp(name, call_private, tp_name, tp_cb, format, args...) \
+ do { \
+ void __check_tp_type(void) \
+ { \
+ register_trace_##tp_name(tp_cb); \
+ } \
+ static const char __mstrtab_##name[] \
+ __attribute__((section("__markers_strings"))) \
+ = #name "\0" format; \
+ static struct marker __mark_##name \
+ __attribute__((section("__markers"), aligned(8))) = \
+ { __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
+ 0, 0, marker_probe_cb, \
+ { __mark_empty_function, NULL}, NULL, #tp_name, tp_cb };\
+ __mark_check_format(format, ## args); \
+ (*__mark_##name.call)(&__mark_##name, call_private, \
+ ## args); \
+ } while (0)
+
extern void marker_update_probe_range(struct marker *begin,
struct marker *end);
#else /* !CONFIG_MARKERS */
#define __trace_mark(generic, name, call_private, format, args...) \
__mark_check_format(format, ## args)
+#define __trace_mark_tp(name, call_private, tp_name, tp_cb, format, args...) \
+ do { \
+ void __check_tp_type(void) \
+ { \
+ register_trace_##tp_name(tp_cb); \
+ } \
+ __mark_check_format(format, ## args); \
+ } while (0)
static inline void marker_update_probe_range(struct marker *begin,
struct marker *end)
{ }
@@ -118,6 +147,20 @@ static inline void marker_update_probe_r
__trace_mark(1, name, NULL, format, ## args)
/**
+ * trace_mark_tp - Marker in a tracepoint callback
+ * @name: marker name, not quoted.
+ * @tp_name: tracepoint name, not quoted.
+ * @tp_cb: tracepoint callback. Should have an associated global symbol so it
+ * is not optimized away by the compiler (should not be static).
+ * @format: format string
+ * @args...: variable argument list
+ *
+ * Places a marker in a tracepoint callback.
+ */
+#define trace_mark_tp(name, tp_name, tp_cb, format, args...) \
+ __trace_mark_tp(name, NULL, tp_name, tp_cb, format, ## args)
+
+/**
* MARK_NOARGS - Format string for a marker with no argument.
*/
#define MARK_NOARGS " "
Index: linux.trees.git/kernel/marker.c
===================================================================
--- linux.trees.git.orig/kernel/marker.c 2008-11-14 17:39:28.000000000 -0500
+++ linux.trees.git/kernel/marker.c 2008-11-14 17:39:45.000000000 -0500
@@ -479,7 +479,7 @@ static int marker_set_format(struct mark
static int set_marker(struct marker_entry *entry, struct marker *elem,
int active)
{
- int ret;
+ int ret = 0;
WARN_ON(strcmp(entry->name, elem->name) != 0);
if (entry->format) {
@@ -531,9 +531,40 @@ static int set_marker(struct marker_entr
*/
smp_wmb();
elem->ptype = entry->ptype;
+
+ if (elem->tp_name && (active ^ elem->state)) {
+ WARN_ON(!elem->tp_cb);
+ /*
+ * It is ok to directly call the probe registration because type
+ * checking has been done in the __trace_mark_tp() macro.
+ */
+
+ if (active) {
+ /*
+ * try_module_get should always succeed because we hold
+ * lock_module() to get the tp_cb address.
+ */
+ ret = try_module_get(__module_text_address(
+ (unsigned long)elem->tp_cb));
+ BUG_ON(!ret);
+ ret = tracepoint_probe_register_noupdate(
+ elem->tp_name,
+ elem->tp_cb);
+ } else {
+ ret = tracepoint_probe_unregister_noupdate(
+ elem->tp_name,
+ elem->tp_cb);
+ /*
+ * tracepoint_probe_update_all() must be called
+ * before the module containing tp_cb is unloaded.
+ */
+ module_put(__module_text_address(
+ (unsigned long)elem->tp_cb));
+ }
+ }
elem->state = active;
- return 0;
+ return ret;
}
/*
@@ -544,7 +575,24 @@ static int set_marker(struct marker_entr
*/
static void disable_marker(struct marker *elem)
{
+ int ret;
+
/* leave "call" as is. It is known statically. */
+ if (elem->tp_name && elem->state) {
+ WARN_ON(!elem->tp_cb);
+ /*
+ * It is ok to directly call the probe registration because type
+ * checking has been done in the __trace_mark_tp() macro.
+ */
+ ret = tracepoint_probe_unregister_noupdate(elem->tp_name,
+ elem->tp_cb);
+ WARN_ON(ret);
+ /*
+ * tracepoint_probe_update_all() must be called
+ * before the module containing tp_cb is unloaded.
+ */
+ module_put(__module_text_address((unsigned long)elem->tp_cb));
+ }
elem->state = 0;
elem->single.func = __mark_empty_function;
/* Update the function before setting the ptype */
@@ -608,6 +656,7 @@ static void marker_update_probes(void)
marker_update_probe_range(__start___markers, __stop___markers);
/* Markers in modules. */
module_update_markers();
+ tracepoint_probe_update_all();
}
/**
Index: linux.trees.git/init/Kconfig
===================================================================
--- linux.trees.git.orig/init/Kconfig 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/init/Kconfig 2008-11-14 17:39:45.000000000 -0500
@@ -808,6 +808,7 @@ config TRACEPOINTS
config MARKERS
bool "Activate markers"
+ depends on TRACEPOINTS
help
Place an empty function call at each marker site. Can be
dynamically changed for a probe function.
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-14 22:47 ` [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp()) Mathieu Desnoyers
@ 2008-11-16 7:59 ` Ingo Molnar
2008-11-18 4:44 ` Mathieu Desnoyers
2008-11-25 12:23 ` KOSAKI Motohiro
0 siblings, 2 replies; 30+ messages in thread
From: Ingo Molnar @ 2008-11-16 7:59 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> Add a new API trace_mark_tp(), which declares a marker within a
> tracepoint probe. When the marker is activated, the tracepoint is
> automatically enabled.
>
> No branch test is used at the marker site, because it would be a
> duplicate of the branch already present in the tracepoint.
>
> Impact: new API.
i dont know.
I was actually hoping for markers (the in-kernel API) to go away
completely - and be replaced with tracepoints.
Markers are the wrong design on several levels. They couple the kernel
dynamically with unknown (to the kernel) entities - and that is
causing complexity all around the place, clearly expressed in these
patches of yours.
Tracepoints are much more specific - typed and enumerated function
call callback points in essence - with some politeness that allows
external instrumentation entities to attach to those callbacks.
Is there any usecase of markers that should not be converted to either
tracepoints or to ftrace_printk() ?
Ingo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-16 7:59 ` Ingo Molnar
@ 2008-11-18 4:44 ` Mathieu Desnoyers
2008-11-18 16:30 ` Ingo Molnar
2008-11-25 12:23 ` KOSAKI Motohiro
1 sibling, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-18 4:44 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Ingo Molnar (mingo@elte.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Add a new API trace_mark_tp(), which declares a marker within a
> > tracepoint probe. When the marker is activated, the tracepoint is
> > automatically enabled.
> >
> > No branch test is used at the marker site, because it would be a
> > duplicate of the branch already present in the tracepoint.
> >
> > Impact: new API.
>
> i dont know.
>
> I was actually hoping for markers (the in-kernel API) to go away
> completely - and be replaced with tracepoints.
>
> Markers are the wrong design on several levels. They couple the kernel
> dynamically with unknown (to the kernel) entities - and that is
> causing complexity all around the place, clearly expressed in these
> patches of yours.
>
> Tracepoints are much more specific - typed and enumerated function
> call callback points in essence - with some politeness that allows
> external instrumentation entities to attach to those callbacks.
>
> Is there any usecase of markers that should not be converted to either
> tracepoints or to ftrace_printk() ?
>
> Ingo
I'll try to explain the common use-case I have in mind. I think starting
from that we'll be able to better understand what pieces of tracepoints
and markers are useful, and which purpose they fulfill. I have other
use-cases in mind too, but for sake of clarity, let's begin with this
subset.
Tracepoints instrument the kernel. They identify code locations and
extract the state of pre-identified interesting variables at these
locations. They are built into the kernel. They are closely tied to
kernel internals, and hence only provide an in-kernel API.
Markers identify the name (and therefore numeric ID) to attach to an
"event" and the data types to export into trace buffers for this
specific event type. These data types are fully expressed in a marker
format-string table recorded in a "metadata" channel. The size of the
various basic types and the endianness is recorded in the buffer header.
Therefore, the binary trace buffers are self-described.
Data is exported through binary trace buffers out of kernel-space,
either by writing directly to disk, sending data over the network, crash
dump extraction, etc.
User-space applications, which run on an architecture with potentially
different endianness and different type sizes, reads the binary buffers.
Therefore, those buffers must be fully self-described so the application
can read them portably (or just on 32-bits userspace running under a
64-bits kernel).
In debug-mode tracing (where the developer wants to add
ftrace_printk()-like statements in its kernel and recompile), the
extracted information should be made available through the same binary
buffers where the information extracted from the tracepoints is saved.
Adding printk-like statements to the kernel should not suffer from the
time and buffer space penality of expanding the raw data to text.
However, ftrace_printk() requires to pretty-print the data in text.
It would be too cumbersome to ask developer to deploy their own
tracepoints whenever they want to create ad-hoc debug-style tracing
statements. This is where trace_mark() statements do better than
ftrace_printk() by allowing to dump the information to trace buffers in
binary format through a simple to use format-string based macro. Note
that each marker format string is dumped in the metadata table to
identify the new event types.
So that was my use-case which I think cannot be converted to tracepoints
nor ftrace_print(). If things are not as clear as they should be, don't
hesitate to ask for clarifications.
I also think we might consider removing the "multiple callback support"
from markers if we tie them more closely to the data extraction code,
given this multiple-callback role is already fulfilled by tracepoints.
OTOH, given my use-case can use markers as information source, thus
bypassing tracepoints for debug-style tracing, we may still need to
connect more than a single probe to the markers. I becomes very useful
for use-cases such as : dumping the kernel or userspace stack, reading
the performance counters and activating/de-activating the function
tracer at specific instrumentation sites.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-18 4:44 ` Mathieu Desnoyers
@ 2008-11-18 16:30 ` Ingo Molnar
2008-11-23 16:40 ` Mathieu Desnoyers
0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2008-11-18 16:30 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> Markers identify the name (and therefore numeric ID) to attach to an
> "event" and the data types to export into trace buffers for this
> specific event type. These data types are fully expressed in a
> marker format-string table recorded in a "metadata" channel. The
> size of the various basic types and the endianness is recorded in
> the buffer header. Therefore, the binary trace buffers are
> self-described.
>
> Data is exported through binary trace buffers out of kernel-space,
> either by writing directly to disk, sending data over the network,
> crash dump extraction, etc.
Streaming gigabytes of data is really mostly only done when we know
_nothing_ useful about a failure mode and are _forced_ into logging
gobs and gobs of data at great expense.
And thus in reality this is a rather uninteresting usecase.
We do recognize and support it as it's a valid "last line of defense"
for system and application failure analysis, but we should also put it
all into proper perspective: it's the rare and abnormal exception, not
the design target.
Note that we support this mode of tracing today already: we can
already stream binary data via the ftrace channel - the ring buffer
gives the infrastructure for that. Just do:
# echo bin > /debug/tracing/trace_options
... and you'll get the trace data streamed to user-space in an
efficient, raw, binary data format!
This works here and today - and if you'd like it to become more
efficient within the ftrace framework, we are all for it. (It's
obviously not the default mode of output, because humans prefer ASCII
and scriptable output formats by a _wide_ margin.)
Almost by definition anything opaque and binary-only that goes from
the kernel to user-space has fundamental limitations: it just doesnt
actively interact with the kernel for us to be able to form a useful
and flexible filter of information around it.
The _real_ solution to tracing in 99% of the cases is to intelligently
limit information - it's not like the user will read and parse
gigabytes of data ...
Look at the myriads of rather useful ftrace plugins we have already
and that sprung out of nothing. Compare it to the _10 years_ of
inaction that more static tracing concepts created. Those plugins work
and spread because it all lives and breathes within the kernel, and
almost none of that could be achieved via the 'stream binary data to
user-space' model you are concentrating on.
So in the conceptual space i can see little use for markers in the
kernel that are not tracepoints (i.e. not actively used by a real
tracer). We had markers in the scheduler initially, then we moved to
tracepoints - and tracepoints are much nicer.
[ And you wrote both markers and tracepoints, so it's not like i risk
degenerating this discussion into a flamewar by advocating one of
your solutions over the other one ;-) ]
... and in that sense i'd love to see lttng become a "super ftrace
plugin", and be merged upstream ASAP.
We could even split it up into multiple bits as its merged: for
example syscall tracing would be a nice touch that a couple of other
plugins would adapt as well. But every tracepoint should have some
active role and active connection to a tracer.
And we'd keep all those tracepoints open for external kprobes use as
well - for the dynamic tracers, as a low-cost courtesy. (no long-term
API guarantees though.)
Hm?
Ingo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-18 16:30 ` Ingo Molnar
@ 2008-11-23 16:40 ` Mathieu Desnoyers
2008-11-23 16:49 ` Ingo Molnar
0 siblings, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-23 16:40 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Ingo Molnar (mingo@elte.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Markers identify the name (and therefore numeric ID) to attach to an
> > "event" and the data types to export into trace buffers for this
> > specific event type. These data types are fully expressed in a
> > marker format-string table recorded in a "metadata" channel. The
> > size of the various basic types and the endianness is recorded in
> > the buffer header. Therefore, the binary trace buffers are
> > self-described.
> >
> > Data is exported through binary trace buffers out of kernel-space,
> > either by writing directly to disk, sending data over the network,
> > crash dump extraction, etc.
>
> Streaming gigabytes of data is really mostly only done when we know
> _nothing_ useful about a failure mode and are _forced_ into logging
> gobs and gobs of data at great expense.
>
> And thus in reality this is a rather uninteresting usecase.
>
> We do recognize and support it as it's a valid "last line of defense"
> for system and application failure analysis, but we should also put it
> all into proper perspective: it's the rare and abnormal exception, not
> the design target.
>
Hrm, the fact that you assume that large data-throughput recording is
seldom used shows you have not been in contact with the same user-base I
have been interacting with. A few examples of successful LTTng users :
- Google are deploying LTTng on their servers. They want to use it to
monitor their production servers (with flight recorder mode tracing)
and to help them solve hard to reproduce problems. They have had
success with such tracing approach to fix "rare disk delay" issues and
VM-related issues presented in this article :
* "Linux Kernel Debugging on Google-sized clusters at Ottawa Linux
Symposium 2007"
http://ltt.polymtl.ca/papers/bligh-Reprint.pdf
- IBM Research have had problems with Commercial Scale-out applications,
which are being an increasing trend to split large server workloads.
They used LTTng successfully to solve a distributed filesystem-related
issue. It's presented in the same paper above.
- Autodesk, in the development of their next-generation of Linux
audio/video edition applications, used LTTng extensively to solve
soft real-time issues they had. Also presented in the same paper.
- Wind River included LTTng in their Linux distribution so their
clients, already familiar to Wind River own tracing solution in
VxWorks, car have the same kind of feature they have relied on for a
long time.
- Montavista have integrated LTTng in their distribution for the same
reasons. It's used by Sony amongst others.
- SuSE are currently integrating LTTng in their next SLES distribution,
because their clients asking for solutions which supports a kernel
closer to real-time need such tools to debug their problems.
- A project between Ericsson, the Canadian Defense, NSERC and various
universities is just starting. It aims at monitoring and debugging
multi-core systems and provide automated and help user system behavior
analysis.
- Siemens have been using LTTng internally for quite some time now.
The wide user-base I have been interacting with, which range from expert
developers to lead OS researchers, all agree on the strong need for a
tool streaming gigabytes of data, as LTTng does, to help analysing the
problem offline. I think that Linux kernel developers might be a bit
biased in this aspect, because they happen to know what they are looking
for. But users, even experts, often have very few clue where the problem
might be in large applications they are developing. And as the number of
cores grows and applications are getting larger and more complex, this
problem is not likely to lessen.
> Note that we support this mode of tracing today already: we can
> already stream binary data via the ftrace channel - the ring buffer
> gives the infrastructure for that. Just do:
>
> # echo bin > /debug/tracing/trace_options
>
> ... and you'll get the trace data streamed to user-space in an
> efficient, raw, binary data format!
>
> This works here and today - and if you'd like it to become more
> efficient within the ftrace framework, we are all for it. (It's
> obviously not the default mode of output, because humans prefer ASCII
> and scriptable output formats by a _wide_ margin.)
>
> Almost by definition anything opaque and binary-only that goes from
> the kernel to user-space has fundamental limitations: it just doesnt
> actively interact with the kernel for us to be able to form a useful
> and flexible filter of information around it.
>
This is the exact reason why I have an elaborated scheme to export
binary data to userspace in LTTng. LTTng buffer format is binary-only,
but is *not* opaque. It is self-described and portable, and a simple
user-space tool can format it to text without much effort.
> The _real_ solution to tracing in 99% of the cases is to intelligently
> limit information - it's not like the user will read and parse
> gigabytes of data ...
>
Yes, limiting the information flow is sometimes required. e.g. we don't
want to export lockdep-rate information all the time. However, having
enough information within the trace to understand the surroundings of a
problematic behavior can greatly help identifying the root cause of the
problem. Ideally, having tools which automatically finds the interesting
spots in those gigabytes of data (which we have in lttv), and helps
representing the information in graphical form (which helps users find
execution patterns much more easily.. it's impressive to see how good
the human brain can be at pattern-recognition), and lets the user dig in
the detailed information located near the problematic execution scenario
is of inestimable value.
It has, in many of the cases explained above, led to fix the problematic
situation after a few hours with a tracer rather than a few weeks of
painful trial-and-error debugging (involving many developers).
> Look at the myriads of rather useful ftrace plugins we have already
> and that sprung out of nothing. Compare it to the _10 years_ of
> inaction that more static tracing concepts created. Those plugins work
> and spread because it all lives and breathes within the kernel, and
> almost none of that could be achieved via the 'stream binary data to
> user-space' model you are concentrating on.
It's great that you have such momentum for ftrace, and yes, there is a
need for in-kernel analysis, because some workloads might be better
suited for in-kernel analysis (potentially because they generate a
too-high tracing throughput for the available system resources).
However, the comparison you are doing here is simply unfair. Ftrace has
this momentum simply because it happens to be shipped with the Linux
kernel. LTTng has been an out-of-tree patch for about 4 years now and
has generated a great deal of interest in users which can afford to
deploy their own kernel. Therefore, the real reason why ftrace has such
popularity is just because, as you say, it "all lives and breathes
within the kernel". It has nothing to do with in-kernel vs
post-processing analysis or with ascii vs binary data streaming.
Also, doing the analysis part within the kernel has a downside : it adds
complexity to the kernel itself. It adds analysis which are sometimes
complex and require additionnal data structures within the kernel. The
advantage of streaming the data out of the kernel is that it makes the
kernel-side of tracing trivially simple : we get the data out to a
memory buffer. Period. This helps being less intrusive, minimizes the
risks of distupting the normal system behavior, etc.
>
> So in the conceptual space i can see little use for markers in the
> kernel that are not tracepoints (i.e. not actively used by a real
> tracer). We had markers in the scheduler initially, then we moved to
> tracepoints - and tracepoints are much nicer.
>
I am open to changes to the markers API, and we may need to do some, but
in LTTng scheme, they fulfill a very important requirement : they turn
what would otherwise be "opaque binary data" as you call it into fully
described, parseable binary data.
We can then think of LTTng binary data as a very efficient data
reprensentation wich can be automatically, and generically, transformed
into text with a simple binary-to-ascii parser (think of it as a
printk-like parser).
> [ And you wrote both markers and tracepoints, so it's not like i risk
> degenerating this discussion into a flamewar by advocating one of
> your solutions over the other one ;-) ]
>
That's ok.. I'm just trying to show which design space markers currently
fulfill.
> ... and in that sense i'd love to see lttng become a "super ftrace
> plugin", and be merged upstream ASAP.
>
Hrm, a big part of LTTng is its optimized buffering mechanism, which has
been more tested and is more flexible than the one currently found in
ftrace (it has supported lockless NMI-safe tracing for over 2 years). It
also separates the various tracing layers in separated modules, which
helps not mixing the buffer management layer with the timestamping layer
and with the memory backend layer.. I am opened to try to merge ftrace
and lttng together, but I think it will be more than a simple "make
lttng a ftrace plugin".
> We could even split it up into multiple bits as its merged: for
> example syscall tracing would be a nice touch that a couple of other
> plugins would adapt as well. But every tracepoint should have some
> active role and active connection to a tracer.
>
> And we'd keep all those tracepoints open for external kprobes use as
> well - for the dynamic tracers, as a low-cost courtesy. (no long-term
> API guarantees though.)
>
Ah, I see.. when you speak of LTTng as a "super ftrace plugin", you
refer to the LTTng tracepoints only (the instrumentation). In that
sense, yes, we could add the LTTng tracepoints into the Linux kernel and
make ftrace a user of those without any problem. And as you say, no
guaranteed API (this is in-kernel only).
> Hm?
Well, given that I currently have :
- trace_clock() infrastructure for timestamping (which I could submit
for -tip, I think it's ready)
- LTTng instrumentation, which could be used by many tracers.
- LTTng buffer management, trace control, etc, which we might want to
get through in a review phase and try to do a mix and match of the
best features between LTTng and ftrace.
I think we could end up with a tracer which would be faster, more solid
and would support both what ftrace is currently doing and what LTTng is
doing. But if we want to do that, we have to both recognise that the
use-cases filled by ftrace and by LTTng are complementary and are all
needed by the community overall.
Mathieu
>
> Ingo
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-23 16:40 ` Mathieu Desnoyers
@ 2008-11-23 16:49 ` Ingo Molnar
2008-11-24 8:05 ` Mathieu Desnoyers
0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2008-11-23 16:49 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> > ... and in that sense i'd love to see lttng become a "super ftrace
> > plugin", and be merged upstream ASAP.
>
> Hrm, a big part of LTTng is its optimized buffering mechanism, which
> has been more tested and is more flexible than the one currently
> found in ftrace (it has supported lockless NMI-safe tracing for over
> 2 years). It also separates the various tracing layers in separated
> modules, which helps not mixing the buffer management layer with the
> timestamping layer and with the memory backend layer.. I am opened
> to try to merge ftrace and lttng together, but I think it will be
> more than a simple "make lttng a ftrace plugin".
Sure, it will certainly be a fair bit of (interesting and useful!)
work.
But it's something we should do iteratively, instead of merging a
parallel framework. Steve's ring-buffer should be a fair step in the
right direction. Could we improve on that?
i see nothing in the feature list of LTTng that we wouldnt want to
have for ftrace, agreed? And we could do an ftrace tracer named
'LTTng' which would interact with lttng userspace in the customary
way.
Ingo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-23 16:49 ` Ingo Molnar
@ 2008-11-24 8:05 ` Mathieu Desnoyers
0 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-24 8:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, akpm, Linus Torvalds, Lai Jiangshan, Peter Zijlstra,
Thomas Gleixner
* Ingo Molnar (mingo@elte.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > > ... and in that sense i'd love to see lttng become a "super ftrace
> > > plugin", and be merged upstream ASAP.
> >
> > Hrm, a big part of LTTng is its optimized buffering mechanism, which
> > has been more tested and is more flexible than the one currently
> > found in ftrace (it has supported lockless NMI-safe tracing for over
> > 2 years). It also separates the various tracing layers in separated
> > modules, which helps not mixing the buffer management layer with the
> > timestamping layer and with the memory backend layer.. I am opened
> > to try to merge ftrace and lttng together, but I think it will be
> > more than a simple "make lttng a ftrace plugin".
>
> Sure, it will certainly be a fair bit of (interesting and useful!)
> work.
>
> But it's something we should do iteratively, instead of merging a
> parallel framework. Steve's ring-buffer should be a fair step in the
> right direction. Could we improve on that?
>
I think it's a very good thing that both Steve and me went through the
effort of creating such buffering mechanism, because it created healthy
competition and we have been able to cross-check each other code (well,
I actually pointed out some bugs he had in his tracer code just because
I ran into the same issues).
The only thing is that ftrace needs are a subset of what LTTng needs and
provides. As Steve stated very clearly in the past weeks, he doesn't
care about exporting binary data in an organized self-described manner.
This is actually a key component of LTTng. Therefore, there seems to be
no will nor need to self-describe the information in the trace buffers
from Steve's point of view.
However, even if such self-description is not used by ftrace-type
tracers within the kernel, it could just be *there* so we can support
high-throughput streaming of organized (non-opaque) binary data when
needed. A ftrace tracer could then participate to this information flow
and make the data available for high-speed extraction. I think this
would be very beneficial to the LTTng use-cases and would not hurt
ftrace tracers at all.
> i see nothing in the feature list of LTTng that we wouldnt want to
> have for ftrace, agreed?
Yes, the only thing is that LTTng might be doing a little bit more than
what ftrace needs, because it needs to export the information to
userspace.
> And we could do an ftrace tracer named
> 'LTTng' which would interact with lttng userspace in the customary
> way.
Or we could think of making ftrace be a LTTng information source, so it
would be able to export all its information to userspace through the
high-speed self-described binary trace buffers (channels) and also
manage to keep all of its current features. The way I see it, because
ftrace has simpler requirements due to the fact it does not export
binary data across the kernel boundary, ftrace buffering can be seen as
a subset of the LTTng requirements.
Besides, looking at LWN reviews of Steve's ring buffer implementation
"A casual perusal of the patch might well leave a reader confused; 2000
lines of relatively complex code to implement what is, in the end, just
a circular buffer. This circular buffer is not even suitable for use by
tracing frameworks yet; a separate "tracing" layer is to be added for
that."
I achieve all the core work of atomic-ops based buffering,
self-description of traces and all I said above with a design easier to
review in a few more lines, but distributed in smaller files :
850 ltt-marker-control.c (marker userspace control, event ID
management)
179 ltt-relay.h
701 ltt-relay-alloc.c (buffer memory backend, vmap-less,
contiguous memory range emulated by software)
1648 ltt-relay.c (atomic buffer writer/reader
synchronization)
780 ltt-tracer.h (header definitions, prototypes, inlines)
Compared to :
2241 ring_buffer.c
130 ring_buffer.h
So in the end, my argumentation is not based on a simple "I prefer my
code to someone else's code", but simply that I think it could be easier
to integrate the tracer with the fewer constraints into the tracer with
the more constraints (principally exporting data to userspace) than to
do the opposite.
And all ftrace could need that LTTng currently lacks (like in-kernel
binary-to-text converter) is currently on the LTTng roadmap.
Mathieu
>
> Ingo
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-16 7:59 ` Ingo Molnar
2008-11-18 4:44 ` Mathieu Desnoyers
@ 2008-11-25 12:23 ` KOSAKI Motohiro
2008-11-25 17:24 ` Frank Ch. Eigler
1 sibling, 1 reply; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-11-25 12:23 UTC (permalink / raw)
To: Ingo Molnar, Frank Ch. Eigler
Cc: kosaki.motohiro, Mathieu Desnoyers, linux-kernel, akpm,
Linus Torvalds, Lai Jiangshan, Peter Zijlstra, Thomas Gleixner
Hi
Sorry for very late responce.
However, if you do marker removing disucussion, I hope CC to Frank Eigler.
because SystemTap also use marker and tracepoint.
and this patch also improvement SystemTap marker support, I think.
IIRC, Currently, Systemtap also have marker/tracepoint on/off
syncronization problem.
> > Add a new API trace_mark_tp(), which declares a marker within a
> > tracepoint probe. When the marker is activated, the tracepoint is
> > automatically enabled.
> >
> > No branch test is used at the marker site, because it would be a
> > duplicate of the branch already present in the tracepoint.
> >
> > Impact: new API.
>
> i dont know.
>
> I was actually hoping for markers (the in-kernel API) to go away
> completely - and be replaced with tracepoints.
>
> Markers are the wrong design on several levels. They couple the kernel
> dynamically with unknown (to the kernel) entities - and that is
> causing complexity all around the place, clearly expressed in these
> patches of yours.
>
> Tracepoints are much more specific - typed and enumerated function
> call callback points in essence - with some politeness that allows
> external instrumentation entities to attach to those callbacks.
>
> Is there any usecase of markers that should not be converted to either
> tracepoints or to ftrace_printk() ?
Frank, Could you please read this thread and give us your comment?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp())
2008-11-25 12:23 ` KOSAKI Motohiro
@ 2008-11-25 17:24 ` Frank Ch. Eigler
0 siblings, 0 replies; 30+ messages in thread
From: Frank Ch. Eigler @ 2008-11-25 17:24 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Ingo Molnar, Mathieu Desnoyers, linux-kernel, akpm,
Linus Torvalds, Lai Jiangshan, Peter Zijlstra, Thomas Gleixner
Hi -
On Tue, Nov 25, 2008 at 09:23:57PM +0900, KOSAKI Motohiro wrote:
> Sorry for very late response.
> However, if you do marker removing discussion, I hope CC to Frank Eigler.
Thanks!
> because SystemTap also use marker and tracepoint. and this patch
> also improvement SystemTap marker support, I think.
I believe that this trace_mark_tp variant would operate at a level
below systemtap's. It would mainly improve the kernel performance and
simplify kernel code.
> IIRC, Currently, Systemtap also have marker/tracepoint on/off
> synchronization problem.
Right, to the extent that systemtap can currently connect to markers
and not yet tracepoints, so tracepoints need to be converted to
markers or another systemtap-visible hooking mechanism at some point.
This is a temporary state of affairs though, as we hope to talk to
tracepoints directly before too long.
mingo wrote:
> > I was actually hoping for markers (the in-kernel API) to go away
> > completely - and be replaced with tracepoints.
(If you're seriously contemplating outright removal of this API, then
the earlier worries about how markers would somehow tie its users'
hands to keep them around forever was surely exaggerated.)
> > Markers are the wrong design on several levels. They couple the kernel
> > dynamically with unknown (to the kernel) entities - [...]
> > Tracepoints are much more specific - typed and enumerated function
> > call callback points in essence [...]
You are confusing type safety and "unknown (to the kernel)"-ness.
Markers provide some type safety via the format string; tracepoints
via individual C declarations.
Markers can connect to "unknown (to the kernel)" clients since their
API is EXPORT_SYMBOL_GPL'd. Tracepoints do exactly the same thing.
There are certainly differences between them, but these two particular
ones are immaterial.
- FChE
^ permalink raw reply [flat|nested] 30+ messages in thread
* [patch 07/16] Markers : create DEFINE_MARKER and GET_MARKER (new API)
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (5 preceding siblings ...)
2008-11-14 22:47 ` [patch 06/16] Markers auto enable tracepoints (new API : trace_mark_tp()) Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 08/16] Tracepoints : Samples fix teardown Mathieu Desnoyers
` (8 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: markers-define-get.patch --]
[-- Type: text/plain, Size: 4925 bytes --]
Allow markers to be used only for declaration, without function call associated.
Useful to create specialized probes.
The problem we had is that two function calls were required when one wanted to
put a marker in a tracepoint probe. Now the marker can be used simply for trace
data type declaration, leaving the trace write work within the tracepoint probe
without any additional function call.
Impact: new API.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
Documentation/markers.txt | 14 ++++++++++++++
include/linux/marker.h | 39 +++++++++++++++++++++++----------------
2 files changed, 37 insertions(+), 16 deletions(-)
Index: linux-2.6-lttng/include/linux/marker.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/marker.h 2008-11-14 16:59:26.000000000 -0500
+++ linux-2.6-lttng/include/linux/marker.h 2008-11-14 16:59:28.000000000 -0500
@@ -55,6 +55,22 @@ struct marker {
#ifdef CONFIG_MARKERS
+#define _DEFINE_MARKER(name, tp_name_str, tp_cb, format) \
+ static const char __mstrtab_##name[] \
+ __attribute__((section("__markers_strings"))) \
+ = #name "\0" format; \
+ static struct marker __mark_##name \
+ __attribute__((section("__markers"), aligned(8))) = \
+ { __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
+ 0, 0, marker_probe_cb, { __mark_empty_function, NULL},\
+ NULL, tp_name_str, tp_cb }
+
+#define DEFINE_MARKER(name, format) \
+ _DEFINE_MARKER(name, NULL, NULL, format)
+
+#define DEFINE_MARKER_TP(name, tp_name, tp_cb, format) \
+ _DEFINE_MARKER(name, #tp_name, tp_cb, format)
+
/*
* Note : the empty asm volatile with read constraint is used here instead of a
* "used" attribute to fix a gcc 4.1.x bug.
@@ -68,14 +84,7 @@ struct marker {
*/
#define __trace_mark(generic, name, call_private, format, args...) \
do { \
- static const char __mstrtab_##name[] \
- __attribute__((section("__markers_strings"))) \
- = #name "\0" format; \
- static struct marker __mark_##name \
- __attribute__((section("__markers"), aligned(8))) = \
- { __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
- 0, 0, marker_probe_cb, \
- { __mark_empty_function, NULL}, NULL, NULL, NULL }; \
+ DEFINE_MARKER(name, format); \
__mark_check_format(format, ## args); \
if (unlikely(__mark_##name.state)) { \
(*__mark_##name.call) \
@@ -89,14 +98,7 @@ struct marker {
{ \
register_trace_##tp_name(tp_cb); \
} \
- static const char __mstrtab_##name[] \
- __attribute__((section("__markers_strings"))) \
- = #name "\0" format; \
- static struct marker __mark_##name \
- __attribute__((section("__markers"), aligned(8))) = \
- { __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
- 0, 0, marker_probe_cb, \
- { __mark_empty_function, NULL}, NULL, #tp_name, tp_cb };\
+ DEFINE_MARKER_TP(name, tp_name, tp_cb, format); \
__mark_check_format(format, ## args); \
(*__mark_##name.call)(&__mark_##name, call_private, \
## args); \
@@ -104,7 +106,11 @@ struct marker {
extern void marker_update_probe_range(struct marker *begin,
struct marker *end);
+
+#define GET_MARKER(name) (__mark_##name)
+
#else /* !CONFIG_MARKERS */
+#define DEFINE_MARKER(name, tp_name, tp_cb, format)
#define __trace_mark(generic, name, call_private, format, args...) \
__mark_check_format(format, ## args)
#define __trace_mark_tp(name, call_private, tp_name, tp_cb, format, args...) \
@@ -118,6 +124,7 @@ extern void marker_update_probe_range(st
static inline void marker_update_probe_range(struct marker *begin,
struct marker *end)
{ }
+#define GET_MARKER(name)
#endif /* CONFIG_MARKERS */
/**
Index: linux-2.6-lttng/Documentation/markers.txt
===================================================================
--- linux-2.6-lttng.orig/Documentation/markers.txt 2008-11-14 16:59:26.000000000 -0500
+++ linux-2.6-lttng/Documentation/markers.txt 2008-11-14 16:59:28.000000000 -0500
@@ -70,6 +70,20 @@ a printk warning which identifies the in
"Format mismatch for probe probe_name (format), marker (format)"
+Another way to use markers is to simply define the marker without generating any
+function call to actually call into the marker. This is useful in combination
+with tracepoint probes in a scheme like this :
+
+void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk);
+
+DEFINE_MARKER_TP(marker_eventname, tracepoint_name, probe_tracepoint_name,
+ "arg1 %u pid %d");
+
+notrace void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk)
+{
+ struct marker *marker = &GET_MARKER(kernel_irq_entry);
+ /* write data to trace buffers ... */
+}
* Probe / marker example
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 08/16] Tracepoints : Samples fix teardown
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (6 preceding siblings ...)
2008-11-14 22:47 ` [patch 07/16] Markers : create DEFINE_MARKER and GET_MARKER (new API) Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 09/16] Tracepoint fix disable Mathieu Desnoyers
` (7 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Rusty Russell,
Frank Ch. Eigler, Lai Jiangshan, Peter Zijlstra, Steven Rostedt
[-- Attachment #1: tracepoints-samples-fix-teardown.patch --]
[-- Type: text/plain, Size: 1939 bytes --]
Need a tracepoint_synchronize_unregister() before the end of exit() to make sure
every probe callers have exited the non preemptible section and thus are not
executing the probe code anymore.
Impact: bugfix.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Steven Rostedt <rostedt@goodmis.org>
---
samples/tracepoints/tracepoint-probe-sample.c | 1 +
samples/tracepoints/tracepoint-probe-sample2.c | 1 +
2 files changed, 2 insertions(+)
Index: linux-2.6-lttng/samples/tracepoints/tracepoint-probe-sample.c
===================================================================
--- linux-2.6-lttng.orig/samples/tracepoints/tracepoint-probe-sample.c 2008-07-31 09:26:51.000000000 -0400
+++ linux-2.6-lttng/samples/tracepoints/tracepoint-probe-sample.c 2008-07-31 09:26:52.000000000 -0400
@@ -46,6 +46,7 @@ void __exit tp_sample_trace_exit(void)
{
unregister_trace_subsys_eventb(probe_subsys_eventb);
unregister_trace_subsys_event(probe_subsys_event);
+ tracepoint_synchronize_unregister();
}
module_exit(tp_sample_trace_exit);
Index: linux-2.6-lttng/samples/tracepoints/tracepoint-probe-sample2.c
===================================================================
--- linux-2.6-lttng.orig/samples/tracepoints/tracepoint-probe-sample2.c 2008-07-31 09:26:51.000000000 -0400
+++ linux-2.6-lttng/samples/tracepoints/tracepoint-probe-sample2.c 2008-07-31 09:26:52.000000000 -0400
@@ -33,6 +33,7 @@ module_init(tp_sample_trace_init);
void __exit tp_sample_trace_exit(void)
{
unregister_trace_subsys_event(probe_subsys_event);
+ tracepoint_synchronize_unregister();
}
module_exit(tp_sample_trace_exit);
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 09/16] Tracepoint fix disable
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (7 preceding siblings ...)
2008-11-14 22:47 ` [patch 08/16] Tracepoints : Samples fix teardown Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 10/16] Tracepoints use rcu_*_sched_notrace Mathieu Desnoyers
` (6 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoint-fix-disable.patch --]
[-- Type: text/plain, Size: 972 bytes --]
Set the probe array pointer to NULL when the tracepoint is disabled.
The probe array point not being NULL could generate a race condition where the
reader would dereference a freed pointer.
Impact: bugfix.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/tracepoint.c | 1 +
1 file changed, 1 insertion(+)
Index: linux.trees.git/kernel/tracepoint.c
===================================================================
--- linux.trees.git.orig/kernel/tracepoint.c 2008-11-14 17:39:52.000000000 -0500
+++ linux.trees.git/kernel/tracepoint.c 2008-11-14 17:39:56.000000000 -0500
@@ -262,6 +262,7 @@ static void set_tracepoint(struct tracep
static void disable_tracepoint(struct tracepoint *elem)
{
elem->state = 0;
+ rcu_assign_pointer(elem->funcs, NULL);
}
/**
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 10/16] Tracepoints use rcu_*_sched_notrace
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (8 preceding siblings ...)
2008-11-14 22:47 ` [patch 09/16] Tracepoint fix disable Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 11/16] Tracepoint Use Unregister Return Value Mathieu Desnoyers
` (5 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Masami Hiramatsu,
Peter Zijlstra, Frank Ch. Eigler, Lai Jiangshan, Hideo AOKI,
Takashi Nishiie, Steven Rostedt
[-- Attachment #1: tracepoints-use-rcu-sched-notrace.patch --]
[-- Type: text/plain, Size: 1422 bytes --]
Make sure tracepoints can be called within ftrace callbacks.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
---
include/linux/tracepoint.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6-lttng/include/linux/tracepoint.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/tracepoint.h 2008-10-08 10:45:30.000000000 -0400
+++ linux-2.6-lttng/include/linux/tracepoint.h 2008-10-08 10:45:34.000000000 -0400
@@ -40,14 +40,14 @@ struct tracepoint {
do { \
void **it_func; \
\
- rcu_read_lock_sched(); \
+ rcu_read_lock_sched_notrace(); \
it_func = rcu_dereference((tp)->funcs); \
if (it_func) { \
do { \
((void(*)(proto))(*it_func))(args); \
} while (*(++it_func)); \
} \
- rcu_read_unlock_sched(); \
+ rcu_read_unlock_sched_notrace(); \
} while (0)
/*
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 11/16] Tracepoint Use Unregister Return Value
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (9 preceding siblings ...)
2008-11-14 22:47 ` [patch 10/16] Tracepoints use rcu_*_sched_notrace Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 12/16] Tracepoint do not put arguments in name Mathieu Desnoyers
` (4 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoint-use-unregister-return-value.patch --]
[-- Type: text/plain, Size: 1531 bytes --]
Unregistering a tracepoint can fail. Return the error value.
Impact: bugfix.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
include/linux/tracepoint.h | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
Index: linux-2.6-lttng/include/linux/tracepoint.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/tracepoint.h 2008-09-19 22:40:43.000000000 -0400
+++ linux-2.6-lttng/include/linux/tracepoint.h 2008-09-19 23:03:20.000000000 -0400
@@ -73,9 +73,9 @@ struct tracepoint {
return tracepoint_probe_register(#name ":" #proto, \
(void *)probe); \
} \
- static inline void unregister_trace_##name(void (*probe)(proto))\
+ static inline int unregister_trace_##name(void (*probe)(proto)) \
{ \
- tracepoint_probe_unregister(#name ":" #proto, \
+ return tracepoint_probe_unregister(#name ":" #proto, \
(void *)probe); \
}
@@ -92,8 +92,10 @@ extern void tracepoint_update_probe_rang
{ \
return -ENOSYS; \
} \
- static inline void unregister_trace_##name(void (*probe)(proto))\
- { }
+ static inline int unregister_trace_##name(void (*probe)(proto)) \
+ { \
+ return -ENOSYS; \
+ }
static inline void tracepoint_update_probe_range(struct tracepoint *begin,
struct tracepoint *end)
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 12/16] Tracepoint do not put arguments in name
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (10 preceding siblings ...)
2008-11-14 22:47 ` [patch 11/16] Tracepoint Use Unregister Return Value Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 13/16] Tracepoints : use modules notifiers Mathieu Desnoyers
` (3 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoint-do-not-put-argument-list-in-name.patch --]
[-- Type: text/plain, Size: 1677 bytes --]
That's overkill, takes space. We have a global tracepoint registery in header
files anyway.
Impact: cleanup.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
include/linux/tracepoint.h | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
Index: linux.trees.git/include/linux/tracepoint.h
===================================================================
--- linux.trees.git.orig/include/linux/tracepoint.h 2008-11-14 17:26:11.000000000 -0500
+++ linux.trees.git/include/linux/tracepoint.h 2008-11-14 17:26:11.000000000 -0500
@@ -60,7 +60,7 @@ struct tracepoint {
{ \
static const char __tpstrtab_##name[] \
__attribute__((section("__tracepoints_strings"))) \
- = #name ":" #proto; \
+ = #name; \
static struct tracepoint __tracepoint_##name \
__attribute__((section("__tracepoints"), aligned(8))) = \
{ __tpstrtab_##name, 0, NULL }; \
@@ -70,13 +70,11 @@ struct tracepoint {
} \
static inline int register_trace_##name(void (*probe)(proto)) \
{ \
- return tracepoint_probe_register(#name ":" #proto, \
- (void *)probe); \
+ return tracepoint_probe_register(#name, (void *)probe); \
} \
static inline int unregister_trace_##name(void (*probe)(proto)) \
{ \
- return tracepoint_probe_unregister(#name ":" #proto, \
- (void *)probe); \
+ return tracepoint_probe_unregister(#name, (void *)probe);\
}
extern void tracepoint_update_probe_range(struct tracepoint *begin,
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 13/16] Tracepoints : use modules notifiers
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (11 preceding siblings ...)
2008-11-14 22:47 ` [patch 12/16] Tracepoint do not put arguments in name Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 14/16] Tracepoints : add DECLARE_TRACE() and DEFINE_TRACE() Mathieu Desnoyers
` (2 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoints-use-modules-notifiers.patch --]
[-- Type: text/plain, Size: 2226 bytes --]
Use module notifiers for tracepoint updates rather than adding a hook in
module.c.
Impact: cleanup.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/module.c | 5 -----
kernel/tracepoint.c | 29 +++++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 5 deletions(-)
Index: linux.trees.git/kernel/tracepoint.c
===================================================================
--- linux.trees.git.orig/kernel/tracepoint.c 2008-11-14 17:39:56.000000000 -0500
+++ linux.trees.git/kernel/tracepoint.c 2008-11-14 17:40:39.000000000 -0500
@@ -541,3 +541,32 @@ void tracepoint_iter_reset(struct tracep
iter->tracepoint = NULL;
}
EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
+
+int tracepoint_module_notify(struct notifier_block *self,
+ unsigned long val, void *data)
+{
+ struct module *mod = data;
+
+ switch (val) {
+ case MODULE_STATE_COMING:
+ tracepoint_update_probe_range(mod->tracepoints,
+ mod->tracepoints + mod->num_tracepoints);
+ break;
+ case MODULE_STATE_GOING:
+ tracepoint_update_probe_range(mod->tracepoints,
+ mod->tracepoints + mod->num_tracepoints);
+ break;
+ }
+ return 0;
+}
+
+struct notifier_block tracepoint_module_nb = {
+ .notifier_call = tracepoint_module_notify,
+ .priority = 0,
+};
+
+static int init_tracepoints(void)
+{
+ return register_module_notifier(&tracepoint_module_nb);
+}
+__initcall(init_tracepoints);
Index: linux.trees.git/kernel/module.c
===================================================================
--- linux.trees.git.orig/kernel/module.c 2008-11-14 17:39:28.000000000 -0500
+++ linux.trees.git/kernel/module.c 2008-11-14 17:40:39.000000000 -0500
@@ -2188,11 +2188,6 @@ static noinline struct module *load_modu
debug = section_objs(hdr, sechdrs, secstrings, "__verbose",
sizeof(*debug), &num_debug);
dynamic_printk_setup(debug, num_debug);
-
-#ifdef CONFIG_TRACEPOINTS
- tracepoint_update_probe_range(mod->tracepoints,
- mod->tracepoints + mod->num_tracepoints);
-#endif
}
/* sechdrs[0].sh_size is always zero */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 14/16] Tracepoints : add DECLARE_TRACE() and DEFINE_TRACE()
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (12 preceding siblings ...)
2008-11-14 22:47 ` [patch 13/16] Tracepoints : use modules notifiers Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 15/16] Tracepoints : documentation fix teardown Mathieu Desnoyers
2008-11-14 22:47 ` [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister Mathieu Desnoyers
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoints-add-define.patch --]
[-- Type: text/plain, Size: 11635 bytes --]
Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint structure
in a single spot for all the kernel. It helps reducing memory consumption,
especially when declaring a lot of tracepoints, e.g. for kmalloc tracing.
*API CHANGE WARNING* : now, DECLARE_TRACE() must be used in headers for
tracepoint declarations rather than DEFINE_TRACE(). This is the sane way to do
it. The name previously used was misleading.
Updates scheduler instrumentation to follow this API change.
Impact: API *CHANGE*. Must update all tracepoint users.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
Documentation/tracepoints.txt | 7 +++++-
include/asm-generic/vmlinux.lds.h | 1
include/linux/tracepoint.h | 35 ++++++++++++++++++++++----------
include/trace/sched.h | 24 ++++++++++-----------
kernel/exit.c | 4 +++
kernel/fork.c | 2 +
kernel/kthread.c | 3 ++
kernel/sched.c | 6 +++++
kernel/signal.c | 2 +
samples/tracepoints/tp-samples-trace.h | 4 +--
samples/tracepoints/tracepoint-sample.c | 3 ++
11 files changed, 66 insertions(+), 25 deletions(-)
Index: linux.trees.git/include/linux/tracepoint.h
===================================================================
--- linux.trees.git.orig/include/linux/tracepoint.h 2008-11-14 17:40:39.000000000 -0500
+++ linux.trees.git/include/linux/tracepoint.h 2008-11-14 17:40:54.000000000 -0500
@@ -24,8 +24,12 @@ struct tracepoint {
const char *name; /* Tracepoint name */
int state; /* State. */
void **funcs;
-} __attribute__((aligned(8)));
-
+} __attribute__((aligned(32))); /*
+ * Aligned on 32 bytes because it is
+ * globally visible and gcc happily
+ * align these on the structure size.
+ * Keep in sync with vmlinux.lds.h.
+ */
#define TPPROTO(args...) args
#define TPARGS(args...) args
@@ -55,15 +59,10 @@ struct tracepoint {
* not add unwanted padding between the beginning of the section and the
* structure. Force alignment to the same alignment as the section start.
*/
-#define DEFINE_TRACE(name, proto, args) \
+#define DECLARE_TRACE(name, proto, args) \
+ extern struct tracepoint __tracepoint_##name; \
static inline void trace_##name(proto) \
{ \
- static const char __tpstrtab_##name[] \
- __attribute__((section("__tracepoints_strings"))) \
- = #name; \
- static struct tracepoint __tracepoint_##name \
- __attribute__((section("__tracepoints"), aligned(8))) = \
- { __tpstrtab_##name, 0, NULL }; \
if (unlikely(__tracepoint_##name.state)) \
__DO_TRACE(&__tracepoint_##name, \
TPPROTO(proto), TPARGS(args)); \
@@ -77,11 +76,23 @@ struct tracepoint {
return tracepoint_probe_unregister(#name, (void *)probe);\
}
+#define DEFINE_TRACE(name) \
+ static const char __tpstrtab_##name[] \
+ __attribute__((section("__tracepoints_strings"))) = #name; \
+ struct tracepoint __tracepoint_##name \
+ __attribute__((section("__tracepoints"), aligned(32))) = \
+ { __tpstrtab_##name, 0, NULL }
+
+#define EXPORT_TRACEPOINT_SYMBOL_GPL(name) \
+ EXPORT_SYMBOL_GPL(__tracepoint_##name)
+#define EXPORT_TRACEPOINT_SYMBOL(name) \
+ EXPORT_SYMBOL(__tracepoint_##name)
+
extern void tracepoint_update_probe_range(struct tracepoint *begin,
struct tracepoint *end);
#else /* !CONFIG_TRACEPOINTS */
-#define DEFINE_TRACE(name, proto, args) \
+#define DECLARE_TRACE(name, proto, args) \
static inline void _do_trace_##name(struct tracepoint *tp, proto) \
{ } \
static inline void trace_##name(proto) \
@@ -95,6 +106,10 @@ extern void tracepoint_update_probe_rang
return -ENOSYS; \
}
+#define DEFINE_TRACE(name)
+#define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
+#define EXPORT_TRACEPOINT_SYMBOL(name)
+
static inline void tracepoint_update_probe_range(struct tracepoint *begin,
struct tracepoint *end)
{ }
Index: linux.trees.git/Documentation/tracepoints.txt
===================================================================
--- linux.trees.git.orig/Documentation/tracepoints.txt 2008-11-14 17:37:14.000000000 -0500
+++ linux.trees.git/Documentation/tracepoints.txt 2008-11-14 17:40:54.000000000 -0500
@@ -42,7 +42,7 @@ In include/trace/subsys.h :
#include <linux/tracepoint.h>
-DEFINE_TRACE(subsys_eventname,
+DECLARE_TRACE(subsys_eventname,
TPPTOTO(int firstarg, struct task_struct *p),
TPARGS(firstarg, p));
@@ -50,6 +50,8 @@ In subsys/file.c (where the tracing stat
#include <trace/subsys.h>
+DEFINE_TRACE(subsys_eventname);
+
void somefct(void)
{
...
@@ -86,6 +88,9 @@ to limit collisions. Tracepoint names ar
considered as being the same whether they are in the core kernel image or in
modules.
+If the tracepoint has to be used in kernel modules, an
+EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be used to
+export the defined tracepoints.
* Probe / tracepoint example
Index: linux.trees.git/samples/tracepoints/tracepoint-sample.c
===================================================================
--- linux.trees.git.orig/samples/tracepoints/tracepoint-sample.c 2008-11-14 17:37:14.000000000 -0500
+++ linux.trees.git/samples/tracepoints/tracepoint-sample.c 2008-11-14 17:40:54.000000000 -0500
@@ -13,6 +13,9 @@
#include <linux/proc_fs.h>
#include "tp-samples-trace.h"
+DEFINE_TRACE(subsys_event);
+DEFINE_TRACE(subsys_eventb);
+
struct proc_dir_entry *pentry_example;
static int my_open(struct inode *inode, struct file *file)
Index: linux.trees.git/samples/tracepoints/tp-samples-trace.h
===================================================================
--- linux.trees.git.orig/samples/tracepoints/tp-samples-trace.h 2008-11-14 17:37:14.000000000 -0500
+++ linux.trees.git/samples/tracepoints/tp-samples-trace.h 2008-11-14 17:40:54.000000000 -0500
@@ -4,10 +4,10 @@
#include <linux/proc_fs.h> /* for struct inode and struct file */
#include <linux/tracepoint.h>
-DEFINE_TRACE(subsys_event,
+DECLARE_TRACE(subsys_event,
TPPROTO(struct inode *inode, struct file *file),
TPARGS(inode, file));
-DEFINE_TRACE(subsys_eventb,
+DECLARE_TRACE(subsys_eventb,
TPPROTO(void),
TPARGS());
#endif
Index: linux.trees.git/include/asm-generic/vmlinux.lds.h
===================================================================
--- linux.trees.git.orig/include/asm-generic/vmlinux.lds.h 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/include/asm-generic/vmlinux.lds.h 2008-11-14 17:40:54.000000000 -0500
@@ -71,6 +71,7 @@
VMLINUX_SYMBOL(__start___markers) = .; \
*(__markers) \
VMLINUX_SYMBOL(__stop___markers) = .; \
+ . = ALIGN(32); \
VMLINUX_SYMBOL(__start___tracepoints) = .; \
*(__tracepoints) \
VMLINUX_SYMBOL(__stop___tracepoints) = .; \
Index: linux.trees.git/include/trace/sched.h
===================================================================
--- linux.trees.git.orig/include/trace/sched.h 2008-11-14 17:37:14.000000000 -0500
+++ linux.trees.git/include/trace/sched.h 2008-11-14 17:40:54.000000000 -0500
@@ -4,52 +4,52 @@
#include <linux/sched.h>
#include <linux/tracepoint.h>
-DEFINE_TRACE(sched_kthread_stop,
+DECLARE_TRACE(sched_kthread_stop,
TPPROTO(struct task_struct *t),
TPARGS(t));
-DEFINE_TRACE(sched_kthread_stop_ret,
+DECLARE_TRACE(sched_kthread_stop_ret,
TPPROTO(int ret),
TPARGS(ret));
-DEFINE_TRACE(sched_wait_task,
+DECLARE_TRACE(sched_wait_task,
TPPROTO(struct rq *rq, struct task_struct *p),
TPARGS(rq, p));
-DEFINE_TRACE(sched_wakeup,
+DECLARE_TRACE(sched_wakeup,
TPPROTO(struct rq *rq, struct task_struct *p),
TPARGS(rq, p));
-DEFINE_TRACE(sched_wakeup_new,
+DECLARE_TRACE(sched_wakeup_new,
TPPROTO(struct rq *rq, struct task_struct *p),
TPARGS(rq, p));
-DEFINE_TRACE(sched_switch,
+DECLARE_TRACE(sched_switch,
TPPROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TPARGS(rq, prev, next));
-DEFINE_TRACE(sched_migrate_task,
+DECLARE_TRACE(sched_migrate_task,
TPPROTO(struct rq *rq, struct task_struct *p, int dest_cpu),
TPARGS(rq, p, dest_cpu));
-DEFINE_TRACE(sched_process_free,
+DECLARE_TRACE(sched_process_free,
TPPROTO(struct task_struct *p),
TPARGS(p));
-DEFINE_TRACE(sched_process_exit,
+DECLARE_TRACE(sched_process_exit,
TPPROTO(struct task_struct *p),
TPARGS(p));
-DEFINE_TRACE(sched_process_wait,
+DECLARE_TRACE(sched_process_wait,
TPPROTO(struct pid *pid),
TPARGS(pid));
-DEFINE_TRACE(sched_process_fork,
+DECLARE_TRACE(sched_process_fork,
TPPROTO(struct task_struct *parent, struct task_struct *child),
TPARGS(parent, child));
-DEFINE_TRACE(sched_signal_send,
+DECLARE_TRACE(sched_signal_send,
TPPROTO(int sig, struct task_struct *p),
TPARGS(sig, p));
Index: linux.trees.git/kernel/sched.c
===================================================================
--- linux.trees.git.orig/kernel/sched.c 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/kernel/sched.c 2008-11-14 17:40:54.000000000 -0500
@@ -118,6 +118,12 @@
*/
#define RUNTIME_INF ((u64)~0ULL)
+DEFINE_TRACE(sched_wait_task);
+DEFINE_TRACE(sched_wakeup);
+DEFINE_TRACE(sched_wakeup_new);
+DEFINE_TRACE(sched_switch);
+DEFINE_TRACE(sched_migrate_task);
+
#ifdef CONFIG_SMP
/*
* Divide a load by a sched group cpu_power : (load / sg->__cpu_power)
Index: linux.trees.git/kernel/exit.c
===================================================================
--- linux.trees.git.orig/kernel/exit.c 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/kernel/exit.c 2008-11-14 17:40:54.000000000 -0500
@@ -54,6 +54,10 @@
#include <asm/pgtable.h>
#include <asm/mmu_context.h>
+DEFINE_TRACE(sched_process_free);
+DEFINE_TRACE(sched_process_exit);
+DEFINE_TRACE(sched_process_wait);
+
static void exit_mm(struct task_struct * tsk);
static inline int task_detached(struct task_struct *p)
Index: linux.trees.git/kernel/fork.c
===================================================================
--- linux.trees.git.orig/kernel/fork.c 2008-11-14 17:38:29.000000000 -0500
+++ linux.trees.git/kernel/fork.c 2008-11-14 17:40:54.000000000 -0500
@@ -80,6 +80,8 @@ DEFINE_PER_CPU(unsigned long, process_co
__cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */
+DEFINE_TRACE(sched_process_fork);
+
int nr_processes(void)
{
int cpu;
Index: linux.trees.git/kernel/kthread.c
===================================================================
--- linux.trees.git.orig/kernel/kthread.c 2008-11-14 17:37:14.000000000 -0500
+++ linux.trees.git/kernel/kthread.c 2008-11-14 17:40:54.000000000 -0500
@@ -21,6 +21,9 @@ static DEFINE_SPINLOCK(kthread_create_lo
static LIST_HEAD(kthread_create_list);
struct task_struct *kthreadd_task;
+DEFINE_TRACE(sched_kthread_stop);
+DEFINE_TRACE(sched_kthread_stop_ret);
+
struct kthread_create_info
{
/* Information passed to kthread() from kthreadd. */
Index: linux.trees.git/kernel/signal.c
===================================================================
--- linux.trees.git.orig/kernel/signal.c 2008-11-14 17:37:13.000000000 -0500
+++ linux.trees.git/kernel/signal.c 2008-11-14 17:40:54.000000000 -0500
@@ -41,6 +41,8 @@
static struct kmem_cache *sigqueue_cachep;
+DEFINE_TRACE(sched_signal_send);
+
static void __user *sig_handler(struct task_struct *t, int sig)
{
return t->sighand->action[sig - 1].sa.sa_handler;
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 15/16] Tracepoints : documentation fix teardown
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (13 preceding siblings ...)
2008-11-14 22:47 ` [patch 14/16] Tracepoints : add DECLARE_TRACE() and DEFINE_TRACE() Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-14 22:47 ` [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister Mathieu Desnoyers
15 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Mathieu Desnoyers, Rusty Russell,
Frank Ch. Eigler, Lai Jiangshan, Peter Zijlstra, Steven Rostedt
[-- Attachment #1: tracepoints-documentation-fix-teardown.patch --]
[-- Type: text/plain, Size: 2099 bytes --]
Need a tracepoint_synchronize_unregister() before the end of exit() to make sure
every probe callers have exited the non preemptible section and thus are not
executing the probe code anymore.
Impact: documentation update.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Steven Rostedt <rostedt@goodmis.org>
---
Documentation/tracepoints.txt | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
Index: linux-2.6-lttng/Documentation/tracepoints.txt
===================================================================
--- linux-2.6-lttng.orig/Documentation/tracepoints.txt 2008-11-14 17:18:59.000000000 -0500
+++ linux-2.6-lttng/Documentation/tracepoints.txt 2008-11-14 17:19:05.000000000 -0500
@@ -70,10 +70,12 @@ Where :
Connecting a function (probe) to a tracepoint is done by providing a probe
(function to call) for the specific tracepoint through
register_trace_subsys_eventname(). Removing a probe is done through
-unregister_trace_subsys_eventname(); it will remove the probe sure there is no
-caller left using the probe when it returns. Probe removal is preempt-safe
-because preemption is disabled around the probe call. See the "Probe example"
-section below for a sample probe module.
+unregister_trace_subsys_eventname(); it will remove the probe.
+marker_synchronize_unregister() must be called before the end of the module exit
+function to make sure there is no caller left using the probe. This, and the
+fact that preemption is disabled around the probe call, make sure that probe
+removal and module unload are safe. See the "Probe example" section below for a
+sample probe module.
The tracepoint mechanism supports inserting multiple instances of the same
tracepoint, but a single definition must be made of a given tracepoint name over
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister
2008-11-14 22:47 [patch 00/16] Markers and Tracepoints Updates for -tip Mathieu Desnoyers
` (14 preceding siblings ...)
2008-11-14 22:47 ` [patch 15/16] Tracepoints : documentation fix teardown Mathieu Desnoyers
@ 2008-11-14 22:47 ` Mathieu Desnoyers
2008-11-16 7:48 ` Ingo Molnar
15 siblings, 1 reply; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-14 22:47 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: akpm, Linus Torvalds, Zhaolei, Mathieu Desnoyers, Lai Jiangshan
[-- Attachment #1: tracepoints-fix-typo-in-documentation.patch --]
[-- Type: text/plain, Size: 1360 bytes --]
Impact: documentation update.
Signed-off-by: Zhaolei <zhaolei@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
---
Documentation/tracepoints.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6-lttng/Documentation/tracepoints.txt
===================================================================
--- linux-2.6-lttng.orig/Documentation/tracepoints.txt 2008-11-14 17:19:21.000000000 -0500
+++ linux-2.6-lttng/Documentation/tracepoints.txt 2008-11-14 17:19:25.000000000 -0500
@@ -71,7 +71,7 @@ Connecting a function (probe) to a trace
(function to call) for the specific tracepoint through
register_trace_subsys_eventname(). Removing a probe is done through
unregister_trace_subsys_eventname(); it will remove the probe.
-marker_synchronize_unregister() must be called before the end of the module exit
+tracepoint_synchronize_unregister() must be called before the end of the module exit
function to make sure there is no caller left using the probe. This, and the
fact that preemption is disabled around the probe call, make sure that probe
removal and module unload are safe. See the "Probe example" section below for a
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister
2008-11-14 22:47 ` [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister Mathieu Desnoyers
@ 2008-11-16 7:48 ` Ingo Molnar
2008-11-17 5:36 ` Mathieu Desnoyers
0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2008-11-16 7:48 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: linux-kernel, akpm, Linus Torvalds, Zhaolei, Lai Jiangshan
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> Impact: documentation update.
>
> Signed-off-by: Zhaolei <zhaolei@cn.fujitsu.com>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
This should have been:
From: Zhaolei <zhaolei@cn.fujitsu.com>
correct?
I updated the patch accordingly.
Ingo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 16/16] marker_synchronize_unregister->tracepoint_synchronize_unregister
2008-11-16 7:48 ` Ingo Molnar
@ 2008-11-17 5:36 ` Mathieu Desnoyers
0 siblings, 0 replies; 30+ messages in thread
From: Mathieu Desnoyers @ 2008-11-17 5:36 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, akpm, Linus Torvalds, Zhaolei, Lai Jiangshan
* Ingo Molnar (mingo@elte.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > Impact: documentation update.
> >
> > Signed-off-by: Zhaolei <zhaolei@cn.fujitsu.com>
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> > CC: Ingo Molnar <mingo@elte.hu>
> > CC: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> This should have been:
>
> From: Zhaolei <zhaolei@cn.fujitsu.com>
>
> correct?
>
> I updated the patch accordingly.
>
Yes, that's correct. I forgot to add this line. Thanks !
Mathieu
> Ingo
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 30+ messages in thread