public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs
@ 2010-04-26 19:50 Steven Rostedt
  2010-04-26 19:50 ` [PATCH 01/10][RFC] tracing: Create class struct for events Steven Rostedt
                   ` (11 more replies)
  0 siblings, 12 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

This is an RFC patch set that also affects kprobes and perf.

At the Linux Collaboration Summit, I talked with Mathieu and others about
lowering the footprint of trace events. I spent all of last week
trying to get the size as small as I could.

Currently, each TRACE_EVENT() macro adds 1 - 5K per tracepoint. I got various
results by adding a TRACE_EVENT() with the compiler, depending on
config options that did not seem related. The new tracepoint I added
would add between 1 and 5K, but I did not investigate enough to
see what the true size was.

What was consistent, was the DEFINE_EVENT(). Currently, it adds
a little over 700 bytes per DEFINE_EVENT().

This patch series does not seem to affect TRACE_EVENT() much (had
the same various sizes), but consistently brings DEFINE_EVENT()s
down from 700 bytes to 250 bytes per DEFINE_EVENT(). Since syscalls
use one "class" and are equivalent to DEFINE_EVENT() this can
be a significant savings.

With events and syscalls (82 events and 616 syscalls), before this
patch series, the size of vmlinux was: 16161794, and afterward: 16058182.

That is 103,612 bytes in savings! (over 100K)


Without tracing syscalls (82 events), it brought the size of vmlinux
down from 1591046 to 15999394.

22,071 bytes in savings.

This is just an RFC (for now), to get peoples opinions on the changes.
It does a bit of rewriting of the CPP macros, just to warning you ;-)

-- Steve

The code can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/rfc-1


Steven Rostedt (10):
      tracing: Create class struct for events
      tracing: Let tracepoints have data passed to tracepoint callbacks
      tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
      tracing: Remove per event trace registering
      tracing: Move fields from event to class structure
      tracing: Move raw_init from events to class
      tracing: Allow events to share their print functions
      tracing: Move print functions into event class
      tracing: Remove duplicate id information in event structure
      tracing: Combine event filter_active and enable into single flags field

----
 include/linux/ftrace_event.h         |   71 +++++++++---
 include/linux/syscalls.h             |   55 +++-------
 include/linux/tracepoint.h           |  119 ++++++++++++++++---
 include/trace/ftrace.h               |  215 ++++++++++------------------------
 include/trace/syscall.h              |    9 +-
 kernel/trace/blktrace.c              |   13 ++-
 kernel/trace/kmemtrace.c             |   28 +++--
 kernel/trace/trace.c                 |    9 +-
 kernel/trace/trace.h                 |    5 +-
 kernel/trace/trace_event_perf.c      |   17 ++-
 kernel/trace/trace_events.c          |  126 +++++++++++++-------
 kernel/trace/trace_events_filter.c   |   28 +++--
 kernel/trace/trace_export.c          |   16 ++-
 kernel/trace/trace_functions_graph.c |    2 +-
 kernel/trace/trace_kprobe.c          |  104 ++++++++++-------
 kernel/trace/trace_output.c          |  137 +++++++++++++++-------
 kernel/trace/trace_output.h          |    2 +-
 kernel/trace/trace_syscalls.c        |  105 +++++++++++++++--
 kernel/tracepoint.c                  |   91 ++++++++-------
 19 files changed, 700 insertions(+), 452 deletions(-)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/10][RFC] tracing: Create class struct for events
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 20:22   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks Steven Rostedt
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0001-tracing-Create-class-struct-for-events.patch --]
[-- Type: text/plain, Size: 12663 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch creates a ftrace_event_class struct that event structs point to.
This class struct will be made to hold information to modify the
events. Currently the class struct only holds the events system name.

This patch slightly increases the size of the text as well as decreases
the data size. The overall change is still a slight increase, but
this change lays the ground work of other changes to make the footprint
of tracepoints smaller.

With 82 standard tracepoints, and 616 system call tracepoints:

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h       |    6 +++++-
 include/linux/syscalls.h           |    6 ++++--
 include/trace/ftrace.h             |   36 +++++++++++++++---------------------
 kernel/trace/trace_events.c        |   20 ++++++++++----------
 kernel/trace/trace_events_filter.c |    6 +++---
 kernel/trace/trace_export.c        |    6 +++++-
 kernel/trace/trace_kprobe.c        |   12 ++++++------
 kernel/trace/trace_syscalls.c      |    4 ++++
 8 files changed, 52 insertions(+), 44 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 39e71b0..496eea8 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -113,10 +113,14 @@ void tracing_record_cmdline(struct task_struct *tsk);
 
 struct event_filter;
 
+struct ftrace_event_class {
+	char			*system;
+};
+
 struct ftrace_event_call {
 	struct list_head	list;
+	struct ftrace_event_class *class;
 	char			*name;
-	char			*system;
 	struct dentry		*dir;
 	struct trace_event	*event;
 	int			enabled;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 057929b..ac5791d 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -134,6 +134,8 @@ struct perf_event_attr;
 #define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
 #define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
 
+extern struct ftrace_event_class event_class_syscalls;
+
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
 	static const struct syscall_metadata __syscall_meta_##sname;	\
 	static struct ftrace_event_call					\
@@ -146,7 +148,7 @@ struct perf_event_attr;
 	  __attribute__((section("_ftrace_events")))			\
 	  event_enter_##sname = {					\
 		.name                   = "sys_enter"#sname,		\
-		.system                 = "syscalls",			\
+		.class			= &event_class_syscalls,	\
 		.event                  = &enter_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
 		.define_fields		= syscall_enter_define_fields,	\
@@ -168,7 +170,7 @@ struct perf_event_attr;
 	  __attribute__((section("_ftrace_events")))			\
 	  event_exit_##sname = {					\
 		.name                   = "sys_exit"#sname,		\
-		.system                 = "syscalls",			\
+		.class			= &event_class_syscalls,	\
 		.event                  = &exit_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
 		.define_fields		= syscall_exit_define_fields,	\
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 75dd778..0921a8f 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -62,7 +62,10 @@
 		struct trace_entry	ent;				\
 		tstruct							\
 		char			__data[0];			\
-	};
+	};								\
+									\
+	static struct ftrace_event_class event_class_##name;
+
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, name, proto, args)	\
 	static struct ftrace_event_call			\
@@ -430,22 +433,6 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
  *
  * Override the macros in <trace/trace_events.h> to include the following:
  *
- * static void ftrace_event_<call>(proto)
- * {
- *	event_trace_printk(_RET_IP_, "<call>: " <fmt>);
- * }
- *
- * static int ftrace_reg_event_<call>(struct ftrace_event_call *unused)
- * {
- *	return register_trace_<call>(ftrace_event_<call>);
- * }
- *
- * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused)
- * {
- *	unregister_trace_<call>(ftrace_event_<call>);
- * }
- *
- *
  * For those macros defined with TRACE_EVENT:
  *
  * static struct ftrace_event_call event_<call>;
@@ -497,11 +484,15 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
  *
  * static const char print_fmt_<call>[] = <TP_printk>;
  *
+ * static struct ftrace_event_class __used event_class_<template> = {
+ *	.system			= "<system>",
+ * }
+ *
  * static struct ftrace_event_call __used
  * __attribute__((__aligned__(4)))
  * __attribute__((section("_ftrace_events"))) event_<call> = {
  *	.name			= "<call>",
- *	.system			= "<system>",
+ *	.class			= event_class_<template>,
  *	.raw_init		= trace_event_raw_init,
  *	.regfunc		= ftrace_reg_event_<call>,
  *	.unregfunc		= ftrace_unreg_event_<call>,
@@ -627,7 +618,10 @@ static struct trace_event ftrace_event_type_##call = {			\
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
-static const char print_fmt_##call[] = print;
+static const char print_fmt_##call[] = print;				\
+static struct ftrace_event_class __used event_class_##call = {		\
+	.system			= __stringify(TRACE_SYSTEM)		\
+}
 
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, call, proto, args)			\
@@ -636,7 +630,7 @@ static struct ftrace_event_call __used					\
 __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
-	.system			= __stringify(TRACE_SYSTEM),		\
+	.class			= &event_class_##template,		\
 	.event			= &ftrace_event_type_##call,		\
 	.raw_init		= trace_event_raw_init,			\
 	.regfunc		= ftrace_raw_reg_event_##call,		\
@@ -655,7 +649,7 @@ static struct ftrace_event_call __used					\
 __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
-	.system			= __stringify(TRACE_SYSTEM),		\
+	.class			= &event_class_##template,		\
 	.event			= &ftrace_event_type_##call,		\
 	.raw_init		= trace_event_raw_init,			\
 	.regfunc		= ftrace_raw_reg_event_##call,		\
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index beab8bf..f6893cc 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -175,10 +175,10 @@ static int __ftrace_set_clr_event(const char *match, const char *sub,
 
 		if (match &&
 		    strcmp(match, call->name) != 0 &&
-		    strcmp(match, call->system) != 0)
+		    strcmp(match, call->class->system) != 0)
 			continue;
 
-		if (sub && strcmp(sub, call->system) != 0)
+		if (sub && strcmp(sub, call->class->system) != 0)
 			continue;
 
 		if (event && strcmp(event, call->name) != 0)
@@ -354,8 +354,8 @@ static int t_show(struct seq_file *m, void *v)
 {
 	struct ftrace_event_call *call = v;
 
-	if (strcmp(call->system, TRACE_SYSTEM) != 0)
-		seq_printf(m, "%s:", call->system);
+	if (strcmp(call->class->system, TRACE_SYSTEM) != 0)
+		seq_printf(m, "%s:", call->class->system);
 	seq_printf(m, "%s\n", call->name);
 
 	return 0;
@@ -452,7 +452,7 @@ system_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
 		if (!call->name || !call->regfunc)
 			continue;
 
-		if (system && strcmp(call->system, system) != 0)
+		if (system && strcmp(call->class->system, system) != 0)
 			continue;
 
 		/*
@@ -924,8 +924,8 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 	 * If the trace point header did not define TRACE_SYSTEM
 	 * then the system would be called "TRACE_SYSTEM".
 	 */
-	if (strcmp(call->system, TRACE_SYSTEM) != 0)
-		d_events = event_subsystem_dir(call->system, d_events);
+	if (strcmp(call->class->system, TRACE_SYSTEM) != 0)
+		d_events = event_subsystem_dir(call->class->system, d_events);
 
 	call->dir = debugfs_create_dir(call->name, d_events);
 	if (!call->dir) {
@@ -1040,7 +1040,7 @@ static void __trace_remove_event_call(struct ftrace_event_call *call)
 	list_del(&call->list);
 	trace_destroy_fields(call);
 	destroy_preds(call);
-	remove_subsystem_dir(call->system);
+	remove_subsystem_dir(call->class->system);
 }
 
 /* Remove an event_call */
@@ -1398,8 +1398,8 @@ static __init void event_trace_self_tests(void)
  * syscalls as we test.
  */
 #ifndef CONFIG_EVENT_TRACE_TEST_SYSCALLS
-		if (call->system &&
-		    strcmp(call->system, "syscalls") == 0)
+		if (call->class->system &&
+		    strcmp(call->class->system, "syscalls") == 0)
 			continue;
 #endif
 
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 4615f62..22fa89f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -627,7 +627,7 @@ static int init_subsystem_preds(struct event_subsystem *system)
 		if (!call->define_fields)
 			continue;
 
-		if (strcmp(call->system, system->name) != 0)
+		if (strcmp(call->class->system, system->name) != 0)
 			continue;
 
 		err = init_preds(call);
@@ -646,7 +646,7 @@ static void filter_free_subsystem_preds(struct event_subsystem *system)
 		if (!call->define_fields)
 			continue;
 
-		if (strcmp(call->system, system->name) != 0)
+		if (strcmp(call->class->system, system->name) != 0)
 			continue;
 
 		filter_disable_preds(call);
@@ -1251,7 +1251,7 @@ static int replace_system_preds(struct event_subsystem *system,
 		if (!call->define_fields)
 			continue;
 
-		if (strcmp(call->system, system->name) != 0)
+		if (strcmp(call->class->system, system->name) != 0)
 			continue;
 
 		/* try to see if the filter can be applied */
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index e091f64..7f16e21 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -18,6 +18,10 @@
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM	ftrace
 
+struct ftrace_event_class event_class_ftrace = {
+	.system			= __stringify(TRACE_SYSTEM),
+};
+
 /* not needed for this file */
 #undef __field_struct
 #define __field_struct(type, item)
@@ -160,7 +164,7 @@ __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.id			= type,					\
-	.system			= __stringify(TRACE_SYSTEM),		\
+	.class			= &event_class_ftrace,			\
 	.raw_init		= ftrace_raw_init_event,		\
 	.print_fmt		= print,				\
 	.define_fields		= ftrace_define_fields_##call,		\
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1251e36..eda220b 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -332,8 +332,8 @@ static struct trace_probe *alloc_trace_probe(const char *group,
 		goto error;
 	}
 
-	tp->call.system = kstrdup(group, GFP_KERNEL);
-	if (!tp->call.system)
+	tp->call.class->system = kstrdup(group, GFP_KERNEL);
+	if (!tp->call.class->system)
 		goto error;
 
 	INIT_LIST_HEAD(&tp->list);
@@ -361,7 +361,7 @@ static void free_trace_probe(struct trace_probe *tp)
 	for (i = 0; i < tp->nr_args; i++)
 		free_probe_arg(&tp->args[i]);
 
-	kfree(tp->call.system);
+	kfree(tp->call.class->system);
 	kfree(tp->call.name);
 	kfree(tp->symbol);
 	kfree(tp);
@@ -374,7 +374,7 @@ static struct trace_probe *find_probe_event(const char *event,
 
 	list_for_each_entry(tp, &probe_list, list)
 		if (strcmp(tp->call.name, event) == 0 &&
-		    strcmp(tp->call.system, group) == 0)
+		    strcmp(tp->call.class->system, group) == 0)
 			return tp;
 	return NULL;
 }
@@ -399,7 +399,7 @@ static int register_trace_probe(struct trace_probe *tp)
 	mutex_lock(&probe_lock);
 
 	/* register as an event */
-	old_tp = find_probe_event(tp->call.name, tp->call.system);
+	old_tp = find_probe_event(tp->call.name, tp->call.class->system);
 	if (old_tp) {
 		/* delete old event */
 		unregister_trace_probe(old_tp);
@@ -798,7 +798,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
 	char buf[MAX_ARGSTR_LEN + 1];
 
 	seq_printf(m, "%c", probe_is_return(tp) ? 'r' : 'p');
-	seq_printf(m, ":%s/%s", tp->call.system, tp->call.name);
+	seq_printf(m, ":%s/%s", tp->call.class->system, tp->call.name);
 
 	if (!tp->symbol)
 		seq_printf(m, " 0x%p", tp->rp.kp.addr);
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 33c2a5b..31fc95a 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -14,6 +14,10 @@ static int sys_refcount_exit;
 static DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
 static DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
 
+struct ftrace_event_class event_class_syscalls = {
+	.system			= "syscalls"
+};
+
 extern unsigned long __start_syscalls_metadata[];
 extern unsigned long __stop_syscalls_metadata[];
 
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
  2010-04-26 19:50 ` [PATCH 01/10][RFC] tracing: Create class struct for events Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-27  9:08   ` Li Zefan
  2010-04-28 20:37   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA() Steven Rostedt
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers

[-- Attachment #1: 0002-tracing-Let-tracepoints-have-data-passed-to-tracepoi.patch --]
[-- Type: text/plain, Size: 15699 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch allows data to be passed to the tracepoint callbacks
if the tracepoint was created to do so.

If a tracepoint is defined with:

DECLARE_TRACE_DATA(name, proto, args)

Then a registered function can also register data to be passed
to the tracepoint as such:

  DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(int status), TP_ARGS(status));

  /* In the C file */

  DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));

  [...]

       trace_mytacepoint(status);

  /* In a file registering this tracepoint */

  int my_callback(int status, void *data)
  {
	struct my_struct my_data = data;
	[...]
  }

  [...]
	my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
	init_my_data(my_data);
	register_trace_mytracepoint_data(my_callback, my_data);

The same callback can also be registered to the same tracepoint as long
as the data registered is the same. Note, the data must also be used
to unregister the callback:

	unregister_trace_mytracepoint_data(my_callback, my_data);

Because of the data parameter, tracepoints declared this way can not have
no args. That is:

  DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(void), TP_ARGS());

will cause an error, but the original DECLARE_TRACE still allows for this.

The DECLARE_TRACE_DATA() will be used by TRACE_EVENT() so that it
can reuse code and bring the size of the tracepoint footprint down.
This means that TRACE_EVENT()s must have at least one argument defined.
This should not be a problem since we should never have a static
tracepoint in the kernel that simply says "Look I'm here!".

This is part of a series to make the tracepoint footprint smaller:

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint

Again, this patch also increases the size of the kernel, but
lays the ground work for decreasing it.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/tracepoint.h |  103 +++++++++++++++++++++++++++++++++++++------
 kernel/tracepoint.c        |   91 ++++++++++++++++++++++-----------------
 2 files changed, 139 insertions(+), 55 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 78b4bd3..4649bdb 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -20,12 +20,17 @@
 struct module;
 struct tracepoint;
 
+struct tracepoint_func {
+	void *func;
+	void *data;
+};
+
 struct tracepoint {
 	const char *name;		/* Tracepoint name */
 	int state;			/* State. */
 	void (*regfunc)(void);
 	void (*unregfunc)(void);
-	void **funcs;
+	struct tracepoint_func *funcs;
 } __attribute__((aligned(32)));		/*
 					 * Aligned on 32 bytes because it is
 					 * globally visible and gcc happily
@@ -40,20 +45,31 @@ struct tracepoint {
 
 #ifdef CONFIG_TRACEPOINTS
 
+#define _CALL_TRACE(proto, args)					\
+	(void)(it_data);						\
+	((void(*)(proto))(it_func))(args)
+
+#define _CALL_TRACE_DATA(proto, args)					\
+	it_data = (it_func_ptr)->data;					\
+	((void(*)(proto, void *))(it_func))(args, (it_data))
+
 /*
  * it_func[0] is never NULL because there is at least one element in the array
  * when the array itself is non NULL.
  */
-#define __DO_TRACE(tp, proto, args)					\
+#define __DO_TRACE(tp, proto, args, call)				\
 	do {								\
-		void **it_func;						\
+		struct tracepoint_func *it_func_ptr;			\
+		void *it_func;						\
+		void *it_data;						\
 									\
 		rcu_read_lock_sched_notrace();				\
-		it_func = rcu_dereference_sched((tp)->funcs);		\
-		if (it_func) {						\
+		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
+		if (it_func_ptr) {					\
 			do {						\
-				((void(*)(proto))(*it_func))(args);	\
-			} while (*(++it_func));				\
+				it_func = (it_func_ptr)->func;		\
+				call;					\
+			} while ((++it_func_ptr)->func);		\
 		}							\
 		rcu_read_unlock_sched_notrace();			\
 	} while (0)
@@ -69,17 +85,55 @@ struct tracepoint {
 	{								\
 		if (unlikely(__tracepoint_##name.state))		\
 			__DO_TRACE(&__tracepoint_##name,		\
-				TP_PROTO(proto), TP_ARGS(args));	\
+				TP_PROTO(proto), TP_ARGS(args),		\
+				_CALL_TRACE(PARAMS(proto),		\
+					    PARAMS(args)));		\
 	}								\
 	static inline int register_trace_##name(void (*probe)(proto))	\
 	{								\
-		return tracepoint_probe_register(#name, (void *)probe);	\
+		return tracepoint_probe_register(#name, (void *)probe,	\
+						 NULL);			\
 	}								\
-	static inline int unregister_trace_##name(void (*probe)(proto))	\
+	static inline int unregister_trace_##name(void (*probe)(proto)) \
 	{								\
-		return tracepoint_probe_unregister(#name, (void *)probe);\
+		return tracepoint_probe_unregister(#name, (void *)probe,\
+						   NULL);		\
 	}
 
+#define DECLARE_TRACE_DATA(name, proto, args)				\
+	extern struct tracepoint __tracepoint_##name;			\
+	static inline void trace_##name(proto)				\
+	{								\
+		if (unlikely(__tracepoint_##name.state))		\
+			__DO_TRACE(&__tracepoint_##name,		\
+				TP_PROTO(proto), TP_ARGS(args),		\
+				_CALL_TRACE_DATA(PARAMS(proto),		\
+						 PARAMS(args)));	\
+	}								\
+	static inline int register_trace_##name(void (*probe)(proto))	\
+	{								\
+		return tracepoint_probe_register(#name, (void *)probe,	\
+						 NULL);			\
+	}								\
+	static inline int unregister_trace_##name(void (*probe)(proto)) \
+	{								\
+		return tracepoint_probe_unregister(#name, (void *)probe,\
+						   NULL);		\
+	}								\
+	static inline int						\
+	register_trace_##name##_data(void (*probe)(proto, void *data),	\
+				     void *data)			\
+	{								\
+		return tracepoint_probe_register(#name, (void *)probe,	\
+						 data);			\
+	}								\
+	static inline int						\
+	unregister_trace_##name##_data(void (*probe)(proto, void *data),\
+				       void *data)			\
+	{								\
+		return tracepoint_probe_unregister(#name, (void *)probe,\
+						   data);		\
+	}
 
 #define DEFINE_TRACE_FN(name, reg, unreg)				\
 	static const char __tpstrtab_##name[]				\
@@ -114,6 +168,22 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
 		return -ENOSYS;						\
 	}
 
+#define DECLARE_TRACE_DATA(name, proto, args)				\
+	static inline void _do_trace_##name(struct tracepoint *tp, proto) \
+	{ }								\
+	static inline void trace_##name(proto)				\
+	{ }								\
+	static inline int						\
+	register_trace_##name(void (*probe)(proto), void *data)		\
+	{								\
+		return -ENOSYS;						\
+	}								\
+	static inline int						\
+	unregister_trace_##name(void (*probe)(proto), void *data)	\
+	{								\
+		return -ENOSYS;						\
+	}
+
 #define DEFINE_TRACE_FN(name, reg, unreg)
 #define DEFINE_TRACE(name)
 #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
@@ -129,16 +199,19 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
  * Connect a probe to a tracepoint.
  * Internal API, should not be used directly.
  */
-extern int tracepoint_probe_register(const char *name, void *probe);
+extern int tracepoint_probe_register(const char *name, void *probe, void *data);
 
 /*
  * Disconnect a probe from a tracepoint.
  * Internal API, should not be used directly.
  */
-extern int tracepoint_probe_unregister(const char *name, void *probe);
+extern int
+tracepoint_probe_unregister(const char *name, void *probe, void *data);
 
-extern int tracepoint_probe_register_noupdate(const char *name, void *probe);
-extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe);
+extern int tracepoint_probe_register_noupdate(const char *name, void *probe,
+					      void *data);
+extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe,
+						void *data);
 extern void tracepoint_probe_update_all(void);
 
 struct tracepoint_iter {
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index cc89be5..c77f3ec 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -54,7 +54,7 @@ static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
  */
 struct tracepoint_entry {
 	struct hlist_node hlist;
-	void **funcs;
+	struct tracepoint_func *funcs;
 	int refcount;	/* Number of times armed. 0 if disarmed. */
 	char name[0];
 };
@@ -64,12 +64,12 @@ struct tp_probes {
 		struct rcu_head rcu;
 		struct list_head list;
 	} u;
-	void *probes[0];
+	struct tracepoint_func probes[0];
 };
 
 static inline void *allocate_probes(int count)
 {
-	struct tp_probes *p  = kmalloc(count * sizeof(void *)
+	struct tp_probes *p  = kmalloc(count * sizeof(struct tracepoint_func)
 			+ sizeof(struct tp_probes), GFP_KERNEL);
 	return p == NULL ? NULL : p->probes;
 }
@@ -79,7 +79,7 @@ static void rcu_free_old_probes(struct rcu_head *head)
 	kfree(container_of(head, struct tp_probes, u.rcu));
 }
 
-static inline void release_probes(void *old)
+static inline void release_probes(struct tracepoint_func *old)
 {
 	if (old) {
 		struct tp_probes *tp_probes = container_of(old,
@@ -95,15 +95,16 @@ static void debug_print_probes(struct tracepoint_entry *entry)
 	if (!tracepoint_debug || !entry->funcs)
 		return;
 
-	for (i = 0; entry->funcs[i]; i++)
-		printk(KERN_DEBUG "Probe %d : %p\n", i, entry->funcs[i]);
+	for (i = 0; entry->funcs[i].func; i++)
+		printk(KERN_DEBUG "Probe %d : %p\n", i, entry->funcs[i].func);
 }
 
-static void *
-tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
+static struct tracepoint_func *
+tracepoint_entry_add_probe(struct tracepoint_entry *entry,
+			   void *probe, void *data)
 {
 	int nr_probes = 0;
-	void **old, **new;
+	struct tracepoint_func *old, *new;
 
 	WARN_ON(!probe);
 
@@ -111,8 +112,9 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
 	old = entry->funcs;
 	if (old) {
 		/* (N -> N+1), (N != 0, 1) probes */
-		for (nr_probes = 0; old[nr_probes]; nr_probes++)
-			if (old[nr_probes] == probe)
+		for (nr_probes = 0; old[nr_probes].func; nr_probes++)
+			if (old[nr_probes].func == probe &&
+			    old[nr_probes].data == data)
 				return ERR_PTR(-EEXIST);
 	}
 	/* + 2 : one for new probe, one for NULL func */
@@ -120,9 +122,10 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
 	if (new == NULL)
 		return ERR_PTR(-ENOMEM);
 	if (old)
-		memcpy(new, old, nr_probes * sizeof(void *));
-	new[nr_probes] = probe;
-	new[nr_probes + 1] = NULL;
+		memcpy(new, old, nr_probes * sizeof(struct tracepoint_func));
+	new[nr_probes].func = probe;
+	new[nr_probes].data = data;
+	new[nr_probes + 1].func = NULL;
 	entry->refcount = nr_probes + 1;
 	entry->funcs = new;
 	debug_print_probes(entry);
@@ -130,10 +133,11 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
 }
 
 static void *
-tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
+tracepoint_entry_remove_probe(struct tracepoint_entry *entry,
+			      void *probe, void *data)
 {
 	int nr_probes = 0, nr_del = 0, i;
-	void **old, **new;
+	struct tracepoint_func *old, *new;
 
 	old = entry->funcs;
 
@@ -142,8 +146,10 @@ tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
 
 	debug_print_probes(entry);
 	/* (N -> M), (N > 1, M >= 0) probes */
-	for (nr_probes = 0; old[nr_probes]; nr_probes++) {
-		if ((!probe || old[nr_probes] == probe))
+	for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
+		if (!probe ||
+		    (old[nr_probes].func == probe &&
+		     old[nr_probes].data == data))
 			nr_del++;
 	}
 
@@ -160,10 +166,11 @@ tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
 		new = allocate_probes(nr_probes - nr_del + 1);
 		if (new == NULL)
 			return ERR_PTR(-ENOMEM);
-		for (i = 0; old[i]; i++)
-			if ((probe && old[i] != probe))
+		for (i = 0; old[i].func; i++)
+			if (probe &&
+			    (old[i].func != probe || old[i].data != data))
 				new[j++] = old[i];
-		new[nr_probes - nr_del] = NULL;
+		new[nr_probes - nr_del].func = NULL;
 		entry->refcount = nr_probes - nr_del;
 		entry->funcs = new;
 	}
@@ -315,18 +322,19 @@ static void tracepoint_update_probes(void)
 	module_update_tracepoints();
 }
 
-static void *tracepoint_add_probe(const char *name, void *probe)
+static struct tracepoint_func *
+tracepoint_add_probe(const char *name, void *probe, void *data)
 {
 	struct tracepoint_entry *entry;
-	void *old;
+	struct tracepoint_func *old;
 
 	entry = get_tracepoint(name);
 	if (!entry) {
 		entry = add_tracepoint(name);
 		if (IS_ERR(entry))
-			return entry;
+			return (struct tracepoint_func *)entry;
 	}
-	old = tracepoint_entry_add_probe(entry, probe);
+	old = tracepoint_entry_add_probe(entry, probe, data);
 	if (IS_ERR(old) && !entry->refcount)
 		remove_tracepoint(entry);
 	return old;
@@ -340,12 +348,12 @@ static void *tracepoint_add_probe(const char *name, void *probe)
  * Returns 0 if ok, error value on error.
  * The probe address must at least be aligned on the architecture pointer size.
  */
-int tracepoint_probe_register(const char *name, void *probe)
+int tracepoint_probe_register(const char *name, void *probe, void *data)
 {
-	void *old;
+	struct tracepoint_func *old;
 
 	mutex_lock(&tracepoints_mutex);
-	old = tracepoint_add_probe(name, probe);
+	old = tracepoint_add_probe(name, probe, data);
 	mutex_unlock(&tracepoints_mutex);
 	if (IS_ERR(old))
 		return PTR_ERR(old);
@@ -356,15 +364,16 @@ int tracepoint_probe_register(const char *name, void *probe)
 }
 EXPORT_SYMBOL_GPL(tracepoint_probe_register);
 
-static void *tracepoint_remove_probe(const char *name, void *probe)
+static struct tracepoint_func *
+tracepoint_remove_probe(const char *name, void *probe, void *data)
 {
 	struct tracepoint_entry *entry;
-	void *old;
+	struct tracepoint_func *old;
 
 	entry = get_tracepoint(name);
 	if (!entry)
 		return ERR_PTR(-ENOENT);
-	old = tracepoint_entry_remove_probe(entry, probe);
+	old = tracepoint_entry_remove_probe(entry, probe, data);
 	if (IS_ERR(old))
 		return old;
 	if (!entry->refcount)
@@ -382,12 +391,12 @@ static void *tracepoint_remove_probe(const char *name, void *probe)
  * itself uses stop_machine(), which insures that every preempt disabled section
  * have finished.
  */
-int tracepoint_probe_unregister(const char *name, void *probe)
+int tracepoint_probe_unregister(const char *name, void *probe, void *data)
 {
-	void *old;
+	struct tracepoint_func *old;
 
 	mutex_lock(&tracepoints_mutex);
-	old = tracepoint_remove_probe(name, probe);
+	old = tracepoint_remove_probe(name, probe, data);
 	mutex_unlock(&tracepoints_mutex);
 	if (IS_ERR(old))
 		return PTR_ERR(old);
@@ -418,12 +427,13 @@ static void tracepoint_add_old_probes(void *old)
  *
  * caller must call tracepoint_probe_update_all()
  */
-int tracepoint_probe_register_noupdate(const char *name, void *probe)
+int tracepoint_probe_register_noupdate(const char *name, void *probe,
+				       void *data)
 {
-	void *old;
+	struct tracepoint_func *old;
 
 	mutex_lock(&tracepoints_mutex);
-	old = tracepoint_add_probe(name, probe);
+	old = tracepoint_add_probe(name, probe, data);
 	if (IS_ERR(old)) {
 		mutex_unlock(&tracepoints_mutex);
 		return PTR_ERR(old);
@@ -441,12 +451,13 @@ EXPORT_SYMBOL_GPL(tracepoint_probe_register_noupdate);
  *
  * caller must call tracepoint_probe_update_all()
  */
-int tracepoint_probe_unregister_noupdate(const char *name, void *probe)
+int tracepoint_probe_unregister_noupdate(const char *name, void *probe,
+					 void *data)
 {
-	void *old;
+	struct tracepoint_func *old;
 
 	mutex_lock(&tracepoints_mutex);
-	old = tracepoint_remove_probe(name, probe);
+	old = tracepoint_remove_probe(name, probe, data);
 	if (IS_ERR(old)) {
 		mutex_unlock(&tracepoints_mutex);
 		return PTR_ERR(old);
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
  2010-04-26 19:50 ` [PATCH 01/10][RFC] tracing: Create class struct for events Steven Rostedt
  2010-04-26 19:50 ` [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 20:39   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 04/10][RFC] tracing: Remove per event trace registering Steven Rostedt
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0003-tracing-Convert-TRACE_EVENT-to-use-the-DECLARE_TRACE.patch --]
[-- Type: text/plain, Size: 1826 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Switch the TRACE_EVENT() macros to use DECLARE_TRACE_DATA(). This
patch is done to prove that the DATA macros work. If any regressions
were to surface, then this patch would help a git bisect to localize
the area.

Once again this patch increases the size of the kernel.

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/tracepoint.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 4649bdb..c04988a 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -355,14 +355,14 @@ static inline void tracepoint_synchronize_unregister(void)
 
 #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print)
 #define DEFINE_EVENT(template, name, proto, args)		\
-	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
+	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
-	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
+	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
 
 #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
-	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
+	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
 #define TRACE_EVENT_FN(name, proto, args, struct,		\
 		assign, print, reg, unreg)			\
-	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
+	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
 
 #endif /* ifdef TRACE_EVENT (see note above) */
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (2 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA() Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 20:44   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 05/10][RFC] tracing: Move fields from event to class structure Steven Rostedt
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0004-tracing-Remove-per-event-trace-registering.patch --]
[-- Type: text/plain, Size: 21498 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch removes the register functions of TRACE_EVENT() to enable
and disable tracepoints. The registering of a event is now down
directly in the trace_events.c file. The tracepoint_probe_register()
is now called directly.

The prototypes are no longer type checked, but this should not be
an issue since the tracepoints are created automatically by the
macros. If a prototype is incorrect in the TRACE_EVENT() macro, then
other macros will catch it.

The trace_event_class structure now holds the probes to be called
by the callbacks. This removes needing to have each event have
a separate pointer for the probe.

To handle kprobes and syscalls, since they register probes in a
different manner, a "reg" field is added to the ftrace_event_class
structure. If the "reg" field is assigned, then it will be called for
enabling and disabling of the probe for either ftrace or perf. To let
the reg function know what is happening, a new enum (trace_reg) is
created that has the type of control that is needed.

With this new rework, the 82 kernel events and 616 syscall events
has their footprint dramatically lowered:

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data
5774316	1306580	9351592	16432488	 fabd68	vmlinux.regs

The size went from 16477030 to 16432488, that's a total of 44K
in savings. With tracepoints being continuously added, this is
critical that the footprint becomes minimal.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h    |   17 +++++--
 include/linux/syscalls.h        |   29 ++---------
 include/linux/tracepoint.h      |   12 ++++-
 include/trace/ftrace.h          |  110 +++++----------------------------------
 kernel/trace/trace_event_perf.c |   15 ++++-
 kernel/trace/trace_events.c     |   26 +++++++---
 kernel/trace/trace_kprobe.c     |   34 +++++++++---
 kernel/trace/trace_syscalls.c   |   56 +++++++++++++++++++-
 8 files changed, 151 insertions(+), 148 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 496eea8..dd0051e 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -113,8 +113,21 @@ void tracing_record_cmdline(struct task_struct *tsk);
 
 struct event_filter;
 
+enum trace_reg {
+	TRACE_REG_REGISTER,
+	TRACE_REG_UNREGISTER,
+	TRACE_REG_PERF_REGISTER,
+	TRACE_REG_PERF_UNREGISTER,
+};
+
+struct ftrace_event_call;
+
 struct ftrace_event_class {
 	char			*system;
+	void			*probe;
+	void			*perf_probe;
+	int			(*reg)(struct ftrace_event_call *event,
+				       enum trace_reg type);
 };
 
 struct ftrace_event_call {
@@ -124,8 +137,6 @@ struct ftrace_event_call {
 	struct dentry		*dir;
 	struct trace_event	*event;
 	int			enabled;
-	int			(*regfunc)(struct ftrace_event_call *);
-	void			(*unregfunc)(struct ftrace_event_call *);
 	int			id;
 	const char		*print_fmt;
 	int			(*raw_init)(struct ftrace_event_call *);
@@ -137,8 +148,6 @@ struct ftrace_event_call {
 	void			*data;
 
 	int			perf_refcount;
-	int			(*perf_event_enable)(struct ftrace_event_call *);
-	void			(*perf_event_disable)(struct ftrace_event_call *);
 };
 
 #define PERF_MAX_TRACE_SIZE	2048
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ac5791d..e3348c4 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -103,22 +103,6 @@ struct perf_event_attr;
 #define __SC_TEST5(t5, a5, ...)	__SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
 #define __SC_TEST6(t6, a6, ...)	__SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
 
-#ifdef CONFIG_PERF_EVENTS
-
-#define TRACE_SYS_ENTER_PERF_INIT(sname)				       \
-	.perf_event_enable = perf_sysenter_enable,			       \
-	.perf_event_disable = perf_sysenter_disable,
-
-#define TRACE_SYS_EXIT_PERF_INIT(sname)					       \
-	.perf_event_enable = perf_sysexit_enable,			       \
-	.perf_event_disable = perf_sysexit_disable,
-#else
-#define TRACE_SYS_ENTER_PERF(sname)
-#define TRACE_SYS_ENTER_PERF_INIT(sname)
-#define TRACE_SYS_EXIT_PERF(sname)
-#define TRACE_SYS_EXIT_PERF_INIT(sname)
-#endif /* CONFIG_PERF_EVENTS */
-
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define __SC_STR_ADECL1(t, a)		#a
 #define __SC_STR_ADECL2(t, a, ...)	#a, __SC_STR_ADECL1(__VA_ARGS__)
@@ -134,7 +118,8 @@ struct perf_event_attr;
 #define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
 #define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
 
-extern struct ftrace_event_class event_class_syscalls;
+extern struct ftrace_event_class event_class_syscall_enter;
+extern struct ftrace_event_class event_class_syscall_exit;
 
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
 	static const struct syscall_metadata __syscall_meta_##sname;	\
@@ -148,14 +133,11 @@ extern struct ftrace_event_class event_class_syscalls;
 	  __attribute__((section("_ftrace_events")))			\
 	  event_enter_##sname = {					\
 		.name                   = "sys_enter"#sname,		\
-		.class			= &event_class_syscalls,	\
+		.class			= &event_class_syscall_enter,	\
 		.event                  = &enter_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
 		.define_fields		= syscall_enter_define_fields,	\
-		.regfunc		= reg_event_syscall_enter,	\
-		.unregfunc		= unreg_event_syscall_enter,	\
 		.data			= (void *)&__syscall_meta_##sname,\
-		TRACE_SYS_ENTER_PERF_INIT(sname)			\
 	}
 
 #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
@@ -170,14 +152,11 @@ extern struct ftrace_event_class event_class_syscalls;
 	  __attribute__((section("_ftrace_events")))			\
 	  event_exit_##sname = {					\
 		.name                   = "sys_exit"#sname,		\
-		.class			= &event_class_syscalls,	\
+		.class			= &event_class_syscall_exit,	\
 		.event                  = &exit_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
 		.define_fields		= syscall_exit_define_fields,	\
-		.regfunc		= reg_event_syscall_exit,	\
-		.unregfunc		= unreg_event_syscall_exit,	\
 		.data			= (void *)&__syscall_meta_##sname,\
-		TRACE_SYS_EXIT_PERF_INIT(sname)			\
 	}
 
 #define SYSCALL_METADATA(sname, nb)				\
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index c04988a..5876b77 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -173,13 +173,21 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
 	{ }								\
 	static inline void trace_##name(proto)				\
 	{ }								\
+	static inline int register_trace_##name(void (*probe)(proto))	\
+	{								\
+		return -ENOSYS;						\
+	}								\
+	static inline int unregister_trace_##name(void (*probe)(proto))	\
+	{								\
+		return -ENOSYS;						\
+	}								\
 	static inline int						\
-	register_trace_##name(void (*probe)(proto), void *data)		\
+	register_trace_##name##_data(void (*probe)(proto), void *data)	\
 	{								\
 		return -ENOSYS;						\
 	}								\
 	static inline int						\
-	unregister_trace_##name(void (*probe)(proto), void *data)	\
+	unregister_trace_##name##_data(void (*probe)(proto), void *data) \
 	{								\
 		return -ENOSYS;						\
 	}
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 0921a8f..62fe622 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -381,53 +381,6 @@ static inline notrace int ftrace_get_offsets_##call(			\
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
-#ifdef CONFIG_PERF_EVENTS
-
-/*
- * Generate the functions needed for tracepoint perf_event support.
- *
- * NOTE: The insertion profile callback (ftrace_profile_<call>) is defined later
- *
- * static int ftrace_profile_enable_<call>(void)
- * {
- * 	return register_trace_<call>(ftrace_profile_<call>);
- * }
- *
- * static void ftrace_profile_disable_<call>(void)
- * {
- * 	unregister_trace_<call>(ftrace_profile_<call>);
- * }
- *
- */
-
-#undef DECLARE_EVENT_CLASS
-#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)
-
-#undef DEFINE_EVENT
-#define DEFINE_EVENT(template, name, proto, args)			\
-									\
-static void perf_trace_##name(proto);					\
-									\
-static notrace int							\
-perf_trace_enable_##name(struct ftrace_event_call *unused)		\
-{									\
-	return register_trace_##name(perf_trace_##name);		\
-}									\
-									\
-static notrace void							\
-perf_trace_disable_##name(struct ftrace_event_call *unused)		\
-{									\
-	unregister_trace_##name(perf_trace_##name);			\
-}
-
-#undef DEFINE_EVENT_PRINT
-#define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
-	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
-
-#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
-
-#endif /* CONFIG_PERF_EVENTS */
-
 /*
  * Stage 4 of the trace events.
  *
@@ -468,16 +421,6 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
  *						   event, irq_flags, pc);
  * }
  *
- * static int ftrace_raw_reg_event_<call>(struct ftrace_event_call *unused)
- * {
- *	return register_trace_<call>(ftrace_raw_event_<call>);
- * }
- *
- * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused)
- * {
- *	unregister_trace_<call>(ftrace_raw_event_<call>);
- * }
- *
  * static struct trace_event ftrace_event_type_<call> = {
  *	.trace			= ftrace_raw_output_<call>, <-- stage 2
  * };
@@ -504,11 +447,15 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
 
 #ifdef CONFIG_PERF_EVENTS
 
+#define _TRACE_PERF_PROTO(call, proto)					\
+	static notrace void						\
+	perf_trace_##call(proto, struct ftrace_event_call *event);
+
 #define _TRACE_PERF_INIT(call)						\
-	.perf_event_enable = perf_trace_enable_##call,			\
-	.perf_event_disable = perf_trace_disable_##call,
+	.perf_probe		= perf_trace_##call,
 
 #else
+#define _TRACE_PERF_PROTO(call, proto)
 #define _TRACE_PERF_INIT(call)
 #endif /* CONFIG_PERF_EVENTS */
 
@@ -542,8 +489,8 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 									\
 static notrace void							\
-ftrace_raw_event_id_##call(struct ftrace_event_call *event_call,	\
-				       proto)				\
+ftrace_raw_event_##call(proto,						\
+			struct ftrace_event_call *event_call)		\
 {									\
 	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
 	struct ring_buffer_event *event;				\
@@ -578,23 +525,6 @@ ftrace_raw_event_id_##call(struct ftrace_event_call *event_call,	\
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, call, proto, args)			\
 									\
-static notrace void ftrace_raw_event_##call(proto)			\
-{									\
-	ftrace_raw_event_id_##template(&event_##call, args);		\
-}									\
-									\
-static notrace int							\
-ftrace_raw_reg_event_##call(struct ftrace_event_call *unused)		\
-{									\
-	return register_trace_##call(ftrace_raw_event_##call);		\
-}									\
-									\
-static notrace void							\
-ftrace_raw_unreg_event_##call(struct ftrace_event_call *unused)		\
-{									\
-	unregister_trace_##call(ftrace_raw_event_##call);		\
-}									\
-									\
 static struct trace_event ftrace_event_type_##call = {			\
 	.trace			= ftrace_raw_output_##call,		\
 };
@@ -618,9 +548,12 @@ static struct trace_event ftrace_event_type_##call = {			\
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
+_TRACE_PERF_PROTO(call, PARAMS(proto));					\
 static const char print_fmt_##call[] = print;				\
 static struct ftrace_event_class __used event_class_##call = {		\
-	.system			= __stringify(TRACE_SYSTEM)		\
+	.system			= __stringify(TRACE_SYSTEM),		\
+	.probe			= ftrace_raw_event_##call,		\
+	_TRACE_PERF_INIT(call)						\
 }
 
 #undef DEFINE_EVENT
@@ -633,11 +566,8 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.class			= &event_class_##template,		\
 	.event			= &ftrace_event_type_##call,		\
 	.raw_init		= trace_event_raw_init,			\
-	.regfunc		= ftrace_raw_reg_event_##call,		\
-	.unregfunc		= ftrace_raw_unreg_event_##call,	\
 	.print_fmt		= print_fmt_##template,			\
 	.define_fields		= ftrace_define_fields_##template,	\
-	_TRACE_PERF_INIT(call)					\
 }
 
 #undef DEFINE_EVENT_PRINT
@@ -651,12 +581,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.class			= &event_class_##template,		\
 	.event			= &ftrace_event_type_##call,		\
-	.raw_init		= trace_event_raw_init,			\
-	.regfunc		= ftrace_raw_reg_event_##call,		\
-	.unregfunc		= ftrace_raw_unreg_event_##call,	\
 	.print_fmt		= print_fmt_##call,			\
-	.define_fields		= ftrace_define_fields_##template,	\
-	_TRACE_PERF_INIT(call)					\
 }
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
@@ -756,8 +681,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 static notrace void							\
-perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
-			    proto)					\
+perf_trace_##call(proto, struct ftrace_event_call *event_call)		\
 {									\
 	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
 	struct ftrace_raw_##call *entry;				\
@@ -792,13 +716,7 @@ perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
 }
 
 #undef DEFINE_EVENT
-#define DEFINE_EVENT(template, call, proto, args)		\
-static notrace void perf_trace_##call(proto)			\
-{								\
-	struct ftrace_event_call *event_call = &event_##call;	\
-								\
-	perf_trace_templ_##template(event_call, args);		\
-}
+#define DEFINE_EVENT(template, call, proto, args)
 
 #undef DEFINE_EVENT_PRINT
 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..95df5a7 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -44,7 +44,12 @@ static int perf_trace_event_enable(struct ftrace_event_call *event)
 		rcu_assign_pointer(perf_trace_buf_nmi, buf);
 	}
 
-	ret = event->perf_event_enable(event);
+	if (event->class->reg)
+		ret = event->class->reg(event, TRACE_REG_PERF_REGISTER);
+	else
+		ret = tracepoint_probe_register(event->name,
+						event->class->perf_probe,
+						event);
 	if (!ret) {
 		total_ref_count++;
 		return 0;
@@ -70,7 +75,8 @@ int perf_trace_enable(int event_id)
 
 	mutex_lock(&event_mutex);
 	list_for_each_entry(event, &ftrace_events, list) {
-		if (event->id == event_id && event->perf_event_enable &&
+		if (event->id == event_id &&
+		    event->class && event->class->perf_probe &&
 		    try_module_get(event->mod)) {
 			ret = perf_trace_event_enable(event);
 			break;
@@ -88,7 +94,10 @@ static void perf_trace_event_disable(struct ftrace_event_call *event)
 	if (--event->perf_refcount > 0)
 		return;
 
-	event->perf_event_disable(event);
+	if (event->class->reg)
+		event->class->reg(event, TRACE_REG_PERF_UNREGISTER);
+	else
+		tracepoint_probe_unregister(event->name, event->class->perf_probe, event);
 
 	if (!--total_ref_count) {
 		buf = perf_trace_buf;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index f6893cc..f84cfcb 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -126,13 +126,23 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 		if (call->enabled) {
 			call->enabled = 0;
 			tracing_stop_cmdline_record();
-			call->unregfunc(call);
+			if (call->class->reg)
+				call->class->reg(call, TRACE_REG_UNREGISTER);
+			else
+				tracepoint_probe_unregister(call->name,
+							    call->class->probe,
+							    call);
 		}
 		break;
 	case 1:
 		if (!call->enabled) {
 			tracing_start_cmdline_record();
-			ret = call->regfunc(call);
+			if (call->class->reg)
+				ret = call->class->reg(call, TRACE_REG_REGISTER);
+			else
+				ret = tracepoint_probe_register(call->name,
+								call->class->probe,
+								call);
 			if (ret) {
 				tracing_stop_cmdline_record();
 				pr_info("event trace: Could not enable event "
@@ -170,7 +180,8 @@ static int __ftrace_set_clr_event(const char *match, const char *sub,
 	mutex_lock(&event_mutex);
 	list_for_each_entry(call, &ftrace_events, list) {
 
-		if (!call->name || !call->regfunc)
+		if (!call->name || !call->class ||
+		    (!call->class->probe && !call->class->reg))
 			continue;
 
 		if (match &&
@@ -296,7 +307,7 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
 		 * The ftrace subsystem is for showing formats only.
 		 * They can not be enabled or disabled via the event files.
 		 */
-		if (call->regfunc)
+		if (call->class && (call->class->probe || call->class->reg))
 			return call;
 	}
 
@@ -449,7 +460,8 @@ system_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
 
 	mutex_lock(&event_mutex);
 	list_for_each_entry(call, &ftrace_events, list) {
-		if (!call->name || !call->regfunc)
+		if (!call->name || !call->class ||
+		    (!call->class->probe && !call->class->reg))
 			continue;
 
 		if (system && strcmp(call->class->system, system) != 0)
@@ -934,11 +946,11 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 		return -1;
 	}
 
-	if (call->regfunc)
+	if (call->class->probe || call->class->reg)
 		trace_create_file("enable", 0644, call->dir, call,
 				  enable);
 
-	if (call->id && call->perf_event_enable)
+	if (call->id && (call->class->perf_probe || call->class->reg))
 		trace_create_file("id", 0444, call->dir, call,
 		 		  id);
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index eda220b..f8af21a 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -202,6 +202,7 @@ struct trace_probe {
 	unsigned long 		nhit;
 	unsigned int		flags;	/* For TP_FLAG_* */
 	const char		*symbol;	/* symbol name */
+	struct ftrace_event_class	class;
 	struct ftrace_event_call	call;
 	struct trace_event		event;
 	unsigned int		nr_args;
@@ -323,6 +324,7 @@ static struct trace_probe *alloc_trace_probe(const char *group,
 		goto error;
 	}
 
+	tp->call.class = &tp->class;
 	tp->call.name = kstrdup(event, GFP_KERNEL);
 	if (!tp->call.name)
 		goto error;
@@ -332,8 +334,8 @@ static struct trace_probe *alloc_trace_probe(const char *group,
 		goto error;
 	}
 
-	tp->call.class->system = kstrdup(group, GFP_KERNEL);
-	if (!tp->call.class->system)
+	tp->class.system = kstrdup(group, GFP_KERNEL);
+	if (!tp->class.system)
 		goto error;
 
 	INIT_LIST_HEAD(&tp->list);
@@ -1302,6 +1304,26 @@ static void probe_perf_disable(struct ftrace_event_call *call)
 }
 #endif	/* CONFIG_PERF_EVENTS */
 
+static __kprobes
+int kprobe_register(struct ftrace_event_call *event, enum trace_reg type)
+{
+	switch (type) {
+	case TRACE_REG_REGISTER:
+		return probe_event_enable(event);
+	case TRACE_REG_UNREGISTER:
+		probe_event_disable(event);
+		return 0;
+
+#ifdef CONFIG_PERF_EVENTS
+	case TRACE_REG_PERF_REGISTER:
+		return probe_perf_enable(event);
+	case TRACE_REG_PERF_UNREGISTER:
+		probe_perf_disable(event);
+		return 0;
+#endif
+	}
+	return 0;
+}
 
 static __kprobes
 int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
@@ -1355,13 +1377,7 @@ static int register_probe_event(struct trace_probe *tp)
 		return -ENODEV;
 	}
 	call->enabled = 0;
-	call->regfunc = probe_event_enable;
-	call->unregfunc = probe_event_disable;
-
-#ifdef CONFIG_PERF_EVENTS
-	call->perf_event_enable = probe_perf_enable;
-	call->perf_event_disable = probe_perf_disable;
-#endif
+	call->class->reg = kprobe_register;
 	call->data = tp;
 	ret = trace_add_event_call(call);
 	if (ret) {
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 31fc95a..c92934d 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -14,8 +14,19 @@ static int sys_refcount_exit;
 static DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
 static DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
 
-struct ftrace_event_class event_class_syscalls = {
-	.system			= "syscalls"
+static int syscall_enter_register(struct ftrace_event_call *event,
+				 enum trace_reg type);
+static int syscall_exit_register(struct ftrace_event_call *event,
+				 enum trace_reg type);
+
+struct ftrace_event_class event_class_syscall_enter = {
+	.system			= "syscalls",
+	.reg			= syscall_enter_register
+};
+
+struct ftrace_event_class event_class_syscall_exit = {
+	.system			= "syscalls",
+	.reg			= syscall_exit_register
 };
 
 extern unsigned long __start_syscalls_metadata[];
@@ -586,3 +597,44 @@ void perf_sysexit_disable(struct ftrace_event_call *call)
 
 #endif /* CONFIG_PERF_EVENTS */
 
+static int syscall_enter_register(struct ftrace_event_call *event,
+				 enum trace_reg type)
+{
+	switch (type) {
+	case TRACE_REG_REGISTER:
+		return reg_event_syscall_enter(event);
+	case TRACE_REG_UNREGISTER:
+		unreg_event_syscall_enter(event);
+		return 0;
+
+#ifdef CONFIG_PERF_EVENTS
+	case TRACE_REG_PERF_REGISTER:
+		return perf_sysenter_enable(event);
+	case TRACE_REG_PERF_UNREGISTER:
+		perf_sysenter_disable(event);
+		return 0;
+#endif
+	}
+	return 0;
+}
+
+static int syscall_exit_register(struct ftrace_event_call *event,
+				 enum trace_reg type)
+{
+	switch (type) {
+	case TRACE_REG_REGISTER:
+		return reg_event_syscall_exit(event);
+	case TRACE_REG_UNREGISTER:
+		unreg_event_syscall_exit(event);
+		return 0;
+
+#ifdef CONFIG_PERF_EVENTS
+	case TRACE_REG_PERF_REGISTER:
+		return perf_sysexit_enable(event);
+	case TRACE_REG_PERF_UNREGISTER:
+		perf_sysexit_disable(event);
+		return 0;
+#endif
+	}
+	return 0;
+}
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/10][RFC] tracing: Move fields from event to class structure
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (3 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 04/10][RFC] tracing: Remove per event trace registering Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 20:58   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 06/10][RFC] tracing: Move raw_init from events to class Steven Rostedt
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers, Tom Zanussi

[-- Attachment #1: 0005-tracing-Move-fields-from-event-to-class-structure.patch --]
[-- Type: text/plain, Size: 18909 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Move the defined fields from the event to the class structure.
Since the fields of the event are defined by the class they belong
to, it makes sense to have the class hold the information instead
of the individual events. The events of the same class would just
hold duplicate information.

After this change the size of the kernel dropped another 8K:

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5774316	1306580	9351592	16432488	 fabd68	vmlinux.reg
5774503	1297492	9351592	16423587	 fa9aa3	vmlinux.fields

Although the text increased, this was mainly due to the C files
having to adapt to the change. This is a constant increase, where
new tracepoints will not increase the Text. But the big drop is
in the data size (as well as needed allocations to hold the fields).
This will give even more savings as more tracepoints are created.

Note, if just TRACE_EVENT()s are used and not DECLARE_EVENT_CLASS()
with several DEFINE_EVENT()s, then the savings will be lost. But
we are pushing developers to consolidate events with DEFINE_EVENT()
so this should not be an issue.

The kprobes define a unique class to every new event, but are dynamic
so it should not be a issue.

The syscalls however have a single class but the fields for the individual
events are different. The syscalls use a metadata to define the
fields. I moved the fields list from the event to the metadata and
added a "get_fields()" function to the class. This function is used
to find the fields. For normal events and kprobes, get_fields() just
returns a pointer to the fields list_head in the class. For syscall
events, it returns the fields list_head in the metadata for the event.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h       |    5 ++-
 include/linux/syscalls.h           |   12 +++++-----
 include/trace/ftrace.h             |   10 +++++---
 include/trace/syscall.h            |    3 +-
 kernel/trace/trace.h               |    3 ++
 kernel/trace/trace_events.c        |   43 +++++++++++++++++++++++++++++++-----
 kernel/trace/trace_events_filter.c |   10 +++++---
 kernel/trace/trace_export.c        |   14 ++++++------
 kernel/trace/trace_kprobe.c        |    8 +++---
 kernel/trace/trace_syscalls.c      |   23 ++++++++++++++++---
 10 files changed, 92 insertions(+), 39 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index dd0051e..1e2c8f5 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -128,6 +128,9 @@ struct ftrace_event_class {
 	void			*perf_probe;
 	int			(*reg)(struct ftrace_event_call *event,
 				       enum trace_reg type);
+	int			(*define_fields)(struct ftrace_event_call *);
+	struct list_head	*(*get_fields)(struct ftrace_event_call *);
+	struct list_head	fields;
 };
 
 struct ftrace_event_call {
@@ -140,8 +143,6 @@ struct ftrace_event_call {
 	int			id;
 	const char		*print_fmt;
 	int			(*raw_init)(struct ftrace_event_call *);
-	int			(*define_fields)(struct ftrace_event_call *);
-	struct list_head	fields;
 	int			filter_active;
 	struct event_filter	*filter;
 	void			*mod;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e3348c4..ef4f81c 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -122,7 +122,7 @@ extern struct ftrace_event_class event_class_syscall_enter;
 extern struct ftrace_event_class event_class_syscall_exit;
 
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
-	static const struct syscall_metadata __syscall_meta_##sname;	\
+	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_enter_##sname;		\
 	static struct trace_event enter_syscall_print_##sname = {	\
@@ -136,12 +136,11 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.class			= &event_class_syscall_enter,	\
 		.event                  = &enter_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
-		.define_fields		= syscall_enter_define_fields,	\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
 #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
-	static const struct syscall_metadata __syscall_meta_##sname;	\
+	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_exit_##sname;		\
 	static struct trace_event exit_syscall_print_##sname = {	\
@@ -155,14 +154,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.class			= &event_class_syscall_exit,	\
 		.event                  = &exit_syscall_print_##sname,	\
 		.raw_init		= init_syscall_trace,		\
-		.define_fields		= syscall_exit_define_fields,	\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
 #define SYSCALL_METADATA(sname, nb)				\
 	SYSCALL_TRACE_ENTER_EVENT(sname);			\
 	SYSCALL_TRACE_EXIT_EVENT(sname);			\
-	static const struct syscall_metadata __used		\
+	static struct syscall_metadata __used			\
 	  __attribute__((__aligned__(4)))			\
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta_##sname = {				\
@@ -172,12 +170,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.args		= args_##sname,			\
 		.enter_event	= &event_enter_##sname,		\
 		.exit_event	= &event_exit_##sname,		\
+		.fields		= LIST_HEAD_INIT(__syscall_meta_##sname.fields), \
 	};
 
 #define SYSCALL_DEFINE0(sname)					\
 	SYSCALL_TRACE_ENTER_EVENT(_##sname);			\
 	SYSCALL_TRACE_EXIT_EVENT(_##sname);			\
-	static const struct syscall_metadata __used		\
+	static struct syscall_metadata __used			\
 	  __attribute__((__aligned__(4)))			\
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta__##sname = {				\
@@ -185,6 +184,7 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.nb_args 	= 0,				\
 		.enter_event	= &event_enter__##sname,	\
 		.exit_event	= &event_exit__##sname,		\
+		.fields		= LIST_HEAD_INIT(__syscall_meta__##sname.fields), \
 	};							\
 	asmlinkage long sys_##sname(void)
 #else
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 62fe622..e6ec392 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -429,6 +429,9 @@ static inline notrace int ftrace_get_offsets_##call(			\
  *
  * static struct ftrace_event_class __used event_class_<template> = {
  *	.system			= "<system>",
+ *	.define_fields		= ftrace_define_fields_<call>,
+	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
+	.probe			= ftrace_raw_event_##call,		\
  * }
  *
  * static struct ftrace_event_call __used
@@ -437,10 +440,8 @@ static inline notrace int ftrace_get_offsets_##call(			\
  *	.name			= "<call>",
  *	.class			= event_class_<template>,
  *	.raw_init		= trace_event_raw_init,
- *	.regfunc		= ftrace_reg_event_<call>,
- *	.unregfunc		= ftrace_unreg_event_<call>,
+ *	.event			= &ftrace_event_type_<call>,
  *	.print_fmt		= print_fmt_<call>,
- *	.define_fields		= ftrace_define_fields_<call>,
  * }
  *
  */
@@ -552,6 +553,8 @@ _TRACE_PERF_PROTO(call, PARAMS(proto));					\
 static const char print_fmt_##call[] = print;				\
 static struct ftrace_event_class __used event_class_##call = {		\
 	.system			= __stringify(TRACE_SYSTEM),		\
+	.define_fields		= ftrace_define_fields_##call,		\
+	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
 	.probe			= ftrace_raw_event_##call,		\
 	_TRACE_PERF_INIT(call)						\
 }
@@ -567,7 +570,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.event			= &ftrace_event_type_##call,		\
 	.raw_init		= trace_event_raw_init,			\
 	.print_fmt		= print_fmt_##template,			\
-	.define_fields		= ftrace_define_fields_##template,	\
 }
 
 #undef DEFINE_EVENT_PRINT
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index e5e5f48..25087c3 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -25,6 +25,7 @@ struct syscall_metadata {
 	int		nb_args;
 	const char	**types;
 	const char	**args;
+	struct list_head fields;
 
 	struct ftrace_event_call *enter_event;
 	struct ftrace_event_call *exit_event;
@@ -34,8 +35,6 @@ struct syscall_metadata {
 extern unsigned long arch_syscall_addr(int nr);
 extern int init_syscall_trace(struct ftrace_event_call *call);
 
-extern int syscall_enter_define_fields(struct ftrace_event_call *call);
-extern int syscall_exit_define_fields(struct ftrace_event_call *call);
 extern int reg_event_syscall_enter(struct ftrace_event_call *call);
 extern void unreg_event_syscall_enter(struct ftrace_event_call *call);
 extern int reg_event_syscall_exit(struct ftrace_event_call *call);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 2825ef2..ff63bee 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -771,6 +771,9 @@ extern void print_subsystem_event_filter(struct event_subsystem *system,
 					 struct trace_seq *s);
 extern int filter_assign_type(const char *type);
 
+struct list_head *
+trace_get_fields(struct ftrace_event_call *event_call);
+
 static inline int
 filter_check_discard(struct ftrace_event_call *call, void *rec,
 		     struct ring_buffer *buffer,
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index f84cfcb..c31632e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -28,11 +28,28 @@ DEFINE_MUTEX(event_mutex);
 
 LIST_HEAD(ftrace_events);
 
+static int fields_done(struct ftrace_event_call *event_call)
+{
+	return 0;
+}
+
+struct list_head *
+trace_get_fields(struct ftrace_event_call *event_call)
+{
+	if (!event_call->class->get_fields)
+		return &event_call->class->fields;
+	return event_call->class->get_fields(event_call);
+}
+
 int trace_define_field(struct ftrace_event_call *call, const char *type,
 		       const char *name, int offset, int size, int is_signed,
 		       int filter_type)
 {
 	struct ftrace_event_field *field;
+	struct list_head *head;
+
+	if (WARN_ON(!call->class) || call->class->define_fields == fields_done)
+		return 0;
 
 	field = kzalloc(sizeof(*field), GFP_KERNEL);
 	if (!field)
@@ -55,7 +72,8 @@ int trace_define_field(struct ftrace_event_call *call, const char *type,
 	field->size = size;
 	field->is_signed = is_signed;
 
-	list_add(&field->link, &call->fields);
+	head = trace_get_fields(call);
+	list_add(&field->link, head);
 
 	return 0;
 
@@ -81,6 +99,9 @@ static int trace_define_common_fields(struct ftrace_event_call *call)
 	int ret;
 	struct trace_entry ent;
 
+	if (call->class->define_fields == fields_done)
+		return 0;
+
 	__common_field(unsigned short, type);
 	__common_field(unsigned char, flags);
 	__common_field(unsigned char, preempt_count);
@@ -93,8 +114,10 @@ static int trace_define_common_fields(struct ftrace_event_call *call)
 void trace_destroy_fields(struct ftrace_event_call *call)
 {
 	struct ftrace_event_field *field, *next;
+	struct list_head *head;
 
-	list_for_each_entry_safe(field, next, &call->fields, link) {
+	head = trace_get_fields(call);
+	list_for_each_entry_safe(field, next, head, link) {
 		list_del(&field->link);
 		kfree(field->type);
 		kfree(field->name);
@@ -110,7 +133,6 @@ int trace_event_raw_init(struct ftrace_event_call *call)
 	if (!id)
 		return -ENODEV;
 	call->id = id;
-	INIT_LIST_HEAD(&call->fields);
 
 	return 0;
 }
@@ -536,6 +558,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 {
 	struct ftrace_event_call *call = filp->private_data;
 	struct ftrace_event_field *field;
+	struct list_head *head;
 	struct trace_seq *s;
 	int common_field_count = 5;
 	char *buf;
@@ -554,7 +577,8 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 	trace_seq_printf(s, "ID: %d\n", call->id);
 	trace_seq_printf(s, "format:\n");
 
-	list_for_each_entry_reverse(field, &call->fields, link) {
+	head = trace_get_fields(call);
+	list_for_each_entry_reverse(field, head, link) {
 		/*
 		 * Smartly shows the array type(except dynamic array).
 		 * Normal:
@@ -954,10 +978,10 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 		trace_create_file("id", 0444, call->dir, call,
 		 		  id);
 
-	if (call->define_fields) {
+	if (call->class->define_fields) {
 		ret = trace_define_common_fields(call);
 		if (!ret)
-			ret = call->define_fields(call);
+			ret = call->class->define_fields(call);
 		if (ret < 0) {
 			pr_warning("Could not initialize trace point"
 				   " events/%s\n", call->name);
@@ -965,6 +989,13 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 		}
 		trace_create_file("filter", 0644, call->dir, call,
 				  filter);
+
+		/*
+		 * Other events with the same class will call
+		 * define fields again, Set the define_fields
+		 * to a stub, and it will be skipped.
+		 */
+		call->class->define_fields = fields_done;
 	}
 
 	trace_create_file("format", 0444, call->dir, call,
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 22fa89f..560683d 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -499,8 +499,10 @@ static struct ftrace_event_field *
 find_event_field(struct ftrace_event_call *call, char *name)
 {
 	struct ftrace_event_field *field;
+	struct list_head *head;
 
-	list_for_each_entry(field, &call->fields, link) {
+	head = trace_get_fields(call);
+	list_for_each_entry(field, head, link) {
 		if (!strcmp(field->name, name))
 			return field;
 	}
@@ -624,7 +626,7 @@ static int init_subsystem_preds(struct event_subsystem *system)
 	int err;
 
 	list_for_each_entry(call, &ftrace_events, list) {
-		if (!call->define_fields)
+		if (!call->class || !call->class->define_fields)
 			continue;
 
 		if (strcmp(call->class->system, system->name) != 0)
@@ -643,7 +645,7 @@ static void filter_free_subsystem_preds(struct event_subsystem *system)
 	struct ftrace_event_call *call;
 
 	list_for_each_entry(call, &ftrace_events, list) {
-		if (!call->define_fields)
+		if (!call->class || !call->class->define_fields)
 			continue;
 
 		if (strcmp(call->class->system, system->name) != 0)
@@ -1248,7 +1250,7 @@ static int replace_system_preds(struct event_subsystem *system,
 	list_for_each_entry(call, &ftrace_events, list) {
 		struct event_filter *filter = call->filter;
 
-		if (!call->define_fields)
+		if (!call->class || !call->class->define_fields)
 			continue;
 
 		if (strcmp(call->class->system, system->name) != 0)
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index 7f16e21..e700a0c 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -18,10 +18,6 @@
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM	ftrace
 
-struct ftrace_event_class event_class_ftrace = {
-	.system			= __stringify(TRACE_SYSTEM),
-};
-
 /* not needed for this file */
 #undef __field_struct
 #define __field_struct(type, item)
@@ -131,7 +127,7 @@ ftrace_define_fields_##name(struct ftrace_event_call *event_call)	\
 
 static int ftrace_raw_init_event(struct ftrace_event_call *call)
 {
-	INIT_LIST_HEAD(&call->fields);
+	INIT_LIST_HEAD(&call->class->fields);
 	return 0;
 }
 
@@ -159,15 +155,19 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
 #undef FTRACE_ENTRY
 #define FTRACE_ENTRY(call, struct_name, type, tstruct, print)		\
 									\
+struct ftrace_event_class event_class_ftrace_##call = {			\
+	.system			= __stringify(TRACE_SYSTEM),		\
+	.define_fields		= ftrace_define_fields_##call,		\
+};									\
+									\
 struct ftrace_event_call __used						\
 __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.id			= type,					\
-	.class			= &event_class_ftrace,			\
+	.class			= &event_class_ftrace_##call,		\
 	.raw_init		= ftrace_raw_init_event,		\
 	.print_fmt		= print,				\
-	.define_fields		= ftrace_define_fields_##call,		\
 };									\
 
 #include "trace_entries.h"
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f8af21a..b14bf74 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1112,8 +1112,6 @@ static void probe_event_disable(struct ftrace_event_call *call)
 
 static int probe_event_raw_init(struct ftrace_event_call *event_call)
 {
-	INIT_LIST_HEAD(&event_call->fields);
-
 	return 0;
 }
 
@@ -1362,11 +1360,13 @@ static int register_probe_event(struct trace_probe *tp)
 	if (probe_is_return(tp)) {
 		tp->event.trace = print_kretprobe_event;
 		call->raw_init = probe_event_raw_init;
-		call->define_fields = kretprobe_event_define_fields;
+		INIT_LIST_HEAD(&call->class->fields);
+		call->class->define_fields = kretprobe_event_define_fields;
 	} else {
 		tp->event.trace = print_kprobe_event;
 		call->raw_init = probe_event_raw_init;
-		call->define_fields = kprobe_event_define_fields;
+		INIT_LIST_HEAD(&call->class->fields);
+		call->class->define_fields = kprobe_event_define_fields;
 	}
 	if (set_print_fmt(tp) < 0)
 		return -ENOMEM;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index c92934d..eb535ba 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -19,14 +19,29 @@ static int syscall_enter_register(struct ftrace_event_call *event,
 static int syscall_exit_register(struct ftrace_event_call *event,
 				 enum trace_reg type);
 
+static int syscall_enter_define_fields(struct ftrace_event_call *call);
+static int syscall_exit_define_fields(struct ftrace_event_call *call);
+
+static struct list_head *
+syscall_get_fields(struct ftrace_event_call *call)
+{
+	struct syscall_metadata *entry = call->data;
+
+	return &entry->fields;
+}
+
 struct ftrace_event_class event_class_syscall_enter = {
 	.system			= "syscalls",
-	.reg			= syscall_enter_register
+	.reg			= syscall_enter_register,
+	.define_fields		= syscall_enter_define_fields,
+	.get_fields		= syscall_get_fields,
 };
 
 struct ftrace_event_class event_class_syscall_exit = {
 	.system			= "syscalls",
-	.reg			= syscall_exit_register
+	.reg			= syscall_exit_register,
+	.define_fields		= syscall_exit_define_fields,
+	.get_fields		= syscall_get_fields,
 };
 
 extern unsigned long __start_syscalls_metadata[];
@@ -219,7 +234,7 @@ static void free_syscall_print_fmt(struct ftrace_event_call *call)
 		kfree(call->print_fmt);
 }
 
-int syscall_enter_define_fields(struct ftrace_event_call *call)
+static int syscall_enter_define_fields(struct ftrace_event_call *call)
 {
 	struct syscall_trace_enter trace;
 	struct syscall_metadata *meta = call->data;
@@ -242,7 +257,7 @@ int syscall_enter_define_fields(struct ftrace_event_call *call)
 	return ret;
 }
 
-int syscall_exit_define_fields(struct ftrace_event_call *call)
+static int syscall_exit_define_fields(struct ftrace_event_call *call)
 {
 	struct syscall_trace_exit trace;
 	int ret;
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/10][RFC] tracing: Move raw_init from events to class
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (4 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 05/10][RFC] tracing: Move fields from event to class structure Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 21:00   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 07/10][RFC] tracing: Allow events to share their print functions Steven Rostedt
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0006-tracing-Move-raw_init-from-events-to-class.patch --]
[-- Type: text/plain, Size: 8156 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The raw_init function pointer in the event is used to initialize
various kinds of events. The type of initialization needed is usually
classed to the kind of event it is.

Two events with the same class will always have the same initialization
function, so it makes sense to move this to the class structure.

Perhaps even making a special system structure would work since
the initialization is the same for all events within a system.
But since there's no system structure (yet), this will just move it
to the class.

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5774567	1297492	9351592	16423651	 fa9ae3	vmlinux.fields
5774510	1293204	9351592	16419306	 fa89ea	vmlinux.init

The text grew very slightly, but this is a constant growth that happened
with the changing of the C files that call the init code.
The bigger savings is the data which will be saved the more events share
a class.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h  |    2 +-
 include/linux/syscalls.h      |    2 --
 include/trace/ftrace.h        |    8 ++++----
 kernel/trace/trace_events.c   |   12 ++++++------
 kernel/trace/trace_export.c   |    2 +-
 kernel/trace/trace_kprobe.c   |    6 +++---
 kernel/trace/trace_syscalls.c |    2 ++
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 1e2c8f5..655de69 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -131,6 +131,7 @@ struct ftrace_event_class {
 	int			(*define_fields)(struct ftrace_event_call *);
 	struct list_head	*(*get_fields)(struct ftrace_event_call *);
 	struct list_head	fields;
+	int			(*raw_init)(struct ftrace_event_call *);
 };
 
 struct ftrace_event_call {
@@ -142,7 +143,6 @@ struct ftrace_event_call {
 	int			enabled;
 	int			id;
 	const char		*print_fmt;
-	int			(*raw_init)(struct ftrace_event_call *);
 	int			filter_active;
 	struct event_filter	*filter;
 	void			*mod;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ef4f81c..a0db1e8 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -135,7 +135,6 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.name                   = "sys_enter"#sname,		\
 		.class			= &event_class_syscall_enter,	\
 		.event                  = &enter_syscall_print_##sname,	\
-		.raw_init		= init_syscall_trace,		\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
@@ -153,7 +152,6 @@ extern struct ftrace_event_class event_class_syscall_exit;
 		.name                   = "sys_exit"#sname,		\
 		.class			= &event_class_syscall_exit,	\
 		.event                  = &exit_syscall_print_##sname,	\
-		.raw_init		= init_syscall_trace,		\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index e6ec392..de0d96c 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -430,8 +430,9 @@ static inline notrace int ftrace_get_offsets_##call(			\
  * static struct ftrace_event_class __used event_class_<template> = {
  *	.system			= "<system>",
  *	.define_fields		= ftrace_define_fields_<call>,
-	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
-	.probe			= ftrace_raw_event_##call,		\
+ *	.fields			= LIST_HEAD_INIT(event_class_##call.fields),
+ *	.raw_init		= trace_event_raw_init,
+ *	.probe			= ftrace_raw_event_##call,
  * }
  *
  * static struct ftrace_event_call __used
@@ -439,7 +440,6 @@ static inline notrace int ftrace_get_offsets_##call(			\
  * __attribute__((section("_ftrace_events"))) event_<call> = {
  *	.name			= "<call>",
  *	.class			= event_class_<template>,
- *	.raw_init		= trace_event_raw_init,
  *	.event			= &ftrace_event_type_<call>,
  *	.print_fmt		= print_fmt_<call>,
  * }
@@ -555,6 +555,7 @@ static struct ftrace_event_class __used event_class_##call = {		\
 	.system			= __stringify(TRACE_SYSTEM),		\
 	.define_fields		= ftrace_define_fields_##call,		\
 	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
+	.raw_init		= trace_event_raw_init,			\
 	.probe			= ftrace_raw_event_##call,		\
 	_TRACE_PERF_INIT(call)						\
 }
@@ -568,7 +569,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.class			= &event_class_##template,		\
 	.event			= &ftrace_event_type_##call,		\
-	.raw_init		= trace_event_raw_init,			\
 	.print_fmt		= print_fmt_##template,			\
 }
 
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index c31632e..c34a9bd 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1012,8 +1012,8 @@ static int __trace_add_event_call(struct ftrace_event_call *call)
 	if (!call->name)
 		return -EINVAL;
 
-	if (call->raw_init) {
-		ret = call->raw_init(call);
+	if (call->class->raw_init) {
+		ret = call->class->raw_init(call);
 		if (ret < 0) {
 			if (ret != -ENOSYS)
 				pr_warning("Could not initialize trace "
@@ -1174,8 +1174,8 @@ static void trace_module_add_events(struct module *mod)
 		/* The linker may leave blanks */
 		if (!call->name)
 			continue;
-		if (call->raw_init) {
-			ret = call->raw_init(call);
+		if (call->class->raw_init) {
+			ret = call->class->raw_init(call);
 			if (ret < 0) {
 				if (ret != -ENOSYS)
 					pr_warning("Could not initialize trace "
@@ -1328,8 +1328,8 @@ static __init int event_trace_init(void)
 		/* The linker may leave blanks */
 		if (!call->name)
 			continue;
-		if (call->raw_init) {
-			ret = call->raw_init(call);
+		if (call->class->raw_init) {
+			ret = call->class->raw_init(call);
 			if (ret < 0) {
 				if (ret != -ENOSYS)
 					pr_warning("Could not initialize trace "
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index e700a0c..e878d06 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -158,6 +158,7 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
 struct ftrace_event_class event_class_ftrace_##call = {			\
 	.system			= __stringify(TRACE_SYSTEM),		\
 	.define_fields		= ftrace_define_fields_##call,		\
+	.raw_init		= ftrace_raw_init_event,		\
 };									\
 									\
 struct ftrace_event_call __used						\
@@ -166,7 +167,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.id			= type,					\
 	.class			= &event_class_ftrace_##call,		\
-	.raw_init		= ftrace_raw_init_event,		\
 	.print_fmt		= print,				\
 };									\
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index b14bf74..428f4a5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1359,13 +1359,13 @@ static int register_probe_event(struct trace_probe *tp)
 	/* Initialize ftrace_event_call */
 	if (probe_is_return(tp)) {
 		tp->event.trace = print_kretprobe_event;
-		call->raw_init = probe_event_raw_init;
 		INIT_LIST_HEAD(&call->class->fields);
+		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kretprobe_event_define_fields;
 	} else {
-		tp->event.trace = print_kprobe_event;
-		call->raw_init = probe_event_raw_init;
 		INIT_LIST_HEAD(&call->class->fields);
+		tp->event.trace = print_kprobe_event;
+		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kprobe_event_define_fields;
 	}
 	if (set_print_fmt(tp) < 0)
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index eb535ba..7ee6086 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -35,6 +35,7 @@ struct ftrace_event_class event_class_syscall_enter = {
 	.reg			= syscall_enter_register,
 	.define_fields		= syscall_enter_define_fields,
 	.get_fields		= syscall_get_fields,
+	.raw_init		= init_syscall_trace,
 };
 
 struct ftrace_event_class event_class_syscall_exit = {
@@ -42,6 +43,7 @@ struct ftrace_event_class event_class_syscall_exit = {
 	.reg			= syscall_exit_register,
 	.define_fields		= syscall_exit_define_fields,
 	.get_fields		= syscall_get_fields,
+	.raw_init		= init_syscall_trace,
 };
 
 extern unsigned long __start_syscalls_metadata[];
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/10][RFC] tracing: Allow events to share their print functions
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (5 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 06/10][RFC] tracing: Move raw_init from events to class Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 21:03   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 08/10][RFC] tracing: Move print functions into event class Steven Rostedt
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0007-tracing-Allow-events-to-share-their-print-functions.patch --]
[-- Type: text/plain, Size: 26832 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Multiple events may use the same method to print their data.
Instead of having all events have a pointer to their print funtions,
the trace_event structure now points to a trace_event_functions structure
that will hold the way to print ouf the event.

The event itself is now passed to the print function to let the print
function know what kind of event it should print.

This opens the door to consolidating the way several events print
their output.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h         |   17 +++-
 include/linux/syscalls.h             |   10 ++-
 include/trace/ftrace.h               |   12 ++-
 include/trace/syscall.h              |    6 +-
 kernel/trace/blktrace.c              |   13 ++-
 kernel/trace/kmemtrace.c             |   28 +++++--
 kernel/trace/trace.c                 |    9 +-
 kernel/trace/trace_functions_graph.c |    2 +-
 kernel/trace/trace_kprobe.c          |   22 ++++--
 kernel/trace/trace_output.c          |  137 +++++++++++++++++++++++-----------
 kernel/trace/trace_output.h          |    2 +-
 kernel/trace/trace_syscalls.c        |    6 +-
 12 files changed, 178 insertions(+), 86 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 655de69..09c2ad7 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -70,18 +70,25 @@ struct trace_iterator {
 };
 
 
+struct trace_event;
+
 typedef enum print_line_t (*trace_print_func)(struct trace_iterator *iter,
-					      int flags);
-struct trace_event {
-	struct hlist_node	node;
-	struct list_head	list;
-	int			type;
+				      int flags, struct trace_event *event);
+
+struct trace_event_functions {
 	trace_print_func	trace;
 	trace_print_func	raw;
 	trace_print_func	hex;
 	trace_print_func	binary;
 };
 
+struct trace_event {
+	struct hlist_node		node;
+	struct list_head		list;
+	int				type;
+	struct trace_event_functions	*funcs;
+};
+
 extern int register_ftrace_event(struct trace_event *event);
 extern int unregister_ftrace_event(struct trace_event *event);
 
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a0db1e8..f3892e9 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -125,9 +125,12 @@ extern struct ftrace_event_class event_class_syscall_exit;
 	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_enter_##sname;		\
-	static struct trace_event enter_syscall_print_##sname = {	\
+	static struct trace_event_functions enter_syscall_print_funcs_##sname = { \
 		.trace                  = print_syscall_enter,		\
 	};								\
+	static struct trace_event enter_syscall_print_##sname = {	\
+		.funcs                  = &enter_syscall_print_funcs_##sname, \
+	};								\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
@@ -142,9 +145,12 @@ extern struct ftrace_event_class event_class_syscall_exit;
 	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_exit_##sname;		\
-	static struct trace_event exit_syscall_print_##sname = {	\
+	static struct trace_event_functions exit_syscall_print_funcs_##sname = { \
 		.trace                  = print_syscall_exit,		\
 	};								\
+	static struct trace_event exit_syscall_print_##sname = {	\
+		.funcs                  = &exit_syscall_print_funcs_##sname, \
+	};								\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index de0d96c..2efb301 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -239,7 +239,8 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, name, proto, args)			\
 static notrace enum print_line_t					\
-ftrace_raw_output_##name(struct trace_iterator *iter, int flags)	\
+ftrace_raw_output_##name(struct trace_iterator *iter, int flags,	\
+			 struct trace_event *event)			\
 {									\
 	return ftrace_raw_output_id_##template(event_##name.id,		\
 					       #name, iter, flags);	\
@@ -248,7 +249,8 @@ ftrace_raw_output_##name(struct trace_iterator *iter, int flags)	\
 #undef DEFINE_EVENT_PRINT
 #define DEFINE_EVENT_PRINT(template, call, proto, args, print)		\
 static notrace enum print_line_t					\
-ftrace_raw_output_##call(struct trace_iterator *iter, int flags)	\
+ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
+			 struct trace_event *event)			\
 {									\
 	struct trace_seq *s = &iter->seq;				\
 	struct ftrace_raw_##template *field;				\
@@ -525,9 +527,11 @@ ftrace_raw_event_##call(proto,						\
 
 #undef DEFINE_EVENT
 #define DEFINE_EVENT(template, call, proto, args)			\
-									\
-static struct trace_event ftrace_event_type_##call = {			\
+static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
 	.trace			= ftrace_raw_output_##call,		\
+};									\
+static struct trace_event ftrace_event_type_##call = {			\
+	.funcs			= &ftrace_event_type_funcs_##call,	\
 };
 
 #undef DEFINE_EVENT_PRINT
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 25087c3..f0eaa45 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -41,8 +41,10 @@ extern int reg_event_syscall_exit(struct ftrace_event_call *call);
 extern void unreg_event_syscall_exit(struct ftrace_event_call *call);
 extern int
 ftrace_format_syscall(struct ftrace_event_call *call, struct trace_seq *s);
-enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags);
-enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags);
+enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags,
+				      struct trace_event *event);
+enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags,
+				     struct trace_event *event);
 #endif
 
 #ifdef CONFIG_PERF_EVENTS
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 07f945a..2737c70 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1320,7 +1320,7 @@ out:
 }
 
 static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
-					       int flags)
+					       int flags, struct trace_event *event)
 {
 	return print_one_line(iter, false);
 }
@@ -1342,7 +1342,8 @@ static int blk_trace_synthesize_old_trace(struct trace_iterator *iter)
 }
 
 static enum print_line_t
-blk_trace_event_print_binary(struct trace_iterator *iter, int flags)
+blk_trace_event_print_binary(struct trace_iterator *iter, int flags,
+			     struct trace_event *event)
 {
 	return blk_trace_synthesize_old_trace(iter) ?
 			TRACE_TYPE_HANDLED : TRACE_TYPE_PARTIAL_LINE;
@@ -1380,12 +1381,16 @@ static struct tracer blk_tracer __read_mostly = {
 	.set_flag	= blk_tracer_set_flag,
 };
 
-static struct trace_event trace_blk_event = {
-	.type		= TRACE_BLK,
+static struct trace_event_functions trace_blk_event_funcs = {
 	.trace		= blk_trace_event_print,
 	.binary		= blk_trace_event_print_binary,
 };
 
+static struct trace_event trace_blk_event = {
+	.type		= TRACE_BLK,
+	.funcs		= &trace_blk_event_funcs,
+};
+
 static int __init init_blk_tracer(void)
 {
 	if (!register_ftrace_event(&trace_blk_event)) {
diff --git a/kernel/trace/kmemtrace.c b/kernel/trace/kmemtrace.c
index a91da69..6a24fe0 100644
--- a/kernel/trace/kmemtrace.c
+++ b/kernel/trace/kmemtrace.c
@@ -237,7 +237,8 @@ struct kmemtrace_user_event_alloc {
 };
 
 static enum print_line_t
-kmemtrace_print_alloc(struct trace_iterator *iter, int flags)
+kmemtrace_print_alloc(struct trace_iterator *iter, int flags,
+		      struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct kmemtrace_alloc_entry *entry;
@@ -257,7 +258,8 @@ kmemtrace_print_alloc(struct trace_iterator *iter, int flags)
 }
 
 static enum print_line_t
-kmemtrace_print_free(struct trace_iterator *iter, int flags)
+kmemtrace_print_free(struct trace_iterator *iter, int flags,
+		     struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct kmemtrace_free_entry *entry;
@@ -275,7 +277,8 @@ kmemtrace_print_free(struct trace_iterator *iter, int flags)
 }
 
 static enum print_line_t
-kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags)
+kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags,
+			   struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct kmemtrace_alloc_entry *entry;
@@ -309,7 +312,8 @@ kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags)
 }
 
 static enum print_line_t
-kmemtrace_print_free_user(struct trace_iterator *iter, int flags)
+kmemtrace_print_free_user(struct trace_iterator *iter, int flags,
+			  struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct kmemtrace_free_entry *entry;
@@ -463,18 +467,26 @@ static enum print_line_t kmemtrace_print_line(struct trace_iterator *iter)
 	}
 }
 
-static struct trace_event kmem_trace_alloc = {
-	.type			= TRACE_KMEM_ALLOC,
+static struct trace_event_functions kmem_trace_alloc_funcs = {
 	.trace			= kmemtrace_print_alloc,
 	.binary			= kmemtrace_print_alloc_user,
 };
 
-static struct trace_event kmem_trace_free = {
-	.type			= TRACE_KMEM_FREE,
+static struct trace_event kmem_trace_alloc = {
+	.type			= TRACE_KMEM_ALLOC,
+	.funcs			= &kmem_trace_alloc_funcs,
+};
+
+static struct trace_event_functions kmem_trace_free_funcs = {
 	.trace			= kmemtrace_print_free,
 	.binary			= kmemtrace_print_free_user,
 };
 
+static struct trace_event kmem_trace_free = {
+	.type			= TRACE_KMEM_FREE,
+	.funcs			= &kmem_trace_free_funcs,
+};
+
 static struct tracer kmem_tracer __read_mostly = {
 	.name			= "kmemtrace",
 	.init			= kmem_trace_init,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b9be232..427e074 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1924,7 +1924,7 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
 	}
 
 	if (event)
-		return event->trace(iter, sym_flags);
+		return event->funcs->trace(iter, sym_flags, event);
 
 	if (!trace_seq_printf(s, "Unknown type %d\n", entry->type))
 		goto partial;
@@ -1950,7 +1950,7 @@ static enum print_line_t print_raw_fmt(struct trace_iterator *iter)
 
 	event = ftrace_find_event(entry->type);
 	if (event)
-		return event->raw(iter, 0);
+		return event->funcs->raw(iter, 0, event);
 
 	if (!trace_seq_printf(s, "%d ?\n", entry->type))
 		goto partial;
@@ -1977,7 +1977,7 @@ static enum print_line_t print_hex_fmt(struct trace_iterator *iter)
 
 	event = ftrace_find_event(entry->type);
 	if (event) {
-		enum print_line_t ret = event->hex(iter, 0);
+		enum print_line_t ret = event->funcs->hex(iter, 0, event);
 		if (ret != TRACE_TYPE_HANDLED)
 			return ret;
 	}
@@ -2002,7 +2002,8 @@ static enum print_line_t print_bin_fmt(struct trace_iterator *iter)
 	}
 
 	event = ftrace_find_event(entry->type);
-	return event ? event->binary(iter, 0) : TRACE_TYPE_HANDLED;
+	return event ? event->funcs->binary(iter, 0, event) :
+		TRACE_TYPE_HANDLED;
 }
 
 static int trace_empty(struct trace_iterator *iter)
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index a7f75fb..c620763 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -1020,7 +1020,7 @@ print_graph_comment(struct trace_seq *s,  struct trace_entry *ent,
 		if (!event)
 			return TRACE_TYPE_UNHANDLED;
 
-		ret = event->trace(iter, sym_flags);
+		ret = event->funcs->trace(iter, sym_flags, event);
 		if (ret != TRACE_TYPE_HANDLED)
 			return ret;
 	}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 428f4a5..b989ae2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1011,16 +1011,15 @@ static __kprobes void kretprobe_trace_func(struct kretprobe_instance *ri,
 
 /* Event entry printers */
 enum print_line_t
-print_kprobe_event(struct trace_iterator *iter, int flags)
+print_kprobe_event(struct trace_iterator *iter, int flags,
+		   struct trace_event *event)
 {
 	struct kprobe_trace_entry *field;
 	struct trace_seq *s = &iter->seq;
-	struct trace_event *event;
 	struct trace_probe *tp;
 	int i;
 
 	field = (struct kprobe_trace_entry *)iter->ent;
-	event = ftrace_find_event(field->ent.type);
 	tp = container_of(event, struct trace_probe, event);
 
 	if (!trace_seq_printf(s, "%s: (", tp->call.name))
@@ -1046,16 +1045,15 @@ partial:
 }
 
 enum print_line_t
-print_kretprobe_event(struct trace_iterator *iter, int flags)
+print_kretprobe_event(struct trace_iterator *iter, int flags,
+		      struct trace_event *event)
 {
 	struct kretprobe_trace_entry *field;
 	struct trace_seq *s = &iter->seq;
-	struct trace_event *event;
 	struct trace_probe *tp;
 	int i;
 
 	field = (struct kretprobe_trace_entry *)iter->ent;
-	event = ftrace_find_event(field->ent.type);
 	tp = container_of(event, struct trace_probe, event);
 
 	if (!trace_seq_printf(s, "%s: (", tp->call.name))
@@ -1351,6 +1349,14 @@ int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
 	return 0;	/* We don't tweek kernel, so just return 0 */
 }
 
+static struct trace_event_functions kretprobe_funcs = {
+	.trace		= print_kretprobe_event
+};
+
+static struct trace_event_functions kprobe_funcs = {
+	.trace		= print_kprobe_event
+};
+
 static int register_probe_event(struct trace_probe *tp)
 {
 	struct ftrace_event_call *call = &tp->call;
@@ -1358,13 +1364,13 @@ static int register_probe_event(struct trace_probe *tp)
 
 	/* Initialize ftrace_event_call */
 	if (probe_is_return(tp)) {
-		tp->event.trace = print_kretprobe_event;
+		tp->event.funcs = &kretprobe_funcs;
 		INIT_LIST_HEAD(&call->class->fields);
 		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kretprobe_event_define_fields;
 	} else {
 		INIT_LIST_HEAD(&call->class->fields);
-		tp->event.trace = print_kprobe_event;
+		tp->event.funcs = &kprobe_funcs;
 		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kprobe_event_define_fields;
 	}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 8e46b33..9c00283 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -726,6 +726,9 @@ int register_ftrace_event(struct trace_event *event)
 	if (WARN_ON(!event))
 		goto out;
 
+	if (WARN_ON(!event->funcs))
+		goto out;
+
 	INIT_LIST_HEAD(&event->list);
 
 	if (!event->type) {
@@ -758,14 +761,14 @@ int register_ftrace_event(struct trace_event *event)
 			goto out;
 	}
 
-	if (event->trace == NULL)
-		event->trace = trace_nop_print;
-	if (event->raw == NULL)
-		event->raw = trace_nop_print;
-	if (event->hex == NULL)
-		event->hex = trace_nop_print;
-	if (event->binary == NULL)
-		event->binary = trace_nop_print;
+	if (event->funcs->trace == NULL)
+		event->funcs->trace = trace_nop_print;
+	if (event->funcs->raw == NULL)
+		event->funcs->raw = trace_nop_print;
+	if (event->funcs->hex == NULL)
+		event->funcs->hex = trace_nop_print;
+	if (event->funcs->binary == NULL)
+		event->funcs->binary = trace_nop_print;
 
 	key = event->type & (EVENT_HASHSIZE - 1);
 
@@ -807,13 +810,15 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_event);
  * Standard events
  */
 
-enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags)
+enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags,
+				  struct trace_event *event)
 {
 	return TRACE_TYPE_HANDLED;
 }
 
 /* TRACE_FN */
-static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags,
+					struct trace_event *event)
 {
 	struct ftrace_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -840,7 +845,8 @@ static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
-static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags,
+				      struct trace_event *event)
 {
 	struct ftrace_entry *field;
 
@@ -854,7 +860,8 @@ static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_HANDLED;
 }
 
-static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags,
+				      struct trace_event *event)
 {
 	struct ftrace_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -867,7 +874,8 @@ static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_HANDLED;
 }
 
-static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags,
+				      struct trace_event *event)
 {
 	struct ftrace_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -880,14 +888,18 @@ static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_HANDLED;
 }
 
-static struct trace_event trace_fn_event = {
-	.type		= TRACE_FN,
+static struct trace_event_functions trace_fn_funcs = {
 	.trace		= trace_fn_trace,
 	.raw		= trace_fn_raw,
 	.hex		= trace_fn_hex,
 	.binary		= trace_fn_bin,
 };
 
+static struct trace_event trace_fn_event = {
+	.type		= TRACE_FN,
+	.funcs		= &trace_fn_funcs,
+};
+
 /* TRACE_CTX an TRACE_WAKE */
 static enum print_line_t trace_ctxwake_print(struct trace_iterator *iter,
 					     char *delim)
@@ -916,13 +928,14 @@ static enum print_line_t trace_ctxwake_print(struct trace_iterator *iter,
 	return TRACE_TYPE_HANDLED;
 }
 
-static enum print_line_t trace_ctx_print(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_ctx_print(struct trace_iterator *iter, int flags,
+					 struct trace_event *event)
 {
 	return trace_ctxwake_print(iter, "==>");
 }
 
 static enum print_line_t trace_wake_print(struct trace_iterator *iter,
-					  int flags)
+					  int flags, struct trace_event *event)
 {
 	return trace_ctxwake_print(iter, "  +");
 }
@@ -950,12 +963,14 @@ static int trace_ctxwake_raw(struct trace_iterator *iter, char S)
 	return TRACE_TYPE_HANDLED;
 }
 
-static enum print_line_t trace_ctx_raw(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_ctx_raw(struct trace_iterator *iter, int flags,
+				       struct trace_event *event)
 {
 	return trace_ctxwake_raw(iter, 0);
 }
 
-static enum print_line_t trace_wake_raw(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_wake_raw(struct trace_iterator *iter, int flags,
+					struct trace_event *event)
 {
 	return trace_ctxwake_raw(iter, '+');
 }
@@ -984,18 +999,20 @@ static int trace_ctxwake_hex(struct trace_iterator *iter, char S)
 	return TRACE_TYPE_HANDLED;
 }
 
-static enum print_line_t trace_ctx_hex(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_ctx_hex(struct trace_iterator *iter, int flags,
+				       struct trace_event *event)
 {
 	return trace_ctxwake_hex(iter, 0);
 }
 
-static enum print_line_t trace_wake_hex(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_wake_hex(struct trace_iterator *iter, int flags,
+					struct trace_event *event)
 {
 	return trace_ctxwake_hex(iter, '+');
 }
 
 static enum print_line_t trace_ctxwake_bin(struct trace_iterator *iter,
-					   int flags)
+					   int flags, struct trace_event *event)
 {
 	struct ctx_switch_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1012,25 +1029,33 @@ static enum print_line_t trace_ctxwake_bin(struct trace_iterator *iter,
 	return TRACE_TYPE_HANDLED;
 }
 
-static struct trace_event trace_ctx_event = {
-	.type		= TRACE_CTX,
+static struct trace_event_functions trace_ctx_funcs = {
 	.trace		= trace_ctx_print,
 	.raw		= trace_ctx_raw,
 	.hex		= trace_ctx_hex,
 	.binary		= trace_ctxwake_bin,
 };
 
-static struct trace_event trace_wake_event = {
-	.type		= TRACE_WAKE,
+static struct trace_event trace_ctx_event = {
+	.type		= TRACE_CTX,
+	.funcs		= &trace_ctx_funcs,
+};
+
+static struct trace_event_functions trace_wake_funcs = {
 	.trace		= trace_wake_print,
 	.raw		= trace_wake_raw,
 	.hex		= trace_wake_hex,
 	.binary		= trace_ctxwake_bin,
 };
 
+static struct trace_event trace_wake_event = {
+	.type		= TRACE_WAKE,
+	.funcs		= &trace_wake_funcs,
+};
+
 /* TRACE_SPECIAL */
 static enum print_line_t trace_special_print(struct trace_iterator *iter,
-					     int flags)
+					     int flags, struct trace_event *event)
 {
 	struct special_entry *field;
 
@@ -1046,7 +1071,7 @@ static enum print_line_t trace_special_print(struct trace_iterator *iter,
 }
 
 static enum print_line_t trace_special_hex(struct trace_iterator *iter,
-					   int flags)
+					   int flags, struct trace_event *event)
 {
 	struct special_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1061,7 +1086,7 @@ static enum print_line_t trace_special_hex(struct trace_iterator *iter,
 }
 
 static enum print_line_t trace_special_bin(struct trace_iterator *iter,
-					   int flags)
+					   int flags, struct trace_event *event)
 {
 	struct special_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1075,18 +1100,22 @@ static enum print_line_t trace_special_bin(struct trace_iterator *iter,
 	return TRACE_TYPE_HANDLED;
 }
 
-static struct trace_event trace_special_event = {
-	.type		= TRACE_SPECIAL,
+static struct trace_event_functions trace_special_funcs = {
 	.trace		= trace_special_print,
 	.raw		= trace_special_print,
 	.hex		= trace_special_hex,
 	.binary		= trace_special_bin,
 };
 
+static struct trace_event trace_special_event = {
+	.type		= TRACE_SPECIAL,
+	.funcs		= &trace_special_funcs,
+};
+
 /* TRACE_STACK */
 
 static enum print_line_t trace_stack_print(struct trace_iterator *iter,
-					   int flags)
+					   int flags, struct trace_event *event)
 {
 	struct stack_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1114,17 +1143,21 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter,
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
-static struct trace_event trace_stack_event = {
-	.type		= TRACE_STACK,
+static struct trace_event_functions trace_stack_funcs = {
 	.trace		= trace_stack_print,
 	.raw		= trace_special_print,
 	.hex		= trace_special_hex,
 	.binary		= trace_special_bin,
 };
 
+static struct trace_event trace_stack_event = {
+	.type		= TRACE_STACK,
+	.funcs		= &trace_stack_funcs,
+};
+
 /* TRACE_USER_STACK */
 static enum print_line_t trace_user_stack_print(struct trace_iterator *iter,
-						int flags)
+						int flags, struct trace_event *event)
 {
 	struct userstack_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1143,17 +1176,22 @@ static enum print_line_t trace_user_stack_print(struct trace_iterator *iter,
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
-static struct trace_event trace_user_stack_event = {
-	.type		= TRACE_USER_STACK,
+static struct trace_event_functions trace_user_stack_funcs = {
 	.trace		= trace_user_stack_print,
 	.raw		= trace_special_print,
 	.hex		= trace_special_hex,
 	.binary		= trace_special_bin,
 };
 
+static struct trace_event trace_user_stack_event = {
+	.type		= TRACE_USER_STACK,
+	.funcs		= &trace_user_stack_funcs,
+};
+
 /* TRACE_BPRINT */
 static enum print_line_t
-trace_bprint_print(struct trace_iterator *iter, int flags)
+trace_bprint_print(struct trace_iterator *iter, int flags,
+		   struct trace_event *event)
 {
 	struct trace_entry *entry = iter->ent;
 	struct trace_seq *s = &iter->seq;
@@ -1178,7 +1216,8 @@ trace_bprint_print(struct trace_iterator *iter, int flags)
 
 
 static enum print_line_t
-trace_bprint_raw(struct trace_iterator *iter, int flags)
+trace_bprint_raw(struct trace_iterator *iter, int flags,
+		 struct trace_event *event)
 {
 	struct bprint_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1197,16 +1236,19 @@ trace_bprint_raw(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
+static struct trace_event_functions trace_bprint_funcs = {
+	.trace		= trace_bprint_print,
+	.raw		= trace_bprint_raw,
+};
 
 static struct trace_event trace_bprint_event = {
 	.type		= TRACE_BPRINT,
-	.trace		= trace_bprint_print,
-	.raw		= trace_bprint_raw,
+	.funcs		= &trace_bprint_funcs,
 };
 
 /* TRACE_PRINT */
 static enum print_line_t trace_print_print(struct trace_iterator *iter,
-					   int flags)
+					   int flags, struct trace_event *event)
 {
 	struct print_entry *field;
 	struct trace_seq *s = &iter->seq;
@@ -1225,7 +1267,8 @@ static enum print_line_t trace_print_print(struct trace_iterator *iter,
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
-static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags)
+static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags,
+					 struct trace_event *event)
 {
 	struct print_entry *field;
 
@@ -1240,12 +1283,16 @@ static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_PARTIAL_LINE;
 }
 
-static struct trace_event trace_print_event = {
-	.type	 	= TRACE_PRINT,
+static struct trace_event_functions trace_print_funcs = {
 	.trace		= trace_print_print,
 	.raw		= trace_print_raw,
 };
 
+static struct trace_event trace_print_event = {
+	.type	 	= TRACE_PRINT,
+	.funcs		= &trace_print_funcs,
+};
+
 
 static struct trace_event *events[] __initdata = {
 	&trace_fn_event,
diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h
index 9d91c72..c038eba 100644
--- a/kernel/trace/trace_output.h
+++ b/kernel/trace/trace_output.h
@@ -25,7 +25,7 @@ extern void trace_event_read_unlock(void);
 extern struct trace_event *ftrace_find_event(int type);
 
 extern enum print_line_t trace_nop_print(struct trace_iterator *iter,
-					 int flags);
+					 int flags, struct trace_event *event);
 extern int
 trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry);
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 7ee6086..0bcca08 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -84,7 +84,8 @@ static struct syscall_metadata *syscall_nr_to_meta(int nr)
 }
 
 enum print_line_t
-print_syscall_enter(struct trace_iterator *iter, int flags)
+print_syscall_enter(struct trace_iterator *iter, int flags,
+		    struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *ent = iter->ent;
@@ -136,7 +137,8 @@ end:
 }
 
 enum print_line_t
-print_syscall_exit(struct trace_iterator *iter, int flags)
+print_syscall_exit(struct trace_iterator *iter, int flags,
+		   struct trace_event *event)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *ent = iter->ent;
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/10][RFC] tracing: Move print functions into event class
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (6 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 07/10][RFC] tracing: Allow events to share their print functions Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 21:03   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure Steven Rostedt
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0008-tracing-Move-print-functions-into-event-class.patch --]
[-- Type: text/plain, Size: 11707 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Currently, every event has its own trace_event structure. This is
fine since the structure is needed anyway. But the print function
structure (trace_event_functions) is now separate. Since the output
of the trace event is done by the class (with the exception of events
defined by DEFINE_EVENT_PRINT), it makes sense to have the class
define the print functions that all events in the class can use.

This makes a bigger deal with the syscall events since all syscall events
use the same class. The savings here is another 37K.

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5774574	1293204	9351592	16419370	 fa8a2a	vmlinux.init
5761154	1268356	9351592	16381102	 f9f4ae	vmlinux.print

To accomplish this, and to let the class know what event is being
printed, the event structure is embedded in the ftrace_event_call
structure. This should not be an issues since the event structure
was created for each event anyway.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h  |    2 +-
 include/linux/syscalls.h      |   18 +++------------
 include/trace/ftrace.h        |   47 +++++++++++++++++-----------------------
 kernel/trace/trace_events.c   |    6 ++--
 kernel/trace/trace_kprobe.c   |   14 +++++-------
 kernel/trace/trace_syscalls.c |    8 +++++++
 6 files changed, 42 insertions(+), 53 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 09c2ad7..aa3695a 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -146,7 +146,7 @@ struct ftrace_event_call {
 	struct ftrace_event_class *class;
 	char			*name;
 	struct dentry		*dir;
-	struct trace_event	*event;
+	struct trace_event	event;
 	int			enabled;
 	int			id;
 	const char		*print_fmt;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f3892e9..5d060b7 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -120,24 +120,20 @@ struct perf_event_attr;
 
 extern struct ftrace_event_class event_class_syscall_enter;
 extern struct ftrace_event_class event_class_syscall_exit;
+extern struct trace_event_functions enter_syscall_print_funcs;
+extern struct trace_event_functions exit_syscall_print_funcs;
 
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
 	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_enter_##sname;		\
-	static struct trace_event_functions enter_syscall_print_funcs_##sname = { \
-		.trace                  = print_syscall_enter,		\
-	};								\
-	static struct trace_event enter_syscall_print_##sname = {	\
-		.funcs                  = &enter_syscall_print_funcs_##sname, \
-	};								\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
 	  event_enter_##sname = {					\
 		.name                   = "sys_enter"#sname,		\
 		.class			= &event_class_syscall_enter,	\
-		.event                  = &enter_syscall_print_##sname,	\
+		.event.funcs            = &enter_syscall_print_funcs,	\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
@@ -145,19 +141,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
 	static struct syscall_metadata __syscall_meta_##sname;		\
 	static struct ftrace_event_call					\
 	__attribute__((__aligned__(4))) event_exit_##sname;		\
-	static struct trace_event_functions exit_syscall_print_funcs_##sname = { \
-		.trace                  = print_syscall_exit,		\
-	};								\
-	static struct trace_event exit_syscall_print_##sname = {	\
-		.funcs                  = &exit_syscall_print_funcs_##sname, \
-	};								\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
 	  event_exit_##sname = {					\
 		.name                   = "sys_exit"#sname,		\
 		.class			= &event_class_syscall_exit,	\
-		.event                  = &exit_syscall_print_##sname,	\
+		.event.funcs		= &exit_syscall_print_funcs,	\
 		.data			= (void *)&__syscall_meta_##sname,\
 	}
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 2efb301..d7b3b56 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -206,18 +206,22 @@
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 static notrace enum print_line_t					\
-ftrace_raw_output_id_##call(int event_id, const char *name,		\
-			    struct trace_iterator *iter, int flags)	\
+ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
+			 struct trace_event *trace_event)		\
 {									\
+	struct ftrace_event_call *event;				\
 	struct trace_seq *s = &iter->seq;				\
 	struct ftrace_raw_##call *field;				\
 	struct trace_entry *entry;					\
 	struct trace_seq *p;						\
 	int ret;							\
 									\
+	event = container_of(trace_event, struct ftrace_event_call,	\
+			     event);					\
+									\
 	entry = iter->ent;						\
 									\
-	if (entry->type != event_id) {					\
+	if (entry->type != event->id) {					\
 		WARN_ON_ONCE(1);					\
 		return TRACE_TYPE_UNHANDLED;				\
 	}								\
@@ -226,7 +230,7 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
 									\
 	p = &get_cpu_var(ftrace_event_seq);				\
 	trace_seq_init(p);						\
-	ret = trace_seq_printf(s, "%s: ", name);			\
+	ret = trace_seq_printf(s, "%s: ", event->name);			\
 	if (ret)							\
 		ret = trace_seq_printf(s, print);			\
 	put_cpu();							\
@@ -234,17 +238,10 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
 		return TRACE_TYPE_PARTIAL_LINE;				\
 									\
 	return TRACE_TYPE_HANDLED;					\
-}
-
-#undef DEFINE_EVENT
-#define DEFINE_EVENT(template, name, proto, args)			\
-static notrace enum print_line_t					\
-ftrace_raw_output_##name(struct trace_iterator *iter, int flags,	\
-			 struct trace_event *event)			\
-{									\
-	return ftrace_raw_output_id_##template(event_##name.id,		\
-					       #name, iter, flags);	\
-}
+}									\
+static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
+	.trace			= ftrace_raw_output_##call,		\
+};
 
 #undef DEFINE_EVENT_PRINT
 #define DEFINE_EVENT_PRINT(template, call, proto, args, print)		\
@@ -277,7 +274,10 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
 		return TRACE_TYPE_PARTIAL_LINE;				\
 									\
 	return TRACE_TYPE_HANDLED;					\
-}
+}									\
+static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
+	.trace			= ftrace_raw_output_##call,		\
+};
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
@@ -526,17 +526,10 @@ ftrace_raw_event_##call(proto,						\
 }
 
 #undef DEFINE_EVENT
-#define DEFINE_EVENT(template, call, proto, args)			\
-static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
-	.trace			= ftrace_raw_output_##call,		\
-};									\
-static struct trace_event ftrace_event_type_##call = {			\
-	.funcs			= &ftrace_event_type_funcs_##call,	\
-};
+#define DEFINE_EVENT(template, call, proto, args)
 
 #undef DEFINE_EVENT_PRINT
-#define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
-	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
+#define DEFINE_EVENT_PRINT(template, name, proto, args, print)
 
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
@@ -572,7 +565,7 @@ __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.class			= &event_class_##template,		\
-	.event			= &ftrace_event_type_##call,		\
+	.event.funcs		= &ftrace_event_type_funcs_##template,	\
 	.print_fmt		= print_fmt_##template,			\
 }
 
@@ -586,7 +579,7 @@ __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
 	.class			= &event_class_##template,		\
-	.event			= &ftrace_event_type_##call,		\
+	.event.funcs		= &ftrace_event_type_funcs_##call,	\
 	.print_fmt		= print_fmt_##call,			\
 }
 
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index c34a9bd..9aa298e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -129,7 +129,7 @@ int trace_event_raw_init(struct ftrace_event_call *call)
 {
 	int id;
 
-	id = register_ftrace_event(call->event);
+	id = register_ftrace_event(&call->event);
 	if (!id)
 		return -ENODEV;
 	call->id = id;
@@ -1077,8 +1077,8 @@ static void remove_subsystem_dir(const char *name)
 static void __trace_remove_event_call(struct ftrace_event_call *call)
 {
 	ftrace_event_enable_disable(call, 0);
-	if (call->event)
-		__unregister_ftrace_event(call->event);
+	if (call->event.funcs)
+		__unregister_ftrace_event(&call->event);
 	debugfs_remove_recursive(call->dir);
 	list_del(&call->list);
 	trace_destroy_fields(call);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index b989ae2..d8061c3 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -204,7 +204,6 @@ struct trace_probe {
 	const char		*symbol;	/* symbol name */
 	struct ftrace_event_class	class;
 	struct ftrace_event_call	call;
-	struct trace_event		event;
 	unsigned int		nr_args;
 	struct probe_arg	args[];
 };
@@ -1020,7 +1019,7 @@ print_kprobe_event(struct trace_iterator *iter, int flags,
 	int i;
 
 	field = (struct kprobe_trace_entry *)iter->ent;
-	tp = container_of(event, struct trace_probe, event);
+	tp = container_of(event, struct trace_probe, call.event);
 
 	if (!trace_seq_printf(s, "%s: (", tp->call.name))
 		goto partial;
@@ -1054,7 +1053,7 @@ print_kretprobe_event(struct trace_iterator *iter, int flags,
 	int i;
 
 	field = (struct kretprobe_trace_entry *)iter->ent;
-	tp = container_of(event, struct trace_probe, event);
+	tp = container_of(event, struct trace_probe, call.event);
 
 	if (!trace_seq_printf(s, "%s: (", tp->call.name))
 		goto partial;
@@ -1364,20 +1363,19 @@ static int register_probe_event(struct trace_probe *tp)
 
 	/* Initialize ftrace_event_call */
 	if (probe_is_return(tp)) {
-		tp->event.funcs = &kretprobe_funcs;
 		INIT_LIST_HEAD(&call->class->fields);
+		call->event.funcs = &kretprobe_funcs;
 		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kretprobe_event_define_fields;
 	} else {
 		INIT_LIST_HEAD(&call->class->fields);
-		tp->event.funcs = &kprobe_funcs;
+		call->event.funcs = &kprobe_funcs;
 		call->class->raw_init = probe_event_raw_init;
 		call->class->define_fields = kprobe_event_define_fields;
 	}
 	if (set_print_fmt(tp) < 0)
 		return -ENOMEM;
-	call->event = &tp->event;
-	call->id = register_ftrace_event(&tp->event);
+	call->id = register_ftrace_event(&call->event);
 	if (!call->id) {
 		kfree(call->print_fmt);
 		return -ENODEV;
@@ -1389,7 +1387,7 @@ static int register_probe_event(struct trace_probe *tp)
 	if (ret) {
 		pr_info("Failed to register kprobe event: %s\n", call->name);
 		kfree(call->print_fmt);
-		unregister_ftrace_event(&tp->event);
+		unregister_ftrace_event(&call->event);
 	}
 	return ret;
 }
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 0bcca08..a4bed39 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -30,6 +30,14 @@ syscall_get_fields(struct ftrace_event_call *call)
 	return &entry->fields;
 }
 
+struct trace_event_functions enter_syscall_print_funcs = {
+	.trace                  = print_syscall_enter,
+};
+
+struct trace_event_functions exit_syscall_print_funcs = {
+	.trace                  = print_syscall_exit,
+};
+
 struct ftrace_event_class event_class_syscall_enter = {
 	.system			= "syscalls",
 	.reg			= syscall_enter_register,
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (7 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 08/10][RFC] tracing: Move print functions into event class Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 21:06   ` Mathieu Desnoyers
  2010-04-26 19:50 ` [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field Steven Rostedt
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0009-tracing-Remove-duplicate-id-information-in-event-str.patch --]
[-- Type: text/plain, Size: 10946 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Now that the trace_event structure is embedded in the ftrace_event_call
structure, there is no need for the ftrace_event_call id field.
The id field is the same as the trace_event type field.

Removing the id and re-arranging the structure brings down the tracepoint
footprint by another 5K.

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5761154	1268356	9351592	16381102	 f9f4ae	vmlinux.print
5761074	1262596	9351592	16375262	 f9ddde	vmlinux.id

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h       |    5 ++---
 include/trace/ftrace.h             |   12 ++++++------
 kernel/trace/trace_event_perf.c    |    4 ++--
 kernel/trace/trace_events.c        |    7 +++----
 kernel/trace/trace_events_filter.c |    2 +-
 kernel/trace/trace_export.c        |    4 ++--
 kernel/trace/trace_kprobe.c        |   18 ++++++++++--------
 kernel/trace/trace_syscalls.c      |   14 ++++++++------
 8 files changed, 34 insertions(+), 32 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index aa3695a..b26507f 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -147,14 +147,13 @@ struct ftrace_event_call {
 	char			*name;
 	struct dentry		*dir;
 	struct trace_event	event;
-	int			enabled;
-	int			id;
 	const char		*print_fmt;
-	int			filter_active;
 	struct event_filter	*filter;
 	void			*mod;
 	void			*data;
 
+	int			enabled;
+	int			filter_active;
 	int			perf_refcount;
 };
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index d7b3b56..246b05e 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -150,7 +150,7 @@
  *
  *	entry = iter->ent;
  *
- *	if (entry->type != event_<call>.id) {
+ *	if (entry->type != event_<call>->event.type) {
  *		WARN_ON_ONCE(1);
  *		return TRACE_TYPE_UNHANDLED;
  *	}
@@ -221,7 +221,7 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
 									\
 	entry = iter->ent;						\
 									\
-	if (entry->type != event->id) {					\
+	if (entry->type != event->event.type) {				\
 		WARN_ON_ONCE(1);					\
 		return TRACE_TYPE_UNHANDLED;				\
 	}								\
@@ -257,7 +257,7 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
 									\
 	entry = iter->ent;						\
 									\
-	if (entry->type != event_##call.id) {				\
+	if (entry->type != event_##call.event.type) {			\
 		WARN_ON_ONCE(1);					\
 		return TRACE_TYPE_UNHANDLED;				\
 	}								\
@@ -408,7 +408,7 @@ static inline notrace int ftrace_get_offsets_##call(			\
  *	__data_size = ftrace_get_offsets_<call>(&__data_offsets, args);
  *
  *	event = trace_current_buffer_lock_reserve(&buffer,
- *				  event_<call>.id,
+ *				  event_<call>->event.type,
  *				  sizeof(*entry) + __data_size,
  *				  irq_flags, pc);
  *	if (!event)
@@ -509,7 +509,7 @@ ftrace_raw_event_##call(proto,						\
 	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
 									\
 	event = trace_current_buffer_lock_reserve(&buffer,		\
-				 event_call->id,			\
+				 event_call->event.type,		\
 				 sizeof(*entry) + __data_size,		\
 				 irq_flags, pc);			\
 	if (!event)							\
@@ -700,7 +700,7 @@ perf_trace_##call(proto, struct ftrace_event_call *event_call)		\
 		      "profile buffer not large enough"))		\
 		return;							\
 	entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare(	\
-		__entry_size, event_call->id, &rctx, &irq_flags);	\
+		__entry_size, event_call->event.type, &rctx, &irq_flags); \
 	if (!entry)							\
 		return;							\
 	tstruct								\
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 95df5a7..b8febf0 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -75,7 +75,7 @@ int perf_trace_enable(int event_id)
 
 	mutex_lock(&event_mutex);
 	list_for_each_entry(event, &ftrace_events, list) {
-		if (event->id == event_id &&
+		if (event->event.type == event_id &&
 		    event->class && event->class->perf_probe &&
 		    try_module_get(event->mod)) {
 			ret = perf_trace_event_enable(event);
@@ -123,7 +123,7 @@ void perf_trace_disable(int event_id)
 
 	mutex_lock(&event_mutex);
 	list_for_each_entry(event, &ftrace_events, list) {
-		if (event->id == event_id) {
+		if (event->event.type == event_id) {
 			perf_trace_event_disable(event);
 			module_put(event->mod);
 			break;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 9aa298e..8d2e28e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -132,7 +132,6 @@ int trace_event_raw_init(struct ftrace_event_call *call)
 	id = register_ftrace_event(&call->event);
 	if (!id)
 		return -ENODEV;
-	call->id = id;
 
 	return 0;
 }
@@ -574,7 +573,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 	trace_seq_init(s);
 
 	trace_seq_printf(s, "name: %s\n", call->name);
-	trace_seq_printf(s, "ID: %d\n", call->id);
+	trace_seq_printf(s, "ID: %d\n", call->event.type);
 	trace_seq_printf(s, "format:\n");
 
 	head = trace_get_fields(call);
@@ -648,7 +647,7 @@ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
 		return -ENOMEM;
 
 	trace_seq_init(s);
-	trace_seq_printf(s, "%d\n", call->id);
+	trace_seq_printf(s, "%d\n", call->event.type);
 
 	r = simple_read_from_buffer(ubuf, cnt, ppos,
 				    s->buffer, s->len);
@@ -974,7 +973,7 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 		trace_create_file("enable", 0644, call->dir, call,
 				  enable);
 
-	if (call->id && (call->class->perf_probe || call->class->reg))
+	if (call->event.type && (call->class->perf_probe || call->class->reg))
 		trace_create_file("id", 0444, call->dir, call,
 		 		  id);
 
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 560683d..b8e3bf3 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1394,7 +1394,7 @@ int ftrace_profile_set_filter(struct perf_event *event, int event_id,
 	mutex_lock(&event_mutex);
 
 	list_for_each_entry(call, &ftrace_events, list) {
-		if (call->id == event_id)
+		if (call->event.type == event_id)
 			break;
 	}
 
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index e878d06..8536e2a 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -153,7 +153,7 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
 #define F_printk(fmt, args...) #fmt ", "  __stringify(args)
 
 #undef FTRACE_ENTRY
-#define FTRACE_ENTRY(call, struct_name, type, tstruct, print)		\
+#define FTRACE_ENTRY(call, struct_name, etype, tstruct, print)		\
 									\
 struct ftrace_event_class event_class_ftrace_##call = {			\
 	.system			= __stringify(TRACE_SYSTEM),		\
@@ -165,7 +165,7 @@ struct ftrace_event_call __used						\
 __attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name			= #call,				\
-	.id			= type,					\
+	.event.type		= etype,				\
 	.class			= &event_class_ftrace_##call,		\
 	.print_fmt		= print,				\
 };									\
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index d8061c3..934078b 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -960,8 +960,8 @@ static __kprobes void kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
 
 	size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);
 
-	event = trace_current_buffer_lock_reserve(&buffer, call->id, size,
-						  irq_flags, pc);
+	event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
+						  size, irq_flags, pc);
 	if (!event)
 		return;
 
@@ -992,8 +992,8 @@ static __kprobes void kretprobe_trace_func(struct kretprobe_instance *ri,
 
 	size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);
 
-	event = trace_current_buffer_lock_reserve(&buffer, call->id, size,
-						  irq_flags, pc);
+	event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
+						  size, irq_flags, pc);
 	if (!event)
 		return;
 
@@ -1228,7 +1228,8 @@ static __kprobes void kprobe_perf_func(struct kprobe *kp,
 		     "profile buffer not large enough"))
 		return;
 
-	entry = perf_trace_buf_prepare(size, call->id, &rctx, &irq_flags);
+	entry = perf_trace_buf_prepare(size, call->event.type,
+				       &rctx, &irq_flags);
 	if (!entry)
 		return;
 
@@ -1258,7 +1259,8 @@ static __kprobes void kretprobe_perf_func(struct kretprobe_instance *ri,
 		     "profile buffer not large enough"))
 		return;
 
-	entry = perf_trace_buf_prepare(size, call->id, &rctx, &irq_flags);
+	entry = perf_trace_buf_prepare(size, call->event.type,
+				       &rctx, &irq_flags);
 	if (!entry)
 		return;
 
@@ -1375,8 +1377,8 @@ static int register_probe_event(struct trace_probe *tp)
 	}
 	if (set_print_fmt(tp) < 0)
 		return -ENOMEM;
-	call->id = register_ftrace_event(&call->event);
-	if (!call->id) {
+	ret = register_ftrace_event(&call->event);
+	if (!ret) {
 		kfree(call->print_fmt);
 		return -ENODEV;
 	}
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a4bed39..23fad22 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -108,7 +108,7 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
 	if (!entry)
 		goto end;
 
-	if (entry->enter_event->id != ent->type) {
+	if (entry->enter_event->event.type != ent->type) {
 		WARN_ON_ONCE(1);
 		goto end;
 	}
@@ -164,7 +164,7 @@ print_syscall_exit(struct trace_iterator *iter, int flags,
 		return TRACE_TYPE_HANDLED;
 	}
 
-	if (entry->exit_event->id != ent->type) {
+	if (entry->exit_event->event.type != ent->type) {
 		WARN_ON_ONCE(1);
 		return TRACE_TYPE_UNHANDLED;
 	}
@@ -306,7 +306,7 @@ void ftrace_syscall_enter(struct pt_regs *regs, long id)
 	size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
 
 	event = trace_current_buffer_lock_reserve(&buffer,
-			sys_data->enter_event->id, size, 0, 0);
+			sys_data->enter_event->event.type, size, 0, 0);
 	if (!event)
 		return;
 
@@ -338,7 +338,7 @@ void ftrace_syscall_exit(struct pt_regs *regs, long ret)
 		return;
 
 	event = trace_current_buffer_lock_reserve(&buffer,
-			sys_data->exit_event->id, sizeof(*entry), 0, 0);
+			sys_data->exit_event->event.type, sizeof(*entry), 0, 0);
 	if (!event)
 		return;
 
@@ -502,7 +502,8 @@ static void perf_syscall_enter(struct pt_regs *regs, long id)
 		return;
 
 	rec = (struct syscall_trace_enter *)perf_trace_buf_prepare(size,
-				sys_data->enter_event->id, &rctx, &flags);
+				sys_data->enter_event->event.type,
+				&rctx, &flags);
 	if (!rec)
 		return;
 
@@ -577,7 +578,8 @@ static void perf_syscall_exit(struct pt_regs *regs, long ret)
 		return;
 
 	rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size,
-				sys_data->exit_event->id, &rctx, &flags);
+				sys_data->exit_event->event.type,
+				&rctx, &flags);
 	if (!rec)
 		return;
 
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (8 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure Steven Rostedt
@ 2010-04-26 19:50 ` Steven Rostedt
  2010-04-28 21:13   ` Mathieu Desnoyers
  2010-04-28 14:45 ` [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Masami Hiramatsu
  2010-04-28 20:18 ` Mathieu Desnoyers
  11 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-26 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Arnaldo Carvalho de Melo, Mathieu Desnoyers,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

[-- Attachment #1: 0010-tracing-Combine-event-filter_active-and-enable-into-.patch --]
[-- Type: text/plain, Size: 6236 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The filter_active and enable both use an int (4 bytes each) to
set a single flag. We can save 4 bytes per event by combining the
two into a single integer.

   text	   data	    bss	    dec	    hex	filename
5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
5761074	1262596	9351592	16375262	 f9ddde	vmlinux.id
5761007	1256916	9351592	16369515	 f9c76b	vmlinux.flags

This gives us another 5K in savings.

The modification of both the enable and filter fields are done
under the event_mutex, so it is still safe to combine the two.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h       |   21 +++++++++++++++++++--
 kernel/trace/trace.h               |    2 +-
 kernel/trace/trace_events.c        |   14 +++++++-------
 kernel/trace/trace_events_filter.c |   10 +++++-----
 kernel/trace/trace_kprobe.c        |    2 +-
 5 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index b26507f..2e28c94 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -141,6 +141,16 @@ struct ftrace_event_class {
 	int			(*raw_init)(struct ftrace_event_call *);
 };
 
+enum {
+	TRACE_EVENT_FL_ENABLED_BIT,
+	TRACE_EVENT_FL_FILTERED_BIT,
+};
+
+enum {
+	TRACE_EVENT_FL_ENABLED	= (1 << TRACE_EVENT_FL_ENABLED_BIT),
+	TRACE_EVENT_FL_FILTERED	= (1 << TRACE_EVENT_FL_FILTERED_BIT),
+};
+
 struct ftrace_event_call {
 	struct list_head	list;
 	struct ftrace_event_class *class;
@@ -152,8 +162,15 @@ struct ftrace_event_call {
 	void			*mod;
 	void			*data;
 
-	int			enabled;
-	int			filter_active;
+	/*
+	 * 32 bit flags:
+	 *   bit 1:		enabled
+	 *   bit 2:		filter_active
+	 *
+	 *  Must hold event_mutex to change.
+	 */
+	unsigned int		flags;
+
 	int			perf_refcount;
 };
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index ff63bee..51ee319 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -779,7 +779,7 @@ filter_check_discard(struct ftrace_event_call *call, void *rec,
 		     struct ring_buffer *buffer,
 		     struct ring_buffer_event *event)
 {
-	if (unlikely(call->filter_active) &&
+	if (unlikely(call->flags & TRACE_EVENT_FL_FILTERED) &&
 	    !filter_match_preds(call->filter, rec)) {
 		ring_buffer_discard_commit(buffer, event);
 		return 1;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 8d2e28e..176b8be 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -144,8 +144,8 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 
 	switch (enable) {
 	case 0:
-		if (call->enabled) {
-			call->enabled = 0;
+		if (call->flags & TRACE_EVENT_FL_ENABLED) {
+			call->flags &= ~TRACE_EVENT_FL_ENABLED;
 			tracing_stop_cmdline_record();
 			if (call->class->reg)
 				call->class->reg(call, TRACE_REG_UNREGISTER);
@@ -156,7 +156,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 		}
 		break;
 	case 1:
-		if (!call->enabled) {
+		if (!(call->flags & TRACE_EVENT_FL_ENABLED)) {
 			tracing_start_cmdline_record();
 			if (call->class->reg)
 				ret = call->class->reg(call, TRACE_REG_REGISTER);
@@ -170,7 +170,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 					"%s\n", call->name);
 				break;
 			}
-			call->enabled = 1;
+			call->flags |= TRACE_EVENT_FL_ENABLED;
 		}
 		break;
 	}
@@ -359,7 +359,7 @@ s_next(struct seq_file *m, void *v, loff_t *pos)
 	(*pos)++;
 
 	list_for_each_entry_continue(call, &ftrace_events, list) {
-		if (call->enabled)
+		if (call->flags & TRACE_EVENT_FL_ENABLED)
 			return call;
 	}
 
@@ -418,7 +418,7 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
 	struct ftrace_event_call *call = filp->private_data;
 	char *buf;
 
-	if (call->enabled)
+	if (call->flags & TRACE_EVENT_FL_ENABLED)
 		buf = "1\n";
 	else
 		buf = "0\n";
@@ -493,7 +493,7 @@ system_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
 		 * or if all events or cleared, or if we have
 		 * a mixture.
 		 */
-		set |= (1 << !!call->enabled);
+		set |= (1 << !!(call->flags & TRACE_EVENT_FL_ENABLED));
 
 		/*
 		 * If we have a mixture, no need to look further.
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index b8e3bf3..fbc72ee 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -546,7 +546,7 @@ static void filter_disable_preds(struct ftrace_event_call *call)
 	struct event_filter *filter = call->filter;
 	int i;
 
-	call->filter_active = 0;
+	call->flags &= ~TRACE_EVENT_FL_FILTERED;
 	filter->n_preds = 0;
 
 	for (i = 0; i < MAX_FILTER_PRED; i++)
@@ -573,7 +573,7 @@ void destroy_preds(struct ftrace_event_call *call)
 {
 	__free_preds(call->filter);
 	call->filter = NULL;
-	call->filter_active = 0;
+	call->flags &= ~TRACE_EVENT_FL_FILTERED;
 }
 
 static struct event_filter *__alloc_preds(void)
@@ -612,7 +612,7 @@ static int init_preds(struct ftrace_event_call *call)
 	if (call->filter)
 		return 0;
 
-	call->filter_active = 0;
+	call->flags &= ~TRACE_EVENT_FL_FILTERED;
 	call->filter = __alloc_preds();
 	if (IS_ERR(call->filter))
 		return PTR_ERR(call->filter);
@@ -1267,7 +1267,7 @@ static int replace_system_preds(struct event_subsystem *system,
 		if (err)
 			filter_disable_preds(call);
 		else {
-			call->filter_active = 1;
+			call->flags |= TRACE_EVENT_FL_FILTERED;
 			replace_filter_string(filter, filter_string);
 		}
 		fail = false;
@@ -1316,7 +1316,7 @@ int apply_event_filter(struct ftrace_event_call *call, char *filter_string)
 	if (err)
 		append_filter_err(ps, call->filter);
 	else
-		call->filter_active = 1;
+		call->flags |= TRACE_EVENT_FL_FILTERED;
 out:
 	filter_opstack_clear(ps);
 	postfix_clear(ps);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 934078b..0e3ded6 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1382,7 +1382,7 @@ static int register_probe_event(struct trace_probe *tp)
 		kfree(call->print_fmt);
 		return -ENODEV;
 	}
-	call->enabled = 0;
+	call->flags = 0;
 	call->class->reg = kprobe_register;
 	call->data = tp;
 	ret = trace_add_event_call(call);
-- 
1.7.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks
  2010-04-26 19:50 ` [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks Steven Rostedt
@ 2010-04-27  9:08   ` Li Zefan
  2010-04-27 15:28     ` Steven Rostedt
  2010-04-28 20:37   ` Mathieu Desnoyers
  1 sibling, 1 reply; 45+ messages in thread
From: Li Zefan @ 2010-04-27  9:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Mathieu Desnoyers, Lai Jiangshan, Masami Hiramatsu,
	Christoph Hellwig, Mathieu Desnoyers

Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch allows data to be passed to the tracepoint callbacks
> if the tracepoint was created to do so.
> 
> If a tracepoint is defined with:
> 
> DECLARE_TRACE_DATA(name, proto, args)
> 
> Then a registered function can also register data to be passed
> to the tracepoint as such:
> 
>   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> 
>   /* In the C file */
> 
>   DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> 
>   [...]
> 
>        trace_mytacepoint(status);
> 
>   /* In a file registering this tracepoint */
> 
>   int my_callback(int status, void *data)
>   {
> 	struct my_struct my_data = data;
> 	[...]
>   }
> 
>   [...]
> 	my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
> 	init_my_data(my_data);
> 	register_trace_mytracepoint_data(my_callback, my_data);
> 
> The same callback can also be registered to the same tracepoint as long
> as the data registered is the same. Note, the data must also be used
> to unregister the callback:
> 
> 	unregister_trace_mytracepoint_data(my_callback, my_data);
> 
> Because of the data parameter, tracepoints declared this way can not have
> no args. That is:
> 
>   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(void), TP_ARGS());
> 
> will cause an error, but the original DECLARE_TRACE still allows for this.
> 
> The DECLARE_TRACE_DATA() will be used by TRACE_EVENT() so that it
> can reuse code and bring the size of the tracepoint footprint down.
> This means that TRACE_EVENT()s must have at least one argument defined.

We have to define at least on argument in TRACE_EVENT() even without
this patch, otherwise it'll cause compile error while expanding the
macros.

> This should not be a problem since we should never have a static
> tracepoint in the kernel that simply says "Look I'm here!".
> 

We do have such a tracepoint. ;)

That is trace_power_end, and it uses a dummy argument merely for
passing compilation.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks
  2010-04-27  9:08   ` Li Zefan
@ 2010-04-27 15:28     ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-27 15:28 UTC (permalink / raw)
  To: Li Zefan
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Mathieu Desnoyers, Lai Jiangshan, Masami Hiramatsu,
	Christoph Hellwig, Mathieu Desnoyers

On Tue, 2010-04-27 at 17:08 +0800, Li Zefan wrote:
> Steven Rostedt wrote:

> > Because of the data parameter, tracepoints declared this way can not have
> > no args. That is:
> > 
> >   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(void), TP_ARGS());
> > 
> > will cause an error, but the original DECLARE_TRACE still allows for this.
> > 
> > The DECLARE_TRACE_DATA() will be used by TRACE_EVENT() so that it
> > can reuse code and bring the size of the tracepoint footprint down.
> > This means that TRACE_EVENT()s must have at least one argument defined.
> 
> We have to define at least on argument in TRACE_EVENT() even without
> this patch, otherwise it'll cause compile error while expanding the
> macros.

OK, good to know that this is not a regression. The DECLARE_TRACE()
still allows now arguments, I spent a bit of time (more than I wanted
to) to make that work. Since I added a new DECLARE_TRACE_DATA() that
must have at least one argument, it is not a regression, because it is
new :-)

Thanks,

-- Steve

P.S.

I'll let these patches sit out for a week waiting for comments, and if
there are none, I'll repackage them (rebase as well) and send them out
for real.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (9 preceding siblings ...)
  2010-04-26 19:50 ` [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field Steven Rostedt
@ 2010-04-28 14:45 ` Masami Hiramatsu
  2010-04-28 20:18 ` Mathieu Desnoyers
  11 siblings, 0 replies; 45+ messages in thread
From: Masami Hiramatsu @ 2010-04-28 14:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Mathieu Desnoyers, Lai Jiangshan, Li Zefan, Christoph Hellwig

Steven Rostedt wrote:
> This is an RFC patch set that also affects kprobes and perf.
> 
> At the Linux Collaboration Summit, I talked with Mathieu and others about
> lowering the footprint of trace events. I spent all of last week
> trying to get the size as small as I could.
> 
> Currently, each TRACE_EVENT() macro adds 1 - 5K per tracepoint. I got various
> results by adding a TRACE_EVENT() with the compiler, depending on
> config options that did not seem related. The new tracepoint I added
> would add between 1 and 5K, but I did not investigate enough to
> see what the true size was.
> 
> What was consistent, was the DEFINE_EVENT(). Currently, it adds
> a little over 700 bytes per DEFINE_EVENT().
> 
> This patch series does not seem to affect TRACE_EVENT() much (had
> the same various sizes), but consistently brings DEFINE_EVENT()s
> down from 700 bytes to 250 bytes per DEFINE_EVENT(). Since syscalls
> use one "class" and are equivalent to DEFINE_EVENT() this can
> be a significant savings.
> 
> With events and syscalls (82 events and 616 syscalls), before this
> patch series, the size of vmlinux was: 16161794, and afterward: 16058182.
> 
> That is 103,612 bytes in savings! (over 100K)
> 
> 
> Without tracing syscalls (82 events), it brought the size of vmlinux
> down from 1591046 to 15999394.
> 
> 22,071 bytes in savings.
> 
> This is just an RFC (for now), to get peoples opinions on the changes.
> It does a bit of rewriting of the CPP macros, just to warning you ;-)

Hm, at least for kprobe tracer, this change is ok, 
even though it isn't much worth as for tracepoints.
I think, if we have ftrace_event_ops, it will help reducing
the size of dynamic events a bit (not so much).


Thank you,


-- 
Masami Hiramatsu
e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs
  2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
                   ` (10 preceding siblings ...)
  2010-04-28 14:45 ` [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Masami Hiramatsu
@ 2010-04-28 20:18 ` Mathieu Desnoyers
  11 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> This is an RFC patch set that also affects kprobes and perf.

Hi Steven,

> 
> At the Linux Collaboration Summit, I talked with Mathieu and others about
> lowering the footprint of trace events. I spent all of last week
> trying to get the size as small as I could.
> 
> Currently, each TRACE_EVENT() macro adds 1 - 5K per tracepoint. I got various
> results by adding a TRACE_EVENT() with the compiler, depending on
> config options that did not seem related. The new tracepoint I added
> would add between 1 and 5K, but I did not investigate enough to
> see what the true size was.

Adding only one might not give an accurate picture, as some sections can
be aligned on 4k boundaries. So if the added TRACE_EVENT() brings you to
the next 4k page, your added size is bumped of an extra 4k.

> 
> What was consistent, was the DEFINE_EVENT(). Currently, it adds
> a little over 700 bytes per DEFINE_EVENT().
> 
> This patch series does not seem to affect TRACE_EVENT() much (had
> the same various sizes), but consistently brings DEFINE_EVENT()s
> down from 700 bytes to 250 bytes per DEFINE_EVENT(). Since syscalls
> use one "class" and are equivalent to DEFINE_EVENT() this can
> be a significant savings.
> 
> With events and syscalls (82 events and 616 syscalls), before this
> patch series, the size of vmlinux was: 16161794, and afterward: 16058182.
> 
> That is 103,612 bytes in savings! (over 100K)
> 
> 
> Without tracing syscalls (82 events), it brought the size of vmlinux
> down from 1591046 to 15999394.

Probably a cut n paste error on the line above. Should read:

down from 16021465 to 15999394.

if you want the "22071 bytes in savings" to hold.

Will look over your patchset.

Thanks,

Mathieu

> 
> 22,071 bytes in savings.
> 
> This is just an RFC (for now), to get peoples opinions on the changes.
> It does a bit of rewriting of the CPP macros, just to warning you ;-)
> 
> -- Steve
> 
> The code can be found at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> tip/tracing/rfc-1
> 
> 
> Steven Rostedt (10):
>       tracing: Create class struct for events
>       tracing: Let tracepoints have data passed to tracepoint callbacks
>       tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
>       tracing: Remove per event trace registering
>       tracing: Move fields from event to class structure
>       tracing: Move raw_init from events to class
>       tracing: Allow events to share their print functions
>       tracing: Move print functions into event class
>       tracing: Remove duplicate id information in event structure
>       tracing: Combine event filter_active and enable into single flags field
> 
> ----
>  include/linux/ftrace_event.h         |   71 +++++++++---
>  include/linux/syscalls.h             |   55 +++-------
>  include/linux/tracepoint.h           |  119 ++++++++++++++++---
>  include/trace/ftrace.h               |  215 ++++++++++------------------------
>  include/trace/syscall.h              |    9 +-
>  kernel/trace/blktrace.c              |   13 ++-
>  kernel/trace/kmemtrace.c             |   28 +++--
>  kernel/trace/trace.c                 |    9 +-
>  kernel/trace/trace.h                 |    5 +-
>  kernel/trace/trace_event_perf.c      |   17 ++-
>  kernel/trace/trace_events.c          |  126 +++++++++++++-------
>  kernel/trace/trace_events_filter.c   |   28 +++--
>  kernel/trace/trace_export.c          |   16 ++-
>  kernel/trace/trace_functions_graph.c |    2 +-
>  kernel/trace/trace_kprobe.c          |  104 ++++++++++-------
>  kernel/trace/trace_output.c          |  137 +++++++++++++++-------
>  kernel/trace/trace_output.h          |    2 +-
>  kernel/trace/trace_syscalls.c        |  105 +++++++++++++++--
>  kernel/tracepoint.c                  |   91 ++++++++-------
>  19 files changed, 700 insertions(+), 452 deletions(-)
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/10][RFC] tracing: Create class struct for events
  2010-04-26 19:50 ` [PATCH 01/10][RFC] tracing: Create class struct for events Steven Rostedt
@ 2010-04-28 20:22   ` Mathieu Desnoyers
  2010-04-28 20:38     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:22 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch creates a ftrace_event_class struct that event structs point to.
> This class struct will be made to hold information to modify the
> events. Currently the class struct only holds the events system name.
> 
> This patch slightly increases the size of the text as well as decreases
> the data size. The overall change is still a slight increase, but
> this change lays the ground work of other changes to make the footprint
> of tracepoints smaller.
> 
> With 82 standard tracepoints, and 616 system call tracepoints:
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/ftrace_event.h       |    6 +++++-
>  include/linux/syscalls.h           |    6 ++++--
>  include/trace/ftrace.h             |   36 +++++++++++++++---------------------
>  kernel/trace/trace_events.c        |   20 ++++++++++----------
>  kernel/trace/trace_events_filter.c |    6 +++---
>  kernel/trace/trace_export.c        |    6 +++++-
>  kernel/trace/trace_kprobe.c        |   12 ++++++------
>  kernel/trace/trace_syscalls.c      |    4 ++++
>  8 files changed, 52 insertions(+), 44 deletions(-)
> 
[...]
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 75dd778..0921a8f 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -62,7 +62,10 @@
>  		struct trace_entry	ent;				\
>  		tstruct							\
>  		char			__data[0];			\
> -	};
> +	};								\
> +									\
> +	static struct ftrace_event_class event_class_##name;
> +
>  #undef DEFINE_EVENT
>  #define DEFINE_EVENT(template, name, proto, args)	\
>  	static struct ftrace_event_call			\
> @@ -430,22 +433,6 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
>   *
>   * Override the macros in <trace/trace_events.h> to include the following:
>   *
> - * static void ftrace_event_<call>(proto)
> - * {
> - *	event_trace_printk(_RET_IP_, "<call>: " <fmt>);
> - * }
> - *
> - * static int ftrace_reg_event_<call>(struct ftrace_event_call *unused)
> - * {
> - *	return register_trace_<call>(ftrace_event_<call>);
> - * }
> - *
> - * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused)
> - * {
> - *	unregister_trace_<call>(ftrace_event_<call>);
> - * }
> - *
> - *
>   * For those macros defined with TRACE_EVENT:
>   *
>   * static struct ftrace_event_call event_<call>;

So.. just wondering, why are you removing these comments ? What's
replacing them ?

Maybe you meant to remove this in a following patch ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks
  2010-04-26 19:50 ` [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks Steven Rostedt
  2010-04-27  9:08   ` Li Zefan
@ 2010-04-28 20:37   ` Mathieu Desnoyers
  2010-04-28 23:56     ` Steven Rostedt
  1 sibling, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch allows data to be passed to the tracepoint callbacks
> if the tracepoint was created to do so.
> 
> If a tracepoint is defined with:
> 
> DECLARE_TRACE_DATA(name, proto, args)
> 
> Then a registered function can also register data to be passed
> to the tracepoint as such:
> 
>   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> 
>   /* In the C file */
> 
>   DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> 
>   [...]
> 
>        trace_mytacepoint(status);
> 
>   /* In a file registering this tracepoint */
> 
>   int my_callback(int status, void *data)
>   {
> 	struct my_struct my_data = data;
> 	[...]
>   }
> 
>   [...]
> 	my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
> 	init_my_data(my_data);
> 	register_trace_mytracepoint_data(my_callback, my_data);
> 
> The same callback can also be registered to the same tracepoint as long
> as the data registered is the same. Note, the data must also be used
> to unregister the callback:
> 
> 	unregister_trace_mytracepoint_data(my_callback, my_data);
> 
> Because of the data parameter, tracepoints declared this way can not have
> no args. That is:
> 
>   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(void), TP_ARGS());
> 
> will cause an error, but the original DECLARE_TRACE still allows for this.
> 
> The DECLARE_TRACE_DATA() will be used by TRACE_EVENT() so that it
> can reuse code and bring the size of the tracepoint footprint down.
> This means that TRACE_EVENT()s must have at least one argument defined.
> This should not be a problem since we should never have a static
> tracepoint in the kernel that simply says "Look I'm here!".
> 

I'm not convinced DECLARE_TRACE_DATA() is an appropriate name. Sounds
confusing. What kind of data is this ? It is not obvious that this
refers to callback private data.

Why can't we just extend the existing DECLARE_TRACE() instead and add a
"callback_data" argument (or something slightly less verbose) ? We can
update all users anyway.

We can also create a variant when there are no arguments passed:

DECLARE_TRACE_NOARG()

We had to do the same for the Linux kernel markers in the past. Then we
can create a TRACE_EVENT_NOARG() macro if necessary.

I don't think it makes sense to require users to pass arguments. It
should be possible to just say "I'm here". Cases where this could make
sense includes cases where we'd only be interested in global variables
at a specific tracepoint.

Thanks,

Mathieu


> This is part of a series to make the tracepoint footprint smaller:
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> 5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
> 
> Again, this patch also increases the size of the kernel, but
> lays the ground work for decreasing it.
> 
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/tracepoint.h |  103 +++++++++++++++++++++++++++++++++++++------
>  kernel/tracepoint.c        |   91 ++++++++++++++++++++++-----------------
>  2 files changed, 139 insertions(+), 55 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 78b4bd3..4649bdb 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -20,12 +20,17 @@
>  struct module;
>  struct tracepoint;
>  
> +struct tracepoint_func {
> +	void *func;
> +	void *data;
> +};
> +
>  struct tracepoint {
>  	const char *name;		/* Tracepoint name */
>  	int state;			/* State. */
>  	void (*regfunc)(void);
>  	void (*unregfunc)(void);
> -	void **funcs;
> +	struct tracepoint_func *funcs;
>  } __attribute__((aligned(32)));		/*
>  					 * Aligned on 32 bytes because it is
>  					 * globally visible and gcc happily
> @@ -40,20 +45,31 @@ struct tracepoint {
>  
>  #ifdef CONFIG_TRACEPOINTS
>  
> +#define _CALL_TRACE(proto, args)					\
> +	(void)(it_data);						\
> +	((void(*)(proto))(it_func))(args)
> +
> +#define _CALL_TRACE_DATA(proto, args)					\
> +	it_data = (it_func_ptr)->data;					\
> +	((void(*)(proto, void *))(it_func))(args, (it_data))
> +
>  /*
>   * it_func[0] is never NULL because there is at least one element in the array
>   * when the array itself is non NULL.
>   */
> -#define __DO_TRACE(tp, proto, args)					\
> +#define __DO_TRACE(tp, proto, args, call)				\
>  	do {								\
> -		void **it_func;						\
> +		struct tracepoint_func *it_func_ptr;			\
> +		void *it_func;						\
> +		void *it_data;						\
>  									\
>  		rcu_read_lock_sched_notrace();				\
> -		it_func = rcu_dereference_sched((tp)->funcs);		\
> -		if (it_func) {						\
> +		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
> +		if (it_func_ptr) {					\
>  			do {						\
> -				((void(*)(proto))(*it_func))(args);	\
> -			} while (*(++it_func));				\
> +				it_func = (it_func_ptr)->func;		\
> +				call;					\
> +			} while ((++it_func_ptr)->func);		\
>  		}							\
>  		rcu_read_unlock_sched_notrace();			\
>  	} while (0)
> @@ -69,17 +85,55 @@ struct tracepoint {
>  	{								\
>  		if (unlikely(__tracepoint_##name.state))		\
>  			__DO_TRACE(&__tracepoint_##name,		\
> -				TP_PROTO(proto), TP_ARGS(args));	\
> +				TP_PROTO(proto), TP_ARGS(args),		\
> +				_CALL_TRACE(PARAMS(proto),		\
> +					    PARAMS(args)));		\
>  	}								\
>  	static inline int register_trace_##name(void (*probe)(proto))	\
>  	{								\
> -		return tracepoint_probe_register(#name, (void *)probe);	\
> +		return tracepoint_probe_register(#name, (void *)probe,	\
> +						 NULL);			\
>  	}								\
> -	static inline int unregister_trace_##name(void (*probe)(proto))	\
> +	static inline int unregister_trace_##name(void (*probe)(proto)) \
>  	{								\
> -		return tracepoint_probe_unregister(#name, (void *)probe);\
> +		return tracepoint_probe_unregister(#name, (void *)probe,\
> +						   NULL);		\
>  	}
>  
> +#define DECLARE_TRACE_DATA(name, proto, args)				\
> +	extern struct tracepoint __tracepoint_##name;			\
> +	static inline void trace_##name(proto)				\
> +	{								\
> +		if (unlikely(__tracepoint_##name.state))		\
> +			__DO_TRACE(&__tracepoint_##name,		\
> +				TP_PROTO(proto), TP_ARGS(args),		\
> +				_CALL_TRACE_DATA(PARAMS(proto),		\
> +						 PARAMS(args)));	\
> +	}								\
> +	static inline int register_trace_##name(void (*probe)(proto))	\
> +	{								\
> +		return tracepoint_probe_register(#name, (void *)probe,	\
> +						 NULL);			\
> +	}								\
> +	static inline int unregister_trace_##name(void (*probe)(proto)) \
> +	{								\
> +		return tracepoint_probe_unregister(#name, (void *)probe,\
> +						   NULL);		\
> +	}								\
> +	static inline int						\
> +	register_trace_##name##_data(void (*probe)(proto, void *data),	\
> +				     void *data)			\
> +	{								\
> +		return tracepoint_probe_register(#name, (void *)probe,	\
> +						 data);			\
> +	}								\
> +	static inline int						\
> +	unregister_trace_##name##_data(void (*probe)(proto, void *data),\
> +				       void *data)			\
> +	{								\
> +		return tracepoint_probe_unregister(#name, (void *)probe,\
> +						   data);		\
> +	}
>  
>  #define DEFINE_TRACE_FN(name, reg, unreg)				\
>  	static const char __tpstrtab_##name[]				\
> @@ -114,6 +168,22 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
>  		return -ENOSYS;						\
>  	}
>  
> +#define DECLARE_TRACE_DATA(name, proto, args)				\
> +	static inline void _do_trace_##name(struct tracepoint *tp, proto) \
> +	{ }								\
> +	static inline void trace_##name(proto)				\
> +	{ }								\
> +	static inline int						\
> +	register_trace_##name(void (*probe)(proto), void *data)		\
> +	{								\
> +		return -ENOSYS;						\
> +	}								\
> +	static inline int						\
> +	unregister_trace_##name(void (*probe)(proto), void *data)	\
> +	{								\
> +		return -ENOSYS;						\
> +	}
> +
>  #define DEFINE_TRACE_FN(name, reg, unreg)
>  #define DEFINE_TRACE(name)
>  #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
> @@ -129,16 +199,19 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
>   * Connect a probe to a tracepoint.
>   * Internal API, should not be used directly.
>   */
> -extern int tracepoint_probe_register(const char *name, void *probe);
> +extern int tracepoint_probe_register(const char *name, void *probe, void *data);
>  
>  /*
>   * Disconnect a probe from a tracepoint.
>   * Internal API, should not be used directly.
>   */
> -extern int tracepoint_probe_unregister(const char *name, void *probe);
> +extern int
> +tracepoint_probe_unregister(const char *name, void *probe, void *data);
>  
> -extern int tracepoint_probe_register_noupdate(const char *name, void *probe);
> -extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe);
> +extern int tracepoint_probe_register_noupdate(const char *name, void *probe,
> +					      void *data);
> +extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe,
> +						void *data);
>  extern void tracepoint_probe_update_all(void);
>  
>  struct tracepoint_iter {
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index cc89be5..c77f3ec 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -54,7 +54,7 @@ static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
>   */
>  struct tracepoint_entry {
>  	struct hlist_node hlist;
> -	void **funcs;
> +	struct tracepoint_func *funcs;
>  	int refcount;	/* Number of times armed. 0 if disarmed. */
>  	char name[0];
>  };
> @@ -64,12 +64,12 @@ struct tp_probes {
>  		struct rcu_head rcu;
>  		struct list_head list;
>  	} u;
> -	void *probes[0];
> +	struct tracepoint_func probes[0];
>  };
>  
>  static inline void *allocate_probes(int count)
>  {
> -	struct tp_probes *p  = kmalloc(count * sizeof(void *)
> +	struct tp_probes *p  = kmalloc(count * sizeof(struct tracepoint_func)
>  			+ sizeof(struct tp_probes), GFP_KERNEL);
>  	return p == NULL ? NULL : p->probes;
>  }
> @@ -79,7 +79,7 @@ static void rcu_free_old_probes(struct rcu_head *head)
>  	kfree(container_of(head, struct tp_probes, u.rcu));
>  }
>  
> -static inline void release_probes(void *old)
> +static inline void release_probes(struct tracepoint_func *old)
>  {
>  	if (old) {
>  		struct tp_probes *tp_probes = container_of(old,
> @@ -95,15 +95,16 @@ static void debug_print_probes(struct tracepoint_entry *entry)
>  	if (!tracepoint_debug || !entry->funcs)
>  		return;
>  
> -	for (i = 0; entry->funcs[i]; i++)
> -		printk(KERN_DEBUG "Probe %d : %p\n", i, entry->funcs[i]);
> +	for (i = 0; entry->funcs[i].func; i++)
> +		printk(KERN_DEBUG "Probe %d : %p\n", i, entry->funcs[i].func);
>  }
>  
> -static void *
> -tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
> +static struct tracepoint_func *
> +tracepoint_entry_add_probe(struct tracepoint_entry *entry,
> +			   void *probe, void *data)
>  {
>  	int nr_probes = 0;
> -	void **old, **new;
> +	struct tracepoint_func *old, *new;
>  
>  	WARN_ON(!probe);
>  
> @@ -111,8 +112,9 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
>  	old = entry->funcs;
>  	if (old) {
>  		/* (N -> N+1), (N != 0, 1) probes */
> -		for (nr_probes = 0; old[nr_probes]; nr_probes++)
> -			if (old[nr_probes] == probe)
> +		for (nr_probes = 0; old[nr_probes].func; nr_probes++)
> +			if (old[nr_probes].func == probe &&
> +			    old[nr_probes].data == data)
>  				return ERR_PTR(-EEXIST);
>  	}
>  	/* + 2 : one for new probe, one for NULL func */
> @@ -120,9 +122,10 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
>  	if (new == NULL)
>  		return ERR_PTR(-ENOMEM);
>  	if (old)
> -		memcpy(new, old, nr_probes * sizeof(void *));
> -	new[nr_probes] = probe;
> -	new[nr_probes + 1] = NULL;
> +		memcpy(new, old, nr_probes * sizeof(struct tracepoint_func));
> +	new[nr_probes].func = probe;
> +	new[nr_probes].data = data;
> +	new[nr_probes + 1].func = NULL;
>  	entry->refcount = nr_probes + 1;
>  	entry->funcs = new;
>  	debug_print_probes(entry);
> @@ -130,10 +133,11 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry, void *probe)
>  }
>  
>  static void *
> -tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
> +tracepoint_entry_remove_probe(struct tracepoint_entry *entry,
> +			      void *probe, void *data)
>  {
>  	int nr_probes = 0, nr_del = 0, i;
> -	void **old, **new;
> +	struct tracepoint_func *old, *new;
>  
>  	old = entry->funcs;
>  
> @@ -142,8 +146,10 @@ tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
>  
>  	debug_print_probes(entry);
>  	/* (N -> M), (N > 1, M >= 0) probes */
> -	for (nr_probes = 0; old[nr_probes]; nr_probes++) {
> -		if ((!probe || old[nr_probes] == probe))
> +	for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> +		if (!probe ||
> +		    (old[nr_probes].func == probe &&
> +		     old[nr_probes].data == data))
>  			nr_del++;
>  	}
>  
> @@ -160,10 +166,11 @@ tracepoint_entry_remove_probe(struct tracepoint_entry *entry, void *probe)
>  		new = allocate_probes(nr_probes - nr_del + 1);
>  		if (new == NULL)
>  			return ERR_PTR(-ENOMEM);
> -		for (i = 0; old[i]; i++)
> -			if ((probe && old[i] != probe))
> +		for (i = 0; old[i].func; i++)
> +			if (probe &&
> +			    (old[i].func != probe || old[i].data != data))
>  				new[j++] = old[i];
> -		new[nr_probes - nr_del] = NULL;
> +		new[nr_probes - nr_del].func = NULL;
>  		entry->refcount = nr_probes - nr_del;
>  		entry->funcs = new;
>  	}
> @@ -315,18 +322,19 @@ static void tracepoint_update_probes(void)
>  	module_update_tracepoints();
>  }
>  
> -static void *tracepoint_add_probe(const char *name, void *probe)
> +static struct tracepoint_func *
> +tracepoint_add_probe(const char *name, void *probe, void *data)
>  {
>  	struct tracepoint_entry *entry;
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	entry = get_tracepoint(name);
>  	if (!entry) {
>  		entry = add_tracepoint(name);
>  		if (IS_ERR(entry))
> -			return entry;
> +			return (struct tracepoint_func *)entry;
>  	}
> -	old = tracepoint_entry_add_probe(entry, probe);
> +	old = tracepoint_entry_add_probe(entry, probe, data);
>  	if (IS_ERR(old) && !entry->refcount)
>  		remove_tracepoint(entry);
>  	return old;
> @@ -340,12 +348,12 @@ static void *tracepoint_add_probe(const char *name, void *probe)
>   * Returns 0 if ok, error value on error.
>   * The probe address must at least be aligned on the architecture pointer size.
>   */
> -int tracepoint_probe_register(const char *name, void *probe)
> +int tracepoint_probe_register(const char *name, void *probe, void *data)
>  {
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	mutex_lock(&tracepoints_mutex);
> -	old = tracepoint_add_probe(name, probe);
> +	old = tracepoint_add_probe(name, probe, data);
>  	mutex_unlock(&tracepoints_mutex);
>  	if (IS_ERR(old))
>  		return PTR_ERR(old);
> @@ -356,15 +364,16 @@ int tracepoint_probe_register(const char *name, void *probe)
>  }
>  EXPORT_SYMBOL_GPL(tracepoint_probe_register);
>  
> -static void *tracepoint_remove_probe(const char *name, void *probe)
> +static struct tracepoint_func *
> +tracepoint_remove_probe(const char *name, void *probe, void *data)
>  {
>  	struct tracepoint_entry *entry;
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	entry = get_tracepoint(name);
>  	if (!entry)
>  		return ERR_PTR(-ENOENT);
> -	old = tracepoint_entry_remove_probe(entry, probe);
> +	old = tracepoint_entry_remove_probe(entry, probe, data);
>  	if (IS_ERR(old))
>  		return old;
>  	if (!entry->refcount)
> @@ -382,12 +391,12 @@ static void *tracepoint_remove_probe(const char *name, void *probe)
>   * itself uses stop_machine(), which insures that every preempt disabled section
>   * have finished.
>   */
> -int tracepoint_probe_unregister(const char *name, void *probe)
> +int tracepoint_probe_unregister(const char *name, void *probe, void *data)
>  {
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	mutex_lock(&tracepoints_mutex);
> -	old = tracepoint_remove_probe(name, probe);
> +	old = tracepoint_remove_probe(name, probe, data);
>  	mutex_unlock(&tracepoints_mutex);
>  	if (IS_ERR(old))
>  		return PTR_ERR(old);
> @@ -418,12 +427,13 @@ static void tracepoint_add_old_probes(void *old)
>   *
>   * caller must call tracepoint_probe_update_all()
>   */
> -int tracepoint_probe_register_noupdate(const char *name, void *probe)
> +int tracepoint_probe_register_noupdate(const char *name, void *probe,
> +				       void *data)
>  {
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	mutex_lock(&tracepoints_mutex);
> -	old = tracepoint_add_probe(name, probe);
> +	old = tracepoint_add_probe(name, probe, data);
>  	if (IS_ERR(old)) {
>  		mutex_unlock(&tracepoints_mutex);
>  		return PTR_ERR(old);
> @@ -441,12 +451,13 @@ EXPORT_SYMBOL_GPL(tracepoint_probe_register_noupdate);
>   *
>   * caller must call tracepoint_probe_update_all()
>   */
> -int tracepoint_probe_unregister_noupdate(const char *name, void *probe)
> +int tracepoint_probe_unregister_noupdate(const char *name, void *probe,
> +					 void *data)
>  {
> -	void *old;
> +	struct tracepoint_func *old;
>  
>  	mutex_lock(&tracepoints_mutex);
> -	old = tracepoint_remove_probe(name, probe);
> +	old = tracepoint_remove_probe(name, probe, data);
>  	if (IS_ERR(old)) {
>  		mutex_unlock(&tracepoints_mutex);
>  		return PTR_ERR(old);
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/10][RFC] tracing: Create class struct for events
  2010-04-28 20:22   ` Mathieu Desnoyers
@ 2010-04-28 20:38     ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-28 20:38 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Wed, 2010-04-28 at 16:22 -0400, Mathieu Desnoyers wrote:

> >  #undef DEFINE_EVENT
> >  #define DEFINE_EVENT(template, name, proto, args)	\
> >  	static struct ftrace_event_call			\
> > @@ -430,22 +433,6 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
> >   *
> >   * Override the macros in <trace/trace_events.h> to include the following:
> >   *
> > - * static void ftrace_event_<call>(proto)
> > - * {
> > - *	event_trace_printk(_RET_IP_, "<call>: " <fmt>);
> > - * }
> > - *
> > - * static int ftrace_reg_event_<call>(struct ftrace_event_call *unused)
> > - * {
> > - *	return register_trace_<call>(ftrace_event_<call>);
> > - * }
> > - *
> > - * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused)
> > - * {
> > - *	unregister_trace_<call>(ftrace_event_<call>);
> > - * }
> > - *
> > - *
> >   * For those macros defined with TRACE_EVENT:
> >   *
> >   * static struct ftrace_event_call event_<call>;
> 
> So.. just wondering, why are you removing these comments ? What's
> replacing them ?
> 
> Maybe you meant to remove this in a following patch ?

 I found a lot of stale comments, these were added with cut and paste
before, and I just removed them here.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
  2010-04-26 19:50 ` [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA() Steven Rostedt
@ 2010-04-28 20:39   ` Mathieu Desnoyers
  2010-04-28 23:57     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:39 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Switch the TRACE_EVENT() macros to use DECLARE_TRACE_DATA(). This
> patch is done to prove that the DATA macros work. If any regressions
> were to surface, then this patch would help a git bisect to localize
> the area.
> 
> Once again this patch increases the size of the kernel.
> 

As recommended in the earlier email:

It would make sense to just add the extra "callback_data" argument
directly to DECLARE_TRACE(), modify the user (TRACE_EVENT) accordingly.
And possibly create a TRACE_EVENT_NOARG() variant.

Thanks,

Mathieu

>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> 5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
> 5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/tracepoint.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 4649bdb..c04988a 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -355,14 +355,14 @@ static inline void tracepoint_synchronize_unregister(void)
>  
>  #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print)
>  #define DEFINE_EVENT(template, name, proto, args)		\
> -	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> +	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
>  #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
> -	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> +	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
>  
>  #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
> -	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> +	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
>  #define TRACE_EVENT_FN(name, proto, args, struct,		\
>  		assign, print, reg, unreg)			\
> -	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> +	DECLARE_TRACE_DATA(name, PARAMS(proto), PARAMS(args))
>  
>  #endif /* ifdef TRACE_EVENT (see note above) */
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-26 19:50 ` [PATCH 04/10][RFC] tracing: Remove per event trace registering Steven Rostedt
@ 2010-04-28 20:44   ` Mathieu Desnoyers
  2010-04-29  0:00     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch removes the register functions of TRACE_EVENT() to enable
> and disable tracepoints. The registering of a event is now down
> directly in the trace_events.c file. The tracepoint_probe_register()
> is now called directly.
> 
> The prototypes are no longer type checked, but this should not be
> an issue since the tracepoints are created automatically by the
> macros. If a prototype is incorrect in the TRACE_EVENT() macro, then
> other macros will catch it.
> 
> The trace_event_class structure now holds the probes to be called
> by the callbacks. This removes needing to have each event have
> a separate pointer for the probe.
> 
> To handle kprobes and syscalls, since they register probes in a
> different manner, a "reg" field is added to the ftrace_event_class
> structure. If the "reg" field is assigned, then it will be called for
> enabling and disabling of the probe for either ftrace or perf. To let
> the reg function know what is happening, a new enum (trace_reg) is
> created that has the type of control that is needed.
> 
> With this new rework, the 82 kernel events and 616 syscall events
> has their footprint dramatically lowered:
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> 5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
> 5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data
> 5774316	1306580	9351592	16432488	 fabd68	vmlinux.regs
> 
> The size went from 16477030 to 16432488, that's a total of 44K
> in savings. With tracepoints being continuously added, this is
> critical that the footprint becomes minimal.

Have you tried doing a BUILD_BUG_ON() on __typeof__() mismatch between
the type of the callback generated by TRACE_EVENT() and the expected
type ?  This might help catching tricky preprocessor macro errors early.

Thanks,

Mathieu

> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/ftrace_event.h    |   17 +++++--
>  include/linux/syscalls.h        |   29 ++---------
>  include/linux/tracepoint.h      |   12 ++++-
>  include/trace/ftrace.h          |  110 +++++----------------------------------
>  kernel/trace/trace_event_perf.c |   15 ++++-
>  kernel/trace/trace_events.c     |   26 +++++++---
>  kernel/trace/trace_kprobe.c     |   34 +++++++++---
>  kernel/trace/trace_syscalls.c   |   56 +++++++++++++++++++-
>  8 files changed, 151 insertions(+), 148 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 496eea8..dd0051e 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -113,8 +113,21 @@ void tracing_record_cmdline(struct task_struct *tsk);
>  
>  struct event_filter;
>  
> +enum trace_reg {
> +	TRACE_REG_REGISTER,
> +	TRACE_REG_UNREGISTER,
> +	TRACE_REG_PERF_REGISTER,
> +	TRACE_REG_PERF_UNREGISTER,
> +};
> +
> +struct ftrace_event_call;
> +
>  struct ftrace_event_class {
>  	char			*system;
> +	void			*probe;
> +	void			*perf_probe;
> +	int			(*reg)(struct ftrace_event_call *event,
> +				       enum trace_reg type);
>  };
>  
>  struct ftrace_event_call {
> @@ -124,8 +137,6 @@ struct ftrace_event_call {
>  	struct dentry		*dir;
>  	struct trace_event	*event;
>  	int			enabled;
> -	int			(*regfunc)(struct ftrace_event_call *);
> -	void			(*unregfunc)(struct ftrace_event_call *);
>  	int			id;
>  	const char		*print_fmt;
>  	int			(*raw_init)(struct ftrace_event_call *);
> @@ -137,8 +148,6 @@ struct ftrace_event_call {
>  	void			*data;
>  
>  	int			perf_refcount;
> -	int			(*perf_event_enable)(struct ftrace_event_call *);
> -	void			(*perf_event_disable)(struct ftrace_event_call *);
>  };
>  
>  #define PERF_MAX_TRACE_SIZE	2048
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index ac5791d..e3348c4 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -103,22 +103,6 @@ struct perf_event_attr;
>  #define __SC_TEST5(t5, a5, ...)	__SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
>  #define __SC_TEST6(t6, a6, ...)	__SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
>  
> -#ifdef CONFIG_PERF_EVENTS
> -
> -#define TRACE_SYS_ENTER_PERF_INIT(sname)				       \
> -	.perf_event_enable = perf_sysenter_enable,			       \
> -	.perf_event_disable = perf_sysenter_disable,
> -
> -#define TRACE_SYS_EXIT_PERF_INIT(sname)					       \
> -	.perf_event_enable = perf_sysexit_enable,			       \
> -	.perf_event_disable = perf_sysexit_disable,
> -#else
> -#define TRACE_SYS_ENTER_PERF(sname)
> -#define TRACE_SYS_ENTER_PERF_INIT(sname)
> -#define TRACE_SYS_EXIT_PERF(sname)
> -#define TRACE_SYS_EXIT_PERF_INIT(sname)
> -#endif /* CONFIG_PERF_EVENTS */
> -
>  #ifdef CONFIG_FTRACE_SYSCALLS
>  #define __SC_STR_ADECL1(t, a)		#a
>  #define __SC_STR_ADECL2(t, a, ...)	#a, __SC_STR_ADECL1(__VA_ARGS__)
> @@ -134,7 +118,8 @@ struct perf_event_attr;
>  #define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
>  #define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
>  
> -extern struct ftrace_event_class event_class_syscalls;
> +extern struct ftrace_event_class event_class_syscall_enter;
> +extern struct ftrace_event_class event_class_syscall_exit;
>  
>  #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
>  	static const struct syscall_metadata __syscall_meta_##sname;	\
> @@ -148,14 +133,11 @@ extern struct ftrace_event_class event_class_syscalls;
>  	  __attribute__((section("_ftrace_events")))			\
>  	  event_enter_##sname = {					\
>  		.name                   = "sys_enter"#sname,		\
> -		.class			= &event_class_syscalls,	\
> +		.class			= &event_class_syscall_enter,	\
>  		.event                  = &enter_syscall_print_##sname,	\
>  		.raw_init		= init_syscall_trace,		\
>  		.define_fields		= syscall_enter_define_fields,	\
> -		.regfunc		= reg_event_syscall_enter,	\
> -		.unregfunc		= unreg_event_syscall_enter,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
> -		TRACE_SYS_ENTER_PERF_INIT(sname)			\
>  	}
>  
>  #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
> @@ -170,14 +152,11 @@ extern struct ftrace_event_class event_class_syscalls;
>  	  __attribute__((section("_ftrace_events")))			\
>  	  event_exit_##sname = {					\
>  		.name                   = "sys_exit"#sname,		\
> -		.class			= &event_class_syscalls,	\
> +		.class			= &event_class_syscall_exit,	\
>  		.event                  = &exit_syscall_print_##sname,	\
>  		.raw_init		= init_syscall_trace,		\
>  		.define_fields		= syscall_exit_define_fields,	\
> -		.regfunc		= reg_event_syscall_exit,	\
> -		.unregfunc		= unreg_event_syscall_exit,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
> -		TRACE_SYS_EXIT_PERF_INIT(sname)			\
>  	}
>  
>  #define SYSCALL_METADATA(sname, nb)				\
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index c04988a..5876b77 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -173,13 +173,21 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
>  	{ }								\
>  	static inline void trace_##name(proto)				\
>  	{ }								\
> +	static inline int register_trace_##name(void (*probe)(proto))	\
> +	{								\
> +		return -ENOSYS;						\
> +	}								\
> +	static inline int unregister_trace_##name(void (*probe)(proto))	\
> +	{								\
> +		return -ENOSYS;						\
> +	}								\
>  	static inline int						\
> -	register_trace_##name(void (*probe)(proto), void *data)		\
> +	register_trace_##name##_data(void (*probe)(proto), void *data)	\
>  	{								\
>  		return -ENOSYS;						\
>  	}								\
>  	static inline int						\
> -	unregister_trace_##name(void (*probe)(proto), void *data)	\
> +	unregister_trace_##name##_data(void (*probe)(proto), void *data) \
>  	{								\
>  		return -ENOSYS;						\
>  	}
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 0921a8f..62fe622 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -381,53 +381,6 @@ static inline notrace int ftrace_get_offsets_##call(			\
>  
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>  
> -#ifdef CONFIG_PERF_EVENTS
> -
> -/*
> - * Generate the functions needed for tracepoint perf_event support.
> - *
> - * NOTE: The insertion profile callback (ftrace_profile_<call>) is defined later
> - *
> - * static int ftrace_profile_enable_<call>(void)
> - * {
> - * 	return register_trace_<call>(ftrace_profile_<call>);
> - * }
> - *
> - * static void ftrace_profile_disable_<call>(void)
> - * {
> - * 	unregister_trace_<call>(ftrace_profile_<call>);
> - * }
> - *
> - */
> -
> -#undef DECLARE_EVENT_CLASS
> -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)
> -
> -#undef DEFINE_EVENT
> -#define DEFINE_EVENT(template, name, proto, args)			\
> -									\
> -static void perf_trace_##name(proto);					\
> -									\
> -static notrace int							\
> -perf_trace_enable_##name(struct ftrace_event_call *unused)		\
> -{									\
> -	return register_trace_##name(perf_trace_##name);		\
> -}									\
> -									\
> -static notrace void							\
> -perf_trace_disable_##name(struct ftrace_event_call *unused)		\
> -{									\
> -	unregister_trace_##name(perf_trace_##name);			\
> -}
> -
> -#undef DEFINE_EVENT_PRINT
> -#define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
> -	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
> -
> -#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
> -
> -#endif /* CONFIG_PERF_EVENTS */
> -
>  /*
>   * Stage 4 of the trace events.
>   *
> @@ -468,16 +421,6 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
>   *						   event, irq_flags, pc);
>   * }
>   *
> - * static int ftrace_raw_reg_event_<call>(struct ftrace_event_call *unused)
> - * {
> - *	return register_trace_<call>(ftrace_raw_event_<call>);
> - * }
> - *
> - * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused)
> - * {
> - *	unregister_trace_<call>(ftrace_raw_event_<call>);
> - * }
> - *
>   * static struct trace_event ftrace_event_type_<call> = {
>   *	.trace			= ftrace_raw_output_<call>, <-- stage 2
>   * };
> @@ -504,11 +447,15 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
>  
>  #ifdef CONFIG_PERF_EVENTS
>  
> +#define _TRACE_PERF_PROTO(call, proto)					\
> +	static notrace void						\
> +	perf_trace_##call(proto, struct ftrace_event_call *event);
> +
>  #define _TRACE_PERF_INIT(call)						\
> -	.perf_event_enable = perf_trace_enable_##call,			\
> -	.perf_event_disable = perf_trace_disable_##call,
> +	.perf_probe		= perf_trace_##call,
>  
>  #else
> +#define _TRACE_PERF_PROTO(call, proto)
>  #define _TRACE_PERF_INIT(call)
>  #endif /* CONFIG_PERF_EVENTS */
>  
> @@ -542,8 +489,8 @@ perf_trace_disable_##name(struct ftrace_event_call *unused)		\
>  #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
>  									\
>  static notrace void							\
> -ftrace_raw_event_id_##call(struct ftrace_event_call *event_call,	\
> -				       proto)				\
> +ftrace_raw_event_##call(proto,						\
> +			struct ftrace_event_call *event_call)		\
>  {									\
>  	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
>  	struct ring_buffer_event *event;				\
> @@ -578,23 +525,6 @@ ftrace_raw_event_id_##call(struct ftrace_event_call *event_call,	\
>  #undef DEFINE_EVENT
>  #define DEFINE_EVENT(template, call, proto, args)			\
>  									\
> -static notrace void ftrace_raw_event_##call(proto)			\
> -{									\
> -	ftrace_raw_event_id_##template(&event_##call, args);		\
> -}									\
> -									\
> -static notrace int							\
> -ftrace_raw_reg_event_##call(struct ftrace_event_call *unused)		\
> -{									\
> -	return register_trace_##call(ftrace_raw_event_##call);		\
> -}									\
> -									\
> -static notrace void							\
> -ftrace_raw_unreg_event_##call(struct ftrace_event_call *unused)		\
> -{									\
> -	unregister_trace_##call(ftrace_raw_event_##call);		\
> -}									\
> -									\
>  static struct trace_event ftrace_event_type_##call = {			\
>  	.trace			= ftrace_raw_output_##call,		\
>  };
> @@ -618,9 +548,12 @@ static struct trace_event ftrace_event_type_##call = {			\
>  
>  #undef DECLARE_EVENT_CLASS
>  #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> +_TRACE_PERF_PROTO(call, PARAMS(proto));					\
>  static const char print_fmt_##call[] = print;				\
>  static struct ftrace_event_class __used event_class_##call = {		\
> -	.system			= __stringify(TRACE_SYSTEM)		\
> +	.system			= __stringify(TRACE_SYSTEM),		\
> +	.probe			= ftrace_raw_event_##call,		\
> +	_TRACE_PERF_INIT(call)						\
>  }
>  
>  #undef DEFINE_EVENT
> @@ -633,11 +566,8 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.class			= &event_class_##template,		\
>  	.event			= &ftrace_event_type_##call,		\
>  	.raw_init		= trace_event_raw_init,			\
> -	.regfunc		= ftrace_raw_reg_event_##call,		\
> -	.unregfunc		= ftrace_raw_unreg_event_##call,	\
>  	.print_fmt		= print_fmt_##template,			\
>  	.define_fields		= ftrace_define_fields_##template,	\
> -	_TRACE_PERF_INIT(call)					\
>  }
>  
>  #undef DEFINE_EVENT_PRINT
> @@ -651,12 +581,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.class			= &event_class_##template,		\
>  	.event			= &ftrace_event_type_##call,		\
> -	.raw_init		= trace_event_raw_init,			\
> -	.regfunc		= ftrace_raw_reg_event_##call,		\
> -	.unregfunc		= ftrace_raw_unreg_event_##call,	\
>  	.print_fmt		= print_fmt_##call,			\
> -	.define_fields		= ftrace_define_fields_##template,	\
> -	_TRACE_PERF_INIT(call)					\
>  }
>  
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
> @@ -756,8 +681,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  #undef DECLARE_EVENT_CLASS
>  #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
>  static notrace void							\
> -perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
> -			    proto)					\
> +perf_trace_##call(proto, struct ftrace_event_call *event_call)		\
>  {									\
>  	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
>  	struct ftrace_raw_##call *entry;				\
> @@ -792,13 +716,7 @@ perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
>  }
>  
>  #undef DEFINE_EVENT
> -#define DEFINE_EVENT(template, call, proto, args)		\
> -static notrace void perf_trace_##call(proto)			\
> -{								\
> -	struct ftrace_event_call *event_call = &event_##call;	\
> -								\
> -	perf_trace_templ_##template(event_call, args);		\
> -}
> +#define DEFINE_EVENT(template, call, proto, args)
>  
>  #undef DEFINE_EVENT_PRINT
>  #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index 81f691e..95df5a7 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -44,7 +44,12 @@ static int perf_trace_event_enable(struct ftrace_event_call *event)
>  		rcu_assign_pointer(perf_trace_buf_nmi, buf);
>  	}
>  
> -	ret = event->perf_event_enable(event);
> +	if (event->class->reg)
> +		ret = event->class->reg(event, TRACE_REG_PERF_REGISTER);
> +	else
> +		ret = tracepoint_probe_register(event->name,
> +						event->class->perf_probe,
> +						event);
>  	if (!ret) {
>  		total_ref_count++;
>  		return 0;
> @@ -70,7 +75,8 @@ int perf_trace_enable(int event_id)
>  
>  	mutex_lock(&event_mutex);
>  	list_for_each_entry(event, &ftrace_events, list) {
> -		if (event->id == event_id && event->perf_event_enable &&
> +		if (event->id == event_id &&
> +		    event->class && event->class->perf_probe &&
>  		    try_module_get(event->mod)) {
>  			ret = perf_trace_event_enable(event);
>  			break;
> @@ -88,7 +94,10 @@ static void perf_trace_event_disable(struct ftrace_event_call *event)
>  	if (--event->perf_refcount > 0)
>  		return;
>  
> -	event->perf_event_disable(event);
> +	if (event->class->reg)
> +		event->class->reg(event, TRACE_REG_PERF_UNREGISTER);
> +	else
> +		tracepoint_probe_unregister(event->name, event->class->perf_probe, event);
>  
>  	if (!--total_ref_count) {
>  		buf = perf_trace_buf;
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index f6893cc..f84cfcb 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -126,13 +126,23 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
>  		if (call->enabled) {
>  			call->enabled = 0;
>  			tracing_stop_cmdline_record();
> -			call->unregfunc(call);
> +			if (call->class->reg)
> +				call->class->reg(call, TRACE_REG_UNREGISTER);
> +			else
> +				tracepoint_probe_unregister(call->name,
> +							    call->class->probe,
> +							    call);
>  		}
>  		break;
>  	case 1:
>  		if (!call->enabled) {
>  			tracing_start_cmdline_record();
> -			ret = call->regfunc(call);
> +			if (call->class->reg)
> +				ret = call->class->reg(call, TRACE_REG_REGISTER);
> +			else
> +				ret = tracepoint_probe_register(call->name,
> +								call->class->probe,
> +								call);
>  			if (ret) {
>  				tracing_stop_cmdline_record();
>  				pr_info("event trace: Could not enable event "
> @@ -170,7 +180,8 @@ static int __ftrace_set_clr_event(const char *match, const char *sub,
>  	mutex_lock(&event_mutex);
>  	list_for_each_entry(call, &ftrace_events, list) {
>  
> -		if (!call->name || !call->regfunc)
> +		if (!call->name || !call->class ||
> +		    (!call->class->probe && !call->class->reg))
>  			continue;
>  
>  		if (match &&
> @@ -296,7 +307,7 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
>  		 * The ftrace subsystem is for showing formats only.
>  		 * They can not be enabled or disabled via the event files.
>  		 */
> -		if (call->regfunc)
> +		if (call->class && (call->class->probe || call->class->reg))
>  			return call;
>  	}
>  
> @@ -449,7 +460,8 @@ system_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
>  
>  	mutex_lock(&event_mutex);
>  	list_for_each_entry(call, &ftrace_events, list) {
> -		if (!call->name || !call->regfunc)
> +		if (!call->name || !call->class ||
> +		    (!call->class->probe && !call->class->reg))
>  			continue;
>  
>  		if (system && strcmp(call->class->system, system) != 0)
> @@ -934,11 +946,11 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
>  		return -1;
>  	}
>  
> -	if (call->regfunc)
> +	if (call->class->probe || call->class->reg)
>  		trace_create_file("enable", 0644, call->dir, call,
>  				  enable);
>  
> -	if (call->id && call->perf_event_enable)
> +	if (call->id && (call->class->perf_probe || call->class->reg))
>  		trace_create_file("id", 0444, call->dir, call,
>  		 		  id);
>  
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index eda220b..f8af21a 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -202,6 +202,7 @@ struct trace_probe {
>  	unsigned long 		nhit;
>  	unsigned int		flags;	/* For TP_FLAG_* */
>  	const char		*symbol;	/* symbol name */
> +	struct ftrace_event_class	class;
>  	struct ftrace_event_call	call;
>  	struct trace_event		event;
>  	unsigned int		nr_args;
> @@ -323,6 +324,7 @@ static struct trace_probe *alloc_trace_probe(const char *group,
>  		goto error;
>  	}
>  
> +	tp->call.class = &tp->class;
>  	tp->call.name = kstrdup(event, GFP_KERNEL);
>  	if (!tp->call.name)
>  		goto error;
> @@ -332,8 +334,8 @@ static struct trace_probe *alloc_trace_probe(const char *group,
>  		goto error;
>  	}
>  
> -	tp->call.class->system = kstrdup(group, GFP_KERNEL);
> -	if (!tp->call.class->system)
> +	tp->class.system = kstrdup(group, GFP_KERNEL);
> +	if (!tp->class.system)
>  		goto error;
>  
>  	INIT_LIST_HEAD(&tp->list);
> @@ -1302,6 +1304,26 @@ static void probe_perf_disable(struct ftrace_event_call *call)
>  }
>  #endif	/* CONFIG_PERF_EVENTS */
>  
> +static __kprobes
> +int kprobe_register(struct ftrace_event_call *event, enum trace_reg type)
> +{
> +	switch (type) {
> +	case TRACE_REG_REGISTER:
> +		return probe_event_enable(event);
> +	case TRACE_REG_UNREGISTER:
> +		probe_event_disable(event);
> +		return 0;
> +
> +#ifdef CONFIG_PERF_EVENTS
> +	case TRACE_REG_PERF_REGISTER:
> +		return probe_perf_enable(event);
> +	case TRACE_REG_PERF_UNREGISTER:
> +		probe_perf_disable(event);
> +		return 0;
> +#endif
> +	}
> +	return 0;
> +}
>  
>  static __kprobes
>  int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
> @@ -1355,13 +1377,7 @@ static int register_probe_event(struct trace_probe *tp)
>  		return -ENODEV;
>  	}
>  	call->enabled = 0;
> -	call->regfunc = probe_event_enable;
> -	call->unregfunc = probe_event_disable;
> -
> -#ifdef CONFIG_PERF_EVENTS
> -	call->perf_event_enable = probe_perf_enable;
> -	call->perf_event_disable = probe_perf_disable;
> -#endif
> +	call->class->reg = kprobe_register;
>  	call->data = tp;
>  	ret = trace_add_event_call(call);
>  	if (ret) {
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 31fc95a..c92934d 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -14,8 +14,19 @@ static int sys_refcount_exit;
>  static DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
>  static DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
>  
> -struct ftrace_event_class event_class_syscalls = {
> -	.system			= "syscalls"
> +static int syscall_enter_register(struct ftrace_event_call *event,
> +				 enum trace_reg type);
> +static int syscall_exit_register(struct ftrace_event_call *event,
> +				 enum trace_reg type);
> +
> +struct ftrace_event_class event_class_syscall_enter = {
> +	.system			= "syscalls",
> +	.reg			= syscall_enter_register
> +};
> +
> +struct ftrace_event_class event_class_syscall_exit = {
> +	.system			= "syscalls",
> +	.reg			= syscall_exit_register
>  };
>  
>  extern unsigned long __start_syscalls_metadata[];
> @@ -586,3 +597,44 @@ void perf_sysexit_disable(struct ftrace_event_call *call)
>  
>  #endif /* CONFIG_PERF_EVENTS */
>  
> +static int syscall_enter_register(struct ftrace_event_call *event,
> +				 enum trace_reg type)
> +{
> +	switch (type) {
> +	case TRACE_REG_REGISTER:
> +		return reg_event_syscall_enter(event);
> +	case TRACE_REG_UNREGISTER:
> +		unreg_event_syscall_enter(event);
> +		return 0;
> +
> +#ifdef CONFIG_PERF_EVENTS
> +	case TRACE_REG_PERF_REGISTER:
> +		return perf_sysenter_enable(event);
> +	case TRACE_REG_PERF_UNREGISTER:
> +		perf_sysenter_disable(event);
> +		return 0;
> +#endif
> +	}
> +	return 0;
> +}
> +
> +static int syscall_exit_register(struct ftrace_event_call *event,
> +				 enum trace_reg type)
> +{
> +	switch (type) {
> +	case TRACE_REG_REGISTER:
> +		return reg_event_syscall_exit(event);
> +	case TRACE_REG_UNREGISTER:
> +		unreg_event_syscall_exit(event);
> +		return 0;
> +
> +#ifdef CONFIG_PERF_EVENTS
> +	case TRACE_REG_PERF_REGISTER:
> +		return perf_sysexit_enable(event);
> +	case TRACE_REG_PERF_UNREGISTER:
> +		perf_sysexit_disable(event);
> +		return 0;
> +#endif
> +	}
> +	return 0;
> +}
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/10][RFC] tracing: Move fields from event to class structure
  2010-04-26 19:50 ` [PATCH 05/10][RFC] tracing: Move fields from event to class structure Steven Rostedt
@ 2010-04-28 20:58   ` Mathieu Desnoyers
  2010-04-29  0:02     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 20:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers, Tom Zanussi

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Move the defined fields from the event to the class structure.
> Since the fields of the event are defined by the class they belong
> to, it makes sense to have the class hold the information instead
> of the individual events. The events of the same class would just
> hold duplicate information.
> 
> After this change the size of the kernel dropped another 8K:
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5774316	1306580	9351592	16432488	 fabd68	vmlinux.reg
> 5774503	1297492	9351592	16423587	 fa9aa3	vmlinux.fields
> 
> Although the text increased, this was mainly due to the C files
> having to adapt to the change. This is a constant increase, where
> new tracepoints will not increase the Text. But the big drop is
> in the data size (as well as needed allocations to hold the fields).
> This will give even more savings as more tracepoints are created.
> 
> Note, if just TRACE_EVENT()s are used and not DECLARE_EVENT_CLASS()
> with several DEFINE_EVENT()s, then the savings will be lost. But
> we are pushing developers to consolidate events with DEFINE_EVENT()
> so this should not be an issue.
> 
> The kprobes define a unique class to every new event, but are dynamic
> so it should not be a issue.
> 
> The syscalls however have a single class but the fields for the individual
> events are different. The syscalls use a metadata to define the
> fields. I moved the fields list from the event to the metadata and
> added a "get_fields()" function to the class. This function is used
> to find the fields. For normal events and kprobes, get_fields() just
> returns a pointer to the fields list_head in the class. For syscall
> events, it returns the fields list_head in the metadata for the event.

So, playing catch-up here, why don't we simply put each syscall event in
their own class ? We could possibly share the class where it makes
sense (e.g. exact same fields).

With the per-class sub-metadata, what's the limitations we have to
expect with these system call events ? Can we map to a field size
directly from the event ID, or do we have to somehow have the event size
encoded in the header to make sense of the payload ?

Thanks,

Mathieu

> 
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Tom Zanussi <tzanussi@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/ftrace_event.h       |    5 ++-
>  include/linux/syscalls.h           |   12 +++++-----
>  include/trace/ftrace.h             |   10 +++++---
>  include/trace/syscall.h            |    3 +-
>  kernel/trace/trace.h               |    3 ++
>  kernel/trace/trace_events.c        |   43 +++++++++++++++++++++++++++++++-----
>  kernel/trace/trace_events_filter.c |   10 +++++---
>  kernel/trace/trace_export.c        |   14 ++++++------
>  kernel/trace/trace_kprobe.c        |    8 +++---
>  kernel/trace/trace_syscalls.c      |   23 ++++++++++++++++---
>  10 files changed, 92 insertions(+), 39 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index dd0051e..1e2c8f5 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -128,6 +128,9 @@ struct ftrace_event_class {
>  	void			*perf_probe;
>  	int			(*reg)(struct ftrace_event_call *event,
>  				       enum trace_reg type);
> +	int			(*define_fields)(struct ftrace_event_call *);
> +	struct list_head	*(*get_fields)(struct ftrace_event_call *);
> +	struct list_head	fields;
>  };
>  
>  struct ftrace_event_call {
> @@ -140,8 +143,6 @@ struct ftrace_event_call {
>  	int			id;
>  	const char		*print_fmt;
>  	int			(*raw_init)(struct ftrace_event_call *);
> -	int			(*define_fields)(struct ftrace_event_call *);
> -	struct list_head	fields;
>  	int			filter_active;
>  	struct event_filter	*filter;
>  	void			*mod;
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index e3348c4..ef4f81c 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -122,7 +122,7 @@ extern struct ftrace_event_class event_class_syscall_enter;
>  extern struct ftrace_event_class event_class_syscall_exit;
>  
>  #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
> -	static const struct syscall_metadata __syscall_meta_##sname;	\
> +	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_enter_##sname;		\
>  	static struct trace_event enter_syscall_print_##sname = {	\
> @@ -136,12 +136,11 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.class			= &event_class_syscall_enter,	\
>  		.event                  = &enter_syscall_print_##sname,	\
>  		.raw_init		= init_syscall_trace,		\
> -		.define_fields		= syscall_enter_define_fields,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
>  #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
> -	static const struct syscall_metadata __syscall_meta_##sname;	\
> +	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_exit_##sname;		\
>  	static struct trace_event exit_syscall_print_##sname = {	\
> @@ -155,14 +154,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.class			= &event_class_syscall_exit,	\
>  		.event                  = &exit_syscall_print_##sname,	\
>  		.raw_init		= init_syscall_trace,		\
> -		.define_fields		= syscall_exit_define_fields,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
>  #define SYSCALL_METADATA(sname, nb)				\
>  	SYSCALL_TRACE_ENTER_EVENT(sname);			\
>  	SYSCALL_TRACE_EXIT_EVENT(sname);			\
> -	static const struct syscall_metadata __used		\
> +	static struct syscall_metadata __used			\
>  	  __attribute__((__aligned__(4)))			\
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta_##sname = {				\
> @@ -172,12 +170,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.args		= args_##sname,			\
>  		.enter_event	= &event_enter_##sname,		\
>  		.exit_event	= &event_exit_##sname,		\
> +		.fields		= LIST_HEAD_INIT(__syscall_meta_##sname.fields), \
>  	};
>  
>  #define SYSCALL_DEFINE0(sname)					\
>  	SYSCALL_TRACE_ENTER_EVENT(_##sname);			\
>  	SYSCALL_TRACE_EXIT_EVENT(_##sname);			\
> -	static const struct syscall_metadata __used		\
> +	static struct syscall_metadata __used			\
>  	  __attribute__((__aligned__(4)))			\
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta__##sname = {				\
> @@ -185,6 +184,7 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.nb_args 	= 0,				\
>  		.enter_event	= &event_enter__##sname,	\
>  		.exit_event	= &event_exit__##sname,		\
> +		.fields		= LIST_HEAD_INIT(__syscall_meta__##sname.fields), \
>  	};							\
>  	asmlinkage long sys_##sname(void)
>  #else
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 62fe622..e6ec392 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -429,6 +429,9 @@ static inline notrace int ftrace_get_offsets_##call(			\
>   *
>   * static struct ftrace_event_class __used event_class_<template> = {
>   *	.system			= "<system>",
> + *	.define_fields		= ftrace_define_fields_<call>,
> +	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
> +	.probe			= ftrace_raw_event_##call,		\

missing * above.


>   * }
>   *
>   * static struct ftrace_event_call __used
> @@ -437,10 +440,8 @@ static inline notrace int ftrace_get_offsets_##call(			\
>   *	.name			= "<call>",
>   *	.class			= event_class_<template>,
>   *	.raw_init		= trace_event_raw_init,
> - *	.regfunc		= ftrace_reg_event_<call>,
> - *	.unregfunc		= ftrace_unreg_event_<call>,
> + *	.event			= &ftrace_event_type_<call>,
>   *	.print_fmt		= print_fmt_<call>,
> - *	.define_fields		= ftrace_define_fields_<call>,
>   * }
>   *
>   */
> @@ -552,6 +553,8 @@ _TRACE_PERF_PROTO(call, PARAMS(proto));					\
>  static const char print_fmt_##call[] = print;				\
>  static struct ftrace_event_class __used event_class_##call = {		\
>  	.system			= __stringify(TRACE_SYSTEM),		\
> +	.define_fields		= ftrace_define_fields_##call,		\
> +	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
>  	.probe			= ftrace_raw_event_##call,		\
>  	_TRACE_PERF_INIT(call)						\
>  }
> @@ -567,7 +570,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.event			= &ftrace_event_type_##call,		\
>  	.raw_init		= trace_event_raw_init,			\
>  	.print_fmt		= print_fmt_##template,			\
> -	.define_fields		= ftrace_define_fields_##template,	\
>  }
>  
>  #undef DEFINE_EVENT_PRINT
> diff --git a/include/trace/syscall.h b/include/trace/syscall.h
> index e5e5f48..25087c3 100644
> --- a/include/trace/syscall.h
> +++ b/include/trace/syscall.h
> @@ -25,6 +25,7 @@ struct syscall_metadata {
>  	int		nb_args;
>  	const char	**types;
>  	const char	**args;
> +	struct list_head fields;
>  
>  	struct ftrace_event_call *enter_event;
>  	struct ftrace_event_call *exit_event;
> @@ -34,8 +35,6 @@ struct syscall_metadata {
>  extern unsigned long arch_syscall_addr(int nr);
>  extern int init_syscall_trace(struct ftrace_event_call *call);
>  
> -extern int syscall_enter_define_fields(struct ftrace_event_call *call);
> -extern int syscall_exit_define_fields(struct ftrace_event_call *call);
>  extern int reg_event_syscall_enter(struct ftrace_event_call *call);
>  extern void unreg_event_syscall_enter(struct ftrace_event_call *call);
>  extern int reg_event_syscall_exit(struct ftrace_event_call *call);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 2825ef2..ff63bee 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -771,6 +771,9 @@ extern void print_subsystem_event_filter(struct event_subsystem *system,
>  					 struct trace_seq *s);
>  extern int filter_assign_type(const char *type);
>  
> +struct list_head *
> +trace_get_fields(struct ftrace_event_call *event_call);
> +
>  static inline int
>  filter_check_discard(struct ftrace_event_call *call, void *rec,
>  		     struct ring_buffer *buffer,
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index f84cfcb..c31632e 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -28,11 +28,28 @@ DEFINE_MUTEX(event_mutex);
>  
>  LIST_HEAD(ftrace_events);
>  
> +static int fields_done(struct ftrace_event_call *event_call)
> +{
> +	return 0;
> +}
> +
> +struct list_head *
> +trace_get_fields(struct ftrace_event_call *event_call)
> +{
> +	if (!event_call->class->get_fields)
> +		return &event_call->class->fields;
> +	return event_call->class->get_fields(event_call);
> +}
> +
>  int trace_define_field(struct ftrace_event_call *call, const char *type,
>  		       const char *name, int offset, int size, int is_signed,
>  		       int filter_type)
>  {
>  	struct ftrace_event_field *field;
> +	struct list_head *head;
> +
> +	if (WARN_ON(!call->class) || call->class->define_fields == fields_done)
> +		return 0;
>  
>  	field = kzalloc(sizeof(*field), GFP_KERNEL);
>  	if (!field)
> @@ -55,7 +72,8 @@ int trace_define_field(struct ftrace_event_call *call, const char *type,
>  	field->size = size;
>  	field->is_signed = is_signed;
>  
> -	list_add(&field->link, &call->fields);
> +	head = trace_get_fields(call);
> +	list_add(&field->link, head);
>  
>  	return 0;
>  
> @@ -81,6 +99,9 @@ static int trace_define_common_fields(struct ftrace_event_call *call)
>  	int ret;
>  	struct trace_entry ent;
>  
> +	if (call->class->define_fields == fields_done)
> +		return 0;
> +
>  	__common_field(unsigned short, type);
>  	__common_field(unsigned char, flags);
>  	__common_field(unsigned char, preempt_count);
> @@ -93,8 +114,10 @@ static int trace_define_common_fields(struct ftrace_event_call *call)
>  void trace_destroy_fields(struct ftrace_event_call *call)
>  {
>  	struct ftrace_event_field *field, *next;
> +	struct list_head *head;
>  
> -	list_for_each_entry_safe(field, next, &call->fields, link) {
> +	head = trace_get_fields(call);
> +	list_for_each_entry_safe(field, next, head, link) {
>  		list_del(&field->link);
>  		kfree(field->type);
>  		kfree(field->name);
> @@ -110,7 +133,6 @@ int trace_event_raw_init(struct ftrace_event_call *call)
>  	if (!id)
>  		return -ENODEV;
>  	call->id = id;
> -	INIT_LIST_HEAD(&call->fields);
>  
>  	return 0;
>  }
> @@ -536,6 +558,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
>  {
>  	struct ftrace_event_call *call = filp->private_data;
>  	struct ftrace_event_field *field;
> +	struct list_head *head;
>  	struct trace_seq *s;
>  	int common_field_count = 5;
>  	char *buf;
> @@ -554,7 +577,8 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
>  	trace_seq_printf(s, "ID: %d\n", call->id);
>  	trace_seq_printf(s, "format:\n");
>  
> -	list_for_each_entry_reverse(field, &call->fields, link) {
> +	head = trace_get_fields(call);
> +	list_for_each_entry_reverse(field, head, link) {
>  		/*
>  		 * Smartly shows the array type(except dynamic array).
>  		 * Normal:
> @@ -954,10 +978,10 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
>  		trace_create_file("id", 0444, call->dir, call,
>  		 		  id);
>  
> -	if (call->define_fields) {
> +	if (call->class->define_fields) {
>  		ret = trace_define_common_fields(call);
>  		if (!ret)
> -			ret = call->define_fields(call);
> +			ret = call->class->define_fields(call);
>  		if (ret < 0) {
>  			pr_warning("Could not initialize trace point"
>  				   " events/%s\n", call->name);
> @@ -965,6 +989,13 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
>  		}
>  		trace_create_file("filter", 0644, call->dir, call,
>  				  filter);
> +
> +		/*
> +		 * Other events with the same class will call
> +		 * define fields again, Set the define_fields
> +		 * to a stub, and it will be skipped.
> +		 */
> +		call->class->define_fields = fields_done;
>  	}
>  
>  	trace_create_file("format", 0444, call->dir, call,
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index 22fa89f..560683d 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -499,8 +499,10 @@ static struct ftrace_event_field *
>  find_event_field(struct ftrace_event_call *call, char *name)
>  {
>  	struct ftrace_event_field *field;
> +	struct list_head *head;
>  
> -	list_for_each_entry(field, &call->fields, link) {
> +	head = trace_get_fields(call);
> +	list_for_each_entry(field, head, link) {
>  		if (!strcmp(field->name, name))
>  			return field;
>  	}
> @@ -624,7 +626,7 @@ static int init_subsystem_preds(struct event_subsystem *system)
>  	int err;
>  
>  	list_for_each_entry(call, &ftrace_events, list) {
> -		if (!call->define_fields)
> +		if (!call->class || !call->class->define_fields)
>  			continue;
>  
>  		if (strcmp(call->class->system, system->name) != 0)
> @@ -643,7 +645,7 @@ static void filter_free_subsystem_preds(struct event_subsystem *system)
>  	struct ftrace_event_call *call;
>  
>  	list_for_each_entry(call, &ftrace_events, list) {
> -		if (!call->define_fields)
> +		if (!call->class || !call->class->define_fields)
>  			continue;
>  
>  		if (strcmp(call->class->system, system->name) != 0)
> @@ -1248,7 +1250,7 @@ static int replace_system_preds(struct event_subsystem *system,
>  	list_for_each_entry(call, &ftrace_events, list) {
>  		struct event_filter *filter = call->filter;
>  
> -		if (!call->define_fields)
> +		if (!call->class || !call->class->define_fields)
>  			continue;
>  
>  		if (strcmp(call->class->system, system->name) != 0)
> diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
> index 7f16e21..e700a0c 100644
> --- a/kernel/trace/trace_export.c
> +++ b/kernel/trace/trace_export.c
> @@ -18,10 +18,6 @@
>  #undef TRACE_SYSTEM
>  #define TRACE_SYSTEM	ftrace
>  
> -struct ftrace_event_class event_class_ftrace = {
> -	.system			= __stringify(TRACE_SYSTEM),
> -};
> -
>  /* not needed for this file */
>  #undef __field_struct
>  #define __field_struct(type, item)
> @@ -131,7 +127,7 @@ ftrace_define_fields_##name(struct ftrace_event_call *event_call)	\
>  
>  static int ftrace_raw_init_event(struct ftrace_event_call *call)
>  {
> -	INIT_LIST_HEAD(&call->fields);
> +	INIT_LIST_HEAD(&call->class->fields);
>  	return 0;
>  }
>  
> @@ -159,15 +155,19 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
>  #undef FTRACE_ENTRY
>  #define FTRACE_ENTRY(call, struct_name, type, tstruct, print)		\
>  									\
> +struct ftrace_event_class event_class_ftrace_##call = {			\
> +	.system			= __stringify(TRACE_SYSTEM),		\
> +	.define_fields		= ftrace_define_fields_##call,		\
> +};									\
> +									\
>  struct ftrace_event_call __used						\
>  __attribute__((__aligned__(4)))						\
>  __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.id			= type,					\
> -	.class			= &event_class_ftrace,			\
> +	.class			= &event_class_ftrace_##call,		\
>  	.raw_init		= ftrace_raw_init_event,		\
>  	.print_fmt		= print,				\
> -	.define_fields		= ftrace_define_fields_##call,		\
>  };									\
>  
>  #include "trace_entries.h"
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index f8af21a..b14bf74 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1112,8 +1112,6 @@ static void probe_event_disable(struct ftrace_event_call *call)
>  
>  static int probe_event_raw_init(struct ftrace_event_call *event_call)
>  {
> -	INIT_LIST_HEAD(&event_call->fields);
> -
>  	return 0;
>  }
>  
> @@ -1362,11 +1360,13 @@ static int register_probe_event(struct trace_probe *tp)
>  	if (probe_is_return(tp)) {
>  		tp->event.trace = print_kretprobe_event;
>  		call->raw_init = probe_event_raw_init;
> -		call->define_fields = kretprobe_event_define_fields;
> +		INIT_LIST_HEAD(&call->class->fields);
> +		call->class->define_fields = kretprobe_event_define_fields;
>  	} else {
>  		tp->event.trace = print_kprobe_event;
>  		call->raw_init = probe_event_raw_init;
> -		call->define_fields = kprobe_event_define_fields;
> +		INIT_LIST_HEAD(&call->class->fields);
> +		call->class->define_fields = kprobe_event_define_fields;
>  	}
>  	if (set_print_fmt(tp) < 0)
>  		return -ENOMEM;
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index c92934d..eb535ba 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -19,14 +19,29 @@ static int syscall_enter_register(struct ftrace_event_call *event,
>  static int syscall_exit_register(struct ftrace_event_call *event,
>  				 enum trace_reg type);
>  
> +static int syscall_enter_define_fields(struct ftrace_event_call *call);
> +static int syscall_exit_define_fields(struct ftrace_event_call *call);
> +
> +static struct list_head *
> +syscall_get_fields(struct ftrace_event_call *call)
> +{
> +	struct syscall_metadata *entry = call->data;
> +
> +	return &entry->fields;
> +}
> +
>  struct ftrace_event_class event_class_syscall_enter = {
>  	.system			= "syscalls",
> -	.reg			= syscall_enter_register
> +	.reg			= syscall_enter_register,
> +	.define_fields		= syscall_enter_define_fields,
> +	.get_fields		= syscall_get_fields,
>  };
>  
>  struct ftrace_event_class event_class_syscall_exit = {
>  	.system			= "syscalls",
> -	.reg			= syscall_exit_register
> +	.reg			= syscall_exit_register,
> +	.define_fields		= syscall_exit_define_fields,
> +	.get_fields		= syscall_get_fields,
>  };
>  
>  extern unsigned long __start_syscalls_metadata[];
> @@ -219,7 +234,7 @@ static void free_syscall_print_fmt(struct ftrace_event_call *call)
>  		kfree(call->print_fmt);
>  }
>  
> -int syscall_enter_define_fields(struct ftrace_event_call *call)
> +static int syscall_enter_define_fields(struct ftrace_event_call *call)
>  {
>  	struct syscall_trace_enter trace;
>  	struct syscall_metadata *meta = call->data;
> @@ -242,7 +257,7 @@ int syscall_enter_define_fields(struct ftrace_event_call *call)
>  	return ret;
>  }
>  
> -int syscall_exit_define_fields(struct ftrace_event_call *call)
> +static int syscall_exit_define_fields(struct ftrace_event_call *call)
>  {
>  	struct syscall_trace_exit trace;
>  	int ret;
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/10][RFC] tracing: Move raw_init from events to class
  2010-04-26 19:50 ` [PATCH 06/10][RFC] tracing: Move raw_init from events to class Steven Rostedt
@ 2010-04-28 21:00   ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 21:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> The raw_init function pointer in the event is used to initialize
> various kinds of events. The type of initialization needed is usually
> classed to the kind of event it is.
> 
> Two events with the same class will always have the same initialization
> function, so it makes sense to move this to the class structure.
> 
> Perhaps even making a special system structure would work since
> the initialization is the same for all events within a system.
> But since there's no system structure (yet), this will just move it
> to the class.
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5774567	1297492	9351592	16423651	 fa9ae3	vmlinux.fields
> 5774510	1293204	9351592	16419306	 fa89ea	vmlinux.init
> 
> The text grew very slightly, but this is a constant growth that happened
> with the changing of the C files that call the init code.
> The bigger savings is the data which will be saved the more events share
> a class.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> ---
>  include/linux/ftrace_event.h  |    2 +-
>  include/linux/syscalls.h      |    2 --
>  include/trace/ftrace.h        |    8 ++++----
>  kernel/trace/trace_events.c   |   12 ++++++------
>  kernel/trace/trace_export.c   |    2 +-
>  kernel/trace/trace_kprobe.c   |    6 +++---
>  kernel/trace/trace_syscalls.c |    2 ++
>  7 files changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 1e2c8f5..655de69 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -131,6 +131,7 @@ struct ftrace_event_class {
>  	int			(*define_fields)(struct ftrace_event_call *);
>  	struct list_head	*(*get_fields)(struct ftrace_event_call *);
>  	struct list_head	fields;
> +	int			(*raw_init)(struct ftrace_event_call *);
>  };
>  
>  struct ftrace_event_call {
> @@ -142,7 +143,6 @@ struct ftrace_event_call {
>  	int			enabled;
>  	int			id;
>  	const char		*print_fmt;
> -	int			(*raw_init)(struct ftrace_event_call *);
>  	int			filter_active;
>  	struct event_filter	*filter;
>  	void			*mod;
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index ef4f81c..a0db1e8 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -135,7 +135,6 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.name                   = "sys_enter"#sname,		\
>  		.class			= &event_class_syscall_enter,	\
>  		.event                  = &enter_syscall_print_##sname,	\
> -		.raw_init		= init_syscall_trace,		\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
> @@ -153,7 +152,6 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  		.name                   = "sys_exit"#sname,		\
>  		.class			= &event_class_syscall_exit,	\
>  		.event                  = &exit_syscall_print_##sname,	\
> -		.raw_init		= init_syscall_trace,		\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index e6ec392..de0d96c 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -430,8 +430,9 @@ static inline notrace int ftrace_get_offsets_##call(			\
>   * static struct ftrace_event_class __used event_class_<template> = {
>   *	.system			= "<system>",
>   *	.define_fields		= ftrace_define_fields_<call>,
> -	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
> -	.probe			= ftrace_raw_event_##call,		\
> + *	.fields			= LIST_HEAD_INIT(event_class_##call.fields),
> + *	.raw_init		= trace_event_raw_init,
> + *	.probe			= ftrace_raw_event_##call,
>   * }
>   *
>   * static struct ftrace_event_call __used
> @@ -439,7 +440,6 @@ static inline notrace int ftrace_get_offsets_##call(			\
>   * __attribute__((section("_ftrace_events"))) event_<call> = {
>   *	.name			= "<call>",
>   *	.class			= event_class_<template>,
> - *	.raw_init		= trace_event_raw_init,
>   *	.event			= &ftrace_event_type_<call>,
>   *	.print_fmt		= print_fmt_<call>,
>   * }
> @@ -555,6 +555,7 @@ static struct ftrace_event_class __used event_class_##call = {		\
>  	.system			= __stringify(TRACE_SYSTEM),		\
>  	.define_fields		= ftrace_define_fields_##call,		\
>  	.fields			= LIST_HEAD_INIT(event_class_##call.fields),\
> +	.raw_init		= trace_event_raw_init,			\
>  	.probe			= ftrace_raw_event_##call,		\
>  	_TRACE_PERF_INIT(call)						\
>  }
> @@ -568,7 +569,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.class			= &event_class_##template,		\
>  	.event			= &ftrace_event_type_##call,		\
> -	.raw_init		= trace_event_raw_init,			\
>  	.print_fmt		= print_fmt_##template,			\
>  }
>  
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index c31632e..c34a9bd 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1012,8 +1012,8 @@ static int __trace_add_event_call(struct ftrace_event_call *call)
>  	if (!call->name)
>  		return -EINVAL;
>  
> -	if (call->raw_init) {
> -		ret = call->raw_init(call);
> +	if (call->class->raw_init) {
> +		ret = call->class->raw_init(call);
>  		if (ret < 0) {
>  			if (ret != -ENOSYS)
>  				pr_warning("Could not initialize trace "
> @@ -1174,8 +1174,8 @@ static void trace_module_add_events(struct module *mod)
>  		/* The linker may leave blanks */
>  		if (!call->name)
>  			continue;
> -		if (call->raw_init) {
> -			ret = call->raw_init(call);
> +		if (call->class->raw_init) {
> +			ret = call->class->raw_init(call);
>  			if (ret < 0) {
>  				if (ret != -ENOSYS)
>  					pr_warning("Could not initialize trace "
> @@ -1328,8 +1328,8 @@ static __init int event_trace_init(void)
>  		/* The linker may leave blanks */
>  		if (!call->name)
>  			continue;
> -		if (call->raw_init) {
> -			ret = call->raw_init(call);
> +		if (call->class->raw_init) {
> +			ret = call->class->raw_init(call);
>  			if (ret < 0) {
>  				if (ret != -ENOSYS)
>  					pr_warning("Could not initialize trace "
> diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
> index e700a0c..e878d06 100644
> --- a/kernel/trace/trace_export.c
> +++ b/kernel/trace/trace_export.c
> @@ -158,6 +158,7 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
>  struct ftrace_event_class event_class_ftrace_##call = {			\
>  	.system			= __stringify(TRACE_SYSTEM),		\
>  	.define_fields		= ftrace_define_fields_##call,		\
> +	.raw_init		= ftrace_raw_init_event,		\
>  };									\
>  									\
>  struct ftrace_event_call __used						\
> @@ -166,7 +167,6 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.id			= type,					\
>  	.class			= &event_class_ftrace_##call,		\
> -	.raw_init		= ftrace_raw_init_event,		\
>  	.print_fmt		= print,				\
>  };									\
>  
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index b14bf74..428f4a5 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1359,13 +1359,13 @@ static int register_probe_event(struct trace_probe *tp)
>  	/* Initialize ftrace_event_call */
>  	if (probe_is_return(tp)) {
>  		tp->event.trace = print_kretprobe_event;
> -		call->raw_init = probe_event_raw_init;
>  		INIT_LIST_HEAD(&call->class->fields);
> +		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kretprobe_event_define_fields;
>  	} else {
> -		tp->event.trace = print_kprobe_event;
> -		call->raw_init = probe_event_raw_init;
>  		INIT_LIST_HEAD(&call->class->fields);
> +		tp->event.trace = print_kprobe_event;
> +		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kprobe_event_define_fields;
>  	}
>  	if (set_print_fmt(tp) < 0)
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index eb535ba..7ee6086 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -35,6 +35,7 @@ struct ftrace_event_class event_class_syscall_enter = {
>  	.reg			= syscall_enter_register,
>  	.define_fields		= syscall_enter_define_fields,
>  	.get_fields		= syscall_get_fields,
> +	.raw_init		= init_syscall_trace,
>  };
>  
>  struct ftrace_event_class event_class_syscall_exit = {
> @@ -42,6 +43,7 @@ struct ftrace_event_class event_class_syscall_exit = {
>  	.reg			= syscall_exit_register,
>  	.define_fields		= syscall_exit_define_fields,
>  	.get_fields		= syscall_get_fields,
> +	.raw_init		= init_syscall_trace,
>  };
>  
>  extern unsigned long __start_syscalls_metadata[];
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/10][RFC] tracing: Allow events to share their print functions
  2010-04-26 19:50 ` [PATCH 07/10][RFC] tracing: Allow events to share their print functions Steven Rostedt
@ 2010-04-28 21:03   ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 21:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Multiple events may use the same method to print their data.
> Instead of having all events have a pointer to their print funtions,
> the trace_event structure now points to a trace_event_functions structure
> that will hold the way to print ouf the event.
> 
> The event itself is now passed to the print function to let the print
> function know what kind of event it should print.
> 
> This opens the door to consolidating the way several events print
> their output.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Makes sense,

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>


> ---
>  include/linux/ftrace_event.h         |   17 +++-
>  include/linux/syscalls.h             |   10 ++-
>  include/trace/ftrace.h               |   12 ++-
>  include/trace/syscall.h              |    6 +-
>  kernel/trace/blktrace.c              |   13 ++-
>  kernel/trace/kmemtrace.c             |   28 +++++--
>  kernel/trace/trace.c                 |    9 +-
>  kernel/trace/trace_functions_graph.c |    2 +-
>  kernel/trace/trace_kprobe.c          |   22 ++++--
>  kernel/trace/trace_output.c          |  137 +++++++++++++++++++++++-----------
>  kernel/trace/trace_output.h          |    2 +-
>  kernel/trace/trace_syscalls.c        |    6 +-
>  12 files changed, 178 insertions(+), 86 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 655de69..09c2ad7 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -70,18 +70,25 @@ struct trace_iterator {
>  };
>  
>  
> +struct trace_event;
> +
>  typedef enum print_line_t (*trace_print_func)(struct trace_iterator *iter,
> -					      int flags);
> -struct trace_event {
> -	struct hlist_node	node;
> -	struct list_head	list;
> -	int			type;
> +				      int flags, struct trace_event *event);
> +
> +struct trace_event_functions {
>  	trace_print_func	trace;
>  	trace_print_func	raw;
>  	trace_print_func	hex;
>  	trace_print_func	binary;
>  };
>  
> +struct trace_event {
> +	struct hlist_node		node;
> +	struct list_head		list;
> +	int				type;
> +	struct trace_event_functions	*funcs;
> +};
> +
>  extern int register_ftrace_event(struct trace_event *event);
>  extern int unregister_ftrace_event(struct trace_event *event);
>  
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index a0db1e8..f3892e9 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -125,9 +125,12 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_enter_##sname;		\
> -	static struct trace_event enter_syscall_print_##sname = {	\
> +	static struct trace_event_functions enter_syscall_print_funcs_##sname = { \
>  		.trace                  = print_syscall_enter,		\
>  	};								\
> +	static struct trace_event enter_syscall_print_##sname = {	\
> +		.funcs                  = &enter_syscall_print_funcs_##sname, \
> +	};								\
>  	static struct ftrace_event_call __used				\
>  	  __attribute__((__aligned__(4)))				\
>  	  __attribute__((section("_ftrace_events")))			\
> @@ -142,9 +145,12 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_exit_##sname;		\
> -	static struct trace_event exit_syscall_print_##sname = {	\
> +	static struct trace_event_functions exit_syscall_print_funcs_##sname = { \
>  		.trace                  = print_syscall_exit,		\
>  	};								\
> +	static struct trace_event exit_syscall_print_##sname = {	\
> +		.funcs                  = &exit_syscall_print_funcs_##sname, \
> +	};								\
>  	static struct ftrace_event_call __used				\
>  	  __attribute__((__aligned__(4)))				\
>  	  __attribute__((section("_ftrace_events")))			\
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index de0d96c..2efb301 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -239,7 +239,8 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
>  #undef DEFINE_EVENT
>  #define DEFINE_EVENT(template, name, proto, args)			\
>  static notrace enum print_line_t					\
> -ftrace_raw_output_##name(struct trace_iterator *iter, int flags)	\
> +ftrace_raw_output_##name(struct trace_iterator *iter, int flags,	\
> +			 struct trace_event *event)			\
>  {									\
>  	return ftrace_raw_output_id_##template(event_##name.id,		\
>  					       #name, iter, flags);	\
> @@ -248,7 +249,8 @@ ftrace_raw_output_##name(struct trace_iterator *iter, int flags)	\
>  #undef DEFINE_EVENT_PRINT
>  #define DEFINE_EVENT_PRINT(template, call, proto, args, print)		\
>  static notrace enum print_line_t					\
> -ftrace_raw_output_##call(struct trace_iterator *iter, int flags)	\
> +ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
> +			 struct trace_event *event)			\
>  {									\
>  	struct trace_seq *s = &iter->seq;				\
>  	struct ftrace_raw_##template *field;				\
> @@ -525,9 +527,11 @@ ftrace_raw_event_##call(proto,						\
>  
>  #undef DEFINE_EVENT
>  #define DEFINE_EVENT(template, call, proto, args)			\
> -									\
> -static struct trace_event ftrace_event_type_##call = {			\
> +static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
>  	.trace			= ftrace_raw_output_##call,		\
> +};									\
> +static struct trace_event ftrace_event_type_##call = {			\
> +	.funcs			= &ftrace_event_type_funcs_##call,	\
>  };
>  
>  #undef DEFINE_EVENT_PRINT
> diff --git a/include/trace/syscall.h b/include/trace/syscall.h
> index 25087c3..f0eaa45 100644
> --- a/include/trace/syscall.h
> +++ b/include/trace/syscall.h
> @@ -41,8 +41,10 @@ extern int reg_event_syscall_exit(struct ftrace_event_call *call);
>  extern void unreg_event_syscall_exit(struct ftrace_event_call *call);
>  extern int
>  ftrace_format_syscall(struct ftrace_event_call *call, struct trace_seq *s);
> -enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags);
> -enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags);
> +enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags,
> +				      struct trace_event *event);
> +enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags,
> +				     struct trace_event *event);
>  #endif
>  
>  #ifdef CONFIG_PERF_EVENTS
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 07f945a..2737c70 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1320,7 +1320,7 @@ out:
>  }
>  
>  static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
> -					       int flags)
> +					       int flags, struct trace_event *event)
>  {
>  	return print_one_line(iter, false);
>  }
> @@ -1342,7 +1342,8 @@ static int blk_trace_synthesize_old_trace(struct trace_iterator *iter)
>  }
>  
>  static enum print_line_t
> -blk_trace_event_print_binary(struct trace_iterator *iter, int flags)
> +blk_trace_event_print_binary(struct trace_iterator *iter, int flags,
> +			     struct trace_event *event)
>  {
>  	return blk_trace_synthesize_old_trace(iter) ?
>  			TRACE_TYPE_HANDLED : TRACE_TYPE_PARTIAL_LINE;
> @@ -1380,12 +1381,16 @@ static struct tracer blk_tracer __read_mostly = {
>  	.set_flag	= blk_tracer_set_flag,
>  };
>  
> -static struct trace_event trace_blk_event = {
> -	.type		= TRACE_BLK,
> +static struct trace_event_functions trace_blk_event_funcs = {
>  	.trace		= blk_trace_event_print,
>  	.binary		= blk_trace_event_print_binary,
>  };
>  
> +static struct trace_event trace_blk_event = {
> +	.type		= TRACE_BLK,
> +	.funcs		= &trace_blk_event_funcs,
> +};
> +
>  static int __init init_blk_tracer(void)
>  {
>  	if (!register_ftrace_event(&trace_blk_event)) {
> diff --git a/kernel/trace/kmemtrace.c b/kernel/trace/kmemtrace.c
> index a91da69..6a24fe0 100644
> --- a/kernel/trace/kmemtrace.c
> +++ b/kernel/trace/kmemtrace.c
> @@ -237,7 +237,8 @@ struct kmemtrace_user_event_alloc {
>  };
>  
>  static enum print_line_t
> -kmemtrace_print_alloc(struct trace_iterator *iter, int flags)
> +kmemtrace_print_alloc(struct trace_iterator *iter, int flags,
> +		      struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct kmemtrace_alloc_entry *entry;
> @@ -257,7 +258,8 @@ kmemtrace_print_alloc(struct trace_iterator *iter, int flags)
>  }
>  
>  static enum print_line_t
> -kmemtrace_print_free(struct trace_iterator *iter, int flags)
> +kmemtrace_print_free(struct trace_iterator *iter, int flags,
> +		     struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct kmemtrace_free_entry *entry;
> @@ -275,7 +277,8 @@ kmemtrace_print_free(struct trace_iterator *iter, int flags)
>  }
>  
>  static enum print_line_t
> -kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags)
> +kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags,
> +			   struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct kmemtrace_alloc_entry *entry;
> @@ -309,7 +312,8 @@ kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags)
>  }
>  
>  static enum print_line_t
> -kmemtrace_print_free_user(struct trace_iterator *iter, int flags)
> +kmemtrace_print_free_user(struct trace_iterator *iter, int flags,
> +			  struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct kmemtrace_free_entry *entry;
> @@ -463,18 +467,26 @@ static enum print_line_t kmemtrace_print_line(struct trace_iterator *iter)
>  	}
>  }
>  
> -static struct trace_event kmem_trace_alloc = {
> -	.type			= TRACE_KMEM_ALLOC,
> +static struct trace_event_functions kmem_trace_alloc_funcs = {
>  	.trace			= kmemtrace_print_alloc,
>  	.binary			= kmemtrace_print_alloc_user,
>  };
>  
> -static struct trace_event kmem_trace_free = {
> -	.type			= TRACE_KMEM_FREE,
> +static struct trace_event kmem_trace_alloc = {
> +	.type			= TRACE_KMEM_ALLOC,
> +	.funcs			= &kmem_trace_alloc_funcs,
> +};
> +
> +static struct trace_event_functions kmem_trace_free_funcs = {
>  	.trace			= kmemtrace_print_free,
>  	.binary			= kmemtrace_print_free_user,
>  };
>  
> +static struct trace_event kmem_trace_free = {
> +	.type			= TRACE_KMEM_FREE,
> +	.funcs			= &kmem_trace_free_funcs,
> +};
> +
>  static struct tracer kmem_tracer __read_mostly = {
>  	.name			= "kmemtrace",
>  	.init			= kmem_trace_init,
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index b9be232..427e074 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1924,7 +1924,7 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
>  	}
>  
>  	if (event)
> -		return event->trace(iter, sym_flags);
> +		return event->funcs->trace(iter, sym_flags, event);
>  
>  	if (!trace_seq_printf(s, "Unknown type %d\n", entry->type))
>  		goto partial;
> @@ -1950,7 +1950,7 @@ static enum print_line_t print_raw_fmt(struct trace_iterator *iter)
>  
>  	event = ftrace_find_event(entry->type);
>  	if (event)
> -		return event->raw(iter, 0);
> +		return event->funcs->raw(iter, 0, event);
>  
>  	if (!trace_seq_printf(s, "%d ?\n", entry->type))
>  		goto partial;
> @@ -1977,7 +1977,7 @@ static enum print_line_t print_hex_fmt(struct trace_iterator *iter)
>  
>  	event = ftrace_find_event(entry->type);
>  	if (event) {
> -		enum print_line_t ret = event->hex(iter, 0);
> +		enum print_line_t ret = event->funcs->hex(iter, 0, event);
>  		if (ret != TRACE_TYPE_HANDLED)
>  			return ret;
>  	}
> @@ -2002,7 +2002,8 @@ static enum print_line_t print_bin_fmt(struct trace_iterator *iter)
>  	}
>  
>  	event = ftrace_find_event(entry->type);
> -	return event ? event->binary(iter, 0) : TRACE_TYPE_HANDLED;
> +	return event ? event->funcs->binary(iter, 0, event) :
> +		TRACE_TYPE_HANDLED;
>  }
>  
>  static int trace_empty(struct trace_iterator *iter)
> diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> index a7f75fb..c620763 100644
> --- a/kernel/trace/trace_functions_graph.c
> +++ b/kernel/trace/trace_functions_graph.c
> @@ -1020,7 +1020,7 @@ print_graph_comment(struct trace_seq *s,  struct trace_entry *ent,
>  		if (!event)
>  			return TRACE_TYPE_UNHANDLED;
>  
> -		ret = event->trace(iter, sym_flags);
> +		ret = event->funcs->trace(iter, sym_flags, event);
>  		if (ret != TRACE_TYPE_HANDLED)
>  			return ret;
>  	}
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 428f4a5..b989ae2 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1011,16 +1011,15 @@ static __kprobes void kretprobe_trace_func(struct kretprobe_instance *ri,
>  
>  /* Event entry printers */
>  enum print_line_t
> -print_kprobe_event(struct trace_iterator *iter, int flags)
> +print_kprobe_event(struct trace_iterator *iter, int flags,
> +		   struct trace_event *event)
>  {
>  	struct kprobe_trace_entry *field;
>  	struct trace_seq *s = &iter->seq;
> -	struct trace_event *event;
>  	struct trace_probe *tp;
>  	int i;
>  
>  	field = (struct kprobe_trace_entry *)iter->ent;
> -	event = ftrace_find_event(field->ent.type);
>  	tp = container_of(event, struct trace_probe, event);
>  
>  	if (!trace_seq_printf(s, "%s: (", tp->call.name))
> @@ -1046,16 +1045,15 @@ partial:
>  }
>  
>  enum print_line_t
> -print_kretprobe_event(struct trace_iterator *iter, int flags)
> +print_kretprobe_event(struct trace_iterator *iter, int flags,
> +		      struct trace_event *event)
>  {
>  	struct kretprobe_trace_entry *field;
>  	struct trace_seq *s = &iter->seq;
> -	struct trace_event *event;
>  	struct trace_probe *tp;
>  	int i;
>  
>  	field = (struct kretprobe_trace_entry *)iter->ent;
> -	event = ftrace_find_event(field->ent.type);
>  	tp = container_of(event, struct trace_probe, event);
>  
>  	if (!trace_seq_printf(s, "%s: (", tp->call.name))
> @@ -1351,6 +1349,14 @@ int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
>  	return 0;	/* We don't tweek kernel, so just return 0 */
>  }
>  
> +static struct trace_event_functions kretprobe_funcs = {
> +	.trace		= print_kretprobe_event
> +};
> +
> +static struct trace_event_functions kprobe_funcs = {
> +	.trace		= print_kprobe_event
> +};
> +
>  static int register_probe_event(struct trace_probe *tp)
>  {
>  	struct ftrace_event_call *call = &tp->call;
> @@ -1358,13 +1364,13 @@ static int register_probe_event(struct trace_probe *tp)
>  
>  	/* Initialize ftrace_event_call */
>  	if (probe_is_return(tp)) {
> -		tp->event.trace = print_kretprobe_event;
> +		tp->event.funcs = &kretprobe_funcs;
>  		INIT_LIST_HEAD(&call->class->fields);
>  		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kretprobe_event_define_fields;
>  	} else {
>  		INIT_LIST_HEAD(&call->class->fields);
> -		tp->event.trace = print_kprobe_event;
> +		tp->event.funcs = &kprobe_funcs;
>  		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kprobe_event_define_fields;
>  	}
> diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
> index 8e46b33..9c00283 100644
> --- a/kernel/trace/trace_output.c
> +++ b/kernel/trace/trace_output.c
> @@ -726,6 +726,9 @@ int register_ftrace_event(struct trace_event *event)
>  	if (WARN_ON(!event))
>  		goto out;
>  
> +	if (WARN_ON(!event->funcs))
> +		goto out;
> +
>  	INIT_LIST_HEAD(&event->list);
>  
>  	if (!event->type) {
> @@ -758,14 +761,14 @@ int register_ftrace_event(struct trace_event *event)
>  			goto out;
>  	}
>  
> -	if (event->trace == NULL)
> -		event->trace = trace_nop_print;
> -	if (event->raw == NULL)
> -		event->raw = trace_nop_print;
> -	if (event->hex == NULL)
> -		event->hex = trace_nop_print;
> -	if (event->binary == NULL)
> -		event->binary = trace_nop_print;
> +	if (event->funcs->trace == NULL)
> +		event->funcs->trace = trace_nop_print;
> +	if (event->funcs->raw == NULL)
> +		event->funcs->raw = trace_nop_print;
> +	if (event->funcs->hex == NULL)
> +		event->funcs->hex = trace_nop_print;
> +	if (event->funcs->binary == NULL)
> +		event->funcs->binary = trace_nop_print;
>  
>  	key = event->type & (EVENT_HASHSIZE - 1);
>  
> @@ -807,13 +810,15 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_event);
>   * Standard events
>   */
>  
> -enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags)
> +enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags,
> +				  struct trace_event *event)
>  {
>  	return TRACE_TYPE_HANDLED;
>  }
>  
>  /* TRACE_FN */
> -static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags,
> +					struct trace_event *event)
>  {
>  	struct ftrace_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -840,7 +845,8 @@ static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> -static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags,
> +				      struct trace_event *event)
>  {
>  	struct ftrace_entry *field;
>  
> @@ -854,7 +860,8 @@ static enum print_line_t trace_fn_raw(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags,
> +				      struct trace_event *event)
>  {
>  	struct ftrace_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -867,7 +874,8 @@ static enum print_line_t trace_fn_hex(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags,
> +				      struct trace_event *event)
>  {
>  	struct ftrace_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -880,14 +888,18 @@ static enum print_line_t trace_fn_bin(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static struct trace_event trace_fn_event = {
> -	.type		= TRACE_FN,
> +static struct trace_event_functions trace_fn_funcs = {
>  	.trace		= trace_fn_trace,
>  	.raw		= trace_fn_raw,
>  	.hex		= trace_fn_hex,
>  	.binary		= trace_fn_bin,
>  };
>  
> +static struct trace_event trace_fn_event = {
> +	.type		= TRACE_FN,
> +	.funcs		= &trace_fn_funcs,
> +};
> +
>  /* TRACE_CTX an TRACE_WAKE */
>  static enum print_line_t trace_ctxwake_print(struct trace_iterator *iter,
>  					     char *delim)
> @@ -916,13 +928,14 @@ static enum print_line_t trace_ctxwake_print(struct trace_iterator *iter,
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static enum print_line_t trace_ctx_print(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_ctx_print(struct trace_iterator *iter, int flags,
> +					 struct trace_event *event)
>  {
>  	return trace_ctxwake_print(iter, "==>");
>  }
>  
>  static enum print_line_t trace_wake_print(struct trace_iterator *iter,
> -					  int flags)
> +					  int flags, struct trace_event *event)
>  {
>  	return trace_ctxwake_print(iter, "  +");
>  }
> @@ -950,12 +963,14 @@ static int trace_ctxwake_raw(struct trace_iterator *iter, char S)
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static enum print_line_t trace_ctx_raw(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_ctx_raw(struct trace_iterator *iter, int flags,
> +				       struct trace_event *event)
>  {
>  	return trace_ctxwake_raw(iter, 0);
>  }
>  
> -static enum print_line_t trace_wake_raw(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_wake_raw(struct trace_iterator *iter, int flags,
> +					struct trace_event *event)
>  {
>  	return trace_ctxwake_raw(iter, '+');
>  }
> @@ -984,18 +999,20 @@ static int trace_ctxwake_hex(struct trace_iterator *iter, char S)
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static enum print_line_t trace_ctx_hex(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_ctx_hex(struct trace_iterator *iter, int flags,
> +				       struct trace_event *event)
>  {
>  	return trace_ctxwake_hex(iter, 0);
>  }
>  
> -static enum print_line_t trace_wake_hex(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_wake_hex(struct trace_iterator *iter, int flags,
> +					struct trace_event *event)
>  {
>  	return trace_ctxwake_hex(iter, '+');
>  }
>  
>  static enum print_line_t trace_ctxwake_bin(struct trace_iterator *iter,
> -					   int flags)
> +					   int flags, struct trace_event *event)
>  {
>  	struct ctx_switch_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1012,25 +1029,33 @@ static enum print_line_t trace_ctxwake_bin(struct trace_iterator *iter,
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static struct trace_event trace_ctx_event = {
> -	.type		= TRACE_CTX,
> +static struct trace_event_functions trace_ctx_funcs = {
>  	.trace		= trace_ctx_print,
>  	.raw		= trace_ctx_raw,
>  	.hex		= trace_ctx_hex,
>  	.binary		= trace_ctxwake_bin,
>  };
>  
> -static struct trace_event trace_wake_event = {
> -	.type		= TRACE_WAKE,
> +static struct trace_event trace_ctx_event = {
> +	.type		= TRACE_CTX,
> +	.funcs		= &trace_ctx_funcs,
> +};
> +
> +static struct trace_event_functions trace_wake_funcs = {
>  	.trace		= trace_wake_print,
>  	.raw		= trace_wake_raw,
>  	.hex		= trace_wake_hex,
>  	.binary		= trace_ctxwake_bin,
>  };
>  
> +static struct trace_event trace_wake_event = {
> +	.type		= TRACE_WAKE,
> +	.funcs		= &trace_wake_funcs,
> +};
> +
>  /* TRACE_SPECIAL */
>  static enum print_line_t trace_special_print(struct trace_iterator *iter,
> -					     int flags)
> +					     int flags, struct trace_event *event)
>  {
>  	struct special_entry *field;
>  
> @@ -1046,7 +1071,7 @@ static enum print_line_t trace_special_print(struct trace_iterator *iter,
>  }
>  
>  static enum print_line_t trace_special_hex(struct trace_iterator *iter,
> -					   int flags)
> +					   int flags, struct trace_event *event)
>  {
>  	struct special_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1061,7 +1086,7 @@ static enum print_line_t trace_special_hex(struct trace_iterator *iter,
>  }
>  
>  static enum print_line_t trace_special_bin(struct trace_iterator *iter,
> -					   int flags)
> +					   int flags, struct trace_event *event)
>  {
>  	struct special_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1075,18 +1100,22 @@ static enum print_line_t trace_special_bin(struct trace_iterator *iter,
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -static struct trace_event trace_special_event = {
> -	.type		= TRACE_SPECIAL,
> +static struct trace_event_functions trace_special_funcs = {
>  	.trace		= trace_special_print,
>  	.raw		= trace_special_print,
>  	.hex		= trace_special_hex,
>  	.binary		= trace_special_bin,
>  };
>  
> +static struct trace_event trace_special_event = {
> +	.type		= TRACE_SPECIAL,
> +	.funcs		= &trace_special_funcs,
> +};
> +
>  /* TRACE_STACK */
>  
>  static enum print_line_t trace_stack_print(struct trace_iterator *iter,
> -					   int flags)
> +					   int flags, struct trace_event *event)
>  {
>  	struct stack_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1114,17 +1143,21 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter,
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> -static struct trace_event trace_stack_event = {
> -	.type		= TRACE_STACK,
> +static struct trace_event_functions trace_stack_funcs = {
>  	.trace		= trace_stack_print,
>  	.raw		= trace_special_print,
>  	.hex		= trace_special_hex,
>  	.binary		= trace_special_bin,
>  };
>  
> +static struct trace_event trace_stack_event = {
> +	.type		= TRACE_STACK,
> +	.funcs		= &trace_stack_funcs,
> +};
> +
>  /* TRACE_USER_STACK */
>  static enum print_line_t trace_user_stack_print(struct trace_iterator *iter,
> -						int flags)
> +						int flags, struct trace_event *event)
>  {
>  	struct userstack_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1143,17 +1176,22 @@ static enum print_line_t trace_user_stack_print(struct trace_iterator *iter,
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> -static struct trace_event trace_user_stack_event = {
> -	.type		= TRACE_USER_STACK,
> +static struct trace_event_functions trace_user_stack_funcs = {
>  	.trace		= trace_user_stack_print,
>  	.raw		= trace_special_print,
>  	.hex		= trace_special_hex,
>  	.binary		= trace_special_bin,
>  };
>  
> +static struct trace_event trace_user_stack_event = {
> +	.type		= TRACE_USER_STACK,
> +	.funcs		= &trace_user_stack_funcs,
> +};
> +
>  /* TRACE_BPRINT */
>  static enum print_line_t
> -trace_bprint_print(struct trace_iterator *iter, int flags)
> +trace_bprint_print(struct trace_iterator *iter, int flags,
> +		   struct trace_event *event)
>  {
>  	struct trace_entry *entry = iter->ent;
>  	struct trace_seq *s = &iter->seq;
> @@ -1178,7 +1216,8 @@ trace_bprint_print(struct trace_iterator *iter, int flags)
>  
>  
>  static enum print_line_t
> -trace_bprint_raw(struct trace_iterator *iter, int flags)
> +trace_bprint_raw(struct trace_iterator *iter, int flags,
> +		 struct trace_event *event)
>  {
>  	struct bprint_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1197,16 +1236,19 @@ trace_bprint_raw(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> +static struct trace_event_functions trace_bprint_funcs = {
> +	.trace		= trace_bprint_print,
> +	.raw		= trace_bprint_raw,
> +};
>  
>  static struct trace_event trace_bprint_event = {
>  	.type		= TRACE_BPRINT,
> -	.trace		= trace_bprint_print,
> -	.raw		= trace_bprint_raw,
> +	.funcs		= &trace_bprint_funcs,
>  };
>  
>  /* TRACE_PRINT */
>  static enum print_line_t trace_print_print(struct trace_iterator *iter,
> -					   int flags)
> +					   int flags, struct trace_event *event)
>  {
>  	struct print_entry *field;
>  	struct trace_seq *s = &iter->seq;
> @@ -1225,7 +1267,8 @@ static enum print_line_t trace_print_print(struct trace_iterator *iter,
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> -static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags)
> +static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags,
> +					 struct trace_event *event)
>  {
>  	struct print_entry *field;
>  
> @@ -1240,12 +1283,16 @@ static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags)
>  	return TRACE_TYPE_PARTIAL_LINE;
>  }
>  
> -static struct trace_event trace_print_event = {
> -	.type	 	= TRACE_PRINT,
> +static struct trace_event_functions trace_print_funcs = {
>  	.trace		= trace_print_print,
>  	.raw		= trace_print_raw,
>  };
>  
> +static struct trace_event trace_print_event = {
> +	.type	 	= TRACE_PRINT,
> +	.funcs		= &trace_print_funcs,
> +};
> +
>  
>  static struct trace_event *events[] __initdata = {
>  	&trace_fn_event,
> diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h
> index 9d91c72..c038eba 100644
> --- a/kernel/trace/trace_output.h
> +++ b/kernel/trace/trace_output.h
> @@ -25,7 +25,7 @@ extern void trace_event_read_unlock(void);
>  extern struct trace_event *ftrace_find_event(int type);
>  
>  extern enum print_line_t trace_nop_print(struct trace_iterator *iter,
> -					 int flags);
> +					 int flags, struct trace_event *event);
>  extern int
>  trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry);
>  
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 7ee6086..0bcca08 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -84,7 +84,8 @@ static struct syscall_metadata *syscall_nr_to_meta(int nr)
>  }
>  
>  enum print_line_t
> -print_syscall_enter(struct trace_iterator *iter, int flags)
> +print_syscall_enter(struct trace_iterator *iter, int flags,
> +		    struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct trace_entry *ent = iter->ent;
> @@ -136,7 +137,8 @@ end:
>  }
>  
>  enum print_line_t
> -print_syscall_exit(struct trace_iterator *iter, int flags)
> +print_syscall_exit(struct trace_iterator *iter, int flags,
> +		   struct trace_event *event)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct trace_entry *ent = iter->ent;
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/10][RFC] tracing: Move print functions into event class
  2010-04-26 19:50 ` [PATCH 08/10][RFC] tracing: Move print functions into event class Steven Rostedt
@ 2010-04-28 21:03   ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 21:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Currently, every event has its own trace_event structure. This is
> fine since the structure is needed anyway. But the print function
> structure (trace_event_functions) is now separate. Since the output
> of the trace event is done by the class (with the exception of events
> defined by DEFINE_EVENT_PRINT), it makes sense to have the class
> define the print functions that all events in the class can use.
> 
> This makes a bigger deal with the syscall events since all syscall events
> use the same class. The savings here is another 37K.
> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5774574	1293204	9351592	16419370	 fa8a2a	vmlinux.init
> 5761154	1268356	9351592	16381102	 f9f4ae	vmlinux.print
> 
> To accomplish this, and to let the class know what event is being
> printed, the event structure is embedded in the ftrace_event_call
> structure. This should not be an issues since the event structure
> was created for each event anyway.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> ---
>  include/linux/ftrace_event.h  |    2 +-
>  include/linux/syscalls.h      |   18 +++------------
>  include/trace/ftrace.h        |   47 +++++++++++++++++-----------------------
>  kernel/trace/trace_events.c   |    6 ++--
>  kernel/trace/trace_kprobe.c   |   14 +++++-------
>  kernel/trace/trace_syscalls.c |    8 +++++++
>  6 files changed, 42 insertions(+), 53 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 09c2ad7..aa3695a 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -146,7 +146,7 @@ struct ftrace_event_call {
>  	struct ftrace_event_class *class;
>  	char			*name;
>  	struct dentry		*dir;
> -	struct trace_event	*event;
> +	struct trace_event	event;
>  	int			enabled;
>  	int			id;
>  	const char		*print_fmt;
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index f3892e9..5d060b7 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -120,24 +120,20 @@ struct perf_event_attr;
>  
>  extern struct ftrace_event_class event_class_syscall_enter;
>  extern struct ftrace_event_class event_class_syscall_exit;
> +extern struct trace_event_functions enter_syscall_print_funcs;
> +extern struct trace_event_functions exit_syscall_print_funcs;
>  
>  #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
>  	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_enter_##sname;		\
> -	static struct trace_event_functions enter_syscall_print_funcs_##sname = { \
> -		.trace                  = print_syscall_enter,		\
> -	};								\
> -	static struct trace_event enter_syscall_print_##sname = {	\
> -		.funcs                  = &enter_syscall_print_funcs_##sname, \
> -	};								\
>  	static struct ftrace_event_call __used				\
>  	  __attribute__((__aligned__(4)))				\
>  	  __attribute__((section("_ftrace_events")))			\
>  	  event_enter_##sname = {					\
>  		.name                   = "sys_enter"#sname,		\
>  		.class			= &event_class_syscall_enter,	\
> -		.event                  = &enter_syscall_print_##sname,	\
> +		.event.funcs            = &enter_syscall_print_funcs,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
> @@ -145,19 +141,13 @@ extern struct ftrace_event_class event_class_syscall_exit;
>  	static struct syscall_metadata __syscall_meta_##sname;		\
>  	static struct ftrace_event_call					\
>  	__attribute__((__aligned__(4))) event_exit_##sname;		\
> -	static struct trace_event_functions exit_syscall_print_funcs_##sname = { \
> -		.trace                  = print_syscall_exit,		\
> -	};								\
> -	static struct trace_event exit_syscall_print_##sname = {	\
> -		.funcs                  = &exit_syscall_print_funcs_##sname, \
> -	};								\
>  	static struct ftrace_event_call __used				\
>  	  __attribute__((__aligned__(4)))				\
>  	  __attribute__((section("_ftrace_events")))			\
>  	  event_exit_##sname = {					\
>  		.name                   = "sys_exit"#sname,		\
>  		.class			= &event_class_syscall_exit,	\
> -		.event                  = &exit_syscall_print_##sname,	\
> +		.event.funcs		= &exit_syscall_print_funcs,	\
>  		.data			= (void *)&__syscall_meta_##sname,\
>  	}
>  
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 2efb301..d7b3b56 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -206,18 +206,22 @@
>  #undef DECLARE_EVENT_CLASS
>  #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
>  static notrace enum print_line_t					\
> -ftrace_raw_output_id_##call(int event_id, const char *name,		\
> -			    struct trace_iterator *iter, int flags)	\
> +ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
> +			 struct trace_event *trace_event)		\
>  {									\
> +	struct ftrace_event_call *event;				\
>  	struct trace_seq *s = &iter->seq;				\
>  	struct ftrace_raw_##call *field;				\
>  	struct trace_entry *entry;					\
>  	struct trace_seq *p;						\
>  	int ret;							\
>  									\
> +	event = container_of(trace_event, struct ftrace_event_call,	\
> +			     event);					\
> +									\
>  	entry = iter->ent;						\
>  									\
> -	if (entry->type != event_id) {					\
> +	if (entry->type != event->id) {					\
>  		WARN_ON_ONCE(1);					\
>  		return TRACE_TYPE_UNHANDLED;				\
>  	}								\
> @@ -226,7 +230,7 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
>  									\
>  	p = &get_cpu_var(ftrace_event_seq);				\
>  	trace_seq_init(p);						\
> -	ret = trace_seq_printf(s, "%s: ", name);			\
> +	ret = trace_seq_printf(s, "%s: ", event->name);			\
>  	if (ret)							\
>  		ret = trace_seq_printf(s, print);			\
>  	put_cpu();							\
> @@ -234,17 +238,10 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
>  		return TRACE_TYPE_PARTIAL_LINE;				\
>  									\
>  	return TRACE_TYPE_HANDLED;					\
> -}
> -
> -#undef DEFINE_EVENT
> -#define DEFINE_EVENT(template, name, proto, args)			\
> -static notrace enum print_line_t					\
> -ftrace_raw_output_##name(struct trace_iterator *iter, int flags,	\
> -			 struct trace_event *event)			\
> -{									\
> -	return ftrace_raw_output_id_##template(event_##name.id,		\
> -					       #name, iter, flags);	\
> -}
> +}									\
> +static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
> +	.trace			= ftrace_raw_output_##call,		\
> +};
>  
>  #undef DEFINE_EVENT_PRINT
>  #define DEFINE_EVENT_PRINT(template, call, proto, args, print)		\
> @@ -277,7 +274,10 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
>  		return TRACE_TYPE_PARTIAL_LINE;				\
>  									\
>  	return TRACE_TYPE_HANDLED;					\
> -}
> +}									\
> +static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
> +	.trace			= ftrace_raw_output_##call,		\
> +};
>  
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>  
> @@ -526,17 +526,10 @@ ftrace_raw_event_##call(proto,						\
>  }
>  
>  #undef DEFINE_EVENT
> -#define DEFINE_EVENT(template, call, proto, args)			\
> -static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
> -	.trace			= ftrace_raw_output_##call,		\
> -};									\
> -static struct trace_event ftrace_event_type_##call = {			\
> -	.funcs			= &ftrace_event_type_funcs_##call,	\
> -};
> +#define DEFINE_EVENT(template, call, proto, args)
>  
>  #undef DEFINE_EVENT_PRINT
> -#define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
> -	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
> +#define DEFINE_EVENT_PRINT(template, name, proto, args, print)
>  
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>  
> @@ -572,7 +565,7 @@ __attribute__((__aligned__(4)))						\
>  __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.class			= &event_class_##template,		\
> -	.event			= &ftrace_event_type_##call,		\
> +	.event.funcs		= &ftrace_event_type_funcs_##template,	\
>  	.print_fmt		= print_fmt_##template,			\
>  }
>  
> @@ -586,7 +579,7 @@ __attribute__((__aligned__(4)))						\
>  __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
>  	.class			= &event_class_##template,		\
> -	.event			= &ftrace_event_type_##call,		\
> +	.event.funcs		= &ftrace_event_type_funcs_##call,	\
>  	.print_fmt		= print_fmt_##call,			\
>  }
>  
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index c34a9bd..9aa298e 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -129,7 +129,7 @@ int trace_event_raw_init(struct ftrace_event_call *call)
>  {
>  	int id;
>  
> -	id = register_ftrace_event(call->event);
> +	id = register_ftrace_event(&call->event);
>  	if (!id)
>  		return -ENODEV;
>  	call->id = id;
> @@ -1077,8 +1077,8 @@ static void remove_subsystem_dir(const char *name)
>  static void __trace_remove_event_call(struct ftrace_event_call *call)
>  {
>  	ftrace_event_enable_disable(call, 0);
> -	if (call->event)
> -		__unregister_ftrace_event(call->event);
> +	if (call->event.funcs)
> +		__unregister_ftrace_event(&call->event);
>  	debugfs_remove_recursive(call->dir);
>  	list_del(&call->list);
>  	trace_destroy_fields(call);
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index b989ae2..d8061c3 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -204,7 +204,6 @@ struct trace_probe {
>  	const char		*symbol;	/* symbol name */
>  	struct ftrace_event_class	class;
>  	struct ftrace_event_call	call;
> -	struct trace_event		event;
>  	unsigned int		nr_args;
>  	struct probe_arg	args[];
>  };
> @@ -1020,7 +1019,7 @@ print_kprobe_event(struct trace_iterator *iter, int flags,
>  	int i;
>  
>  	field = (struct kprobe_trace_entry *)iter->ent;
> -	tp = container_of(event, struct trace_probe, event);
> +	tp = container_of(event, struct trace_probe, call.event);
>  
>  	if (!trace_seq_printf(s, "%s: (", tp->call.name))
>  		goto partial;
> @@ -1054,7 +1053,7 @@ print_kretprobe_event(struct trace_iterator *iter, int flags,
>  	int i;
>  
>  	field = (struct kretprobe_trace_entry *)iter->ent;
> -	tp = container_of(event, struct trace_probe, event);
> +	tp = container_of(event, struct trace_probe, call.event);
>  
>  	if (!trace_seq_printf(s, "%s: (", tp->call.name))
>  		goto partial;
> @@ -1364,20 +1363,19 @@ static int register_probe_event(struct trace_probe *tp)
>  
>  	/* Initialize ftrace_event_call */
>  	if (probe_is_return(tp)) {
> -		tp->event.funcs = &kretprobe_funcs;
>  		INIT_LIST_HEAD(&call->class->fields);
> +		call->event.funcs = &kretprobe_funcs;
>  		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kretprobe_event_define_fields;
>  	} else {
>  		INIT_LIST_HEAD(&call->class->fields);
> -		tp->event.funcs = &kprobe_funcs;
> +		call->event.funcs = &kprobe_funcs;
>  		call->class->raw_init = probe_event_raw_init;
>  		call->class->define_fields = kprobe_event_define_fields;
>  	}
>  	if (set_print_fmt(tp) < 0)
>  		return -ENOMEM;
> -	call->event = &tp->event;
> -	call->id = register_ftrace_event(&tp->event);
> +	call->id = register_ftrace_event(&call->event);
>  	if (!call->id) {
>  		kfree(call->print_fmt);
>  		return -ENODEV;
> @@ -1389,7 +1387,7 @@ static int register_probe_event(struct trace_probe *tp)
>  	if (ret) {
>  		pr_info("Failed to register kprobe event: %s\n", call->name);
>  		kfree(call->print_fmt);
> -		unregister_ftrace_event(&tp->event);
> +		unregister_ftrace_event(&call->event);
>  	}
>  	return ret;
>  }
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 0bcca08..a4bed39 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -30,6 +30,14 @@ syscall_get_fields(struct ftrace_event_call *call)
>  	return &entry->fields;
>  }
>  
> +struct trace_event_functions enter_syscall_print_funcs = {
> +	.trace                  = print_syscall_enter,
> +};
> +
> +struct trace_event_functions exit_syscall_print_funcs = {
> +	.trace                  = print_syscall_exit,
> +};
> +
>  struct ftrace_event_class event_class_syscall_enter = {
>  	.system			= "syscalls",
>  	.reg			= syscall_enter_register,
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure
  2010-04-26 19:50 ` [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure Steven Rostedt
@ 2010-04-28 21:06   ` Mathieu Desnoyers
  2010-04-29  0:04     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 21:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Now that the trace_event structure is embedded in the ftrace_event_call
> structure, there is no need for the ftrace_event_call id field.
> The id field is the same as the trace_event type field.
> 
> Removing the id and re-arranging the structure brings down the tracepoint
> footprint by another 5K.

I might have missed it, but how exactly is the event type allocated
uniquely ? Is it barely a duplicate of the call "id" field ?

Thanks,

Mathieu

> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5761154	1268356	9351592	16381102	 f9f4ae	vmlinux.print
> 5761074	1262596	9351592	16375262	 f9ddde	vmlinux.id
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/ftrace_event.h       |    5 ++---
>  include/trace/ftrace.h             |   12 ++++++------
>  kernel/trace/trace_event_perf.c    |    4 ++--
>  kernel/trace/trace_events.c        |    7 +++----
>  kernel/trace/trace_events_filter.c |    2 +-
>  kernel/trace/trace_export.c        |    4 ++--
>  kernel/trace/trace_kprobe.c        |   18 ++++++++++--------
>  kernel/trace/trace_syscalls.c      |   14 ++++++++------
>  8 files changed, 34 insertions(+), 32 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index aa3695a..b26507f 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -147,14 +147,13 @@ struct ftrace_event_call {
>  	char			*name;
>  	struct dentry		*dir;
>  	struct trace_event	event;
> -	int			enabled;
> -	int			id;
>  	const char		*print_fmt;
> -	int			filter_active;
>  	struct event_filter	*filter;
>  	void			*mod;
>  	void			*data;
>  
> +	int			enabled;
> +	int			filter_active;
>  	int			perf_refcount;
>  };
>  
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index d7b3b56..246b05e 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -150,7 +150,7 @@
>   *
>   *	entry = iter->ent;
>   *
> - *	if (entry->type != event_<call>.id) {
> + *	if (entry->type != event_<call>->event.type) {
>   *		WARN_ON_ONCE(1);
>   *		return TRACE_TYPE_UNHANDLED;
>   *	}
> @@ -221,7 +221,7 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
>  									\
>  	entry = iter->ent;						\
>  									\
> -	if (entry->type != event->id) {					\
> +	if (entry->type != event->event.type) {				\
>  		WARN_ON_ONCE(1);					\
>  		return TRACE_TYPE_UNHANDLED;				\
>  	}								\
> @@ -257,7 +257,7 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags,	\
>  									\
>  	entry = iter->ent;						\
>  									\
> -	if (entry->type != event_##call.id) {				\
> +	if (entry->type != event_##call.event.type) {			\
>  		WARN_ON_ONCE(1);					\
>  		return TRACE_TYPE_UNHANDLED;				\
>  	}								\
> @@ -408,7 +408,7 @@ static inline notrace int ftrace_get_offsets_##call(			\
>   *	__data_size = ftrace_get_offsets_<call>(&__data_offsets, args);
>   *
>   *	event = trace_current_buffer_lock_reserve(&buffer,
> - *				  event_<call>.id,
> + *				  event_<call>->event.type,
>   *				  sizeof(*entry) + __data_size,
>   *				  irq_flags, pc);
>   *	if (!event)
> @@ -509,7 +509,7 @@ ftrace_raw_event_##call(proto,						\
>  	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
>  									\
>  	event = trace_current_buffer_lock_reserve(&buffer,		\
> -				 event_call->id,			\
> +				 event_call->event.type,		\
>  				 sizeof(*entry) + __data_size,		\
>  				 irq_flags, pc);			\
>  	if (!event)							\
> @@ -700,7 +700,7 @@ perf_trace_##call(proto, struct ftrace_event_call *event_call)		\
>  		      "profile buffer not large enough"))		\
>  		return;							\
>  	entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare(	\
> -		__entry_size, event_call->id, &rctx, &irq_flags);	\
> +		__entry_size, event_call->event.type, &rctx, &irq_flags); \
>  	if (!entry)							\
>  		return;							\
>  	tstruct								\
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index 95df5a7..b8febf0 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -75,7 +75,7 @@ int perf_trace_enable(int event_id)
>  
>  	mutex_lock(&event_mutex);
>  	list_for_each_entry(event, &ftrace_events, list) {
> -		if (event->id == event_id &&
> +		if (event->event.type == event_id &&
>  		    event->class && event->class->perf_probe &&
>  		    try_module_get(event->mod)) {
>  			ret = perf_trace_event_enable(event);
> @@ -123,7 +123,7 @@ void perf_trace_disable(int event_id)
>  
>  	mutex_lock(&event_mutex);
>  	list_for_each_entry(event, &ftrace_events, list) {
> -		if (event->id == event_id) {
> +		if (event->event.type == event_id) {
>  			perf_trace_event_disable(event);
>  			module_put(event->mod);
>  			break;
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 9aa298e..8d2e28e 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -132,7 +132,6 @@ int trace_event_raw_init(struct ftrace_event_call *call)
>  	id = register_ftrace_event(&call->event);
>  	if (!id)
>  		return -ENODEV;
> -	call->id = id;
>  
>  	return 0;
>  }
> @@ -574,7 +573,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
>  	trace_seq_init(s);
>  
>  	trace_seq_printf(s, "name: %s\n", call->name);
> -	trace_seq_printf(s, "ID: %d\n", call->id);
> +	trace_seq_printf(s, "ID: %d\n", call->event.type);
>  	trace_seq_printf(s, "format:\n");
>  
>  	head = trace_get_fields(call);
> @@ -648,7 +647,7 @@ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
>  		return -ENOMEM;
>  
>  	trace_seq_init(s);
> -	trace_seq_printf(s, "%d\n", call->id);
> +	trace_seq_printf(s, "%d\n", call->event.type);
>  
>  	r = simple_read_from_buffer(ubuf, cnt, ppos,
>  				    s->buffer, s->len);
> @@ -974,7 +973,7 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
>  		trace_create_file("enable", 0644, call->dir, call,
>  				  enable);
>  
> -	if (call->id && (call->class->perf_probe || call->class->reg))
> +	if (call->event.type && (call->class->perf_probe || call->class->reg))
>  		trace_create_file("id", 0444, call->dir, call,
>  		 		  id);
>  
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index 560683d..b8e3bf3 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -1394,7 +1394,7 @@ int ftrace_profile_set_filter(struct perf_event *event, int event_id,
>  	mutex_lock(&event_mutex);
>  
>  	list_for_each_entry(call, &ftrace_events, list) {
> -		if (call->id == event_id)
> +		if (call->event.type == event_id)
>  			break;
>  	}
>  
> diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
> index e878d06..8536e2a 100644
> --- a/kernel/trace/trace_export.c
> +++ b/kernel/trace/trace_export.c
> @@ -153,7 +153,7 @@ static int ftrace_raw_init_event(struct ftrace_event_call *call)
>  #define F_printk(fmt, args...) #fmt ", "  __stringify(args)
>  
>  #undef FTRACE_ENTRY
> -#define FTRACE_ENTRY(call, struct_name, type, tstruct, print)		\
> +#define FTRACE_ENTRY(call, struct_name, etype, tstruct, print)		\
>  									\
>  struct ftrace_event_class event_class_ftrace_##call = {			\
>  	.system			= __stringify(TRACE_SYSTEM),		\
> @@ -165,7 +165,7 @@ struct ftrace_event_call __used						\
>  __attribute__((__aligned__(4)))						\
>  __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name			= #call,				\
> -	.id			= type,					\
> +	.event.type		= etype,				\
>  	.class			= &event_class_ftrace_##call,		\
>  	.print_fmt		= print,				\
>  };									\
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index d8061c3..934078b 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -960,8 +960,8 @@ static __kprobes void kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
>  
>  	size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);
>  
> -	event = trace_current_buffer_lock_reserve(&buffer, call->id, size,
> -						  irq_flags, pc);
> +	event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
> +						  size, irq_flags, pc);
>  	if (!event)
>  		return;
>  
> @@ -992,8 +992,8 @@ static __kprobes void kretprobe_trace_func(struct kretprobe_instance *ri,
>  
>  	size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);
>  
> -	event = trace_current_buffer_lock_reserve(&buffer, call->id, size,
> -						  irq_flags, pc);
> +	event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
> +						  size, irq_flags, pc);
>  	if (!event)
>  		return;
>  
> @@ -1228,7 +1228,8 @@ static __kprobes void kprobe_perf_func(struct kprobe *kp,
>  		     "profile buffer not large enough"))
>  		return;
>  
> -	entry = perf_trace_buf_prepare(size, call->id, &rctx, &irq_flags);
> +	entry = perf_trace_buf_prepare(size, call->event.type,
> +				       &rctx, &irq_flags);
>  	if (!entry)
>  		return;
>  
> @@ -1258,7 +1259,8 @@ static __kprobes void kretprobe_perf_func(struct kretprobe_instance *ri,
>  		     "profile buffer not large enough"))
>  		return;
>  
> -	entry = perf_trace_buf_prepare(size, call->id, &rctx, &irq_flags);
> +	entry = perf_trace_buf_prepare(size, call->event.type,
> +				       &rctx, &irq_flags);
>  	if (!entry)
>  		return;
>  
> @@ -1375,8 +1377,8 @@ static int register_probe_event(struct trace_probe *tp)
>  	}
>  	if (set_print_fmt(tp) < 0)
>  		return -ENOMEM;
> -	call->id = register_ftrace_event(&call->event);
> -	if (!call->id) {
> +	ret = register_ftrace_event(&call->event);
> +	if (!ret) {
>  		kfree(call->print_fmt);
>  		return -ENODEV;
>  	}
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index a4bed39..23fad22 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -108,7 +108,7 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
>  	if (!entry)
>  		goto end;
>  
> -	if (entry->enter_event->id != ent->type) {
> +	if (entry->enter_event->event.type != ent->type) {
>  		WARN_ON_ONCE(1);
>  		goto end;
>  	}
> @@ -164,7 +164,7 @@ print_syscall_exit(struct trace_iterator *iter, int flags,
>  		return TRACE_TYPE_HANDLED;
>  	}
>  
> -	if (entry->exit_event->id != ent->type) {
> +	if (entry->exit_event->event.type != ent->type) {
>  		WARN_ON_ONCE(1);
>  		return TRACE_TYPE_UNHANDLED;
>  	}
> @@ -306,7 +306,7 @@ void ftrace_syscall_enter(struct pt_regs *regs, long id)
>  	size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
>  
>  	event = trace_current_buffer_lock_reserve(&buffer,
> -			sys_data->enter_event->id, size, 0, 0);
> +			sys_data->enter_event->event.type, size, 0, 0);
>  	if (!event)
>  		return;
>  
> @@ -338,7 +338,7 @@ void ftrace_syscall_exit(struct pt_regs *regs, long ret)
>  		return;
>  
>  	event = trace_current_buffer_lock_reserve(&buffer,
> -			sys_data->exit_event->id, sizeof(*entry), 0, 0);
> +			sys_data->exit_event->event.type, sizeof(*entry), 0, 0);
>  	if (!event)
>  		return;
>  
> @@ -502,7 +502,8 @@ static void perf_syscall_enter(struct pt_regs *regs, long id)
>  		return;
>  
>  	rec = (struct syscall_trace_enter *)perf_trace_buf_prepare(size,
> -				sys_data->enter_event->id, &rctx, &flags);
> +				sys_data->enter_event->event.type,
> +				&rctx, &flags);
>  	if (!rec)
>  		return;
>  
> @@ -577,7 +578,8 @@ static void perf_syscall_exit(struct pt_regs *regs, long ret)
>  		return;
>  
>  	rec = (struct syscall_trace_exit *)perf_trace_buf_prepare(size,
> -				sys_data->exit_event->id, &rctx, &flags);
> +				sys_data->exit_event->event.type,
> +				&rctx, &flags);
>  	if (!rec)
>  		return;
>  
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field
  2010-04-26 19:50 ` [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field Steven Rostedt
@ 2010-04-28 21:13   ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-28 21:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> The filter_active and enable both use an int (4 bytes each) to
> set a single flag. We can save 4 bytes per event by combining the
> two into a single integer.

So you're adding extra masks on the tracing fast path to save 4 bytes
per event. That sounds like an acceptable tradeoff, since a mask is
incredibly cheap compared to a cache miss.

This patch could use slightly more verbose locking rule explanation of
the flags. The updates are protected by a mutex, but can reads happen
concurrently with updates ? (can we update the filter while tracing is
active ?)

Thanks,

Mathieu

> 
>    text	   data	    bss	    dec	    hex	filename
> 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> 5761074	1262596	9351592	16375262	 f9ddde	vmlinux.id
> 5761007	1256916	9351592	16369515	 f9c76b	vmlinux.flags
> 
> This gives us another 5K in savings.
> 
> The modification of both the enable and filter fields are done
> under the event_mutex, so it is still safe to combine the two.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/linux/ftrace_event.h       |   21 +++++++++++++++++++--
>  kernel/trace/trace.h               |    2 +-
>  kernel/trace/trace_events.c        |   14 +++++++-------
>  kernel/trace/trace_events_filter.c |   10 +++++-----
>  kernel/trace/trace_kprobe.c        |    2 +-
>  5 files changed, 33 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index b26507f..2e28c94 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -141,6 +141,16 @@ struct ftrace_event_class {
>  	int			(*raw_init)(struct ftrace_event_call *);
>  };
>  
> +enum {
> +	TRACE_EVENT_FL_ENABLED_BIT,
> +	TRACE_EVENT_FL_FILTERED_BIT,
> +};
> +
> +enum {
> +	TRACE_EVENT_FL_ENABLED	= (1 << TRACE_EVENT_FL_ENABLED_BIT),
> +	TRACE_EVENT_FL_FILTERED	= (1 << TRACE_EVENT_FL_FILTERED_BIT),
> +};
> +
>  struct ftrace_event_call {
>  	struct list_head	list;
>  	struct ftrace_event_class *class;
> @@ -152,8 +162,15 @@ struct ftrace_event_call {
>  	void			*mod;
>  	void			*data;
>  
> -	int			enabled;
> -	int			filter_active;
> +	/*
> +	 * 32 bit flags:
> +	 *   bit 1:		enabled
> +	 *   bit 2:		filter_active
> +	 *
> +	 *  Must hold event_mutex to change.
> +	 */
> +	unsigned int		flags;
> +
>  	int			perf_refcount;
>  };
>  
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index ff63bee..51ee319 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -779,7 +779,7 @@ filter_check_discard(struct ftrace_event_call *call, void *rec,
>  		     struct ring_buffer *buffer,
>  		     struct ring_buffer_event *event)
>  {
> -	if (unlikely(call->filter_active) &&
> +	if (unlikely(call->flags & TRACE_EVENT_FL_FILTERED) &&
>  	    !filter_match_preds(call->filter, rec)) {
>  		ring_buffer_discard_commit(buffer, event);
>  		return 1;
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 8d2e28e..176b8be 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -144,8 +144,8 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
>  
>  	switch (enable) {
>  	case 0:
> -		if (call->enabled) {
> -			call->enabled = 0;
> +		if (call->flags & TRACE_EVENT_FL_ENABLED) {
> +			call->flags &= ~TRACE_EVENT_FL_ENABLED;
>  			tracing_stop_cmdline_record();
>  			if (call->class->reg)
>  				call->class->reg(call, TRACE_REG_UNREGISTER);
> @@ -156,7 +156,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
>  		}
>  		break;
>  	case 1:
> -		if (!call->enabled) {
> +		if (!(call->flags & TRACE_EVENT_FL_ENABLED)) {
>  			tracing_start_cmdline_record();
>  			if (call->class->reg)
>  				ret = call->class->reg(call, TRACE_REG_REGISTER);
> @@ -170,7 +170,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
>  					"%s\n", call->name);
>  				break;
>  			}
> -			call->enabled = 1;
> +			call->flags |= TRACE_EVENT_FL_ENABLED;
>  		}
>  		break;
>  	}
> @@ -359,7 +359,7 @@ s_next(struct seq_file *m, void *v, loff_t *pos)
>  	(*pos)++;
>  
>  	list_for_each_entry_continue(call, &ftrace_events, list) {
> -		if (call->enabled)
> +		if (call->flags & TRACE_EVENT_FL_ENABLED)
>  			return call;
>  	}
>  
> @@ -418,7 +418,7 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
>  	struct ftrace_event_call *call = filp->private_data;
>  	char *buf;
>  
> -	if (call->enabled)
> +	if (call->flags & TRACE_EVENT_FL_ENABLED)
>  		buf = "1\n";
>  	else
>  		buf = "0\n";
> @@ -493,7 +493,7 @@ system_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
>  		 * or if all events or cleared, or if we have
>  		 * a mixture.
>  		 */
> -		set |= (1 << !!call->enabled);
> +		set |= (1 << !!(call->flags & TRACE_EVENT_FL_ENABLED));
>  
>  		/*
>  		 * If we have a mixture, no need to look further.
> diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
> index b8e3bf3..fbc72ee 100644
> --- a/kernel/trace/trace_events_filter.c
> +++ b/kernel/trace/trace_events_filter.c
> @@ -546,7 +546,7 @@ static void filter_disable_preds(struct ftrace_event_call *call)
>  	struct event_filter *filter = call->filter;
>  	int i;
>  
> -	call->filter_active = 0;
> +	call->flags &= ~TRACE_EVENT_FL_FILTERED;
>  	filter->n_preds = 0;
>  
>  	for (i = 0; i < MAX_FILTER_PRED; i++)
> @@ -573,7 +573,7 @@ void destroy_preds(struct ftrace_event_call *call)
>  {
>  	__free_preds(call->filter);
>  	call->filter = NULL;
> -	call->filter_active = 0;
> +	call->flags &= ~TRACE_EVENT_FL_FILTERED;
>  }
>  
>  static struct event_filter *__alloc_preds(void)
> @@ -612,7 +612,7 @@ static int init_preds(struct ftrace_event_call *call)
>  	if (call->filter)
>  		return 0;
>  
> -	call->filter_active = 0;
> +	call->flags &= ~TRACE_EVENT_FL_FILTERED;
>  	call->filter = __alloc_preds();
>  	if (IS_ERR(call->filter))
>  		return PTR_ERR(call->filter);
> @@ -1267,7 +1267,7 @@ static int replace_system_preds(struct event_subsystem *system,
>  		if (err)
>  			filter_disable_preds(call);
>  		else {
> -			call->filter_active = 1;
> +			call->flags |= TRACE_EVENT_FL_FILTERED;
>  			replace_filter_string(filter, filter_string);
>  		}
>  		fail = false;
> @@ -1316,7 +1316,7 @@ int apply_event_filter(struct ftrace_event_call *call, char *filter_string)
>  	if (err)
>  		append_filter_err(ps, call->filter);
>  	else
> -		call->filter_active = 1;
> +		call->flags |= TRACE_EVENT_FL_FILTERED;
>  out:
>  	filter_opstack_clear(ps);
>  	postfix_clear(ps);
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 934078b..0e3ded6 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1382,7 +1382,7 @@ static int register_probe_event(struct trace_probe *tp)
>  		kfree(call->print_fmt);
>  		return -ENODEV;
>  	}
> -	call->enabled = 0;
> +	call->flags = 0;
>  	call->class->reg = kprobe_register;
>  	call->data = tp;
>  	ret = trace_add_event_call(call);
> -- 
> 1.7.0
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks
  2010-04-28 20:37   ` Mathieu Desnoyers
@ 2010-04-28 23:56     ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-28 23:56 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers

On Wed, 2010-04-28 at 16:37 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > This patch allows data to be passed to the tracepoint callbacks
> > if the tracepoint was created to do so.
> > 
> > If a tracepoint is defined with:
> > 
> > DECLARE_TRACE_DATA(name, proto, args)
> > 
> > Then a registered function can also register data to be passed
> > to the tracepoint as such:
> > 
> >   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> > 
> >   /* In the C file */
> > 
> >   DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
> > 
> >   [...]
> > 
> >        trace_mytacepoint(status);
> > 
> >   /* In a file registering this tracepoint */
> > 
> >   int my_callback(int status, void *data)
> >   {
> > 	struct my_struct my_data = data;
> > 	[...]
> >   }
> > 
> >   [...]
> > 	my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
> > 	init_my_data(my_data);
> > 	register_trace_mytracepoint_data(my_callback, my_data);
> > 
> > The same callback can also be registered to the same tracepoint as long
> > as the data registered is the same. Note, the data must also be used
> > to unregister the callback:
> > 
> > 	unregister_trace_mytracepoint_data(my_callback, my_data);
> > 
> > Because of the data parameter, tracepoints declared this way can not have
> > no args. That is:
> > 
> >   DECLARE_TRACE_DATA(mytracepoint, TP_PROTO(void), TP_ARGS());
> > 
> > will cause an error, but the original DECLARE_TRACE still allows for this.
> > 
> > The DECLARE_TRACE_DATA() will be used by TRACE_EVENT() so that it
> > can reuse code and bring the size of the tracepoint footprint down.
> > This means that TRACE_EVENT()s must have at least one argument defined.
> > This should not be a problem since we should never have a static
> > tracepoint in the kernel that simply says "Look I'm here!".
> > 
> 
> I'm not convinced DECLARE_TRACE_DATA() is an appropriate name. Sounds
> confusing. What kind of data is this ? It is not obvious that this
> refers to callback private data.

Well, looking at the examples, it's pretty obvious what data is ;-)

> 
> Why can't we just extend the existing DECLARE_TRACE() instead and add a
> "callback_data" argument (or something slightly less verbose) ? We can
> update all users anyway.
> 
> We can also create a variant when there are no arguments passed:
> 
> DECLARE_TRACE_NOARG()

I have no problem with modifying DECLARE_TRACE() this way. In fact that
was the original way I did it. I was just concerned about changing the
fact that DECLARE_TRACE() no longer allows for (void), and it breaks
your example in the samples dir.

We can make DECLARE_TRACE() add the callback data, and add a NOARG()
version for those that do not have any args.


> 
> We had to do the same for the Linux kernel markers in the past. Then we
> can create a TRACE_EVENT_NOARG() macro if necessary.

Hmm, this may be difficult, since the TRACE_EVENT() requires passing of
a arg. I guess we can make NOARG will just ignore the "arg" value.

> 
> I don't think it makes sense to require users to pass arguments. It
> should be possible to just say "I'm here". Cases where this could make
> sense includes cases where we'd only be interested in global variables
> at a specific tracepoint.

Well, as Li just pointed out, we already require it ;-)

Not a big deal, we can add a noarg version in the future, but this is
the cost for doing advance work with CPP.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
  2010-04-28 20:39   ` Mathieu Desnoyers
@ 2010-04-28 23:57     ` Steven Rostedt
  2010-04-29  0:03       ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-28 23:57 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Wed, 2010-04-28 at 16:39 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Switch the TRACE_EVENT() macros to use DECLARE_TRACE_DATA(). This
> > patch is done to prove that the DATA macros work. If any regressions
> > were to surface, then this patch would help a git bisect to localize
> > the area.
> > 
> > Once again this patch increases the size of the kernel.
> > 
> 
> As recommended in the earlier email:
> 
> It would make sense to just add the extra "callback_data" argument
> directly to DECLARE_TRACE(), modify the user (TRACE_EVENT) accordingly.
> And possibly create a TRACE_EVENT_NOARG() variant.

Are you suggesting to make DECLARE_TRACE() be...

#define DECLARE_TRACE(name, proto, args, data)

?

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-28 20:44   ` Mathieu Desnoyers
@ 2010-04-29  0:00     ` Steven Rostedt
  2010-04-29  0:05       ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29  0:00 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Wed, 2010-04-28 at 16:44 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > This patch removes the register functions of TRACE_EVENT() to enable
> > and disable tracepoints. The registering of a event is now down
> > directly in the trace_events.c file. The tracepoint_probe_register()
> > is now called directly.
> > 
> > The prototypes are no longer type checked, but this should not be
> > an issue since the tracepoints are created automatically by the
> > macros. If a prototype is incorrect in the TRACE_EVENT() macro, then
> > other macros will catch it.
> > 
> > The trace_event_class structure now holds the probes to be called
> > by the callbacks. This removes needing to have each event have
> > a separate pointer for the probe.
> > 
> > To handle kprobes and syscalls, since they register probes in a
> > different manner, a "reg" field is added to the ftrace_event_class
> > structure. If the "reg" field is assigned, then it will be called for
> > enabling and disabling of the probe for either ftrace or perf. To let
> > the reg function know what is happening, a new enum (trace_reg) is
> > created that has the type of control that is needed.
> > 
> > With this new rework, the 82 kernel events and 616 syscall events
> > has their footprint dramatically lowered:
> > 
> >    text	   data	    bss	    dec	    hex	filename
> > 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> > 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> > 5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
> > 5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data
> > 5774316	1306580	9351592	16432488	 fabd68	vmlinux.regs
> > 
> > The size went from 16477030 to 16432488, that's a total of 44K
> > in savings. With tracepoints being continuously added, this is
> > critical that the footprint becomes minimal.
> 
> Have you tried doing a BUILD_BUG_ON() on __typeof__() mismatch between
> the type of the callback generated by TRACE_EVENT() and the expected
> type ?  This might help catching tricky preprocessor macro errors early.

Well, we could, but if it is broken once, it is broken everywhere.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/10][RFC] tracing: Move fields from event to class structure
  2010-04-28 20:58   ` Mathieu Desnoyers
@ 2010-04-29  0:02     ` Steven Rostedt
       [not found]       ` <20100429133213.GA14617@Krystal>
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29  0:02 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers, Tom Zanussi

On Wed, 2010-04-28 at 16:58 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Move the defined fields from the event to the class structure.
> > Since the fields of the event are defined by the class they belong
> > to, it makes sense to have the class hold the information instead
> > of the individual events. The events of the same class would just
> > hold duplicate information.
> > 
> > After this change the size of the kernel dropped another 8K:
> > 
> >    text	   data	    bss	    dec	    hex	filename
> > 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> > 5774316	1306580	9351592	16432488	 fabd68	vmlinux.reg
> > 5774503	1297492	9351592	16423587	 fa9aa3	vmlinux.fields
> > 
> > Although the text increased, this was mainly due to the C files
> > having to adapt to the change. This is a constant increase, where
> > new tracepoints will not increase the Text. But the big drop is
> > in the data size (as well as needed allocations to hold the fields).
> > This will give even more savings as more tracepoints are created.
> > 
> > Note, if just TRACE_EVENT()s are used and not DECLARE_EVENT_CLASS()
> > with several DEFINE_EVENT()s, then the savings will be lost. But
> > we are pushing developers to consolidate events with DEFINE_EVENT()
> > so this should not be an issue.
> > 
> > The kprobes define a unique class to every new event, but are dynamic
> > so it should not be a issue.
> > 
> > The syscalls however have a single class but the fields for the individual
> > events are different. The syscalls use a metadata to define the
> > fields. I moved the fields list from the event to the metadata and
> > added a "get_fields()" function to the class. This function is used
> > to find the fields. For normal events and kprobes, get_fields() just
> > returns a pointer to the fields list_head in the class. For syscall
> > events, it returns the fields list_head in the metadata for the event.
> 
> So, playing catch-up here, why don't we simply put each syscall event in
> their own class ? We could possibly share the class where it makes
> sense (e.g. exact same fields).

Well, they have their own class. But I guess you are talking about a
"meta-data class".

> 
> With the per-class sub-metadata, what's the limitations we have to
> expect with these system call events ? Can we map to a field size
> directly from the event ID, or do we have to somehow have the event size
> encoded in the header to make sense of the payload ?

That will be a lot of work. This is all generated automatically from the
SYSCALL() macros. To group them, we need a way to know what syscalls
have the same parameters, and manually add that. It may end up being a
maintenance nightmare.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA()
  2010-04-28 23:57     ` Steven Rostedt
@ 2010-04-29  0:03       ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-29  0:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Wed, 2010-04-28 at 16:39 -0400, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > From: Steven Rostedt <srostedt@redhat.com>
> > > 
> > > Switch the TRACE_EVENT() macros to use DECLARE_TRACE_DATA(). This
> > > patch is done to prove that the DATA macros work. If any regressions
> > > were to surface, then this patch would help a git bisect to localize
> > > the area.
> > > 
> > > Once again this patch increases the size of the kernel.
> > > 
> > 
> > As recommended in the earlier email:
> > 
> > It would make sense to just add the extra "callback_data" argument
> > directly to DECLARE_TRACE(), modify the user (TRACE_EVENT) accordingly.
> > And possibly create a TRACE_EVENT_NOARG() variant.
> 
> Are you suggesting to make DECLARE_TRACE() be...
> 
> #define DECLARE_TRACE(name, proto, args, data)
> 
> ?

err.. forget about that. We only need to modify the callback to take the
extra argument into acount, not DECLARE_TRACE().

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure
  2010-04-28 21:06   ` Mathieu Desnoyers
@ 2010-04-29  0:04     ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29  0:04 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Wed, 2010-04-28 at 17:06 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Now that the trace_event structure is embedded in the ftrace_event_call
> > structure, there is no need for the ftrace_event_call id field.
> > The id field is the same as the trace_event type field.
> > 
> > Removing the id and re-arranging the structure brings down the tracepoint
> > footprint by another 5K.
> 
> I might have missed it, but how exactly is the event type allocated
> uniquely ? Is it barely a duplicate of the call "id" field ?

It is allocated in kernel/trace/trace_events.c.

The code there scans the "_ftrace_events" section to find all the events
that are created, and it assigns the events a unique id. Currently the
id is just copied from the trace_event field to the ftrace_event_call
data. The two are the same.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-29  0:00     ` Steven Rostedt
@ 2010-04-29  0:05       ` Mathieu Desnoyers
  2010-04-29  0:20         ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-29  0:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Wed, 2010-04-28 at 16:44 -0400, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > From: Steven Rostedt <srostedt@redhat.com>
> > > 
> > > This patch removes the register functions of TRACE_EVENT() to enable
> > > and disable tracepoints. The registering of a event is now down
> > > directly in the trace_events.c file. The tracepoint_probe_register()
> > > is now called directly.
> > > 
> > > The prototypes are no longer type checked, but this should not be
> > > an issue since the tracepoints are created automatically by the
> > > macros. If a prototype is incorrect in the TRACE_EVENT() macro, then
> > > other macros will catch it.
> > > 
> > > The trace_event_class structure now holds the probes to be called
> > > by the callbacks. This removes needing to have each event have
> > > a separate pointer for the probe.
> > > 
> > > To handle kprobes and syscalls, since they register probes in a
> > > different manner, a "reg" field is added to the ftrace_event_class
> > > structure. If the "reg" field is assigned, then it will be called for
> > > enabling and disabling of the probe for either ftrace or perf. To let
> > > the reg function know what is happening, a new enum (trace_reg) is
> > > created that has the type of control that is needed.
> > > 
> > > With this new rework, the 82 kernel events and 616 syscall events
> > > has their footprint dramatically lowered:
> > > 
> > >    text	   data	    bss	    dec	    hex	filename
> > > 5788186	1337252	9351592	16477030	 fb6b66	vmlinux.orig
> > > 5792282	1333796	9351592	16477670	 fb6de6	vmlinux.class
> > > 5793448	1333780	9351592	16478820	 fb7264	vmlinux.tracepoint
> > > 5796926	1337748	9351592	16486266	 fb8f7a	vmlinux.data
> > > 5774316	1306580	9351592	16432488	 fabd68	vmlinux.regs
> > > 
> > > The size went from 16477030 to 16432488, that's a total of 44K
> > > in savings. With tracepoints being continuously added, this is
> > > critical that the footprint becomes minimal.
> > 
> > Have you tried doing a BUILD_BUG_ON() on __typeof__() mismatch between
> > the type of the callback generated by TRACE_EVENT() and the expected
> > type ?  This might help catching tricky preprocessor macro errors early.
> 
> Well, we could, but if it is broken once, it is broken everywhere.

I fear about "subtly" broken things, where trace data could end up being
incorrectly typed and/or corrupted. I think this BUILD_BUG_ON() will
become very useful.

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-29  0:05       ` Mathieu Desnoyers
@ 2010-04-29  0:20         ` Steven Rostedt
       [not found]           ` <20100429133649.GC14617@Krystal>
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29  0:20 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Wed, 2010-04-28 at 20:05 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:

> > > Have you tried doing a BUILD_BUG_ON() on __typeof__() mismatch between
> > > the type of the callback generated by TRACE_EVENT() and the expected
> > > type ?  This might help catching tricky preprocessor macro errors early.
> > 
> > Well, we could, but if it is broken once, it is broken everywhere.
> 
> I fear about "subtly" broken things, where trace data could end up being
> incorrectly typed and/or corrupted. I think this BUILD_BUG_ON() will
> become very useful.

Actually, I'm not sure what you want to check. What is not checked is
the prototype that is created, to the prototype that is passed to the
tracepoint_probe_register. Other parts are still checked. If you
mis-match the args with the parameters, there are still places that the
compiler will flag it. There really is not much less protection here
than there was before.

Instead of calling register_trace_##name that is created for each
tracepoint, we now call the tracepoint_probe_register() directly in the
C file with the generated probe.

Both the probe and the tracepoint are created from the same data. I'm
not seeing where you want to add this check.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/10][RFC] tracing: Move fields from event to class structure
       [not found]       ` <20100429133213.GA14617@Krystal>
@ 2010-04-29 13:50         ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29 13:50 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig,
	Mathieu Desnoyers, Tom Zanussi

On Thu, 2010-04-29 at 09:32 -0400, Mathieu Desnoyers wrote:

> OK, just to make sure I understand what we currently have and how I
> could deal with it:
> 
> Let's say syscall_entry has ID 17 in the event header. Let's suppose its
> first field is the system call number (an unsigned short). This system
> call number will determine the following binary format (how many fields
> recorded in the event) and the metadata telling how to print these
> fields as well.
> 
> So for a syscall_entry event, we could have metadata describing it
> (I'm doing an ad-hoc metadata format here, don't mind about the exact
> formulation for now). The first field value would determine the type
> cast of the following fields:
> 
> event {
>   name: syscall_entry;
>   id: 17;
>   field {
>     name: syscall_id;
>     type: unsigned short;
>     typecast: syscall_entry_params;
>   }
>   typecast {
>     name: syscall_entry_params;
>     map {
>       value: 0;  /* sys_read on x86_64 */
>       field {
>         type: unsigned int;
>         name: fd;
>       }
>       field {
>         type: char __user *;
>         name: buf;
>       }
>       field {
>         type: size_t;
>         name: count;
>       }
>     }
>    map {
>      value: 1;	/* sys_write on x86_64 */
>      ....
>   }
>   and so on for all other ......
> }
> 
> Does that look correct ? Maybe I'm just re-doing something already
> existing, so I prefer to ask first.

I'm a little confused by your example, but perhaps you are describing
what we already have.

All syscall_entrys have the same class, and all syscall_exits have their
own too.

What each syscall has separate is a meta-data, which is unique to
syscalls, and normal TRACE_EVENT()s do not have them. It is put in the
call->private field.

So what we have is:

struct ftrace_event_class event_class_syscall_enter;


This handles the printing of most of the data. What it does not cover is
how to print the parameters of the syscall itself. The meta-data is
created per syscall that specifies the syscalls parameters.

And the same metadata is used by both the syscall enter and exit events.
The meta data is described in include/trace/syscalls.h

struct syscall_metadata {
	cost char	*name;
	int		syscall_nr;
	int		nb_args;
	const char	**types;
	const char	**args;

	struct ftrace_event_call *enter_event;
	struct ftrace_event_call *exit_event;
};


Here the description is how to print the syscall parameters.

I don't see how we can group it any better, without manually doing it.

Hope this explains things better.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
       [not found]           ` <20100429133649.GC14617@Krystal>
@ 2010-04-29 14:06             ` Steven Rostedt
  2010-04-29 14:55               ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29 14:06 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Thu, 2010-04-29 at 09:36 -0400, Mathieu Desnoyers wrote:

> > Instead of calling register_trace_##name that is created for each
> > tracepoint, we now call the tracepoint_probe_register() directly in the
> > C file with the generated probe.
> > 
> > Both the probe and the tracepoint are created from the same data. I'm
> > not seeing where you want to add this check.
> 
> So if they are created from the same data, we can expect this test to
> always pass, which is good (and expected).
> 
> I'd add this extra check before casting the callback to (void *) before
> it is passed to tracepoint_probe_register(). Let's just call this
> internal preprocessing macro integrity check. As long as it does not add
> a runtime cost, there is no reason not to put this extra check.

The problem is, the cast is now performed in a C file for all events.
There's no way to know what to cast it to there. This is out of the
automation of the macro.

We use to have the cast check by creating code that would create the
"register_trace_##call", and the typecheck was doing by passing the data
to this function. But we removed this code out of the per event, it was
adding lots of text footprint, and moved it to one single function that
handles all events. It is just expected that the callback created
matches the function it was done.

If you are overly paranoid, we could create a special function that
tests that the callback format that is created matches the tracepoint
that is created, and make it so GCC sees that nothing calls it and
removes it at final link. But I still see this as a waste.


The tracepoint is created in include/linux/tracepoint.h:

#define TRACE_EVENT(name, proto, args, struct, assign, print)	\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))

The callback is created in include/trace/ftrace.h:

#undef TRACE_EVENT
#define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
	DECLARE_EVENT_CLASS(name,				\
				PARAMS(proto),			\
				PARAMS(args),			\
				PARAMS(tstruct),		\
				PARAMS(assign),			\
				PARAMS(print));			\
	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));

#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
									\
static notrace void							\
ftrace_raw_event_##call(proto,						\
			struct ftrace_event_call *event_call)		\
[...]


Thus the "proto" field of the TRACE_EVENT() is used to make both the
tracepoint and the callback. We add the struct ftrace_event_call
*event_call which is the data we pass to the callback.

Now, where this gets called is in kernel/trace/trace_events.c:

	tracepoint_probe_register(call->name,
				  call->class->probe,
				  call);

This is where we lose the typecheck. So my question is... where do you
want to put in a check?

-- Steve

	



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-29 14:06             ` Steven Rostedt
@ 2010-04-29 14:55               ` Mathieu Desnoyers
  2010-04-29 16:06                 ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-29 14:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Thu, 2010-04-29 at 09:36 -0400, Mathieu Desnoyers wrote:
> 
> > > Instead of calling register_trace_##name that is created for each
> > > tracepoint, we now call the tracepoint_probe_register() directly in the
> > > C file with the generated probe.
> > > 
> > > Both the probe and the tracepoint are created from the same data. I'm
> > > not seeing where you want to add this check.
> > 
> > So if they are created from the same data, we can expect this test to
> > always pass, which is good (and expected).
> > 
> > I'd add this extra check before casting the callback to (void *) before
> > it is passed to tracepoint_probe_register(). Let's just call this
> > internal preprocessing macro integrity check. As long as it does not add
> > a runtime cost, there is no reason not to put this extra check.
> 
> The problem is, the cast is now performed in a C file for all events.
> There's no way to know what to cast it to there. This is out of the
> automation of the macro.
> 
> We use to have the cast check by creating code that would create the
> "register_trace_##call", and the typecheck was doing by passing the data
> to this function. But we removed this code out of the per event, it was
> adding lots of text footprint, and moved it to one single function that
> handles all events. It is just expected that the callback created
> matches the function it was done.
> 
> If you are overly paranoid, we could create a special function that
> tests that the callback format that is created matches the tracepoint
> that is created, and make it so GCC sees that nothing calls it and
> removes it at final link. But I still see this as a waste.
> 
> 
> The tracepoint is created in include/linux/tracepoint.h:
> 
> #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
> 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))

Can we add something like this to DECLARE_TRACE ? (not convinced it is
valid though)

static inline void check_trace_##name(cb_type)
{
	BUILD_BUG_ON(!__same_type(void (*probe)(TP_PROTO(proto), void *data),
				  cb_type));
}

> 
> The callback is created in include/trace/ftrace.h:
> 
> #undef TRACE_EVENT
> #define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
> 	DECLARE_EVENT_CLASS(name,				\
> 				PARAMS(proto),			\
> 				PARAMS(args),			\
> 				PARAMS(tstruct),		\
> 				PARAMS(assign),			\
> 				PARAMS(print));			\
> 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
> 
> #undef DECLARE_EVENT_CLASS
> #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> 									\
> static notrace void							\
> ftrace_raw_event_##call(proto,						\
> 			struct ftrace_event_call *event_call)		\
> [...]
> 

Either within this callback, or in a dummy static function after, we
could add:

check_trace_##call(ftrace_raw_event_##call);

So.. you are the preprocessor expert, do you think this could fly ? ;)

Thanks,

Mathieu

> 
> Thus the "proto" field of the TRACE_EVENT() is used to make both the
> tracepoint and the callback. We add the struct ftrace_event_call
> *event_call which is the data we pass to the callback.
> 
> Now, where this gets called is in kernel/trace/trace_events.c:
> 
> 	tracepoint_probe_register(call->name,
> 				  call->class->probe,
> 				  call);
> 
> This is where we lose the typecheck. So my question is... where do you
> want to put in a check?
> 
> -- Steve
> 
> 	
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-29 14:55               ` Mathieu Desnoyers
@ 2010-04-29 16:06                 ` Steven Rostedt
  2010-04-30 17:09                   ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-29 16:06 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Thu, 2010-04-29 at 10:55 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:

> > 
> > The tracepoint is created in include/linux/tracepoint.h:
> > 
> > #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
> > 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> 
> Can we add something like this to DECLARE_TRACE ? (not convinced it is
> valid though)
> 
> static inline void check_trace_##name(cb_type)
> {
> 	BUILD_BUG_ON(!__same_type(void (*probe)(TP_PROTO(proto), void *data),
> 				  cb_type));
> }
> 

We could add it, but I'm not sure it would add any more protection. If
for some strange reason the prototype got out of sync, would would
prevent the cb_type from getting out of sync with it too, and not cause
this to fail, but still have the same bug.

Honestly, I find this a bit too paranoid. Again, the callback and the
tracepoint are made with the same data. I find it hard to think that it
would break somehow. Yes, perhaps it will break if you modify ftrace.h,
but then if you are doing that, you should know better than to break
things :-)


> > 
> > The callback is created in include/trace/ftrace.h:
> > 
> > #undef TRACE_EVENT
> > #define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
> > 	DECLARE_EVENT_CLASS(name,				\
> > 				PARAMS(proto),			\
> > 				PARAMS(args),			\
> > 				PARAMS(tstruct),		\
> > 				PARAMS(assign),			\
> > 				PARAMS(print));			\
> > 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
> > 
> > #undef DECLARE_EVENT_CLASS
> > #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> > 									\
> > static notrace void							\
> > ftrace_raw_event_##call(proto,						\
> > 			struct ftrace_event_call *event_call)		\
> > [...]
> > 
> 
> Either within this callback, or in a dummy static function after, we
> could add:
> 
> check_trace_##call(ftrace_raw_event_##call);
> 
> So.. you are the preprocessor expert, do you think this could fly ? ;)



Sure, the static function you did could be added, and hope that gcc is
smart enough to get rid of it (add __unused to it). But what are we
really checking here? If CPP works?

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-29 16:06                 ` Steven Rostedt
@ 2010-04-30 17:09                   ` Mathieu Desnoyers
  2010-04-30 18:16                     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-30 17:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Thu, 2010-04-29 at 10:55 -0400, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> > > 
> > > The tracepoint is created in include/linux/tracepoint.h:
> > > 
> > > #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
> > > 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> > 
> > Can we add something like this to DECLARE_TRACE ? (not convinced it is
> > valid though)
> > 
> > static inline void check_trace_##name(cb_type)
> > {
> > 	BUILD_BUG_ON(!__same_type(void (*probe)(TP_PROTO(proto), void *data),
> > 				  cb_type));
> > }
> > 
> 
> We could add it, but I'm not sure it would add any more protection. If
> for some strange reason the prototype got out of sync, would would
> prevent the cb_type from getting out of sync with it too, and not cause
> this to fail, but still have the same bug.
> 
> Honestly, I find this a bit too paranoid. Again, the callback and the
> tracepoint are made with the same data. I find it hard to think that it
> would break somehow. Yes, perhaps it will break if you modify ftrace.h,
> but then if you are doing that, you should know better than to break
> things :-)

How can you be sure that the "void *data" type will match the type at
the same position in the generated callback ?

Honestly, I don't think kernel programmers write bug-free code. And I
include myself when I say that. So the best we can do, on top of code
review, is to use all the verification and debugging tools available to
minimize the amount of undetected bugs. Rather than try to find out the
cause of subtly broken tracepoint callbacks with their runtime
side-effects, I strongly prefer to let the compiler find this out as
early as possible.

I also don't trust that these complex TRACE_EVENT() preprocessor macros
will never ever have bugs. That's just doomed to happen one day or
another. Again, call me paranoid if you like, but I think adding this
type checking is justified.

I am providing the type check implementation in a separate email. It
will need to be extended to support the extra data parameter you plan to
add.

Thanks,

Mathieu

> 
> 
> > > 
> > > The callback is created in include/trace/ftrace.h:
> > > 
> > > #undef TRACE_EVENT
> > > #define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
> > > 	DECLARE_EVENT_CLASS(name,				\
> > > 				PARAMS(proto),			\
> > > 				PARAMS(args),			\
> > > 				PARAMS(tstruct),		\
> > > 				PARAMS(assign),			\
> > > 				PARAMS(print));			\
> > > 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
> > > 
> > > #undef DECLARE_EVENT_CLASS
> > > #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> > > 									\
> > > static notrace void							\
> > > ftrace_raw_event_##call(proto,						\
> > > 			struct ftrace_event_call *event_call)		\
> > > [...]
> > > 
> > 
> > Either within this callback, or in a dummy static function after, we
> > could add:
> > 
> > check_trace_##call(ftrace_raw_event_##call);
> > 
> > So.. you are the preprocessor expert, do you think this could fly ? ;)
> 
> 
> 
> Sure, the static function you did could be added, and hope that gcc is
> smart enough to get rid of it (add __unused to it). But what are we
> really checking here? If CPP works?
> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 17:09                   ` Mathieu Desnoyers
@ 2010-04-30 18:16                     ` Steven Rostedt
  2010-04-30 19:06                       ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-30 18:16 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Fri, 2010-04-30 at 13:09 -0400, Mathieu Desnoyers wrote:

> How can you be sure that the "void *data" type will match the type at
> the same position in the generated callback ?


We do it all the time in the kernel with no type checking. Just look at
all the users of file->private.


> 
> Honestly, I don't think kernel programmers write bug-free code. And I
> include myself when I say that. So the best we can do, on top of code
> review, is to use all the verification and debugging tools available to
> minimize the amount of undetected bugs. Rather than try to find out the
> cause of subtly broken tracepoint callbacks with their runtime
> side-effects, I strongly prefer to let the compiler find this out as
> early as possible.

If it is possible sure, but that's the point. Where do you add the
check? The typecast is in the C code that is constant for all trace
events.

> 
> I also don't trust that these complex TRACE_EVENT() preprocessor macros

Thanks for your vote of confidence.

> will never ever have bugs. That's just doomed to happen one day or
> another. Again, call me paranoid if you like, but I think adding this
> type checking is justified.

Where do you add the typecheck?? As I said before, if the TRACE_EVENT()
macros are broken, then so will the typecheck, and it will not catch the
errors.

Sure the event macros can have bugs, but if it does then it will have
bugs for all. Because it is automated. If there is a bug, it wont be
because of a missed type being passed in, it would be because of one of
the extra macros we have that processes the same type incorrectly.

> 
> I am providing the type check implementation in a separate email. It
> will need to be extended to support the extra data parameter you plan to
> add.

I saw the patch, but how does it help?

I use "proto" to make the tracepoint and the callback, so I can add
somewhere this "check_trace_callback_type_##name(proto)", but if the
macros break somehow, that means proto changed between two references of
it, but what keeps proto from breaking at both callback creation and the
typecheck.

Basically, you are saying that somehow the argument "proto" can change
between two uses of it. I don't really see that happening, and I'm not
paranoid enough to think that's an issue. Adding checks that don't
really check anything, honestly I find a waste, and just more confusion
in the macros.

-- Steve


> > 
> > 
> > > > 
> > > > The callback is created in include/trace/ftrace.h:
> > > > 
> > > > #undef TRACE_EVENT
> > > > #define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
> > > > 	DECLARE_EVENT_CLASS(name,				\
> > > > 				PARAMS(proto),			\
> > > > 				PARAMS(args),			\
> > > > 				PARAMS(tstruct),		\
> > > > 				PARAMS(assign),			\
> > > > 				PARAMS(print));			\
> > > > 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
> > > > 
> > > > #undef DECLARE_EVENT_CLASS
> > > > #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> > > > 									\
> > > > static notrace void							\
> > > > ftrace_raw_event_##call(proto,						\
> > > > 			struct ftrace_event_call *event_call)		\
> > > > [...]
> > > > 
> > > 
> > > Either within this callback, or in a dummy static function after, we
> > > could add:
> > > 
> > > check_trace_##call(ftrace_raw_event_##call);
> > > 
> > > So.. you are the preprocessor expert, do you think this could fly ? ;)
> > 
> > 
> > 
> > Sure, the static function you did could be added, and hope that gcc is
> > smart enough to get rid of it (add __unused to it). But what are we
> > really checking here? If CPP works?
> > 
> > -- Steve
> > 
> > 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 18:16                     ` Steven Rostedt
@ 2010-04-30 19:06                       ` Mathieu Desnoyers
  2010-04-30 19:48                         ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-30 19:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Fri, 2010-04-30 at 13:09 -0400, Mathieu Desnoyers wrote:
> 
> > How can you be sure that the "void *data" type will match the type at
> > the same position in the generated callback ?
> 
> 
> We do it all the time in the kernel with no type checking. Just look at
> all the users of file->private.
> 
> 
> > 
> > Honestly, I don't think kernel programmers write bug-free code. And I
> > include myself when I say that. So the best we can do, on top of code
> > review, is to use all the verification and debugging tools available to
> > minimize the amount of undetected bugs. Rather than try to find out the
> > cause of subtly broken tracepoint callbacks with their runtime
> > side-effects, I strongly prefer to let the compiler find this out as
> > early as possible.
> 
> If it is possible sure, but that's the point. Where do you add the
> check? The typecast is in the C code that is constant for all trace
> events.

You can add the call to the static inline type check directly within the
generated probe function, right after the local variable declarations.

> 
> > 
> > I also don't trust that these complex TRACE_EVENT() preprocessor macros
> 
> Thanks for your vote of confidence.

Please don't take this personally. As I said above, I include myself in
the list of people I don't trust to write entirely bug-free code. I'm
just saying that we should not overlook a possibility to detect more
bugs automatically when we have one, especially if this results in no
object code change.

> 
> > will never ever have bugs. That's just doomed to happen one day or
> > another. Again, call me paranoid if you like, but I think adding this
> > type checking is justified.
> 
> Where do you add the typecheck?? As I said before, if the TRACE_EVENT()
> macros are broken, then so will the typecheck, and it will not catch the
> errors.
> 
> Sure the event macros can have bugs, but if it does then it will have
> bugs for all. Because it is automated. If there is a bug, it wont be
> because of a missed type being passed in, it would be because of one of
> the extra macros we have that processes the same type incorrectly.
> 
> > 
> > I am providing the type check implementation in a separate email. It
> > will need to be extended to support the extra data parameter you plan to
> > add.
> 
> I saw the patch, but how does it help?
> 
> I use "proto" to make the tracepoint and the callback, so I can add
> somewhere this "check_trace_callback_type_##name(proto)", but if the
> macros break somehow, that means proto changed between two references of
> it, but what keeps proto from breaking at both callback creation and the
> typecheck.
> 
> Basically, you are saying that somehow the argument "proto" can change
> between two uses of it. I don't really see that happening, and I'm not
> paranoid enough to think that's an issue. Adding checks that don't
> really check anything, honestly I find a waste, and just more confusion
> in the macros.

In the TRACE_EVENT() case, without the extra "void *data" argument,
it is indeed checking that the "proto" of the callback you create is
that same as the "proto" expected by the tracepoint call. However, given
that you plan on adding other parameters besides "proto", then the added
type-checking makes more and more sense.

Thanks,

Mathieu

> 
> -- Steve
> 
> 
> > > 
> > > 
> > > > > 
> > > > > The callback is created in include/trace/ftrace.h:
> > > > > 
> > > > > #undef TRACE_EVENT
> > > > > #define TRACE_EVENT(name, proto, args, tstuct, assign, print)	\
> > > > > 	DECLARE_EVENT_CLASS(name,				\
> > > > > 				PARAMS(proto),			\
> > > > > 				PARAMS(args),			\
> > > > > 				PARAMS(tstruct),		\
> > > > > 				PARAMS(assign),			\
> > > > > 				PARAMS(print));			\
> > > > > 	DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
> > > > > 
> > > > > #undef DECLARE_EVENT_CLASS
> > > > > #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
> > > > > 									\
> > > > > static notrace void							\
> > > > > ftrace_raw_event_##call(proto,						\
> > > > > 			struct ftrace_event_call *event_call)		\
> > > > > [...]
> > > > > 
> > > > 
> > > > Either within this callback, or in a dummy static function after, we
> > > > could add:
> > > > 
> > > > check_trace_##call(ftrace_raw_event_##call);
> > > > 
> > > > So.. you are the preprocessor expert, do you think this could fly ? ;)
> > > 
> > > 
> > > 
> > > Sure, the static function you did could be added, and hope that gcc is
> > > smart enough to get rid of it (add __unused to it). But what are we
> > > really checking here? If CPP works?
> > > 
> > > -- Steve
> > > 
> > > 
> > 
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 19:06                       ` Mathieu Desnoyers
@ 2010-04-30 19:48                         ` Steven Rostedt
  2010-04-30 20:07                           ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-30 19:48 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Fri, 2010-04-30 at 15:06 -0400, Mathieu Desnoyers wrote:

> > If it is possible sure, but that's the point. Where do you add the
> > check? The typecast is in the C code that is constant for all trace
> > events.
> 
> You can add the call to the static inline type check directly within the
> generated probe function, right after the local variable declarations.

Well, one thing, the callback is not going to be the same as the
DECLARE_TRACE() because the prototype ends with "void *data", and the
function being called actually uses the type of that data.

We now will have:

	DEFINE_TRACE(mytracepoint, int myarg, myarg);

	void mycallback(int myarg, struct mystuct *mydata);

	register_trace_mytracepoint_data(mycallback, mydata)

There's no place in DEFINE_TRACE to be able to test the type of data
that is being passed back. I could make the calling function be:

	void mycallback(int myarg, void *data)
	{
		struct mystruct *mydata = data;
	[...]

Because the data is defined uniquely by the caller that registers a
callback. Each function can register its own data type.

> 
> > 
> > > 
> > > I also don't trust that these complex TRACE_EVENT() preprocessor macros
> > 
> > Thanks for your vote of confidence.
> 
> Please don't take this personally. As I said above, I include myself in
> the list of people I don't trust to write entirely bug-free code. I'm
> just saying that we should not overlook a possibility to detect more
> bugs automatically when we have one, especially if this results in no
> object code change.

The point being is that this is not about buggy code, but the fact that
the same data is being used in two places, you want to test to make sure
it is the same. I don't see how this helps.



> 
> > 
> > > will never ever have bugs. That's just doomed to happen one day or
> > > another. Again, call me paranoid if you like, but I think adding this
> > > type checking is justified.
> > 
> > Where do you add the typecheck?? As I said before, if the TRACE_EVENT()
> > macros are broken, then so will the typecheck, and it will not catch the
> > errors.
> > 
> > Sure the event macros can have bugs, but if it does then it will have
> > bugs for all. Because it is automated. If there is a bug, it wont be
> > because of a missed type being passed in, it would be because of one of
> > the extra macros we have that processes the same type incorrectly.
> > 
> > > 
> > > I am providing the type check implementation in a separate email. It
> > > will need to be extended to support the extra data parameter you plan to
> > > add.
> > 
> > I saw the patch, but how does it help?
> > 
> > I use "proto" to make the tracepoint and the callback, so I can add
> > somewhere this "check_trace_callback_type_##name(proto)", but if the
> > macros break somehow, that means proto changed between two references of
> > it, but what keeps proto from breaking at both callback creation and the
> > typecheck.
> > 
> > Basically, you are saying that somehow the argument "proto" can change
> > between two uses of it. I don't really see that happening, and I'm not
> > paranoid enough to think that's an issue. Adding checks that don't
> > really check anything, honestly I find a waste, and just more confusion
> > in the macros.
> 
> In the TRACE_EVENT() case, without the extra "void *data" argument,
> it is indeed checking that the "proto" of the callback you create is
> that same as the "proto" expected by the tracepoint call. However, given
> that you plan on adding other parameters besides "proto", then the added
> type-checking makes more and more sense.

But you can not test it! That's my point.

The first part of proto will be the same, and that's all we can test.
But the data parameter that the DECLARE_TRACE() is going to create will
be void *. Which means we can't test it. This is something that C lacks,
and we could test it in C++ if we did this with templates. The only way
to test it is at runtime with a magic number in the data field.

This is the same as the file->private data. You can't test it at build
time.

Let me explain this again:

	DECLARE_TRACE(name, proto, args);

Will call the function like:

	callback(args, data);

The callback will be at best:

	int callback(proto, void *data);


because the data being passed in is not defined yet. It is defined at
the point of the registering of the callback. You can have two callbacks
registered to the same tracepoint with two different types as the data
field.

So what is it that this check is testing?

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 19:48                         ` Steven Rostedt
@ 2010-04-30 20:07                           ` Mathieu Desnoyers
  2010-04-30 20:14                             ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-30 20:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Fri, 2010-04-30 at 15:06 -0400, Mathieu Desnoyers wrote:
> 
> > > If it is possible sure, but that's the point. Where do you add the
> > > check? The typecast is in the C code that is constant for all trace
> > > events.
> > 
> > You can add the call to the static inline type check directly within the
> > generated probe function, right after the local variable declarations.
> 
> Well, one thing, the callback is not going to be the same as the
> DECLARE_TRACE() because the prototype ends with "void *data", and the
> function being called actually uses the type of that data.
> 
> We now will have:
> 
> 	DEFINE_TRACE(mytracepoint, int myarg, myarg);
> 
> 	void mycallback(int myarg, struct mystuct *mydata);
> 
> 	register_trace_mytracepoint_data(mycallback, mydata)
> 
> There's no place in DEFINE_TRACE to be able to test the type of data
> that is being passed back. I could make the calling function be:
> 
> 	void mycallback(int myarg, void *data)
> 	{
> 		struct mystruct *mydata = data;
> 	[...]
> 
> Because the data is defined uniquely by the caller that registers a
> callback. Each function can register its own data type.

Yep. There would need to be a cast from void * to struct mystruct *
at the beginning of the callback as you propose here. I prefer this cast
to be explicit (as proposed here) rather than hidden within the entire
function call (void *) cast.

> 
> > 
> > > 
> > > > 
> > > > I also don't trust that these complex TRACE_EVENT() preprocessor macros
> > > 
> > > Thanks for your vote of confidence.
> > 
> > Please don't take this personally. As I said above, I include myself in
> > the list of people I don't trust to write entirely bug-free code. I'm
> > just saying that we should not overlook a possibility to detect more
> > bugs automatically when we have one, especially if this results in no
> > object code change.
> 
> The point being is that this is not about buggy code, but the fact that
> the same data is being used in two places, you want to test to make sure
> it is the same. I don't see how this helps.

See my comment above about specifically casting the void *data parameter
rather than relying on casting of the whole callback function pointer
type to void *.

> 
> 
> 
> > 
> > > 
> > > > will never ever have bugs. That's just doomed to happen one day or
> > > > another. Again, call me paranoid if you like, but I think adding this
> > > > type checking is justified.
> > > 
> > > Where do you add the typecheck?? As I said before, if the TRACE_EVENT()
> > > macros are broken, then so will the typecheck, and it will not catch the
> > > errors.
> > > 
> > > Sure the event macros can have bugs, but if it does then it will have
> > > bugs for all. Because it is automated. If there is a bug, it wont be
> > > because of a missed type being passed in, it would be because of one of
> > > the extra macros we have that processes the same type incorrectly.
> > > 
> > > > 
> > > > I am providing the type check implementation in a separate email. It
> > > > will need to be extended to support the extra data parameter you plan to
> > > > add.
> > > 
> > > I saw the patch, but how does it help?
> > > 
> > > I use "proto" to make the tracepoint and the callback, so I can add
> > > somewhere this "check_trace_callback_type_##name(proto)", but if the
> > > macros break somehow, that means proto changed between two references of
> > > it, but what keeps proto from breaking at both callback creation and the
> > > typecheck.
> > > 
> > > Basically, you are saying that somehow the argument "proto" can change
> > > between two uses of it. I don't really see that happening, and I'm not
> > > paranoid enough to think that's an issue. Adding checks that don't
> > > really check anything, honestly I find a waste, and just more confusion
> > > in the macros.
> > 
> > In the TRACE_EVENT() case, without the extra "void *data" argument,
> > it is indeed checking that the "proto" of the callback you create is
> > that same as the "proto" expected by the tracepoint call. However, given
> > that you plan on adding other parameters besides "proto", then the added
> > type-checking makes more and more sense.
> 
> But you can not test it! That's my point.
> 
> The first part of proto will be the same, and that's all we can test.
> But the data parameter that the DECLARE_TRACE() is going to create will
> be void *. Which means we can't test it. This is something that C lacks,
> and we could test it in C++ if we did this with templates. The only way
> to test it is at runtime with a magic number in the data field.
> 
> This is the same as the file->private data. You can't test it at build
> time.
> 
> Let me explain this again:
> 
> 	DECLARE_TRACE(name, proto, args);
> 
> Will call the function like:
> 
> 	callback(args, data);
> 
> The callback will be at best:
> 
> 	int callback(proto, void *data);
> 
> 
> because the data being passed in is not defined yet. It is defined at
> the point of the registering of the callback. You can have two callbacks
> registered to the same tracepoint with two different types as the data
> field.
> 
> So what is it that this check is testing?

It's making sure that TRACE_EVENT() creates callbacks with the following
signature:

  void callback(proto, void *data)

rather than

  void callback(proto, struct somestruct *data)

and forces the cast to be done within the callback rather than casting
the whole function pointer type to void *, assuming types to match. I
prefer to leave the cast outside of the tracepoint infrastructure, so we
do not obfuscate the fact that an explicit type cast is needed there.

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 20:07                           ` Mathieu Desnoyers
@ 2010-04-30 20:14                             ` Steven Rostedt
  2010-04-30 21:02                               ` Mathieu Desnoyers
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2010-04-30 20:14 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

On Fri, 2010-04-30 at 16:07 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > On Fri, 2010-04-30 at 15:06 -0400, Mathieu Desnoyers wrote:
> > 
> > > > If it is possible sure, but that's the point. Where do you add the
> > > > check? The typecast is in the C code that is constant for all trace
> > > > events.
> > > 
> > > You can add the call to the static inline type check directly within the
> > > generated probe function, right after the local variable declarations.
> > 
> > Well, one thing, the callback is not going to be the same as the
> > DECLARE_TRACE() because the prototype ends with "void *data", and the
> > function being called actually uses the type of that data.
> > 
> > We now will have:
> > 
> > 	DEFINE_TRACE(mytracepoint, int myarg, myarg);
> > 
> > 	void mycallback(int myarg, struct mystuct *mydata);
> > 
> > 	register_trace_mytracepoint_data(mycallback, mydata)
> > 
> > There's no place in DEFINE_TRACE to be able to test the type of data
> > that is being passed back. I could make the calling function be:
> > 
> > 	void mycallback(int myarg, void *data)
> > 	{
> > 		struct mystruct *mydata = data;
> > 	[...]
> > 
> > Because the data is defined uniquely by the caller that registers a
> > callback. Each function can register its own data type.
> 
> Yep. There would need to be a cast from void * to struct mystruct *
> at the beginning of the callback as you propose here. I prefer this cast
> to be explicit (as proposed here) rather than hidden within the entire
> function call (void *) cast.
> 

OK, so you prefer that, I don't, but I also don't care, so I could
easily change it.


> > Let me explain this again:
> > 
> > 	DECLARE_TRACE(name, proto, args);
> > 
> > Will call the function like:
> > 
> > 	callback(args, data);
> > 
> > The callback will be at best:
> > 
> > 	int callback(proto, void *data);
> > 
> > 
> > because the data being passed in is not defined yet. It is defined at
> > the point of the registering of the callback. You can have two callbacks
> > registered to the same tracepoint with two different types as the data
> > field.
> > 
> > So what is it that this check is testing?
> 
> It's making sure that TRACE_EVENT() creates callbacks with the following
> signature:
> 
>   void callback(proto, void *data)
> 
> rather than
> 
>   void callback(proto, struct somestruct *data)
> 
> and forces the cast to be done within the callback rather than casting
> the whole function pointer type to void *, assuming types to match. I
> prefer to leave the cast outside of the tracepoint infrastructure, so we
> do not obfuscate the fact that an explicit type cast is needed there.

Fine, but I hardly see it as obfuscation. But my question again, even if
we do change this. What is this test testing? To me, it is checking that
CPP works.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10][RFC] tracing: Remove per event trace registering
  2010-04-30 20:14                             ` Steven Rostedt
@ 2010-04-30 21:02                               ` Mathieu Desnoyers
  0 siblings, 0 replies; 45+ messages in thread
From: Mathieu Desnoyers @ 2010-04-30 21:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arnaldo Carvalho de Melo,
	Lai Jiangshan, Li Zefan, Masami Hiramatsu, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Fri, 2010-04-30 at 16:07 -0400, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > On Fri, 2010-04-30 at 15:06 -0400, Mathieu Desnoyers wrote:
> > > 
> > > > > If it is possible sure, but that's the point. Where do you add the
> > > > > check? The typecast is in the C code that is constant for all trace
> > > > > events.
> > > > 
> > > > You can add the call to the static inline type check directly within the
> > > > generated probe function, right after the local variable declarations.
> > > 
> > > Well, one thing, the callback is not going to be the same as the
> > > DECLARE_TRACE() because the prototype ends with "void *data", and the
> > > function being called actually uses the type of that data.
> > > 
> > > We now will have:
> > > 
> > > 	DEFINE_TRACE(mytracepoint, int myarg, myarg);
> > > 
> > > 	void mycallback(int myarg, struct mystuct *mydata);
> > > 
> > > 	register_trace_mytracepoint_data(mycallback, mydata)
> > > 
> > > There's no place in DEFINE_TRACE to be able to test the type of data
> > > that is being passed back. I could make the calling function be:
> > > 
> > > 	void mycallback(int myarg, void *data)
> > > 	{
> > > 		struct mystruct *mydata = data;
> > > 	[...]
> > > 
> > > Because the data is defined uniquely by the caller that registers a
> > > callback. Each function can register its own data type.
> > 
> > Yep. There would need to be a cast from void * to struct mystruct *
> > at the beginning of the callback as you propose here. I prefer this cast
> > to be explicit (as proposed here) rather than hidden within the entire
> > function call (void *) cast.
> > 
> 
> OK, so you prefer that, I don't, but I also don't care, so I could
> easily change it.
> 
> 
> > > Let me explain this again:
> > > 
> > > 	DECLARE_TRACE(name, proto, args);
> > > 
> > > Will call the function like:
> > > 
> > > 	callback(args, data);
> > > 
> > > The callback will be at best:
> > > 
> > > 	int callback(proto, void *data);
> > > 
> > > 
> > > because the data being passed in is not defined yet. It is defined at
> > > the point of the registering of the callback. You can have two callbacks
> > > registered to the same tracepoint with two different types as the data
> > > field.
> > > 
> > > So what is it that this check is testing?
> > 
> > It's making sure that TRACE_EVENT() creates callbacks with the following
> > signature:
> > 
> >   void callback(proto, void *data)
> > 
> > rather than
> > 
> >   void callback(proto, struct somestruct *data)
> > 
> > and forces the cast to be done within the callback rather than casting
> > the whole function pointer type to void *, assuming types to match. I
> > prefer to leave the cast outside of the tracepoint infrastructure, so we
> > do not obfuscate the fact that an explicit type cast is needed there.
> 
> Fine, but I hardly see it as obfuscation. But my question again, even if
> we do change this. What is this test testing? To me, it is checking that
> CPP works.

It's checking that the macros generated compatible call/callback
prototypes, yes. It comes down to using the compiler type-checking to
double-check that the macros are fine.

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2010-04-30 21:02 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-26 19:50 [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Steven Rostedt
2010-04-26 19:50 ` [PATCH 01/10][RFC] tracing: Create class struct for events Steven Rostedt
2010-04-28 20:22   ` Mathieu Desnoyers
2010-04-28 20:38     ` Steven Rostedt
2010-04-26 19:50 ` [PATCH 02/10][RFC] tracing: Let tracepoints have data passed to tracepoint callbacks Steven Rostedt
2010-04-27  9:08   ` Li Zefan
2010-04-27 15:28     ` Steven Rostedt
2010-04-28 20:37   ` Mathieu Desnoyers
2010-04-28 23:56     ` Steven Rostedt
2010-04-26 19:50 ` [PATCH 03/10][RFC] tracing: Convert TRACE_EVENT() to use the DECLARE_TRACE_DATA() Steven Rostedt
2010-04-28 20:39   ` Mathieu Desnoyers
2010-04-28 23:57     ` Steven Rostedt
2010-04-29  0:03       ` Mathieu Desnoyers
2010-04-26 19:50 ` [PATCH 04/10][RFC] tracing: Remove per event trace registering Steven Rostedt
2010-04-28 20:44   ` Mathieu Desnoyers
2010-04-29  0:00     ` Steven Rostedt
2010-04-29  0:05       ` Mathieu Desnoyers
2010-04-29  0:20         ` Steven Rostedt
     [not found]           ` <20100429133649.GC14617@Krystal>
2010-04-29 14:06             ` Steven Rostedt
2010-04-29 14:55               ` Mathieu Desnoyers
2010-04-29 16:06                 ` Steven Rostedt
2010-04-30 17:09                   ` Mathieu Desnoyers
2010-04-30 18:16                     ` Steven Rostedt
2010-04-30 19:06                       ` Mathieu Desnoyers
2010-04-30 19:48                         ` Steven Rostedt
2010-04-30 20:07                           ` Mathieu Desnoyers
2010-04-30 20:14                             ` Steven Rostedt
2010-04-30 21:02                               ` Mathieu Desnoyers
2010-04-26 19:50 ` [PATCH 05/10][RFC] tracing: Move fields from event to class structure Steven Rostedt
2010-04-28 20:58   ` Mathieu Desnoyers
2010-04-29  0:02     ` Steven Rostedt
     [not found]       ` <20100429133213.GA14617@Krystal>
2010-04-29 13:50         ` Steven Rostedt
2010-04-26 19:50 ` [PATCH 06/10][RFC] tracing: Move raw_init from events to class Steven Rostedt
2010-04-28 21:00   ` Mathieu Desnoyers
2010-04-26 19:50 ` [PATCH 07/10][RFC] tracing: Allow events to share their print functions Steven Rostedt
2010-04-28 21:03   ` Mathieu Desnoyers
2010-04-26 19:50 ` [PATCH 08/10][RFC] tracing: Move print functions into event class Steven Rostedt
2010-04-28 21:03   ` Mathieu Desnoyers
2010-04-26 19:50 ` [PATCH 09/10][RFC] tracing: Remove duplicate id information in event structure Steven Rostedt
2010-04-28 21:06   ` Mathieu Desnoyers
2010-04-29  0:04     ` Steven Rostedt
2010-04-26 19:50 ` [PATCH 10/10][RFC] tracing: Combine event filter_active and enable into single flags field Steven Rostedt
2010-04-28 21:13   ` Mathieu Desnoyers
2010-04-28 14:45 ` [PATCH 00/10][RFC] tracing: Lowering the footprint of TRACE_EVENTs Masami Hiramatsu
2010-04-28 20:18 ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox