* [PATCH 01/10] tracing: move trace point formats to files in include/trace directory
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 02/10] tracing: add subsystem level to trace events Steven Rostedt
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0001-tracing-move-trace-point-formats-to-files-in-includ.patch --]
[-- Type: text/plain, Size: 2355 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
Impact: clean up
To further facilitate the ease of adding trace points for developers, this
patch creates include/trace/trace_events.h and
include/trace/trace_event_types.h.
The former file will hold the trace/<type>.h files and the latter will hold
the trace/<type>_event_types.h files.
To create new tracepoints and to have them automatically
appear in the event tracer, a developer makes the trace/<type>.h file
which includes <linux/tracepoint.h> and the trace/<type>_event_types.h file.
The trace/<type>_event_types.h file will hold the TRACE_FORMAT
macros.
Then add the trace/<type>.h file to trace/trace_events.h,
and add the trace/<type>_event_types.h to the trace_event_types.h file.
No need to modify files elsewhere.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
include/trace/trace_event_types.h | 4 ++++
include/trace/trace_events.h | 4 ++++
kernel/trace/events.c | 10 ++--------
3 files changed, 10 insertions(+), 8 deletions(-)
create mode 100644 include/trace/trace_event_types.h
create mode 100644 include/trace/trace_events.h
diff --git a/include/trace/trace_event_types.h b/include/trace/trace_event_types.h
new file mode 100644
index 0000000..33c8ed5
--- /dev/null
+++ b/include/trace/trace_event_types.h
@@ -0,0 +1,4 @@
+/* trace/<type>_event_types.h here */
+
+#include <trace/sched_event_types.h>
+#include <trace/irq_event_types.h>
diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
new file mode 100644
index 0000000..ea2ef20
--- /dev/null
+++ b/include/trace/trace_events.h
@@ -0,0 +1,4 @@
+/* trace/<type>.h here */
+
+#include <trace/sched.h>
+#include <trace/irq.h>
diff --git a/kernel/trace/events.c b/kernel/trace/events.c
index 3c75623..46e27ad 100644
--- a/kernel/trace/events.c
+++ b/kernel/trace/events.c
@@ -1,15 +1,9 @@
/*
* This is the place to register all trace points as events.
- * Include the trace/<type>.h at the top.
- * Include the trace/<type>_event_types.h at the bottom.
*/
-/* trace/<type>.h here */
-#include <trace/sched.h>
-#include <trace/irq.h>
+#include <trace/trace_events.h>
#include "trace_events.h"
-/* trace/<type>_event_types.h here */
-#include <trace/sched_event_types.h>
-#include <trace/irq_event_types.h>
+#include <trace/trace_event_types.h>
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 02/10] tracing: add subsystem level to trace events
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
2009-02-28 9:06 ` [PATCH 01/10] tracing: move trace point formats to files in include/trace directory Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 03/10] tracing: make the set_event and available_events subsystem aware Steven Rostedt
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0002-tracing-add-subsystem-level-to-trace-events.patch --]
[-- Type: text/plain, Size: 4344 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
If a trace point header defines TRACE_SYSTEM, then it will add the
following trace points into that event system.
If include/trace/irq_event_types.h has:
#define TRACE_SYSTEM irq
at the top and
#undef TRACE_SYSTEM
at the bottom, then a directory "irq" will be created in the
/debug/tracing/events directory. Inside that directory will contain the
two trace points that are defined in include/trace/irq_event_types.h.
Only adding the above to irq and not to sched, we get:
# ls /debug/tracing/events/
irq sched_process_exit sched_signal_send sched_wakeup_new
sched_kthread_stop sched_process_fork sched_switch
sched_kthread_stop_ret sched_process_free sched_wait_task
sched_migrate_task sched_process_wait sched_wakeup
# ls /debug/tracing/events/irq
irq_handler_entry irq_handler_exit
If we add #define TRACE_SYSTEM sched to the trace/sched_event_types.h
then the rest of the trace events will be put in a sched directory
within the events directory.
I've been playing with this idea of the subsystem for a while, but
recently Tom Zanussi posted some patches to lkml that included this
method. Tom's approach was clean and got me to finally put some effort
to clean up the event trace points.
Thanks to Tom Zanussi for demonstrating how nice the subsystem
method is.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/trace/events.c | 4 +++
kernel/trace/trace_events.c | 48 +++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace_events.h | 2 +
3 files changed, 54 insertions(+), 0 deletions(-)
diff --git a/kernel/trace/events.c b/kernel/trace/events.c
index 46e27ad..4e4e458 100644
--- a/kernel/trace/events.c
+++ b/kernel/trace/events.c
@@ -2,6 +2,10 @@
* This is the place to register all trace points as events.
*/
+/* someday this needs to go in a generic header */
+#define __STR(x) #x
+#define STR(x) __STR(x)
+
#include <trace/trace_events.h>
#include "trace_events.h"
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 3bcb9df..1933220 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -345,11 +345,59 @@ static struct dentry *event_trace_events_dir(void)
return d_events;
}
+struct event_subsystem {
+ struct list_head list;
+ const char *name;
+ struct dentry *entry;
+};
+
+static LIST_HEAD(event_subsystems);
+
+static struct dentry *
+event_subsystem_dir(const char *name, struct dentry *d_events)
+{
+ struct event_subsystem *system;
+
+ /* First see if we did not already create this dir */
+ list_for_each_entry(system, &event_subsystems, list) {
+ if (strcmp(system->name, name) == 0)
+ return system->entry;
+ }
+
+ /* need to create new entry */
+ system = kmalloc(sizeof(*system), GFP_KERNEL);
+ if (!system) {
+ pr_warning("No memory to create event subsystem %s\n",
+ name);
+ return d_events;
+ }
+
+ system->entry = debugfs_create_dir(name, d_events);
+ if (!system->entry) {
+ pr_warning("Could not create event subsystem %s\n",
+ name);
+ kfree(system);
+ return d_events;
+ }
+
+ system->name = name;
+ list_add(&system->list, &event_subsystems);
+
+ return system->entry;
+}
+
static int
event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
{
struct dentry *entry;
+ /*
+ * If the trace point header did not define TRACE_SYSTEM
+ * then the system would be called "TRACE_SYSTEM".
+ */
+ if (strcmp(call->system, "TRACE_SYSTEM") != 0)
+ d_events = event_subsystem_dir(call->system, d_events);
+
call->dir = debugfs_create_dir(call->name, d_events);
if (!call->dir) {
pr_warning("Could not create debugfs "
diff --git a/kernel/trace/trace_events.h b/kernel/trace/trace_events.h
index deb95e5..b015d7b 100644
--- a/kernel/trace/trace_events.h
+++ b/kernel/trace/trace_events.h
@@ -7,6 +7,7 @@
struct ftrace_event_call {
char *name;
+ char *system;
struct dentry *dir;
int enabled;
int (*regfunc)(void);
@@ -44,6 +45,7 @@ static struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
+ .system = STR(TRACE_SYSTEM), \
.regfunc = ftrace_reg_event_##call, \
.unregfunc = ftrace_unreg_event_##call, \
}
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 03/10] tracing: make the set_event and available_events subsystem aware
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
2009-02-28 9:06 ` [PATCH 01/10] tracing: move trace point formats to files in include/trace directory Steven Rostedt
2009-02-28 9:06 ` [PATCH 02/10] tracing: add subsystem level to trace events Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 04/10] tracing: add subsystem irq for irq events Steven Rostedt
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0003-tracing-make-the-set_event-and-available_events-sub.patch --]
[-- Type: text/plain, Size: 2513 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
This patch makes the event files, set_event and available_events
aware of the subsystem.
Now you can enable an entire subsystem with:
echo 'irq:*' > set_event
Note: the '*' is not needed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/trace/trace_events.c | 45 +++++++++++++++++++++++++++++++++++++++---
1 files changed, 41 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 1933220..b811eb3 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -12,6 +12,8 @@
#include "trace_events.h"
+#define TRACE_SYSTEM "TRACE_SYSTEM"
+
#define events_for_each(event) \
for (event = __start_ftrace_events; \
(unsigned long)event < (unsigned long)__stop_ftrace_events; \
@@ -45,14 +47,47 @@ static void ftrace_clear_events(void)
static int ftrace_set_clr_event(char *buf, int set)
{
struct ftrace_event_call *call = __start_ftrace_events;
+ char *event = NULL, *sub = NULL, *match;
+ int ret = -EINVAL;
+
+ /*
+ * The buf format can be <subsystem>:<event-name>
+ * *:<event-name> means any event by that name.
+ * :<event-name> is the same.
+ *
+ * <subsystem>:* means all events in that subsystem
+ * <subsystem>: means the same.
+ *
+ * <name> (no ':') means all events in a subsystem with
+ * the name <name> or any event that matches <name>
+ */
+
+ match = strsep(&buf, ":");
+ if (buf) {
+ sub = match;
+ event = buf;
+ match = NULL;
+ if (!strlen(sub) || strcmp(sub, "*") == 0)
+ sub = NULL;
+ if (!strlen(event) || strcmp(event, "*") == 0)
+ event = NULL;
+ }
events_for_each(call) {
if (!call->name)
continue;
- if (strcmp(buf, call->name) != 0)
+ if (match &&
+ strcmp(match, call->name) != 0 &&
+ strcmp(match, call->system) != 0)
+ continue;
+
+ if (sub && strcmp(sub, call->system) != 0)
+ continue;
+
+ if (event && strcmp(event, call->name) != 0)
continue;
if (set) {
@@ -68,9 +103,9 @@ static int ftrace_set_clr_event(char *buf, int set)
call->enabled = 0;
call->unregfunc();
}
- return 0;
+ ret = 0;
}
- return -EINVAL;
+ return ret;
}
/* 128 should be much more than enough */
@@ -200,6 +235,8 @@ static int t_show(struct seq_file *m, void *v)
{
struct ftrace_event_call *call = v;
+ if (strcmp(call->system, TRACE_SYSTEM) != 0)
+ seq_printf(m, "%s:", call->system);
seq_printf(m, "%s\n", call->name);
return 0;
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 04/10] tracing: add subsystem irq for irq events
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (2 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 03/10] tracing: make the set_event and available_events subsystem aware Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 05/10] tracing: add subsystem sched for sched events Steven Rostedt
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0004-tracing-add-subsystem-irq-for-irq-events.patch --]
[-- Type: text/plain, Size: 861 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
Add the TRACE_SYSTEM irq for the irq events.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
include/trace/irq_event_types.h | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/include/trace/irq_event_types.h b/include/trace/irq_event_types.h
index 5d0919f..47a2be1 100644
--- a/include/trace/irq_event_types.h
+++ b/include/trace/irq_event_types.h
@@ -5,6 +5,9 @@
# error Unless you know what you are doing.
#endif
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM irq
+
TRACE_FORMAT(irq_handler_entry,
TPPROTO(int irq, struct irqaction *action),
TPARGS(irq, action),
@@ -15,3 +18,5 @@ TRACE_FORMAT(irq_handler_exit,
TPARGS(irq, action, ret),
TPFMT("irq=%d handler=%s return=%s",
irq, action->name, ret ? "handled" : "unhandled"));
+
+#undef TRACE_SYSTEM
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 05/10] tracing: add subsystem sched for sched events
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (3 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 04/10] tracing: add subsystem irq for irq events Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 06/10] tracing: add interface to write into current tracer buffer Steven Rostedt
` (5 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0005-tracing-add-subsystem-sched-for-sched-events.patch --]
[-- Type: text/plain, Size: 852 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
Add the TRACE_SYSTEM sched for the sched events.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
include/trace/sched_event_types.h | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/include/trace/sched_event_types.h b/include/trace/sched_event_types.h
index a3d3d66..2ada206 100644
--- a/include/trace/sched_event_types.h
+++ b/include/trace/sched_event_types.h
@@ -5,6 +5,9 @@
# error Unless you know what you are doing.
#endif
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM sched
+
TRACE_FORMAT(sched_kthread_stop,
TPPROTO(struct task_struct *t),
TPARGS(t),
@@ -70,3 +73,5 @@ TRACE_FORMAT(sched_signal_send,
TPPROTO(int sig, struct task_struct *p),
TPARGS(sig, p),
TPFMT("sig: %d task %s:%d", sig, p->comm, p->pid));
+
+#undef TRACE_SYSTEM
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 06/10] tracing: add interface to write into current tracer buffer
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (4 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 05/10] tracing: add subsystem sched for sched events Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 07/10] tracing: add raw trace point recording infrastructure Steven Rostedt
` (4 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0006-tracing-add-interface-to-write-into-current-tracer.patch --]
[-- Type: text/plain, Size: 2139 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
Right now all tracers must manage their own trace buffers. This was
to enforce tracers to be independent in case we finally decide to
allow each tracer to have their own trace buffer.
But now we are adding event tracing that writes to the current tracer's
buffer. This adds an interface to allow events to write to the current
tracer buffer without having to manage its own. Since event tracing
has no "tracer", and is just a way to hook into any other tracer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/trace/trace.c | 14 ++++++++++++++
kernel/trace/trace.h | 6 ++++++
2 files changed, 20 insertions(+), 0 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9c5987a..c5e39cd 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -846,6 +846,20 @@ void trace_buffer_unlock_commit(struct trace_array *tr,
trace_wake_up();
}
+struct ring_buffer_event *
+trace_current_buffer_lock_reserve(unsigned char type, unsigned long len,
+ unsigned long flags, int pc)
+{
+ return trace_buffer_lock_reserve(&global_trace,
+ type, len, flags, pc);
+}
+
+void trace_current_buffer_unlock_commit(struct ring_buffer_event *event,
+ unsigned long flags, int pc)
+{
+ return trace_buffer_unlock_commit(&global_trace, event, flags, pc);
+}
+
void
trace_function(struct trace_array *tr,
unsigned long ip, unsigned long parent_ip, unsigned long flags,
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 6321917..adf161f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -442,6 +442,12 @@ void trace_buffer_unlock_commit(struct trace_array *tr,
struct ring_buffer_event *event,
unsigned long flags, int pc);
+struct ring_buffer_event *
+trace_current_buffer_lock_reserve(unsigned char type, unsigned long len,
+ unsigned long flags, int pc);
+void trace_current_buffer_unlock_commit(struct ring_buffer_event *event,
+ unsigned long flags, int pc);
+
struct trace_entry *tracing_get_trace_entry(struct trace_array *tr,
struct trace_array_cpu *data);
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 07/10] tracing: add raw trace point recording infrastructure
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (5 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 06/10] tracing: add interface to write into current tracer buffer Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-03-02 7:51 ` Tom Zanussi
2009-02-28 9:06 ` [PATCH 08/10] tracing: add raw fast tracing interface for trace events Steven Rostedt
` (3 subsequent siblings)
10 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0007-tracing-add-raw-trace-point-recording-infrastructur.patch --]
[-- Type: text/plain, Size: 17372 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
Impact: lower overhead tracing
The current event tracer can automatically pick up trace points
that are registered with the TRACE_FORMAT macro. But it required
a printf format string and parsing. Although, this adds the ability
to get guaranteed information like task names and such, it took
a hit in overhead processing. This processing can add about 500-1000
nanoseconds overhead, but in some cases that too is considered
too much and we want to shave off as much from this overhead as
possible.
Tom Zanussi recently posted tracing patches to lkml that are based
on a nice idea about capturing the data via C structs using
STRUCT_ENTER, STRUCT_EXIT type of macros.
I liked that method very much, but did not like the implementation
that required a developer to add data/code in several disjoint
locations.
This patch extends the event_tracer macros to do a similar "raw C"
approach that Tom Zanussi did. But instead of having the developers
needing to tweak a bunch of code all over the place, they can do it
all in one macro - preferably placed near the code that it is
tracing. That makes it much more likely that tracepoints will be
maintained on an ongoing basis by the code they modify.
The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
a developer may still utilize the more low level DECLARE_TRACE macros
if they don't care about getting their traces automatically in the event
tracer.)
They can also use the existing TRACE_FORMAT if they don't need to code
the tracepoint in C, but just want to use the convenience of printf.
So if the developer wants to "hardwire" a tracepoint in the fastest
possible way, and wants to acquire their data via a user space utility
in a raw binary format, or wants to see it in the trace output but not
sacrifice any performance, then they can implement the faster but
more complex TRACE_EVENT_FORMAT macro.
Here's what usage looks like:
TRACE_EVENT_FORMAT(name,
TPPROTO(proto),
TPARGS(args),
TPFMT(fmt, fmt_args),
TRACE_STUCT(
TRACE_FIELD(type1, item1, assign1)
TRACE_FIELD(type2, item2, assign2)
[...]
),
TPRAWFMT(raw_fmt)
);
Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
uses.
name: is the unique identifier of the trace point
proto: The proto type that the trace point uses
args: the args in the proto type
fmt: printf format to use with the event printf tracer
fmt_args: the printf argments to match fmt
TRACE_STRUCT starts the ability to create a structure.
Each item in the structure is defined with a TRACE_FIELD
TRACE_FIELD(type, item, assign)
type: the C type of item.
item: the name of the item in the stucture
assign: what to assign the item in the trace point callback
raw_fmt is a way to pretty print the struct. It must match
the order of the items are added in TRACE_STUCT
An example of this would be:
TRACE_EVENT_FORMAT(sched_wakeup,
TPPROTO(struct rq *rq, struct task_struct *p, int success),
TPARGS(rq, p, success),
TPFMT("task %s:%d %s",
p->comm, p->pid, success?"succeeded":"failed"),
TRACE_STRUCT(
TRACE_FIELD(pid_t, pid, p->pid)
TRACE_FIELD(int, success, success)
),
TPRAWFMT("task %d success=%d")
);
This creates us a unique struct of:
struct {
pid_t pid;
int success;
};
And the way the call back would assign these values would be:
entry->pid = p->pid;
entry->success = success;
The nice part about this is that the creation of the assignent is done
via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is
created, the developer will then have a faster method to record
into the ring buffer. They do not need to worry about the tracer itself.
The developer would only need to touch the files in include/trace/*.h
Again, I would like to give special thanks to Tom Zanussi for this
nice idea.
Idea-from: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/trace/events.c | 6 +-
kernel/trace/trace.h | 19 +++
kernel/trace/trace_events.c | 2 +-
kernel/trace/trace_events.h | 57 ---------
kernel/trace/trace_events_stage_1.h | 34 ++++++
kernel/trace/trace_events_stage_2.h | 72 ++++++++++++
kernel/trace/trace_events_stage_3.h | 219 +++++++++++++++++++++++++++++++++++
7 files changed, 350 insertions(+), 59 deletions(-)
delete mode 100644 kernel/trace/trace_events.h
create mode 100644 kernel/trace/trace_events_stage_1.h
create mode 100644 kernel/trace/trace_events_stage_2.h
create mode 100644 kernel/trace/trace_events_stage_3.h
diff --git a/kernel/trace/events.c b/kernel/trace/events.c
index 4e4e458..f2509cb 100644
--- a/kernel/trace/events.c
+++ b/kernel/trace/events.c
@@ -8,6 +8,10 @@
#include <trace/trace_events.h>
-#include "trace_events.h"
+#include "trace_output.h"
+
+#include "trace_events_stage_1.h"
+#include "trace_events_stage_2.h"
+#include "trace_events_stage_3.h"
#include <trace/trace_event_types.h>
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index adf161f..aa1ab0c 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -726,4 +726,23 @@ static inline void trace_branch_disable(void)
}
#endif /* CONFIG_BRANCH_TRACER */
+struct ftrace_event_call {
+ char *name;
+ char *system;
+ struct dentry *dir;
+ int enabled;
+ int (*regfunc)(void);
+ void (*unregfunc)(void);
+ int id;
+ struct dentry *raw_dir;
+ int raw_enabled;
+ int (*raw_init)(void);
+ int (*raw_reg)(void);
+ void (*raw_unreg)(void);
+};
+
+void event_trace_printk(unsigned long ip, const char *fmt, ...);
+extern struct ftrace_event_call __start_ftrace_events[];
+extern struct ftrace_event_call __stop_ftrace_events[];
+
#endif /* _LINUX_KERNEL_TRACE_H */
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index b811eb3..77a5c02 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -10,7 +10,7 @@
#include <linux/module.h>
#include <linux/ctype.h>
-#include "trace_events.h"
+#include "trace.h"
#define TRACE_SYSTEM "TRACE_SYSTEM"
diff --git a/kernel/trace/trace_events.h b/kernel/trace/trace_events.h
deleted file mode 100644
index b015d7b..0000000
--- a/kernel/trace/trace_events.h
+++ /dev/null
@@ -1,57 +0,0 @@
-#ifndef _LINUX_KERNEL_TRACE_EVENTS_H
-#define _LINUX_KERNEL_TRACE_EVENTS_H
-
-#include <linux/debugfs.h>
-#include <linux/ftrace.h>
-#include "trace.h"
-
-struct ftrace_event_call {
- char *name;
- char *system;
- struct dentry *dir;
- int enabled;
- int (*regfunc)(void);
- void (*unregfunc)(void);
-};
-
-
-#undef TPFMT
-#define TPFMT(fmt, args...) fmt "\n", ##args
-
-#undef TRACE_FORMAT
-#define TRACE_FORMAT(call, proto, args, fmt) \
-static void ftrace_event_##call(proto) \
-{ \
- event_trace_printk(_RET_IP_, "(" #call ") " fmt); \
-} \
- \
-static int ftrace_reg_event_##call(void) \
-{ \
- int ret; \
- \
- ret = register_trace_##call(ftrace_event_##call); \
- if (!ret) \
- pr_info("event trace: Could not activate trace point " \
- "probe to " #call); \
- return ret; \
-} \
- \
-static void ftrace_unreg_event_##call(void) \
-{ \
- unregister_trace_##call(ftrace_event_##call); \
-} \
- \
-static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
-__attribute__((section("_ftrace_events"))) event_##call = { \
- .name = #call, \
- .system = STR(TRACE_SYSTEM), \
- .regfunc = ftrace_reg_event_##call, \
- .unregfunc = ftrace_unreg_event_##call, \
-}
-
-void event_trace_printk(unsigned long ip, const char *fmt, ...);
-extern struct ftrace_event_call __start_ftrace_events[];
-extern struct ftrace_event_call __stop_ftrace_events[];
-
-#endif /* _LINUX_KERNEL_TRACE_EVENTS_H */
diff --git a/kernel/trace/trace_events_stage_1.h b/kernel/trace/trace_events_stage_1.h
new file mode 100644
index 0000000..fd3bf93
--- /dev/null
+++ b/kernel/trace/trace_events_stage_1.h
@@ -0,0 +1,34 @@
+/*
+ * Stage 1 of the trace events.
+ *
+ * Override the macros in <trace/trace_event_types.h> to include the following:
+ *
+ * struct ftrace_raw_<call> {
+ * struct trace_entry ent;
+ * <type> <item>;
+ * [...]
+ * };
+ *
+ * The <type> <item> is created by the TRACE_FIELD(type, item, assign)
+ * macro. We simply do "type item;", and that will create the fields
+ * in the structure.
+ */
+
+#undef TRACE_FORMAT
+#define TRACE_FORMAT(call, proto, args, fmt)
+
+#undef TRACE_EVENT_FORMAT
+#define TRACE_EVENT_FORMAT(name, proto, args, fmt, tstruct, tpfmt) \
+ struct ftrace_raw_##name { \
+ struct trace_entry ent; \
+ tstruct \
+ }; \
+ static struct ftrace_event_call event_##name
+
+#undef TRACE_STRUCT
+#define TRACE_STRUCT(args...) args
+
+#define TRACE_FIELD(type, item, assign) \
+ type item;
+
+#include <trace/trace_event_types.h>
diff --git a/kernel/trace/trace_events_stage_2.h b/kernel/trace/trace_events_stage_2.h
new file mode 100644
index 0000000..3eaaef5
--- /dev/null
+++ b/kernel/trace/trace_events_stage_2.h
@@ -0,0 +1,72 @@
+/*
+ * Stage 2 of the trace events.
+ *
+ * Override the macros in <trace/trace_event_types.h> to include the following:
+ *
+ * enum print_line_t
+ * ftrace_raw_output_<call>(struct trace_iterator *iter, int flags)
+ * {
+ * struct trace_seq *s = &iter->seq;
+ * struct ftrace_raw_<call> *field; <-- defined in stage 1
+ * struct trace_entry *entry;
+ * int ret;
+ *
+ * entry = iter->ent;
+ *
+ * if (entry->type != event_<call>.id) {
+ * WARN_ON_ONCE(1);
+ * return TRACE_TYPE_UNHANDLED;
+ * }
+ *
+ * field = (typeof(field))entry;
+ *
+ * ret = trace_seq_printf(s, <TPRAWFMT> "%s", <ARGS> "\n");
+ * if (!ret)
+ * return TRACE_TYPE_PARTIAL_LINE;
+ *
+ * return TRACE_TYPE_HANDLED;
+ * }
+ *
+ * This is the method used to print the raw event to the trace
+ * output format. Note, this is not needed if the data is read
+ * in binary.
+ */
+
+#undef TRACE_STRUCT
+#define TRACE_STRUCT(args...) args
+
+#undef TRACE_FIELD
+#define TRACE_FIELD(type, item, assign) \
+ field->item,
+
+
+#undef TPRAWFMT
+#define TPRAWFMT(args...) args
+
+#undef TRACE_EVENT_FORMAT
+#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \
+enum print_line_t \
+ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \
+{ \
+ struct trace_seq *s = &iter->seq; \
+ struct ftrace_raw_##call *field; \
+ struct trace_entry *entry; \
+ int ret; \
+ \
+ entry = iter->ent; \
+ \
+ if (entry->type != event_##call.id) { \
+ WARN_ON_ONCE(1); \
+ return TRACE_TYPE_UNHANDLED; \
+ } \
+ \
+ field = (typeof(field))entry; \
+ \
+ ret = trace_seq_printf(s, tpfmt "%s", tstruct "\n"); \
+ if (!ret) \
+ return TRACE_TYPE_PARTIAL_LINE; \
+ \
+ return TRACE_TYPE_HANDLED; \
+}
+
+#include <trace/trace_event_types.h>
diff --git a/kernel/trace/trace_events_stage_3.h b/kernel/trace/trace_events_stage_3.h
new file mode 100644
index 0000000..7a161c4
--- /dev/null
+++ b/kernel/trace/trace_events_stage_3.h
@@ -0,0 +1,219 @@
+/*
+ * Stage 3 of the trace events.
+ *
+ * Override the macros in <trace/trace_event_types.h> to include the following:
+ *
+ * static void ftrace_event_<call>(proto)
+ * {
+ * event_trace_printk(_RET_IP_, "(<call>) " <fmt>);
+ * }
+ *
+ * static int ftrace_reg_event_<call>(void)
+ * {
+ * int ret;
+ *
+ * ret = register_trace_<call>(ftrace_event_<call>);
+ * if (!ret)
+ * pr_info("event trace: Could not activate trace point "
+ * "probe to <call>");
+ * return ret;
+ * }
+ *
+ * static void ftrace_unreg_event_<call>(void)
+ * {
+ * unregister_trace_<call>(ftrace_event_<call>);
+ * }
+ *
+ * For those macros defined with TRACE_FORMAT:
+ *
+ * static struct ftrace_event_call __used
+ * __attribute__((__aligned__(4)))
+ * __attribute__((section("_ftrace_events"))) event_<call> = {
+ * .name = "<call>",
+ * .regfunc = ftrace_reg_event_<call>,
+ * .unregfunc = ftrace_unreg_event_<call>,
+ * }
+ *
+ *
+ * For those macros defined with TRACE_EVENT_FORMAT:
+ *
+ * static struct ftrace_event_call event_<call>;
+ *
+ * static void ftrace_raw_event_<call>(proto)
+ * {
+ * struct ring_buffer_event *event;
+ * struct ftrace_raw_<call> *entry; <-- defined in stage 1
+ * unsigned long irq_flags;
+ * int pc;
+ *
+ * local_save_flags(irq_flags);
+ * pc = preempt_count();
+ *
+ * event = trace_current_buffer_lock_reserve(event_<call>.id,
+ * sizeof(struct ftrace_raw_<call>),
+ * irq_flags, pc);
+ * if (!event)
+ * return;
+ * entry = ring_buffer_event_data(event);
+ *
+ * <tstruct>; <-- Here we assign the entries by the TRACE_FIELD.
+ *
+ * trace_current_buffer_unlock_commit(event, irq_flags, pc);
+ * }
+ *
+ * static int ftrace_raw_reg_event_<call>(void)
+ * {
+ * int ret;
+ *
+ * ret = register_trace_<call>(ftrace_raw_event_<call>);
+ * if (!ret)
+ * pr_info("event trace: Could not activate trace point "
+ * "probe to <call>");
+ * return ret;
+ * }
+ *
+ * static void ftrace_unreg_event_<call>(void)
+ * {
+ * unregister_trace_<call>(ftrace_raw_event_<call>);
+ * }
+ *
+ * static struct trace_event ftrace_event_type_<call> = {
+ * .trace = ftrace_raw_output_<call>, <-- stage 2
+ * };
+ *
+ * static int ftrace_raw_init_event_<call>(void)
+ * {
+ * int id;
+ *
+ * id = register_ftrace_event(&ftrace_event_type_<call>);
+ * if (!id)
+ * return -ENODEV;
+ * event_<call>.id = id;
+ * return 0;
+ * }
+ *
+ * static struct ftrace_event_call __used
+ * __attribute__((__aligned__(4)))
+ * __attribute__((section("_ftrace_events"))) event_<call> = {
+ * .name = "<call>",
+ * .regfunc = ftrace_reg_event_<call>,
+ * .unregfunc = ftrace_unreg_event_<call>,
+ * .raw_init = ftrace_raw_init_event_<call>,
+ * .raw_reg = ftrace_raw_reg_event_<call>,
+ * .raw_unreg = ftrace_raw_unreg_event_<call>,
+ * }
+ *
+ */
+
+#undef TPFMT
+#define TPFMT(fmt, args...) fmt "\n", ##args
+
+#define _TRACE_FORMAT(call, proto, args, fmt) \
+static void ftrace_event_##call(proto) \
+{ \
+ event_trace_printk(_RET_IP_, "(" #call ") " fmt); \
+} \
+ \
+static int ftrace_reg_event_##call(void) \
+{ \
+ int ret; \
+ \
+ ret = register_trace_##call(ftrace_event_##call); \
+ if (!ret) \
+ pr_info("event trace: Could not activate trace point " \
+ "probe to " #call); \
+ return ret; \
+} \
+ \
+static void ftrace_unreg_event_##call(void) \
+{ \
+ unregister_trace_##call(ftrace_event_##call); \
+} \
+
+
+#undef TRACE_FORMAT
+#define TRACE_FORMAT(call, proto, args, fmt) \
+_TRACE_FORMAT(call, PARAMS(proto), PARAMS(args), PARAMS(fmt)) \
+static struct ftrace_event_call __used \
+__attribute__((__aligned__(4))) \
+__attribute__((section("_ftrace_events"))) event_##call = { \
+ .name = #call, \
+ .system = STR(TRACE_SYSTEM), \
+ .regfunc = ftrace_reg_event_##call, \
+ .unregfunc = ftrace_unreg_event_##call, \
+}
+
+#undef TRACE_FIELD
+#define TRACE_FIELD(type, item, assign)\
+ entry->item = assign;
+
+#undef TRACE_EVENT_FORMAT
+#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \
+_TRACE_FORMAT(call, PARAMS(proto), PARAMS(args), PARAMS(fmt)) \
+ \
+static struct ftrace_event_call event_##call; \
+ \
+static void ftrace_raw_event_##call(proto) \
+{ \
+ struct ring_buffer_event *event; \
+ struct ftrace_raw_##call *entry; \
+ unsigned long irq_flags; \
+ int pc; \
+ \
+ local_save_flags(irq_flags); \
+ pc = preempt_count(); \
+ \
+ event = trace_current_buffer_lock_reserve(event_##call.id, \
+ sizeof(struct ftrace_raw_##call), \
+ irq_flags, pc); \
+ if (!event) \
+ return; \
+ entry = ring_buffer_event_data(event); \
+ \
+ tstruct; \
+ \
+ trace_current_buffer_unlock_commit(event, irq_flags, pc); \
+} \
+ \
+static int ftrace_raw_reg_event_##call(void) \
+{ \
+ int ret; \
+ \
+ ret = register_trace_##call(ftrace_raw_event_##call); \
+ if (!ret) \
+ pr_info("event trace: Could not activate trace point " \
+ "probe to " #call); \
+ return ret; \
+} \
+ \
+static void ftrace_raw_unreg_event_##call(void) \
+{ \
+ unregister_trace_##call(ftrace_raw_event_##call); \
+} \
+ \
+static struct trace_event ftrace_event_type_##call = { \
+ .trace = ftrace_raw_output_##call, \
+}; \
+ \
+static int ftrace_raw_init_event_##call(void) \
+{ \
+ int id; \
+ \
+ id = register_ftrace_event(&ftrace_event_type_##call); \
+ if (!id) \
+ return -ENODEV; \
+ event_##call.id = id; \
+ return 0; \
+} \
+ \
+static struct ftrace_event_call __used \
+__attribute__((__aligned__(4))) \
+__attribute__((section("_ftrace_events"))) event_##call = { \
+ .name = #call, \
+ .system = STR(TRACE_SYSTEM), \
+ .regfunc = ftrace_reg_event_##call, \
+ .unregfunc = ftrace_unreg_event_##call, \
+ .raw_init = ftrace_raw_init_event_##call, \
+ .raw_reg = ftrace_raw_reg_event_##call, \
+ .raw_unreg = ftrace_raw_unreg_event_##call, \
+}
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 07/10] tracing: add raw trace point recording infrastructure
2009-02-28 9:06 ` [PATCH 07/10] tracing: add raw trace point recording infrastructure Steven Rostedt
@ 2009-03-02 7:51 ` Tom Zanussi
2009-03-02 12:22 ` Frédéric Weisbecker
2009-03-02 13:23 ` Steven Rostedt
0 siblings, 2 replies; 15+ messages in thread
From: Tom Zanussi @ 2009-03-02 7:51 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Peter Zijlstra,
Frederic Weisbecker, Mathieu Desnoyers, Masami Hiramatsu,
KOSAKI Motohiro, Jason Baron, Frank Ch. Eigler, acme,
Steven Rostedt
Hi,
On Sat, 2009-02-28 at 04:06 -0500, Steven Rostedt wrote:
> plain text document attachment
> (0007-tracing-add-raw-trace-point-recording-infrastructur.patch)
> From: Steven Rostedt <srostedt@redhat.com>
>
> Impact: lower overhead tracing
>
> The current event tracer can automatically pick up trace points
> that are registered with the TRACE_FORMAT macro. But it required
> a printf format string and parsing. Although, this adds the ability
> to get guaranteed information like task names and such, it took
> a hit in overhead processing. This processing can add about 500-1000
> nanoseconds overhead, but in some cases that too is considered
> too much and we want to shave off as much from this overhead as
> possible.
>
> Tom Zanussi recently posted tracing patches to lkml that are based
> on a nice idea about capturing the data via C structs using
> STRUCT_ENTER, STRUCT_EXIT type of macros.
>
> I liked that method very much, but did not like the implementation
> that required a developer to add data/code in several disjoint
> locations.
>
> This patch extends the event_tracer macros to do a similar "raw C"
> approach that Tom Zanussi did. But instead of having the developers
> needing to tweak a bunch of code all over the place, they can do it
> all in one macro - preferably placed near the code that it is
> tracing. That makes it much more likely that tracepoints will be
> maintained on an ongoing basis by the code they modify.
>
> The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
> a developer may still utilize the more low level DECLARE_TRACE macros
> if they don't care about getting their traces automatically in the event
> tracer.)
>
> They can also use the existing TRACE_FORMAT if they don't need to code
> the tracepoint in C, but just want to use the convenience of printf.
>
> So if the developer wants to "hardwire" a tracepoint in the fastest
> possible way, and wants to acquire their data via a user space utility
> in a raw binary format, or wants to see it in the trace output but not
> sacrifice any performance, then they can implement the faster but
> more complex TRACE_EVENT_FORMAT macro.
>
> Here's what usage looks like:
>
> TRACE_EVENT_FORMAT(name,
> TPPROTO(proto),
> TPARGS(args),
> TPFMT(fmt, fmt_args),
> TRACE_STUCT(
> TRACE_FIELD(type1, item1, assign1)
> TRACE_FIELD(type2, item2, assign2)
> [...]
> ),
> TPRAWFMT(raw_fmt)
> );
>
> Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
> uses.
>
> name: is the unique identifier of the trace point
> proto: The proto type that the trace point uses
> args: the args in the proto type
> fmt: printf format to use with the event printf tracer
> fmt_args: the printf argments to match fmt
>
> TRACE_STRUCT starts the ability to create a structure.
> Each item in the structure is defined with a TRACE_FIELD
>
> TRACE_FIELD(type, item, assign)
>
> type: the C type of item.
> item: the name of the item in the stucture
> assign: what to assign the item in the trace point callback
>
> raw_fmt is a way to pretty print the struct. It must match
> the order of the items are added in TRACE_STUCT
>
> An example of this would be:
>
> TRACE_EVENT_FORMAT(sched_wakeup,
> TPPROTO(struct rq *rq, struct task_struct *p, int success),
> TPARGS(rq, p, success),
> TPFMT("task %s:%d %s",
> p->comm, p->pid, success?"succeeded":"failed"),
> TRACE_STRUCT(
> TRACE_FIELD(pid_t, pid, p->pid)
> TRACE_FIELD(int, success, success)
> ),
> TPRAWFMT("task %d success=%d")
> );
>
> This creates us a unique struct of:
>
> struct {
> pid_t pid;
> int success;
> };
>
> And the way the call back would assign these values would be:
>
> entry->pid = p->pid;
> entry->success = success;
>
> The nice part about this is that the creation of the assignent is done
> via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is
> created, the developer will then have a faster method to record
> into the ring buffer. They do not need to worry about the tracer itself.
>
Nice improvements - I definitely was unhappy about having things spread
around in different files unnecessarily. And I like the fact that your
macros generate assignments too but am curious about what to do if you
need to do something more complicated than an assignment e.g. in the
block tracepoints I had to assign fields differently based on the value
of blk_pc_request():
if (blk_pc_request(rq)) {
zed_event->sector = 0;
zed_event->bytes = rq->data_len;
zed_event->pdu_len = pdu_len;
memcpy(zed_event->pdu, rq->cmd, pdu_len);
} else {
zed_event->sector = rq->hard_sector;
zed_event->bytes = rq->hard_nr_sectors << 9;
zed_event->pdu_len = 0;
}
Is there a way to define some fields but without the assignments, and do
them manually somewhere else? I guess it would be nice to be able to
define all events using TRACE_EVENT_FORMAT but have a way to special
case certain events/fields.
Anyway, sorry if it's already handled in the code - haven't had a chance
to really peruse it.
Tom
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 07/10] tracing: add raw trace point recording infrastructure
2009-03-02 7:51 ` Tom Zanussi
@ 2009-03-02 12:22 ` Frédéric Weisbecker
2009-03-02 13:23 ` Steven Rostedt
1 sibling, 0 replies; 15+ messages in thread
From: Frédéric Weisbecker @ 2009-03-02 12:22 UTC (permalink / raw)
To: Tom Zanussi
Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
Peter Zijlstra, Mathieu Desnoyers, Masami Hiramatsu,
KOSAKI Motohiro, Jason Baron, Frank Ch. Eigler, acme,
Steven Rostedt
2009/3/2 Tom Zanussi <tzanussi@gmail.com>:
> Hi,
>
> On Sat, 2009-02-28 at 04:06 -0500, Steven Rostedt wrote:
>> plain text document attachment
>> (0007-tracing-add-raw-trace-point-recording-infrastructur.patch)
>> From: Steven Rostedt <srostedt@redhat.com>
>>
>> Impact: lower overhead tracing
>>
>> The current event tracer can automatically pick up trace points
>> that are registered with the TRACE_FORMAT macro. But it required
>> a printf format string and parsing. Although, this adds the ability
>> to get guaranteed information like task names and such, it took
>> a hit in overhead processing. This processing can add about 500-1000
>> nanoseconds overhead, but in some cases that too is considered
>> too much and we want to shave off as much from this overhead as
>> possible.
>>
>> Tom Zanussi recently posted tracing patches to lkml that are based
>> on a nice idea about capturing the data via C structs using
>> STRUCT_ENTER, STRUCT_EXIT type of macros.
>>
>> I liked that method very much, but did not like the implementation
>> that required a developer to add data/code in several disjoint
>> locations.
>>
>> This patch extends the event_tracer macros to do a similar "raw C"
>> approach that Tom Zanussi did. But instead of having the developers
>> needing to tweak a bunch of code all over the place, they can do it
>> all in one macro - preferably placed near the code that it is
>> tracing. That makes it much more likely that tracepoints will be
>> maintained on an ongoing basis by the code they modify.
>>
>> The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
>> a developer may still utilize the more low level DECLARE_TRACE macros
>> if they don't care about getting their traces automatically in the event
>> tracer.)
>>
>> They can also use the existing TRACE_FORMAT if they don't need to code
>> the tracepoint in C, but just want to use the convenience of printf.
>>
>> So if the developer wants to "hardwire" a tracepoint in the fastest
>> possible way, and wants to acquire their data via a user space utility
>> in a raw binary format, or wants to see it in the trace output but not
>> sacrifice any performance, then they can implement the faster but
>> more complex TRACE_EVENT_FORMAT macro.
>>
>> Here's what usage looks like:
>>
>> TRACE_EVENT_FORMAT(name,
>> TPPROTO(proto),
>> TPARGS(args),
>> TPFMT(fmt, fmt_args),
>> TRACE_STUCT(
>> TRACE_FIELD(type1, item1, assign1)
>> TRACE_FIELD(type2, item2, assign2)
>> [...]
>> ),
>> TPRAWFMT(raw_fmt)
>> );
>>
>> Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
>> uses.
>>
>> name: is the unique identifier of the trace point
>> proto: The proto type that the trace point uses
>> args: the args in the proto type
>> fmt: printf format to use with the event printf tracer
>> fmt_args: the printf argments to match fmt
>>
>> TRACE_STRUCT starts the ability to create a structure.
>> Each item in the structure is defined with a TRACE_FIELD
>>
>> TRACE_FIELD(type, item, assign)
>>
>> type: the C type of item.
>> item: the name of the item in the stucture
>> assign: what to assign the item in the trace point callback
>>
>> raw_fmt is a way to pretty print the struct. It must match
>> the order of the items are added in TRACE_STUCT
>>
>> An example of this would be:
>>
>> TRACE_EVENT_FORMAT(sched_wakeup,
>> TPPROTO(struct rq *rq, struct task_struct *p, int success),
>> TPARGS(rq, p, success),
>> TPFMT("task %s:%d %s",
>> p->comm, p->pid, success?"succeeded":"failed"),
>> TRACE_STRUCT(
>> TRACE_FIELD(pid_t, pid, p->pid)
>> TRACE_FIELD(int, success, success)
>> ),
>> TPRAWFMT("task %d success=%d")
>> );
>>
>> This creates us a unique struct of:
>>
>> struct {
>> pid_t pid;
>> int success;
>> };
>>
>> And the way the call back would assign these values would be:
>>
>> entry->pid = p->pid;
>> entry->success = success;
>>
>> The nice part about this is that the creation of the assignent is done
>> via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is
>> created, the developer will then have a faster method to record
>> into the ring buffer. They do not need to worry about the tracer itself.
>>
>
> Nice improvements - I definitely was unhappy about having things spread
> around in different files unnecessarily. And I like the fact that your
> macros generate assignments too but am curious about what to do if you
> need to do something more complicated than an assignment e.g. in the
> block tracepoints I had to assign fields differently based on the value
> of blk_pc_request():
>
> if (blk_pc_request(rq)) {
> zed_event->sector = 0;
> zed_event->bytes = rq->data_len;
> zed_event->pdu_len = pdu_len;
> memcpy(zed_event->pdu, rq->cmd, pdu_len);
> } else {
> zed_event->sector = rq->hard_sector;
> zed_event->bytes = rq->hard_nr_sectors << 9;
> zed_event->pdu_len = 0;
> }
>
> Is there a way to define some fields but without the assignments, and do
> them manually somewhere else? I guess it would be nice to be able to
> define all events using TRACE_EVENT_FORMAT but have a way to special
> case certain events/fields.
Note that on such case you can do a conditional assignment:
TRACE_FIELD(int, bytes, blk_pc_request(rq) ? rq->data_len :
rq->hard_nr_sectors << 9);
The drawback here is that you'll have to repeat this conditional for
each field, which is annoying
and a loss of performance.
Perhaps it would be interesting to allow something more low level if
the user wishes.
> Anyway, sorry if it's already handled in the code - haven't had a chance
> to really peruse it.
>
> Tom
>
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 07/10] tracing: add raw trace point recording infrastructure
2009-03-02 7:51 ` Tom Zanussi
2009-03-02 12:22 ` Frédéric Weisbecker
@ 2009-03-02 13:23 ` Steven Rostedt
1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-03-02 13:23 UTC (permalink / raw)
To: Tom Zanussi
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Peter Zijlstra,
Frederic Weisbecker, Mathieu Desnoyers, Masami Hiramatsu,
KOSAKI Motohiro, Jason Baron, Frank Ch. Eigler, acme,
Steven Rostedt
On Mon, 2 Mar 2009, Tom Zanussi wrote:
> > An example of this would be:
> >
> > TRACE_EVENT_FORMAT(sched_wakeup,
> > TPPROTO(struct rq *rq, struct task_struct *p, int success),
> > TPARGS(rq, p, success),
> > TPFMT("task %s:%d %s",
> > p->comm, p->pid, success?"succeeded":"failed"),
> > TRACE_STRUCT(
> > TRACE_FIELD(pid_t, pid, p->pid)
> > TRACE_FIELD(int, success, success)
> > ),
> > TPRAWFMT("task %d success=%d")
> > );
> >
> > This creates us a unique struct of:
> >
> > struct {
> > pid_t pid;
> > int success;
> > };
> >
> > And the way the call back would assign these values would be:
> >
> > entry->pid = p->pid;
> > entry->success = success;
> >
> > The nice part about this is that the creation of the assignent is done
> > via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is
> > created, the developer will then have a faster method to record
> > into the ring buffer. They do not need to worry about the tracer itself.
> >
>
> Nice improvements - I definitely was unhappy about having things spread
> around in different files unnecessarily. And I like the fact that your
> macros generate assignments too but am curious about what to do if you
> need to do something more complicated than an assignment e.g. in the
> block tracepoints I had to assign fields differently based on the value
> of blk_pc_request():
>
> if (blk_pc_request(rq)) {
> zed_event->sector = 0;
> zed_event->bytes = rq->data_len;
> zed_event->pdu_len = pdu_len;
> memcpy(zed_event->pdu, rq->cmd, pdu_len);
> } else {
> zed_event->sector = rq->hard_sector;
> zed_event->bytes = rq->hard_nr_sectors << 9;
> zed_event->pdu_len = 0;
> }
>
> Is there a way to define some fields but without the assignments, and do
> them manually somewhere else? I guess it would be nice to be able to
> define all events using TRACE_EVENT_FORMAT but have a way to special
> case certain events/fields.
>
> Anyway, sorry if it's already handled in the code - haven't had a chance
> to really peruse it.
Nope, you are right, it is not handled... yet ;-)
I was thinking about adding TRACE_FIELD_SPECIAL() that would allow for a
different means to copy the field.
TRACE_FIELD_SPECIAL(char*, pdu, rec,
TPCMD(memcpy((rec)->pdu, rq->cmd, pdu_len));
Where, rec would have your "zed_event", and the fourth argument would have
the way to handle that field.
I have not tried that yet, but I think this, or something similar, could
work.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 08/10] tracing: add raw fast tracing interface for trace events
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (6 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 07/10] tracing: add raw trace point recording infrastructure Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 09/10] tracing: create the C style tracing for the sched subsystem Steven Rostedt
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0008-tracing-add-raw-fast-tracing-interface-for-trace-ev.patch --]
[-- Type: text/plain, Size: 8130 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
This patch adds the interface to enable the C style trace points.
In the directory /debugfs/tracing/events/subsystem/event
We now have three files:
enable : values 0 or 1 to enable or disable the trace event.
available_types: values 'raw' and 'printf' which indicate the tracing
types available for the trace point. If a developer does not
use the TRACE_EVENT_FORMAT macro and just uses the TRACE_FORMAT
macro, then only 'printf' will be available. This file is
read only.
type: values 'raw' or 'printf'. This indicates which type of tracing
is active for that trace point. 'printf' is the default and
if 'raw' is not available, this file is read only.
# echo raw > /debug/tracing/events/sched/sched_wakeup/type
# echo 1 > /debug/tracing/events/sched/sched_wakeup/enable
Will enable the C style tracing for the sched_wakeup trace point.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/trace/trace.h | 7 ++
kernel/trace/trace_events.c | 199 +++++++++++++++++++++++++++++++++++++------
2 files changed, 181 insertions(+), 25 deletions(-)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index aa1ab0c..f6fa0b9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -726,6 +726,12 @@ static inline void trace_branch_disable(void)
}
#endif /* CONFIG_BRANCH_TRACER */
+/* trace event type bit fields, not numeric */
+enum {
+ TRACE_EVENT_TYPE_PRINTF = 1,
+ TRACE_EVENT_TYPE_RAW = 2,
+};
+
struct ftrace_event_call {
char *name;
char *system;
@@ -736,6 +742,7 @@ struct ftrace_event_call {
int id;
struct dentry *raw_dir;
int raw_enabled;
+ int type;
int (*raw_init)(void);
int (*raw_reg)(void);
void (*raw_unreg)(void);
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 77a5c02..1d07f80 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -44,6 +44,36 @@ static void ftrace_clear_events(void)
}
}
+static void ftrace_event_enable_disable(struct ftrace_event_call *call,
+ int enable)
+{
+
+ switch (enable) {
+ case 0:
+ if (call->enabled) {
+ call->enabled = 0;
+ call->unregfunc();
+ }
+ if (call->raw_enabled) {
+ call->raw_enabled = 0;
+ call->raw_unreg();
+ }
+ break;
+ case 1:
+ if (!call->enabled &&
+ (call->type & TRACE_EVENT_TYPE_PRINTF)) {
+ call->enabled = 1;
+ call->regfunc();
+ }
+ if (!call->raw_enabled &&
+ (call->type & TRACE_EVENT_TYPE_RAW)) {
+ call->raw_enabled = 1;
+ call->raw_reg();
+ }
+ break;
+ }
+}
+
static int ftrace_set_clr_event(char *buf, int set)
{
struct ftrace_event_call *call = __start_ftrace_events;
@@ -90,19 +120,8 @@ static int ftrace_set_clr_event(char *buf, int set)
if (event && strcmp(event, call->name) != 0)
continue;
- if (set) {
- /* Already set? */
- if (call->enabled)
- return 0;
- call->enabled = 1;
- call->regfunc();
- } else {
- /* Already cleared? */
- if (!call->enabled)
- return 0;
- call->enabled = 0;
- call->unregfunc();
- }
+ ftrace_event_enable_disable(call, set);
+
ret = 0;
}
return ret;
@@ -273,7 +292,7 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
struct ftrace_event_call *call = filp->private_data;
char *buf;
- if (call->enabled)
+ if (call->enabled || call->raw_enabled)
buf = "1\n";
else
buf = "0\n";
@@ -304,18 +323,8 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
switch (val) {
case 0:
- if (!call->enabled)
- break;
-
- call->enabled = 0;
- call->unregfunc();
- break;
case 1:
- if (call->enabled)
- break;
-
- call->enabled = 1;
- call->regfunc();
+ ftrace_event_enable_disable(call, val);
break;
default:
@@ -327,6 +336,107 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
return cnt;
}
+static ssize_t
+event_type_read(struct file *filp, char __user *ubuf, size_t cnt,
+ loff_t *ppos)
+{
+ struct ftrace_event_call *call = filp->private_data;
+ char buf[16];
+ int r = 0;
+
+ if (call->type & TRACE_EVENT_TYPE_PRINTF)
+ r += sprintf(buf, "printf\n");
+
+ if (call->type & TRACE_EVENT_TYPE_RAW)
+ r += sprintf(buf+r, "raw\n");
+
+ return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static ssize_t
+event_type_write(struct file *filp, const char __user *ubuf, size_t cnt,
+ loff_t *ppos)
+{
+ struct ftrace_event_call *call = filp->private_data;
+ char buf[64];
+
+ /*
+ * If there's only one type, we can't change it.
+ * And currently we always have printf type, and we
+ * may or may not have raw type.
+ *
+ * This is a redundant check, the file should be read
+ * only if this is the case anyway.
+ */
+
+ if (!call->raw_init)
+ return -EPERM;
+
+ if (cnt >= sizeof(buf))
+ return -EINVAL;
+
+ if (copy_from_user(&buf, ubuf, cnt))
+ return -EFAULT;
+
+ buf[cnt] = 0;
+
+ if (!strncmp(buf, "printf", 6) &&
+ (!buf[6] || isspace(buf[6]))) {
+
+ call->type = TRACE_EVENT_TYPE_PRINTF;
+
+ /*
+ * If raw enabled, the disable it and enable
+ * printf type.
+ */
+ if (call->raw_enabled) {
+ call->raw_enabled = 0;
+ call->raw_unreg();
+
+ call->enabled = 1;
+ call->regfunc();
+ }
+
+ } else if (!strncmp(buf, "raw", 3) &&
+ (!buf[3] || isspace(buf[3]))) {
+
+ call->type = TRACE_EVENT_TYPE_RAW;
+
+ /*
+ * If printf enabled, the disable it and enable
+ * raw type.
+ */
+ if (call->enabled) {
+ call->enabled = 0;
+ call->unregfunc();
+
+ call->raw_enabled = 1;
+ call->raw_reg();
+ }
+ } else
+ return -EINVAL;
+
+ *ppos += cnt;
+
+ return cnt;
+}
+
+static ssize_t
+event_available_types_read(struct file *filp, char __user *ubuf, size_t cnt,
+ loff_t *ppos)
+{
+ struct ftrace_event_call *call = filp->private_data;
+ char buf[16];
+ int r = 0;
+
+ r += sprintf(buf, "printf\n");
+
+ if (call->raw_init)
+ r += sprintf(buf+r, "raw\n");
+
+ return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
static const struct seq_operations show_event_seq_ops = {
.start = t_start,
.next = t_next,
@@ -362,6 +472,17 @@ static const struct file_operations ftrace_enable_fops = {
.write = event_enable_write,
};
+static const struct file_operations ftrace_type_fops = {
+ .open = tracing_open_generic,
+ .read = event_type_read,
+ .write = event_type_write,
+};
+
+static const struct file_operations ftrace_available_types_fops = {
+ .open = tracing_open_generic,
+ .read = event_available_types_read,
+};
+
static struct dentry *event_trace_events_dir(void)
{
static struct dentry *d_tracer;
@@ -427,6 +548,7 @@ static int
event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
{
struct dentry *entry;
+ int ret;
/*
* If the trace point header did not define TRACE_SYSTEM
@@ -435,6 +557,18 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
if (strcmp(call->system, "TRACE_SYSTEM") != 0)
d_events = event_subsystem_dir(call->system, d_events);
+ if (call->raw_init) {
+ ret = call->raw_init();
+ if (ret < 0) {
+ pr_warning("Could not initialize trace point"
+ " events/%s\n", call->name);
+ return ret;
+ }
+ }
+
+ /* default the output to printf */
+ call->type = TRACE_EVENT_TYPE_PRINTF;
+
call->dir = debugfs_create_dir(call->name, d_events);
if (!call->dir) {
pr_warning("Could not create debugfs "
@@ -448,6 +582,21 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
pr_warning("Could not create debugfs "
"'%s/enable' entry\n", call->name);
+ /* Only let type be writable, if we can change it */
+ entry = debugfs_create_file("type",
+ call->raw_init ? 0644 : 0444,
+ call->dir, call,
+ &ftrace_type_fops);
+ if (!entry)
+ pr_warning("Could not create debugfs "
+ "'%s/type' entry\n", call->name);
+
+ entry = debugfs_create_file("available_types", 0444, call->dir, call,
+ &ftrace_available_types_fops);
+ if (!entry)
+ pr_warning("Could not create debugfs "
+ "'%s/type' available_types\n", call->name);
+
return 0;
}
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 09/10] tracing: create the C style tracing for the sched subsystem
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (7 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 08/10] tracing: add raw fast tracing interface for trace events Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:06 ` [PATCH 10/10] tracing: create the C style tracing for the irq subsystem Steven Rostedt
2009-02-28 9:17 ` [PATCH 00/10] [git pull] for tip/tracing/ftrace Ingo Molnar
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0009-tracing-create-the-C-style-tracing-for-the-sched-su.patch --]
[-- Type: text/plain, Size: 5792 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style
faster tracing for the sched subsystem trace points.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
include/linux/tracepoint.h | 3 +
include/trace/sched_event_types.h | 119 +++++++++++++++++++++++++++++--------
2 files changed, 97 insertions(+), 25 deletions(-)
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 62d1339..152b2f0 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -157,4 +157,7 @@ static inline void tracepoint_synchronize_unregister(void)
#define TRACE_FORMAT(name, proto, args, fmt) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
+#define TRACE_EVENT_FORMAT(name, proto, args, fmt, struct, tpfmt) \
+ TRACE_FORMAT(name, PARAMS(proto), PARAMS(args), PARAMS(fmt))
+
#endif
diff --git a/include/trace/sched_event_types.h b/include/trace/sched_event_types.h
index 2ada206..ba059c1 100644
--- a/include/trace/sched_event_types.h
+++ b/include/trace/sched_event_types.h
@@ -1,6 +1,6 @@
/* use <trace/sched.h> instead */
-#ifndef TRACE_FORMAT
+#ifndef TRACE_EVENT_FORMAT
# error Do not include this file directly.
# error Unless you know what you are doing.
#endif
@@ -8,70 +8,139 @@
#undef TRACE_SYSTEM
#define TRACE_SYSTEM sched
-TRACE_FORMAT(sched_kthread_stop,
+TRACE_EVENT_FORMAT(sched_kthread_stop,
TPPROTO(struct task_struct *t),
TPARGS(t),
- TPFMT("task %s:%d", t->comm, t->pid));
+ TPFMT("task %s:%d", t->comm, t->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, t->pid)
+ ),
+ TPRAWFMT("task %d")
+ );
-TRACE_FORMAT(sched_kthread_stop_ret,
+TRACE_EVENT_FORMAT(sched_kthread_stop_ret,
TPPROTO(int ret),
TPARGS(ret),
- TPFMT("ret=%d", ret));
+ TPFMT("ret=%d", ret),
+ TRACE_STRUCT(
+ TRACE_FIELD(int, ret, ret)
+ ),
+ TPRAWFMT("ret=%d")
+ );
-TRACE_FORMAT(sched_wait_task,
+TRACE_EVENT_FORMAT(sched_wait_task,
TPPROTO(struct rq *rq, struct task_struct *p),
TPARGS(rq, p),
- TPFMT("task %s:%d", p->comm, p->pid));
+ TPFMT("task %s:%d", p->comm, p->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ ),
+ TPRAWFMT("task %d")
+ );
-TRACE_FORMAT(sched_wakeup,
+TRACE_EVENT_FORMAT(sched_wakeup,
TPPROTO(struct rq *rq, struct task_struct *p, int success),
TPARGS(rq, p, success),
TPFMT("task %s:%d %s",
- p->comm, p->pid, success?"succeeded":"failed"));
+ p->comm, p->pid, success ? "succeeded" : "failed"),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ TRACE_FIELD(int, success, success)
+ ),
+ TPRAWFMT("task %d success=%d")
+ );
-TRACE_FORMAT(sched_wakeup_new,
+TRACE_EVENT_FORMAT(sched_wakeup_new,
TPPROTO(struct rq *rq, struct task_struct *p, int success),
TPARGS(rq, p, success),
TPFMT("task %s:%d",
- p->comm, p->pid, success?"succeeded":"failed"));
+ p->comm, p->pid, success ? "succeeded" : "failed"),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ TRACE_FIELD(int, success, success)
+ ),
+ TPRAWFMT("task %d success=%d")
+ );
-TRACE_FORMAT(sched_switch,
+TRACE_EVENT_FORMAT(sched_switch,
TPPROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TPARGS(rq, prev, next),
TPFMT("task %s:%d ==> %s:%d",
- prev->comm, prev->pid, next->comm, next->pid));
+ prev->comm, prev->pid, next->comm, next->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, prev_pid, prev->pid)
+ TRACE_FIELD(int, prev_prio, prev->prio)
+ TRACE_FIELD(pid_t, next_pid, next->pid)
+ TRACE_FIELD(int, next_prio, next->prio)
+ ),
+ TPRAWFMT("prev %d:%d ==> next %d:%d")
+ );
-TRACE_FORMAT(sched_migrate_task,
+TRACE_EVENT_FORMAT(sched_migrate_task,
TPPROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
TPARGS(p, orig_cpu, dest_cpu),
TPFMT("task %s:%d from: %d to: %d",
- p->comm, p->pid, orig_cpu, dest_cpu));
+ p->comm, p->pid, orig_cpu, dest_cpu),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ TRACE_FIELD(int, orig_cpu, orig_cpu)
+ TRACE_FIELD(int, dest_cpu, dest_cpu)
+ ),
+ TPRAWFMT("task %d from: %d to: %d")
+ );
-TRACE_FORMAT(sched_process_free,
+TRACE_EVENT_FORMAT(sched_process_free,
TPPROTO(struct task_struct *p),
TPARGS(p),
- TPFMT("task %s:%d", p->comm, p->pid));
+ TPFMT("task %s:%d", p->comm, p->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ ),
+ TPRAWFMT("task %d")
+ );
-TRACE_FORMAT(sched_process_exit,
+TRACE_EVENT_FORMAT(sched_process_exit,
TPPROTO(struct task_struct *p),
TPARGS(p),
- TPFMT("task %s:%d", p->comm, p->pid));
+ TPFMT("task %s:%d", p->comm, p->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, p->pid)
+ ),
+ TPRAWFMT("task %d")
+ );
-TRACE_FORMAT(sched_process_wait,
+TRACE_EVENT_FORMAT(sched_process_wait,
TPPROTO(struct pid *pid),
TPARGS(pid),
- TPFMT("pid %d", pid));
+ TPFMT("pid %d", pid_nr(pid)),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, pid, pid_nr(pid))
+ ),
+ TPRAWFMT("task %d")
+ );
-TRACE_FORMAT(sched_process_fork,
+TRACE_EVENT_FORMAT(sched_process_fork,
TPPROTO(struct task_struct *parent, struct task_struct *child),
TPARGS(parent, child),
TPFMT("parent %s:%d child %s:%d",
- parent->comm, parent->pid, child->comm, child->pid));
+ parent->comm, parent->pid, child->comm, child->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(pid_t, parent, parent->pid)
+ TRACE_FIELD(pid_t, child, child->pid)
+ ),
+ TPRAWFMT("parent %d child %d")
+ );
-TRACE_FORMAT(sched_signal_send,
+TRACE_EVENT_FORMAT(sched_signal_send,
TPPROTO(int sig, struct task_struct *p),
TPARGS(sig, p),
- TPFMT("sig: %d task %s:%d", sig, p->comm, p->pid));
+ TPFMT("sig: %d task %s:%d", sig, p->comm, p->pid),
+ TRACE_STRUCT(
+ TRACE_FIELD(int, sig, sig)
+ TRACE_FIELD(pid_t, pid, p->pid)
+ ),
+ TPRAWFMT("sig: %d task %d")
+ );
#undef TRACE_SYSTEM
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 10/10] tracing: create the C style tracing for the irq subsystem
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (8 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 09/10] tracing: create the C style tracing for the sched subsystem Steven Rostedt
@ 2009-02-28 9:06 ` Steven Rostedt
2009-02-28 9:17 ` [PATCH 00/10] [git pull] for tip/tracing/ftrace Ingo Molnar
10 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2009-02-28 9:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme, Steven Rostedt
[-- Attachment #1: 0010-tracing-create-the-C-style-tracing-for-the-irq-subs.patch --]
[-- Type: text/plain, Size: 1361 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style
faster tracing for the irq subsystem trace points.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
include/trace/irq_event_types.h | 19 +++++++++++++++----
1 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/include/trace/irq_event_types.h b/include/trace/irq_event_types.h
index 47a2be1..65850bc 100644
--- a/include/trace/irq_event_types.h
+++ b/include/trace/irq_event_types.h
@@ -8,15 +8,26 @@
#undef TRACE_SYSTEM
#define TRACE_SYSTEM irq
-TRACE_FORMAT(irq_handler_entry,
+TRACE_EVENT_FORMAT(irq_handler_entry,
TPPROTO(int irq, struct irqaction *action),
TPARGS(irq, action),
- TPFMT("irq=%d handler=%s", irq, action->name));
+ TPFMT("irq=%d handler=%s", irq, action->name),
+ TRACE_STRUCT(
+ TRACE_FIELD(int, irq, irq)
+ ),
+ TPRAWFMT("irq %d")
+ );
-TRACE_FORMAT(irq_handler_exit,
+TRACE_EVENT_FORMAT(irq_handler_exit,
TPPROTO(int irq, struct irqaction *action, int ret),
TPARGS(irq, action, ret),
TPFMT("irq=%d handler=%s return=%s",
- irq, action->name, ret ? "handled" : "unhandled"));
+ irq, action->name, ret ? "handled" : "unhandled"),
+ TRACE_STRUCT(
+ TRACE_FIELD(int, irq, irq)
+ TRACE_FIELD(int, ret, ret)
+ ),
+ TPRAWFMT("irq %d ret %d")
+ );
#undef TRACE_SYSTEM
--
1.5.6.5
--
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 00/10] [git pull] for tip/tracing/ftrace
2009-02-28 9:06 [PATCH 00/10] [git pull] for tip/tracing/ftrace Steven Rostedt
` (9 preceding siblings ...)
2009-02-28 9:06 ` [PATCH 10/10] tracing: create the C style tracing for the irq subsystem Steven Rostedt
@ 2009-02-28 9:17 ` Ingo Molnar
10 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2009-02-28 9:17 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
Mathieu Desnoyers, Tom Zanussi, Masami Hiramatsu, KOSAKI Motohiro,
Jason Baron, Frank Ch. Eigler, acme
* Steven Rostedt <rostedt@goodmis.org> wrote:
> Ingo,
>
> Please pull the latest tip/tracing/ftrace tree, which can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> tip/tracing/ftrace
>
>
> Steven Rostedt (10):
> tracing: move trace point formats to files in include/trace directory
> tracing: add subsystem level to trace events
> tracing: make the set_event and available_events subsystem aware
> tracing: add subsystem irq for irq events
> tracing: add subsystem sched for sched events
> tracing: add interface to write into current tracer buffer
> tracing: add raw trace point recording infrastructure
> tracing: add raw fast tracing interface for trace events
> tracing: create the C style tracing for the sched subsystem
> tracing: create the C style tracing for the irq subsystem
Great stuff!
Pulled into tip:tracing/ftrace, thanks Steve!
Ingo
^ permalink raw reply [flat|nested] 15+ messages in thread