* [PATCH 2/3] tracing/perf: reimplement TP_perf_assign() logic
2013-06-18 19:21 [PATCH 0/3] tracing/perf: perf_trace_buf/perf_xxx hacks Oleg Nesterov
@ 2013-06-18 19:22 ` Oleg Nesterov
0 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2013-06-18 19:22 UTC (permalink / raw)
To: Steven Rostedt, Peter Zijlstra
Cc: Frederic Weisbecker, Ingo Molnar, Masami Hiramatsu,
Srikar Dronamraju, zhangwei(Jovi), linux-kernel
TP_perf_assign/__perf_xxx is used to change the default values
of __addr/__count/__task variables for perf_trace_buf_submit().
Unfortunately, TP_perf_assign() is called "too late", we want to
have a fast-path "__task != NULL" check in perf_trace_##call() at
the start. So this patch simply embeds __perf_xxx() into TP_ARGS(),
this way DECLARE_EVENT_CLASS() can use the result of assignments
hidden in "args" right after ftrace_get_offsets_##call() which is
mostly trivial.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/trace/events/sched.h | 16 +++-------------
include/trace/ftrace.h | 19 +++++++++++--------
2 files changed, 14 insertions(+), 21 deletions(-)
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 249c024..2e7d994 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
TP_PROTO(struct task_struct *p, int success),
- TP_ARGS(p, success),
+ TP_ARGS(__perf_task(p), success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__entry->prio = p->prio;
__entry->success = success;
__entry->target_cpu = task_cpu(p);
- )
- TP_perf_assign(
- __perf_task(p);
),
TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
TP_PROTO(struct task_struct *tsk, u64 delay),
- TP_ARGS(tsk, delay),
+ TP_ARGS(__perf_task(tsk), __perf_count(delay)),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
__entry->pid = tsk->pid;
__entry->delay = delay;
- )
- TP_perf_assign(
- __perf_count(delay);
- __perf_task(tsk);
),
TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@@ -376,7 +369,7 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
- TP_ARGS(tsk, runtime, vruntime),
+ TP_ARGS(tsk, __perf_count(runtime), vruntime),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -390,9 +383,6 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
__entry->pid = tsk->pid;
__entry->runtime = runtime;
__entry->vruntime = vruntime;
- )
- TP_perf_assign(
- __perf_count(runtime);
),
TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index aed594a..8886877 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -503,8 +503,14 @@ static inline notrace int ftrace_get_offsets_##call( \
#undef TP_fast_assign
#define TP_fast_assign(args...) args
-#undef TP_perf_assign
-#define TP_perf_assign(args...)
+#undef __perf_addr
+#define __perf_addr(a) (a)
+
+#undef __perf_count
+#define __perf_count(c) (c)
+
+#undef __perf_task
+#define __perf_task(t) (t)
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@@ -632,16 +638,13 @@ __attribute__((section("_ftrace_events"))) *__event_##call = &event_##call
#define __get_str(field) (char *)__get_dynamic_array(field)
#undef __perf_addr
-#define __perf_addr(a) __addr = (a)
+#define __perf_addr(a) (__addr = (a))
#undef __perf_count
-#define __perf_count(c) __count = (c)
+#define __perf_count(c) (__count = (c))
#undef __perf_task
-#define __perf_task(t) __task = (t)
-
-#undef TP_perf_assign
-#define TP_perf_assign(args...) args
+#define __perf_task(t) (__task = (t))
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
--
1.5.5.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] tracing/perf: reimplement TP_perf_assign() logic
2013-07-18 18:30 [PATCH RESEND 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
@ 2013-07-18 18:30 ` Oleg Nesterov
0 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2013-07-18 18:30 UTC (permalink / raw)
To: Frederic Weisbecker, Peter Zijlstra, Steven Rostedt
Cc: David Ahern, Ingo Molnar, Masami Hiramatsu, zhangwei(Jovi),
linux-kernel
The next patch tries to avoid the costly perf_trace_buf_* calls
when possible but there is a problem. We can only do this if
__task == NULL, perf_tp_event(task != NULL) has the additional
code for this case.
Unfortunately, TP_perf_assign/__perf_xxx which changes the default
values of __count/__task variables for perf_trace_buf_submit() is
called "too late", after we already did perf_trace_buf_prepare(),
and the optimization above can't work.
So this patch simply embeds __perf_xxx() into TP_ARGS(), this way
DECLARE_EVENT_CLASS() can use the result of assignments hidden in
"args" right after ftrace_get_offsets_##call() which is mostly
trivial. This allows us to have the fast-path "__task != NULL"
check at the start, see the next patch.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/trace/events/sched.h | 16 +++-------------
include/trace/ftrace.h | 19 +++++++++++--------
2 files changed, 14 insertions(+), 21 deletions(-)
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 249c024..2e7d994 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
TP_PROTO(struct task_struct *p, int success),
- TP_ARGS(p, success),
+ TP_ARGS(__perf_task(p), success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__entry->prio = p->prio;
__entry->success = success;
__entry->target_cpu = task_cpu(p);
- )
- TP_perf_assign(
- __perf_task(p);
),
TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
TP_PROTO(struct task_struct *tsk, u64 delay),
- TP_ARGS(tsk, delay),
+ TP_ARGS(__perf_task(tsk), __perf_count(delay)),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
__entry->pid = tsk->pid;
__entry->delay = delay;
- )
- TP_perf_assign(
- __perf_count(delay);
- __perf_task(tsk);
),
TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@@ -376,7 +369,7 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
- TP_ARGS(tsk, runtime, vruntime),
+ TP_ARGS(tsk, __perf_count(runtime), vruntime),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -390,9 +383,6 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
__entry->pid = tsk->pid;
__entry->runtime = runtime;
__entry->vruntime = vruntime;
- )
- TP_perf_assign(
- __perf_count(runtime);
),
TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index aed594a..8886877 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -503,8 +503,14 @@ static inline notrace int ftrace_get_offsets_##call( \
#undef TP_fast_assign
#define TP_fast_assign(args...) args
-#undef TP_perf_assign
-#define TP_perf_assign(args...)
+#undef __perf_addr
+#define __perf_addr(a) (a)
+
+#undef __perf_count
+#define __perf_count(c) (c)
+
+#undef __perf_task
+#define __perf_task(t) (t)
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@@ -632,16 +638,13 @@ __attribute__((section("_ftrace_events"))) *__event_##call = &event_##call
#define __get_str(field) (char *)__get_dynamic_array(field)
#undef __perf_addr
-#define __perf_addr(a) __addr = (a)
+#define __perf_addr(a) (__addr = (a))
#undef __perf_count
-#define __perf_count(c) __count = (c)
+#define __perf_count(c) (__count = (c))
#undef __perf_task
-#define __perf_task(t) __task = (t)
-
-#undef TP_perf_assign
-#define TP_perf_assign(args...) args
+#define __perf_task(t) (__task = (t))
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
--
1.5.5.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events)
@ 2013-08-05 16:50 Oleg Nesterov
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
Sorry for double post, forgot to cc lkml...
On 07/19, Ingo Molnar wrote:
>
> * Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Hello.
> >
> > The patches are the same, I only tried to update the changelogs a bit.
> > I am also quoting my old email below, to explain what this hack tries
> > to do.
> >
> > Say, "perf record -e sched:sched_switch -p1".
> >
> > Every task except /sbin/init will do perf_trace_sched_switch() and
> > perf_trace_buf_prepare() + perf_trace_buf_submit for no reason(),
> > it doesn't have a counter.
> >
> > So it makes sense to add the fast-path check at the start of
> > perf_trace_##call(),
> >
> > if (hlist_empty(event_call->perf_events))
> > return;
> >
> > The problem is, we should not do this if __task != NULL (iow, if
> > DECLARE_EVENT_CLASS() uses __perf_task()), perf_tp_event() has the
> > additional code for this case.
> >
> > So we should do
> >
> > if (!__task && hlist_empty(event_call->perf_events))
> > return;
> >
> > But __task is changed by "{ assign; }" block right before
> > perf_trace_buf_submit(). Too late for the fast-path check,
> > we already called perf_trace_buf_prepare/fetch_regs.
> >
> > So. After 2/3 __perf_task() (and __perf_count/addr) is called
> > when ftrace_get_offsets_##call(args) evaluates the arguments,
> > and we can check !__task && hlist_empty() right after that.
> >
> > Oleg.
>
> Nice improvement.
>
> Peter, Steve, any objections?
Ingo,
It seems that everybody agree with this hack but it was forgotten,
let me resend it again.
The only change is that I added the following tags:
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
Oleg.
include/trace/events/sched.h | 22 ++++++++--------------
include/trace/ftrace.h | 33 ++++++++++++++++++++-------------
2 files changed, 28 insertions(+), 27 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
@ 2013-08-05 16:50 ` Oleg Nesterov
2013-08-05 18:08 ` Steven Rostedt
2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
2 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
To simplify the review of the next patches:
1. We are going to reimplent __perf_task/counter and embedd them
into TP_ARGS(). expand TRACE_EVENT(sched_stat_runtime) into
DECLARE_EVENT_CLASS() + DEFINE_EVENT(), this way they can use
different TP_ARGS's.
2. Change perf_trace_##call() macro to do perf_fetch_caller_regs()
right before perf_trace_buf_prepare().
This way it evaluates TP_ARGS() asap, the next patch explores
this fact.
Note: after 87f44bbc perf_trace_buf_prepare() doesn't need
"struct pt_regs *regs", perhaps it makes sense to remove this
argument. And perhaps we can teach perf_trace_buf_submit()
to accept regs == NULL and do fetch_caller_regs(CALLER_ADDR1)
in this case.
3. Cosmetic, but the typecast from "void*" buys nothing. It just
adds the noise, remove it.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
include/trace/events/sched.h | 6 +++++-
include/trace/ftrace.h | 7 +++----
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index e5586ca..249c024 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -372,7 +372,7 @@ DEFINE_EVENT(sched_stat_template, sched_stat_blocked,
* Tracepoint for accounting runtime (time the task is executing
* on a CPU).
*/
-TRACE_EVENT(sched_stat_runtime,
+DECLARE_EVENT_CLASS(sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
@@ -401,6 +401,10 @@ TRACE_EVENT(sched_stat_runtime,
(unsigned long long)__entry->vruntime)
);
+DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
+ TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
+ TP_ARGS(tsk, runtime, vruntime));
+
/*
* Tracepoint for showing priority inheritance modifying a tasks
* priority.
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 41a6643..618af05 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -663,15 +663,14 @@ perf_trace_##call(void *__data, proto) \
int __data_size; \
int rctx; \
\
- perf_fetch_caller_regs(&__regs); \
- \
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
sizeof(u64)); \
__entry_size -= sizeof(u32); \
\
- entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare( \
- __entry_size, event_call->event.type, &__regs, &rctx); \
+ perf_fetch_caller_regs(&__regs); \
+ entry = perf_trace_buf_prepare(__entry_size, \
+ event_call->event.type, &__regs, &rctx); \
if (!entry) \
return; \
\
--
1.5.5.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic
2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
@ 2013-08-05 16:50 ` Oleg Nesterov
2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
2 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
The next patch tries to avoid the costly perf_trace_buf_* calls
when possible but there is a problem. We can only do this if
__task == NULL, perf_tp_event(task != NULL) has the additional
code for this case.
Unfortunately, TP_perf_assign/__perf_xxx which changes the default
values of __count/__task variables for perf_trace_buf_submit() is
called "too late", after we already did perf_trace_buf_prepare(),
and the optimization above can't work.
So this patch simply embeds __perf_xxx() into TP_ARGS(), this way
DECLARE_EVENT_CLASS() can use the result of assignments hidden in
"args" right after ftrace_get_offsets_##call() which is mostly
trivial. This allows us to have the fast-path "__task != NULL"
check at the start, see the next patch.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
include/trace/events/sched.h | 16 +++-------------
include/trace/ftrace.h | 19 +++++++++++--------
2 files changed, 14 insertions(+), 21 deletions(-)
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 249c024..2e7d994 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
TP_PROTO(struct task_struct *p, int success),
- TP_ARGS(p, success),
+ TP_ARGS(__perf_task(p), success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__entry->prio = p->prio;
__entry->success = success;
__entry->target_cpu = task_cpu(p);
- )
- TP_perf_assign(
- __perf_task(p);
),
TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
TP_PROTO(struct task_struct *tsk, u64 delay),
- TP_ARGS(tsk, delay),
+ TP_ARGS(__perf_task(tsk), __perf_count(delay)),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
__entry->pid = tsk->pid;
__entry->delay = delay;
- )
- TP_perf_assign(
- __perf_count(delay);
- __perf_task(tsk);
),
TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@@ -376,7 +369,7 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
- TP_ARGS(tsk, runtime, vruntime),
+ TP_ARGS(tsk, __perf_count(runtime), vruntime),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@@ -390,9 +383,6 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
__entry->pid = tsk->pid;
__entry->runtime = runtime;
__entry->vruntime = vruntime;
- )
- TP_perf_assign(
- __perf_count(runtime);
),
TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 618af05..4163d93 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -507,8 +507,14 @@ static inline notrace int ftrace_get_offsets_##call( \
#undef TP_fast_assign
#define TP_fast_assign(args...) args
-#undef TP_perf_assign
-#define TP_perf_assign(args...)
+#undef __perf_addr
+#define __perf_addr(a) (a)
+
+#undef __perf_count
+#define __perf_count(c) (c)
+
+#undef __perf_task
+#define __perf_task(t) (t)
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@@ -636,16 +642,13 @@ __attribute__((section("_ftrace_events"))) *__event_##call = &event_##call
#define __get_str(field) (char *)__get_dynamic_array(field)
#undef __perf_addr
-#define __perf_addr(a) __addr = (a)
+#define __perf_addr(a) (__addr = (a))
#undef __perf_count
-#define __perf_count(c) __count = (c)
+#define __perf_count(c) (__count = (c))
#undef __perf_task
-#define __perf_task(t) __task = (t)
-
-#undef TP_perf_assign
-#define TP_perf_assign(args...) args
+#define __perf_task(t) (__task = (t))
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
--
1.5.5.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
@ 2013-08-05 16:51 ` Oleg Nesterov
2 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:51 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
make no sense if hlist_empty(head). Change perf_trace_##call()
to check ->perf_events beforehand and do nothing if it is empty.
This removes the overhead for tasks without events associated
with them. For example, "perf record -e sched:sched_switch -p1"
attaches the counter(s) to the single task, but every task in
system will do perf_trace_buf_prepare/submit() just to realize
that it was not attached to this event.
However, we can only do this if __task == NULL, so we also add
the __builtin_constant_p(__task) check.
With this patch "perf bench sched pipe" shows approximately 4%
improvement when "perf record -p1" runs in parallel, many thanks
to Steven for the testing.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
include/trace/ftrace.h | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 4163d93..5c7ab17 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -667,6 +667,12 @@ perf_trace_##call(void *__data, proto) \
int rctx; \
\
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
+ \
+ head = this_cpu_ptr(event_call->perf_events); \
+ if (__builtin_constant_p(!__task) && !__task && \
+ hlist_empty(head)) \
+ return; \
+ \
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
sizeof(u64)); \
__entry_size -= sizeof(u32); \
@@ -681,7 +687,6 @@ perf_trace_##call(void *__data, proto) \
\
{ assign; } \
\
- head = this_cpu_ptr(event_call->perf_events); \
perf_trace_buf_submit(entry, __entry_size, rctx, __addr, \
__count, &__regs, head, __task); \
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
@ 2013-08-05 18:08 ` Steven Rostedt
2013-08-05 18:51 ` Oleg Nesterov
0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2013-08-05 18:08 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Tested-by: David Ahern <dsahern@gmail.com>
> Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
Just so you know. The standard that we now want to live by is only one
tag per line. I know I gave you that tag in my email, but when adding it
to a patch it needs to be:
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
both, it needs to be like that.
-- Steve
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
2013-08-05 18:08 ` Steven Rostedt
@ 2013-08-05 18:51 ` Oleg Nesterov
2013-08-05 19:07 ` Steven Rostedt
0 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2013-08-05 18:51 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
On 08/05, Steven Rostedt wrote:
>
> On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
>
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > Tested-by: David Ahern <dsahern@gmail.com>
> > Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
>
> Just so you know. The standard that we now want to live by is only one
> tag per line. I know I gave you that tag in my email, but when adding it
> to a patch it needs to be:
>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
> Acked-by: Steven Rostedt <rostedt@goodmis.org>
>
> I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
> both, it needs to be like that.
Sorry... should I resend once again ?
Oleg.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
2013-08-05 18:51 ` Oleg Nesterov
@ 2013-08-05 19:07 ` Steven Rostedt
0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2013-08-05 19:07 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
Masami Hiramatsu, zhangwei(Jovi), linux-kernel
On Mon, 2013-08-05 at 20:51 +0200, Oleg Nesterov wrote:
> On 08/05, Steven Rostedt wrote:
> >
> > On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
> >
> > > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > > Tested-by: David Ahern <dsahern@gmail.com>
> > > Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
> >
> > Just so you know. The standard that we now want to live by is only one
> > tag per line. I know I gave you that tag in my email, but when adding it
> > to a patch it needs to be:
> >
> > Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
> > Acked-by: Steven Rostedt <rostedt@goodmis.org>
> >
> > I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
> > both, it needs to be like that.
>
> Sorry... should I resend once again ?
>
Sure, why not. It's only 3 patches :-)
-- Steve
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-08-05 19:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
2013-08-05 18:08 ` Steven Rostedt
2013-08-05 18:51 ` Oleg Nesterov
2013-08-05 19:07 ` Steven Rostedt
2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
-- strict thread matches above, loose matches on Subject: below --
2013-07-18 18:30 [PATCH RESEND 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-07-18 18:30 ` [PATCH 2/3] tracing/perf: reimplement TP_perf_assign() logic Oleg Nesterov
2013-06-18 19:21 [PATCH 0/3] tracing/perf: perf_trace_buf/perf_xxx hacks Oleg Nesterov
2013-06-18 19:22 ` [PATCH 2/3] tracing/perf: reimplement TP_perf_assign() logic Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox