public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-06-18 19:21 [PATCH 0/3] tracing/perf: perf_trace_buf/perf_xxx hacks Oleg Nesterov
@ 2013-06-18 19:22 ` Oleg Nesterov
  2013-06-18 20:02   ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-06-18 19:22 UTC (permalink / raw)
  To: Steven Rostedt, Peter Zijlstra
  Cc: Frederic Weisbecker, Ingo Molnar, Masami Hiramatsu,
	Srikar Dronamraju, zhangwei(Jovi), linux-kernel

perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
make no sense if hlist_empty(head). Change perf_trace_##call()
to check ->perf_events beforehand and do nothing if it is empty.

However, we can only do this if __task == NULL, so we also add
the __builtin_constant_p(__task) check.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/trace/ftrace.h |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 8886877..04455b8 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -663,6 +663,12 @@ perf_trace_##call(void *__data, proto)					\
 	int rctx;							\
 									\
 	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
+									\
+	head = this_cpu_ptr(event_call->perf_events);			\
+	if (__builtin_constant_p(!__task) && !__task &&			\
+				hlist_empty(head))			\
+		return;							\
+									\
 	__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 			     sizeof(u64));				\
 	__entry_size -= sizeof(u32);					\
@@ -677,7 +683,6 @@ perf_trace_##call(void *__data, proto)					\
 									\
 	{ assign; }							\
 									\
-	head = this_cpu_ptr(event_call->perf_events);			\
 	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
 		__count, &__regs, head, __task);			\
 }
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-06-18 19:22 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
@ 2013-06-18 20:02   ` Steven Rostedt
  2013-06-19 18:12     ` Oleg Nesterov
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2013-06-18 20:02 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu, Srikar Dronamraju, zhangwei(Jovi), linux-kernel

On Tue, 2013-06-18 at 21:22 +0200, Oleg Nesterov wrote:
> perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
> make no sense if hlist_empty(head). Change perf_trace_##call()
> to check ->perf_events beforehand and do nothing if it is empty.
> 
> However, we can only do this if __task == NULL, so we also add
> the __builtin_constant_p(__task) check.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  include/trace/ftrace.h |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 8886877..04455b8 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -663,6 +663,12 @@ perf_trace_##call(void *__data, proto)					\
>  	int rctx;							\
>  									\
>  	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
> +									\
> +	head = this_cpu_ptr(event_call->perf_events);			\
> +	if (__builtin_constant_p(!__task) && !__task &&			\


I'm trying to wrap my head around this:

  __builtin_constant_p(!task)

is this the same as:

  !__builtin_constant_p(task)

Or is it the same as:

  __builtin_constant_p(task)

?

Because that '!' is confusing the heck out of me.

If !task is a constant, wouldn't task be a constant too, and if task is
not a constant then I would also assume !task is not a constant as well.

If this is the case, can we nuke the '!' from the builtin_consant_p().
The code is confusing enough as is. Or is it that the code is very
confusing and in keeping with the coding style, you are trying to come
up with new ways of adding to the confusion.

Or is this your way to confuse me as much as my code has confused
you? ;-)

-- Steve

> +				hlist_empty(head))			\
> +		return;							\
> +									\
>  	__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
>  			     sizeof(u64));				\
>  	__entry_size -= sizeof(u32);					\
> @@ -677,7 +683,6 @@ perf_trace_##call(void *__data, proto)					\
>  									\
>  	{ assign; }							\
>  									\
> -	head = this_cpu_ptr(event_call->perf_events);			\
>  	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
>  		__count, &__regs, head, __task);			\
>  }



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-06-18 20:02   ` Steven Rostedt
@ 2013-06-19 18:12     ` Oleg Nesterov
  2013-06-19 18:24       ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-06-19 18:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu, Srikar Dronamraju, zhangwei(Jovi), linux-kernel

On 06/18, Steven Rostedt wrote:
>
> On Tue, 2013-06-18 at 21:22 +0200, Oleg Nesterov wrote:
> > @@ -663,6 +663,12 @@ perf_trace_##call(void *__data, proto)					\
> >  	int rctx;							\
> >  									\
> >  	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
> > +									\
> > +	head = this_cpu_ptr(event_call->perf_events);			\
> > +	if (__builtin_constant_p(!__task) && !__task &&			\
>
>
> I'm trying to wrap my head around this:
>
>   __builtin_constant_p(!task)
>
> is this the same as:
>
>   !__builtin_constant_p(task)
>
> Or is it the same as:
>
>   __builtin_constant_p(task)
>
> ?
>
> Because that '!' is confusing the heck out of me.
>
> If !task is a constant, wouldn't task be a constant too, and if task is
> not a constant then I would also assume !task is not a constant as well.

!__task looks more explicit/symmetrical to me. We need

	if (is_compile_time_true(!__task)) && list_empty)
		return;

is_compile_time_true(cond) could be defined as

	__builtin_constant_p(cond) && (cond)
or
	__builtin_constant_p(!cond) && (cond)

but the 1ts one looks more clean.

However,

> If this is the case, can we nuke the '!' from the builtin_consant_p().

OK, I do not really mind, will do.

And,

> Or is this your way to confuse me as much as my code has confused
> you? ;-)

Of course! this was the main reason.


Steven, I convinced myself the patch should be correct. If you agree with
this hack:

	- anything else I should do apart from the change above?

	- should I resend the previous "[PATCH 0/3] tracing: more
	  list_empty(perf_events) checks" series?

	  This series depends on "[PATCH 3/3] tracing/perf: Move the
	  PERF_MAX_TRACE_SIZE check into perf_trace_buf_prepare()".

	  Or I can drop this patch if you do not like it and rediff.

	  Just in case, there are other pending patches in trace_kprobe.c
	  which I am going to resend, but they are orthogonal.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-06-19 18:12     ` Oleg Nesterov
@ 2013-06-19 18:24       ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2013-06-19 18:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu, Srikar Dronamraju, zhangwei(Jovi), linux-kernel

On Wed, 2013-06-19 at 20:12 +0200, Oleg Nesterov wrote:

> > Or is this your way to confuse me as much as my code has confused
> > you? ;-)
> 
> Of course! this was the main reason.

I knew it!

> 
> 
> Steven, I convinced myself the patch should be correct. If you agree with
> this hack:
> 
> 	- anything else I should do apart from the change above?
> 
> 	- should I resend the previous "[PATCH 0/3] tracing: more
> 	  list_empty(perf_events) checks" series?
> 
> 	  This series depends on "[PATCH 3/3] tracing/perf: Move the
> 	  PERF_MAX_TRACE_SIZE check into perf_trace_buf_prepare()".
> 
> 	  Or I can drop this patch if you do not like it and rediff.
> 
> 	  Just in case, there are other pending patches in trace_kprobe.c
> 	  which I am going to resend, but they are orthogonal.

I'll pull in the patches and play with them. I'll let you know what I
find.

Thanks,


-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-07-18 18:30 [PATCH RESEND 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
@ 2013-07-18 18:30 ` Oleg Nesterov
  0 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2013-07-18 18:30 UTC (permalink / raw)
  To: Frederic Weisbecker, Peter Zijlstra, Steven Rostedt
  Cc: David Ahern, Ingo Molnar, Masami Hiramatsu, zhangwei(Jovi),
	linux-kernel

perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
make no sense if hlist_empty(head). Change perf_trace_##call()
to check ->perf_events beforehand and do nothing if it is empty.

This removes the overhead for tasks without events associated
with them. For example, "perf record -e sched:sched_switch -p1"
attaches the counter(s) to the single task, but every task in
system will do perf_trace_buf_prepare/submit() just to realize
that it was not attached to this event.

However, we can only do this if __task == NULL, so we also add
the __builtin_constant_p(__task) check.

With this patch "perf bench sched pipe" shows approximately 4%
improvement when "perf record -p1" runs in parallel, many thanks
to Steven for the testing.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/trace/ftrace.h |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 8886877..04455b8 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -663,6 +663,12 @@ perf_trace_##call(void *__data, proto)					\
 	int rctx;							\
 									\
 	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
+									\
+	head = this_cpu_ptr(event_call->perf_events);			\
+	if (__builtin_constant_p(!__task) && !__task &&			\
+				hlist_empty(head))			\
+		return;							\
+									\
 	__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 			     sizeof(u64));				\
 	__entry_size -= sizeof(u32);					\
@@ -677,7 +683,6 @@ perf_trace_##call(void *__data, proto)					\
 									\
 	{ assign; }							\
 									\
-	head = this_cpu_ptr(event_call->perf_events);			\
 	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
 		__count, &__regs, head, __task);			\
 }
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events)
@ 2013-08-05 16:50 Oleg Nesterov
  2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

Sorry for double post, forgot to cc lkml...

On 07/19, Ingo Molnar wrote:
>
> * Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Hello.
> >
> > The patches are the same, I only tried to update the changelogs a bit.
> > I am also quoting my old email below, to explain what this hack tries
> > to do.
> >
> > 	Say, "perf record -e sched:sched_switch -p1".
> >
> > 	Every task except /sbin/init will do perf_trace_sched_switch() and
> > 	perf_trace_buf_prepare() + perf_trace_buf_submit for no reason(),
> > 	it doesn't have a counter.
> >
> > 	So it makes sense to add the fast-path check at the start of
> > 	perf_trace_##call(),
> >
> > 		if (hlist_empty(event_call->perf_events))
> > 			return;
> >
> > 	The problem is, we should not do this if __task != NULL (iow, if
> > 	DECLARE_EVENT_CLASS() uses __perf_task()), perf_tp_event() has the
> > 	additional code for this case.
> >
> > 	So we should do
> >
> > 		if (!__task && hlist_empty(event_call->perf_events))
> > 			return;
> >
> > 	But __task is changed by "{ assign; }" block right before
> > 	perf_trace_buf_submit(). Too late for the fast-path check,
> > 	we already called perf_trace_buf_prepare/fetch_regs.
> >
> > 	So. After 2/3 __perf_task() (and __perf_count/addr) is called
> > 	when ftrace_get_offsets_##call(args) evaluates the arguments,
> > 	and we can check !__task && hlist_empty() right after that.
> >
> > Oleg.
>
> Nice improvement.
>
> Peter, Steve, any objections?

Ingo,

It seems that everybody agree with this hack but it was forgotten,
let me resend it again.

The only change is that I added the following tags:

	Tested-by: David Ahern <dsahern@gmail.com>
	Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>

Oleg.

 include/trace/events/sched.h |   22 ++++++++--------------
 include/trace/ftrace.h       |   33 ++++++++++++++++++++-------------
 2 files changed, 28 insertions(+), 27 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
  2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
@ 2013-08-05 16:50 ` Oleg Nesterov
  2013-08-05 18:08   ` Steven Rostedt
  2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
  2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
  2 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

To simplify the review of the next patches:

1. We are going to reimplent __perf_task/counter and embedd them
   into TP_ARGS(). expand TRACE_EVENT(sched_stat_runtime) into
   DECLARE_EVENT_CLASS() + DEFINE_EVENT(), this way they can use
   different TP_ARGS's.

2. Change perf_trace_##call() macro to do perf_fetch_caller_regs()
   right before perf_trace_buf_prepare().

   This way it evaluates TP_ARGS() asap, the next patch explores
   this fact.

   Note: after 87f44bbc perf_trace_buf_prepare() doesn't need
   "struct pt_regs *regs", perhaps it makes sense to remove this
   argument. And perhaps we can teach perf_trace_buf_submit()
   to accept regs == NULL and do fetch_caller_regs(CALLER_ADDR1)
   in this case.

3. Cosmetic, but the typecast from "void*" buys nothing. It just
   adds the noise, remove it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/sched.h |    6 +++++-
 include/trace/ftrace.h       |    7 +++----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index e5586ca..249c024 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -372,7 +372,7 @@ DEFINE_EVENT(sched_stat_template, sched_stat_blocked,
  * Tracepoint for accounting runtime (time the task is executing
  * on a CPU).
  */
-TRACE_EVENT(sched_stat_runtime,
+DECLARE_EVENT_CLASS(sched_stat_runtime,
 
 	TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
 
@@ -401,6 +401,10 @@ TRACE_EVENT(sched_stat_runtime,
 			(unsigned long long)__entry->vruntime)
 );
 
+DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
+	     TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
+	     TP_ARGS(tsk, runtime, vruntime));
+
 /*
  * Tracepoint for showing priority inheritance modifying a tasks
  * priority.
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 41a6643..618af05 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -663,15 +663,14 @@ perf_trace_##call(void *__data, proto)					\
 	int __data_size;						\
 	int rctx;							\
 									\
-	perf_fetch_caller_regs(&__regs);				\
-									\
 	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
 	__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 			     sizeof(u64));				\
 	__entry_size -= sizeof(u32);					\
 									\
-	entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare(	\
-		__entry_size, event_call->event.type, &__regs, &rctx);	\
+	perf_fetch_caller_regs(&__regs);				\
+	entry = perf_trace_buf_prepare(__entry_size,			\
+			event_call->event.type, &__regs, &rctx);	\
 	if (!entry)							\
 		return;							\
 									\
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic
  2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
  2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
@ 2013-08-05 16:50 ` Oleg Nesterov
  2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
  2 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

The next patch tries to avoid the costly perf_trace_buf_* calls
when possible but there is a problem. We can only do this if
__task == NULL, perf_tp_event(task != NULL) has the additional
code for this case.

Unfortunately, TP_perf_assign/__perf_xxx which changes the default
values of __count/__task variables for perf_trace_buf_submit() is
called "too late", after we already did perf_trace_buf_prepare(),
and the optimization above can't work.

So this patch simply embeds __perf_xxx() into TP_ARGS(), this way
DECLARE_EVENT_CLASS() can use the result of assignments hidden in
"args" right after ftrace_get_offsets_##call() which is mostly
trivial. This allows us to have the fast-path "__task != NULL"
check at the start, see the next patch.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/sched.h |   16 +++-------------
 include/trace/ftrace.h       |   19 +++++++++++--------
 2 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 249c024..2e7d994 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
 
 	TP_PROTO(struct task_struct *p, int success),
 
-	TP_ARGS(p, success),
+	TP_ARGS(__perf_task(p), success),
 
 	TP_STRUCT__entry(
 		__array(	char,	comm,	TASK_COMM_LEN	)
@@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
 		__entry->prio		= p->prio;
 		__entry->success	= success;
 		__entry->target_cpu	= task_cpu(p);
-	)
-	TP_perf_assign(
-		__perf_task(p);
 	),
 
 	TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
 
 	TP_PROTO(struct task_struct *tsk, u64 delay),
 
-	TP_ARGS(tsk, delay),
+	TP_ARGS(__perf_task(tsk), __perf_count(delay)),
 
 	TP_STRUCT__entry(
 		__array( char,	comm,	TASK_COMM_LEN	)
@@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
 		memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
 		__entry->pid	= tsk->pid;
 		__entry->delay	= delay;
-	)
-	TP_perf_assign(
-		__perf_count(delay);
-		__perf_task(tsk);
 	),
 
 	TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@@ -376,7 +369,7 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
 
 	TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
 
-	TP_ARGS(tsk, runtime, vruntime),
+	TP_ARGS(tsk, __perf_count(runtime), vruntime),
 
 	TP_STRUCT__entry(
 		__array( char,	comm,	TASK_COMM_LEN	)
@@ -390,9 +383,6 @@ DECLARE_EVENT_CLASS(sched_stat_runtime,
 		__entry->pid		= tsk->pid;
 		__entry->runtime	= runtime;
 		__entry->vruntime	= vruntime;
-	)
-	TP_perf_assign(
-		__perf_count(runtime);
 	),
 
 	TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 618af05..4163d93 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -507,8 +507,14 @@ static inline notrace int ftrace_get_offsets_##call(			\
 #undef TP_fast_assign
 #define TP_fast_assign(args...) args
 
-#undef TP_perf_assign
-#define TP_perf_assign(args...)
+#undef __perf_addr
+#define __perf_addr(a)	(a)
+
+#undef __perf_count
+#define __perf_count(c)	(c)
+
+#undef __perf_task
+#define __perf_task(t)	(t)
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
@@ -636,16 +642,13 @@ __attribute__((section("_ftrace_events"))) *__event_##call = &event_##call
 #define __get_str(field) (char *)__get_dynamic_array(field)
 
 #undef __perf_addr
-#define __perf_addr(a) __addr = (a)
+#define __perf_addr(a)	(__addr = (a))
 
 #undef __perf_count
-#define __perf_count(c) __count = (c)
+#define __perf_count(c)	(__count = (c))
 
 #undef __perf_task
-#define __perf_task(t) __task = (t)
-
-#undef TP_perf_assign
-#define TP_perf_assign(args...) args
+#define __perf_task(t)	(__task = (t))
 
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible
  2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
  2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
  2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
@ 2013-08-05 16:51 ` Oleg Nesterov
  2 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2013-08-05 16:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL)
make no sense if hlist_empty(head). Change perf_trace_##call()
to check ->perf_events beforehand and do nothing if it is empty.

This removes the overhead for tasks without events associated
with them. For example, "perf record -e sched:sched_switch -p1"
attaches the counter(s) to the single task, but every task in
system will do perf_trace_buf_prepare/submit() just to realize
that it was not attached to this event.

However, we can only do this if __task == NULL, so we also add
the __builtin_constant_p(__task) check.

With this patch "perf bench sched pipe" shows approximately 4%
improvement when "perf record -p1" runs in parallel, many thanks
to Steven for the testing.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/ftrace.h |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 4163d93..5c7ab17 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -667,6 +667,12 @@ perf_trace_##call(void *__data, proto)					\
 	int rctx;							\
 									\
 	__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
+									\
+	head = this_cpu_ptr(event_call->perf_events);			\
+	if (__builtin_constant_p(!__task) && !__task &&			\
+				hlist_empty(head))			\
+		return;							\
+									\
 	__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
 			     sizeof(u64));				\
 	__entry_size -= sizeof(u32);					\
@@ -681,7 +687,6 @@ perf_trace_##call(void *__data, proto)					\
 									\
 	{ assign; }							\
 									\
-	head = this_cpu_ptr(event_call->perf_events);			\
 	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
 		__count, &__regs, head, __task);			\
 }
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
  2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
@ 2013-08-05 18:08   ` Steven Rostedt
  2013-08-05 18:51     ` Oleg Nesterov
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2013-08-05 18:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:

> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Tested-by: David Ahern <dsahern@gmail.com>
> Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>

Just so you know. The standard that we now want to live by is only one
tag per line. I know I gave you that tag in my email, but when adding it
to a patch it needs to be:

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>

I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
both, it needs to be like that.

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
  2013-08-05 18:08   ` Steven Rostedt
@ 2013-08-05 18:51     ` Oleg Nesterov
  2013-08-05 19:07       ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2013-08-05 18:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

On 08/05, Steven Rostedt wrote:
>
> On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
>
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > Tested-by: David Ahern <dsahern@gmail.com>
> > Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
>
> Just so you know. The standard that we now want to live by is only one
> tag per line. I know I gave you that tag in my email, but when adding it
> to a patch it needs to be:
>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
> Acked-by: Steven Rostedt <rostedt@goodmis.org>
>
> I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
> both, it needs to be like that.

Sorry... should I resend once again ?

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime)
  2013-08-05 18:51     ` Oleg Nesterov
@ 2013-08-05 19:07       ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2013-08-05 19:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Frederic Weisbecker, Peter Zijlstra, David Ahern,
	Masami Hiramatsu, zhangwei(Jovi), linux-kernel

On Mon, 2013-08-05 at 20:51 +0200, Oleg Nesterov wrote:
> On 08/05, Steven Rostedt wrote:
> >
> > On Mon, 2013-08-05 at 18:50 +0200, Oleg Nesterov wrote:
> >
> > > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > > Tested-by: David Ahern <dsahern@gmail.com>
> > > Reviewed-and-Acked-by: Steven Rostedt <rostedt@goodmis.org>
> >
> > Just so you know. The standard that we now want to live by is only one
> > tag per line. I know I gave you that tag in my email, but when adding it
> > to a patch it needs to be:
> >
> > Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
> > Acked-by: Steven Rostedt <rostedt@goodmis.org>
> >
> > I wonder if the Reviewed-by assumes the Acked-by? Anyway, if you add
> > both, it needs to be like that.
> 
> Sorry... should I resend once again ?
> 

Sure, why not. It's only 3 patches :-)

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-08-05 19:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-05 16:50 [PATCH 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-08-05 16:50 ` [PATCH 1/3] tracing/perf: Expand TRACE_EVENT(sched_stat_runtime) Oleg Nesterov
2013-08-05 18:08   ` Steven Rostedt
2013-08-05 18:51     ` Oleg Nesterov
2013-08-05 19:07       ` Steven Rostedt
2013-08-05 16:50 ` [PATCH 2/3] tracing/perf: Reimplement TP_perf_assign() logic Oleg Nesterov
2013-08-05 16:51 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
  -- strict thread matches above, loose matches on Subject: below --
2013-07-18 18:30 [PATCH RESEND 0/3] Teach perf_trace_##call() to check hlist_empty(perf_events) Oleg Nesterov
2013-07-18 18:30 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
2013-06-18 19:21 [PATCH 0/3] tracing/perf: perf_trace_buf/perf_xxx hacks Oleg Nesterov
2013-06-18 19:22 ` [PATCH 3/3] tracing/perf: Avoid perf_trace_buf_*() in perf_trace_##call() when possible Oleg Nesterov
2013-06-18 20:02   ` Steven Rostedt
2013-06-19 18:12     ` Oleg Nesterov
2013-06-19 18:24       ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox