* [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
@ 2015-11-25 15:45 Jiri Olsa
2015-11-25 15:45 ` [PATCH 2/2] ftrace: Check sample types only for sampling events Jiri Olsa
2015-11-25 16:19 ` [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Steven Rostedt
0 siblings, 2 replies; 7+ messages in thread
From: Jiri Olsa @ 2015-11-25 15:45 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
Having following commands running concurently:
# perf record -e ftrace:function -a -o krava.data sleep 10
# perf record -e ftrace:function --filter 'ip == SyS_read' ls
will endup in the latter one failing on the filter
rules and store all functions (in perf.data) as
instructed by the first record instead of just
SyS_read records.
The reason is that we don't check the ftrace_ops that
triggered the event with event's ftrace_ops. Hence
once running together the event from latter perf will
get all the data of the event from the first one.
Fixing this by having separate handler for ftrace:function
event that actualy checks ftrace_ops against event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 26 ++++++++++++++++++++++++++
kernel/trace/trace_event_perf.c | 3 +--
3 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f9828a48f16a..be1ad1ef47f3 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1010,6 +1010,8 @@ extern void perf_tp_event(u64 addr, u64 count, void *record,
struct hlist_head *head, int rctx,
struct task_struct *task);
extern void perf_bp_event(struct perf_event *event, void *data);
+void perf_ftrace_ops_event(struct ftrace_ops *ops, void *record, int entry_size,
+ struct pt_regs *regs, struct hlist_head *head, int rctx);
#ifndef perf_misc_flags
# define perf_misc_flags(regs) \
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5854fcf7f05a..1ec33bebd39a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -44,6 +44,7 @@
#include <linux/compat.h>
#include <linux/bpf.h>
#include <linux/filter.h>
+#include <linux/ftrace.h>
#include "internal.h"
@@ -6950,6 +6951,31 @@ static int perf_tp_event_match(struct perf_event *event,
return 1;
}
+#ifdef CONFIG_FUNCTION_TRACER
+void perf_ftrace_ops_event(struct ftrace_ops *ops, void *record, int entry_size,
+ struct pt_regs *regs, struct hlist_head *head, int rctx)
+{
+ struct perf_sample_data data;
+ struct perf_event *event;
+
+ struct perf_raw_record raw = {
+ .size = entry_size,
+ .data = record,
+ };
+
+ perf_sample_data_init(&data, 0, 0);
+ data.raw = &raw;
+
+ hlist_for_each_entry_rcu(event, head, hlist_entry) {
+ if ((&event->ftrace_ops == ops) &&
+ (perf_tp_event_match(event, &data, regs)))
+ perf_swevent_event(event, 1, &data, regs);
+ }
+
+ perf_swevent_put_recursion_context(rctx);
+}
+#endif
+
void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
struct pt_regs *regs, struct hlist_head *head, int rctx,
struct task_struct *task)
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index cc9f7a9319be..fa1f12ef3ba3 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -324,8 +324,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
entry->ip = ip;
entry->parent_ip = parent_ip;
- perf_trace_buf_submit(entry, ENTRY_SIZE, rctx, 0,
- 1, ®s, head, NULL);
+ perf_ftrace_ops_event(ops, entry, ENTRY_SIZE, ®s, head, rctx);
#undef ENTRY_SIZE
}
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] ftrace: Check sample types only for sampling events
2015-11-25 15:45 [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Jiri Olsa
@ 2015-11-25 15:45 ` Jiri Olsa
2015-11-25 16:19 ` [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Steven Rostedt
1 sibling, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2015-11-25 15:45 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
Currently we check sample type for ftrace:function event
even if it's not created as sampling event. That prevents
creating ftrace_function event in counting mode.
Making sure we check sample types only for sampling events.
Before:
$ sudo perf stat -e ftrace:function ls
...
Performance counter stats for 'ls':
<not supported> ftrace:function
0.001983662 seconds time elapsed
After:
$ sudo perf stat -e ftrace:function ls
...
Performance counter stats for 'ls':
44,498 ftrace:function
0.037534722 seconds time elapsed
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
kernel/trace/trace_event_perf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index fa1f12ef3ba3..8801b677e668 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -52,14 +52,14 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
* event, due to issues with page faults while tracing page
* fault handler and its overall trickiness nature.
*/
- if (!p_event->attr.exclude_callchain_user)
+ if (is_sampling_event(p_event) && !p_event->attr.exclude_callchain_user)
return -EINVAL;
/*
* Same reason to disable user stack dump as for user space
* callchains above.
*/
- if (p_event->attr.sample_type & PERF_SAMPLE_STACK_USER)
+ if (is_sampling_event(p_event) && p_event->attr.sample_type & PERF_SAMPLE_STACK_USER)
return -EINVAL;
}
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
2015-11-25 15:45 [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Jiri Olsa
2015-11-25 15:45 ` [PATCH 2/2] ftrace: Check sample types only for sampling events Jiri Olsa
@ 2015-11-25 16:19 ` Steven Rostedt
2015-11-25 16:50 ` Jiri Olsa
1 sibling, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2015-11-25 16:19 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
On Wed, 25 Nov 2015 16:45:32 +0100
Jiri Olsa <jolsa@kernel.org> wrote:
> Having following commands running concurently:
>
> # perf record -e ftrace:function -a -o krava.data sleep 10
> # perf record -e ftrace:function --filter 'ip == SyS_read' ls
>
> will endup in the latter one failing on the filter
> rules and store all functions (in perf.data) as
> instructed by the first record instead of just
> SyS_read records.
>
> The reason is that we don't check the ftrace_ops that
> triggered the event with event's ftrace_ops. Hence
> once running together the event from latter perf will
> get all the data of the event from the first one.
>
> Fixing this by having separate handler for ftrace:function
> event that actualy checks ftrace_ops against event.
This seems redundant. I never understood the control_ops that perf uses
in the function tracing infrastructure. Why can't you just register the
event->ops and have that ops set the filtering? Then the ftrace
infrastructure will only call that event handler for the functions its
filtered on. Then you don't need to do it again. Right now ftrace
already does that with the generic "control_ops" that perf uses, but
now you are doing it again. Seems rather pointless.
I may need to spend some time looking at how perf uses the function
tracing, and perhaps we can remove a lot of this redundancy.
-- Steve
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
2015-11-25 16:19 ` [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Steven Rostedt
@ 2015-11-25 16:50 ` Jiri Olsa
2015-11-25 17:02 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2015-11-25 16:50 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
Peter Zijlstra, Arnaldo Carvalho de Melo
On Wed, Nov 25, 2015 at 11:19:56AM -0500, Steven Rostedt wrote:
> On Wed, 25 Nov 2015 16:45:32 +0100
> Jiri Olsa <jolsa@kernel.org> wrote:
>
> > Having following commands running concurently:
> >
> > # perf record -e ftrace:function -a -o krava.data sleep 10
> > # perf record -e ftrace:function --filter 'ip == SyS_read' ls
> >
> > will endup in the latter one failing on the filter
> > rules and store all functions (in perf.data) as
> > instructed by the first record instead of just
> > SyS_read records.
> >
> > The reason is that we don't check the ftrace_ops that
> > triggered the event with event's ftrace_ops. Hence
> > once running together the event from latter perf will
> > get all the data of the event from the first one.
> >
> > Fixing this by having separate handler for ftrace:function
> > event that actualy checks ftrace_ops against event.
>
> This seems redundant. I never understood the control_ops that perf uses
> in the function tracing infrastructure. Why can't you just register the
> event->ops and have that ops set the filtering? Then the ftrace
> infrastructure will only call that event handler for the functions its
> filtered on. Then you don't need to do it again. Right now ftrace
> already does that with the generic "control_ops" that perf uses, but
> now you are doing it again. Seems rather pointless.
well thats exactly what we are doing.. but as all ops
share single callback we need to find the proper event
this callback was triggered for
currently we use tracepoint callback (perf_tp_event)
where the proper event is found based on the event->filter
however this is not the case for ftrace:function because
filter will not change the event->filter, but the ops filter
> I may need to spend some time looking at how perf uses the function
> tracing, and perhaps we can remove a lot of this redundancy.
that'd be cool
thanks,
jirka
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
2015-11-25 16:50 ` Jiri Olsa
@ 2015-11-25 17:02 ` Steven Rostedt
2015-11-25 17:12 ` Jiri Olsa
0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2015-11-25 17:02 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
Peter Zijlstra, Arnaldo Carvalho de Melo
On Wed, 25 Nov 2015 17:50:57 +0100
Jiri Olsa <jolsa@redhat.com> wrote:
> > This seems redundant. I never understood the control_ops that perf uses
> > in the function tracing infrastructure. Why can't you just register the
> > event->ops and have that ops set the filtering? Then the ftrace
> > infrastructure will only call that event handler for the functions its
> > filtered on. Then you don't need to do it again. Right now ftrace
> > already does that with the generic "control_ops" that perf uses, but
> > now you are doing it again. Seems rather pointless.
>
> well thats exactly what we are doing.. but as all ops
> share single callback we need to find the proper event
> this callback was triggered for
The ftrace_ops has a "private" field for the user to set. Could you
make that point back to the event that allocated the ftrace_ops?
Then the callback function could easily get the event that matches the
ftrace_ops.
>
> currently we use tracepoint callback (perf_tp_event)
> where the proper event is found based on the event->filter
>
> however this is not the case for ftrace:function because
> filter will not change the event->filter, but the ops filter
Yeah, I'm trying to figure out the paths here. I would love to remove
the control_ops as that complicates the function tracing code a bit
more than I would like it to be.
I just crashed function tracing by function tracing perf doing function
tracing :-) I'm currently debugging that (and adding more code to help
debug things like this).
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
2015-11-25 17:02 ` Steven Rostedt
@ 2015-11-25 17:12 ` Jiri Olsa
2015-11-25 17:42 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2015-11-25 17:12 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
Peter Zijlstra, Arnaldo Carvalho de Melo
On Wed, Nov 25, 2015 at 12:02:57PM -0500, Steven Rostedt wrote:
> On Wed, 25 Nov 2015 17:50:57 +0100
> Jiri Olsa <jolsa@redhat.com> wrote:
>
> > > This seems redundant. I never understood the control_ops that perf uses
> > > in the function tracing infrastructure. Why can't you just register the
> > > event->ops and have that ops set the filtering? Then the ftrace
> > > infrastructure will only call that event handler for the functions its
> > > filtered on. Then you don't need to do it again. Right now ftrace
> > > already does that with the generic "control_ops" that perf uses, but
> > > now you are doing it again. Seems rather pointless.
> >
> > well thats exactly what we are doing.. but as all ops
> > share single callback we need to find the proper event
> > this callback was triggered for
>
> The ftrace_ops has a "private" field for the user to set. Could you
> make that point back to the event that allocated the ftrace_ops?
> Then the callback function could easily get the event that matches the
> ftrace_ops.
aaah did not see this one ;-) that will do.. nice
>
> >
> > currently we use tracepoint callback (perf_tp_event)
> > where the proper event is found based on the event->filter
> >
> > however this is not the case for ftrace:function because
> > filter will not change the event->filter, but the ops filter
>
> Yeah, I'm trying to figure out the paths here. I would love to remove
> the control_ops as that complicates the function tracing code a bit
> more than I would like it to be.
perf is using the control interface to enable and disable function
tracing any time the process (and its events) is scheduled in/out
sched_in(event) -> ftrace_function_local_enable(&event->ftrace_ops)
sched_out(event) -> ftrace_function_local_disable(&event->ftrace_ops)
>
> I just crashed function tracing by function tracing perf doing function
> tracing :-) I'm currently debugging that (and adding more code to help
> debug things like this).
;-)
jirka
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ftrace: Add separate handler for ftrace:function event
2015-11-25 17:12 ` Jiri Olsa
@ 2015-11-25 17:42 ` Steven Rostedt
0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2015-11-25 17:42 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
Peter Zijlstra, Arnaldo Carvalho de Melo
On Wed, 25 Nov 2015 18:12:43 +0100
Jiri Olsa <jolsa@redhat.com> wrote:
> > Yeah, I'm trying to figure out the paths here. I would love to remove
> > the control_ops as that complicates the function tracing code a bit
> > more than I would like it to be.
>
> perf is using the control interface to enable and disable function
> tracing any time the process (and its events) is scheduled in/out
>
> sched_in(event) -> ftrace_function_local_enable(&event->ftrace_ops)
> sched_out(event) -> ftrace_function_local_disable(&event->ftrace_ops)
OK, I think we can merge the control list to the normal list and have
something like:
(!op->disabled || !ftrace_function_local_disabled(op))
We can also keep the control flag for the ops, and have:
!(op->flags & FTRACE_OPS_FL_CONTROL) || rcu_is_watching()
test too, and use the ftrace ops list code.
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-11-25 17:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-25 15:45 [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Jiri Olsa
2015-11-25 15:45 ` [PATCH 2/2] ftrace: Check sample types only for sampling events Jiri Olsa
2015-11-25 16:19 ` [PATCH 1/2] ftrace: Add separate handler for ftrace:function event Steven Rostedt
2015-11-25 16:50 ` Jiri Olsa
2015-11-25 17:02 ` Steven Rostedt
2015-11-25 17:12 ` Jiri Olsa
2015-11-25 17:42 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox