* [PATCHv2 0/3] perf ftrace: Several perf ftrace:function event fixes
@ 2015-11-26 19:34 Jiri Olsa
2015-11-26 19:34 ` [PATCH 1/3] ftrace: Check sample types only for sampling events Jiri Olsa
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jiri Olsa @ 2015-11-26 19:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
hi,
sending few fixes for perf ftrace:function event handling.
It's v2 for following original post:
http://marc.info/?l=linux-kernel&m=144846634815636&w=2
v2 changes:
- using ftrace_ops::private to store event pointer
- moving attr::exclude_kernel check into event init processing
thanks,
jirka
---
Jiri Olsa (3):
ftrace: Check sample types only for sampling events
perf: Move exclude_kernel tracepoint check to init event
perf ftrace: Use ftrace_ops::private to store event pointer
include/linux/perf_event.h | 3 +++
kernel/events/core.c | 30 +++++++++++++++++++++++++-----
kernel/trace/trace_event_perf.c | 28 +++++++++++++++++++---------
3 files changed, 47 insertions(+), 14 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] ftrace: Check sample types only for sampling events
2015-11-26 19:34 [PATCHv2 0/3] perf ftrace: Several perf ftrace:function event fixes Jiri Olsa
@ 2015-11-26 19:34 ` Jiri Olsa
2015-11-26 19:34 ` [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event Jiri Olsa
2015-11-26 19:34 ` [PATCH 3/3] perf ftrace: Use ftrace_ops::private to store event pointer Jiri Olsa
2 siblings, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2015-11-26 19:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
Currently we check sample type for ftrace:function event
even if it's not created as sampling event. That prevents
creating ftrace_function event in counting mode.
Making sure we check sample types only for sampling events.
Before:
$ sudo perf stat -e ftrace:function ls
...
Performance counter stats for 'ls':
<not supported> ftrace:function
0.001983662 seconds time elapsed
After:
$ sudo perf stat -e ftrace:function ls
...
Performance counter stats for 'ls':
44,498 ftrace:function
0.037534722 seconds time elapsed
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
kernel/trace/trace_event_perf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index cc9f7a9319be..c41463af3298 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -52,14 +52,14 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
* event, due to issues with page faults while tracing page
* fault handler and its overall trickiness nature.
*/
- if (!p_event->attr.exclude_callchain_user)
+ if (is_sampling_event(p_event) && !p_event->attr.exclude_callchain_user)
return -EINVAL;
/*
* Same reason to disable user stack dump as for user space
* callchains above.
*/
- if (p_event->attr.sample_type & PERF_SAMPLE_STACK_USER)
+ if (is_sampling_event(p_event) && p_event->attr.sample_type & PERF_SAMPLE_STACK_USER)
return -EINVAL;
}
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event
2015-11-26 19:34 [PATCHv2 0/3] perf ftrace: Several perf ftrace:function event fixes Jiri Olsa
2015-11-26 19:34 ` [PATCH 1/3] ftrace: Check sample types only for sampling events Jiri Olsa
@ 2015-11-26 19:34 ` Jiri Olsa
2015-11-26 19:41 ` Arnaldo Carvalho de Melo
2015-11-26 19:34 ` [PATCH 3/3] perf ftrace: Use ftrace_ops::private to store event pointer Jiri Olsa
2 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2015-11-26 19:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
We suppress events with attr::exclude_kernel set when
the event is generated, so following capture will
give no warning and produce no data:
$ sudo perf record -e sched:sched_switch:u ls
$ sudo /perf script | wc -l
0
Checking the attr::exclude_kernel at the event init
time and failing right away:
$ sudo perf record -e sched:sched_switch:u ls
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (sched:sched_switch).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
kernel/events/core.c | 5 -----
kernel/trace/trace_event_perf.c | 15 +++++++++++++++
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5854fcf7f05a..9fb9d5ef6825 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6938,11 +6938,6 @@ static int perf_tp_event_match(struct perf_event *event,
{
if (event->hw.state & PERF_HES_STOPPED)
return 0;
- /*
- * All tracepoints are from kernel-space.
- */
- if (event->attr.exclude_kernel)
- return 0;
if (!perf_tp_filter_match(event, data))
return 0;
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index c41463af3298..637268855296 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -182,11 +182,26 @@ static void perf_trace_event_close(struct perf_event *p_event)
tp_event->class->reg(tp_event, TRACE_REG_PERF_CLOSE, p_event);
}
+static int perf_trace_event_attr(struct perf_event *event)
+{
+ /*
+ * All tracepoints are from kernel-space.
+ */
+ if (event->attr.exclude_kernel)
+ return -EINVAL;
+
+ return 0;
+}
+
static int perf_trace_event_init(struct trace_event_call *tp_event,
struct perf_event *p_event)
{
int ret;
+ ret = perf_trace_event_attr(p_event);
+ if (ret)
+ return ret;
+
ret = perf_trace_event_perm(tp_event, p_event);
if (ret)
return ret;
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] perf ftrace: Use ftrace_ops::private to store event pointer
2015-11-26 19:34 [PATCHv2 0/3] perf ftrace: Several perf ftrace:function event fixes Jiri Olsa
2015-11-26 19:34 ` [PATCH 1/3] ftrace: Check sample types only for sampling events Jiri Olsa
2015-11-26 19:34 ` [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event Jiri Olsa
@ 2015-11-26 19:34 ` Jiri Olsa
2 siblings, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2015-11-26 19:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
Having following commands running concurently:
# perf record -e ftrace:function -a -o krava.data sleep 10
# perf record -e ftrace:function --filter 'ip == SyS_read' ls
will endup in the latter one failing on the filter
rules and store all functions (in perf.data) as
instructed by the first record instead of just
SyS_read records.
The reason is that we don't check the ftrace_ops that
triggered the event with event's ftrace_ops. Hence
once running together the event from latter perf will
get all the data of the event from the first one.
Fixing this by using ftrace_ops::private value to keep
the perf_event pointer. This way we don't need to search
for triggered event (as tracepoint handler does) and
directly store sample.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
include/linux/perf_event.h | 3 +++
kernel/events/core.c | 25 +++++++++++++++++++++++++
kernel/trace/trace_event_perf.c | 9 ++-------
3 files changed, 30 insertions(+), 7 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f9828a48f16a..e438f909df31 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1009,6 +1009,9 @@ extern void perf_tp_event(u64 addr, u64 count, void *record,
int entry_size, struct pt_regs *regs,
struct hlist_head *head, int rctx,
struct task_struct *task);
+void perf_function_event(struct perf_event *event,
+ void *record, int entry_size,
+ struct pt_regs *regs, int rctx);
extern void perf_bp_event(struct perf_event *event, void *data);
#ifndef perf_misc_flags
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9fb9d5ef6825..dc98f12c55e2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7103,6 +7103,31 @@ static void perf_event_free_bpf_prog(struct perf_event *event)
}
}
+#ifdef CONFIG_FUNCTION_TRACER
+void perf_function_event(struct perf_event *event,
+ void *record, int entry_size,
+ struct pt_regs *regs, int rctx)
+
+{
+ struct perf_sample_data data;
+ struct perf_raw_record raw = {
+ .size = entry_size,
+ .data = record,
+ };
+
+ if (event->hw.state & PERF_HES_STOPPED)
+ goto out;
+
+ perf_sample_data_init(&data, 0, 0);
+ data.raw = &raw;
+
+ perf_swevent_event(event, 1, &data, regs);
+
+out:
+ perf_swevent_put_recursion_context(rctx);
+}
+#endif /* CONFIG_FUNCTION_TRACER */
+
#else
static inline void perf_tp_register(void)
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 637268855296..396a39988bde 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -318,14 +318,9 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *ops, struct pt_regs *pt_regs)
{
struct ftrace_entry *entry;
- struct hlist_head *head;
struct pt_regs regs;
int rctx;
- head = this_cpu_ptr(event_function.perf_events);
- if (hlist_empty(head))
- return;
-
#define ENTRY_SIZE (ALIGN(sizeof(struct ftrace_entry) + sizeof(u32), \
sizeof(u64)) - sizeof(u32))
@@ -339,8 +334,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
entry->ip = ip;
entry->parent_ip = parent_ip;
- perf_trace_buf_submit(entry, ENTRY_SIZE, rctx, 0,
- 1, ®s, head, NULL);
+ perf_function_event(ops->private, entry, ENTRY_SIZE, ®s, rctx);
#undef ENTRY_SIZE
}
@@ -349,6 +343,7 @@ static int perf_ftrace_function_register(struct perf_event *event)
{
struct ftrace_ops *ops = &event->ftrace_ops;
+ ops->private = event;
ops->flags |= FTRACE_OPS_FL_CONTROL;
ops->func = perf_ftrace_function_call;
return register_ftrace_function(ops);
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event
2015-11-26 19:34 ` [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event Jiri Olsa
@ 2015-11-26 19:41 ` Arnaldo Carvalho de Melo
2015-11-26 19:49 ` Jiri Olsa
0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-11-26 19:41 UTC (permalink / raw)
To: Jiri Olsa
Cc: Steven Rostedt, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
Peter Zijlstra
Em Thu, Nov 26, 2015 at 08:34:48PM +0100, Jiri Olsa escreveu:
> We suppress events with attr::exclude_kernel set when
> the event is generated, so following capture will
> give no warning and produce no data:
>
> $ sudo perf record -e sched:sched_switch:u ls
> $ sudo /perf script | wc -l
> 0
>
> Checking the attr::exclude_kernel at the event init
> time and failing right away:
We can as well provide a better warning in the tooling side and don't
even ask this nonsensical combo to the kernel, right?
Arguably the kernel was doing what was asked for, its just that no
sched:sched_switch took place while in user space... ;-)
- Arnaldo
> $ sudo perf record -e sched:sched_switch:u ls
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (sched:sched_switch).
> /bin/dmesg may provide additional information.
> No CONFIG_PERF_EVENTS=y kernel support configured?
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> kernel/events/core.c | 5 -----
> kernel/trace/trace_event_perf.c | 15 +++++++++++++++
> 2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5854fcf7f05a..9fb9d5ef6825 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6938,11 +6938,6 @@ static int perf_tp_event_match(struct perf_event *event,
> {
> if (event->hw.state & PERF_HES_STOPPED)
> return 0;
> - /*
> - * All tracepoints are from kernel-space.
> - */
> - if (event->attr.exclude_kernel)
> - return 0;
>
> if (!perf_tp_filter_match(event, data))
> return 0;
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index c41463af3298..637268855296 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -182,11 +182,26 @@ static void perf_trace_event_close(struct perf_event *p_event)
> tp_event->class->reg(tp_event, TRACE_REG_PERF_CLOSE, p_event);
> }
>
> +static int perf_trace_event_attr(struct perf_event *event)
> +{
> + /*
> + * All tracepoints are from kernel-space.
> + */
> + if (event->attr.exclude_kernel)
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> static int perf_trace_event_init(struct trace_event_call *tp_event,
> struct perf_event *p_event)
> {
> int ret;
>
> + ret = perf_trace_event_attr(p_event);
> + if (ret)
> + return ret;
> +
> ret = perf_trace_event_perm(tp_event, p_event);
> if (ret)
> return ret;
> --
> 2.4.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event
2015-11-26 19:41 ` Arnaldo Carvalho de Melo
@ 2015-11-26 19:49 ` Jiri Olsa
2015-11-26 20:44 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 7+ messages in thread
From: Jiri Olsa @ 2015-11-26 19:49 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Steven Rostedt, lkml, David Ahern, Ingo Molnar,
Namhyung Kim, Peter Zijlstra
On Thu, Nov 26, 2015 at 04:41:06PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 26, 2015 at 08:34:48PM +0100, Jiri Olsa escreveu:
> > We suppress events with attr::exclude_kernel set when
> > the event is generated, so following capture will
> > give no warning and produce no data:
> >
> > $ sudo perf record -e sched:sched_switch:u ls
> > $ sudo /perf script | wc -l
> > 0
> >
> > Checking the attr::exclude_kernel at the event init
> > time and failing right away:
>
> We can as well provide a better warning in the tooling side and don't
> even ask this nonsensical combo to the kernel, right?
yep.. I'll update my todo ;-)
>
> Arguably the kernel was doing what was asked for, its just that no
> sched:sched_switch took place while in user space... ;-)
I think it's better to fail like that.. rather than spending
time on figuring why my test did not give me any data ;-)
jirka
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event
2015-11-26 19:49 ` Jiri Olsa
@ 2015-11-26 20:44 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-11-26 20:44 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jiri Olsa, Steven Rostedt, lkml, David Ahern, Ingo Molnar,
Namhyung Kim, Peter Zijlstra
Em Thu, Nov 26, 2015 at 08:49:10PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 26, 2015 at 04:41:06PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 26, 2015 at 08:34:48PM +0100, Jiri Olsa escreveu:
> > > We suppress events with attr::exclude_kernel set when
> > > the event is generated, so following capture will
> > > give no warning and produce no data:
> > >
> > > $ sudo perf record -e sched:sched_switch:u ls
> > > $ sudo /perf script | wc -l
> > > 0
> > >
> > > Checking the attr::exclude_kernel at the event init
> > > time and failing right away:
> >
> > We can as well provide a better warning in the tooling side and don't
> > even ask this nonsensical combo to the kernel, right?
>
> yep.. I'll update my todo ;-)
Thanks!
> > Arguably the kernel was doing what was asked for, its just that no
> > sched:sched_switch took place while in user space... ;-)
>
> I think it's better to fail like that.. rather than spending
> time on figuring why my test did not give me any data ;-)
Up to peterz ;-)
- Arnaldo
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-11-26 20:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-26 19:34 [PATCHv2 0/3] perf ftrace: Several perf ftrace:function event fixes Jiri Olsa
2015-11-26 19:34 ` [PATCH 1/3] ftrace: Check sample types only for sampling events Jiri Olsa
2015-11-26 19:34 ` [PATCH 2/3] perf: Move exclude_kernel tracepoint check to init event Jiri Olsa
2015-11-26 19:41 ` Arnaldo Carvalho de Melo
2015-11-26 19:49 ` Jiri Olsa
2015-11-26 20:44 ` Arnaldo Carvalho de Melo
2015-11-26 19:34 ` [PATCH 3/3] perf ftrace: Use ftrace_ops::private to store event pointer Jiri Olsa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox