* [PATCH 0/8] More perf patches
@ 2009-11-23 10:37 Peter Zijlstra
2009-11-23 10:37 ` [PATCH 1/8] perf: undo copy/paste damage Peter Zijlstra
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/8] perf: undo copy/paste damage
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:53 ` [tip:perf/core] perf_events: Undo " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 2/8] perf: style nits Peter Zijlstra
` (7 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-1.patch --]
[-- Type: text/plain, Size: 1894 bytes --]
We had two almost identical functions, avoid the duplication.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 31 +++++++++----------------------
1 file changed, 9 insertions(+), 22 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1704,16 +1704,10 @@ static void free_event(struct perf_event
call_rcu(&event->rcu_head, free_event_rcu);
}
-/*
- * Called when the last reference to the file is gone.
- */
-static int perf_release(struct inode *inode, struct file *file)
+int perf_event_release_kernel(struct perf_event *event)
{
- struct perf_event *event = file->private_data;
struct perf_event_context *ctx = event->ctx;
- file->private_data = NULL;
-
WARN_ON_ONCE(ctx->parent_ctx);
mutex_lock(&ctx->mutex);
perf_event_remove_from_context(event);
@@ -1728,26 +1722,19 @@ static int perf_release(struct inode *in
return 0;
}
+EXPORT_SYMBOL_GPL(perf_event_release_kernel);
-int perf_event_release_kernel(struct perf_event *event)
+/*
+ * Called when the last reference to the file is gone.
+ */
+static int perf_release(struct inode *inode, struct file *file)
{
- struct perf_event_context *ctx = event->ctx;
-
- WARN_ON_ONCE(ctx->parent_ctx);
- mutex_lock(&ctx->mutex);
- perf_event_remove_from_context(event);
- mutex_unlock(&ctx->mutex);
-
- mutex_lock(&event->owner->perf_event_mutex);
- list_del_init(&event->owner_entry);
- mutex_unlock(&event->owner->perf_event_mutex);
- put_task_struct(event->owner);
+ struct perf_event *event = file->private_data;
- free_event(event);
+ file->private_data = NULL;
- return 0;
+ return perf_event_release_kernel(event);
}
-EXPORT_SYMBOL_GPL(perf_event_release_kernel);
static int perf_event_read_size(struct perf_event *event)
{
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/8] perf: style nits
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
2009-11-23 10:37 ` [PATCH 1/8] perf: undo copy/paste damage Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Fix " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 3/8] perf: disable events when we detach them Peter Zijlstra
` (6 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-2.patch --]
[-- Type: text/plain, Size: 905 bytes --]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -449,9 +450,8 @@ retry:
* can remove the event safely, if the call above did not
* succeed.
*/
- if (!list_empty(&event->group_entry)) {
+ if (!list_empty(&event->group_entry))
list_del_event(event, ctx);
- }
spin_unlock_irq(&ctx->lock);
}
@@ -1033,10 +1033,10 @@ void __perf_event_sched_out(struct perf_
update_context_time(ctx);
perf_disable();
- if (ctx->nr_active)
+ if (ctx->nr_active) {
list_for_each_entry(event, &ctx->group_list, group_entry)
group_sched_out(event, cpuctx, ctx);
-
+ }
perf_enable();
out:
spin_unlock(&ctx->lock);
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/8] perf: disable events when we detach them
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
2009-11-23 10:37 ` [PATCH 1/8] perf: undo copy/paste damage Peter Zijlstra
2009-11-23 10:37 ` [PATCH 2/8] perf: style nits Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Disable " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 4/8] perf: update the context time on exit Peter Zijlstra
` (5 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-4.patch --]
[-- Type: text/plain, Size: 783 bytes --]
If we leave the event in STATE_INACTIVE, any read of the event after
the detach will increase the running count but not the enabled count
and cause funny scaling artefacts.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 2 ++
1 file changed, 2 insertions(+)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -294,6 +294,8 @@ list_del_event(struct perf_event *event,
if (event->group_leader != event)
event->group_leader->nr_siblings--;
+ event->state = PERF_EVENT_STATE_OFF;
+
/*
* If this was a group event with sibling events then
* upgrade the siblings to singleton events by adding them
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/8] perf: update the context time on exit
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (2 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 3/8] perf: disable events when we detach them Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Update " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 5/8] perf: fix __perf_event_exit_task update_event_times locking Peter Zijlstra
` (4 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-3.patch --]
[-- Type: text/plain, Size: 790 bytes --]
It appeared we did call update_event_times() on exit, but we failed to
update the context time, which renders the former moot.
XXX: locking is a bit iffy, we call update_event_times under
ctx->mutex instead of ctx->lock, should probably be fixed in
some way.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 1 +
1 file changed, 1 insertion(+)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -4982,6 +4982,7 @@ void perf_event_exit_task(struct task_st
* the events from it.
*/
unclone_ctx(child_ctx);
+ update_context_time(child_ctx);
spin_unlock_irqrestore(&child_ctx->lock, flags);
/*
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 5/8] perf: fix __perf_event_exit_task update_event_times locking
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (3 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 4/8] perf: update the context time on exit Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 6/8] perf: optimize __perf_sw_event() Peter Zijlstra
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-8.patch --]
[-- Type: text/plain, Size: 3149 bytes --]
Move the update_event_times() call in __perf_event_exit_task() into
list_del_event() because that holds the proper lock (ctx->lock) and
seems a more natural place to do the last time update.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 78 ++++++++++++++++++++++++++--------------------------
1 file changed, 39 insertions(+), 39 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -246,6 +246,44 @@ static void perf_unpin_context(struct pe
put_ctx(ctx);
}
+static inline u64 perf_clock(void)
+{
+ return cpu_clock(smp_processor_id());
+}
+
+/*
+ * Update the record of the current time in a context.
+ */
+static void update_context_time(struct perf_event_context *ctx)
+{
+ u64 now = perf_clock();
+
+ ctx->time += now - ctx->timestamp;
+ ctx->timestamp = now;
+}
+
+/*
+ * Update the total_time_enabled and total_time_running fields for a event.
+ */
+static void update_event_times(struct perf_event *event)
+{
+ struct perf_event_context *ctx = event->ctx;
+ u64 run_end;
+
+ if (event->state < PERF_EVENT_STATE_INACTIVE ||
+ event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
+ return;
+
+ event->total_time_enabled = ctx->time - event->tstamp_enabled;
+
+ if (event->state == PERF_EVENT_STATE_INACTIVE)
+ run_end = event->tstamp_stopped;
+ else
+ run_end = ctx->time;
+
+ event->total_time_running = run_end - event->tstamp_running;
+}
+
/*
* Add a event from the lists for its context.
* Must be called with ctx->mutex and ctx->lock held.
@@ -294,6 +332,7 @@ list_del_event(struct perf_event *event,
if (event->group_leader != event)
event->group_leader->nr_siblings--;
+ update_event_times(event);
event->state = PERF_EVENT_STATE_OFF;
/*
@@ -454,44 +493,6 @@ retry:
spin_unlock_irq(&ctx->lock);
}
-static inline u64 perf_clock(void)
-{
- return cpu_clock(smp_processor_id());
-}
-
-/*
- * Update the record of the current time in a context.
- */
-static void update_context_time(struct perf_event_context *ctx)
-{
- u64 now = perf_clock();
-
- ctx->time += now - ctx->timestamp;
- ctx->timestamp = now;
-}
-
-/*
- * Update the total_time_enabled and total_time_running fields for a event.
- */
-static void update_event_times(struct perf_event *event)
-{
- struct perf_event_context *ctx = event->ctx;
- u64 run_end;
-
- if (event->state < PERF_EVENT_STATE_INACTIVE ||
- event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
- return;
-
- event->total_time_enabled = ctx->time - event->tstamp_enabled;
-
- if (event->state == PERF_EVENT_STATE_INACTIVE)
- run_end = event->tstamp_stopped;
- else
- run_end = ctx->time;
-
- event->total_time_running = run_end - event->tstamp_running;
-}
-
/*
* Update total_time_enabled and total_time_running for all events in a group.
*/
@@ -4931,7 +4932,6 @@ __perf_event_exit_task(struct perf_event
{
struct perf_event *parent_event;
- update_event_times(child_event);
perf_event_remove_from_context(child_event);
parent_event = child_event->parent;
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 6/8] perf: optimize __perf_sw_event()
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (4 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 5/8] perf: fix __perf_event_exit_task update_event_times locking Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 10:37 ` [PATCH 7/8] perf: undo some recursion damage Peter Zijlstra
` (2 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-6.patch --]
[-- Type: text/plain, Size: 1058 bytes --]
From: Ingo Molnar <mingo@elte.hu>
Ingo noticed that the C99 initialisation of the structure resulted in
a memset() and an assignment. Use explicit assignments to avoid the
memset.
perf_prepare_sample() initializes all data entries, except:
addr, period and raw.
period is set in perf_swevent_overflow, addr is set here, leaving only
raw to be dealt with, so clear that explicitly as well.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -3945,9 +3945,10 @@ out:
void __perf_sw_event(u32 event_id, u64 nr, int nmi,
struct pt_regs *regs, u64 addr)
{
- struct perf_sample_data data = {
- .addr = addr,
- };
+ struct perf_sample_data data;
+
+ data.addr = addr;
+ data.raw = NULL;
do_perf_sw_event(PERF_TYPE_SOFTWARE, event_id, nr, nmi,
&data, regs);
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 7/8] perf: undo some recursion damage
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (5 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 6/8] perf: optimize __perf_sw_event() Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 11:55 ` [tip:perf/core] perf_events: Undo " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 8/8][RFC] perf: be paranoid about child times? Peter Zijlstra
2009-11-23 14:00 ` [PATCH 9/8] perf_events: Restore sanity to scaling land Peter Zijlstra
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker
[-- Attachment #1: perf-foo-7.patch --]
[-- Type: text/plain, Size: 10647 bytes --]
Make perf_swevent_get_recursion_context return a context number and
disable preemption.
This could be used to remove the IRQ disable from the trace bit and
index the per-cpu buffer with.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/perf_event.h | 8 ++--
include/trace/ftrace.h | 17 +++++-----
kernel/perf_event.c | 71 ++++++++++++++++++------------------------
kernel/trace/trace_kprobe.c | 14 ++++----
kernel/trace/trace_syscalls.c | 14 ++++----
5 files changed, 61 insertions(+), 63 deletions(-)
Index: linux-2.6/include/linux/perf_event.h
===================================================================
--- linux-2.6.orig/include/linux/perf_event.h
+++ linux-2.6/include/linux/perf_event.h
@@ -874,8 +874,8 @@ extern int perf_output_begin(struct perf
extern void perf_output_end(struct perf_output_handle *handle);
extern void perf_output_copy(struct perf_output_handle *handle,
const void *buf, unsigned int len);
-extern int perf_swevent_get_recursion_context(int **recursion);
-extern void perf_swevent_put_recursion_context(int *recursion);
+extern int perf_swevent_get_recursion_context(void);
+extern void perf_swevent_put_recursion_context(int rctx);
#else
static inline void
perf_event_task_sched_in(struct task_struct *task, int cpu) { }
@@ -904,8 +904,8 @@ static inline void perf_event_mmap(struc
static inline void perf_event_comm(struct task_struct *tsk) { }
static inline void perf_event_fork(struct task_struct *tsk) { }
static inline void perf_event_init(void) { }
-static int perf_swevent_get_recursion_context(int **recursion) { return -1; }
-static void perf_swevent_put_recursion_context(int *recursion) { }
+static inline int perf_swevent_get_recursion_context(void) { return -1; }
+static inline void perf_swevent_put_recursion_context(int rctx) { }
#endif
Index: linux-2.6/include/trace/ftrace.h
===================================================================
--- linux-2.6.orig/include/trace/ftrace.h
+++ linux-2.6/include/trace/ftrace.h
@@ -724,8 +724,8 @@ __attribute__((section("_ftrace_events")
static void ftrace_profile_##call(proto) \
{ \
struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
- extern int perf_swevent_get_recursion_context(int **recursion); \
- extern void perf_swevent_put_recursion_context(int *recursion); \
+ extern int perf_swevent_get_recursion_context(void); \
+ extern void perf_swevent_put_recursion_context(int rctx); \
struct ftrace_event_call *event_call = &event_##call; \
extern void perf_tp_event(int, u64, u64, void *, int); \
struct ftrace_raw_##call *entry; \
@@ -736,8 +736,8 @@ static void ftrace_profile_##call(proto)
int __data_size; \
char *trace_buf; \
char *raw_data; \
- int *recursion; \
int __cpu; \
+ int rctx; \
int pc; \
\
pc = preempt_count(); \
@@ -753,8 +753,9 @@ static void ftrace_profile_##call(proto)
\
local_irq_save(irq_flags); \
\
- if (perf_swevent_get_recursion_context(&recursion)) \
- goto end_recursion; \
+ rctx = perf_swevent_get_recursion_context(); \
+ if (rctx < 0) \
+ goto end_recursion; \
\
__cpu = smp_processor_id(); \
\
@@ -781,9 +782,9 @@ static void ftrace_profile_##call(proto)
perf_tp_event(event_call->id, __addr, __count, entry, \
__entry_size); \
\
-end: \
- perf_swevent_put_recursion_context(recursion); \
-end_recursion: \
+end: \
+ perf_swevent_put_recursion_context(rctx); \
+end_recursion: \
local_irq_restore(irq_flags); \
\
}
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -3869,45 +3869,50 @@ static void perf_swevent_ctx_event(struc
}
}
-/*
- * Must be called with preemption disabled
- */
-int perf_swevent_get_recursion_context(int **recursion)
+int perf_swevent_get_recursion_context(void)
{
- struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ struct perf_cpu_context *cpuctx = &get_cpu_var(perf_cpu_context);
+ int rctx;
if (in_nmi())
- *recursion = &cpuctx->recursion[3];
+ rctx = 3;
else if (in_irq())
- *recursion = &cpuctx->recursion[2];
+ rctx = 2;
else if (in_softirq())
- *recursion = &cpuctx->recursion[1];
+ rctx = 1;
else
- *recursion = &cpuctx->recursion[0];
+ rctx = 0;
- if (**recursion)
+ if (cpuctx->recursion[rctx]) {
+ put_cpu_var(perf_cpu_context);
return -1;
+ }
- (**recursion)++;
+ cpuctx->recursion[rctx]++;
+ barrier();
- return 0;
+ return rctx;
}
EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context);
-void perf_swevent_put_recursion_context(int *recursion)
+void perf_swevent_put_recursion_context(int rctx)
{
- (*recursion)--;
+ struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ barrier();
+ cpuctx->recursion[rctx]++;
+ put_cpu_var(perf_cpu_context);
}
EXPORT_SYMBOL_GPL(perf_swevent_put_recursion_context);
-static void __do_perf_sw_event(enum perf_type_id type, u32 event_id,
- u64 nr, int nmi,
- struct perf_sample_data *data,
- struct pt_regs *regs)
+static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
+ u64 nr, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
{
+ struct perf_cpu_context *cpuctx;
struct perf_event_context *ctx;
- struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ cpuctx = &__get_cpu_var(perf_cpu_context);
rcu_read_lock();
perf_swevent_ctx_event(&cpuctx->ctx, type, event_id,
nr, nmi, data, regs);
@@ -3921,35 +3926,23 @@ static void __do_perf_sw_event(enum perf
rcu_read_unlock();
}
-static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
- u64 nr, int nmi,
- struct perf_sample_data *data,
- struct pt_regs *regs)
-{
- int *recursion;
-
- preempt_disable();
-
- if (perf_swevent_get_recursion_context(&recursion))
- goto out;
-
- __do_perf_sw_event(type, event_id, nr, nmi, data, regs);
-
- perf_swevent_put_recursion_context(recursion);
-out:
- preempt_enable();
-}
-
void __perf_sw_event(u32 event_id, u64 nr, int nmi,
struct pt_regs *regs, u64 addr)
{
struct perf_sample_data data;
+ int rctx;
+
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
+ return;
data.addr = addr;
data.raw = NULL;
do_perf_sw_event(PERF_TYPE_SOFTWARE, event_id, nr, nmi,
&data, regs);
+
+ perf_swevent_put_recursion_context(rctx);
}
static void perf_swevent_read(struct perf_event *event)
@@ -4173,7 +4166,7 @@ void perf_tp_event(int event_id, u64 add
regs = task_pt_regs(current);
/* Trace events already protected against recursion */
- __do_perf_sw_event(PERF_TYPE_TRACEPOINT, event_id, count, 1,
+ do_perf_sw_event(PERF_TYPE_TRACEPOINT, event_id, count, 1,
&data, regs);
}
EXPORT_SYMBOL_GPL(perf_tp_event);
Index: linux-2.6/kernel/trace/trace_kprobe.c
===================================================================
--- linux-2.6.orig/kernel/trace/trace_kprobe.c
+++ linux-2.6/kernel/trace/trace_kprobe.c
@@ -1213,7 +1213,7 @@ static __kprobes int kprobe_profile_func
unsigned long irq_flags;
char *trace_buf;
char *raw_data;
- int *recursion;
+ int rctx;
pc = preempt_count();
__size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);
@@ -1229,7 +1229,8 @@ static __kprobes int kprobe_profile_func
*/
local_irq_save(irq_flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
__cpu = smp_processor_id();
@@ -1258,7 +1259,7 @@ static __kprobes int kprobe_profile_func
perf_tp_event(call->id, entry->ip, 1, entry, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(irq_flags);
@@ -1276,8 +1277,8 @@ static __kprobes int kretprobe_profile_f
int size, __size, i, pc, __cpu;
unsigned long irq_flags;
char *trace_buf;
- int *recursion;
char *raw_data;
+ int rctx;
pc = preempt_count();
__size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);
@@ -1293,7 +1294,8 @@ static __kprobes int kretprobe_profile_f
*/
local_irq_save(irq_flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
__cpu = smp_processor_id();
@@ -1323,7 +1325,7 @@ static __kprobes int kretprobe_profile_f
perf_tp_event(call->id, entry->ret_ip, 1, entry, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(irq_flags);
Index: linux-2.6/kernel/trace/trace_syscalls.c
===================================================================
--- linux-2.6.orig/kernel/trace/trace_syscalls.c
+++ linux-2.6/kernel/trace/trace_syscalls.c
@@ -481,8 +481,8 @@ static void prof_syscall_enter(struct pt
unsigned long flags;
char *trace_buf;
char *raw_data;
- int *recursion;
int syscall_nr;
+ int rctx;
int size;
int cpu;
@@ -506,7 +506,8 @@ static void prof_syscall_enter(struct pt
/* Protect the per cpu buffer, begin the rcu read side */
local_irq_save(flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
cpu = smp_processor_id();
@@ -530,7 +531,7 @@ static void prof_syscall_enter(struct pt
perf_tp_event(sys_data->enter_id, 0, 1, rec, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(flags);
}
@@ -582,7 +583,7 @@ static void prof_syscall_exit(struct pt_
int syscall_nr;
char *trace_buf;
char *raw_data;
- int *recursion;
+ int rctx;
int size;
int cpu;
@@ -609,7 +610,8 @@ static void prof_syscall_exit(struct pt_
/* Protect the per cpu buffer, begin the rcu read side */
local_irq_save(flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
cpu = smp_processor_id();
@@ -634,7 +636,7 @@ static void prof_syscall_exit(struct pt_
perf_tp_event(sys_data->exit_id, 0, 1, rec, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(flags);
}
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 8/8][RFC] perf: be paranoid about child times?
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (6 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 7/8] perf: undo some recursion damage Peter Zijlstra
@ 2009-11-23 10:37 ` Peter Zijlstra
2009-11-23 10:51 ` Ingo Molnar
2009-11-23 14:00 ` [PATCH 9/8] perf_events: Restore sanity to scaling land Peter Zijlstra
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 10:37 UTC (permalink / raw)
To: Ingo Molnar, Paul Mackerras; +Cc: linux-kernel, Peter Zijlstra
[-- Attachment #1: perf-foo-5.patch --]
[-- Type: text/plain, Size: 984 bytes --]
Normally we flatten the inherited hierarchy by attaching all childs to
the top parent, therefore a child's child_total_time_* should never
get incremented, add it anyway?
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1780,8 +1780,10 @@ u64 perf_event_read_value(struct perf_ev
list_for_each_entry(child, &event->child_list, child_list) {
total += perf_event_read(child);
- *enabled += child->total_time_enabled;
- *running += child->total_time_running;
+ *enabled += child->total_time_enabled +
+ atomic64_read(&child->child_total_time_enabled);
+ *running += child->total_time_running +
+ atomic64_read(&child->child_total_time_running);
}
mutex_unlock(&event->child_mutex);
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 8/8][RFC] perf: be paranoid about child times?
2009-11-23 10:37 ` [PATCH 8/8][RFC] perf: be paranoid about child times? Peter Zijlstra
@ 2009-11-23 10:51 ` Ingo Molnar
0 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2009-11-23 10:51 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Paul Mackerras, linux-kernel
* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> Normally we flatten the inherited hierarchy by attaching all childs to
> the top parent, therefore a child's child_total_time_* should never
> get incremented, add it anyway?
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
> kernel/perf_event.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/kernel/perf_event.c
> ===================================================================
> --- linux-2.6.orig/kernel/perf_event.c
> +++ linux-2.6/kernel/perf_event.c
> @@ -1780,8 +1780,10 @@ u64 perf_event_read_value(struct perf_ev
>
> list_for_each_entry(child, &event->child_list, child_list) {
> total += perf_event_read(child);
> - *enabled += child->total_time_enabled;
> - *running += child->total_time_running;
> + *enabled += child->total_time_enabled +
> + atomic64_read(&child->child_total_time_enabled);
> + *running += child->total_time_running +
> + atomic64_read(&child->child_total_time_running);
Stick in a WARN_ON_ONCE() instead?
Ingo
^ permalink raw reply [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Undo copy/paste damage
2009-11-23 10:37 ` [PATCH 1/8] perf: undo copy/paste damage Peter Zijlstra
@ 2009-11-23 11:53 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:53 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, paulus, hpa, mingo, a.p.zijlstra, efault,
fweisbec, tglx, mingo
Commit-ID: a66a3052e2d4c5815d7ad26887b1d4193206e691
Gitweb: http://git.kernel.org/tip/a66a3052e2d4c5815d7ad26887b1d4193206e691
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:23 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:55 +0100
perf_events: Undo copy/paste damage
We had two almost identical functions, avoid the duplication.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091123103819.537537928@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 31 +++++++++----------------------
1 files changed, 9 insertions(+), 22 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 20df8ab..e2daa10 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1704,16 +1704,10 @@ static void free_event(struct perf_event *event)
call_rcu(&event->rcu_head, free_event_rcu);
}
-/*
- * Called when the last reference to the file is gone.
- */
-static int perf_release(struct inode *inode, struct file *file)
+int perf_event_release_kernel(struct perf_event *event)
{
- struct perf_event *event = file->private_data;
struct perf_event_context *ctx = event->ctx;
- file->private_data = NULL;
-
WARN_ON_ONCE(ctx->parent_ctx);
mutex_lock(&ctx->mutex);
perf_event_remove_from_context(event);
@@ -1728,26 +1722,19 @@ static int perf_release(struct inode *inode, struct file *file)
return 0;
}
+EXPORT_SYMBOL_GPL(perf_event_release_kernel);
-int perf_event_release_kernel(struct perf_event *event)
+/*
+ * Called when the last reference to the file is gone.
+ */
+static int perf_release(struct inode *inode, struct file *file)
{
- struct perf_event_context *ctx = event->ctx;
-
- WARN_ON_ONCE(ctx->parent_ctx);
- mutex_lock(&ctx->mutex);
- perf_event_remove_from_context(event);
- mutex_unlock(&ctx->mutex);
-
- mutex_lock(&event->owner->perf_event_mutex);
- list_del_init(&event->owner_entry);
- mutex_unlock(&event->owner->perf_event_mutex);
- put_task_struct(event->owner);
+ struct perf_event *event = file->private_data;
- free_event(event);
+ file->private_data = NULL;
- return 0;
+ return perf_event_release_kernel(event);
}
-EXPORT_SYMBOL_GPL(perf_event_release_kernel);
static int perf_event_read_size(struct perf_event *event)
{
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Fix style nits
2009-11-23 10:37 ` [PATCH 2/8] perf: style nits Peter Zijlstra
@ 2009-11-23 11:54 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulus, hpa, mingo, fweisbec, a.p.zijlstra, tglx,
mingo
Commit-ID: 6c2bfcbe58e0dd39554be88940149f5aa11e17d1
Gitweb: http://git.kernel.org/tip/6c2bfcbe58e0dd39554be88940149f5aa11e17d1
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:24 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:55 +0100
perf_events: Fix style nits
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.613427378@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index e2daa10..1f14481 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -447,9 +447,8 @@ retry:
* can remove the event safely, if the call above did not
* succeed.
*/
- if (!list_empty(&event->group_entry)) {
+ if (!list_empty(&event->group_entry))
list_del_event(event, ctx);
- }
spin_unlock_irq(&ctx->lock);
}
@@ -1033,10 +1032,10 @@ void __perf_event_sched_out(struct perf_event_context *ctx,
update_context_time(ctx);
perf_disable();
- if (ctx->nr_active)
+ if (ctx->nr_active) {
list_for_each_entry(event, &ctx->group_list, group_entry)
group_sched_out(event, cpuctx, ctx);
-
+ }
perf_enable();
out:
spin_unlock(&ctx->lock);
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Disable events when we detach them
2009-11-23 10:37 ` [PATCH 3/8] perf: disable events when we detach them Peter Zijlstra
@ 2009-11-23 11:54 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulus, hpa, mingo, fweisbec, a.p.zijlstra, tglx,
mingo
Commit-ID: 2e2af50b1fab3c40636839a7f439c167ae559533
Gitweb: http://git.kernel.org/tip/2e2af50b1fab3c40636839a7f439c167ae559533
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:25 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:56 +0100
perf_events: Disable events when we detach them
If we leave the event in STATE_INACTIVE, any read of the event
after the detach will increase the running count but not the
enabled count and cause funny scaling artefacts.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.689055515@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 1f14481..fb851ec 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -294,6 +294,8 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
if (event->group_leader != event)
event->group_leader->nr_siblings--;
+ event->state = PERF_EVENT_STATE_OFF;
+
/*
* If this was a group event with sibling events then
* upgrade the siblings to singleton events by adding them
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Update the context time on exit
2009-11-23 10:37 ` [PATCH 4/8] perf: update the context time on exit Peter Zijlstra
@ 2009-11-23 11:54 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulus, hpa, mingo, fweisbec, a.p.zijlstra, tglx,
mingo
Commit-ID: 5e942bb33371254a474653123cd9e13a4c89ee44
Gitweb: http://git.kernel.org/tip/5e942bb33371254a474653123cd9e13a4c89ee44
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:26 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:56 +0100
perf_events: Update the context time on exit
It appeared we did call update_event_times() on exit, but we
failed to update the context time, which renders the former
moot.
Locking is a bit iffy, we call update_event_times under
ctx->mutex instead of ctx->lock - the next patch fixes this.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.764207355@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index fb851ec..8be2574 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4983,6 +4983,7 @@ void perf_event_exit_task(struct task_struct *child)
* the events from it.
*/
unclone_ctx(child_ctx);
+ update_context_time(child_ctx);
spin_unlock_irqrestore(&child_ctx->lock, flags);
/*
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking
2009-11-23 10:37 ` [PATCH 5/8] perf: fix __perf_event_exit_task update_event_times locking Peter Zijlstra
@ 2009-11-23 11:54 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulus, hpa, mingo, fweisbec, a.p.zijlstra, tglx,
mingo
Commit-ID: f67218c3e93abaf0f480bb94b53d234853ffe4de
Gitweb: http://git.kernel.org/tip/f67218c3e93abaf0f480bb94b53d234853ffe4de
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:27 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:57 +0100
perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking
Move the update_event_times() call in __perf_event_exit_task()
into list_del_event() because that holds the proper lock
(ctx->lock) and seems a more natural place to do the last time
update.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.842455480@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 78 +++++++++++++++++++++++++-------------------------
1 files changed, 39 insertions(+), 39 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 8be2574..50f11b5 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -246,6 +246,44 @@ static void perf_unpin_context(struct perf_event_context *ctx)
put_ctx(ctx);
}
+static inline u64 perf_clock(void)
+{
+ return cpu_clock(smp_processor_id());
+}
+
+/*
+ * Update the record of the current time in a context.
+ */
+static void update_context_time(struct perf_event_context *ctx)
+{
+ u64 now = perf_clock();
+
+ ctx->time += now - ctx->timestamp;
+ ctx->timestamp = now;
+}
+
+/*
+ * Update the total_time_enabled and total_time_running fields for a event.
+ */
+static void update_event_times(struct perf_event *event)
+{
+ struct perf_event_context *ctx = event->ctx;
+ u64 run_end;
+
+ if (event->state < PERF_EVENT_STATE_INACTIVE ||
+ event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
+ return;
+
+ event->total_time_enabled = ctx->time - event->tstamp_enabled;
+
+ if (event->state == PERF_EVENT_STATE_INACTIVE)
+ run_end = event->tstamp_stopped;
+ else
+ run_end = ctx->time;
+
+ event->total_time_running = run_end - event->tstamp_running;
+}
+
/*
* Add a event from the lists for its context.
* Must be called with ctx->mutex and ctx->lock held.
@@ -294,6 +332,7 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
if (event->group_leader != event)
event->group_leader->nr_siblings--;
+ update_event_times(event);
event->state = PERF_EVENT_STATE_OFF;
/*
@@ -454,44 +493,6 @@ retry:
spin_unlock_irq(&ctx->lock);
}
-static inline u64 perf_clock(void)
-{
- return cpu_clock(smp_processor_id());
-}
-
-/*
- * Update the record of the current time in a context.
- */
-static void update_context_time(struct perf_event_context *ctx)
-{
- u64 now = perf_clock();
-
- ctx->time += now - ctx->timestamp;
- ctx->timestamp = now;
-}
-
-/*
- * Update the total_time_enabled and total_time_running fields for a event.
- */
-static void update_event_times(struct perf_event *event)
-{
- struct perf_event_context *ctx = event->ctx;
- u64 run_end;
-
- if (event->state < PERF_EVENT_STATE_INACTIVE ||
- event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
- return;
-
- event->total_time_enabled = ctx->time - event->tstamp_enabled;
-
- if (event->state == PERF_EVENT_STATE_INACTIVE)
- run_end = event->tstamp_stopped;
- else
- run_end = ctx->time;
-
- event->total_time_running = run_end - event->tstamp_running;
-}
-
/*
* Update total_time_enabled and total_time_running for all events in a group.
*/
@@ -4931,7 +4932,6 @@ __perf_event_exit_task(struct perf_event *child_event,
{
struct perf_event *parent_event;
- update_event_times(child_event);
perf_event_remove_from_context(child_event);
parent_event = child_event->parent;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Undo some recursion damage
2009-11-23 10:37 ` [PATCH 7/8] perf: undo some recursion damage Peter Zijlstra
@ 2009-11-23 11:55 ` tip-bot for Peter Zijlstra
2009-11-23 12:39 ` Frederic Weisbecker
0 siblings, 1 reply; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 11:55 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulus, hpa, mingo, fweisbec, a.p.zijlstra, tglx,
mingo
Commit-ID: 4ed7c92d68a5387ba5f7030dc76eab03558e27f5
Gitweb: http://git.kernel.org/tip/4ed7c92d68a5387ba5f7030dc76eab03558e27f5
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 23 Nov 2009 11:37:29 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 11:49:57 +0100
perf_events: Undo some recursion damage
Make perf_swevent_get_recursion_context return a context number
and disable preemption.
This could be used to remove the IRQ disable from the trace bit
and index the per-cpu buffer with.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091123103819.993226816@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/linux/perf_event.h | 8 ++--
include/trace/ftrace.h | 17 +++++-----
kernel/perf_event.c | 71 ++++++++++++++++++----------------------
kernel/trace/trace_kprobe.c | 14 +++++---
kernel/trace/trace_syscalls.c | 14 +++++---
5 files changed, 61 insertions(+), 63 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 74e98b1..43adbd7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -874,8 +874,8 @@ extern int perf_output_begin(struct perf_output_handle *handle,
extern void perf_output_end(struct perf_output_handle *handle);
extern void perf_output_copy(struct perf_output_handle *handle,
const void *buf, unsigned int len);
-extern int perf_swevent_get_recursion_context(int **recursion);
-extern void perf_swevent_put_recursion_context(int *recursion);
+extern int perf_swevent_get_recursion_context(void);
+extern void perf_swevent_put_recursion_context(int rctx);
#else
static inline void
perf_event_task_sched_in(struct task_struct *task, int cpu) { }
@@ -904,8 +904,8 @@ static inline void perf_event_mmap(struct vm_area_struct *vma) { }
static inline void perf_event_comm(struct task_struct *tsk) { }
static inline void perf_event_fork(struct task_struct *tsk) { }
static inline void perf_event_init(void) { }
-static int perf_swevent_get_recursion_context(int **recursion) { return -1; }
-static void perf_swevent_put_recursion_context(int *recursion) { }
+static inline int perf_swevent_get_recursion_context(void) { return -1; }
+static inline void perf_swevent_put_recursion_context(int rctx) { }
#endif
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index c222ef5..c3417c1 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -724,8 +724,8 @@ __attribute__((section("_ftrace_events"))) event_##call = { \
static void ftrace_profile_##call(proto) \
{ \
struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
- extern int perf_swevent_get_recursion_context(int **recursion); \
- extern void perf_swevent_put_recursion_context(int *recursion); \
+ extern int perf_swevent_get_recursion_context(void); \
+ extern void perf_swevent_put_recursion_context(int rctx); \
struct ftrace_event_call *event_call = &event_##call; \
extern void perf_tp_event(int, u64, u64, void *, int); \
struct ftrace_raw_##call *entry; \
@@ -736,8 +736,8 @@ static void ftrace_profile_##call(proto) \
int __data_size; \
char *trace_buf; \
char *raw_data; \
- int *recursion; \
int __cpu; \
+ int rctx; \
int pc; \
\
pc = preempt_count(); \
@@ -753,8 +753,9 @@ static void ftrace_profile_##call(proto) \
\
local_irq_save(irq_flags); \
\
- if (perf_swevent_get_recursion_context(&recursion)) \
- goto end_recursion; \
+ rctx = perf_swevent_get_recursion_context(); \
+ if (rctx < 0) \
+ goto end_recursion; \
\
__cpu = smp_processor_id(); \
\
@@ -781,9 +782,9 @@ static void ftrace_profile_##call(proto) \
perf_tp_event(event_call->id, __addr, __count, entry, \
__entry_size); \
\
-end: \
- perf_swevent_put_recursion_context(recursion); \
-end_recursion: \
+end: \
+ perf_swevent_put_recursion_context(rctx); \
+end_recursion: \
local_irq_restore(irq_flags); \
\
}
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 50f11b5..0b0d5f7 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3869,45 +3869,50 @@ static void perf_swevent_ctx_event(struct perf_event_context *ctx,
}
}
-/*
- * Must be called with preemption disabled
- */
-int perf_swevent_get_recursion_context(int **recursion)
+int perf_swevent_get_recursion_context(void)
{
- struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ struct perf_cpu_context *cpuctx = &get_cpu_var(perf_cpu_context);
+ int rctx;
if (in_nmi())
- *recursion = &cpuctx->recursion[3];
+ rctx = 3;
else if (in_irq())
- *recursion = &cpuctx->recursion[2];
+ rctx = 2;
else if (in_softirq())
- *recursion = &cpuctx->recursion[1];
+ rctx = 1;
else
- *recursion = &cpuctx->recursion[0];
+ rctx = 0;
- if (**recursion)
+ if (cpuctx->recursion[rctx]) {
+ put_cpu_var(perf_cpu_context);
return -1;
+ }
- (**recursion)++;
+ cpuctx->recursion[rctx]++;
+ barrier();
- return 0;
+ return rctx;
}
EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context);
-void perf_swevent_put_recursion_context(int *recursion)
+void perf_swevent_put_recursion_context(int rctx)
{
- (*recursion)--;
+ struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ barrier();
+ cpuctx->recursion[rctx]++;
+ put_cpu_var(perf_cpu_context);
}
EXPORT_SYMBOL_GPL(perf_swevent_put_recursion_context);
-static void __do_perf_sw_event(enum perf_type_id type, u32 event_id,
- u64 nr, int nmi,
- struct perf_sample_data *data,
- struct pt_regs *regs)
+static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
+ u64 nr, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
{
+ struct perf_cpu_context *cpuctx;
struct perf_event_context *ctx;
- struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
+ cpuctx = &__get_cpu_var(perf_cpu_context);
rcu_read_lock();
perf_swevent_ctx_event(&cpuctx->ctx, type, event_id,
nr, nmi, data, regs);
@@ -3921,34 +3926,22 @@ static void __do_perf_sw_event(enum perf_type_id type, u32 event_id,
rcu_read_unlock();
}
-static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
- u64 nr, int nmi,
- struct perf_sample_data *data,
- struct pt_regs *regs)
-{
- int *recursion;
-
- preempt_disable();
-
- if (perf_swevent_get_recursion_context(&recursion))
- goto out;
-
- __do_perf_sw_event(type, event_id, nr, nmi, data, regs);
-
- perf_swevent_put_recursion_context(recursion);
-out:
- preempt_enable();
-}
-
void __perf_sw_event(u32 event_id, u64 nr, int nmi,
struct pt_regs *regs, u64 addr)
{
struct perf_sample_data data;
+ int rctx;
+
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
+ return;
data.addr = addr;
data.raw = NULL;
do_perf_sw_event(PERF_TYPE_SOFTWARE, event_id, nr, nmi, &data, regs);
+
+ perf_swevent_put_recursion_context(rctx);
}
static void perf_swevent_read(struct perf_event *event)
@@ -4172,7 +4165,7 @@ void perf_tp_event(int event_id, u64 addr, u64 count, void *record,
regs = task_pt_regs(current);
/* Trace events already protected against recursion */
- __do_perf_sw_event(PERF_TYPE_TRACEPOINT, event_id, count, 1,
+ do_perf_sw_event(PERF_TYPE_TRACEPOINT, event_id, count, 1,
&data, regs);
}
EXPORT_SYMBOL_GPL(perf_tp_event);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 22e6f68..79ce6a2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1213,7 +1213,7 @@ static __kprobes int kprobe_profile_func(struct kprobe *kp,
unsigned long irq_flags;
char *trace_buf;
char *raw_data;
- int *recursion;
+ int rctx;
pc = preempt_count();
__size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);
@@ -1229,7 +1229,8 @@ static __kprobes int kprobe_profile_func(struct kprobe *kp,
*/
local_irq_save(irq_flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
__cpu = smp_processor_id();
@@ -1258,7 +1259,7 @@ static __kprobes int kprobe_profile_func(struct kprobe *kp,
perf_tp_event(call->id, entry->ip, 1, entry, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(irq_flags);
@@ -1276,8 +1277,8 @@ static __kprobes int kretprobe_profile_func(struct kretprobe_instance *ri,
int size, __size, i, pc, __cpu;
unsigned long irq_flags;
char *trace_buf;
- int *recursion;
char *raw_data;
+ int rctx;
pc = preempt_count();
__size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);
@@ -1293,7 +1294,8 @@ static __kprobes int kretprobe_profile_func(struct kretprobe_instance *ri,
*/
local_irq_save(irq_flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
__cpu = smp_processor_id();
@@ -1323,7 +1325,7 @@ static __kprobes int kretprobe_profile_func(struct kretprobe_instance *ri,
perf_tp_event(call->id, entry->ret_ip, 1, entry, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(irq_flags);
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 41b6dd9..9189cbe 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -481,8 +481,8 @@ static void prof_syscall_enter(struct pt_regs *regs, long id)
unsigned long flags;
char *trace_buf;
char *raw_data;
- int *recursion;
int syscall_nr;
+ int rctx;
int size;
int cpu;
@@ -506,7 +506,8 @@ static void prof_syscall_enter(struct pt_regs *regs, long id)
/* Protect the per cpu buffer, begin the rcu read side */
local_irq_save(flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
cpu = smp_processor_id();
@@ -530,7 +531,7 @@ static void prof_syscall_enter(struct pt_regs *regs, long id)
perf_tp_event(sys_data->enter_id, 0, 1, rec, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(flags);
}
@@ -582,7 +583,7 @@ static void prof_syscall_exit(struct pt_regs *regs, long ret)
int syscall_nr;
char *trace_buf;
char *raw_data;
- int *recursion;
+ int rctx;
int size;
int cpu;
@@ -609,7 +610,8 @@ static void prof_syscall_exit(struct pt_regs *regs, long ret)
/* Protect the per cpu buffer, begin the rcu read side */
local_irq_save(flags);
- if (perf_swevent_get_recursion_context(&recursion))
+ rctx = perf_swevent_get_recursion_context();
+ if (rctx < 0)
goto end_recursion;
cpu = smp_processor_id();
@@ -634,7 +636,7 @@ static void prof_syscall_exit(struct pt_regs *regs, long ret)
perf_tp_event(sys_data->exit_id, 0, 1, rec, size);
end:
- perf_swevent_put_recursion_context(recursion);
+ perf_swevent_put_recursion_context(rctx);
end_recursion:
local_irq_restore(flags);
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [tip:perf/core] perf_events: Undo some recursion damage
2009-11-23 11:55 ` [tip:perf/core] perf_events: Undo " tip-bot for Peter Zijlstra
@ 2009-11-23 12:39 ` Frederic Weisbecker
2009-11-23 12:50 ` Peter Zijlstra
0 siblings, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2009-11-23 12:39 UTC (permalink / raw)
To: mingo, hpa, paulus, linux-kernel, a.p.zijlstra, tglx, mingo
Cc: linux-tip-commits
On Mon, Nov 23, 2009 at 11:55:08AM +0000, tip-bot for Peter Zijlstra wrote:
> Commit-ID: 4ed7c92d68a5387ba5f7030dc76eab03558e27f5
> Gitweb: http://git.kernel.org/tip/4ed7c92d68a5387ba5f7030dc76eab03558e27f5
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> AuthorDate: Mon, 23 Nov 2009 11:37:29 +0100
> Committer: Ingo Molnar <mingo@elte.hu>
> CommitDate: Mon, 23 Nov 2009 11:49:57 +0100
>
> perf_events: Undo some recursion damage
>
> Make perf_swevent_get_recursion_context return a context number
> and disable preemption.
>
> This could be used to remove the IRQ disable from the trace bit
> and index the per-cpu buffer with.
But if I do that, it means I will lose traces once irq nest.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:perf/core] perf_events: Undo some recursion damage
2009-11-23 12:39 ` Frederic Weisbecker
@ 2009-11-23 12:50 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 12:50 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: mingo, hpa, paulus, linux-kernel, tglx, mingo, linux-tip-commits
On Mon, 2009-11-23 at 13:39 +0100, Frederic Weisbecker wrote:
> On Mon, Nov 23, 2009 at 11:55:08AM +0000, tip-bot for Peter Zijlstra wrote:
> > Commit-ID: 4ed7c92d68a5387ba5f7030dc76eab03558e27f5
> > Gitweb: http://git.kernel.org/tip/4ed7c92d68a5387ba5f7030dc76eab03558e27f5
> > Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > AuthorDate: Mon, 23 Nov 2009 11:37:29 +0100
> > Committer: Ingo Molnar <mingo@elte.hu>
> > CommitDate: Mon, 23 Nov 2009 11:49:57 +0100
> >
> > perf_events: Undo some recursion damage
> >
> > Make perf_swevent_get_recursion_context return a context number
> > and disable preemption.
> >
> > This could be used to remove the IRQ disable from the trace bit
> > and index the per-cpu buffer with.
>
>
> But if I do that, it means I will lose traces once irq nest.
Ah yes, that crap :-(
Maybe its about time we extend the generic irq bits to know about nested
irqs, or firmly kill the whole notion.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 9/8] perf_events: Restore sanity to scaling land
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
` (7 preceding siblings ...)
2009-11-23 10:37 ` [PATCH 8/8][RFC] perf: be paranoid about child times? Peter Zijlstra
@ 2009-11-23 14:00 ` Peter Zijlstra
2009-11-23 17:42 ` [tip:perf/core] " tip-bot for Peter Zijlstra
8 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2009-11-23 14:00 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Paul Mackerras, linux-kernel
It is quite possible to call update_event_times() on a context that
isn't actually running and thereby confuse the thing.
perf stat was reporting !100% scale values for software counters
(2e2af50b perf_events: Disable events when we detach them, solved the
worst of that, but there was still some left).
The thing that happens is that because we are not self-reaping (we have
a caring parent) there is a time between the last schedule (out) and
having do_exit() called which will detach the events.
This period would be accounted as enabled,!running because the
event->state==INACTIVE, even though !event->ctx->is_active.
Similar issues could have been observed by calling read() on a event
while the attached task was not scheduled in.
Solve this by teaching update_event_times() about ctx->is_active.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/perf_event.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0b0d5f7..0aafe85 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -274,7 +274,12 @@ static void update_event_times(struct perf_event *event)
event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
return;
- event->total_time_enabled = ctx->time - event->tstamp_enabled;
+ if (ctx->is_active)
+ run_end = ctx->time;
+ else
+ run_end = event->tstamp_stopped;
+
+ event->total_time_enabled = run_end - event->tstamp_enabled;
if (event->state == PERF_EVENT_STATE_INACTIVE)
run_end = event->tstamp_stopped;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:perf/core] perf_events: Restore sanity to scaling land
2009-11-23 14:00 ` [PATCH 9/8] perf_events: Restore sanity to scaling land Peter Zijlstra
@ 2009-11-23 17:42 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-11-23 17:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, paulus, hpa, mingo, a.p.zijlstra, efault,
peterz, fweisbec, tglx, mingo
Commit-ID: acd1d7c1f8f3d848a3c5327dc09f8c1efb971678
Gitweb: http://git.kernel.org/tip/acd1d7c1f8f3d848a3c5327dc09f8c1efb971678
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 23 Nov 2009 15:00:36 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 23 Nov 2009 15:22:19 +0100
perf_events: Restore sanity to scaling land
It is quite possible to call update_event_times() on a context
that isn't actually running and thereby confuse the thing.
perf stat was reporting !100% scale values for software counters
(2e2af50b perf_events: Disable events when we detach them,
solved the worst of that, but there was still some left).
The thing that happens is that because we are not self-reaping
(we have a caring parent) there is a time between the last
schedule (out) and having do_exit() called which will detach the
events.
This period would be accounted as enabled,!running because the
event->state==INACTIVE, even though !event->ctx->is_active.
Similar issues could have been observed by calling read() on a
event while the attached task was not scheduled in.
Solve this by teaching update_event_times() about
ctx->is_active.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1258984836.4531.480.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/perf_event.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0b0d5f7..0aafe85 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -274,7 +274,12 @@ static void update_event_times(struct perf_event *event)
event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
return;
- event->total_time_enabled = ctx->time - event->tstamp_enabled;
+ if (ctx->is_active)
+ run_end = ctx->time;
+ else
+ run_end = event->tstamp_stopped;
+
+ event->total_time_enabled = run_end - event->tstamp_enabled;
if (event->state == PERF_EVENT_STATE_INACTIVE)
run_end = event->tstamp_stopped;
^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2009-11-23 17:43 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-23 10:37 [PATCH 0/8] More perf patches Peter Zijlstra
2009-11-23 10:37 ` [PATCH 1/8] perf: undo copy/paste damage Peter Zijlstra
2009-11-23 11:53 ` [tip:perf/core] perf_events: Undo " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 2/8] perf: style nits Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Fix " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 3/8] perf: disable events when we detach them Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Disable " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 4/8] perf: update the context time on exit Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Update " tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 5/8] perf: fix __perf_event_exit_task update_event_times locking Peter Zijlstra
2009-11-23 11:54 ` [tip:perf/core] perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking tip-bot for Peter Zijlstra
2009-11-23 10:37 ` [PATCH 6/8] perf: optimize __perf_sw_event() Peter Zijlstra
2009-11-23 10:37 ` [PATCH 7/8] perf: undo some recursion damage Peter Zijlstra
2009-11-23 11:55 ` [tip:perf/core] perf_events: Undo " tip-bot for Peter Zijlstra
2009-11-23 12:39 ` Frederic Weisbecker
2009-11-23 12:50 ` Peter Zijlstra
2009-11-23 10:37 ` [PATCH 8/8][RFC] perf: be paranoid about child times? Peter Zijlstra
2009-11-23 10:51 ` Ingo Molnar
2009-11-23 14:00 ` [PATCH 9/8] perf_events: Restore sanity to scaling land Peter Zijlstra
2009-11-23 17:42 ` [tip:perf/core] " tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox