* [PATCH 0/7] perf updates and fixes
@ 2010-03-26 1:52 Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, David Miller,
Steven Rostedt, Archs, H . Peter Anvin, Thomas Gleixner,
Stephane Eranian
Hi,
The series is not yet mergeable because it would break PowerPc
(hot regs snapshot API has been changed, and I don't know how
to update PowerPc for that).
But if you're fine with the ideas, I can integrate the necessary
changes to fix this, and also separate fixes and updates.
Thanks.
Frederic Weisbecker (7):
perf: Drop the frame reliablity check
perf: Fetch hot regs from the template caller
x86: Unify dumpstack.h and stacktrace.h
perf: Move perf_arch_fetch_caller_regs into a macro
perf: Make perf_fetch_caller_regs rewind to the first caller only
perf: Use hot regs with software sched/migrate events
perf: Correctly align perf event tracing buffer
arch/x86/include/asm/perf_event.h | 10 ++++++-
arch/x86/include/asm/stacktrace.h | 45 ++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event.c | 18 +------------
arch/x86/kernel/dumpstack.c | 1 -
arch/x86/kernel/dumpstack.h | 47 ----------------------------------
arch/x86/kernel/dumpstack_32.c | 2 -
arch/x86/kernel/dumpstack_64.c | 2 -
arch/x86/kernel/stacktrace.c | 7 +++--
include/linux/perf_event.h | 51 ++++++++++++++----------------------
include/trace/ftrace.h | 23 ++++++++--------
kernel/perf_event.c | 10 +------
kernel/trace/trace_event_perf.c | 13 ++++++---
12 files changed, 101 insertions(+), 128 deletions(-)
delete mode 100644 arch/x86/kernel/dumpstack.h
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 0/7] perf updates and fixes
2010-03-26 1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
@ 2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
` (2 subsequent siblings)
3 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, David Miller,
Steven Rostedt, Archs, H . Peter Anvin, Thomas Gleixner,
Stephane Eranian
Hi,
The series is not yet mergeable because it would break PowerPc
(hot regs snapshot API has been changed, and I don't know how
to update PowerPc for that).
But if you're fine with the ideas, I can integrate the necessary
changes to fix this, and also separate fixes and updates.
Thanks.
Frederic Weisbecker (7):
perf: Drop the frame reliablity check
perf: Fetch hot regs from the template caller
x86: Unify dumpstack.h and stacktrace.h
perf: Move perf_arch_fetch_caller_regs into a macro
perf: Make perf_fetch_caller_regs rewind to the first caller only
perf: Use hot regs with software sched/migrate events
perf: Correctly align perf event tracing buffer
arch/x86/include/asm/perf_event.h | 10 ++++++-
arch/x86/include/asm/stacktrace.h | 45 ++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event.c | 18 +------------
arch/x86/kernel/dumpstack.c | 1 -
arch/x86/kernel/dumpstack.h | 47 ----------------------------------
arch/x86/kernel/dumpstack_32.c | 2 -
arch/x86/kernel/dumpstack_64.c | 2 -
arch/x86/kernel/stacktrace.c | 7 +++--
include/linux/perf_event.h | 51 ++++++++++++++----------------------
include/trace/ftrace.h | 23 ++++++++--------
kernel/perf_event.c | 10 +------
kernel/trace/trace_event_perf.c | 13 ++++++---
12 files changed, 101 insertions(+), 128 deletions(-)
delete mode 100644 arch/x86/kernel/dumpstack.h
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro
2010-03-26 1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
@ 2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
2010-03-26 6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
3 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
David Miller, Archs
Having perf_arch_fetch_caller_regs() as a weak function overridable
by macros makes it hard to export its symbol for modules, as we
need to export the symbol in another file than the generic weak
definition to deal with compiler bugs.
Currently it's not a problem because we can define the symbol in
trace_event_perf.c as only trace events use it. But we are going
to make some generic software events to use this new facility,
so the symbol must be available even if trace events are not built.
This patch fixes the problem by making perf_arch_fetch_caller_regs
a macro that archs can define in their <asm/perf_event.h>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
arch/x86/include/asm/perf_event.h | 10 +++++++++-
arch/x86/kernel/cpu/perf_event.c | 13 -------------
include/linux/perf_event.h | 8 +++++---
kernel/perf_event.c | 7 -------
kernel/trace/trace_event_perf.c | 2 --
5 files changed, 14 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 124dddd..4bf3d37 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -155,7 +155,15 @@ extern void perf_events_lapic_init(void);
#define perf_instruction_pointer(regs) ((regs)->ip)
-#else
+#include <asm/stacktrace.h>
+
+#define perf_arch_fetch_caller_regs(regs, ip, skip) \
+ (regs)->ip = ip; \
+ (regs)->bp = rewind_frame_pointer(skip); \
+ (regs)->cs = __KERNEL_CS; \
+ local_save_flags((regs)->flags);
+
+#else /* !CONFIG_PERF_EVENTS */
static inline void init_hw_perf_events(void) { }
static inline void perf_events_lapic_init(void) { }
#endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3f203e..43fae02 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1691,16 +1691,3 @@ struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
return entry;
}
-#ifdef CONFIG_EVENT_TRACING
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
- regs->ip = ip;
- /*
- * perf_arch_fetch_caller_regs adds another call, we need to increment
- * the skip level
- */
- regs->bp = rewind_frame_pointer(skip + 1);
- regs->cs = __KERNEL_CS;
- local_save_flags(regs->flags);
-}
-#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2bccb7b..76b680f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,8 +867,10 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
__perf_sw_event(event_id, nr, nmi, regs, addr);
}
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+#endif
/*
* Take a snapshot of the regs. Skip ip and frame pointer to
@@ -902,7 +904,7 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
ip = 0;
}
- return perf_arch_fetch_caller_regs(regs, ip, skip);
+ perf_arch_fetch_caller_regs(regs, ip, skip);
}
extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 455393e..bbed6f0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2790,13 +2790,6 @@ __weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
return NULL;
}
-#ifdef CONFIG_EVENT_TRACING
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-#endif
-
/*
* Output
*/
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..8e9edcd 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -12,8 +12,6 @@
DEFINE_PER_CPU(struct pt_regs, perf_trace_regs);
EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
static char *perf_trace_buf;
static char *perf_trace_buf_nmi;
--
1.6.2.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro
2010-03-26 1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
@ 2010-03-26 1:52 ` Frederic Weisbecker
0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs
Having perf_arch_fetch_caller_regs() as a weak function overridable
by macros makes it hard to export its symbol for modules, as we
need to export the symbol in another file than the generic weak
definition to deal with compiler bugs.
Currently it's not a problem because we can define the symbol in
trace_event_perf.c as only trace events use it. But we are going
to make some generic software events to use this new facility,
so the symbol must be available even if trace events are not built.
This patch fixes the problem by making perf_arch_fetch_caller_regs
a macro that archs can define in their <asm/perf_event.h>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
arch/x86/include/asm/perf_event.h | 10 +++++++++-
arch/x86/kernel/cpu/perf_event.c | 13 -------------
include/linux/perf_event.h | 8 +++++---
kernel/perf_event.c | 7 -------
kernel/trace/trace_event_perf.c | 2 --
5 files changed, 14 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 124dddd..4bf3d37 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -155,7 +155,15 @@ extern void perf_events_lapic_init(void);
#define perf_instruction_pointer(regs) ((regs)->ip)
-#else
+#include <asm/stacktrace.h>
+
+#define perf_arch_fetch_caller_regs(regs, ip, skip) \
+ (regs)->ip = ip; \
+ (regs)->bp = rewind_frame_pointer(skip); \
+ (regs)->cs = __KERNEL_CS; \
+ local_save_flags((regs)->flags);
+
+#else /* !CONFIG_PERF_EVENTS */
static inline void init_hw_perf_events(void) { }
static inline void perf_events_lapic_init(void) { }
#endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3f203e..43fae02 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1691,16 +1691,3 @@ struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
return entry;
}
-#ifdef CONFIG_EVENT_TRACING
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
- regs->ip = ip;
- /*
- * perf_arch_fetch_caller_regs adds another call, we need to increment
- * the skip level
- */
- regs->bp = rewind_frame_pointer(skip + 1);
- regs->cs = __KERNEL_CS;
- local_save_flags(regs->flags);
-}
-#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2bccb7b..76b680f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,8 +867,10 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
__perf_sw_event(event_id, nr, nmi, regs, addr);
}
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+#endif
/*
* Take a snapshot of the regs. Skip ip and frame pointer to
@@ -902,7 +904,7 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
ip = 0;
}
- return perf_arch_fetch_caller_regs(regs, ip, skip);
+ perf_arch_fetch_caller_regs(regs, ip, skip);
}
extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 455393e..bbed6f0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2790,13 +2790,6 @@ __weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
return NULL;
}
-#ifdef CONFIG_EVENT_TRACING
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-#endif
-
/*
* Output
*/
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..8e9edcd 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -12,8 +12,6 @@
DEFINE_PER_CPU(struct pt_regs, perf_trace_regs);
EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
static char *perf_trace_buf;
static char *perf_trace_buf_nmi;
--
1.6.2.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only
2010-03-26 1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
@ 2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-04-08 9:57 ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
2010-03-26 6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
3 siblings, 2 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
David Miller, Archs
Trace events now only need to rewind the regs to the immediate
caller, so we don't need anymore to implement the support for
further stack walking.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
arch/x86/include/asm/perf_event.h | 6 +++---
arch/x86/include/asm/stacktrace.h | 5 ++---
include/linux/perf_event.h | 30 +++++-------------------------
include/trace/ftrace.h | 2 +-
4 files changed, 11 insertions(+), 32 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 4bf3d37..1df2317 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -157,9 +157,9 @@ extern void perf_events_lapic_init(void);
#include <asm/stacktrace.h>
-#define perf_arch_fetch_caller_regs(regs, ip, skip) \
- (regs)->ip = ip; \
- (regs)->bp = rewind_frame_pointer(skip); \
+#define perf_arch_fetch_caller_regs(regs, __ip) \
+ (regs)->ip = __ip; \
+ (regs)->bp = caller_frame_pointer(); \
(regs)->cs = __KERNEL_CS; \
local_save_flags((regs)->flags);
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8fb70b7..62c9897 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -74,15 +74,14 @@ struct stack_frame {
unsigned long return_address;
};
-static inline unsigned long rewind_frame_pointer(int n)
+static inline unsigned long caller_frame_pointer(void)
{
struct stack_frame *frame;
get_bp(frame);
#ifdef CONFIG_FRAME_POINTER
- while (n--)
- frame = frame->next_frame;
+ frame = frame->next_frame;
#endif
return (unsigned long)frame;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 76b680f..3b59cf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -869,42 +869,22 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
#ifndef perf_arch_fetch_caller_regs
static inline void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
#endif
/*
- * Take a snapshot of the regs. Skip ip and frame pointer to
- * the nth caller. We only need a few of the regs:
+ * Take a snapshot of the regs. Rewind ip and frame pointer to
+ * the caller. We only need a few of the regs:
* - ip for PERF_SAMPLE_IP
* - cs for user_mode() tests
* - bp for callchains
* - eflags, for future purposes, just in case
*/
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
{
- unsigned long ip;
-
memset(regs, 0, sizeof(*regs));
- switch (skip) {
- case 1 :
- ip = CALLER_ADDR0;
- break;
- case 2 :
- ip = CALLER_ADDR1;
- break;
- case 3 :
- ip = CALLER_ADDR2;
- break;
- case 4:
- ip = CALLER_ADDR3;
- break;
- /* No need to support further for now */
- default:
- ip = 0;
- }
-
- perf_arch_fetch_caller_regs(regs, ip, skip);
+ perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
}
extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 882c648..82e9977 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -795,7 +795,7 @@ static notrace void perf_trace_##call(proto) \
struct ftrace_event_call *event_call = &event_##call; \
struct pt_regs *__regs = &get_cpu_var(perf_trace_regs); \
\
- perf_fetch_caller_regs(__regs, 1); \
+ perf_fetch_caller_regs(__regs); \
\
perf_trace_templ_##template(event_call, __regs, args); \
\
--
1.6.2.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only
2010-03-26 1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
@ 2010-03-26 1:52 ` Frederic Weisbecker
2010-04-08 9:57 ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
1 sibling, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 1:52 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs
Trace events now only need to rewind the regs to the immediate
caller, so we don't need anymore to implement the support for
further stack walking.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
arch/x86/include/asm/perf_event.h | 6 +++---
arch/x86/include/asm/stacktrace.h | 5 ++---
include/linux/perf_event.h | 30 +++++-------------------------
include/trace/ftrace.h | 2 +-
4 files changed, 11 insertions(+), 32 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 4bf3d37..1df2317 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -157,9 +157,9 @@ extern void perf_events_lapic_init(void);
#include <asm/stacktrace.h>
-#define perf_arch_fetch_caller_regs(regs, ip, skip) \
- (regs)->ip = ip; \
- (regs)->bp = rewind_frame_pointer(skip); \
+#define perf_arch_fetch_caller_regs(regs, __ip) \
+ (regs)->ip = __ip; \
+ (regs)->bp = caller_frame_pointer(); \
(regs)->cs = __KERNEL_CS; \
local_save_flags((regs)->flags);
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8fb70b7..62c9897 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -74,15 +74,14 @@ struct stack_frame {
unsigned long return_address;
};
-static inline unsigned long rewind_frame_pointer(int n)
+static inline unsigned long caller_frame_pointer(void)
{
struct stack_frame *frame;
get_bp(frame);
#ifdef CONFIG_FRAME_POINTER
- while (n--)
- frame = frame->next_frame;
+ frame = frame->next_frame;
#endif
return (unsigned long)frame;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 76b680f..3b59cf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -869,42 +869,22 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
#ifndef perf_arch_fetch_caller_regs
static inline void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
#endif
/*
- * Take a snapshot of the regs. Skip ip and frame pointer to
- * the nth caller. We only need a few of the regs:
+ * Take a snapshot of the regs. Rewind ip and frame pointer to
+ * the caller. We only need a few of the regs:
* - ip for PERF_SAMPLE_IP
* - cs for user_mode() tests
* - bp for callchains
* - eflags, for future purposes, just in case
*/
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
{
- unsigned long ip;
-
memset(regs, 0, sizeof(*regs));
- switch (skip) {
- case 1 :
- ip = CALLER_ADDR0;
- break;
- case 2 :
- ip = CALLER_ADDR1;
- break;
- case 3 :
- ip = CALLER_ADDR2;
- break;
- case 4:
- ip = CALLER_ADDR3;
- break;
- /* No need to support further for now */
- default:
- ip = 0;
- }
-
- perf_arch_fetch_caller_regs(regs, ip, skip);
+ perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
}
extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 882c648..82e9977 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -795,7 +795,7 @@ static notrace void perf_trace_##call(proto) \
struct ftrace_event_call *event_call = &event_##call; \
struct pt_regs *__regs = &get_cpu_var(perf_trace_regs); \
\
- perf_fetch_caller_regs(__regs, 1); \
+ perf_fetch_caller_regs(__regs); \
\
perf_trace_templ_##template(event_call, __regs, args); \
\
--
1.6.2.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7] perf updates and fixes
2010-03-26 1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
` (2 preceding siblings ...)
2010-03-26 1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
@ 2010-03-26 6:02 ` Paul Mackerras
2010-03-26 7:58 ` Ingo Molnar
2010-03-26 17:45 ` Frederic Weisbecker
3 siblings, 2 replies; 16+ messages in thread
From: Paul Mackerras @ 2010-03-26 6:02 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
David Miller, Steven Rostedt, Archs, H . Peter Anvin,
Thomas Gleixner, Stephane Eranian
On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
> The series is not yet mergeable because it would break PowerPc
> (hot regs snapshot API has been changed, and I don't know how
> to update PowerPc for that).
>
> But if you're fine with the ideas, I can integrate the necessary
> changes to fix this, and also separate fixes and updates.
The patch below adds the necessary stuff for powerpc. You could fold
it into your "perf: Move perf_arch_fetch_caller_regs into a macro"
patch, or keep it as a separate patch in the series (though that would
make preserving bisectability more difficult).
There is a little difficulty in that you first create a 3-argument
form of perf_arch_fetch_caller_regs and then remove one argument in a
later patch, whereas my patch below just creates the 2-argument form.
I'm sure you can resolve that one way or another.
The patch to add the old-style perf_arch_fetch_caller_regs for powerpc
is sitting in the tip/perf/urgent branch but hasn't gone anywhere.
I suppose we need to get Ingo to pull that branch into tip/perf/core
and then include a revert patch in the series that switches to the new
interface. Alternatively, if the urgent branch isn't append-only then
we could disappear it (the urgent branch would need to be rewound by
one commit).
Paul.
----------
[PATCH] perf/powerpc: Implement perf_arch_fetch_caller_regs for powerpc
This adds the ability to get a hot register snapshot for powerpc,
enabling us to get meaningful call chains for tracepoints and context
switch events.
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/include/asm/perf_event.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
index e6d4ce6..5c16b89 100644
--- a/arch/powerpc/include/asm/perf_event.h
+++ b/arch/powerpc/include/asm/perf_event.h
@@ -21,3 +21,15 @@
#ifdef CONFIG_FSL_EMB_PERF_EVENT
#include <asm/perf_event_fsl_emb.h>
#endif
+
+#ifdef CONFIG_PERF_EVENTS
+#include <asm/ptrace.h>
+#include <asm/reg.h>
+
+#define perf_arch_fetch_caller_regs(regs, __ip) \
+ do { \
+ (regs)->nip = __ip; \
+ (regs)->gpr[1] = *(unsigned long *)__get_SP(); \
+ asm volatile("mfmsr %0" : "=r" ((regs)->msr)); \
+ } while (0)
+#endif
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7] perf updates and fixes
2010-03-26 6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
@ 2010-03-26 7:58 ` Ingo Molnar
2010-03-26 17:38 ` Frederic Weisbecker
2010-03-26 17:45 ` Frederic Weisbecker
1 sibling, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2010-03-26 7:58 UTC (permalink / raw)
To: Paul Mackerras
Cc: Frederic Weisbecker, LKML, Peter Zijlstra,
Arnaldo Carvalho de Melo, David Miller, Steven Rostedt, Archs,
H . Peter Anvin, Thomas Gleixner, Stephane Eranian
* Paul Mackerras <paulus@samba.org> wrote:
> On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
>
> > The series is not yet mergeable because it would break PowerPc (hot regs
> > snapshot API has been changed, and I don't know how to update PowerPc for
> > that).
> >
> > But if you're fine with the ideas, I can integrate the necessary changes
> > to fix this, and also separate fixes and updates.
>
> The patch below adds the necessary stuff for powerpc. You could fold it
> into your "perf: Move perf_arch_fetch_caller_regs into a macro" patch, or
> keep it as a separate patch in the series (though that would make preserving
> bisectability more difficult).
Since the series needs a resend anyway folding back ought to be fine i think.
I'm wondering whether this should get into tip:perf/urgent - or in
tip:perf/core for 2.6.35.
It fixes sw event call trace ugliness, but is that a 2.6.34 regression? Is
there any other aspect of the series that points towards accelerating this
into .34?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7] perf updates and fixes
2010-03-26 7:58 ` Ingo Molnar
@ 2010-03-26 17:38 ` Frederic Weisbecker
0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 17:38 UTC (permalink / raw)
To: Ingo Molnar
Cc: Paul Mackerras, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
David Miller, Steven Rostedt, Archs, H . Peter Anvin,
Thomas Gleixner, Stephane Eranian
On Fri, Mar 26, 2010 at 08:58:17AM +0100, Ingo Molnar wrote:
>
> * Paul Mackerras <paulus@samba.org> wrote:
>
> > On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
> >
> > > The series is not yet mergeable because it would break PowerPc (hot regs
> > > snapshot API has been changed, and I don't know how to update PowerPc for
> > > that).
> > >
> > > But if you're fine with the ideas, I can integrate the necessary changes
> > > to fix this, and also separate fixes and updates.
> >
> > The patch below adds the necessary stuff for powerpc. You could fold it
> > into your "perf: Move perf_arch_fetch_caller_regs into a macro" patch, or
> > keep it as a separate patch in the series (though that would make preserving
> > bisectability more difficult).
>
> Since the series needs a resend anyway folding back ought to be fine i think.
>
> I'm wondering whether this should get into tip:perf/urgent - or in
> tip:perf/core for 2.6.35.
>
> It fixes sw event call trace ugliness, but is that a 2.6.34 regression? Is
> there any other aspect of the series that points towards accelerating this
> into .34?
Let's have a look:
perf: Correctly align perf event tracing buffer
Should probably go into urgent. The change is not invasive at
all. It doesn't fix a regression but it's still an important fix.
The rest
It depends. The whole bunch is rather invasive.
The callchains of context switches never worked correctly
I think. I couldn't tell if the cpu migration has ever worked.
If it ever did, then it's a regression fix but in the middle of
too much hot regs improvements. So I can cook a specific fix for
the cpu migration event to work, and keep the rest for perf/core.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7] perf updates and fixes
2010-03-26 6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
2010-03-26 7:58 ` Ingo Molnar
@ 2010-03-26 17:45 ` Frederic Weisbecker
1 sibling, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 17:45 UTC (permalink / raw)
To: Paul Mackerras
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
David Miller, Steven Rostedt, Archs, H . Peter Anvin,
Thomas Gleixner, Stephane Eranian
On Fri, Mar 26, 2010 at 05:02:30PM +1100, Paul Mackerras wrote:
> On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
>
> > The series is not yet mergeable because it would break PowerPc
> > (hot regs snapshot API has been changed, and I don't know how
> > to update PowerPc for that).
> >
> > But if you're fine with the ideas, I can integrate the necessary
> > changes to fix this, and also separate fixes and updates.
>
> The patch below adds the necessary stuff for powerpc. You could fold
> it into your "perf: Move perf_arch_fetch_caller_regs into a macro"
> patch, or keep it as a separate patch in the series (though that would
> make preserving bisectability more difficult).
>
> There is a little difficulty in that you first create a 3-argument
> form of perf_arch_fetch_caller_regs and then remove one argument in a
> later patch, whereas my patch below just creates the 2-argument form.
> I'm sure you can resolve that one way or another.
>
> The patch to add the old-style perf_arch_fetch_caller_regs for powerpc
> is sitting in the tip/perf/urgent branch but hasn't gone anywhere.
> I suppose we need to get Ingo to pull that branch into tip/perf/core
> and then include a revert patch in the series that switches to the new
> interface. Alternatively, if the urgent branch isn't append-only then
> we could disappear it (the urgent branch would need to be rewound by
> one commit).
Thanks a lot, that's axactly what I needed.
I'll sort it out to not break things in the middle :)
>
> Paul.
> ----------
> [PATCH] perf/powerpc: Implement perf_arch_fetch_caller_regs for powerpc
>
> This adds the ability to get a hot register snapshot for powerpc,
> enabling us to get meaningful call chains for tracepoints and context
> switch events.
>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> arch/powerpc/include/asm/perf_event.h | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
> index e6d4ce6..5c16b89 100644
> --- a/arch/powerpc/include/asm/perf_event.h
> +++ b/arch/powerpc/include/asm/perf_event.h
> @@ -21,3 +21,15 @@
> #ifdef CONFIG_FSL_EMB_PERF_EVENT
> #include <asm/perf_event_fsl_emb.h>
> #endif
> +
> +#ifdef CONFIG_PERF_EVENTS
> +#include <asm/ptrace.h>
> +#include <asm/reg.h>
> +
> +#define perf_arch_fetch_caller_regs(regs, __ip) \
> + do { \
> + (regs)->nip = __ip; \
> + (regs)->gpr[1] = *(unsigned long *)__get_SP(); \
> + asm volatile("mfmsr %0" : "=r" ((regs)->msr)); \
> + } while (0)
> +#endif
^ permalink raw reply [flat|nested] 16+ messages in thread
* [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic
2010-03-26 1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
@ 2010-04-08 9:57 ` Eric Dumazet
2010-04-08 10:59 ` Frederic Weisbecker
2010-04-08 12:32 ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
1 sibling, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2010-04-08 9:57 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
Paul Mackerras, David Miller, Archs
Hello
Current linux-2.6 tree panics on my dev machine
64 bit kernel, 32bit user land
CONFIG_FRAME_POINTER=y
perf timechart record &
Instant crash
Call Trace:
perf_trace_sched_switch+0xd5/0x120
schedule+0x6b5/0x860
retint_careful+0xd/0x21
RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
CR2: 00000000d21f1422
rewind_frame_pointer() is probably wrong.
No test performed to check frame is in current stack, or
that (!user_mode_vm(regs))
static inline unsigned long rewind_frame_pointer(int n)
{
struct stack_frame *frame;
get_bp(frame);
#ifdef CONFIG_FRAME_POINTER
while (n--)
frame = frame->next_frame;
#endif
return (unsigned long)frame;
}
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic
2010-04-08 9:57 ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
@ 2010-04-08 10:59 ` Frederic Weisbecker
2010-04-08 12:32 ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
1 sibling, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 10:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
Paul Mackerras, David Miller, Archs
On Thu, Apr 08, 2010 at 11:57:20AM +0200, Eric Dumazet wrote:
> Hello
>
> Current linux-2.6 tree panics on my dev machine
>
> 64 bit kernel, 32bit user land
> CONFIG_FRAME_POINTER=y
>
> perf timechart record &
>
> Instant crash
>
> Call Trace:
> perf_trace_sched_switch+0xd5/0x120
> schedule+0x6b5/0x860
> retint_careful+0xd/0x21
>
> RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
> CR2: 00000000d21f1422
>
>
> rewind_frame_pointer() is probably wrong.
>
> No test performed to check frame is in current stack, or
> that (!user_mode_vm(regs))
user_mode_vm() can not work here as we are actually filling
regs from scratch.
But we indeed need to have a safe dereference to avoid such
crashes. A simple probe_kernel_address() should do the trick.
This API is going to change for the next cycle as it won't need
to rewind further than the first caller. So I'm going to do a
rough probe_kernel_address() fix for the current version. The next
one won't have this problem.
>
>
> static inline unsigned long rewind_frame_pointer(int n)
> {
> struct stack_frame *frame;
>
> get_bp(frame);
>
> #ifdef CONFIG_FRAME_POINTER
> while (n--)
> frame = frame->next_frame;
> #endif
>
> return (unsigned long)frame;
> }
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
2010-04-08 9:57 ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
2010-04-08 10:59 ` Frederic Weisbecker
@ 2010-04-08 12:32 ` Frederic Weisbecker
2010-04-08 13:52 ` Eric Dumazet
1 sibling, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 12:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
Paul Mackerras, David Miller, Archs
On Thu, Apr 08, 2010 at 11:57:20AM +0200, Eric Dumazet wrote:
> Hello
>
> Current linux-2.6 tree panics on my dev machine
>
> 64 bit kernel, 32bit user land
> CONFIG_FRAME_POINTER=y
>
> perf timechart record &
>
> Instant crash
>
> Call Trace:
> perf_trace_sched_switch+0xd5/0x120
> schedule+0x6b5/0x860
> retint_careful+0xd/0x21
>
> RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
> CR2: 00000000d21f1422
>
>
> rewind_frame_pointer() is probably wrong.
>
> No test performed to check frame is in current stack, or
> that (!user_mode_vm(regs))
>
>
> static inline unsigned long rewind_frame_pointer(int n)
> {
> struct stack_frame *frame;
>
> get_bp(frame);
>
> #ifdef CONFIG_FRAME_POINTER
> while (n--)
> frame = frame->next_frame;
> #endif
>
> return (unsigned long)frame;
> }
>
>
>
Can you please test this fix?
Thanks.
---
From 60d5c4e8498efc4a01abceef54ad3bc91993bf41 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 8 Apr 2010 14:05:50 +0200
Subject: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
When we fetch the hot regs and rewind to the nth caller, it
might happen that we dereference a frame pointer outside the
kernel stack boundaries, like in this example:
perf_trace_sched_switch+0xd5/0x120
schedule+0x6b5/0x860
retint_careful+0xd/0x21
Since we directly dereference a userspace frame pointer here while
rewinding behind retint_careful, this may end up in a crash.
Fix this by simply using probe_kernel_address() when we rewind the
frame pointer.
This issue will have a much more proper fix in the next version of the
perf_arch_fetch_caller_regs() API that will only need to rewind to the
first caller.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
arch/x86/kernel/dumpstack.h | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
index e39e771..e1a93be 100644
--- a/arch/x86/kernel/dumpstack.h
+++ b/arch/x86/kernel/dumpstack.h
@@ -14,6 +14,8 @@
#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
#endif
+#include <linux/uaccess.h>
+
extern void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl);
@@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
get_bp(frame);
#ifdef CONFIG_FRAME_POINTER
- while (n--)
- frame = frame->next_frame;
+ while (n--) {
+ if (probe_kernel_address(&frame->next_frame, frame))
+ break;
+ }
#endif
return (unsigned long)frame;
--
1.6.2.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
2010-04-08 12:32 ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
@ 2010-04-08 13:52 ` Eric Dumazet
2010-04-08 17:31 ` [GIT PULL] perf fix Frederic Weisbecker
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2010-04-08 13:52 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
Paul Mackerras, David Miller, Archs
Le jeudi 08 avril 2010 à 14:32 +0200, Frederic Weisbecker a écrit :
>
> Can you please test this fix?
>
> Thanks.
>
> ---
> From 60d5c4e8498efc4a01abceef54ad3bc91993bf41 Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Thu, 8 Apr 2010 14:05:50 +0200
> Subject: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
>
> When we fetch the hot regs and rewind to the nth caller, it
> might happen that we dereference a frame pointer outside the
> kernel stack boundaries, like in this example:
>
> perf_trace_sched_switch+0xd5/0x120
> schedule+0x6b5/0x860
> retint_careful+0xd/0x21
>
> Since we directly dereference a userspace frame pointer here while
> rewinding behind retint_careful, this may end up in a crash.
>
> Fix this by simply using probe_kernel_address() when we rewind the
> frame pointer.
>
> This issue will have a much more proper fix in the next version of the
> perf_arch_fetch_caller_regs() API that will only need to rewind to the
> first caller.
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: Archs <linux-arch@vger.kernel.org>
> ---
> arch/x86/kernel/dumpstack.h | 8 ++++++--
> 1 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
> index e39e771..e1a93be 100644
> --- a/arch/x86/kernel/dumpstack.h
> +++ b/arch/x86/kernel/dumpstack.h
> @@ -14,6 +14,8 @@
> #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
> #endif
>
> +#include <linux/uaccess.h>
> +
> extern void
> show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
> unsigned long *stack, unsigned long bp, char *log_lvl);
> @@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
> get_bp(frame);
>
> #ifdef CONFIG_FRAME_POINTER
> - while (n--)
> - frame = frame->next_frame;
> + while (n--) {
> + if (probe_kernel_address(&frame->next_frame, frame))
> + break;
> + }
> #endif
>
> return (unsigned long)frame;
Thanks, no more crash :)
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [GIT PULL] perf fix
2010-04-08 13:52 ` Eric Dumazet
@ 2010-04-08 17:31 ` Frederic Weisbecker
2010-04-13 22:51 ` Ingo Molnar
0 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 17:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Eric Dumazet, Peter Zijlstra,
Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs
Ingo,
Please pull the perf/urgent branch that can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
perf/urgent
Thanks,
Frederic
---
Frederic Weisbecker (1):
perf: Fix unsafe frame rewinding with hot regs fetching
arch/x86/kernel/dumpstack.h | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
---
commit ab285f2b5290d92b7ec1a6f9aad54308dadf6157
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu Apr 8 14:05:50 2010 +0200
perf: Fix unsafe frame rewinding with hot regs fetching
When we fetch the hot regs and rewind to the nth caller, it
might happen that we dereference a frame pointer outside the
kernel stack boundaries, like in this example:
perf_trace_sched_switch+0xd5/0x120
schedule+0x6b5/0x860
retint_careful+0xd/0x21
Since we directly dereference a userspace frame pointer here while
rewinding behind retint_careful, this may end up in a crash.
Fix this by simply using probe_kernel_address() when we rewind the
frame pointer.
This issue will have a much more proper fix in the next version of the
perf_arch_fetch_caller_regs() API that will only need to rewind to the
first caller.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
index e39e771..e1a93be 100644
--- a/arch/x86/kernel/dumpstack.h
+++ b/arch/x86/kernel/dumpstack.h
@@ -14,6 +14,8 @@
#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
#endif
+#include <linux/uaccess.h>
+
extern void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl);
@@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
get_bp(frame);
#ifdef CONFIG_FRAME_POINTER
- while (n--)
- frame = frame->next_frame;
+ while (n--) {
+ if (probe_kernel_address(&frame->next_frame, frame))
+ break;
+ }
#endif
return (unsigned long)frame;
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [GIT PULL] perf fix
2010-04-08 17:31 ` [GIT PULL] perf fix Frederic Weisbecker
@ 2010-04-13 22:51 ` Ingo Molnar
0 siblings, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2010-04-13 22:51 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Eric Dumazet, Peter Zijlstra, Arnaldo Carvalho de Melo,
Paul Mackerras, David Miller, Archs
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Ingo,
>
> Please pull the perf/urgent branch that can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> perf/urgent
>
> Thanks,
> Frederic
> ---
>
> Frederic Weisbecker (1):
> perf: Fix unsafe frame rewinding with hot regs fetching
>
>
> arch/x86/kernel/dumpstack.h | 8 ++++++--
> 1 files changed, 6 insertions(+), 2 deletions(-)
Pulled, thanks a lot Frederic!
Ingo
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-04-13 22:51 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-26 1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-03-26 1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
2010-03-26 1:52 ` Frederic Weisbecker
2010-04-08 9:57 ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
2010-04-08 10:59 ` Frederic Weisbecker
2010-04-08 12:32 ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
2010-04-08 13:52 ` Eric Dumazet
2010-04-08 17:31 ` [GIT PULL] perf fix Frederic Weisbecker
2010-04-13 22:51 ` Ingo Molnar
2010-03-26 6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
2010-03-26 7:58 ` Ingo Molnar
2010-03-26 17:38 ` Frederic Weisbecker
2010-03-26 17:45 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).