* [PATCH 1/6] x86: Save stack pointer in perf live regs savings
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-02 16:29 ` [PATCH 2/6] x86: Fetch stack from regs when possible in dump_trace() Frederic Weisbecker
` (5 subsequent siblings)
6 siblings, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo
In order to prepare for fetching the stack pointer from the
regs when possible in dump_trace() instead of taking the
local one, save the current stack pointer in perf live regs saving.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/x86/include/asm/perf_event.h | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index d9d4dae..094fb30 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -152,6 +152,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->bp = caller_frame_pointer(); \
(regs)->cs = __KERNEL_CS; \
regs->flags = 0; \
+ asm volatile( \
+ _ASM_MOV "%%"_ASM_SP ", %0\n" \
+ : "=m" ((regs)->sp) \
+ :: "memory" \
+ ); \
}
#else
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH 2/6] x86: Fetch stack from regs when possible in dump_trace()
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
2011-07-02 16:29 ` [PATCH 1/6] x86: Save stack pointer in perf live regs savings Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-02 16:29 ` [PATCH 3/6] x86,64: Simplify save_regs() Frederic Weisbecker
` (4 subsequent siblings)
6 siblings, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo
When regs are passed to dump_stack(), we fetch the frame
pointer from the regs but the stack pointer is taken from
the current frame.
Thus the frame and stack pointers may not come from the same
context. For example this can result in the unwinder to
think the context is in irq, due to the current value of
the stack, but the frame pointer coming from the regs points
to a frame from another place. It then tries to fix up
the irq link but ends up dereferencing a random frame
pointer that doesn't belong to the irq stack:
[ 9131.706906] ------------[ cut here ]------------
[ 9131.707003] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2aa/0x330()
[ 9131.707003] Hardware name: AMD690VM-FMH
[ 9131.707003] Perf: bad frame pointer = 0000000000000005 in callchain
[ 9131.707003] Modules linked in:
[ 9131.707003] Pid: 1050, comm: perf Not tainted 3.0.0-rc3+ #181
[ 9131.707003] Call Trace:
[ 9131.707003] <IRQ> [<ffffffff8104bd4a>] warn_slowpath_common+0x7a/0xb0
[ 9131.707003] [<ffffffff8104be21>] warn_slowpath_fmt+0x41/0x50
[ 9131.707003] [<ffffffff8178b873>] ? bad_to_user+0x6d/0x10be
[ 9131.707003] [<ffffffff8100c2da>] dump_trace+0x2aa/0x330
[ 9131.707003] [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50
[ 9131.707003] [<ffffffff8101b164>] perf_callchain_kernel+0x54/0x70
[ 9131.707003] [<ffffffff810d391f>] perf_prepare_sample+0x19f/0x2a0
[ 9131.707003] [<ffffffff810d546c>] __perf_event_overflow+0x16c/0x290
[ 9131.707003] [<ffffffff810d5430>] ? __perf_event_overflow+0x130/0x290
[ 9131.707003] [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50
[ 9131.707003] [<ffffffff8100fbb9>] ? sched_clock+0x9/0x10
[ 9131.707003] [<ffffffff810752e5>] ? T.375+0x15/0x90
[ 9131.707003] [<ffffffff81084da4>] ? trace_hardirqs_on_caller+0x64/0x180
[ 9131.707003] [<ffffffff810817bd>] ? trace_hardirqs_off+0xd/0x10
[ 9131.707003] [<ffffffff810d5764>] perf_event_overflow+0x14/0x20
[ 9131.707003] [<ffffffff810d588c>] perf_swevent_hrtimer+0x11c/0x130
[ 9131.707003] [<ffffffff817821a1>] ? error_exit+0x51/0xb0
[ 9131.707003] [<ffffffff81072e93>] __run_hrtimer+0x83/0x1e0
[ 9131.707003] [<ffffffff810d5770>] ? perf_event_overflow+0x20/0x20
[ 9131.707003] [<ffffffff81073256>] hrtimer_interrupt+0x106/0x250
[ 9131.707003] [<ffffffff812a3bfd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 9131.707003] [<ffffffff81024833>] smp_apic_timer_interrupt+0x53/0x90
[ 9131.707003] [<ffffffff81789053>] apic_timer_interrupt+0x13/0x20
[ 9131.707003] <EOI> [<ffffffff817821a1>] ? error_exit+0x51/0xb0
[ 9131.707003] [<ffffffff8178219c>] ? error_exit+0x4c/0xb0
[ 9131.707003] ---[ end trace b2560d4876709347 ]---
Fix this by simply taking the stack pointer from regs->sp
when regs are provided.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/x86/kernel/dumpstack_64.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index e71c98d..788295c 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -155,9 +155,12 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
task = current;
if (!stack) {
- stack = &dummy;
- if (task && task != current)
+ if (regs)
+ stack = (unsigned long *)regs->sp;
+ else if (task && task != current)
stack = (unsigned long *)task->thread.sp;
+ else
+ stack = &dummy;
}
if (!bp)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
2011-07-02 16:29 ` [PATCH 1/6] x86: Save stack pointer in perf live regs savings Frederic Weisbecker
2011-07-02 16:29 ` [PATCH 2/6] x86: Fetch stack from regs when possible in dump_trace() Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-04 7:20 ` Jan Beulich
2011-07-02 16:29 ` [PATCH 4/6] x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ Frederic Weisbecker
` (3 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo, Jan Beulich
The save_regs function that saves the regs on low level
irq entry is complicated because of the fact it changes
its stack in the middle and also because it manipulates
data allocated in the caller frame and accesses there
are directly calculated from callee rsp value with the
return address in the middle of the way.
This complicates the static stack offsets calculation and
require more dynamic ones. It also needs a save/restore
of the function's return address.
To simplify and optimize this, turn save_regs() into a
macro.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
---
arch/x86/kernel/entry_64.S | 44 +++++++++++++++++---------------------------
1 files changed, 17 insertions(+), 27 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 8a445a0..b6b2e85 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -297,27 +297,22 @@ ENDPROC(native_usergs_sysret64)
.endm
/* save partial stack frame */
- .pushsection .kprobes.text, "ax"
-ENTRY(save_args)
- XCPT_FRAME
+ .macro SAVE_ARGS_IRQ
cld
- /*
- * start from rbp in pt_regs and jump over
- * return address.
- */
- movq_cfi rdi, RDI+8-RBP
- movq_cfi rsi, RSI+8-RBP
- movq_cfi rdx, RDX+8-RBP
- movq_cfi rcx, RCX+8-RBP
- movq_cfi rax, RAX+8-RBP
- movq_cfi r8, R8+8-RBP
- movq_cfi r9, R9+8-RBP
- movq_cfi r10, R10+8-RBP
- movq_cfi r11, R11+8-RBP
-
- leaq -RBP+8(%rsp),%rdi /* arg1 for handler */
- movq_cfi rbp, 8 /* push %rbp */
- leaq 8(%rsp), %rbp /* mov %rsp, %ebp */
+ /* start from rbp in pt_regs and jump over */
+ movq_cfi rdi, RDI-RBP
+ movq_cfi rsi, RSI-RBP
+ movq_cfi rdx, RDX-RBP
+ movq_cfi rcx, RCX-RBP
+ movq_cfi rax, RAX-RBP
+ movq_cfi r8, R8-RBP
+ movq_cfi r9, R9-RBP
+ movq_cfi r10, R10-RBP
+ movq_cfi r11, R11-RBP
+
+ leaq -RBP(%rsp),%rdi /* arg1 for handler */
+ movq_cfi rbp, 0 /* push %rbp */
+ movq %rsp, %rbp
testl $3, CS(%rdi)
je 1f
SWAPGS
@@ -329,19 +324,14 @@ ENTRY(save_args)
*/
1: incl PER_CPU_VAR(irq_count)
jne 2f
- popq_cfi %rax /* move return address... */
mov PER_CPU_VAR(irq_stack_ptr),%rsp
EMPTY_FRAME 0
pushq_cfi %rbp /* backlink for unwinder */
- pushq_cfi %rax /* ... to the new stack */
/*
* We entered an interrupt context - irqs are off:
*/
2: TRACE_IRQS_OFF
- ret
- CFI_ENDPROC
-END(save_args)
- .popsection
+ .endm
ENTRY(save_rest)
PARTIAL_FRAME 1 REST_SKIP+8
@@ -791,7 +781,7 @@ END(interrupt)
/* reserve pt_regs for scratch regs and rbp */
subq $ORIG_RAX-RBP, %rsp
CFI_ADJUST_CFA_OFFSET ORIG_RAX-RBP
- call save_args
+ SAVE_ARGS_IRQ
PARTIAL_FRAME 0
call \func
.endm
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-02 16:29 ` [PATCH 3/6] x86,64: Simplify save_regs() Frederic Weisbecker
@ 2011-07-04 7:20 ` Jan Beulich
2011-07-04 12:57 ` Frederic Weisbecker
2011-07-04 12:59 ` Frederic Weisbecker
0 siblings, 2 replies; 20+ messages in thread
From: Jan Beulich @ 2011-07-04 7:20 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
>>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> The save_regs function that saves the regs on low level
> irq entry is complicated because of the fact it changes
> its stack in the middle and also because it manipulates
> data allocated in the caller frame and accesses there
> are directly calculated from callee rsp value with the
> return address in the middle of the way.
>
> This complicates the static stack offsets calculation and
> require more dynamic ones. It also needs a save/restore
> of the function's return address.
>
> To simplify and optimize this, turn save_regs() into a
> macro.
So this got pulled out into a function a couple of releases ago,
and now it's being converted back? Wasn't the original patch's
intention to reduce the amount of duplication of generated
code?
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jan Beulich <JBeulich@novell.com>
> ---
> arch/x86/kernel/entry_64.S | 44 +++++++++++++++++---------------------------
> 1 files changed, 17 insertions(+), 27 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 8a445a0..b6b2e85 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -297,27 +297,22 @@ ENDPROC(native_usergs_sysret64)
> .endm
>
> /* save partial stack frame */
> - .pushsection .kprobes.text, "ax"
> -ENTRY(save_args)
> - XCPT_FRAME
> + .macro SAVE_ARGS_IRQ
> cld
> - /*
> - * start from rbp in pt_regs and jump over
> - * return address.
> - */
> - movq_cfi rdi, RDI+8-RBP
> - movq_cfi rsi, RSI+8-RBP
> - movq_cfi rdx, RDX+8-RBP
> - movq_cfi rcx, RCX+8-RBP
> - movq_cfi rax, RAX+8-RBP
> - movq_cfi r8, R8+8-RBP
> - movq_cfi r9, R9+8-RBP
> - movq_cfi r10, R10+8-RBP
> - movq_cfi r11, R11+8-RBP
> -
> - leaq -RBP+8(%rsp),%rdi /* arg1 for handler */
> - movq_cfi rbp, 8 /* push %rbp */
> - leaq 8(%rsp), %rbp /* mov %rsp, %ebp */
> + /* start from rbp in pt_regs and jump over */
> + movq_cfi rdi, RDI-RBP
> + movq_cfi rsi, RSI-RBP
> + movq_cfi rdx, RDX-RBP
> + movq_cfi rcx, RCX-RBP
> + movq_cfi rax, RAX-RBP
> + movq_cfi r8, R8-RBP
> + movq_cfi r9, R9-RBP
> + movq_cfi r10, R10-RBP
> + movq_cfi r11, R11-RBP
> +
> + leaq -RBP(%rsp),%rdi /* arg1 for handler */
> + movq_cfi rbp, 0 /* push %rbp */
> + movq %rsp, %rbp
> testl $3, CS(%rdi)
> je 1f
> SWAPGS
> @@ -329,19 +324,14 @@ ENTRY(save_args)
> */
> 1: incl PER_CPU_VAR(irq_count)
> jne 2f
> - popq_cfi %rax /* move return address... */
> mov PER_CPU_VAR(irq_stack_ptr),%rsp
> EMPTY_FRAME 0
> pushq_cfi %rbp /* backlink for unwinder */
> - pushq_cfi %rax /* ... to the new stack */
> /*
> * We entered an interrupt context - irqs are off:
> */
> 2: TRACE_IRQS_OFF
> - ret
> - CFI_ENDPROC
> -END(save_args)
> - .popsection
> + .endm
>
> ENTRY(save_rest)
> PARTIAL_FRAME 1 REST_SKIP+8
> @@ -791,7 +781,7 @@ END(interrupt)
> /* reserve pt_regs for scratch regs and rbp */
> subq $ORIG_RAX-RBP, %rsp
> CFI_ADJUST_CFA_OFFSET ORIG_RAX-RBP
> - call save_args
> + SAVE_ARGS_IRQ
> PARTIAL_FRAME 0
With the above CFI adjustments all looking correct, this PARTIAL_FRAME
directive now is at best redundant, and hence should be removed.
Jan
> call \func
> .endm
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-04 7:20 ` Jan Beulich
@ 2011-07-04 12:57 ` Frederic Weisbecker
2011-07-06 17:34 ` Andi Kleen
2011-07-04 12:59 ` Frederic Weisbecker
1 sibling, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-04 12:57 UTC (permalink / raw)
To: Jan Beulich
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
On Mon, Jul 04, 2011 at 08:20:51AM +0100, Jan Beulich wrote:
> >>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > The save_regs function that saves the regs on low level
> > irq entry is complicated because of the fact it changes
> > its stack in the middle and also because it manipulates
> > data allocated in the caller frame and accesses there
> > are directly calculated from callee rsp value with the
> > return address in the middle of the way.
> >
> > This complicates the static stack offsets calculation and
> > require more dynamic ones. It also needs a save/restore
> > of the function's return address.
> >
> > To simplify and optimize this, turn save_regs() into a
> > macro.
>
> So this got pulled out into a function a couple of releases ago,
> and now it's being converted back? Wasn't the original patch's
> intention to reduce the amount of duplication of generated
> code?
Right. I didn't know there was a conversion a while ago for
this save_regs() from a macro to a func in order to remove
code duplication.
I really did not think about code duplication, considering
it's better to optimize the irq entry path.
What do you guys think? We can still revert the whole patchset.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-04 12:57 ` Frederic Weisbecker
@ 2011-07-06 17:34 ` Andi Kleen
2011-07-06 18:20 ` H. Peter Anvin
0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2011-07-06 17:34 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Jan Beulich, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
Frederic Weisbecker <fweisbec@gmail.com> writes:
>
> I really did not think about code duplication, considering
> it's better to optimize the irq entry path.
>
> What do you guys think? We can still revert the whole patchset.
FWIW I think it should be a macro, like it was in the original code.
Optimizing entry*.S for code size doesn't make a lot of sense.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-06 17:34 ` Andi Kleen
@ 2011-07-06 18:20 ` H. Peter Anvin
2011-07-06 20:42 ` Frederic Weisbecker
0 siblings, 1 reply; 20+ messages in thread
From: H. Peter Anvin @ 2011-07-06 18:20 UTC (permalink / raw)
To: Andi Kleen
Cc: Frederic Weisbecker, Jan Beulich, Peter Zijlstra, Ingo Molnar,
Thomas Gleixner, Arnaldo Carvalho de Melo, LKML
On 07/06/2011 10:34 AM, Andi Kleen wrote:
> Frederic Weisbecker <fweisbec@gmail.com> writes:
>>
>> I really did not think about code duplication, considering
>> it's better to optimize the irq entry path.
>>
>> What do you guys think? We can still revert the whole patchset.
>
> FWIW I think it should be a macro, like it was in the original code.
>
> Optimizing entry*.S for code size doesn't make a lot of sense.
>
Code size, no.
*Path* size and cache/prefetch friendliness is another matter.
The subroutine is bad on that account, too, so yes, this really seems
like a losing proposition.
I'm not too fond of the gajillion obtuse macros we have, but subroutines
doesn't make it really any better.
-hpa
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-06 18:20 ` H. Peter Anvin
@ 2011-07-06 20:42 ` Frederic Weisbecker
0 siblings, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-06 20:42 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andi Kleen, Jan Beulich, Peter Zijlstra, Ingo Molnar,
Thomas Gleixner, Arnaldo Carvalho de Melo, LKML
On Wed, Jul 06, 2011 at 11:20:49AM -0700, H. Peter Anvin wrote:
> On 07/06/2011 10:34 AM, Andi Kleen wrote:
> > Frederic Weisbecker <fweisbec@gmail.com> writes:
> >>
> >> I really did not think about code duplication, considering
> >> it's better to optimize the irq entry path.
> >>
> >> What do you guys think? We can still revert the whole patchset.
> >
> > FWIW I think it should be a macro, like it was in the original code.
> >
> > Optimizing entry*.S for code size doesn't make a lot of sense.
> >
>
> Code size, no.
>
> *Path* size and cache/prefetch friendliness is another matter.
> The subroutine is bad on that account, too, so yes, this really seems
> like a losing proposition.
>
> I'm not too fond of the gajillion obtuse macros we have, but subroutines
> doesn't make it really any better.
Fine, so I guess we can keep that macro conversion.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/6] x86,64: Simplify save_regs()
2011-07-04 7:20 ` Jan Beulich
2011-07-04 12:57 ` Frederic Weisbecker
@ 2011-07-04 12:59 ` Frederic Weisbecker
1 sibling, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-04 12:59 UTC (permalink / raw)
To: Jan Beulich
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
On Mon, Jul 04, 2011 at 08:20:51AM +0100, Jan Beulich wrote:
> > ENTRY(save_rest)
> > PARTIAL_FRAME 1 REST_SKIP+8
> > @@ -791,7 +781,7 @@ END(interrupt)
> > /* reserve pt_regs for scratch regs and rbp */
> > subq $ORIG_RAX-RBP, %rsp
> > CFI_ADJUST_CFA_OFFSET ORIG_RAX-RBP
> > - call save_args
> > + SAVE_ARGS_IRQ
> > PARTIAL_FRAME 0
>
> With the above CFI adjustments all looking correct, this PARTIAL_FRAME
> directive now is at best redundant, and hence should be removed.
Indeed, will fix.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/6] x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
` (2 preceding siblings ...)
2011-07-02 16:29 ` [PATCH 3/6] x86,64: Simplify save_regs() Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-04 7:34 ` Jan Beulich
2011-07-02 16:29 ` [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving Frederic Weisbecker
` (2 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo, Jan Beulich
Just for clarity in the code. Have a first block that handles
the frame pointer and a separate one that handles pt_regs
pointer and its use.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
---
arch/x86/kernel/entry_64.S | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index b6b2e85..20dc8e6 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -310,9 +310,10 @@ ENDPROC(native_usergs_sysret64)
movq_cfi r10, R10-RBP
movq_cfi r11, R11-RBP
- leaq -RBP(%rsp),%rdi /* arg1 for handler */
movq_cfi rbp, 0 /* push %rbp */
movq %rsp, %rbp
+
+ leaq -RBP(%rsp),%rdi /* arg1 for handler */
testl $3, CS(%rdi)
je 1f
SWAPGS
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 4/6] x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ
2011-07-02 16:29 ` [PATCH 4/6] x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ Frederic Weisbecker
@ 2011-07-04 7:34 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2011-07-04 7:34 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
>>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Just for clarity in the code. Have a first block that handles
> the frame pointer and a separate one that handles pt_regs
> pointer and its use.
While possible indeed making the code easier to read, it increases the
chance of a latency related stall (due to the write to %rdi immediately
preceding its use as a memory operand's address).
Jan
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jan Beulich <JBeulich@novell.com>
> ---
> arch/x86/kernel/entry_64.S | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index b6b2e85..20dc8e6 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -310,9 +310,10 @@ ENDPROC(native_usergs_sysret64)
> movq_cfi r10, R10-RBP
> movq_cfi r11, R11-RBP
>
> - leaq -RBP(%rsp),%rdi /* arg1 for handler */
> movq_cfi rbp, 0 /* push %rbp */
> movq %rsp, %rbp
> +
> + leaq -RBP(%rsp),%rdi /* arg1 for handler */
> testl $3, CS(%rdi)
> je 1f
> SWAPGS
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
` (3 preceding siblings ...)
2011-07-02 16:29 ` [PATCH 4/6] x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-04 7:29 ` Jan Beulich
2011-07-02 16:29 ` [PATCH 6/6] x86: Don't use frame pointer to save old stack on irq entry Frederic Weisbecker
2011-07-04 9:13 ` [RFC GIT PULL] x86 entry / perf stacktrace changes Ingo Molnar
6 siblings, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo, Jan Beulich
The unwinder backlink in interrupt entry is very useless.
It's actually not part of the stack frame chain and thus is
never used.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
---
arch/x86/kernel/entry_64.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 20dc8e6..6131432 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -327,7 +327,6 @@ ENDPROC(native_usergs_sysret64)
jne 2f
mov PER_CPU_VAR(irq_stack_ptr),%rsp
EMPTY_FRAME 0
- pushq_cfi %rbp /* backlink for unwinder */
/*
* We entered an interrupt context - irqs are off:
*/
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-02 16:29 ` [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving Frederic Weisbecker
@ 2011-07-04 7:29 ` Jan Beulich
2011-07-04 9:17 ` Ingo Molnar
2011-07-04 13:22 ` Frederic Weisbecker
0 siblings, 2 replies; 20+ messages in thread
From: Jan Beulich @ 2011-07-04 7:29 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
>>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> The unwinder backlink in interrupt entry is very useless.
> It's actually not part of the stack frame chain and thus is
> never used.
I very much doubt this - see dump_trace()'s comment in its IRQ-stack
related code portion (and the corresponding use of irq_stack_end[-1]).
Jan
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jan Beulich <JBeulich@novell.com>
> ---
> arch/x86/kernel/entry_64.S | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 20dc8e6..6131432 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -327,7 +327,6 @@ ENDPROC(native_usergs_sysret64)
> jne 2f
> mov PER_CPU_VAR(irq_stack_ptr),%rsp
> EMPTY_FRAME 0
> - pushq_cfi %rbp /* backlink for unwinder */
> /*
> * We entered an interrupt context - irqs are off:
> */
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-04 7:29 ` Jan Beulich
@ 2011-07-04 9:17 ` Ingo Molnar
2011-07-04 13:10 ` Frederic Weisbecker
2011-07-04 13:22 ` Frederic Weisbecker
1 sibling, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2011-07-04 9:17 UTC (permalink / raw)
To: Jan Beulich
Cc: Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
* Jan Beulich <JBeulich@novell.com> wrote:
> >>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > The unwinder backlink in interrupt entry is very useless.
> > It's actually not part of the stack frame chain and thus is
> > never used.
>
> I very much doubt this - see dump_trace()'s comment in its IRQ-stack
> related code portion (and the corresponding use of irq_stack_end[-1]).
> > +++ b/arch/x86/kernel/entry_64.S
> > @@ -327,7 +327,6 @@ ENDPROC(native_usergs_sysret64)
> > jne 2f
> > mov PER_CPU_VAR(irq_stack_ptr),%rsp
> > EMPTY_FRAME 0
> > - pushq_cfi %rbp /* backlink for unwinder */
> > /*
> > * We entered an interrupt context - irqs are off:
> > */
Frederic, please add it back with a much better comment in the .S
showing where it's used and how. Perhaps even try to trigger the
usage of this backlink and document the effect in the changelog.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-04 9:17 ` Ingo Molnar
@ 2011-07-04 13:10 ` Frederic Weisbecker
0 siblings, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-04 13:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jan Beulich, Peter Zijlstra, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
On Mon, Jul 04, 2011 at 11:17:42AM +0200, Ingo Molnar wrote:
>
> * Jan Beulich <JBeulich@novell.com> wrote:
>
> > >>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > The unwinder backlink in interrupt entry is very useless.
> > > It's actually not part of the stack frame chain and thus is
> > > never used.
> >
> > I very much doubt this - see dump_trace()'s comment in its IRQ-stack
> > related code portion (and the corresponding use of irq_stack_end[-1]).
>
> > > +++ b/arch/x86/kernel/entry_64.S
> > > @@ -327,7 +327,6 @@ ENDPROC(native_usergs_sysret64)
> > > jne 2f
> > > mov PER_CPU_VAR(irq_stack_ptr),%rsp
> > > EMPTY_FRAME 0
> > > - pushq_cfi %rbp /* backlink for unwinder */
> > > /*
> > > * We entered an interrupt context - irqs are off:
> > > */
>
> Frederic, please add it back with a much better comment in the .S
> showing where it's used and how. Perhaps even try to trigger the
> usage of this backlink and document the effect in the changelog.
Ok. Please also consider the point Jan made: save_regs() was created
a while ago to convert code from a macro to a function in order to
reduce the amount of duplicated code amongst interrupts.
I did not think about that. On the other hand, keeping that into a
function unoptimize a bit the interrupt entries I think (but then
at the cost of increasing a bit the i-cache footprint?)
Not sure what we should do. We can still revert/zap the whole
and focus on pure stacktrace fixes without underlying optimizations.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-04 7:29 ` Jan Beulich
2011-07-04 9:17 ` Ingo Molnar
@ 2011-07-04 13:22 ` Frederic Weisbecker
2011-07-05 22:21 ` Valdis.Kletnieks
1 sibling, 1 reply; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-04 13:22 UTC (permalink / raw)
To: Jan Beulich
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
On Mon, Jul 04, 2011 at 08:29:08AM +0100, Jan Beulich wrote:
> >>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > The unwinder backlink in interrupt entry is very useless.
> > It's actually not part of the stack frame chain and thus is
> > never used.
>
> I very much doubt this - see dump_trace()'s comment in its IRQ-stack
> related code portion (and the corresponding use of irq_stack_end[-1]).
>
> Jan
Good point. I misunderstood that.
But then I believe I accidentally fixed it back in
"x86: Don't use frame pointer to save old stack on irq entry" by
pushing %rsi instead in the new stack. It contains the backlink to
the old stack.
If we keep the macro as-is, I'll add a comment to explain further
what is involved there.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving
2011-07-04 13:22 ` Frederic Weisbecker
@ 2011-07-05 22:21 ` Valdis.Kletnieks
0 siblings, 0 replies; 20+ messages in thread
From: Valdis.Kletnieks @ 2011-07-05 22:21 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Jan Beulich, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
Arnaldo Carvalho de Melo, LKML, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 946 bytes --]
On Mon, 04 Jul 2011 15:22:23 +0200, Frederic Weisbecker said:
> On Mon, Jul 04, 2011 at 08:29:08AM +0100, Jan Beulich wrote:
> > >>> On 02.07.11 at 18:29, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > The unwinder backlink in interrupt entry is very useless.
> > > It's actually not part of the stack frame chain and thus is
> > > never used.
> >
> > I very much doubt this - see dump_trace()'s comment in its IRQ-stack
> > related code portion (and the corresponding use of irq_stack_end[-1]).
> >
> > Jan
>
> Good point. I misunderstood that.
>
> But then I believe I accidentally fixed it back in
> "x86: Don't use frame pointer to save old stack on irq entry" by
> pushing %rsi instead in the new stack. It contains the backlink to
> the old stack.
>
> If we keep the macro as-is, I'll add a comment to explain further
> what is involved there.
Ahh.. I was wondering why there wasn't suddenly a very lonely popq
out there... :)
[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 6/6] x86: Don't use frame pointer to save old stack on irq entry
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
` (4 preceding siblings ...)
2011-07-02 16:29 ` [PATCH 5/6] x86: Remove useless unwinder backlink from irq regs saving Frederic Weisbecker
@ 2011-07-02 16:29 ` Frederic Weisbecker
2011-07-04 9:13 ` [RFC GIT PULL] x86 entry / perf stacktrace changes Ingo Molnar
6 siblings, 0 replies; 20+ messages in thread
From: Frederic Weisbecker @ 2011-07-02 16:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Thomas Gleixner, H. Peter Anvin,
Peter Zijlstra, Arnaldo Carvalho de Melo, Jan Beulich
rbp is used in SAVE_ARGS_IRQ to save the old stack pointer
in order to restore it later in ret_from_intr.
It is convenient because we save its value in the irq regs
and it's easily restored using the leave instruction.
However this is a kind of abuse of the frame pointer which
role is to help unwinding the kernel by chaining frames
together, each node following the return address to the
previous frame.
But although we are breaking the frame by changing the stack
pointer, there is no preceding return address before the new
frame. Hence using the frame pointer to link the two stacks
breaks the stack unwinders that find a random value instead of
a return address here.
There is no workaround that can work in every case. We are using
the fixup_bp_irq_link() function to dereference that abused frame
pointer in the case of non nesting interrupt (which means stack
changed).
But that doesn't fix the case of interrupts that don't change the
stack (but we still have the unconditional frame link), which is
the case of hardirq interrupting softirq. We have no way to detect
this transition so the frame irq link is considered as a real frame
pointer and the return address is dereferenced but it is still a
spurious one.
There are two possible results of this: either the spurious return
address, a random stack value, luckily belongs to the kernel text
and then the unwinding can continue and we just have a weird entry
in the stack trace. Or it doesn't belong to the kernel text and
unwinding stops there.
This is the reason why stacktraces (including perf callchains) on
irqs that interrupted softirqs don't work very well.
To solve this, we don't save the old stack pointer on rbp anymore
but we save it to a scratch register that we push on the new
stack and that we pop back later on irq return.
This preserves the whole frame chain without spurious return addresses
in the middle and drops the need for the horrid fixup_bp_irq_link()
workaround.
And finally irqs that interrupt softirq are sanely unwinded.
Before:
99.81% perf [kernel.kallsyms] [k] perf_pending_event
|
--- perf_pending_event
irq_work_run
smp_irq_work_interrupt
irq_work_interrupt
|
|--41.60%-- __read
| |
| |--99.90%-- create_worker
| | bench_sched_messaging
| | cmd_bench
| | run_builtin
| | main
| | __libc_start_main
| --0.10%-- [...]
After:
1.64% swapper [kernel.kallsyms] [k] perf_pending_event
|
--- perf_pending_event
irq_work_run
smp_irq_work_interrupt
irq_work_interrupt
|
|--95.00%-- arch_irq_work_raise
| irq_work_queue
| __perf_event_overflow
| perf_swevent_overflow
| perf_swevent_event
| perf_tp_event
| perf_trace_softirq
| __do_softirq
| call_softirq
| do_softirq
| irq_exit
| |
| |--73.68%-- smp_apic_timer_interrupt
| | apic_timer_interrupt
| | |
| | |--96.43%-- amd_e400_idle
| | | cpu_idle
| | | start_secondary
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
---
arch/x86/kernel/dumpstack_64.c | 30 ------------------------------
arch/x86/kernel/entry_64.S | 27 +++++++++++++++------------
2 files changed, 15 insertions(+), 42 deletions(-)
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 788295c..19853ad 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -105,34 +105,6 @@ in_irq_stack(unsigned long *stack, unsigned long *irq_stack,
}
/*
- * We are returning from the irq stack and go to the previous one.
- * If the previous stack is also in the irq stack, then bp in the first
- * frame of the irq stack points to the previous, interrupted one.
- * Otherwise we have another level of indirection: We first save
- * the bp of the previous stack, then we switch the stack to the irq one
- * and save a new bp that links to the previous one.
- * (See save_args())
- */
-static inline unsigned long
-fixup_bp_irq_link(unsigned long bp, unsigned long *stack,
- unsigned long *irq_stack, unsigned long *irq_stack_end)
-{
-#ifdef CONFIG_FRAME_POINTER
- struct stack_frame *frame = (struct stack_frame *)bp;
- unsigned long next;
-
- if (!in_irq_stack(stack, irq_stack, irq_stack_end)) {
- if (!probe_kernel_address(&frame->next_frame, next))
- return next;
- else
- WARN_ONCE(1, "Perf: bad frame pointer = %p in "
- "callchain\n", &frame->next_frame);
- }
-#endif
- return bp;
-}
-
-/*
* x86-64 can have up to three kernel stacks:
* process stack
* interrupt stack
@@ -208,8 +180,6 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
* pointer (index -1 to end) in the IRQ stack:
*/
stack = (unsigned long *) (irq_stack_end[-1]);
- bp = fixup_bp_irq_link(bp, stack, irq_stack,
- irq_stack_end);
irq_stack_end = NULL;
ops->stack(data, "EOI");
continue;
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 6131432..d656f68 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -310,8 +310,11 @@ ENDPROC(native_usergs_sysret64)
movq_cfi r10, R10-RBP
movq_cfi r11, R11-RBP
- movq_cfi rbp, 0 /* push %rbp */
- movq %rsp, %rbp
+ /* Save rbp so that we can unwind from get_irq_regs() */
+ movq_cfi rbp, 0
+
+ /* Save previous stack value */
+ movq %rsp, %rsi
leaq -RBP(%rsp),%rdi /* arg1 for handler */
testl $3, CS(%rdi)
@@ -327,10 +330,11 @@ ENDPROC(native_usergs_sysret64)
jne 2f
mov PER_CPU_VAR(irq_stack_ptr),%rsp
EMPTY_FRAME 0
- /*
- * We entered an interrupt context - irqs are off:
- */
-2: TRACE_IRQS_OFF
+
+2: /* Store previous stack value */
+ pushq %rsi
+ /* We entered an interrupt context - irqs are off: */
+ TRACE_IRQS_OFF
.endm
ENTRY(save_rest)
@@ -804,15 +808,14 @@ ret_from_intr:
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
decl PER_CPU_VAR(irq_count)
- leaveq
- CFI_RESTORE rbp
+ /* Restore saved previous stack */
+ popq %rsi
+ leaq 16(%rsi), %rsp
+
CFI_DEF_CFA_REGISTER rsp
- CFI_ADJUST_CFA_OFFSET -8
+ CFI_ADJUST_CFA_OFFSET -16
- /* we did not save rbx, restore only from ARGOFFSET */
- addq $8, %rsp
- CFI_ADJUST_CFA_OFFSET -8
exit_intr:
GET_THREAD_INFO(%rcx)
testl $3,CS-ARGOFFSET(%rsp)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [RFC GIT PULL] x86 entry / perf stacktrace changes
2011-07-02 16:29 [RFC GIT PULL] x86 entry / perf stacktrace changes Frederic Weisbecker
` (5 preceding siblings ...)
2011-07-02 16:29 ` [PATCH 6/6] x86: Don't use frame pointer to save old stack on irq entry Frederic Weisbecker
@ 2011-07-04 9:13 ` Ingo Molnar
6 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2011-07-04 9:13 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Thomas Gleixner, H. Peter Anvin, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jan Beulich
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Ingo,
>
> Please pull the perf/stacktrace branch that can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> perf/stacktrace
>
> Or may be not yet. It's still in RFC because I would like
> to ensure I did not break cfi annotations. I hope Jan beulich
> or others can have a look.
>
> Sorry to mess up irq entry and perf stacktrace changes in the same
> topic. But the changes happen to be very interconnected.
>
> Several cleanups, fixes and optimizations for the x86-64
> stacktraces and irq entry.
>
> Last patch seems to add a bit of overhead in the irq entry
> (one more "push") but given the changes made in previous
> patches, the end result is a more optimized and more clear irq
> entry.
>
> Thanks,
> Frederic
> ---
>
> Frederic Weisbecker (6):
> x86: Save stack pointer in perf live regs savings
> x86: Fetch stack from regs when possible in dump_trace()
> x86,64: Simplify save_regs()
> x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ
> x86: Remove useless unwinder backlink from irq regs saving
> x86: Don't use frame pointer to save old stack on irq entry
>
>
> arch/x86/include/asm/perf_event.h | 5 +++
> arch/x86/kernel/dumpstack_64.c | 37 +++-----------------
> arch/x86/kernel/entry_64.S | 69 ++++++++++++++++--------------------
> 3 files changed, 41 insertions(+), 70 deletions(-)
Pulled, thanks a lot Frederic! The irq backtrace fixes look pretty
important - if there's any CFI annotation detail to be fixed we can
do it in a delta.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 20+ messages in thread