From: "Wangnan (F)" <wangnan0@huawei.com>
To: Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
David Ahern <dsahern@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Milian Wolff <milian.wolff@kdab.com>
Cc: <linux-kernel@vger.kernel.org>, pi3orama <pi3orama@163.com>,
"lizefan 00213767" <lizefan@huawei.com>
Subject: [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction
Date: Wed, 18 Nov 2015 15:20:36 +0800 [thread overview]
Message-ID: <564C26C4.2040603@huawei.com> (raw)
Hi all,
When analysising Jiri's patchset [1] I found a dwarf unwind problem.
On x86 platform, when sample is at a 'callq' instruction, dwarf based
stack unwind always fail.
I compile a small C source file with debug information, turn off
frame pointer and disable optimization:
$ gcc -g -O0 -fomit-frame-pointer ./test_dwarf_unwind.c -o
./test_dwarf_unwind
Here is part of asm code. Please have a look at 40057d. Also, have a
look at 400534 and 4005bc, from these two instructions we know this
"funcc" need 0x38 bytes on stack, so we know where is the return address.
$ objdump -S ./test_dwarf_unwind
...
0000000000400534 <funcc>:
#include <sys/time.h>
static volatile int x = 0;
int funcc(void)
{
400534: 48 83 ec 38 sub $0x38,%rsp
struct timeval tv1, tv2;
unsigned long us1, us2;
gettimeofday(&tv1, NULL);
400538: 48 8d 44 24 10 lea 0x10(%rsp),%rax
...
400572: 48 89 e0 mov %rsp,%rax
400575: be 00 00 00 00 mov $0x0,%esi
40057a: 48 89 c7 mov %rax,%rdi
40057d: e8 7e fe ff ff callq 400400
<gettimeofday@plt> <---!!!
us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
400582: 48 8b 04 24 mov (%rsp),%rax
400586: 48 69 d0 40 42 0f 00 imul $0xf4240,%rax,%rdx
...
return x;
4005b6: 8b 05 74 04 20 00 mov 0x200474(%rip),%eax
# 600a30 <x>
}
4005bc: 48 83 c4 38 add $0x38,%rsp
4005c0: c3 retq
Then use perf to trace it:
# ./pref record -g -F 9 --call-graph dwarf ./test_dwarf_unwind
# ./perf report --no-children --stdio
...
1.23% test_dwarf_unwi test_dwarf_unwind [.] funcc
|
---funcc
|
|--0.62%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.62%-- 0 <-- I applied Jiri's patch in [1] so see
this 0,
which indicates a broken call chain.
From 'perf script' result I found the two samples inside funcc:
# ./perf script
...
test_dwarf_unwi 25232 2201855.424651: 443365670 cycles:pp:
4005a4 funcc (/home/w00229757/test_dwarf_unwind)
4005ca funcb (/home/w00229757/test_dwarf_unwind)
4005d8 funca (/home/w00229757/test_dwarf_unwind)
4005e6 main (/home/w00229757/test_dwarf_unwind)
7facc5c2ebd5 __libc_start_main
(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
400459 _start (/home/w00229757/test_dwarf_unwind)
...
test_dwarf_unwi 25232 2201856.647088: 443268987 cycles:pp:
40057d funcc (/home/w00229757/test_dwarf_unwind)
0 [unknown] ([unknown])
With perf report -D I can manually check their stack information manually.
This is stack of sample at 4005a4:
# ./perf report -D
...
0x1d9c8 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
. 0000: 09 00 00 00 02 40 e8 20 a4 05 40 00 00 00 00 00 .....@. ..@.....
...
. 00d0: 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 ......... ......
. 00e0: 2b f6 4c 56 00 00 00 00 46 52 02 00 00 00 00 00 +.LV....FR......
. 00f0: 29 f6 4c 56 00 00 00 00 70 e4 0c 00 00 00 00 00 ).LV....p.......
. 0100: 06 f3 3b d7 d7 24 05 00 b0 00 28 d7 d7 24 05 00 ..;..$....(..$..
. 0110: 00 00 00 00 00 00 00 00 ca 05 40 00 00 00 00 00
...
. 20e0: b0 06 00 00 00 00 00 00 ........
.
2201855424651081 0x1d9c8 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002):
25232/25232: 0x4005a4 period: 443365670 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.......
.... IP 0x4005a4
.......
... ustack: size 1712, offset 0xd8
....
From the above information we can check its stack from 0xe0 (stack size
reside at 0xd8 as a u64 so stack data start at 0xe0). Based on information
we get from the above asm code, we know the return address should at the
0x38 of the stack, which is 0x4005ca.
This is stack of sample at 40057d:
0x343c0 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
. 0000: 09 00 00 00 02 40 e8 20 7d 05 40 00 00 00 00 00 .....@. }.@.....
...
. 00d0: 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 ......... ......
. 00e0: 82 05 40 00 00 00 00 00 2c f6 4c 56 00 00 00 00 ..@.....,.LV....
. 00f0: 84 b7 05 00 00 00 00 00 29 f6 4c 56 00 00 00 00 ........).LV....
. 0100: 70 e4 0c 00 00 00 00 00 84 9a 4e d7 d7 24 05 00 p.........N..$..
. 0110: b0 00 28 d7 d7 24 05 00 00 00 00 00 00 00 00 00 ..(..$..........
. 0120: ca 05 40 00 00 00 00 00 f0 05 40 00 00 00 00 00 ..@.......@.....
...
. 20e0: b8 06 00 00 00 00 00 00 ........
.
2201856647088808 0x343c0 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002):
25232/25232: 0x40057d period: 443268987 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
......
.... IP 0x40057d
......
... ustack: size 1720, offset 0xd8
.....
By manually checking user stack from 0xe0, we can see problem that:
although the IP
is still at 'callq', that instruction seems already take effect that, it has
already pushes its return address (0x400582) onto stack.
So there must be somewhere the pt_regs or the stack is adjusted in kernel.
Does anyone can give some hint on it?
Thank you.
[1]
http://lkml.kernel.org/r/1447772739-18471-1-git-send-email-jolsa@kernel.org
next reply other threads:[~2015-11-18 7:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-18 7:20 Wangnan (F) [this message]
2015-11-18 8:00 ` [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction Wangnan (F)
2015-11-18 8:20 ` Ingo Molnar
2015-11-18 8:42 ` Wangnan (F)
2015-11-18 8:49 ` Wangnan (F)
2015-11-19 6:37 ` Ingo Molnar
2015-11-19 6:45 ` Wangnan (F)
2015-11-19 10:23 ` Ingo Molnar
2015-11-19 10:43 ` Wangnan (F)
2015-11-19 11:28 ` Peter Zijlstra
2015-11-19 11:23 ` Peter Zijlstra
2015-11-27 8:38 ` Ingo Molnar
2015-11-30 9:28 ` Peter Zijlstra
2015-12-01 7:28 ` Ingo Molnar
2015-12-01 8:38 ` Peter Zijlstra
2015-12-01 16:11 ` Ingo Molnar
2015-12-01 17:21 ` Peter Zijlstra
2015-12-02 9:55 ` Ingo Molnar
2015-11-18 8:48 ` Jiri Olsa
2015-11-18 9:02 ` Wangnan (F)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=564C26C4.2040603@huawei.com \
--to=wangnan0@huawei.com \
--cc=acme@kernel.org \
--cc=dsahern@gmail.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=milian.wolff@kdab.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=pi3orama@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).