[BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Wangnan (F)" <wangnan0@huawei.com>
To: Jiri Olsa <jolsa@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	David Ahern <dsahern@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Milian Wolff <milian.wolff@kdab.com>
Cc: <linux-kernel@vger.kernel.org>, pi3orama <pi3orama@163.com>,
	"lizefan 00213767" <lizefan@huawei.com>
Subject: [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction
Date: Wed, 18 Nov 2015 15:20:36 +0800	[thread overview]
Message-ID: <564C26C4.2040603@huawei.com> (raw)

Hi all,

When analysising Jiri's patchset [1] I found a dwarf unwind problem.
On x86 platform, when sample is at a 'callq' instruction, dwarf based
stack unwind always fail.

I compile a small C source file with debug information, turn off
frame pointer and disable optimization:

$ gcc -g -O0 -fomit-frame-pointer ./test_dwarf_unwind.c -o 
./test_dwarf_unwind

Here is part of asm code. Please have a look at 40057d. Also, have a
look at 400534 and 4005bc, from these two instructions we know this
"funcc" need 0x38 bytes on stack, so we know where is the return address.

$ objdump -S ./test_dwarf_unwind
...
0000000000400534 <funcc>:
#include <sys/time.h>

static volatile int x = 0;

int funcc(void)
{
   400534:       48 83 ec 38             sub    $0x38,%rsp
     struct timeval tv1, tv2;
     unsigned long us1, us2;

     gettimeofday(&tv1, NULL);
   400538:       48 8d 44 24 10          lea    0x10(%rsp),%rax
   ...
   400572:       48 89 e0                mov    %rsp,%rax
   400575:       be 00 00 00 00          mov    $0x0,%esi
   40057a:       48 89 c7                mov    %rax,%rdi
   40057d:       e8 7e fe ff ff          callq  400400 
<gettimeofday@plt>    <---!!!
         us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
   400582:       48 8b 04 24             mov    (%rsp),%rax
   400586:       48 69 d0 40 42 0f 00    imul   $0xf4240,%rax,%rdx
   ...
     return x;
   4005b6:       8b 05 74 04 20 00       mov 0x200474(%rip),%eax        
# 600a30 <x>
}
   4005bc:       48 83 c4 38             add    $0x38,%rsp
   4005c0:       c3                      retq

Then use perf to trace it:

# ./pref record  -g -F 9 --call-graph dwarf ./test_dwarf_unwind
# ./perf report  --no-children --stdio
      ...
      1.23%  test_dwarf_unwi  test_dwarf_unwind  [.] funcc
             |
             ---funcc
                |
                |--0.62%-- funcb
                |          funca
                |          main
                |          __libc_start_main
                |          _start
                |
                 --0.62%-- 0   <-- I applied Jiri's patch in [1] so see 
this 0,
                                   which indicates a broken call chain.

 From 'perf script' result I found the two samples inside funcc:

# ./perf script
...
test_dwarf_unwi 25232 2201855.424651:  443365670 cycles:pp:
                   4005a4 funcc (/home/w00229757/test_dwarf_unwind)
                   4005ca funcb (/home/w00229757/test_dwarf_unwind)
                   4005d8 funca (/home/w00229757/test_dwarf_unwind)
                   4005e6 main (/home/w00229757/test_dwarf_unwind)
             7facc5c2ebd5 __libc_start_main 
(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
                   400459 _start (/home/w00229757/test_dwarf_unwind)
...
test_dwarf_unwi 25232 2201856.647088:  443268987 cycles:pp:
                   40057d funcc (/home/w00229757/test_dwarf_unwind)
                        0 [unknown] ([unknown])

With perf report -D I can manually check their stack information manually.


This is stack of sample at 4005a4:

# ./perf report -D
...
0x1d9c8 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
.  0000:  09 00 00 00 02 40 e8 20 a4 05 40 00 00 00 00 00  .....@. ..@.....
...
.  00d0:  00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00  ......... ......
.  00e0:  2b f6 4c 56 00 00 00 00 46 52 02 00 00 00 00 00 +.LV....FR......
.  00f0:  29 f6 4c 56 00 00 00 00 70 e4 0c 00 00 00 00 00 ).LV....p.......
.  0100:  06 f3 3b d7 d7 24 05 00 b0 00 28 d7 d7 24 05 00 ..;..$....(..$..
.  0110:  00 00 00 00 00 00 00 00 ca 05 40 00 00 00 00 00
...
.  20e0:  b0 06 00 00 00 00 00 00                          ........
.
2201855424651081 0x1d9c8 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 
25232/25232: 0x4005a4 period: 443365670 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.......
.... IP    0x4005a4
.......
... ustack: size 1712, offset 0xd8
....

 From the above information we can check its stack from 0xe0 (stack size
reside at 0xd8 as a u64 so stack data start at 0xe0). Based on information
we get from the above asm code, we know the return address should at the
0x38 of the stack, which is 0x4005ca.

This is stack of sample at 40057d:

0x343c0 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
.  0000:  09 00 00 00 02 40 e8 20 7d 05 40 00 00 00 00 00  .....@. }.@.....
...
.  00d0:  00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00  ......... ......
.  00e0:  82 05 40 00 00 00 00 00 2c f6 4c 56 00 00 00 00 ..@.....,.LV....
.  00f0:  84 b7 05 00 00 00 00 00 29 f6 4c 56 00 00 00 00 ........).LV....
.  0100:  70 e4 0c 00 00 00 00 00 84 9a 4e d7 d7 24 05 00 p.........N..$..
.  0110:  b0 00 28 d7 d7 24 05 00 00 00 00 00 00 00 00 00 ..(..$..........
.  0120:  ca 05 40 00 00 00 00 00 f0 05 40 00 00 00 00 00 ..@.......@.....
...
.  20e0:  b8 06 00 00 00 00 00 00                          ........
.
2201856647088808 0x343c0 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 
25232/25232: 0x40057d period: 443268987 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
......
.... IP    0x40057d
......
... ustack: size 1720, offset 0xd8
.....

By manually checking user stack from 0xe0, we can see problem that: 
although the IP
is still at 'callq', that instruction seems already take effect that, it has
already pushes its return address (0x400582) onto stack.

So there must be somewhere the pt_regs or the stack is adjusted in kernel.
Does anyone can give some hint on it?

Thank you.

[1] 
http://lkml.kernel.org/r/1447772739-18471-1-git-send-email-jolsa@kernel.org

next             reply	other threads:[~2015-11-18  7:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-18  7:20 Wangnan (F) [this message]
2015-11-18  8:00 ` [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction Wangnan (F)
2015-11-18  8:20   ` Ingo Molnar
2015-11-18  8:42     ` Wangnan (F)
2015-11-18  8:49       ` Wangnan (F)
2015-11-19  6:37         ` Ingo Molnar
2015-11-19  6:45           ` Wangnan (F)
2015-11-19 10:23             ` Ingo Molnar
2015-11-19 10:43               ` Wangnan (F)
2015-11-19 11:28                 ` Peter Zijlstra
2015-11-19 11:23               ` Peter Zijlstra
2015-11-27  8:38                 ` Ingo Molnar
2015-11-30  9:28                   ` Peter Zijlstra
2015-12-01  7:28                     ` Ingo Molnar
2015-12-01  8:38                       ` Peter Zijlstra
2015-12-01 16:11                         ` Ingo Molnar
2015-12-01 17:21                           ` Peter Zijlstra
2015-12-02  9:55                             ` Ingo Molnar
2015-11-18  8:48 ` Jiri Olsa
2015-11-18  9:02   ` Wangnan (F)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564C26C4.2040603@huawei.com \
    --to=wangnan0@huawei.com \
    --cc=acme@kernel.org \
    --cc=dsahern@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=milian.wolff@kdab.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pi3orama@163.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).