All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/6] perf thread-stack: Improve thread_stack__no_call_return()
Date: Fri, 22 Feb 2019 11:39:00 -0300	[thread overview]
Message-ID: <20190222143900.GA13225@kernel.org> (raw)
In-Reply-To: <ef029200-4eb7-3fb5-fcb2-64f5b129a6bf@intel.com>

Em Fri, Feb 22, 2019 at 11:48:51AM +0200, Adrian Hunter escreveu:
> On 9/01/19 11:18 AM, Adrian Hunter wrote:
> > Improve thread_stack__no_call_return() to better handle 'returns' that do
> > not match the stack i.e. 'no call'. See code comments for details.
> 
> This patch and patch 6 of the series do not seem to have been applied.
> Can they be applied?

I think there was some python3 oddity and I ended up letting this fell
thru the cracks, now I applied and tested it, see my results below, ints
in my perf/core branch now, will go together with your new batch in the
next pull req to Ingo,

Thanks,

- Arnaldo

commit 4ad79244f4899815a904f6e6adb71d24216b9b1d
Author: Adrian Hunter <adrian.hunter@intel.com>
Date:   Wed Jan 9 11:18:34 2019 +0200

    perf thread-stack: Improve thread_stack__no_call_return()
    
    Improve thread_stack__no_call_return() to better handle 'returns' that
    do not match the stack i.e. 'no call'. See code comments for details.
    The example below shows how retpolines are affected:
    
    Example:
    
      $ cat simple-retpoline.c
      __attribute__((noinline)) int bar(void)
      {
              return -1;
      }
    
      int foo(void)
      {
              return bar() + 1;
      }
    
      __attribute__((indirect_branch("thunk"))) int main()
      {
              int (*volatile fn)(void) = foo;
    
              fn();
              return fn();
      }
      $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
      $ objdump -d simple-retpoline
      <SNIP>
      0000000000001040 <main>:
          1040:       48 83 ec 18             sub    $0x18,%rsp
          1044:       48 8d 05 25 01 00 00    lea    0x125(%rip),%rax        # 1170 <foo>
          104b:       48 89 44 24 08          mov    %rax,0x8(%rsp)
          1050:       48 8b 44 24 08          mov    0x8(%rsp),%rax
          1055:       e8 1f 01 00 00          callq  1179 <__x86_indirect_thunk_rax>
          105a:       48 8b 44 24 08          mov    0x8(%rsp),%rax
          105f:       48 83 c4 18             add    $0x18,%rsp
          1063:       e9 11 01 00 00          jmpq   1179 <__x86_indirect_thunk_rax>
      <SNIP>
      0000000000001160 <bar>:
          1160:       b8 ff ff ff ff          mov    $0xffffffff,%eax
          1165:       c3                      retq
      <SNIP>
      0000000000001170 <foo>:
          1170:       e8 eb ff ff ff          callq  1160 <bar>
          1175:       83 c0 01                add    $0x1,%eax
          1178:       c3                      retq
      0000000000001179 <__x86_indirect_thunk_rax>:
          1179:       e8 07 00 00 00          callq  1185 <__x86_indirect_thunk_rax+0xc>
          117e:       f3 90                   pause
          1180:       0f ae e8                lfence
          1183:       eb f9                   jmp    117e <__x86_indirect_thunk_rax+0x5>
          1185:       48 89 04 24             mov    %rax,(%rsp)
          1189:       c3                      retq
      <SNIP>
      $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
      $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
      2019-01-08 14:03:37.851655 Creating database...
      2019-01-08 14:03:37.863256 Writing records...
      2019-01-08 14:03:38.069750 Adding indexes
      2019-01-08 14:03:38.078799 Done
      $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db
    
    Before:
    
        main
            -> __x86_indirect_thunk_rax
                -> __x86_indirect_thunk_rax
                    -> __x86_indirect_thunk_rax
                        -> bar
    
    After:
    
        main
            -> __x86_indirect_thunk_rax
                -> __x86_indirect_thunk_rax
                    -> foo
                        -> bar
    
    Committer testing:
    
    Chose "Reports", Then "Context-Sensitive Call Graph" and then go on
    expanding:
    
    Before:
    
    simple-retpolin
       PID:PID
          _start
             _start
                __libc_start_main
                   main
                       __x86_indirect_thunk_rax
                          __x86_indirect_thunk_rax
                          bar
    
    After:
    
    Remove the "simple.retpoline.db" file, run again the 'perf script' line
    to regenerate the .db file and run the exported-sql-viewer.py again to
    get the same all the way to 'main', then, from there, including 'main':
    
                   main
                       __x86_indirect_thunk_rax
                           __x86_indirect_thunk_rax
                               foo
                                   bar
    
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Link: http://lkml.kernel.org/r/20190109091835.5570-6-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index f52c0f90915d..632c07a125ab 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -638,14 +638,57 @@ static int thread_stack__no_call_return(struct thread *thread,
 	else
 		parent = root;
 
-	/* This 'return' had no 'call', so push and pop top of stack */
-	cp = call_path__findnew(cpr, parent, fsym, ip, ks);
+	if (parent->sym == from_al->sym) {
+		/*
+		 * At the bottom of the stack, assume the missing 'call' was
+		 * before the trace started. So, pop the current symbol and push
+		 * the 'to' symbol.
+		 */
+		if (ts->cnt == 1) {
+			err = thread_stack__call_return(thread, ts, --ts->cnt,
+							tm, ref, false);
+			if (err)
+				return err;
+		}
+
+		if (!ts->cnt) {
+			cp = call_path__findnew(cpr, root, tsym, addr, ks);
+
+			return thread_stack__push_cp(ts, addr, tm, ref, cp,
+						     true, false);
+		}
+
+		/*
+		 * Otherwise assume the 'return' is being used as a jump (e.g.
+		 * retpoline) and just push the 'to' symbol.
+		 */
+		cp = call_path__findnew(cpr, parent, tsym, addr, ks);
+
+		err = thread_stack__push_cp(ts, 0, tm, ref, cp, true, false);
+		if (!err)
+			ts->stack[ts->cnt - 1].non_call = true;
+
+		return err;
+	}
+
+	/*
+	 * Assume 'parent' has not yet returned, so push 'to', and then push and
+	 * pop 'from'.
+	 */
+
+	cp = call_path__findnew(cpr, parent, tsym, addr, ks);
 
 	err = thread_stack__push_cp(ts, addr, tm, ref, cp, true, false);
 	if (err)
 		return err;
 
-	return thread_stack__pop_cp(thread, ts, addr, tm, ref, tsym);
+	cp = call_path__findnew(cpr, cp, fsym, ip, ks);
+
+	err = thread_stack__push_cp(ts, ip, tm, ref, cp, true, false);
+	if (err)
+		return err;
+
+	return thread_stack__call_return(thread, ts, --ts->cnt, tm, ref, false);
 }
 
 static int thread_stack__trace_begin(struct thread *thread,

  reply	other threads:[~2019-02-22 14:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09  9:18 [PATCH 0/6] perf thread-stack: x86 retpolines Adrian Hunter
2019-01-09  9:18 ` [PATCH 1/6] perf tools: Fix split_kallsyms_for_kcore for trampoline symbols Adrian Hunter
2019-02-09 12:57   ` [tip:perf/core] perf tools: Fix split_kallsyms_for_kcore() " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 2/6] perf thread-stack: Tidy thread_stack__push_cp() usage Adrian Hunter
2019-02-09 12:58   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 3/6] perf thread-stack: Tidy thread_stack__no_call_return() by adding more local variables Adrian Hunter
2019-02-09 12:58   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 4/6] perf thread-stack: Represent jmps to the start of a different symbol Adrian Hunter
2019-02-06 12:39   ` Arnaldo Carvalho de Melo
2019-02-06 13:25     ` Adrian Hunter
2019-02-09 12:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 5/6] perf thread-stack: Improve thread_stack__no_call_return() Adrian Hunter
2019-02-22  9:48   ` Adrian Hunter
2019-02-22 14:39     ` Arnaldo Carvalho de Melo [this message]
2019-02-28  7:49   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-09  9:18 ` [PATCH 6/6] perf thread-stack: Hide x86 retpolines Adrian Hunter
2019-01-09 15:38   ` Jiri Olsa
2019-01-10  7:52     ` Adrian Hunter
2019-02-22 19:42   ` Arnaldo Carvalho de Melo
2019-02-22 20:53     ` Arnaldo Carvalho de Melo
2019-02-28  7:49   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-01-10  9:55 ` [PATCH 0/6] perf thread-stack: " Jiri Olsa
2019-02-06  9:10   ` Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190222143900.GA13225@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=adrian.hunter@intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.