public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	mingo@infradead.org, alexander.shishkin@linux.intel.com,
	linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v2 3/3] perf report: Implement visual marker for macro fusion in annotate
Date: Mon, 19 Jun 2017 14:35:29 -0300	[thread overview]
Message-ID: <20170619173529.GM3645@kernel.org> (raw)
In-Reply-To: <1497840958-4759-4-git-send-email-yao.jin@linux.intel.com>

Em Mon, Jun 19, 2017 at 10:55:58AM +0800, Jin Yao escreveu:
> For marking the fused instructions clearly, This patch adds a
> line before the first instruction of pair and joins it with the
> arrow of the jump.
> 
> For example, when je is selected in annotate view, the line
> before cmpl is displayed and joins the arrow of je.
> 
>        │   ┌──cmpl   $0x0,argp_program_version_hook
>  81.93 │   │──je     20
>        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
>        │   │↓ jne    29
>        │   │↓ jmp    43
>  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Ok, thanks for making this per-arch! Some comments:

I think we should have this marked permanently, i.e. not just when we go
to the jump line, something like this (testing here in a t450s
broadwell, function hc_find_func, /usr/lib64/liblzma.so.5.2.2):

It is like this now, when we are not on the jne jump line:

  0.71 │       mov    %r14d,%r10d                                                                                                                            ▒
       │       movzbl (%rdx,%r10,1),%ebp                                                                                                                     ▒
  1.06 │ 70:   mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74:   cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒
       │     ↑ jne    70                                                                                                                                     ▒
  0.85 │       movzbl (%rdx),%r10d                                                                                                                           ▒
  0.99 │       cmp    %r10b,(%rbx)                                                                                                                           ▒

I think it should be augmented to:

  0.71 │       mov    %r14d,%r10d                                                                                                                            ▒
       │       movzbl (%rdx,%r10,1),%ebp                                                                                                                     ▒
  1.06 │ 70: ┌─mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74: └─cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒
       │     ↑ jne    70                                                                                                                                     ▒
  0.85 │       movzbl (%rdx),%r10d                                                                                                                           ▒
  0.99 │       cmp    %r10b,(%rbx)                                                                                                                           ▒

I.e. no arrow, the two instructions that end up as one micro-op being
connected.

And then this:

        │   ┌──cmpl   $0x0,argp_program_version_hook
  81.93 │   │──je     20
        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
        │   │↓ jne    29
        │   │↓ jmp    43
  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Would look better as:

        │   ┌──cmpl   $0x0,argp_program_version_hook
  81.93 │   ├──je     20
        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
        │   │↓ jne    29
        │   │↓ jmp    43
  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Patch below, please test/ack :-)

This was the low hanging fruit, having the:

  1.06 │ 70: ┌─mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74: └─cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒

Marker always there, not just when we have the cursor on top of one of
those lines remains to be coded.

But you state:

 ------------
    Macro fusion merges two instructions to a single micro-op. Intel core
    platform performs this hardware optimization under limited
    circumstances.
 ------------

"Intel core", what about older arches, etc, don't you have to look at:

# cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
# cpuid : GenuineIntel,6,61,4

present in the perf.data header (or in the running system, for things
like 'perf top') to make sure that this is a machine where such "macro
fusion" takes place?

- Arnaldo

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index acba636bd165..9ef7677ae14f 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -756,8 +756,10 @@ void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column,
 		ui_browser__gotorc(browser, end_row, column);
 		SLsmg_draw_hline(2);
 		ui_browser__gotorc(browser, end_row + 1, column - 1);
-		SLsmg_draw_vline(1);
+		SLsmg_write_char(SLSMG_LTEE_CHAR);
 	} else {
+		ui_browser__gotorc(browser, end_row, column - 1);
+		SLsmg_write_char(SLSMG_LTEE_CHAR);
 		ui_browser__gotorc(browser, end_row, column);
 		SLsmg_draw_hline(2);
 	}

  reply	other threads:[~2017-06-19 17:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19  2:55 [PATCH v2 0/3] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-19  2:55 ` [PATCH v2 1/3] perf util: Return arch from symbol__disassemble and save it in browser Jin Yao
2017-06-20  9:02   ` [tip:perf/core] perf annotate: Return arch from symbol__disassemble() " tip-bot for Jin Yao
2017-06-19  2:55 ` [PATCH v2 2/3] perf util: Check for fused instruction Jin Yao
2017-06-19  2:55 ` [PATCH v2 3/3] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-19 17:35   ` Arnaldo Carvalho de Melo [this message]
2017-06-19 19:13     ` Arnaldo Carvalho de Melo
2017-06-20  1:25     ` Jin, Yao
2017-06-20  1:37       ` Arnaldo Carvalho de Melo
2017-06-20  1:54         ` Jin, Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170619173529.GM3645@kernel.org \
    --to=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@infradead.org \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox