All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	mingo@infradead.org, alexander.shishkin@linux.intel.com,
	linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v2 3/3] perf report: Implement visual marker for macro fusion in annotate
Date: Mon, 19 Jun 2017 14:35:29 -0300	[thread overview]
Message-ID: <20170619173529.GM3645@kernel.org> (raw)
In-Reply-To: <1497840958-4759-4-git-send-email-yao.jin@linux.intel.com>

Em Mon, Jun 19, 2017 at 10:55:58AM +0800, Jin Yao escreveu:
> For marking the fused instructions clearly, This patch adds a
> line before the first instruction of pair and joins it with the
> arrow of the jump.
> 
> For example, when je is selected in annotate view, the line
> before cmpl is displayed and joins the arrow of je.
> 
>        │   ┌──cmpl   $0x0,argp_program_version_hook
>  81.93 │   │──je     20
>        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
>        │   │↓ jne    29
>        │   │↓ jmp    43
>  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Ok, thanks for making this per-arch! Some comments:

I think we should have this marked permanently, i.e. not just when we go
to the jump line, something like this (testing here in a t450s
broadwell, function hc_find_func, /usr/lib64/liblzma.so.5.2.2):

It is like this now, when we are not on the jne jump line:

  0.71 │       mov    %r14d,%r10d                                                                                                                            ▒
       │       movzbl (%rdx,%r10,1),%ebp                                                                                                                     ▒
  1.06 │ 70:   mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74:   cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒
       │     ↑ jne    70                                                                                                                                     ▒
  0.85 │       movzbl (%rdx),%r10d                                                                                                                           ▒
  0.99 │       cmp    %r10b,(%rbx)                                                                                                                           ▒

I think it should be augmented to:

  0.71 │       mov    %r14d,%r10d                                                                                                                            ▒
       │       movzbl (%rdx,%r10,1),%ebp                                                                                                                     ▒
  1.06 │ 70: ┌─mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74: └─cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒
       │     ↑ jne    70                                                                                                                                     ▒
  0.85 │       movzbl (%rdx),%r10d                                                                                                                           ▒
  0.99 │       cmp    %r10b,(%rbx)                                                                                                                           ▒

I.e. no arrow, the two instructions that end up as one micro-op being
connected.

And then this:

        │   ┌──cmpl   $0x0,argp_program_version_hook
  81.93 │   │──je     20
        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
        │   │↓ jne    29
        │   │↓ jmp    43
  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Would look better as:

        │   ┌──cmpl   $0x0,argp_program_version_hook
  81.93 │   ├──je     20
        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
        │   │↓ jne    29
        │   │↓ jmp    43
  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Patch below, please test/ack :-)

This was the low hanging fruit, having the:

  1.06 │ 70: ┌─mov    (%r9,%rcx,4),%ecx                                                                                                                      ◆
 77.98 │ 74: └─cmp    %bpl,(%rbx,%r10,1)                                                                                                                     ▒

Marker always there, not just when we have the cursor on top of one of
those lines remains to be coded.

But you state:

 ------------
    Macro fusion merges two instructions to a single micro-op. Intel core
    platform performs this hardware optimization under limited
    circumstances.
 ------------

"Intel core", what about older arches, etc, don't you have to look at:

# cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
# cpuid : GenuineIntel,6,61,4

present in the perf.data header (or in the running system, for things
like 'perf top') to make sure that this is a machine where such "macro
fusion" takes place?

- Arnaldo

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index acba636bd165..9ef7677ae14f 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -756,8 +756,10 @@ void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column,
 		ui_browser__gotorc(browser, end_row, column);
 		SLsmg_draw_hline(2);
 		ui_browser__gotorc(browser, end_row + 1, column - 1);
-		SLsmg_draw_vline(1);
+		SLsmg_write_char(SLSMG_LTEE_CHAR);
 	} else {
+		ui_browser__gotorc(browser, end_row, column - 1);
+		SLsmg_write_char(SLSMG_LTEE_CHAR);
 		ui_browser__gotorc(browser, end_row, column);
 		SLsmg_draw_hline(2);
 	}

  reply	other threads:[~2017-06-19 17:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19  2:55 [PATCH v2 0/3] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-19  2:55 ` [PATCH v2 1/3] perf util: Return arch from symbol__disassemble and save it in browser Jin Yao
2017-06-20  9:02   ` [tip:perf/core] perf annotate: Return arch from symbol__disassemble() " tip-bot for Jin Yao
2017-06-19  2:55 ` [PATCH v2 2/3] perf util: Check for fused instruction Jin Yao
2017-06-19  2:55 ` [PATCH v2 3/3] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-19 17:35   ` Arnaldo Carvalho de Melo [this message]
2017-06-19 19:13     ` Arnaldo Carvalho de Melo
2017-06-20  1:25     ` Jin, Yao
2017-06-20  1:37       ` Arnaldo Carvalho de Melo
2017-06-20  1:54         ` Jin, Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170619173529.GM3645@kernel.org \
    --to=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@infradead.org \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.