Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: linux-kernel@vger.kernel.org, David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Hagen Paul Pfeifer <hagen@jauu.net>,
	Mike Galbraith <efault@gmx.de>, Namhyung Kim <namhyung@gmail.com>,
	Paul Mackerras <paulus@samba.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Stephane Eranian <eranian@google.com>,
	arnaldo.melo@gmail.com,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
Date: Fri, 27 Apr 2012 10:06:13 +0200	[thread overview]
Message-ID: <20120427080613.GA7359@gmail.com> (raw)
In-Reply-To: <20120427072153.GB4766@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> [...]
> 
> This would require for us to draw all loops that are 
> interesting: probably all backwards jumping ones, with some 
> nesting limit of 5 or so. I think we really need this loop 
> graph for this UI to have low visual overhead :-/
> 
> Beyond improving visual stability of the output, this would 
> also obviously be (much) more informative as for reasonably 
> sized functions it would show all the current loop contexts - 
> which is very useful when one tries to make sense of a 
> function in assembly.

My (silent) hope would be that if we make assembly output more 
obvious, more structured visually then more people will read 
assembly code.

Note that once we have loop nesting there are countless other 
possibilities as well to augment the output.

( Note that I pile on a couple of visual suggestions below - so 
  the output becomes more and more complex - ideally we want to 
  pick the ones from below that we like and skip the ones that 
  are not so good. )

1) Loop grouping

We could slightly group the assembly instructions themselves 
as well:

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
    0.00 │┌─→ c0:  cmp    %rax,%rbx
    0.00 ││        mov    %rax,%rcx
    0.00 ││      ↓ je     5dd
    0.00 ││   cc:  test   %rcx,%rcx
    0.00 ││      ↓ je     da
    0.00 ││        mov    0x8(%rcx),%esi
    0.00 ││        shr    $0x4,%esi
    0.00 ││        sub    $0x2,%esi
    0.00 ││   da:  mov    %rcx,0x10(%rbx)
    0.00 ││        mov    %rcx,%rax
    0.00 ││        cmpl   $0x0,%fs:0x18
    0.00 ││      ↓ je     ed
    0.00 ││        lock   cmpxchg %rbx,(%rdx)
    0.00 ││        cmp    %rax,%rcx
    0.00 │└──────↑ jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ je     104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ jne    719

See that by using a newline how much clearer the loop stands 
out, visually?

2) Nesting

The classic C method to show nesting is to use curly braces and 
slight indentation:

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   je     5dd
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   je     da
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   je     ed
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ je     104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ jne    719

3) Separating out forward conditional jumps

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    5dd
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   .je    da
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    ed
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   719

So the forward conditional jumps get moved by one character to 
the right and get prefixed with '.', which has the following 
effects:

 - the flow of all the 'other', register modifying and loop 
   instructions can be seen more clearly

 - the 'exception' forward conditional jumps get moved into the 
   visual background, slightly.

 - blocks of instructions with no branches amongst them are more 
   clearly visible.

If '.' is too aggressive or too confusing then some other 
(possibly graphical) character could be used as well?

4) Marking labels

I'd argue we bring in <ab> as label markers:

    0.00 │       ↓ jmp    <cc>
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    <5dd>
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   .je    <da>
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    <ed>
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    <c0>
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    <104>
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   <719>

Note how much easier it becomes to read the body of the loop: 
the <> labels are clearly separated from constants/offsets 
within the other assembly instructions. In the current output 
the two easily 'mix', which is bad.

5)

Another trick we could use to separate labels from constants 
more clearly is capitalization:

    0.00 │       ↓ jmp    <CC>
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ C0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    <5DD>
    0.00 ││   CC:    test   %rcx,%rcx
    0.00 ││      ↓   .je    <DA>
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   DA:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    <ED>
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    <C0>
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    <104>
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   <719>

I'm not entirely convinced we want this one.

6) Making callq's to function symbols more C-alike

Before:

    0.00 ││  70:  mov    %rbx,%rdi
    0.00 ││       callq  _IO_switch_to_main_get_area
    0.00 ││       mov    0x8(%rbx),%rdx
    0.00 ││       mov    0x10(%rbx),%rsi
    0.00 ││       cmp    %rsi,%rdx

After:

    0.00 ││  70:  mov    %rbx,%rdi
    0.00 ││       callq  _IO_switch_to_main_get_area()
    0.00 ││       mov    0x8(%rbx),%rdx
    0.00 ││       mov    0x10(%rbx),%rsi
    0.00 ││       cmp    %rsi,%rdx

The () makes it easier to separate the 'function' purpose of the 
symbol.

This would be especially useful once we start looking into 
debuginfo data and start symbolising stack frame offsets and 
data references:

Before:

    0.00 │   1b3:  mov    -0x38(%rbp),%rdx
    0.00 │         mov    -0x34(%rbp),%rcx
    0.00 │         mov    0x335fb4(%rip),%eax        # 392a5b0b60 <perturb_byte>

After:

  #
  # PARAM:idx          : 0x38
  # PARAM:addr         : 0x34
  # DATA:perturb_byte  : 0x392a5b0b60
  #

  ...

    0.00 │   1b3:  mov    P:idx....(%rbp),%rdx
    0.00 │         mov    P:addr...(%rbp),%rcx
    0.00 │         mov    D:pertu..(%rip),%eax

Where 'P:' stands for parameter, 'D:' for data, and the symbol 
names are length-culled to a fixed width 6 chars, to make the 
registers align better and to not interfere with other textual 
output. There's a summary at the top, to still have all the raw 
binary data available - but 'o' can be pressed as well to see 
the raw values.

This way it still looks like assembly - but is *much* more 
informative.

Or so.

Thanks,

	Ingo

next prev parent reply	other threads:[~2012-04-27  8:06 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 3/4] perf annotate: Mark jump instructions with no offset Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps Arnaldo Carvalho de Melo
2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
2012-04-27  8:06   ` Ingo Molnar [this message]
2012-04-27 15:31     ` Arnaldo Carvalho de Melo
2012-04-27 15:48       ` Peter Zijlstra
2012-04-27 16:07         ` Arnaldo Carvalho de Melo
2012-04-27 16:20           ` Peter Zijlstra
2012-04-27 18:09             ` Arnaldo Carvalho de Melo
2012-04-27 15:16   ` Arnaldo Carvalho de Melo
2012-04-27 16:35 ` Linus Torvalds
2012-04-27 16:46   ` Arnaldo Carvalho de Melo
2012-04-27 17:12     ` Linus Torvalds
2012-04-27 18:03       ` Arnaldo Carvalho de Melo
2012-04-27 18:23         ` Linus Torvalds
2012-04-27 19:17           ` Arnaldo Carvalho de Melo
2012-04-28 13:38             ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120427080613.GA7359@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@infradead.org \
    --cc=acme@redhat.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=hagen@jauu.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=namhyung@gmail.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox