Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@infradead.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Hagen Paul Pfeifer <hagen@jauu.net>,
	Mike Galbraith <efault@gmx.de>, Namhyung Kim <namhyung@gmail.com>,
	Paul Mackerras <paulus@samba.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Stephane Eranian <eranian@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
Date: Fri, 27 Apr 2012 12:31:19 -0300	[thread overview]
Message-ID: <20120427153119.GC27997@infradead.org> (raw)
In-Reply-To: <20120427080613.GA7359@gmail.com>

Em Fri, Apr 27, 2012 at 10:06:13AM +0200, Ingo Molnar escreveu:
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > [...]
> > 
> > This would require for us to draw all loops that are 
> > interesting: probably all backwards jumping ones, with some 
> > nesting limit of 5 or so. I think we really need this loop 
> > graph for this UI to have low visual overhead :-/
> > 
> > Beyond improving visual stability of the output, this would 
> > also obviously be (much) more informative as for reasonably 
> > sized functions it would show all the current loop contexts - 
> > which is very useful when one tries to make sense of a 
> > function in assembly.
> 
> My (silent) hope would be that if we make assembly output more 
> obvious, more structured visually then more people will read 
> assembly code.

Well, the idea is to improve the experience of people who _already_ look
at assembly output so that they can become slightly (or a lot, hey, one
can dream) more productive, as we get older we need better eye lenses
8-)

Doing that we would eventually have something that would make more
people join this existing assembly lovers club.
 
> Note that once we have loop nesting there are countless other 
> possibilities as well to augment the output.

Yeah, ponies!
 
> ( Note that I pile on a couple of visual suggestions below - so 
>   the output becomes more and more complex - ideally we want to 
>   pick the ones from below that we like and skip the ones that 
>   are not so good. )
> 
> 1) Loop grouping
> 
> We could slightly group the assembly instructions themselves 
> as well:
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>     0.00 │┌─→ c0:  cmp    %rax,%rbx
>     0.00 ││        mov    %rax,%rcx
>     0.00 ││      ↓ je     5dd
>     0.00 ││   cc:  test   %rcx,%rcx
>     0.00 ││      ↓ je     da
>     0.00 ││        mov    0x8(%rcx),%esi
>     0.00 ││        shr    $0x4,%esi
>     0.00 ││        sub    $0x2,%esi
>     0.00 ││   da:  mov    %rcx,0x10(%rbx)
>     0.00 ││        mov    %rcx,%rax
>     0.00 ││        cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓ je     ed
>     0.00 ││        lock   cmpxchg %rbx,(%rdx)
>     0.00 ││        cmp    %rax,%rcx
>     0.00 │└──────↑ jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ je     104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ jne    719
> 
> See that by using a newline how much clearer the loop stands 
> out, visually?

Linus also made that suggestion, agreed.
 
> 2) Nesting
> 
> The classic C method to show nesting is to use curly braces and 
> slight indentation:
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   je     5dd
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   je     da
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   je     ed
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ je     104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ jne    719

I guess we could do away with the {, the spaces before/after the loop
+ the indentation (that Arjan also suggested on G+) should be enough,
right?
 
> 3) Separating out forward conditional jumps
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    5dd
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    da
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    ed
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   719
> 
> So the forward conditional jumps get moved by one character to 
> the right and get prefixed with '.', which has the following 
> effects:
> 
>  - the flow of all the 'other', register modifying and loop 
>    instructions can be seen more clearly
> 
>  - the 'exception' forward conditional jumps get moved into the 
>    visual background, slightly.
> 
>  - blocks of instructions with no branches amongst them are more 
>    clearly visible.
> 
> If '.' is too aggressive or too confusing then some other 
> (possibly graphical) character could be used as well?

Yeah, there are some unused graphical chars. But perhaps we should just
use ↓ again?

     0.00 │┌─→ c0:    cmp    %rax,%rbx
     0.00 ││          mov    %rax,%rcx
     0.00 ││      ↓   ↓je    5dd
     0.00 ││   cc:    test   %rcx,%rcx
     0.00 ││      ↓   ↓je    da
     0.00 ││          mov    0x8(%rcx),%esi
     0.00 ││          shr    $0x4,%esi
     0.00 ││          sub    $0x2,%esi
     0.00 ││   da:    mov    %rcx,0x10(%rbx)
     0.00 ││          mov    %rcx,%rax
     0.00 ││          cmpl   $0x0,%fs:0x18
     0.00 ││      ↓   ↓je    ed
     0.00 ││          lock   cmpxchg %rbx,(%rdx)
     0.00 ││          cmp    %rax,%rcx
          ││
     0.00 │└──────↑ } jne    c0

Does it still stands out? I think so, we expect a letter there,
something very different is there, matching the other down arrowsome
columns to the left.

> 4) Marking labels
> 
> I'd argue we bring in <ab> as label markers:
> 
>     0.00 │       ↓ jmp    <cc>
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    <5dd>
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    <da>
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    <ed>
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    <c0>
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    <104>
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   <719>
> 
> Note how much easier it becomes to read the body of the loop: 
> the <> labels are clearly separated from constants/offsets 
> within the other assembly instructions. In the current output 
> the two easily 'mix', which is bad.

I think the <> is enough, i.e. no need for capitalization, as a bonus
point, it is already what is used in raw assembly mode, i.e. a subset of
users are already used to that visual cue.
 
> 5)
> 
> Another trick we could use to separate labels from constants 
> more clearly is capitalization:
> 
>     0.00 │       ↓ jmp    <CC>
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ C0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    <5DD>
>     0.00 ││   CC:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    <DA>
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   DA:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    <ED>
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    <C0>
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    <104>
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   <719>
> 
> I'm not entirely convinced we want this one.
> 
> 6) Making callq's to function symbols more C-alike
> 
> Before:
> 
>     0.00 ││  70:  mov    %rbx,%rdi
>     0.00 ││       callq  _IO_switch_to_main_get_area
>     0.00 ││       mov    0x8(%rbx),%rdx
>     0.00 ││       mov    0x10(%rbx),%rsi
>     0.00 ││       cmp    %rsi,%rdx
> 
> After:
> 
>     0.00 ││  70:  mov    %rbx,%rdi
>     0.00 ││       callq  _IO_switch_to_main_get_area()
>     0.00 ││       mov    0x8(%rbx),%rdx
>     0.00 ││       mov    0x10(%rbx),%rsi
>     0.00 ││       cmp    %rsi,%rdx
> 
> The () makes it easier to separate the 'function' purpose of the 
> symbol.

Yes, I agreed already with that one, just was waiting a bit more to see
if somebody else would disagree (hint, hint)
 
> This would be especially useful once we start looking into 
> debuginfo data and start symbolising stack frame offsets and 
> data references:
> 
> Before:
> 
>     0.00 │   1b3:  mov    -0x38(%rbp),%rdx
>     0.00 │         mov    -0x34(%rbp),%rcx
>     0.00 │         mov    0x335fb4(%rip),%eax        # 392a5b0b60 <perturb_byte>
> 
> After:
> 
>   #
>   # PARAM:idx          : 0x38
>   # PARAM:addr         : 0x34
>   # DATA:perturb_byte  : 0x392a5b0b60
>   #
> 
>   ...
> 
>     0.00 │   1b3:  mov    P:idx....(%rbp),%rdx
>     0.00 │         mov    P:addr...(%rbp),%rcx
>     0.00 │         mov    D:pertu..(%rip),%eax
> 
> Where 'P:' stands for parameter, 'D:' for data, and the symbol 
> names are length-culled to a fixed width 6 chars, to make the 
> registers align better and to not interfere with other textual 
> output. There's a summary at the top, to still have all the raw 
> binary data available - but 'o' can be pressed as well to see 
> the raw values.
> 
> This way it still looks like assembly - but is *much* more 
> informative.

Yeah, there were some ideas related to figuring out what values are in
what registers at each line, easy for things like constants, requires a
bit more thought for other cases.
 
- Arnaldo

next prev parent reply	other threads:[~2012-04-27 15:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 3/4] perf annotate: Mark jump instructions with no offset Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps Arnaldo Carvalho de Melo
2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
2012-04-27  8:06   ` Ingo Molnar
2012-04-27 15:31     ` Arnaldo Carvalho de Melo [this message]
2012-04-27 15:48       ` Peter Zijlstra
2012-04-27 16:07         ` Arnaldo Carvalho de Melo
2012-04-27 16:20           ` Peter Zijlstra
2012-04-27 18:09             ` Arnaldo Carvalho de Melo
2012-04-27 15:16   ` Arnaldo Carvalho de Melo
2012-04-27 16:35 ` Linus Torvalds
2012-04-27 16:46   ` Arnaldo Carvalho de Melo
2012-04-27 17:12     ` Linus Torvalds
2012-04-27 18:03       ` Arnaldo Carvalho de Melo
2012-04-27 18:23         ` Linus Torvalds
2012-04-27 19:17           ` Arnaldo Carvalho de Melo
2012-04-28 13:38             ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120427153119.GC27997@infradead.org \
    --to=acme@infradead.org \
    --cc=dsahern@gmail.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=hagen@jauu.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@gmail.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox