All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Du, Changbin" <changbin.du@intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: "Du, Changbin" <changbin.du@intel.com>,
	peterz@infradead.org, mingo@redhat.com,
	alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org
Subject: Re: Does perf-annotate work correctly?
Date: Wed, 13 Sep 2017 09:54:25 +0800	[thread overview]
Message-ID: <20170913015425.GA607@intel.com> (raw)
In-Reply-To: <20170912143350.GA3452@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 9333 bytes --]

On Tue, Sep 12, 2017 at 11:33:50AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 12, 2017 at 06:10:35PM +0800, Du, Changbin escreveu:
> > When a annotate a symbol, I find the annotated C source code doesn't match assembly code.
> > So I cannot determine which line of C code has much overhead withou gdb's help.
> > 
> > Here is a example result of function apic_has_interrupt_for_ppr() in kvm module.
> 
> Ok, was this using the module .ko file or /proc/kcore? You forgot to
> cut'n'paste the first line on the screen.
> 
It is arch/x86/kvm/kvm.ko.

> Also, how did you use gdb?
> 
$ gdb arch/x86/kvm/kvm.ko
$ (gdb) disassemble /s apic_has_interrupt_for_ppr

> perf uses objdump to do the disassembly, and depending on how it is used
> (live system, post processing on a different machine, permissions) it
> may use different files to do the disassembly.
> 
But objdump has same out as gdb. (Always on same machine, and no binary changed.)

$ objdump -d -S arch/x86/kvm/kvm.o
...
static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
{
   3b4a0:	e8 00 00 00 00       	callq  3b4a5 <apic_has_interrupt_for_ppr+0x5>
   3b4a5:	55                   	push   %rbp
   3b4a6:	48 89 e5             	mov    %rsp,%rbp
   3b4a9:	48 83 ec 08          	sub    $0x8,%rsp
	int highest_irr;
	if (kvm_x86_ops->sync_pir_to_irr && apic->vcpu->arch.apicv_active)
   3b4ad:	48 8b 05 00 00 00 00 	mov    0x0(%rip),%rax        # 3b4b4 <apic_has_interrupt_for_ppr+0x14>
   3b4b4:	48 8b 80 38 02 00 00 	mov    0x238(%rax),%rax
   3b4bb:	48 85 c0             	test   %rax,%rax
   3b4be:	74 10                	je     3b4d0 <apic_has_interrupt_for_ppr+0x30>
   3b4c0:	48 8b 97 88 00 00 00 	mov    0x88(%rdi),%rdx
   3b4c7:	80 ba 28 03 00 00 00 	cmpb   $0x0,0x328(%rdx)
   3b4ce:	75 3a                	jne    3b50a <apic_has_interrupt_for_ppr+0x6a>

	/*
	 * Note that irr_pending is just a hint. It will be always
	 * true with virtual interrupt delivery enabled.
	 */
	if (!apic->irr_pending)
   3b4d0:	80 bf 91 00 00 00 00 	cmpb   $0x0,0x91(%rdi)
   3b4d7:	74 2a                	je     3b503 <apic_has_interrupt_for_ppr+0x63>
   3b4d9:	48 8b 8f a0 00 00 00 	mov    0xa0(%rdi),%rcx
static int find_highest_vector(void *bitmap)
{
	int vec;
	u32 *reg;

	for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
   3b4e0:	b8 e0 00 00 00       	mov    $0xe0,%eax
	     vec >= 0; vec -= APIC_VECTORS_PER_REG) {
		reg = bitmap + REG_POS(vec);
		if (*reg)
   3b4e5:	89 c2                	mov    %eax,%edx
   3b4e7:	c1 fa 05             	sar    $0x5,%edx
   3b4ea:	c1 e2 04             	shl    $0x4,%edx
   3b4ed:	48 63 d2             	movslq %edx,%rdx
   3b4f0:	8b 94 11 00 02 00 00 	mov    0x200(%rcx,%rdx,1),%edx
   3b4f7:	85 d2                	test   %edx,%edx
   3b4f9:	75 2d                	jne    3b528 <apic_has_interrupt_for_ppr+0x88>



> Please provide more detailed information on the exact command line
> arguments and usage scenario.
>  
> - Arnaldo

> 
> >        │580         __clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention);    ▒
> >        │581 }                                                                            ▒
> >        │                                                                                 ▒
> >        │583 static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)       ▒
> >        │584 {                                                                            ▒
> >   0.88 │30:   cmpb   $0x0,0x91(%rdi)                                                     ▒
> >   2.54 │    ↓ je     63                                                                  ▒
> >   0.20 │      mov    0xa0(%rdi),%rcx                                                     ▒
> >        │581         int highest_irr;                                                     ▒
> >        │582         if (kvm_x86_ops->sync_pir_to_irr && apic->vcpu->arch.apicv_active)   ▒
> >   4.91 │      mov    $0xe0,%eax                       x                                   ▒
> >   1.46 │45:   mov    %eax,%edx                        x                                   ▒
> >   0.02 │      sar    $0x5,%edx                        x                                   ▒
> >   3.57 │      shl    $0x4,%edx                        x                                   ▒
> >   3.34 │      movslq %edx,%rdx                        x                                   ▒
> >   1.25 │      mov    0x200(%rcx,%rdx,1),%edx          x                                   ▒
> >  42.44 │      test   %edx,%edx                        x                                   ▒
> >   0.01 │   ┌──jne    88                               x                                   ▒
> >   3.48 │   │  sub    $0x20,%eax                       x                                   ▒
> >   2.24 │   │  cmp    $0xffffffe0,%eax                 x                                   ▒
> >        │586│apic_find_highest_irr():                                                     ▒
> >        │   │                                                                             ▒
> >        │407│        /*                                                                   ▒
> >        │408│         * Note that irr_pending is just a hint. It will be always           ▒
> >        │409│         * true with virtual interrupt delivery enabled.                     ▒
> >        │410│         */                                                                  ▒
> >        │411│        if (!apic->irr_pending)                                              ▒
> >        │   │↑ jne    45                                                                  ▒
> >   0.62 │63:│  mov    $0xffffffff,%eax                                                    ◆
> >   0.83 │   │  leaveq                                                                     ▒
> >  13.52 │   │← retq                                                                       ▒
> >        │6a:│  mov    %esi,-0x4(%rbp)                                                     ▒
> >        │   │  mov    %rdx,%rdi                                                           ▒
> >        │418│find_highest_vector():                                                       ▒
> >        │340│static int find_highest_vector(void *bitmap)                                 ▒
> >        │341│{                                                                            ▒
> >        │342│        int vec;                                                             ▒
> >        │343│        u32 *reg;                                                            ▒
> >        │   │                                                                             ▒
> >        │345│        for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;                   ▒
> >        │   │→ callq  *%rax                                                               ▒
> >        │   │  mov    -0x4(%rbp),%esi                                                     ▒
> >        │343│             vec >= 0; vec -= APIC_VECTORS_PER_REG) {                        ▒
> >        │344│                reg = bitmap + REG_POS(vec);                                 ▒
> >        │345│                if (*reg)                                                    ▒
> >   0.05 │75:│  cmp    $0xffffffff,%eax                                                    ▒
> >        │   │↑ je     63                                                                  ▒
> >   1.95 │   │  mov    %eax,%edx                                                           ▒
> >   1.45 │   │  and    $0xf0,%edx                                                          
> > 
> > 
> > Look at the assembly code block where I have put a 'x' on the right. Apparently the
> > assembly code doesn't match the C source code arrounded. Let's look the correct disassemble
> > result from gdb:
> > 
> > 340		for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> >    0x000000000003b4e0 <+64>:	mov    $0xe0,%eax
> > 
> > 342			reg = bitmap + REG_POS(vec);
> > 343			if (*reg)
> >    0x000000000003b4e5 <+69>:	mov    %eax,%edx
> >    0x000000000003b4e7 <+71>:	sar    $0x5,%edx
> >    0x000000000003b4ea <+74>:	shl    $0x4,%edx
> >    0x000000000003b4ed <+77>:	movslq %edx,%rdx
> >    0x000000000003b4f0 <+80>:	mov    0x200(%rcx,%rdx,1),%edx
> >    0x000000000003b4f7 <+87>:	test   %edx,%edx
> >    0x000000000003b4f9 <+89>:	jne    0x3b528 <apic_has_interrupt_for_ppr+136>
> > 
> > 341		     vec >= 0; vec -= APIC_VECTORS_PER_REG) {
> >    0x000000000003b4fb <+91>:	sub    $0x20,%eax
> > 
> > 340		for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> >    0x000000000003b4fe <+94>:	cmp    $0xffffffe0,%eax
> >    0x000000000003b501 <+97>:	jne    0x3b4e5 <apic_has_interrupt_for_ppr+69>
> > 
> > 
> > Compared to gdb, perf-annoate has messed up. is it a bug or just perf is not as perfect as gdb?
> > 
> > -- 
> > Thanks,
> > Changbin Du
> 
> 

-- 
Thanks,
Changbin Du

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2017-09-13  2:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12 10:10 Does perf-annotate work correctly? Du, Changbin
2017-09-12 14:33 ` Arnaldo Carvalho de Melo
2017-09-13  1:54   ` Du, Changbin [this message]
2017-09-26  6:06     ` Du, Changbin
2017-09-13  9:14   ` Du, Changbin
2017-10-13 10:15     ` Du, Changbin
2017-10-16  9:28       ` Jiri Olsa
2017-10-16  9:34         ` Du, Changbin
2017-10-16  9:30       ` Jiri Olsa
2017-10-16  9:35         ` Du, Changbin
2017-10-16 10:45           ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170913015425.GA607@intel.com \
    --to=changbin.du@intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.