From: "Du, Changbin" <changbin.du@intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: "Du, Changbin" <changbin.du@intel.com>,
peterz@infradead.org, mingo@redhat.com,
alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org
Subject: Re: Does perf-annotate work correctly?
Date: Wed, 13 Sep 2017 09:54:25 +0800 [thread overview]
Message-ID: <20170913015425.GA607@intel.com> (raw)
In-Reply-To: <20170912143350.GA3452@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 9333 bytes --]
On Tue, Sep 12, 2017 at 11:33:50AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 12, 2017 at 06:10:35PM +0800, Du, Changbin escreveu:
> > When a annotate a symbol, I find the annotated C source code doesn't match assembly code.
> > So I cannot determine which line of C code has much overhead withou gdb's help.
> >
> > Here is a example result of function apic_has_interrupt_for_ppr() in kvm module.
>
> Ok, was this using the module .ko file or /proc/kcore? You forgot to
> cut'n'paste the first line on the screen.
>
It is arch/x86/kvm/kvm.ko.
> Also, how did you use gdb?
>
$ gdb arch/x86/kvm/kvm.ko
$ (gdb) disassemble /s apic_has_interrupt_for_ppr
> perf uses objdump to do the disassembly, and depending on how it is used
> (live system, post processing on a different machine, permissions) it
> may use different files to do the disassembly.
>
But objdump has same out as gdb. (Always on same machine, and no binary changed.)
$ objdump -d -S arch/x86/kvm/kvm.o
...
static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
{
3b4a0: e8 00 00 00 00 callq 3b4a5 <apic_has_interrupt_for_ppr+0x5>
3b4a5: 55 push %rbp
3b4a6: 48 89 e5 mov %rsp,%rbp
3b4a9: 48 83 ec 08 sub $0x8,%rsp
int highest_irr;
if (kvm_x86_ops->sync_pir_to_irr && apic->vcpu->arch.apicv_active)
3b4ad: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 3b4b4 <apic_has_interrupt_for_ppr+0x14>
3b4b4: 48 8b 80 38 02 00 00 mov 0x238(%rax),%rax
3b4bb: 48 85 c0 test %rax,%rax
3b4be: 74 10 je 3b4d0 <apic_has_interrupt_for_ppr+0x30>
3b4c0: 48 8b 97 88 00 00 00 mov 0x88(%rdi),%rdx
3b4c7: 80 ba 28 03 00 00 00 cmpb $0x0,0x328(%rdx)
3b4ce: 75 3a jne 3b50a <apic_has_interrupt_for_ppr+0x6a>
/*
* Note that irr_pending is just a hint. It will be always
* true with virtual interrupt delivery enabled.
*/
if (!apic->irr_pending)
3b4d0: 80 bf 91 00 00 00 00 cmpb $0x0,0x91(%rdi)
3b4d7: 74 2a je 3b503 <apic_has_interrupt_for_ppr+0x63>
3b4d9: 48 8b 8f a0 00 00 00 mov 0xa0(%rdi),%rcx
static int find_highest_vector(void *bitmap)
{
int vec;
u32 *reg;
for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
3b4e0: b8 e0 00 00 00 mov $0xe0,%eax
vec >= 0; vec -= APIC_VECTORS_PER_REG) {
reg = bitmap + REG_POS(vec);
if (*reg)
3b4e5: 89 c2 mov %eax,%edx
3b4e7: c1 fa 05 sar $0x5,%edx
3b4ea: c1 e2 04 shl $0x4,%edx
3b4ed: 48 63 d2 movslq %edx,%rdx
3b4f0: 8b 94 11 00 02 00 00 mov 0x200(%rcx,%rdx,1),%edx
3b4f7: 85 d2 test %edx,%edx
3b4f9: 75 2d jne 3b528 <apic_has_interrupt_for_ppr+0x88>
> Please provide more detailed information on the exact command line
> arguments and usage scenario.
>
> - Arnaldo
>
> > │580 __clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention); ▒
> > │581 } ▒
> > │ ▒
> > │583 static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr) ▒
> > │584 { ▒
> > 0.88 │30: cmpb $0x0,0x91(%rdi) ▒
> > 2.54 │ ↓ je 63 ▒
> > 0.20 │ mov 0xa0(%rdi),%rcx ▒
> > │581 int highest_irr; ▒
> > │582 if (kvm_x86_ops->sync_pir_to_irr && apic->vcpu->arch.apicv_active) ▒
> > 4.91 │ mov $0xe0,%eax x ▒
> > 1.46 │45: mov %eax,%edx x ▒
> > 0.02 │ sar $0x5,%edx x ▒
> > 3.57 │ shl $0x4,%edx x ▒
> > 3.34 │ movslq %edx,%rdx x ▒
> > 1.25 │ mov 0x200(%rcx,%rdx,1),%edx x ▒
> > 42.44 │ test %edx,%edx x ▒
> > 0.01 │ ┌──jne 88 x ▒
> > 3.48 │ │ sub $0x20,%eax x ▒
> > 2.24 │ │ cmp $0xffffffe0,%eax x ▒
> > │586│apic_find_highest_irr(): ▒
> > │ │ ▒
> > │407│ /* ▒
> > │408│ * Note that irr_pending is just a hint. It will be always ▒
> > │409│ * true with virtual interrupt delivery enabled. ▒
> > │410│ */ ▒
> > │411│ if (!apic->irr_pending) ▒
> > │ │↑ jne 45 ▒
> > 0.62 │63:│ mov $0xffffffff,%eax ◆
> > 0.83 │ │ leaveq ▒
> > 13.52 │ │← retq ▒
> > │6a:│ mov %esi,-0x4(%rbp) ▒
> > │ │ mov %rdx,%rdi ▒
> > │418│find_highest_vector(): ▒
> > │340│static int find_highest_vector(void *bitmap) ▒
> > │341│{ ▒
> > │342│ int vec; ▒
> > │343│ u32 *reg; ▒
> > │ │ ▒
> > │345│ for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG; ▒
> > │ │→ callq *%rax ▒
> > │ │ mov -0x4(%rbp),%esi ▒
> > │343│ vec >= 0; vec -= APIC_VECTORS_PER_REG) { ▒
> > │344│ reg = bitmap + REG_POS(vec); ▒
> > │345│ if (*reg) ▒
> > 0.05 │75:│ cmp $0xffffffff,%eax ▒
> > │ │↑ je 63 ▒
> > 1.95 │ │ mov %eax,%edx ▒
> > 1.45 │ │ and $0xf0,%edx
> >
> >
> > Look at the assembly code block where I have put a 'x' on the right. Apparently the
> > assembly code doesn't match the C source code arrounded. Let's look the correct disassemble
> > result from gdb:
> >
> > 340 for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> > 0x000000000003b4e0 <+64>: mov $0xe0,%eax
> >
> > 342 reg = bitmap + REG_POS(vec);
> > 343 if (*reg)
> > 0x000000000003b4e5 <+69>: mov %eax,%edx
> > 0x000000000003b4e7 <+71>: sar $0x5,%edx
> > 0x000000000003b4ea <+74>: shl $0x4,%edx
> > 0x000000000003b4ed <+77>: movslq %edx,%rdx
> > 0x000000000003b4f0 <+80>: mov 0x200(%rcx,%rdx,1),%edx
> > 0x000000000003b4f7 <+87>: test %edx,%edx
> > 0x000000000003b4f9 <+89>: jne 0x3b528 <apic_has_interrupt_for_ppr+136>
> >
> > 341 vec >= 0; vec -= APIC_VECTORS_PER_REG) {
> > 0x000000000003b4fb <+91>: sub $0x20,%eax
> >
> > 340 for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> > 0x000000000003b4fe <+94>: cmp $0xffffffe0,%eax
> > 0x000000000003b501 <+97>: jne 0x3b4e5 <apic_has_interrupt_for_ppr+69>
> >
> >
> > Compared to gdb, perf-annoate has messed up. is it a bug or just perf is not as perfect as gdb?
> >
> > --
> > Thanks,
> > Changbin Du
>
>
--
Thanks,
Changbin Du
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
next prev parent reply other threads:[~2017-09-13 2:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-12 10:10 Does perf-annotate work correctly? Du, Changbin
2017-09-12 14:33 ` Arnaldo Carvalho de Melo
2017-09-13 1:54 ` Du, Changbin [this message]
2017-09-26 6:06 ` Du, Changbin
2017-09-13 9:14 ` Du, Changbin
2017-10-13 10:15 ` Du, Changbin
2017-10-16 9:28 ` Jiri Olsa
2017-10-16 9:34 ` Du, Changbin
2017-10-16 9:30 ` Jiri Olsa
2017-10-16 9:35 ` Du, Changbin
2017-10-16 10:45 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170913015425.GA607@intel.com \
--to=changbin.du@intel.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).