From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
To: kvm-ppc@vger.kernel.org
Subject: Re: exit timing analysis v1 - comments&discussions welcome
Date: Thu, 09 Oct 2008 09:35:05 +0000 [thread overview]
Message-ID: <48EDD049.1000709@linux.vnet.ibm.com> (raw)
In-Reply-To: <48DA0747.3020004@linux.vnet.ibm.com>
I modified the code according to your comments and my ideas, the new
values are shown in column impISF (irq delivery, Stat, FindFirstBit)
I changed some code of the statistic updating and the interrupt delivery
and got this:
base - impirq (d3) - impstat (d5) - impboth - impISF
a) 12.57% - 11.13% - 12.05% - 11.03% - 12.28% exit, saving
guest state (booke_interrupt.S)
b) 7.37% - 9.38% - 8.69% - 8.07% - 10.13% reaching
kvmppc_handle_exit
c) 7.38% - 7.20% - 7.49% - 9.78% - 7.85% syscall exit
is checked and a interrupt is queued using kvmppc_queue_exception
d1) 2.49% - 3.39% - 2.56% - 3.30% - 3.70% some checks
for all exits
d2) 8.84% - 8.56% - 9.28% - 8.31% - 6.07% finding
first bit in kvmppc_check_and_deliver_interrupts
d3) 6.53% - 5.25% - 6.63% - 5.10% - 4.27% can_deliver
in kvmppc_check_and_deliver_interrupts
d4) 13.66% - 15.37% - 14.12% - 14.92% - 13.96%
clear&deliver exception in kvmppc_check_and_deliver_interrupts
d5) 3.65% - 4.57% - 2.68% - 4.41% - 3.77% updating
kvm_stat statistics
e) 6.55% - 6.30% - 6.30% - 5.89% - 6.74% returning
from kvmppc_handle_exit to booke_interrupt.S
f1) 30.90% - 28.78% - 30.16% - 29.16% - 31.19% restoring
guest tlb
f2) 4.81% - 4.77% - 5.06% - 4.66% - 5.17% restoring
guest state ([s]regs)
We all see the measurement inaccuracy, but the last columns look good at
the improved sections d2, d3 and d4.
I'll remove these detailed tracing soon and make a larger test hoping
that this will not have the inaccuracy.
But for now I still wonder about the ~14% for clear&deliver - that
should just not be "that" much.
It should be worth to look into that section once again more in detail
first.
Christian Ehrhardt wrote:
> Hollis Blanchard wrote:
>> On Wed, 2008-10-08 at 15:49 +0200, Christian Ehrhardt wrote:
>>
>>> Wondering about that 30.5% for postprocessing and
>>> kvmppc_check_and_deliver_interrupts I quickly checked that in detail
>>> - part d is now divided in 4 subparts.
>>> I also looked at the return to guest path if the expected part
>>> (restoring tlb) is really the main time eater there. The result
>>> shows clearly that it is.
>>>
>>> more detailed breakdown:
>>> a) 10.94% - exit, saving guest state (booke_interrupt.S)
>>> b) 8.12% - reaching kvmppc_handle_exit
>>> c) 7.59% - syscall exit is checked and a interrupt is queued
>>> using kvmppc_queue_exception
>>> d1) 3.33% - some checks for all exits
>>> d2) 8.29% - finding first bit in kvmppc_check_and_deliver_interrupts
>>> d3) 17.20% - can_deliver/clear&deliver exception in
>>> kvmppc_check_and_deliver_interrupts
>>> d4) 4.47% - updating kvm_stat statistics
>>> e) 6.13% - returning from kvmppc_handle_exit to booke_interrupt.S
>>> f1) 29.18% - restoring guest tlb
>>> f2) 4.69% - restoring guest state ([s]regs)
>>>
>>> These fractions are % of our ~12µs syscall exit.
>>> => restoring tlb on each reenter = 4µs constant overhead
>>> => looking a bit into irq delivery and other constant things like
>>> kvm_stat updating
>>>
>>>
>> ...
>>
>>> Now I go for the TLB replacement in f1.
>>>
>>
>> Hang on... does d3 make sense to you? It doesn't to me, and if there's a
>> bug there it will be easier to fix than rewriting the TLB code. :)
>>
> I did not give up improving that part too :-)
>> I think your core runs at 667MHz, right? So that's 1.5 ns/cycle. 17.20%
>> of 12µs is 2064ns, or about 1300 cycles. (Check my math.)
>>
> I get the same results. 1% ~ 80 cycles.
>> Now when I look at kvmppc_core_deliver_interrupts(), I'm not sure where
>> that time is going. We're assuming the first_first_bit() loop usually
>> executes once, for syscall. Does it actually execute more than that? I
>> don't expect any of kvmppc_can_deliver_interrupt(),
>> kvmppc_booke_clear_exception(), or kvmppc_booke_deliver_interrupt() to
>> take lots of time.
>>
> You can see below that I already had a more detailed breakdown in my
> old mail:
> [...]
> d2) 8.84% - 8.56% - 9.28% - 8.31% finding first bit in
> kvmppc_check_and_deliver_interrupts
> d3) 6.53% - 5.25% - 6.63% - 5.10% can_deliver in
> kvmppc_check_and_deliver_interrupts
> d4) 13.66% - 15.37% - 14.12% - 14.92% clear&deliver
> exception in kvmppc_check_and_deliver_interrupts
> [...]
>> Could it be cache effects? exception_priority[] and priority_exception[]
>> are 16 bytes each, and our L1 cacheline is 32 bytes, so they should both
>> fit into one... except they're not aligned.
>>
> I would be so happy if I would have hardware performance counters like
> cache misses :-)
>> Also, it looks like we use the generic find_first_bit(). That may be
>> more expensive than we'd like. However, since
>> vcpu->arch.pending_exceptions is a single long (not an arbitrary sized
>> bitfield), we should be able to use ffs() instead, which has an
>> optimized PowerPC implementation. That might help a lot.
>>
> good idea.
> I'll check this and some other small improvements I have in mind.
>
>> We might even be able to replace find_next_bit() too, by shifting a mask
>> over each loop, but I don't think we'll have to, since I expect the
>> common case to be we can deliver the first pending exception. (Worth
>> checking? :)
>>
> I'm not sure. It's surely worth checking how often that second
> find_next_bit is called.
> If that number is far too small it's not worth.
>
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
next prev parent reply other threads:[~2008-10-09 9:35 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-24 9:24 exit timing analysis v1 - comments&discussions welcome Christian Ehrhardt
2008-09-24 15:14 ` Hollis Blanchard
2008-09-25 9:32 ` Liu Yu-B13201
2008-09-25 15:18 ` Hollis Blanchard
2008-10-02 12:02 ` Christian Ehrhardt
2008-10-07 14:36 ` Christian Ehrhardt
2008-10-08 13:49 ` Christian Ehrhardt
2008-10-08 15:41 ` Hollis Blanchard
2008-10-09 8:02 ` Christian Ehrhardt
2008-10-09 9:35 ` Christian Ehrhardt [this message]
2008-10-09 14:49 ` Christian Ehrhardt
2008-10-10 8:32 ` Christian Ehrhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48EDD049.1000709@linux.vnet.ibm.com \
--to=ehrhardt@linux.vnet.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox