Re: exit timing analysis v1 - comments&discussions welcome

Kernel KVM-PPC virtualization development
 help / color / mirror / Atom feed

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
To: kvm-ppc@vger.kernel.org
Subject: Re: exit timing analysis v1 - comments&discussions welcome
Date: Thu, 09 Oct 2008 08:02:46 +0000	[thread overview]
Message-ID: <48EDBAA6.9050308@linux.vnet.ibm.com> (raw)
In-Reply-To: <48DA0747.3020004@linux.vnet.ibm.com>

Hollis Blanchard wrote:
> On Wed, 2008-10-08 at 15:49 +0200, Christian Ehrhardt wrote:
>   
>> Wondering about that 30.5% for postprocessing and 
>> kvmppc_check_and_deliver_interrupts I quickly checked that in detail - 
>> part d is now divided in 4 subparts.
>> I also looked at the return to guest path if the expected part 
>> (restoring tlb) is really the main time eater there. The result shows 
>> clearly that it is.
>>
>> more detailed breakdown:
>> a)  10.94%  - exit, saving guest state (booke_interrupt.S)
>> b)   8.12% - reaching kvmppc_handle_exit
>> c)   7.59%  - syscall exit is checked and a interrupt is queued using 
>> kvmppc_queue_exception
>> d1)  3.33%  - some checks for all exits
>> d2)  8.29% - finding first bit in kvmppc_check_and_deliver_interrupts
>> d3) 17.20% - can_deliver/clear&deliver exception in 
>> kvmppc_check_and_deliver_interrupts
>> d4)  4.47% - updating kvm_stat statistics
>> e)   6.13% - returning from kvmppc_handle_exit to booke_interrupt.S
>> f1) 29.18% - restoring guest tlb
>> f2)  4.69% - restoring guest state ([s]regs)
>>
>> These fractions are % of our ~12µs syscall exit.
>> => restoring tlb on each reenter = 4µs constant overhead
>> => looking a bit into irq delivery and other constant things like 
>> kvm_stat updating
>>
>>     
> ...
>   
>> Now I go for the TLB replacement in f1.
>>     
>
> Hang on... does d3 make sense to you? It doesn't to me, and if there's a
> bug there it will be easier to fix than rewriting the TLB code. :)
>   
I did not give up improving that part too :-)
> I think your core runs at 667MHz, right? So that's 1.5 ns/cycle. 17.20%
> of 12µs is 2064ns, or about 1300 cycles. (Check my math.)
>   
I get the same results. 1% ~ 80 cycles.
> Now when I look at kvmppc_core_deliver_interrupts(), I'm not sure where
> that time is going. We're assuming the first_first_bit() loop usually
> executes once, for syscall. Does it actually execute more than that? I
> don't expect any of kvmppc_can_deliver_interrupt(),
> kvmppc_booke_clear_exception(), or kvmppc_booke_deliver_interrupt() to
> take lots of time.
>   
You can see below that I already had a more detailed breakdown in my old 
mail:
[...]
d2)  8.84% -   8.56%     -   9.28%      -   8.31% finding first bit in 
kvmppc_check_and_deliver_interrupts
d3)  6.53% -   5.25%     -   6.63%      -   5.10% can_deliver in 
kvmppc_check_and_deliver_interrupts
d4) 13.66% -  15.37%     -  14.12%      -  14.92% clear&deliver 
exception in kvmppc_check_and_deliver_interrupts
[...]
> Could it be cache effects? exception_priority[] and priority_exception[]
> are 16 bytes each, and our L1 cacheline is 32 bytes, so they should both
> fit into one... except they're not aligned.
>   
I would be so happy if I would have hardware performance counters like 
cache misses :-)
> Also, it looks like we use the generic find_first_bit(). That may be
> more expensive than we'd like. However, since
> vcpu->arch.pending_exceptions is a single long (not an arbitrary sized
> bitfield), we should be able to use ffs() instead, which has an
> optimized PowerPC implementation. That might help a lot.
>   
good idea.
I'll check this and some other small improvements I have in mind.

> We might even be able to replace find_next_bit() too, by shifting a mask
> over each loop, but I don't think we'll have to, since I expect the
> common case to be we can deliver the first pending exception. (Worth
> checking? :)
>   
I'm not sure. It's surely worth checking how often that second 
find_next_bit is called.
If that number is far too small it's not worth.

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

next prev parent reply	other threads:[~2008-10-09  8:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-24  9:24 exit timing analysis v1 - comments&discussions welcome Christian Ehrhardt
2008-09-24 15:14 ` Hollis Blanchard
2008-09-25  9:32 ` Liu Yu-B13201
2008-09-25 15:18 ` Hollis Blanchard
2008-10-02 12:02 ` Christian Ehrhardt
2008-10-07 14:36 ` Christian Ehrhardt
2008-10-08 13:49 ` Christian Ehrhardt
2008-10-08 15:41 ` Hollis Blanchard
2008-10-09  8:02 ` Christian Ehrhardt [this message]
2008-10-09  9:35 ` Christian Ehrhardt
2008-10-09 14:49 ` Christian Ehrhardt
2008-10-10  8:32 ` Christian Ehrhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48EDBAA6.9050308@linux.vnet.ibm.com \
    --to=ehrhardt@linux.vnet.ibm.com \
    --cc=kvm-ppc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox