All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
To: kvm-ppc@vger.kernel.org
Subject: Re: exit timing analysis v1 - comments&discussions welcome
Date: Tue, 07 Oct 2008 14:36:41 +0000	[thread overview]
Message-ID: <48EB73F9.6010006@linux.vnet.ibm.com> (raw)
In-Reply-To: <48DA0747.3020004@linux.vnet.ibm.com>

As intended I separated the time of one of our exits into it's segments 
to see where we loose the constant base time.
I looked at the syscall exit which is pretty fast and does not deviate 
very much.

A little tracing and debugging later I can now show the rough breakdown 
where the time is spent for that exit:

avg time spent in the segments I checked:
a) 11.5%  - exit, saving guest state (booke_interrupt.S)
b) 8.5% - reaching kvmppc_handle_exit
c) 10%  - syscall exit is checked and a interrupt is queued using 
kvmppc_queue_exception
d) 30.5%  - postprocessing for all exits = 
kvmppc_check_and_deliver_interrupts and updating some statistics
e) 5.5% - returning from kvmppc_handle_exit to booke_interrupt.S
f) 34%  - restoring guest state tlb/regs (booke_interrupt.S)

Ok, looking at these numbers I think we have two areas to check in detail.
- On one hand the post processing part (d) actually should not take that 
much time.
- On the other hand we might be able to improve the last segment by 
chanign our tlb management (Liu made a suggestion on the list two weeks ago)

When thinking about this breakdown while having the complete timing in 
mind (see below) shows us that the average time spent for one of our 
most frequent exit is near the minimum duration (EMUL). This means if we 
can reduce some of the constant overehad this might really help us in 
those frequent and hot paths (also birt tlb misses will benefit from that).

I continue on that and keep you all updated.
Christian

Christian Ehrhardt wrote:
> Today I refactored my code in a way that it might eventually get 
> upstream.
> That means I removed some debug blocks, made it configurable and print 
> output only on request per sysrq.
> Feel free to review that code now, otherwise wait for my separate rfc 
> post that will follow ~next week.
>
> Additionally I realized in a discussion that my host system was still 
> configured in demo mode (X11+xfce, dhcpd, avahi, ... running in host).
> I took some updated numbers without all those background activities - 
> testacase again boot, login, ls, kill.
>
> I intend to the irq path in detail as suggested and first of all I'll 
> try to get some timing of the "segments" of such a call.
> This should show us where we loose our "constant base time" and give 
> me some hints when looking at it in detail.
>
> FYI - new numbers without high background load (monospace recommended):
> NOPV:
>        MMIO count      10105 min         46 max       1534 
> sum              2952467 avg         292.1788 stddev      295.933 %  2.04
>         DCR count       5428 min         40 max        209 
> sum               246168 avg          45.3515 stddev        6.758 %  0.17
>      SIGNAL count        695 min         65 max       3755 
> sum                89767 avg         129.1612 stddev      314.421 %  0.06
>    ITLBREAL count      80045 min         13 max        108 
> sum              1063127 avg          13.2816 stddev        2.338 %  0.73
>    ITLBVIRT count    1000585 min         21 max     264827 
> sum             24300596 avg          24.2864 stddev      264.753 % 16.78
>    DTLBREAL count      91206 min         13 max         69 
> sum              1285214 avg          14.0913 stddev        2.225 %  0.89
>    DTLBVIRT count     977434 min         21 max       1446 
> sum             24007008 avg          24.5613 stddev        4.426 % 16.58
>     SYSCALL count      10460 min         11 max         55 
> sum               116447 avg          11.1326 stddev        1.929 %  0.08
>         ISI count      11724 min         11 max         61 
> sum               130007 avg          11.0890 stddev        1.929 %  0.09
>         DSI count      20737 min         11 max         57 
> sum               230009 avg          11.0917 stddev        1.914 %  0.16
>    EMULINST count    5683356 min         11 max       3778 
> sum             79339467 avg          13.9600 stddev       50.275 % 54.78
>         DEC count      13079 min         50 max        826 
> sum               732712 avg          56.0220 stddev       22.382 %  0.51
>      EXTINT count         55 min         30 max       1478 
> sum                10996 avg         199.9273 stddev      238.150 %  0.01
>  FP_UNAVAIL count        280 min         11 max         53 
> sum                 3163 avg          11.2964 stddev        3.495 %  0.00
> TIMEINGUEST count    7905189 min          0 max       3688 
> sum             10330742 avg           1.3068 stddev        8.970 %  7.13
>              csum   15810378                               
> sumsum         144837890 => ~2:24 runtime
>                                                            
> sumavg           7241894
>
> PV:
>        MMIO count      12505 min         46 max       3087 
> sum              3693782 avg         295.3844 stddev      260.788 %  4.01
>         DCR count       5595 min         40 max        706 
> sum               273578 avg          48.8969 stddev       31.305 %  0.30
>      SIGNAL count        654 min         65 max       4132 
> sum               300027 avg         458.7569 stddev      571.130 %  0.33
>    ITLBREAL count      71711 min         13 max        104 
> sum               943053 avg          13.1507 stddev        2.360 %  1.02
>    ITLBVIRT count     750649 min         21 max       1503 
> sum             18178245 avg          24.2167 stddev        7.335 % 19.71
>    DTLBREAL count      83356 min         13 max        102 
> sum              1146242 avg          13.7512 stddev        2.406 %  1.24
>      DTLBPV count      30086 min         20 max        237 
> sum               653556 avg          21.7229 stddev        4.639 %  0.71
>    DTLBVIRT count     772811 min         21 max        713 
> sum             19079477 avg          24.6884 stddev        6.593 % 20.69
>     SYSCALL count       7647 min         11 max         57 
> sum                84821 avg          11.0921 stddev        1.897 %  0.09
>       HCALL count          1 min         19 max         19 
> sum                   19 avg          19.0000 stddev        0.000 %  0.00
>         ISI count       9895 min         11 max         73 
> sum               109667 avg          11.0831 stddev        1.904 %  0.12
>         DSI count      17974 min         10 max         57 
> sum               199504 avg          11.0996 stddev        2.046 %  0.22
>    EMULINST count    2567245 min         11 max       4212 
> sum             40501314 avg          15.7762 stddev       65.673 % 43.92
>         DEC count       7488 min         51 max        641 
> sum               426813 avg          56.9996 stddev       23.893 %  0.46
>      EXTINT count       2215 min         31 max       1677 
> sum               297495 avg         134.3093 stddev      116.219 %  0.32
>  FP_UNAVAIL count        258 min         11 max         11 
> sum                 2838 avg          11.0000 stddev        0.000 %  0.00
> TIMEINGUEST count    4340090 min          0 max       3850 
> sum              6316079 avg           1.4553 stddev       12.599 %  6.85
>              csum    8680180                               
> sumsum          92206510 => ~1:32 runtime
>                                                            
> sumavg           4610325
>
> Hollis Blanchard wrote:
>> On Thu, 2008-09-25 at 17:32 +0800, Liu Yu-B13201 wrote:
>>>> On Wed, 2008-09-24 at 11:24 +0200, Christian Ehrhardt wrote:
>>>>>                      count      min         max             
>>>>> sum            avg        stddev     time%
>>>>>     EMULINST     6,423,699       12     247,582      
>>>> 91,732,705       
>>>>> 14.2804       241.200     45.36
>>>>>     ITLBVIRT     1,777,242       21     264,257      
>>>> 47,116,557       
>>>>> 26.5111       286.040     23.30
>>>>>     DTLBVIRT     1,544,241       22     263,947      
>>>> 41,765,447       
>>>>> 27.0459       218.997     20.65
>>>> (The max here is of course when the guest was de-scheduled in favor of
>>>> another host process.)
>>>>
>>>> I think it's interesting that the min instruction emulation time is 12
>>>> usecs. In fact, our "null" exits where we do almost no processing 
>>>> (ISI,
>>>> DSI, syscall) are 11 usecs, which suggests we have a problem with
>>>> interrupt handler overhead (for all exit types).
>>>>
>>> Will you consider about moving tlb manipulation out of entering path?
>>> Examining the modify array may cost some time.
>>
>> Yes, definitely.
>>
>> That's about the only thing I can see in the 440 interrupt path that
>> might take significant time. Unfortunately we can't profile that code
>> because we have no performance counters, so finding the problem will
>> require some experimentation.
>
>


-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization


  parent reply	other threads:[~2008-10-07 14:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-24  9:24 exit timing analysis v1 - comments&discussions welcome Christian Ehrhardt
2008-09-24 15:14 ` Hollis Blanchard
2008-09-25  9:32 ` Liu Yu-B13201
2008-09-25 15:18 ` Hollis Blanchard
2008-10-02 12:02 ` Christian Ehrhardt
2008-10-07 14:36 ` Christian Ehrhardt [this message]
2008-10-08 13:49 ` Christian Ehrhardt
2008-10-08 15:41 ` Hollis Blanchard
2008-10-09  8:02 ` Christian Ehrhardt
2008-10-09  9:35 ` Christian Ehrhardt
2008-10-09 14:49 ` Christian Ehrhardt
2008-10-10  8:32 ` Christian Ehrhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48EB73F9.6010006@linux.vnet.ibm.com \
    --to=ehrhardt@linux.vnet.ibm.com \
    --cc=kvm-ppc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.