All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Theurer <habanero@us.ibm.com>
To: "Dave Thompson (davetho)" <davetho@cisco.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: CPU Utilization
Date: Tue, 13 Dec 2005 09:35:35 -0600	[thread overview]
Message-ID: <439EEA47.7020700@us.ibm.com> (raw)
In-Reply-To: <5440A5A36B8CED4B9F54524343CB6B68FD1535@xmb-rtp-215.amer.cisco.com>

Dave Thompson (davetho) wrote:

>>-----Original Message-----
>>From: Andrew Theurer [mailto:habanero@us.ibm.com] 
>>Sent: Monday, December 12, 2005 9:24 PM
>>To: Dave Thompson (davetho)
>>Cc: Anthony Liguori; xen-devel@lists.xensource.com
>>Subject: Re: [Xen-devel] CPU Utilization
>>
>>
>>>But what else is running?  In this case I only have dom0 configured,
>>>there is no domU.  The only other possibility would be the hypervisor
>>>and I hope the hypervisor is not accounting for the other 30%.
>>> 
>>>
>>If xend is started, you may have the software bridge running 
>>which can use as much as 10% cpu.
>>
>
>But I would think that the bridge activity should be showing up
>in the top CPU summary as well.  It is running on domain 0 after all.
>I know one person suggested that kernel activity is not represented
>in the top CPU util output.  But I don't see how that can be right.
>If so, where else is that time accounted for?  It seems to be all
>there (in the sy, hi, and si values).
>
>
>>Also, I don't see soft ints in that top output.  
>>That could also be another ~7% cpu.
>>
>
>Soft interrupt time is accounted for in the si field (15%) of the
>summary.  I believe that is where most (if not all) of the TCP
>processing is performed. Here is the top CPU summary display again:
>
>Cpu(s):  1.0% us,  7.3% sy,  0.0% ni, 73.3% id,  0.0% wa,  3.3% hi,
>15.0% si
>
>
Sorry, I overlooked the si.

>>Also xen is doing some work, receiving the real interrupts
>>and generating virtual interrupts to dom0, so with all this,
>>it is possible that you are using another 30% unseen 
>>in top.
>>
>
>But aren't the hypervisor calls actually still being accounted for
>by the domain since clock ticks are not lost but made up for in the
>timer_interrupt() function of arch/xen/i386/kiernel/time.c?  The
>only issue is really when a domain is preempted by another domain
>by the xen scheduler and this is actually a problem in the other
>direction.  The swapped out domain will still account for the
>time in whichever time bucket it was using when the domain was
>preempted (so the same time is accounted for by both domains).
>Basically the aggregated CPU time for all domains on a CPU could
>add greater than 100% because of this.  If the domain is
>re-scheduled because of a SCHEDOP_block in the idle loop, the time
>will be properly accounted for as idle time.
>
I wonder if this is working under all situations.  This problem seems 
familiar.  Before the kernel accounted for si and hi properly, we had a 
very similar situation with this type of workload: lots of cpu time 
unaccounted for because the interrupt processing happend mostly when the 
system was idle, and the timer tick did not account for this properly.  
I wonder if we have a similar problem in xen/linux.  If lost ticks are 
"queued up" but accounted for just one type of mode, then I think we 
could be way off in some sitations like this.

>
>However, none of this really matters for my case since I am
>only running domain 0, there is no guest domain.  I just want
>a good explanation why 'xm top' is reporting 30% more CPU utilization
>than top in this case.
>
>
>>Best way to confirm this would be to use xenoprofile.
>>
>
>Xenoprof is great for seeing which kernel functions are taking
>the majority of time but does it really help with CPU utilization?
>It counts (in the default case) unhalted clock cycles and in the
>xen idle loop the processor is halted (to save power) so the
>clock cycles are not accounted for.  Is this right or am I
>missing something.
>
I guess I was hoping to find a smoking gun in xen :).  The only other 
thing I think we could do is count the number of total samples we got 
over x seconds and compare this with the number of samples we would get 
in the same time period on a 100% busy system.  We should then be able 
to figure out how much % time the cpu was halted.

-Andrew

  reply	other threads:[~2005-12-13 15:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-13 14:15 CPU Utilization Dave Thompson (davetho)
2005-12-13 15:35 ` Andrew Theurer [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-07-24 17:05 CPU utilization Vijay Khemka
2005-12-12 23:19 CPU Utilization Dave Thompson (davetho)
2005-12-13  3:23 ` Andrew Theurer
2005-12-12 19:41 Dave Thompson (davetho)
2005-12-12 21:10 ` Rob Gardner
2005-12-12 23:01 ` Anthony Liguori
2005-12-13 13:41   ` Rik van Riel
2005-12-12 19:07 Dave Thompson (davetho)
2005-12-12 19:23 ` Rob Gardner
2005-10-04 23:59 Preethi M
2005-10-06 16:22 ` Rob Gardner
2005-05-01 13:51 CPU utilization Brian Hays
2005-05-01 17:14 ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=439EEA47.7020700@us.ibm.com \
    --to=habanero@us.ibm.com \
    --cc=davetho@cisco.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.