All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Glauber Costa <glommer@redhat.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: time accounting problem in pvops kernel
Date: Tue, 17 Aug 2010 15:51:34 -0700	[thread overview]
Message-ID: <4C6B1276.9060203@goop.org> (raw)
In-Reply-To: <4C6AC705.1080904@redhat.com>

 On 08/17/2010 10:29 AM, Paolo Bonzini wrote:
> Hi,
>
> while experimenting a bit with time.c we found a bug in time
> accounting.  Basically, /proc/stat counts idle time twice for PV guests
> running a pvops kernel

What version?  Upstream and stable kernels contain the changeset "xen:
drop xen_sched_clock in favour of using plain wallclock time" which
should fix a lot of timekeeping/scheduling problems.

Thanks,
    J

> .
>
> To reproduce, try this command in an unloaded guest:
>
> grep cpu0 /proc/stat; sleep 20 ; grep cpu0 /proc/stat
>
> and see the fourth number in /proc/stat (idle) increasing by approximately
> 4000 for a kernel with USER_HZ == 100. Instead, if you try these commands
> instead (you need an otherwise unloaded machine for these):
>
> grep cpu0 /proc/stat; timeout 20s yes > /dev/null ; grep cpu0 /proc/stat
> grep cpu0 /proc/stat; timeout 20s dd if=/dev/urandom > /dev/null ; grep cpu0 /proc/stat
>
> the first and third number in the /cpu/stat increase instead by 2000 only.
>
> The reason for this seems to be that in xen_timer_interrupt Linux's
> normal timer accounting is called (via evt->event_handler) and this
> calls account_idle_time. However, idle ticks are also added from
> do_stolen_accounting, so that overall they're counted twice.
>
> Related to this, it looks like stolen tick accounting is subtly
> wrong. Even if only part of a tick is stolen by the hypervisor, Linux's
> time accounting will add a whole tick to the user/system/idle time. In
> a dynticks kernel (or maybe even if the scheduling quanta have some
> kind of resonance with the guest's timer interrupt?) the sum of the
> four components user+sys+idle+steal will then be larger than the wall
> time. In fact, I found experimentally steal time to be usually 20%
> off from wall-user-sys-idle when the machine is under moderate load
> (e.g. 5 domains at 100% CPU usage, on a 4-CPU machine). Of course I used
> the correct, divided-by-2 idle time to do this computation.
>
> Paolo
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

  reply	other threads:[~2010-08-17 22:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-17 17:29 time accounting problem in pvops kernel Paolo Bonzini
2010-08-17 22:51 ` Jeremy Fitzhardinge [this message]
2010-08-18  7:49   ` Paolo Bonzini
2010-08-18 14:15     ` Paolo Bonzini
2010-08-18 16:06       ` Jeremy Fitzhardinge
2010-08-18 14:17     ` Jed Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C6B1276.9060203@goop.org \
    --to=jeremy@goop.org \
    --cc=glommer@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.