Re: pinning, tsc and apic

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Ryan Harper <ryanh@us.ibm.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: kvm-devel@lists.sourceforge.net, Ryan Harper <ryanh@us.ibm.com>
Subject: Re: pinning, tsc and apic
Date: Mon, 12 May 2008 16:23:19 -0500	[thread overview]
Message-ID: <20080512212319.GV17938@us.ibm.com> (raw)
In-Reply-To: <48289F36.8020702@codemonkey.ws>

* Anthony Liguori <anthony@codemonkey.ws> [2008-05-12 15:05]:
> Ryan Harper wrote:
> >I've been digging into some of the instability we see when running
> >larger numbers of guests at the same time.  The test I'm currently using
> >involves launching 64 1vcpu guests on an 8-way AMD box.
> 
> Note this is a Barcelona system and therefore should have a 
> fixed-frequency TSC.
> 
> >  With the latest
> >kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
> >64 of these 1 second apart,
> 
> BTW, what if you don't pace-out the startups?  Do we still have issues 
> with that?

Do you mean without the 1 second delay or with a longer delay?  My
experience is that delay helps (fewer hangs), but doesn't solve things
completely.

> 
> > and only a handful (1 to 3)  end up not
> >making it up.  In dmesg on the host, I get a couple messages:
> >
> >[321365.362534] vcpu not ready for apic_round_robin
> >
> >and 
> >
> >[321503.023788] Unsupported delivery mode 7
> >
> >Now, the interesting bit for me was when I used numactl to pin the guest
> >to a processor, all of the guests come up with no issues at all.  As I
> >looked into it, it means that we're not running any of the vcpu
> >migration code which on svm is comprised of tsc_offset recalibration and
> >apic migration, and on vmx, a little more per-vcpu work
> >  
> 
> Another data point is that -no-kvm-irqchip doesn't make the situation 
> better.

Right.  Let me clarify; I still see hung guests, but I need to validate
if I see the apic related messages in the host or not, I don't recall
for certain.

> >I've convinced myself that svm.c's tsc offset calculation works and
> >handles the migration from cpu to cpu quite well.  I added the following
> >snippet to trigger if we ever encountered the case where we migrated to
> >a tsc that was behind:
> >
> >    rdtscll(tsc_this);
> >    delta = vcpu->arch.host_tsc - tsc_this;
> >    old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
> >    new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
> >    if (new_time < old_time) {
> >        printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
> >               vcpu->cpu, cpu, old_time - new_time);
> >    }
> >    svm->vmcb->control.tsc_offset += delta;
> >  
> 
> Time will never go backwards, but what can happen is that the TSC 
> frequency will slow down.  This is because upon VCPU migration, we don't 
> account for the time between vcpu_put on the old processor and vcpu_load 
> on the new processor.  This time then disappears.

In svm.c, I do think we account for most of that time since the delta
calculation will shift the guest time forward to the tsc value read in
svm_vcpu_load().  We'll still miss the time between fixing the offset
and when the guest can actually read its tsc.

> 
> A possible way to fix this (that's only valid on a processor with a 
> fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and 
> then on vcpu_load, take the delta timestamp since the old TSC was saved, 
> and use the TSC frequency on the new pcpu to calculate the number of 
> elapsed cycles.
> 
> Assuming a fixed frequency TSC, and a calibrated TSC across all 
> processors, you could get the same affects by using the VT tsc delta 
> logic.  Basically, it always uses the new CPU's TSC unless that would 
> cause the guest to move backwards in time.  As long as you have a 
> stable, calibrated TSC, this would work out.
> 
> Can you try your old patch that did this and see if it fixes the problem?

Yeah, I'll give it a spin.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

next prev parent reply	other threads:[~2008-05-12 21:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-12 19:19 pinning, tsc and apic Ryan Harper
2008-05-12 19:49 ` Anthony Liguori
2008-05-12 21:23   ` Ryan Harper [this message]
2008-05-12 21:44     ` Anthony Liguori
2008-05-13 18:56       ` Ryan Harper
2008-05-14 23:25 ` Marcelo Tosatti
2008-05-14 23:45   ` Anthony Liguori
2008-05-15  6:59     ` Chris Wright
2008-05-15 14:10       ` Anthony Liguori
2008-05-15 16:26       ` Ryan Harper
2008-06-18 13:12       ` [kvm-devel] " Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080512212319.GV17938@us.ibm.com \
    --to=ryanh@us.ibm.com \
    --cc=anthony@codemonkey.ws \
    --cc=kvm-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox