From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: The SMP RHEL 5.1 PAE guest can't boot up issue Date: Tue, 26 Feb 2008 12:28:01 +0200 Message-ID: <47C3E9B1.7040406@qumranet.com> References: <200802221657.34243.sheng.yang@intel.com> <47BEF550.8040803@qumranet.com> <10EA09EFD8728347A513008B6B0DA77A02CFD373@pdsmsx411.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel@lists.sourceforge.net To: "Dong, Eddie" Return-path: In-Reply-To: <10EA09EFD8728347A513008B6B0DA77A02CFD373@pdsmsx411.ccr.corp.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces@lists.sourceforge.net Errors-To: kvm-devel-bounces@lists.sourceforge.net List-Id: kvm.vger.kernel.org Dong, Eddie wrote: >>> I don't know if the patch was still needed now, since it was posted >>> long ago(I don't know which issue it solved). I'd like to post a >>> revert patch if necessary. >>> >>> >> I believe the patch is still necessary, since we still need to >> guarantee that a vcpu's tsc is monotonous. I think there are three >> issues to be addressed: >> >> 1. The majority of intel machines don't need the offset adjustment >> since they already have a constant rate tsc that is synchronized on >> all cpus. I think this is indicated by X86_FEATURE_CONSTANT_TSC >> (though I'm not 100% certain if it means that the rate is the same >> for all cpus, Thomas can you clarify?) >> > > So why not make the TSC_OFFSET adjustment conditional? > Yes, that's what I meant. We just need to be sure that this is what X86_FEATURE_CONSTANT_TSC means. > >> This will improve tsc quality for those machines, but we can't depend >> on it, since some machines don't have constant tsc. Further, I don't >> think really large machines can have constant tsc since clock >> distribution becomes difficult or impossible. >> > > For NUMA machines, this is an issue, but depend on how you support > NUMA. One way is to bind VCPUs of a guest to same node if guest is not > NUMA, if this is the model, then we don't have issue. > I think Xen is planning in this way and it is same for KVM. > > > This is a user decision, many small VMs or a few larger ones. To support the "many small VMs", we need to be able to detect the tsc stability groups. I don't think that's the same as NUMA nodes for processors with on-board memory controllers (where each processor is a node). >> 2. We should implement round robin and lowest priority like qemu does. >> Xen does the same thing: >> >> >>> /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */ >>> #define IRQ0_SPECIAL_ROUTING 1 >>> >> in arch/x86/hvm/vioapic.c, at least for irq 0. >> > > We did same thing in Xen long time ago to avoid this issue. > It helps but not perfect. > An equivalent hack is now in kvm as well. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/