From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: KVM: only abort guest entry if timer count goes from 0->1 Date: Thu, 12 Jun 2008 15:09:19 +0300 Message-ID: <485111EF.1080607@qumranet.com> References: <20080611225238.GB23771@dmt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Marcelo Tosatti Return-path: Received: from il.qumranet.com ([212.179.150.194]:37986 "EHLO il.qumranet.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755988AbYFLML1 (ORCPT ); Thu, 12 Jun 2008 08:11:27 -0400 In-Reply-To: <20080611225238.GB23771@dmt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > Only abort guest entry if the timer count went from 0->1, since for 1->2 > or larger the bit will either be set already or a timer irq will have > been injected. > > Using atomic_inc_and_test() for it also introduces an SMP barrier > to the LAPIC version (thought it was unecessary because of timer > migration, but guest can be scheduled to a different pCPU between exit > and kvm_vcpu_block(), so there is the possibility for a race). > > Noticed by Avi. > > Applied, thanks. > diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c > index 9e3391e..c0f7872 100644 > --- a/arch/x86/kvm/i8254.c > +++ b/arch/x86/kvm/i8254.c > @@ -198,14 +198,11 @@ static int __pit_timer_fn(struct kvm_kpit_state *ps) > struct kvm_vcpu *vcpu0 = ps->pit->kvm->vcpus[0]; > struct kvm_kpit_timer *pt = &ps->pit_timer; > > - atomic_inc(&pt->pending); > - smp_mb__after_atomic_inc(); > - if (vcpu0) { > + if (!atomic_inc_and_test(&pt->pending)) > set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests); > - if (waitqueue_active(&vcpu0->wq)) { > - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; > - wake_up_interruptible(&vcpu0->wq); > - } > + if (vcpu0 && waitqueue_active(&vcpu0->wq)) { > + vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; > + wake_up_interruptible(&vcpu0->wq); > } > Semi-related: what business does the timer have waking up vcpu0? The timer pin may be masked, or routed via the ioapic to vcpu13. The code here needs to touch irq pin 0, not wake up random vcpus. It can't do that immediately since we're in interrupt context and we can't take the appropriate locks. So we need to go through a workqueue. There's some code for this in the pci device assignment code, so we can just use that when it is merged. Longer term we need to give the pic and ioapic their own locking, likely with spin_lock_irq() so we can touch them from interrupt context. > index 180ba73..73f43de 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -945,8 +945,8 @@ static int __apic_timer_fn(struct kvm_lapic *apic) > int result = 0; > wait_queue_head_t *q = &apic->vcpu->wq; > > - atomic_inc(&apic->timer.pending); > - set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); > + if(!atomic_inc_and_test(&apic->timer.pending)) > + set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); > if (waitqueue_active(q)) { > apic->vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > wake_up_interruptible(q); > The wakeup is okay since lapic is handled by its vcpu. But do we need to change the mpstate? The lapic code should do that, if it determines that an interrupt is actually generated. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.