* [patch 0/2] timer injection races @ 2008-06-06 19:37 Marcelo Tosatti 2008-06-06 19:37 ` [patch 1/2] KVM: consolidate check for pending vcpu requests Marcelo Tosatti 2008-06-06 19:37 ` [patch 2/2] KVM: close timer injection race window in __vcpu_run Marcelo Tosatti 0 siblings, 2 replies; 7+ messages in thread From: Marcelo Tosatti @ 2008-06-06 19:37 UTC (permalink / raw) To: Avi Kivity, Chris Wright; +Cc: kvm -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* [patch 1/2] KVM: consolidate check for pending vcpu requests 2008-06-06 19:37 [patch 0/2] timer injection races Marcelo Tosatti @ 2008-06-06 19:37 ` Marcelo Tosatti 2008-06-08 7:10 ` Avi Kivity 2008-06-06 19:37 ` [patch 2/2] KVM: close timer injection race window in __vcpu_run Marcelo Tosatti 1 sibling, 1 reply; 7+ messages in thread From: Marcelo Tosatti @ 2008-06-06 19:37 UTC (permalink / raw) To: Avi Kivity, Chris Wright; +Cc: kvm, Marcelo Tosatti [-- Attachment #1: better-vcpu-run-request-handling --] [-- Type: text/plain, Size: 1659 bytes --] A guest vcpu instance can be scheduled to a different physical CPU between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable(). If that happens, the timer will only be migrated to the current pCPU on the next exit, meaning that guest LAPIC timer event can be delayed until a host interrupt is triggered. Fix it by cancelling guest entry if any vcpu request is pending. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/kvm/x86.c =================================================================== --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -2798,6 +2798,8 @@ again: if (vcpu->requests) { if (test_and_clear_bit(KVM_REQ_MIGRATE_TIMER, &vcpu->requests)) __kvm_migrate_timers(vcpu); + if (test_and_clear_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests)) + kvm_x86_ops->tlb_flush(vcpu); if (test_and_clear_bit(KVM_REQ_REPORT_TPR_ACCESS, &vcpu->requests)) { kvm_run->exit_reason = KVM_EXIT_TPR_ACCESS; @@ -2820,21 +2822,13 @@ again: local_irq_disable(); - if (need_resched()) { + if (vcpu->requests || need_resched()) { local_irq_enable(); preempt_enable(); r = 1; goto out; } - if (vcpu->requests) - if (test_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) { - local_irq_enable(); - preempt_enable(); - r = 1; - goto out; - } - if (signal_pending(current)) { local_irq_enable(); preempt_enable(); @@ -2864,9 +2858,6 @@ again: kvm_guest_enter(); - if (vcpu->requests) - if (test_and_clear_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests)) - kvm_x86_ops->tlb_flush(vcpu); KVMTRACE_0D(VMENTRY, vcpu, entryexit); kvm_x86_ops->run(vcpu, kvm_run); -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 1/2] KVM: consolidate check for pending vcpu requests 2008-06-06 19:37 ` [patch 1/2] KVM: consolidate check for pending vcpu requests Marcelo Tosatti @ 2008-06-08 7:10 ` Avi Kivity 0 siblings, 0 replies; 7+ messages in thread From: Avi Kivity @ 2008-06-08 7:10 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Chris Wright, kvm Marcelo Tosatti wrote: > A guest vcpu instance can be scheduled to a different physical CPU > between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable(). > > If that happens, the timer will only be migrated to the current pCPU on > the next exit, meaning that guest LAPIC timer event can be delayed until > a host interrupt is triggered. > > Fix it by cancelling guest entry if any vcpu request is pending. > Applied, thanks. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 7+ messages in thread
* [patch 2/2] KVM: close timer injection race window in __vcpu_run 2008-06-06 19:37 [patch 0/2] timer injection races Marcelo Tosatti 2008-06-06 19:37 ` [patch 1/2] KVM: consolidate check for pending vcpu requests Marcelo Tosatti @ 2008-06-06 19:37 ` Marcelo Tosatti 2008-06-08 7:17 ` Avi Kivity 1 sibling, 1 reply; 7+ messages in thread From: Marcelo Tosatti @ 2008-06-06 19:37 UTC (permalink / raw) To: Avi Kivity, Chris Wright; +Cc: kvm, Marcelo Tosatti [-- Attachment #1: fired-but-not-injected-guest-timer --] [-- Type: text/plain, Size: 2350 bytes --] If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/kvm/i8254.c =================================================================== --- kvm.orig/arch/x86/kvm/i8254.c +++ kvm/arch/x86/kvm/i8254.c @@ -200,9 +200,12 @@ static int __pit_timer_fn(struct kvm_kpi atomic_inc(&pt->pending); smp_mb__after_atomic_inc(); - if (vcpu0 && waitqueue_active(&vcpu0->wq)) { - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; - wake_up_interruptible(&vcpu0->wq); + if (vcpu0) { + set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests); + if (waitqueue_active(&vcpu0->wq)) { + vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; + wake_up_interruptible(&vcpu0->wq); + } } pt->timer.expires = ktime_add_ns(pt->timer.expires, pt->period); Index: kvm/arch/x86/kvm/lapic.c =================================================================== --- kvm.orig/arch/x86/kvm/lapic.c +++ kvm/arch/x86/kvm/lapic.c @@ -946,6 +946,7 @@ static int __apic_timer_fn(struct kvm_la wait_queue_head_t *q = &apic->vcpu->wq; atomic_inc(&apic->timer.pending); + set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); if (waitqueue_active(q)) { apic->vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; wake_up_interruptible(q); Index: kvm/arch/x86/kvm/x86.c =================================================================== --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -2813,6 +2813,7 @@ again: } } + clear_bit(KVM_REQ_PENDING_TIMER, &vcpu->requests); kvm_inject_pending_timer_irqs(vcpu); preempt_disable(); Index: kvm/include/linux/kvm_host.h =================================================================== --- kvm.orig/include/linux/kvm_host.h +++ kvm/include/linux/kvm_host.h @@ -33,6 +33,7 @@ #define KVM_REQ_REPORT_TPR_ACCESS 2 #define KVM_REQ_MMU_RELOAD 3 #define KVM_REQ_TRIPLE_FAULT 4 +#define KVM_REQ_PENDING_TIMER 5 struct kvm_vcpu; extern struct kmem_cache *kvm_vcpu_cache; -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 2/2] KVM: close timer injection race window in __vcpu_run 2008-06-06 19:37 ` [patch 2/2] KVM: close timer injection race window in __vcpu_run Marcelo Tosatti @ 2008-06-08 7:17 ` Avi Kivity 2008-06-08 15:08 ` Marcelo Tosatti 0 siblings, 1 reply; 7+ messages in thread From: Avi Kivity @ 2008-06-08 7:17 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Chris Wright, kvm Marcelo Tosatti wrote: > If a timer fires after kvm_inject_pending_timer_irqs() but before > local_irq_disable() the code will enter guest mode and only inject such > timer interrupt the next time an unrelated event causes an exit. > > It would be simpler if the timer->pending irq conversion could be done > with IRQ's disabled, so that the above problem cannot happen. > > For now introduce a new vcpu requests bit to cancel guest entry. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > Applied this too. > Index: kvm/arch/x86/kvm/i8254.c > =================================================================== > --- kvm.orig/arch/x86/kvm/i8254.c > +++ kvm/arch/x86/kvm/i8254.c > @@ -200,9 +200,12 @@ static int __pit_timer_fn(struct kvm_kpi > > atomic_inc(&pt->pending); > smp_mb__after_atomic_inc(); > - if (vcpu0 && waitqueue_active(&vcpu0->wq)) { > - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; > - wake_up_interruptible(&vcpu0->wq); > + if (vcpu0) { > + set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests); > + if (waitqueue_active(&vcpu0->wq)) { > + vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; > + wake_up_interruptible(&vcpu0->wq); > + } > } > > We probably ought to wakeup only if pt->pending was zero, no? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 2/2] KVM: close timer injection race window in __vcpu_run 2008-06-08 7:17 ` Avi Kivity @ 2008-06-08 15:08 ` Marcelo Tosatti 2008-06-09 3:29 ` Marcelo Tosatti 0 siblings, 1 reply; 7+ messages in thread From: Marcelo Tosatti @ 2008-06-08 15:08 UTC (permalink / raw) To: Avi Kivity; +Cc: Chris Wright, kvm On Sun, Jun 08, 2008 at 10:17:15AM +0300, Avi Kivity wrote: >> Index: kvm/arch/x86/kvm/i8254.c >> =================================================================== >> --- kvm.orig/arch/x86/kvm/i8254.c >> +++ kvm/arch/x86/kvm/i8254.c >> @@ -200,9 +200,12 @@ static int __pit_timer_fn(struct kvm_kpi >> atomic_inc(&pt->pending); >> smp_mb__after_atomic_inc(); >> - if (vcpu0 && waitqueue_active(&vcpu0->wq)) { >> - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; >> - wake_up_interruptible(&vcpu0->wq); >> + if (vcpu0) { >> + set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests); >> + if (waitqueue_active(&vcpu0->wq)) { >> + vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; >> + wake_up_interruptible(&vcpu0->wq); >> + } >> } >> > > We probably ought to wakeup only if pt->pending was zero, no? Yep, same for LAPIC. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 2/2] KVM: close timer injection race window in __vcpu_run 2008-06-08 15:08 ` Marcelo Tosatti @ 2008-06-09 3:29 ` Marcelo Tosatti 0 siblings, 0 replies; 7+ messages in thread From: Marcelo Tosatti @ 2008-06-09 3:29 UTC (permalink / raw) To: Avi Kivity; +Cc: Chris Wright, kvm On Sun, Jun 08, 2008 at 12:08:34PM -0300, Marcelo Tosatti wrote: > On Sun, Jun 08, 2008 at 10:17:15AM +0300, Avi Kivity wrote: > > > > We probably ought to wakeup only if pt->pending was zero, no? > > Yep, same for LAPIC. Using atomic_inc_and_test() for it also introduces an SMP barrier to the LAPIC version (thought it was unecessary because of timer migration, but guest can be scheduled to a different pCPU between exit and kvm_vcpu_block(), so there is the possibility for a race). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 9e3391e..c0f7872 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -198,14 +198,11 @@ static int __pit_timer_fn(struct kvm_kpit_state *ps) struct kvm_vcpu *vcpu0 = ps->pit->kvm->vcpus[0]; struct kvm_kpit_timer *pt = &ps->pit_timer; - atomic_inc(&pt->pending); - smp_mb__after_atomic_inc(); - if (vcpu0) { + if (!atomic_inc_and_test(&pt->pending)) set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests); - if (waitqueue_active(&vcpu0->wq)) { - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; - wake_up_interruptible(&vcpu0->wq); - } + if (vcpu0 && waitqueue_active(&vcpu0->wq)) { + vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; + wake_up_interruptible(&vcpu0->wq); } pt->timer.expires = ktime_add_ns(pt->timer.expires, pt->period); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 180ba73..73f43de 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -945,8 +945,8 @@ static int __apic_timer_fn(struct kvm_lapic *apic) int result = 0; wait_queue_head_t *q = &apic->vcpu->wq; - atomic_inc(&apic->timer.pending); - set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); + if(!atomic_inc_and_test(&apic->timer.pending)) + set_bit(KVM_REQ_PENDING_TIMER, &apic->vcpu->requests); if (waitqueue_active(q)) { apic->vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; wake_up_interruptible(q); ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-09 3:29 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-06 19:37 [patch 0/2] timer injection races Marcelo Tosatti 2008-06-06 19:37 ` [patch 1/2] KVM: consolidate check for pending vcpu requests Marcelo Tosatti 2008-06-08 7:10 ` Avi Kivity 2008-06-06 19:37 ` [patch 2/2] KVM: close timer injection race window in __vcpu_run Marcelo Tosatti 2008-06-08 7:17 ` Avi Kivity 2008-06-08 15:08 ` Marcelo Tosatti 2008-06-09 3:29 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox