From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [PATCH v2 1/3] Introduce a workqueue to deliver PIT timer interrupts. Date: Tue, 15 Jun 2010 14:53:43 -0300 Message-ID: <20100615175343.GA5732@amt.cnet> References: <1276535482-11965-1-git-send-email-clalance@redhat.com> <1276535482-11965-2-git-send-email-clalance@redhat.com> <20100614221949.GB8658@amt.cnet> <20100615121144.GA2659@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Chris Lalancette Return-path: Received: from mx1.redhat.com ([209.132.183.28]:42179 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752409Ab0FORzv (ORCPT ); Tue, 15 Jun 2010 13:55:51 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o5FHtoq4013875 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 15 Jun 2010 13:55:50 -0400 Content-Disposition: inline In-Reply-To: <20100615121144.GA2659@localhost.localdomain> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jun 15, 2010 at 08:11:45AM -0400, Chris Lalancette wrote: > On 06/14/10 - 07:19:49PM, Marcelo Tosatti wrote: > > On Mon, Jun 14, 2010 at 01:11:20PM -0400, Chris Lalancette wrote: > > > We really want to "kvm_set_irq" during the hrtimer callback, > > > but that is risky because that is during interrupt context. > > > Instead, offload the work to a workqueue, which is a bit safer > > > and should provide most of the same functionality. > > > > > > Signed-off-by: Chris Lalancette > > > --- > > > arch/x86/kvm/i8254.c | 125 ++++++++++++++++++++++++++++--------------------- > > > arch/x86/kvm/i8254.h | 4 +- > > > arch/x86/kvm/irq.c | 1 - > > > 3 files changed, 74 insertions(+), 56 deletions(-) > > > > > > diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c > > > index 188d827..3bed8ac 100644 > > > --- a/arch/x86/kvm/i8254.c > > > +++ b/arch/x86/kvm/i8254.c > > > @@ -34,6 +34,7 @@ > > > > > > #include > > > #include > > > +#include > > > > > > #include "irq.h" > > > #include "i8254.h" > > > @@ -244,11 +245,11 @@ static void kvm_pit_ack_irq(struct kvm_irq_ack_notifier *kian) > > > { > > > struct kvm_kpit_state *ps = container_of(kian, struct kvm_kpit_state, > > > irq_ack_notifier); > > > - raw_spin_lock(&ps->inject_lock); > > > + spin_lock(&ps->inject_lock); > > > if (atomic_dec_return(&ps->pit_timer.pending) < 0) > > > atomic_inc(&ps->pit_timer.pending); > > > ps->irq_ack = 1; > > > - raw_spin_unlock(&ps->inject_lock); > > > + spin_unlock(&ps->inject_lock); > > > } > > > > > > void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu) > > > @@ -267,7 +268,8 @@ void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu) > > > static void destroy_pit_timer(struct kvm_timer *pt) > > > { > > > pr_debug("execute del timer!\n"); > > > - hrtimer_cancel(&pt->timer); > > > + if (hrtimer_cancel(&pt->timer)) > > > + cancel_work_sync(&pt->kvm->arch.vpit->expired); > > > } > > > > > > static bool kpit_is_periodic(struct kvm_timer *ktimer) > > > @@ -281,6 +283,58 @@ static struct kvm_timer_ops kpit_ops = { > > > .is_periodic = kpit_is_periodic, > > > }; > > > > > > +static void pit_do_work(struct work_struct *work) > > > +{ > > > + struct kvm_pit *pit = container_of(work, struct kvm_pit, expired); > > > + struct kvm *kvm = pit->kvm; > > > + struct kvm_vcpu *vcpu; > > > + int i; > > > + struct kvm_kpit_state *ps = &pit->pit_state; > > > + int inject = 0; > > > + > > > + /* Try to inject pending interrupts when > > > + * last one has been acked. > > > + */ > > > + spin_lock(&ps->inject_lock); > > > + if (ps->irq_ack) { > > > + ps->irq_ack = 0; > > > + inject = 1; > > > + } > > > + spin_unlock(&ps->inject_lock); > > > + if (inject) { > > > + kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1); > > > + kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0); > > > + > > > + /* > > > + * Provides NMI watchdog support via Virtual Wire mode. > > > + * The route is: PIT -> PIC -> LVT0 in NMI mode. > > > + * > > > + * Note: Our Virtual Wire implementation is simplified, only > > > + * propagating PIT interrupts to all VCPUs when they have set > > > + * LVT0 to NMI delivery. Other PIC interrupts are just sent to > > > + * VCPU0, and only if its LVT0 is in EXTINT mode. > > > + */ > > > + if (kvm->arch.vapics_in_nmi_mode > 0) > > > + kvm_for_each_vcpu(i, vcpu, kvm) > > > + kvm_apic_nmi_wd_deliver(vcpu); > > > + } > > > +} > > > + > > > +static enum hrtimer_restart pit_timer_fn(struct hrtimer *data) > > > +{ > > > + struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer); > > > + struct kvm_pit *pt = ktimer->kvm->arch.vpit; > > > + > > > + if (ktimer->reinject) > > > + queue_work(pt->wq, &pt->expired); > > > > The "problem" is queue_work only queues the work if it was not already > > queued. So multiple queue_work() calls can collapse into one executed > > job. > > > > You need to maintain a counter here in pit_timer_fn, and reinject at > > some point (perhaps on ACK) if there are multiple interrupts pending. > > Ah, OK, so that's what the "pending" variable was all about. I didn't quite > understand that before. I'll make this change. > > Is there any way in particular I can test this change? Just fire up a RHEL-3 > guest and see if time drifts? Something more targetted? Firing up a RHEL-3 guest should be enough (one without lost interrupt compensation). > > > > > + > > > + if (ktimer->t_ops->is_periodic(ktimer)) { > > > + hrtimer_add_expires_ns(&ktimer->timer, ktimer->period); > > > + return HRTIMER_RESTART; > > > + } else > > > + return HRTIMER_NORESTART; > > > +} > > > + > > > static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period) > > > { > > > struct kvm_timer *pt = &ps->pit_timer; > > > @@ -291,14 +345,14 @@ static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period) > > > pr_debug("create pit timer, interval is %llu nsec\n", interval); > > > > > > /* TODO The new value only affected after the retriggered */ > > > - hrtimer_cancel(&pt->timer); > > > + if (hrtimer_cancel(&pt->timer)) > > > + cancel_work_sync(&pt->kvm->arch.vpit->expired); > > > > There can be a queued work instance even if the hrtimer is not active, > > so cancel_work_sync should be unconditional. > > Yeah, that's what I had initially. However, I ran into a problem with this; > if you call "cancel_work_sync" when there is *no* work queued up, then you get > a stack trace like: > > BUG: unable to handle kernel paging request at 0000000000002368 > IP: [] pit_load_count+0x95/0x190 [kvm] > > Call Trace: > [] kvm_pit_reset+0x62/0xa0 [kvm] > [] kvm_create_pit+0x162/0x290 [kvm] > [] kvm_arch_vm_ioctl+0x5d3/0xd80 [kvm] > > Since I have to use "pending" for the interrupt reinjection logic, I can > work around both of these issues by keying off of that for whether to do > a cancel_work_sync. Unless you have a better idea for how to work around > this issue? It should be fine to call cancel_work_sync on a workqueue with no work pending. It seems the error is somewhere else.