From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37013) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOqFl-00025s-E4 for qemu-devel@nongnu.org; Tue, 02 Sep 2014 11:44:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XOqFf-0007yE-7J for qemu-devel@nongnu.org; Tue, 02 Sep 2014 11:44:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45055) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOqFf-0007y8-0I for qemu-devel@nongnu.org; Tue, 02 Sep 2014 11:44:11 -0400 Date: Tue, 2 Sep 2014 18:44:06 +0300 From: "Michael S. Tsirkin" Message-ID: <20140902154406.GA23374@redhat.com> References: <201408231836387399956@sangfor.com> <53FAA874.70703@redhat.com> <201408251517235889695@sangfor.com> <53FAE5EB.8080809@redhat.com> <201408261728240882530@sangfor.com> <53FD681D.7070602@redhat.com> <201408271731497879013@sangfor.com> <201408282055150251894@sangfor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201408282055150251894@sangfor.com> Subject: Re: [Qemu-devel] [question] e1000 interrupt storm happenedbecauseofits correspondingioapic->irr bit always set List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhang Haoyu Cc: Jason Wang , qemu-devel , kvm On Thu, Aug 28, 2014 at 08:55:18PM +0800, Zhang Haoyu wrote: > Hi Jason, > I tested below patch, it's okay, the e1000 interrupt storm disappeared. > But I am going to make a bit change on it, could you help review it? > > >Currently, we call ioapic_service() immediately when we find the irq is still > >active during eoi broadcast. But for real hardware, there's some dealy between > >the EOI writing and irq delivery (system bus latency?). So we need to emulate > >this behavior. Otherwise, for a guest who haven't register a proper irq handler > >, it would stay in the interrupt routine as this irq would be re-injected > >immediately after guest enables interrupt. This would lead guest can't move > >forward and may miss the possibility to get proper irq handler registered (one > >example is windows guest resuming from hibernation). > > > >As there's no way to differ the unhandled irq from new raised ones, this patch > >solve this problems by scheduling a delayed work when the count of irq injected > >during eoi broadcast exceeds a threshold value. After this patch, the guest can > >move a little forward when there's no suitable irq handler in case it may > >register one very soon and for guest who has a bad irq detection routine ( such > >as note_interrupt() in linux ), this bad irq would be recognized soon as in the > >past. > > > >Signed-off-by: Jason Wang redhat.com> > >--- > > virt/kvm/ioapic.c | 47 +++++++++++++++++++++++++++++++++++++++++++++-- > > virt/kvm/ioapic.h | 2 ++ > > 2 files changed, 47 insertions(+), 2 deletions(-) > > > >diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > >index dcaf272..892253e 100644 > >--- a/virt/kvm/ioapic.c > >+++ b/virt/kvm/ioapic.c > > -221,6 +221,24 int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level) > > return ret; > > } > > > >+static void kvm_ioapic_eoi_inject_work(struct work_struct *work) > >+{ > >+ int i, ret; > >+ struct kvm_ioapic *ioapic = container_of(work, struct kvm_ioapic, > >+ eoi_inject.work); > >+ spin_lock(&ioapic->lock); > >+ for (i = 0; i < IOAPIC_NUM_PINS; i++) { > >+ union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i]; > >+ > >+ if (ent->fields.trig_mode != IOAPIC_LEVEL_TRIG) > >+ continue; > >+ > >+ if (ioapic->irr & (1 << i) && !ent->fields.remote_irr) > >+ ret = ioapic_service(ioapic, i); > >+ } > >+ spin_unlock(&ioapic->lock); > >+} > >+ > > static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector, > > int trigger_mode) > > { > > -249,8 +267,29 static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector, > > > > ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG); > > ent->fields.remote_irr = 0; > >- if (!ent->fields.mask && (ioapic->irr & (1 << i))) > >- ioapic_service(ioapic, i); > >+ if (!ent->fields.mask && (ioapic->irr & (1 << i))) { > >+ ++ioapic->irq_eoi; > -+ ++ioapic->irq_eoi; > ++ ++ioapic->irq_eoi[i]; > >+ if (ioapic->irq_eoi == 100) { > -+ if (ioapic->irq_eoi == 100) { > ++ if (ioapic->irq_eoi[i] == 100) { > >+ /* > >+ * Real hardware does not deliver the irq so > >+ * immediately during eoi broadcast, so we need > >+ * to emulate this behavior. Otherwise, for > >+ * guests who has not registered handler of a > >+ * level irq, this irq would be injected > >+ * immediately after guest enables interrupt > >+ * (which happens usually at the end of the > >+ * common interrupt routine). This would lead > >+ * guest can't move forward and may miss the > >+ * possibility to get proper irq handler > >+ * registered. So we need to give some breath to > >+ * guest. TODO: 1 is too long? > >+ */ > >+ schedule_delayed_work(&ioapic->eoi_inject, 1); > >+ ioapic->irq_eoi = 0; > -+ ioapic->irq_eoi = 0; > ++ ioapic->irq_eoi[i] = 0; > >+ } else { > >+ ioapic_service(ioapic, i); > >+ } > >+ } > ++ else { > ++ ioapic->irq_eoi[i] = 0; > ++ } > > } > > } > I think ioapic->irq_eoi is prone to reach to 100, because during a eoi broadcast, > it's possible that another interrupt's (not current eoi's corresponding interrupt) irr is set, so the ioapic->irq_eoi will grow continually, > and not too long, ioapic->irq_eoi will reach to 100. > I want to add "u32 irq_eoi[IOAPIC_NUM_PINS];" instead of "u32 irq_eoi;". > Any ideas? > > Zhang Haoyu I'm a bit concerned how this will affect realtime guests. Worth adding a flag to enable this, so that e.g. virtio is not affected? > > > > -375,12 +414,14 void kvm_ioapic_reset(struct kvm_ioapic *ioapic) > > { > > int i; > > > >+ cancel_delayed_work_sync(&ioapic->eoi_inject); > > for (i = 0; i < IOAPIC_NUM_PINS; i++) > > ioapic->redirtbl[i].fields.mask = 1; > > ioapic->base_address = IOAPIC_DEFAULT_BASE_ADDRESS; > > ioapic->ioregsel = 0; > > ioapic->irr = 0; > > ioapic->id = 0; > >+ ioapic->irq_eoi = 0; > -+ ioapic->irq_eoi = 0; > ++ memset(ioapic->irq_eoi, 0x00, IOAPIC_NUM_PINS); > > update_handled_vectors(ioapic); > > } > > > > -398,6 +439,7 int kvm_ioapic_init(struct kvm *kvm) > > if (!ioapic) > > return -ENOMEM; > > spin_lock_init(&ioapic->lock); > >+ INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work); > > kvm->arch.vioapic = ioapic; > > kvm_ioapic_reset(ioapic); > > kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops); > > -418,6 +460,7 void kvm_ioapic_destroy(struct kvm *kvm) > > { > > struct kvm_ioapic *ioapic = kvm->arch.vioapic; > > > >+ cancel_delayed_work_sync(&ioapic->eoi_inject); > > if (ioapic) { > > kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &ioapic->dev); > > kvm->arch.vioapic = NULL; > >diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h > >index 0b190c3..8938e66 100644 > >--- a/virt/kvm/ioapic.h > >+++ b/virt/kvm/ioapic.h > > -47,6 +47,8 struct kvm_ioapic { > > void (*ack_notifier)(void *opaque, int irq); > > spinlock_t lock; > > DECLARE_BITMAP(handled_vectors, 256); > >+ struct delayed_work eoi_inject; > >+ u32 irq_eoi; > -+ u32 irq_eoi; > ++ u32 irq_eoi[IOAPIC_NUM_PINS]; > > }; > > > > #ifdef DEBUG >