From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [question] Why newer QEMU may lose irq when doing migration? Date: Fri, 19 Dec 2014 10:50:50 +0100 Message-ID: <5493F4FA.5050605@redhat.com> References: <54915F1B.30107@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: yang.z.zhang@intel.com, Bandan Das , kvm@vger.kernel.org, Wanpeng Li To: Wincy Van Return-path: Received: from mail-wi0-f177.google.com ([209.85.212.177]:43332 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751892AbaLSJu4 (ORCPT ); Fri, 19 Dec 2014 04:50:56 -0500 Received: by mail-wi0-f177.google.com with SMTP id l15so1132973wiw.16 for ; Fri, 19 Dec 2014 01:50:55 -0800 (PST) In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 19/12/2014 04:58, Wincy Van wrote: > 2014-12-17 18:46 GMT+08:00 Paolo Bonzini : >> >> >> On 17/12/2014 04:46, Wincy Van wrote: >>> Hi, all: >>> >>> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of >>> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc >>> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery) >>> introduced a bug (see >>> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html). >>> >>> From the description "Unlike the old qemu-kvm, which really never did >>> that, with new QEMU it is for some reason >>> somewhat likely to migrate a VM with a nonzero IRR in the ioapic." >>> >>> Why could new QEMU do that? I can not find any codes about the "some reason".. >>> As we know, once a irq is set in kvm's ioapic, the ioapic will send >>> that irq to lapic, this is an "atomic" operation. >> >> It can happen if the IRQ is masked in the IOAPIC, for example. Until >> commit 0bc830b, KVM could not distinguish two cases: >> >> 1) an edge-triggered interrupt that was raised while the IOAPIC had it >> masked >> >> 2) an edge-triggered interrupt that was raised and delivered, but for >> which userspace left the level to 1. > > It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc > interrupt may be lost when doing migration, and guest will not acknowledge it, > then the newer rtc interrupts are ignored forever. I think this is > none of the cases above, because the interrupt was lost. It must be > something wrong here. There is a third case actually. If the source kernel is an old one before commit 2c2bf0113697 (KVM: Use eoi to track RTC interrupt delivery status, 2013-04-11), ioapic->irr can also be set if the RTC interrupt was coalesced (for example because the PPR was too high to deliver it). Instead, commit 2c2bf0113697 will not set ioapic->irr in this case. Yang, was this intentional? The question, however, is then why my patch set worked (fixed migration) even without moving ioapic->irr |= mask; above this: if (irq == RTC_GSI && line_status && rtc_irq_check_coalesced(ioapic)) { ret = 0; /* coalesced */ goto out; } >> No, QEMU does not save the pending IRQ. IRQs are stateless in QEMU. >> The assumption is that after a qemu_set_irq the IRQ will be >> delivered---possibly on the other side of the migration, but it will be >> delivered. > > I find that in kvm_arch_vcpu_ioctl_get_sregs, KVM will save pending IRQs > into sregs->interrupt_bitmap and QEMU will save it. > Isn't it right? This is a pending interrupt that has been queued by the kernel but not delivered yet. It can happen if the interrupt controller is implemented in userspace. Paolo