From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: Infinite IRQ injection loop in QEMU Date: Wed, 22 Oct 2014 20:01:31 +0200 Message-ID: <5447F0FB.30906@redhat.com> References: <5447CE55.4020308@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit To: John Snow , KVM list , Stefan Hajnoczi Return-path: Received: from mail-lb0-f171.google.com ([209.85.217.171]:45054 "EHLO mail-lb0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753755AbaJVSBi (ORCPT ); Wed, 22 Oct 2014 14:01:38 -0400 Received: by mail-lb0-f171.google.com with SMTP id z12so3320948lbi.30 for ; Wed, 22 Oct 2014 11:01:35 -0700 (PDT) In-Reply-To: <5447CE55.4020308@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 10/22/2014 05:33 PM, John Snow wrote: > > I've been working on improving the AHCI device emulation for QEMU but > have recently run into an issue where Windows 8 guests -- upon trying to > resume from hibernation -- manage to trigger an infinite IRQ injection > loop where it seems that the IRQ never quite properly gets cleared. > > I am still working on troubleshooting it further, but I wanted to see if > anyone had advice or experience with this type of issue. > > In a nutshell: > - Windows 8 boots up inside of QEMU/KVM > - Windows 8 is suspended to disk either via "shut down" or explicit > hibernate. QEMU exits. > - Windows 8 is resumed > - Windows 8 resets the AHCI device and begins re-initializing it > - Once the active AHCI port is reset, it issues an interrupt to indicate > it has a pending message (set of register values) ready for the host to > synchronize state with the HBA. This interrupt appears to be legacy PCI > and not MSI. > - This triggers an infinite injection loop. This usually means that the interrupt was not properly cleared in the AHCI controller. Since legacy PCI interrupts are shared, it probably means that the guest was not expecting the AHCI interrupt and is just not asking the driver to handle it. Perhaps the BIOS is leaving the driver with INTX enabled, or something like that? Paolo > Here are some characteristic traces from perf record, grabbing > kvm-related entries with user space traces. > > Here's where the interrupt first appears to become stuck, showing when > it is set: http://pastebin.com/KPevxCw2 > > It looks like pin #16, vec=177. All activity in the guest and QEMU now > apparently ceases, and then the perf script shows many, many loops which > look like the following: http://pastebin.com/qYh9035y > > which repeats over-and-over. It does not appear that QEMU is re-setting > the IRQ, and there are no further calls from the guest into ICH9 or AHCI > related code to set/unset any device registers. > > In talking with Stefan, we think that the irr bit is possibly not > getting cleared (or getting set again?) after the EOI (see the first > paste) -- does anyone have experience with debugging this type of issue, > or have some hints about what may be happening?