From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip Date: Thu, 03 Feb 2011 10:31:12 +0100 Message-ID: <4D4A75E0.60204@siemens.com> References: <4D496D77.2010405@siemens.com> <4D496FA6.8070301@siemens.com> <4D49738D.7080404@redhat.com> <4D4979BD.6080900@siemens.com> <20110202154611.GR14984@redhat.com> <4D497DAB.7010901@siemens.com> <20110202162948.GS14984@redhat.com> <4D498825.8090404@siemens.com> <20110202163922.GT14984@redhat.com> <4D498B94.8080001@siemens.com> <20110203074240.GU14984@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Avi Kivity , kvm , qemu-devel To: Gleb Natapov Return-path: Received: from david.siemens.de ([192.35.17.14]:33103 "EHLO david.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756002Ab1BCJbb (ORCPT ); Thu, 3 Feb 2011 04:31:31 -0500 In-Reply-To: <20110203074240.GU14984@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2011-02-03 08:42, Gleb Natapov wrote: > On Wed, Feb 02, 2011 at 05:51:32PM +0100, Jan Kiszka wrote: >>>>>>>> Just did so, and I can no longer reproduce the problem. Hmm... >>>>>>>> >>>>>>> If there is no problem in the logic of this commit (and I do not see >>>>>>> one yet) then we somewhere miss kicking vcpu when interrupt, that should be >>>>>>> handled, arrives? >>>>>> >>>>>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is >>>>>> serializing. If the guest raises the tpr and then signals this with a >>>>>> succeeding, non vm-exiting instruction to the other vcpus, one of those >>>>>> could inject an interrupt with a higher priority than the previous tpr, >>>>>> but a lower one than current tpr. QEMU user space would accept this >>>>>> interrupt - and would likely surprise the guest. Do I miss something? >>>>>> >>>>> Injection happens by vcpu thread on cpu entry: >>>>> run->request_interrupt_window = kvm_arch_try_push_interrupts(env); >>>>> and tpr is synced on vcpu exit, so I do not yet see how what you describe >>>>> above may happen since during injection vcpu should see correct tpr. >>>> >>>> Hmm, maybe this is the key: Once we call into apic_get_interrupt >>>> (because CPU_INTERRUPT_HARD was set as described above) and we find a >>>> pending irq below the tpr, we inject a spurious vector instead. >>>> >>> That should be easy to verify. I expect Windows to BSOD upon receiving >>> spurious vector though. >> >> I hacked spurious irq injection away, but the issue remains. At the same >> time, Windows is receiving tons of spurious interrupts without any >> complaints, even without that tpr optimization in the kernel. So this is >> obviously not yet the key. >> >> Let's try your idea that we miss a wakeup. >> > That is unlikely too. If vcpu missed wakeup, "info cpus" would solve the > hang since it would kick vcpu out of the kernel and missed interrupt would be > injected on re-entry. Yeah, and it wouldn't explain the various BSOFs I'm seeing (you get an even broader spectrum when trying the Windows installations DVDs). We are probably digging at the wrong site. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux