From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=42471 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pkfab-0003b0-Cp for qemu-devel@nongnu.org; Wed, 02 Feb 2011 11:29:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PkfaZ-0007ND-Ig for qemu-devel@nongnu.org; Wed, 02 Feb 2011 11:29:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:3907) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PkfaZ-0007N0-9y for qemu-devel@nongnu.org; Wed, 02 Feb 2011 11:29:51 -0500 Date: Wed, 2 Feb 2011 18:29:48 +0200 From: Gleb Natapov Subject: Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip Message-ID: <20110202162948.GS14984@redhat.com> References: <4D4952FA.8020300@siemens.com> <4D49569F.6060207@redhat.com> <4D496A8D.90000@siemens.com> <4D496BC5.10807@redhat.com> <4D496D77.2010405@siemens.com> <4D496FA6.8070301@siemens.com> <4D49738D.7080404@redhat.com> <4D4979BD.6080900@siemens.com> <20110202154611.GR14984@redhat.com> <4D497DAB.7010901@siemens.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D497DAB.7010901@siemens.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Avi Kivity , kvm , qemu-devel On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote: > On 2011-02-02 16:46, Gleb Natapov wrote: > > On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote: > >> On 2011-02-02 16:09, Avi Kivity wrote: > >>> On 02/02/2011 04:52 PM, Jan Kiszka wrote: > >>>> On 2011-02-02 15:43, Jan Kiszka wrote: > >>>>> On 2011-02-02 15:35, Avi Kivity wrote: > >>>>>> On 02/02/2011 04:30 PM, Jan Kiszka wrote: > >>>>>>> On 2011-02-02 14:05, Avi Kivity wrote: > >>>>>>>> On 02/02/2011 02:50 PM, Jan Kiszka wrote: > >>>>>>>>>>> > >>>>>>>>>> Opps, -smp 1. With -smp 2 it boot almost completely and then hangs. > >>>>>>>>> > >>>>>>>>> Ah, good (or not good). With Windows 2003 Server, I actually get a Blue > >>>>>>>>> Screen (Stop 0x000000b8). > >>>>>>>> > >>>>>>>> Userspace APIC is broken since it may run with an outdated cr8, does > >>>>>>>> reverting 27a4f7976d5 help? > >>>>>>> > >>>>>>> Can you elaborate on what is broken? The way hw/apic.c maintains the > >>>>>>> tpr? Would it make sense to compare this against the in-kernel model? Or > >>>>>>> do you mean something else? > >>>>>> > >>>>>> The problem, IIRC, was that we look up the TPR but it may already have > >>>>>> been changed by the running vcpu. Not 100% sure. > >>>>>> > >>>>>> If that is indeed the problem then the fix would be to process the APIC > >>>>>> in vcpu context (which is what the kernel does - we set a bit in the IRR > >>>>>> and all further processing is synchronous). > >>>>> > >>>>> You mean: user space changes the tpr value while the vcpu is in KVM_RUN, > >>>>> then we return from the kernel and overwrite the tpr in the apic with > >>>>> the vcpu's view, right? > >>>> > >>>> Hmm, probably rather that there is a discrepancy between tpr and irr. > >>>> The latter is changed asynchronously /wrt to the vcpu, the former /wrt > >>>> the user space device model. > >>> > >>> And yet, both are synchronized via qemu_mutex. So we're still missing > >>> something in this picture. > >>> > >>>> Run apic_set_irq on the vcpu? > >>> > >>> static void apic_set_irq(APICState *s, int vector_num, int trigger_mode) > >>> { > >>> apic_irq_delivered += !get_bit(s->irr, vector_num); > >>> > >>> trace_apic_set_irq(apic_irq_delivered); > >>> > >>> set_bit(s->irr, vector_num); > >>> > >>> This is even more async with kernel irqchip > >>> > >>> if (trigger_mode) > >>> set_bit(s->tmr, vector_num); > >>> else > >>> reset_bit(s->tmr, vector_num); > >>> > >>> This is protected by qemu_mutex > >>> > >>> apic_update_irq(s); > >>> > >>> This will be run the next time the vcpu exits, via apic_get_interrupt(). > >> > >> The decision to pend an IRQ (and potentially kick the vcpu) takes place > >> immediately in acip_update_irq. And it is based on current irr as well > >> as tpr. But we update again when user space returns with a new value. > >> > >>> > >>> } > >>> > >>> Did you check whether reverting that commit helps? > >>> > >> > >> Just did so, and I can no longer reproduce the problem. Hmm... > >> > > If there is no problem in the logic of this commit (and I do not see > > one yet) then we somewhere miss kicking vcpu when interrupt, that should be > > handled, arrives? > > I'm not yet confident about the logic of the kernel patch: mov to cr8 is > serializing. If the guest raises the tpr and then signals this with a > succeeding, non vm-exiting instruction to the other vcpus, one of those > could inject an interrupt with a higher priority than the previous tpr, > but a lower one than current tpr. QEMU user space would accept this > interrupt - and would likely surprise the guest. Do I miss something? > Injection happens by vcpu thread on cpu entry: run->request_interrupt_window = kvm_arch_try_push_interrupts(env); and tpr is synced on vcpu exit, so I do not yet see how what you describe above may happen since during injection vcpu should see correct tpr. -- Gleb.