From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MA601-0005ce-D7 for qemu-devel@nongnu.org; Fri, 29 May 2009 13:36:09 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MA5zw-0005b2-EI for qemu-devel@nongnu.org; Fri, 29 May 2009 13:36:08 -0400 Received: from [199.232.76.173] (port=37557 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MA5zv-0005av-UC for qemu-devel@nongnu.org; Fri, 29 May 2009 13:36:03 -0400 Received: from mx20.gnu.org ([199.232.41.8]:35465) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MA5zv-0007dw-F6 for qemu-devel@nongnu.org; Fri, 29 May 2009 13:36:03 -0400 Received: from lizzard.sbs.de ([194.138.37.39]) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MA5zu-0004C5-OH for qemu-devel@nongnu.org; Fri, 29 May 2009 13:36:03 -0400 Message-ID: <4A201CFF.2080401@siemens.com> Date: Fri, 29 May 2009 19:35:59 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <4A1F9B7C.4020201@siemens.com> <20090529130806.GB28542@redhat.com> <4A1FF6B9.9050502@siemens.com> <20090529162015.GA29579@redhat.com> <4A201177.2090103@siemens.com> <20090529165418.GA917@redhat.com> <4A2014F8.3030405@siemens.com> <20090529171924.GB917@redhat.com> <4A201ADC.5030206@siemens.com> <20090529173149.GD917@redhat.com> In-Reply-To: <20090529173149.GD917@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: Lost interrupts with upstream KVM List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: qemu-devel , kvm-devel Gleb Natapov wrote: > On Fri, May 29, 2009 at 07:26:52PM +0200, Jan Kiszka wrote: >>>>>>>>>> with latest kernel modules, namely beginning with 6bc0a1a235 (Remove >>>>>>>>>> irq_pending bitmap), I'm loosing interrupts with upstream's KVM support. >>>>>>>>>> After some bisecting, hair-pulling and a bit meditation I added a >>>>>>>>>> WARN_ON(kvm_cpu_has_interrupt(vcpu)) to kvm_vcpu_ioctl_interrupt, and it >>>>>>>>>> actually triggered right before the guest got stuck. >>>>>>>>>> >>>>>>>>>> This didn't trigger with qemu-kvm (and -no-kvm-irqchip) yet but, on the >>>>>>>>>> other hand, I currently do not see a potential bug in upstream's >>>>>>>>>> kvm_arch_pre_run. Could you have a look if you can reproduce, >>>>>>>>>> specifically if this isn't a KVM kernel issue in the end? >>>>>>>>>> >>>>>>>>> In kvm_cpu_exec() after calling kvm_arch_pre_run() env->exit_request is >>>>>>>>> tested and function can exit without calling kvm_vcpu_ioctl(KVM_RUN). >>>>>>>>> Can you check if this what happens in your case? >>>>>>>> This path is executed quite frequently here. No obvious correlation with >>>>>>>> the lost IRQ. >>>>>>>> >>>>>>> If kvm_arch_pre_run() injected interrupt kvm_vcpu_ioctl(KVM_RUN) have to >>>>>>> be executed before injecting another interrupt, so if on the fist call >>>>>>> of kvm_cpu_exec() kvm_arch_pre_run() injected interrupt, but >>>>>>> kvm_vcpu_ioctl(KVM_RUN) was not executed because of env->exit_request >>>>>>> and on the next kvm_cpu_exec() other interrupt is injected the previous >>>>>>> one will be lost. >>>>>> ...and kvm_run->ready_for_interrupt_injection is not updated either in >>>>>> that case, right? That makes be wonder if KVM_INTERRUPT shouldn't better >>>>>> return an error in case the queue is full already. >>>>>> >>>>> If kvm_vcpu_ioctl(KVM_RUN) is called, but exit happens before interrupt >>>>> is injected kvm_run->ready_for_interrupt_injection should be update to >>>>> reflect that fact. >>>> Yes, but in this case it isn't called if IIUC. So that is the problem >>>> upstream KVM faces? >>>> >>> This is my guest. It tries to inject two different interrupt >>> simultaneously and this is not supported (and not correct). >>> It can be easily checked if you have reproducible case. >>> >>>> Then again: What do you think is the proper long-term fix? Only >>>> adjusting upstream KVM (required anyway) or also making the kernel >>>> support more robust against this pattern? >>> If my guest is correct no fix needed for KVM module (we can enhance >>> API to return error as you suggested, but this will not fix buggy >>> userspace). You are asking what do I think is the proper long-term >>> fix then my answer is: merging qemu-kvm into qemu dropping whatever we >>> have there currently ;) >> As we won't merge libkvm's structure upstream, we won't see the same >> code structure in qemu one day that currently works (correctly) in qemu-kvm. >> > I hope we will merge it as close as realistically possible. And if the > result is "not good enough" it will be morphed into "good enough" bit > by bit using bisectable commits. For sure. And merging KVM features upstream requires that upstream is capable of testing the result. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux