From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: Lost interrupts with upstream KVM Date: Fri, 29 May 2009 19:26:52 +0200 Message-ID: <4A201ADC.5030206@siemens.com> References: <4A1F9B7C.4020201@siemens.com> <20090529130806.GB28542@redhat.com> <4A1FF6B9.9050502@siemens.com> <20090529162015.GA29579@redhat.com> <4A201177.2090103@siemens.com> <20090529165418.GA917@redhat.com> <4A2014F8.3030405@siemens.com> <20090529171924.GB917@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kvm-devel , qemu-devel To: Gleb Natapov Return-path: Received: from gecko.sbs.de ([194.138.37.40]:22942 "EHLO gecko.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752794AbZE2R1K (ORCPT ); Fri, 29 May 2009 13:27:10 -0400 In-Reply-To: <20090529171924.GB917@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Gleb Natapov wrote: > On Fri, May 29, 2009 at 07:01:44PM +0200, Jan Kiszka wrote: >> Gleb Natapov wrote: >>> On Fri, May 29, 2009 at 06:46:47PM +0200, Jan Kiszka wrote: >>>> Gleb Natapov wrote: >>>>> On Fri, May 29, 2009 at 04:52:41PM +0200, Jan Kiszka wrote: >>>>>> Gleb Natapov wrote: >>>>>>> On Fri, May 29, 2009 at 10:23:24AM +0200, Jan Kiszka wrote: >>>>>>>> Hi Gleb, >>>>>>>> >>>>>>>> with latest kernel modules, namely beginning with 6bc0a1a235 (Remove >>>>>>>> irq_pending bitmap), I'm loosing interrupts with upstream's KVM support. >>>>>>>> After some bisecting, hair-pulling and a bit meditation I added a >>>>>>>> WARN_ON(kvm_cpu_has_interrupt(vcpu)) to kvm_vcpu_ioctl_interrupt, and it >>>>>>>> actually triggered right before the guest got stuck. >>>>>>>> >>>>>>>> This didn't trigger with qemu-kvm (and -no-kvm-irqchip) yet but, on the >>>>>>>> other hand, I currently do not see a potential bug in upstream's >>>>>>>> kvm_arch_pre_run. Could you have a look if you can reproduce, >>>>>>>> specifically if this isn't a KVM kernel issue in the end? >>>>>>>> >>>>>>> In kvm_cpu_exec() after calling kvm_arch_pre_run() env->exit_request is >>>>>>> tested and function can exit without calling kvm_vcpu_ioctl(KVM_RUN). >>>>>>> Can you check if this what happens in your case? >>>>>> This path is executed quite frequently here. No obvious correlation with >>>>>> the lost IRQ. >>>>>> >>>>> If kvm_arch_pre_run() injected interrupt kvm_vcpu_ioctl(KVM_RUN) have to >>>>> be executed before injecting another interrupt, so if on the fist call >>>>> of kvm_cpu_exec() kvm_arch_pre_run() injected interrupt, but >>>>> kvm_vcpu_ioctl(KVM_RUN) was not executed because of env->exit_request >>>>> and on the next kvm_cpu_exec() other interrupt is injected the previous >>>>> one will be lost. >>>> ...and kvm_run->ready_for_interrupt_injection is not updated either in >>>> that case, right? That makes be wonder if KVM_INTERRUPT shouldn't better >>>> return an error in case the queue is full already. >>>> >>> If kvm_vcpu_ioctl(KVM_RUN) is called, but exit happens before interrupt >>> is injected kvm_run->ready_for_interrupt_injection should be update to >>> reflect that fact. >> Yes, but in this case it isn't called if IIUC. So that is the problem >> upstream KVM faces? >> > This is my guest. It tries to inject two different interrupt > simultaneously and this is not supported (and not correct). > It can be easily checked if you have reproducible case. > >> Then again: What do you think is the proper long-term fix? Only >> adjusting upstream KVM (required anyway) or also making the kernel >> support more robust against this pattern? > If my guest is correct no fix needed for KVM module (we can enhance > API to return error as you suggested, but this will not fix buggy > userspace). You are asking what do I think is the proper long-term > fix then my answer is: merging qemu-kvm into qemu dropping whatever we > have there currently ;) As we won't merge libkvm's structure upstream, we won't see the same code structure in qemu one day that currently works (correctly) in qemu-kvm. However, will see if I can fix upstream based on this analysis. Good to know that we do not have another tricky race in the kvm irq handling part. Thanks, Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux