From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: Lost interrupts with upstream KVM
Date: Fri, 29 May 2009 19:01:44 +0200
Message-ID: <4A2014F8.3030405@siemens.com>
References: <4A1F9B7C.4020201@siemens.com> <20090529130806.GB28542@redhat.com> <4A1FF6B9.9050502@siemens.com> <20090529162015.GA29579@redhat.com> <4A201177.2090103@siemens.com> <20090529165418.GA917@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: kvm-devel <kvm@vger.kernel.org>, qemu-devel <qemu-devel@nongnu.org>
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from lizzard.sbs.de ([194.138.37.39]:19750 "EHLO lizzard.sbs.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751789AbZE2RCD (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 29 May 2009 13:02:03 -0400
In-Reply-To: <20090529165418.GA917@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Gleb Natapov wrote:
> On Fri, May 29, 2009 at 06:46:47PM +0200, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Fri, May 29, 2009 at 04:52:41PM +0200, Jan Kiszka wrote:
>>>> Gleb Natapov wrote:
>>>>> On Fri, May 29, 2009 at 10:23:24AM +0200, Jan Kiszka wrote:
>>>>>> Hi Gleb,
>>>>>>
>>>>>> with latest kernel modules, namely beginning with 6bc0a1a235 (Remove
>>>>>> irq_pending bitmap), I'm loosing interrupts with upstream's KVM support.
>>>>>> After some bisecting, hair-pulling and a bit meditation I added a
>>>>>> WARN_ON(kvm_cpu_has_interrupt(vcpu)) to kvm_vcpu_ioctl_interrupt, and it
>>>>>> actually triggered right before the guest got stuck.
>>>>>>
>>>>>> This didn't trigger with qemu-kvm (and -no-kvm-irqchip) yet but, on the
>>>>>> other hand, I currently do not see a potential bug in upstream's
>>>>>> kvm_arch_pre_run. Could you have a look if you can reproduce,
>>>>>> specifically if this isn't a KVM kernel issue in the end?
>>>>>>
>>>>> In kvm_cpu_exec() after calling kvm_arch_pre_run() env->exit_request is
>>>>> tested and function can exit without calling kvm_vcpu_ioctl(KVM_RUN).
>>>>> Can you check if this what happens in your case?
>>>> This path is executed quite frequently here. No obvious correlation with
>>>> the lost IRQ.
>>>>
>>> If kvm_arch_pre_run() injected interrupt kvm_vcpu_ioctl(KVM_RUN) have to
>>> be executed before injecting another interrupt, so if on the fist call
>>> of kvm_cpu_exec() kvm_arch_pre_run() injected interrupt, but
>>> kvm_vcpu_ioctl(KVM_RUN) was not executed because of env->exit_request
>>> and on the next kvm_cpu_exec() other interrupt is injected the previous
>>> one will be lost.
>> ...and kvm_run->ready_for_interrupt_injection is not updated either in
>> that case, right? That makes be wonder if KVM_INTERRUPT shouldn't better
>> return an error in case the queue is full already.
>>
> If kvm_vcpu_ioctl(KVM_RUN) is called, but exit happens before interrupt
> is injected kvm_run->ready_for_interrupt_injection should be update to
> reflect that fact.

Yes, but in this case it isn't called if IIUC. So that is the problem
upstream KVM faces?

Then again: What do you think is the proper long-term fix? Only
adjusting upstream KVM (required anyway) or also making the kernel
support more robust against this pattern?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux