From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Date: Fri, 23 Apr 2010 14:05:49 +0300 Message-ID: <4BD17F0D.5070201@redhat.com> References: <201004211548.12824.sheng.yang@intel.com> <20100421155840.GA22052@amt.cnet> <20100421171227.GB10744@redhat.com> <20100421173734.GA27425@amt.cnet> <20100421175848.GB2455@redhat.com> <20100421182911.GA28343@amt.cnet> <20100421183839.GC2455@redhat.com> <20100422164038.GA1117@amt.cnet> <20100422181130.GD2455@redhat.com> <20100422194030.GA4616@amt.cnet> <20100422195514.GE2455@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , "Yang, Sheng" , kvm , "bonenkamp@gmx.de" , Chris Wright To: Gleb Natapov Return-path: Received: from mx1.redhat.com ([209.132.183.28]:13556 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755264Ab0DWLFy (ORCPT ); Fri, 23 Apr 2010 07:05:54 -0400 In-Reply-To: <20100422195514.GE2455@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 04/22/2010 10:55 PM, Gleb Natapov wrote: > > >>> What about converting PIC/IOAPIC mutexes into spinlocks? >>> >> Works for me, but on large guests the spinning will be noticeable. >> I believe. >> > For interrupts going through IOPIC, but we know this is not scalable > anyway. > Yes. We also wanted to convert the ioapic/pic to spinlocks so we could queue the interrupt from the PIT directly instead of using KVM_REQ_PENDING_TIMER which keeps confusing me. Chris Lalancette posted a patchset for this a while back but it was never completed. I'm not really happy with adding lots of spin_lock_irqsave()s though, especially on the ioapic which may iterate over all vcpus (not worried about scaling, but about a malicious guest hurting host latency). An alternative is make kvm_set_irq() irq safe: if msi, do things immediately, otherwise post a work item. So we can call kvm_set_irq() directly from the interrupt. Alternative alternative (perhaps better for short term): switch assigned_dev_lock to a mutex (we're already in a work handler, no need for spinlock). The race between the irq and removal of an assigned device is closed by free_irq(): lock mark assigned device as going away unlock free_irq() actually kill it like invalid mmu pages. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.