From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs Date: Tue, 29 Dec 2009 14:59:37 -0200 Message-ID: <20091229165937.GA12062@amt.cnet> References: <1262002110-4240-1-git-send-email-avi@redhat.com> <20091228203757.GA20508@amt.cnet> <4B391958.8090602@redhat.com> <20091228213044.GC21422@amt.cnet> <4B39DB65.5090206@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Gleb Natapov , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.redhat.com ([209.132.183.28]:58908 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752156AbZL2WKS (ORCPT ); Tue, 29 Dec 2009 17:10:18 -0500 Received: from int-mx08.intmail.prod.int.phx2.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id nBTMAH8n029828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 29 Dec 2009 17:10:17 -0500 Content-Disposition: inline In-Reply-To: <4B39DB65.5090206@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Dec 29, 2009 at 12:35:17PM +0200, Avi Kivity wrote: > On 12/28/2009 11:30 PM, Marcelo Tosatti wrote: >> On Mon, Dec 28, 2009 at 10:47:20PM +0200, Avi Kivity wrote: >> >>> On 12/28/2009 10:37 PM, Marcelo Tosatti wrote: >>> >>>> On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote: >>>> >>>> >>>>> When the guest acknowledges an interrupt, it sends an EOI message to the local >>>>> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take >>>>> the ioapic mutex. >>>>> >>>>> On large guests, this causes a lot of contention on this mutex. Since large >>>>> guests usually don't route interrupts via the ioapic (they use msi instead), >>>>> this is completely unnecessary. >>>>> >>>>> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking >>>>> the mutex, check if the ioapic was actually responsible for the acked vector. >>>>> If not, we can return early. >>>>> >>>>> >>>> Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC >>>> code), instead? >>>> >>>> >>> That's a lot cleaner, yes. Indeed there's the TMR which holds this >>> info. Gleb suggested doing this in the local apic but we didn't think >>> of using the TMR. >>> >> Problem with storing in the LAPIC is you have to migrate the bitmap >> along (otherwise can't know if EOI is from MSI or IOAPIC). But it sounds >> much simpler. >> > > If we move the vectors_handled bitmap to the local apic, I don't see how > it simplified things. Its vcpu-local. >>> There's a small race there - the TMR is set after the IRR, so the >>> interrupt can be injected and acked before the TMR is updated, but that >>> can be fixed by switching the order. >>> >> Makes sense. >> > > Btw, that race is already exposed to the guest, if it cares to read TMR. > I'll send a patch. > >> >>> But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()? >>> >> Oops. >> >> The worrying thing about the handled_vectors bitmap in the IOAPIC is >> that the update is not atomic wrt to lapic EOI handler. >> >> Unless its certain that races there are the guests problem, which should >> have proper locking to never allow things like >> >> kvm_set_ioapic vec >> update handled bitmap, vec not IOAPIC >> handled anymore >> ack lapic irq vec >> >> to happen. >> >> (with bitmap in LAPIC you avoid those things). >> >> > > It seems real hardware will have the same issue (also look at comments > regarding irq migration in arch/x86/kernel/io_apic.c). So I think a > guest is required to ack before migrating an irq. Fair. Applied, thanks.