From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=36968 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OY2ri-000460-Lg for qemu-devel@nongnu.org; Sun, 11 Jul 2010 16:11:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OY2rg-0002wd-A5 for qemu-devel@nongnu.org; Sun, 11 Jul 2010 16:11:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37010) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OY2rg-0002wE-0L for qemu-devel@nongnu.org; Sun, 11 Jul 2010 16:11:04 -0400 Date: Sun, 11 Jul 2010 23:05:47 +0300 From: "Michael S. Tsirkin" Message-ID: <20100711200547.GE12202@redhat.com> References: <20100711180910.20121.93313.stgit@localhost6.localdomain6> <20100711180936.20121.35376.stgit@localhost6.localdomain6> <4C3A09F3.8010304@redhat.com> <1278872784.20397.18.camel@x201> <4C3A0DE3.8010806@redhat.com> <20100711185456.GA11048@redhat.com> <1278876078.20397.79.camel@x201> <20100711192330.GA11491@redhat.com> <1278878614.20397.128.camel@x201> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1278878614.20397.128.camel@x201> Subject: [Qemu-devel] Re: [RFC PATCH 4/5] APIC/IOAPIC EOI callback List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: chrisw@redhat.com, pugs@cisco.com, Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org On Sun, Jul 11, 2010 at 02:03:34PM -0600, Alex Williamson wrote: > On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote: > > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote: > > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote: > > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote: > > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote: > > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote: > > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote: > > > > > >>>For device assignment, we need to know when the VM writes an end > > > > > >>>of interrupt to the APIC, which allows us to de-assert the interrupt > > > > > >>>line and clear the DisINTx bit. Add a new wrapper for ioapic > > > > > >>>generated interrupts with a callback on eoi and create an interface > > > > > >>>for drivers to be notified on eoi. > > > > > >>> > > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we need a > > > > > >>new interface there. > > > > > >Registering an eventfd for the eoi seems like a reasonable alternative. > > > > > > > > > > I'm worried about that racing (with what?) > > > > > > > > With device asserting the interrupt? > > > > Need to make sure that all possible scenarious work well: > > > > > > > > device asserts interrupt > > > > driver clears interrupt > > > > device asserts interrupt > > > > eoi > > > > > > > > device asserts interrupt > > > > driver clears interrupt > > > > eoi > > > > device asserts interrupt > > > > > > > > etc > > > > > > > > Not that I see issues, these are things we need to check. > > > > > > I think those are all protected by host and qemu vfio drivers managing > > > DisINTx. The way I understand it to work now is: > > > > > > device asserts interrupt > > > interrupt lands in host vfio driver > > > host vfio sets DisINTx on the device > > > host vfio sends eventfd > > > eventfd lands in qemu vfio, does a qemu_set_irq > > > ... guest processes > > > guest writes eoi to apic, lands back in qemu vfio driver > > > qemu vfio deasserts qemu interrupt > > > qemu vfio clears DisINTx > > > > > > So I don't think there's a race as long as ordering is sane for toggling > > > DisINTx. Thanks, > > > > > > Alex > > > > > > > What about threaded interrupts? I think (correct me if I am wrong) > > that they work like this: > > > > device asserts interrupt > > guest disables interrupt > > Is this the guest manipulating DisINTx itself? I suppose it could be a > device dependent disable as well. It can manipulate it, so we need to virtualize it, but that's a separate issue. > > eoi > > guest enables interrupt > > driver clears interrupt > > These two are hopefully reversed or else the driver is expecting to > clear and potentially reassert interrupts anyway. Yes. Sorry. > > device asserts interrupt > > > > If so, your code will clear DisINTx immediately which > > will always get us another host interrupt: > > performance will be hurt. I am also not sure > > we'll not lose interrupts. > > Level interrupts are lossy afaik, if it gets cleared but an interrupt > condition still exists, it should be reasserted. Yes but I mean we won't interrupt the guest. So it wil lstay disabled forever. > > It seems we need to track interrupt disable/enable as well, and only > > clear DisINTx after eoi with interrupts enabled. Not sure what is the > > interface for this. > > If a driver uses device dependent code to disable interrupts, > there's no > issue, we'll clear DisINTx, but the device still won't generate an > interrupt until the dependent code is re-enabled by the guest (assuming > there's no cross talk between DisINTx and device dependent components). > > For the case that a guest driver disables via DisINTx, it seems easy to > trap and track that. So we get: > > device asserts interrupt > guest disables interrupt > (trapped, qemu-vfio sets intx.guest_disabled = 1) > eoi > (qemu-vfio deasserts qemu interrupts, but because of above doesn't clear DisINTx) > guest enables interrupt > (allowed to pass through, intx.guest_disabled = 0) > driver clears interrupt > device asserts interrupt > > I've already got an intx.pending bit, so I think this just changes the eoi to: > > vdev->intx.pending = 0; > qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0); > if (!vdev->intx.guest_disabled) { > vfio_unmask_intx(vdev); > } > > Writing the command register DisINTx bit then just gets some kind of: > > if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) { > intx.guest_disabled = 1; > cmd &= ~PCI_COMMAND_INTX_DISABLE; > } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) { > intx.guest_disabled = 0; > } > ... allow write > > That work? Thanks, > > Alex No, I mean guest OS disables the specific interrupt with disable_irq. -- MST