All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org, pugs@cisco.com,
	chrisw@redhat.com
Subject: Re: [RFC PATCH 4/5] APIC/IOAPIC EOI callback
Date: Sun, 11 Jul 2010 23:05:47 +0300	[thread overview]
Message-ID: <20100711200547.GE12202@redhat.com> (raw)
In-Reply-To: <1278878614.20397.128.camel@x201>

On Sun, Jul 11, 2010 at 02:03:34PM -0600, Alex Williamson wrote:
> On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote:
> > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote:
> > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote:
> > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote:
> > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote:
> > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote:
> > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote:
> > > > > >>>For device assignment, we need to know when the VM writes an end
> > > > > >>>of interrupt to the APIC, which allows us to de-assert the interrupt
> > > > > >>>line and clear the DisINTx bit.  Add a new wrapper for ioapic
> > > > > >>>generated interrupts with a callback on eoi and create an interface
> > > > > >>>for drivers to be notified on eoi.
> > > > > >>>
> > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we need a
> > > > > >>new interface there.
> > > > > >Registering an eventfd for the eoi seems like a reasonable alternative.
> > > > > 
> > > > > I'm worried about that racing (with what?)
> > > > 
> > > > With device asserting the interrupt?
> > > > Need to make sure that all possible scenarious work well:
> > > > 
> > > > 	device asserts interrupt
> > > > 	driver clears interrupt
> > > > 	device asserts interrupt
> > > > 	eoi
> > > > 
> > > > 	device asserts interrupt
> > > > 	driver clears interrupt
> > > > 	eoi
> > > > 	device asserts interrupt
> > > > 
> > > > etc
> > > > 
> > > > Not that I see issues, these are things we need to check.
> > > 
> > > I think those are all protected by host and qemu vfio drivers managing
> > > DisINTx.  The way I understand it to work now is:
> > > 
> > > 	device asserts interrupt
> > > 	interrupt lands in host vfio driver
> > > 	host vfio sets DisINTx on the device
> > > 	host vfio sends eventfd
> > > 	eventfd lands in qemu vfio, does a qemu_set_irq
> > >         ... guest processes
> > > 	guest writes eoi to apic, lands back in qemu vfio driver
> > > 	qemu vfio deasserts qemu interrupt
> > > 	qemu vfio clears DisINTx
> > > 
> > > So I don't think there's a race as long as ordering is sane for toggling
> > > DisINTx.  Thanks,
> > > 
> > > Alex
> > > 
> > 
> > What about threaded interrupts? I think (correct me if I am wrong)
> > that they work like this:
> > 
> >  	device asserts interrupt
> > 	guest disables interrupt
> 
> Is this the guest manipulating DisINTx itself?  I suppose it could be a
> device dependent disable as well.

It can manipulate it, so we need to virtualize it, but that's a
separate issue.

> >  	eoi
> > 	guest enables interrupt
> >  	driver clears interrupt
> 
> These two are hopefully reversed or else the driver is expecting to
> clear and potentially reassert interrupts anyway.

Yes. Sorry.

> >  	device asserts interrupt
> > 
> > If so, your code will clear DisINTx immediately which
> > will always get us another host interrupt:
> > performance will be hurt. I am also not sure
> > we'll not lose interrupts.
> 
> Level interrupts are lossy afaik, if it gets cleared but an interrupt
> condition still exists, it should be reasserted.

Yes but I mean we won't interrupt the guest. So it wil lstay disabled
forever.

> > It seems we need to track interrupt disable/enable as well, and only
> > clear DisINTx after eoi with interrupts enabled.  Not sure what is the
> > interface for this.
> 
> If a driver uses device dependent code to disable interrupts,
> there's no
> issue, we'll clear DisINTx, but the device still won't generate an
> interrupt until the dependent code is re-enabled by the guest (assuming
> there's no cross talk between DisINTx and device dependent components).
> 
> For the case that a guest driver disables via DisINTx, it seems easy to
> trap and track that.  So we get:
> 
>         device asserts interrupt
>         guest disables interrupt
>         (trapped, qemu-vfio sets intx.guest_disabled = 1)
>         eoi
>         (qemu-vfio deasserts qemu interrupts, but because of above doesn't clear DisINTx)
>         guest enables interrupt
>         (allowed to pass through, intx.guest_disabled = 0)
>         driver clears interrupt
>         device asserts interrupt
> 
> I've already got an intx.pending bit, so I think this just changes the eoi to:
> 
>     vdev->intx.pending = 0;
>     qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0);
>     if (!vdev->intx.guest_disabled) {
>         vfio_unmask_intx(vdev);
>     }
> 
> Writing the command register DisINTx bit then just gets some kind of:
> 
>     if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) {
>         intx.guest_disabled = 1;
>         cmd &= ~PCI_COMMAND_INTX_DISABLE;
>     } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) {
>         intx.guest_disabled = 0;
>     }
>     ... allow write
> 
> That work?  Thanks,
> 
> Alex

No, I mean guest OS disables the specific interrupt with
disable_irq.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: chrisw@redhat.com, pugs@cisco.com, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [RFC PATCH 4/5] APIC/IOAPIC EOI callback
Date: Sun, 11 Jul 2010 23:05:47 +0300	[thread overview]
Message-ID: <20100711200547.GE12202@redhat.com> (raw)
In-Reply-To: <1278878614.20397.128.camel@x201>

On Sun, Jul 11, 2010 at 02:03:34PM -0600, Alex Williamson wrote:
> On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote:
> > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote:
> > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote:
> > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote:
> > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote:
> > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote:
> > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote:
> > > > > >>>For device assignment, we need to know when the VM writes an end
> > > > > >>>of interrupt to the APIC, which allows us to de-assert the interrupt
> > > > > >>>line and clear the DisINTx bit.  Add a new wrapper for ioapic
> > > > > >>>generated interrupts with a callback on eoi and create an interface
> > > > > >>>for drivers to be notified on eoi.
> > > > > >>>
> > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we need a
> > > > > >>new interface there.
> > > > > >Registering an eventfd for the eoi seems like a reasonable alternative.
> > > > > 
> > > > > I'm worried about that racing (with what?)
> > > > 
> > > > With device asserting the interrupt?
> > > > Need to make sure that all possible scenarious work well:
> > > > 
> > > > 	device asserts interrupt
> > > > 	driver clears interrupt
> > > > 	device asserts interrupt
> > > > 	eoi
> > > > 
> > > > 	device asserts interrupt
> > > > 	driver clears interrupt
> > > > 	eoi
> > > > 	device asserts interrupt
> > > > 
> > > > etc
> > > > 
> > > > Not that I see issues, these are things we need to check.
> > > 
> > > I think those are all protected by host and qemu vfio drivers managing
> > > DisINTx.  The way I understand it to work now is:
> > > 
> > > 	device asserts interrupt
> > > 	interrupt lands in host vfio driver
> > > 	host vfio sets DisINTx on the device
> > > 	host vfio sends eventfd
> > > 	eventfd lands in qemu vfio, does a qemu_set_irq
> > >         ... guest processes
> > > 	guest writes eoi to apic, lands back in qemu vfio driver
> > > 	qemu vfio deasserts qemu interrupt
> > > 	qemu vfio clears DisINTx
> > > 
> > > So I don't think there's a race as long as ordering is sane for toggling
> > > DisINTx.  Thanks,
> > > 
> > > Alex
> > > 
> > 
> > What about threaded interrupts? I think (correct me if I am wrong)
> > that they work like this:
> > 
> >  	device asserts interrupt
> > 	guest disables interrupt
> 
> Is this the guest manipulating DisINTx itself?  I suppose it could be a
> device dependent disable as well.

It can manipulate it, so we need to virtualize it, but that's a
separate issue.

> >  	eoi
> > 	guest enables interrupt
> >  	driver clears interrupt
> 
> These two are hopefully reversed or else the driver is expecting to
> clear and potentially reassert interrupts anyway.

Yes. Sorry.

> >  	device asserts interrupt
> > 
> > If so, your code will clear DisINTx immediately which
> > will always get us another host interrupt:
> > performance will be hurt. I am also not sure
> > we'll not lose interrupts.
> 
> Level interrupts are lossy afaik, if it gets cleared but an interrupt
> condition still exists, it should be reasserted.

Yes but I mean we won't interrupt the guest. So it wil lstay disabled
forever.

> > It seems we need to track interrupt disable/enable as well, and only
> > clear DisINTx after eoi with interrupts enabled.  Not sure what is the
> > interface for this.
> 
> If a driver uses device dependent code to disable interrupts,
> there's no
> issue, we'll clear DisINTx, but the device still won't generate an
> interrupt until the dependent code is re-enabled by the guest (assuming
> there's no cross talk between DisINTx and device dependent components).
> 
> For the case that a guest driver disables via DisINTx, it seems easy to
> trap and track that.  So we get:
> 
>         device asserts interrupt
>         guest disables interrupt
>         (trapped, qemu-vfio sets intx.guest_disabled = 1)
>         eoi
>         (qemu-vfio deasserts qemu interrupts, but because of above doesn't clear DisINTx)
>         guest enables interrupt
>         (allowed to pass through, intx.guest_disabled = 0)
>         driver clears interrupt
>         device asserts interrupt
> 
> I've already got an intx.pending bit, so I think this just changes the eoi to:
> 
>     vdev->intx.pending = 0;
>     qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0);
>     if (!vdev->intx.guest_disabled) {
>         vfio_unmask_intx(vdev);
>     }
> 
> Writing the command register DisINTx bit then just gets some kind of:
> 
>     if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) {
>         intx.guest_disabled = 1;
>         cmd &= ~PCI_COMMAND_INTX_DISABLE;
>     } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) {
>         intx.guest_disabled = 0;
>     }
>     ... allow write
> 
> That work?  Thanks,
> 
> Alex

No, I mean guest OS disables the specific interrupt with
disable_irq.

-- 
MST

  reply	other threads:[~2010-07-11 20:11 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-11 18:09 [RFC PATCH 0/5] QEMU VFIO device assignment Alex Williamson
2010-07-11 18:09 ` [Qemu-devel] " Alex Williamson
2010-07-11 18:09 ` [RFC PATCH 1/5] qemu_ram_map/unmap: Allow pre-allocated space to be mapped Alex Williamson
2010-07-11 18:09   ` [Qemu-devel] " Alex Williamson
2010-07-11 18:09 ` [RFC PATCH 2/5] Minimal RAM API support Alex Williamson
2010-07-11 18:09   ` [Qemu-devel] " Alex Williamson
2010-07-11 18:18   ` Alex Williamson
2010-07-11 18:18     ` [Qemu-devel] " Alex Williamson
2010-07-11 18:20   ` Avi Kivity
2010-07-11 18:20     ` [Qemu-devel] " Avi Kivity
2010-07-11 18:24     ` Alex Williamson
2010-07-11 18:24       ` [Qemu-devel] " Alex Williamson
2010-07-11 18:29       ` Avi Kivity
2010-07-11 18:29         ` [Qemu-devel] " Avi Kivity
2010-07-11 18:09 ` [RFC PATCH 3/5] RAM API: Make use of it for x86 PC Alex Williamson
2010-07-11 18:09   ` [Qemu-devel] " Alex Williamson
2010-07-11 18:09 ` [RFC PATCH 4/5] APIC/IOAPIC EOI callback Alex Williamson
2010-07-11 18:09   ` [Qemu-devel] " Alex Williamson
2010-07-11 18:14   ` Avi Kivity
2010-07-11 18:14     ` [Qemu-devel] " Avi Kivity
2010-07-11 18:26     ` Alex Williamson
2010-07-11 18:26       ` [Qemu-devel] " Alex Williamson
2010-07-11 18:30       ` Avi Kivity
2010-07-11 18:30         ` [Qemu-devel] " Avi Kivity
2010-07-11 18:54         ` Michael S. Tsirkin
2010-07-11 18:54           ` [Qemu-devel] " Michael S. Tsirkin
2010-07-11 19:21           ` Alex Williamson
2010-07-11 19:21             ` [Qemu-devel] " Alex Williamson
2010-07-11 19:23             ` Michael S. Tsirkin
2010-07-11 19:23               ` [Qemu-devel] " Michael S. Tsirkin
2010-07-11 20:03               ` Alex Williamson
2010-07-11 20:03                 ` [Qemu-devel] " Alex Williamson
2010-07-11 20:05                 ` Michael S. Tsirkin [this message]
2010-07-11 20:05                   ` Michael S. Tsirkin
2010-07-11 20:12                 ` Michael S. Tsirkin
2010-07-11 20:12                   ` [Qemu-devel] " Michael S. Tsirkin
2010-07-11 21:59                   ` Alex Williamson
2010-07-11 21:59                     ` [Qemu-devel] " Alex Williamson
2010-07-12  6:33         ` Avi Kivity
2010-07-12  6:33           ` [Qemu-devel] " Avi Kivity
2010-07-12  9:05           ` Gleb Natapov
2010-07-12  9:05             ` [Qemu-devel] " Gleb Natapov
2010-07-12  9:13             ` Avi Kivity
2010-07-12  9:13               ` [Qemu-devel] " Avi Kivity
2010-07-11 18:09 ` [RFC PATCH 5/5] VFIO based device assignment Alex Williamson
2010-07-11 18:09   ` [Qemu-devel] " Alex Williamson
2010-07-11 18:27   ` Avi Kivity
2010-07-11 18:27     ` [Qemu-devel] " Avi Kivity
2010-07-11 19:38     ` Alex Williamson
2010-07-11 19:38       ` [Qemu-devel] " Alex Williamson
2010-07-12  6:37       ` Avi Kivity
2010-07-12  6:37         ` [Qemu-devel] " Avi Kivity
2010-07-11 18:17 ` [RFC PATCH 0/5] QEMU VFIO " Avi Kivity
2010-07-11 18:17   ` [Qemu-devel] " Avi Kivity
2010-07-11 18:37   ` Alex Williamson
2010-07-11 18:37     ` [Qemu-devel] " Alex Williamson
2010-07-11 18:43     ` Avi Kivity
2010-07-11 18:43       ` [Qemu-devel] " Avi Kivity
2010-07-11 20:24       ` Alex Williamson
2010-07-11 20:24         ` [Qemu-devel] " Alex Williamson
2010-07-12  6:29         ` Avi Kivity
2010-07-12  6:29           ` [Qemu-devel] " Avi Kivity
2010-07-12 11:03           ` Michael S. Tsirkin
2010-07-12 11:03             ` [Qemu-devel] " Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100711200547.GE12202@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pugs@cisco.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.