public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Etienne Martineau <etmartin101@gmail.com>,
	Chris Wright <chrisw@sous-sol.org>,
	kvm@vger.kernel.org
Subject: Re: KVM devices assignment; PCIe AER?
Date: Thu, 28 Oct 2010 07:39:59 +0200	[thread overview]
Message-ID: <20101028053959.GF5599@redhat.com> (raw)
In-Reply-To: <1288243062.5129.231.camel@x201>

On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote:
> On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
> > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> > > > On Wed, 27 Oct 2010, Alex Williamson wrote:
> > > > > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > > > > irqfds are currently only used by vhost afaik, since it's being
> > > > > interrupted externally, much like pass through devices are.
> > > > 
> > > > Fair enough. Thanks for the clarification.
> > > > 
> > > > > Sort of.  When the VFIO device triggers an interrupt, we get notified
> > > > > via the eventfd we've registered for that interrupt.  We can then call
> > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > > > > That much works today.
> > > > 
> > > > Understood but performance wise this is no good for KVM right?
> > > 
> > > Right, bouncing interrupts and EOIs through qemu via eventfds is going
> > > to add latency.  On the interrupt path we already have irqfds, which
> > > will avoid the bounce through userspace, we just need to use them.
> > > Doing something similar with EOIs could avoid that path, giving us
> > > something comparable to current device assignment.
> > > 
> > > > > The irqfd mechanism is simply a way for KVM to
> > > > > directly consume the eventfd and raise an interrupt via a pre-setup
> > > > > vector.  That's yet to be implemented for INTx on VFIO, but should
> > > > > mostly be a matter of connecting existing pieces together.  It's working
> > > > > for MSI-X.
> > > > 
> > > > OK, I was on the impression you already had irqfd 'connected' to KVM from 
> > > > VFIO... This is why I was asking about the nature of the changed in VFIO.
> > > > 
> > > > > When VFIO sends an interrupt, it disables the physical device from
> > > > > generating more interrupts (this is where VFIO requires PCI 2.3
> > > > > compliant devices for the INTx disable bit int he status register).
> > > > > When the guest services the interrupt, we can detect this by catching
> > > > > the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
> > > > > the device.  Wash, rinse, repeat.
> > > > >
> > > > > To do this in qemu, I created a callback on the ioapic where drivers can
> > > > > register for the interrupt they care about.  Since KVM moves the ioapic
> > > > > into the kernel, we need to extend this into KVM and have yet another
> > > > > eventfd mechanism.  It's possible that we could have the VFIO kernel
> > > > > module also receive this eventfd, re-enabling interrupts on the device,
> > > > > in much the same way as above.
> > > > 
> > > > In the cases of KVM where are you going to catch the EIO? For some 
> > > > reason I'm on the impression that this is part of KVM. If so then how are 
> > > > you going to 'signal' to VFIO? Cannot use eventfd here right?
> > > 
> > > KVM already has an internal IRQ ACK notifier (which is what current
> > > device assignment uses to do the same thing), it's just a matter of
> > > adding a callback that does a kvm_register_irq_ack_notifier that sends
> > > off the eventfd signal.  I've got this working and will probably send
> > > out the KVM patch this week.  For now the eventfd goes to userspace, but
> > > this is where I imagine we could steal some of the irqfd code to make
> > > VFIO consume the irqfd signal directly.  Thanks,
> > > 
> > > Alex
> > 
> > BTW, how do we handle sharing the interrupt in guest?
> 
> I'm currently using flags to track whether we've asserted the interrupt
> in qemu, and only act on the eoi when the flag is set.  In my current
> setup, the guest puts the pass through device and USB on the same
> interrupt and using this filtering seems to be sufficient.  I think this
> should act just like bare metal, the device will reassert the interrupt
> if it still needs service, but we can avoid obviously gratuitous eois
> being passed down to vfio.
> 
> This will complicate having vfio intercept the eoi eventfd directly
> since it will then need to track the state too.  Another thing I've got
> working is letting vfio support older non-PCI-2.3 compliant devices so
> long as they can claim an exclusive interrupt (just like current code).
> We need to track whether the irq is enabled or disabled for this anyway
> so that we don't get unbalanced enabled/disables.
> 
> Alex

Tracking state is also good for saving an extra config read
on each access.

-- 
MST

  reply	other threads:[~2010-10-28  5:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-26 16:41 KVM devices assignment; PCIe AER? etmartin101
2010-10-26 18:10 ` Alex Williamson
2010-10-26 18:21 ` Chris Wright
2010-10-26 18:37 ` Michael S. Tsirkin
2010-10-26 20:24   ` Etienne Martineau
2010-10-26 20:42     ` Chris Wright
2010-10-26 22:08       ` Etienne Martineau
2010-10-26 22:15         ` Chris Wright
2010-10-26 22:17           ` Michael S. Tsirkin
2010-10-26 22:47           ` Etienne Martineau
2010-10-26 23:05             ` Chris Wright
2010-10-27  3:51               ` Etienne Martineau
2010-10-27 14:54                 ` Alex Williamson
2010-10-27 18:23                   ` Etienne Martineau
2010-10-27 19:16                     ` Alex Williamson
2010-10-27 21:43                       ` Etienne Martineau
2010-10-27 22:58                         ` Alex Williamson
2010-10-28  4:58                           ` Michael S. Tsirkin
2010-10-28  5:17                             ` Alex Williamson
2010-10-28  5:39                               ` Michael S. Tsirkin [this message]
2010-10-28 23:36                           ` Etienne Martineau
2010-10-26 20:54     ` Michael S. Tsirkin
2010-10-26 18:38 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101028053959.GF5599@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=etmartin101@gmail.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox