From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Etienne Martineau <etmartin101@gmail.com>,
Chris Wright <chrisw@sous-sol.org>,
kvm@vger.kernel.org
Subject: Re: KVM devices assignment; PCIe AER?
Date: Thu, 28 Oct 2010 07:39:59 +0200 [thread overview]
Message-ID: <20101028053959.GF5599@redhat.com> (raw)
In-Reply-To: <1288243062.5129.231.camel@x201>
On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote:
> On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
> > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> > > > On Wed, 27 Oct 2010, Alex Williamson wrote:
> > > > > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > > > > irqfds are currently only used by vhost afaik, since it's being
> > > > > interrupted externally, much like pass through devices are.
> > > >
> > > > Fair enough. Thanks for the clarification.
> > > >
> > > > > Sort of. When the VFIO device triggers an interrupt, we get notified
> > > > > via the eventfd we've registered for that interrupt. We can then call
> > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > > > > That much works today.
> > > >
> > > > Understood but performance wise this is no good for KVM right?
> > >
> > > Right, bouncing interrupts and EOIs through qemu via eventfds is going
> > > to add latency. On the interrupt path we already have irqfds, which
> > > will avoid the bounce through userspace, we just need to use them.
> > > Doing something similar with EOIs could avoid that path, giving us
> > > something comparable to current device assignment.
> > >
> > > > > The irqfd mechanism is simply a way for KVM to
> > > > > directly consume the eventfd and raise an interrupt via a pre-setup
> > > > > vector. That's yet to be implemented for INTx on VFIO, but should
> > > > > mostly be a matter of connecting existing pieces together. It's working
> > > > > for MSI-X.
> > > >
> > > > OK, I was on the impression you already had irqfd 'connected' to KVM from
> > > > VFIO... This is why I was asking about the nature of the changed in VFIO.
> > > >
> > > > > When VFIO sends an interrupt, it disables the physical device from
> > > > > generating more interrupts (this is where VFIO requires PCI 2.3
> > > > > compliant devices for the INTx disable bit int he status register).
> > > > > When the guest services the interrupt, we can detect this by catching
> > > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on
> > > > > the device. Wash, rinse, repeat.
> > > > >
> > > > > To do this in qemu, I created a callback on the ioapic where drivers can
> > > > > register for the interrupt they care about. Since KVM moves the ioapic
> > > > > into the kernel, we need to extend this into KVM and have yet another
> > > > > eventfd mechanism. It's possible that we could have the VFIO kernel
> > > > > module also receive this eventfd, re-enabling interrupts on the device,
> > > > > in much the same way as above.
> > > >
> > > > In the cases of KVM where are you going to catch the EIO? For some
> > > > reason I'm on the impression that this is part of KVM. If so then how are
> > > > you going to 'signal' to VFIO? Cannot use eventfd here right?
> > >
> > > KVM already has an internal IRQ ACK notifier (which is what current
> > > device assignment uses to do the same thing), it's just a matter of
> > > adding a callback that does a kvm_register_irq_ack_notifier that sends
> > > off the eventfd signal. I've got this working and will probably send
> > > out the KVM patch this week. For now the eventfd goes to userspace, but
> > > this is where I imagine we could steal some of the irqfd code to make
> > > VFIO consume the irqfd signal directly. Thanks,
> > >
> > > Alex
> >
> > BTW, how do we handle sharing the interrupt in guest?
>
> I'm currently using flags to track whether we've asserted the interrupt
> in qemu, and only act on the eoi when the flag is set. In my current
> setup, the guest puts the pass through device and USB on the same
> interrupt and using this filtering seems to be sufficient. I think this
> should act just like bare metal, the device will reassert the interrupt
> if it still needs service, but we can avoid obviously gratuitous eois
> being passed down to vfio.
>
> This will complicate having vfio intercept the eoi eventfd directly
> since it will then need to track the state too. Another thing I've got
> working is letting vfio support older non-PCI-2.3 compliant devices so
> long as they can claim an exclusive interrupt (just like current code).
> We need to track whether the irq is enabled or disabled for this anyway
> so that we don't get unbalanced enabled/disables.
>
> Alex
Tracking state is also good for saving an extra config read
on each access.
--
MST
next prev parent reply other threads:[~2010-10-28 5:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-26 16:41 KVM devices assignment; PCIe AER? etmartin101
2010-10-26 18:10 ` Alex Williamson
2010-10-26 18:21 ` Chris Wright
2010-10-26 18:37 ` Michael S. Tsirkin
2010-10-26 20:24 ` Etienne Martineau
2010-10-26 20:42 ` Chris Wright
2010-10-26 22:08 ` Etienne Martineau
2010-10-26 22:15 ` Chris Wright
2010-10-26 22:17 ` Michael S. Tsirkin
2010-10-26 22:47 ` Etienne Martineau
2010-10-26 23:05 ` Chris Wright
2010-10-27 3:51 ` Etienne Martineau
2010-10-27 14:54 ` Alex Williamson
2010-10-27 18:23 ` Etienne Martineau
2010-10-27 19:16 ` Alex Williamson
2010-10-27 21:43 ` Etienne Martineau
2010-10-27 22:58 ` Alex Williamson
2010-10-28 4:58 ` Michael S. Tsirkin
2010-10-28 5:17 ` Alex Williamson
2010-10-28 5:39 ` Michael S. Tsirkin [this message]
2010-10-28 23:36 ` Etienne Martineau
2010-10-26 20:54 ` Michael S. Tsirkin
2010-10-26 18:38 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101028053959.GF5599@redhat.com \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=chrisw@sous-sol.org \
--cc=etmartin101@gmail.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox