From: Alex Williamson <alex.williamson@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
Alexander Graf <agraf@suse.de>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"anthony@codemonkey.ws" <anthony@codemonkey.ws>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space
Date: Fri, 08 Jun 2012 08:56:24 -0600 [thread overview]
Message-ID: <1339167384.26976.71.camel@ul30vt> (raw)
In-Reply-To: <4FD20F92.9080805@siemens.com>
On Fri, 2012-06-08 at 16:43 +0200, Jan Kiszka wrote:
> On 2012-06-08 16:00, Alexey Kardashevskiy wrote:
> > 08.06.2012 21:30, Jan Kiszka пишет:
> >> On 2012-06-08 13:16, Alexey Kardashevskiy wrote:
> >>> 08.06.2012 20:56, Jan Kiszka написал:
> >>>> On 2012-06-08 10:47, Alexey Kardashevskiy wrote:
> >>>>> Yet another try :)
> >>>>>
> >>>>> Normally the pci_add_capability is called on devices to add new
> >>>>> capability. This is ok for emulated devices which capabilities list
> >>>>> is being built by QEMU.
> >>>>>
> >>>>> In the case of VFIO the capability may already exist and adding new
> >>>>
> >>>> Why does it exit? VFIO should build the virtual capability list from
> >>>> scratch (just like classic device assignment does), recreating the
> >>>> layout of the physical device (except for masked out caps). In that
> >>>> case, this conflict should become impossible, no?
> >>>
> >>> Normally capabilities in emulated devices are created by calling
> >>> msi_init or msix_init - just when emulated device wants to advertise it
> >>> to the guest.
> >>>
> >>> In the case of VFIO, there is a lot of capabilities which QEMU does not
> >>> know and does not want to know about. They are read from the host kernel
> >>> as is. And we definitely want to pass these capabilities to the guest as
> >>> is, i.e. on the same position and the same number of them. Just for some
> >>> we call pci_add_capability (indirectly!) if we want QEMU to support them
> >>> somehow.
> >>>
> >>> If we invent some function which "readds" all the capabilities we got
> >>> from the host to keep internal QEMU's PCIDevice data in sync, then we'll
> >>> need to change every piece of code which adds capabilities.
> >>
> >> I can't follow. What is different in VFIO from device-assignment.c,
> >> assigned_device_pci_cap_init (except that it already uses msi[x]_init,
> >> something we need to fix in device-assignment.c)?
> >
> > What are device-assignment.c and assigned_device_pci_cap_init? Cannot
> > find them in QEMU tree.
>
> "Old-style" KVM device assignment is not yet upstream. You can find it
> in qemu-kvm, hopefully in upstream soon as well.
>
> >
> > Ah, anyway. The main difference is QEMU does not emulate VFIO devices,
> > it just a proxy to the host system. Or I do not understand the question.
> >
> >>> I noticed,
> >>> this is very common approach here to change a lot for a very small thing
> >>> or rare case but I'd like to avoid this :)
> >>>
> >>>> But if pci_*add*_capability should actually be used like this (I doubt
> >>>> this),
> >>>
> >>> MSI/MSIX use it. To enable MSI/MSIX on VFIO PCIDevice, we call
> >>> msi_init/msix_init and they call pci_add_capability.
> >>
> >> You can't blame msi_init/msix_init for the fact that VFIO creates a
> >> capability list with an existing MSI/MSI-X entry beforehand.
> >
> > VFIO does not create any capability. It gets them all from the host
> > kernel and passes to the guest as is. VFIO only needs MSIX to be enabled
> > in VFIO.
>
> Just like any device in QEMU, also VFIO need to set up a virtual config
> space when it registers with the PCI core layer. Even if the virtual one
> is modeled after the real one, it is still _created_ by the VFIO
> userspace part. And this creation process is obviously a bit messed up
> so far. Fix this, but not by adding workarounds in the MSI or PCI layer.
> Rather add all capabilities you want to expose to the guest via
> pci_add_capability or, indirectly, via msi[x]_init at the right
> position. Do not just copy the real config space over, that breaks the
> core layer as we see.
The difference between VFIO and kvm device assignment is that VFIO
emulates a lot of config space for us, so most things are passed
through. MSI and MSIX are unique that we actually do want the qemu
support for helping us to manage them. So we're basically not telling
qemu about anything other than these, and for the most part, that works
since qemu never handles access to the other capabilities. However, I
think you're probably right, VFIO should just walk the capabilities
list, registering each with qemu. It's a little "unnecessary" overhead
from the VFIO perspective, but it makes the VFIO device less unique.
I'll work on adding this. Thanks,
Alex
next prev parent reply other threads:[~2012-06-08 14:56 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-11 6:45 [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space Alexey Kardashevskiy
2012-05-11 10:52 ` Alexander Graf
2012-05-11 12:47 ` Alexey Kardashevskiy
2012-05-11 14:13 ` Alexander Graf
2012-05-14 3:49 ` Alexey Kardashevskiy
2012-05-18 5:12 ` Alexey Kardashevskiy
2012-05-22 2:02 ` Benjamin Herrenschmidt
2012-05-22 3:21 ` Alexander Graf
2012-05-22 3:44 ` Alexey Kardashevskiy
2012-05-22 5:52 ` Alexander Graf
2012-05-22 6:11 ` Alexey Kardashevskiy
2012-05-22 6:31 ` Alexander Graf
2012-05-22 7:01 ` Alexey Kardashevskiy
2012-05-22 7:13 ` Alexander Graf
2012-05-22 7:37 ` Benjamin Herrenschmidt
2012-06-08 8:47 ` Alexey Kardashevskiy
2012-06-08 10:56 ` Jan Kiszka
2012-06-08 11:16 ` Alexey Kardashevskiy
2012-06-08 11:30 ` Jan Kiszka
2012-06-08 14:00 ` Alexey Kardashevskiy
2012-06-08 14:43 ` Jan Kiszka
2012-06-08 14:56 ` Alex Williamson [this message]
2012-06-08 15:05 ` Jan Kiszka
2012-06-08 15:22 ` Alex Williamson
2012-05-22 6:38 ` Alexander Graf
2012-05-11 19:20 ` Jason Baron
2012-05-12 0:27 ` Alexey Kardashevskiy
2012-05-14 2:37 ` Alex Williamson
-- strict thread matches above, loose matches on Subject: below --
2012-05-11 6:59 Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1339167384.26976.71.camel@ul30vt \
--to=alex.williamson@redhat.com \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=anthony@codemonkey.ws \
--cc=david@gibson.dropbear.id.au \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).