From: Scott Wood <scottwood@freescale.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Stuart Yoder <b08248@gmail.com>,
Benjamin Herrenschmidt <benh@au.ibm.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Alexander Graf <agraf@suse.de>, "avi@redhat.com" <avi@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Tue, 27 Sep 2011 16:28:27 -0500 [thread overview]
Message-ID: <4E823FFB.1030508@freescale.com> (raw)
In-Reply-To: <1317084333.25092.138.camel@x201.home>
On 09/26/2011 07:45 PM, Alex Williamson wrote:
> On Mon, 2011-09-26 at 18:59 -0500, Scott Wood wrote:
>> On 09/26/2011 01:34 PM, Alex Williamson wrote:
>>> /* Reset the device */
>>> #define VFIO_DEVICE_RESET _IO(, ,)
>>
>> What generic way do we have to do this? We should probably have a way
>> to determine whether it's possible, without actually asking to do it.
>
> It's not generic, it could be a VFIO_DEVICE_PCI_RESET or we could add a
> bit to the device flags to indicate if it's available or we could add a
> "probe" arg to the ioctl to either check for existence or do it.
Even with PCI, isn't this only possible if function-level reset is
supported? I think we need a flag.
For devices that can't be reset by the kernel, we'll want the ability to
stop/start DMA acccess through the IOMMU (or other bus-specific means),
separate from whether the fd is open. If a device is assigned to a
partition and that partition gets reset, we'll want to disable DMA
before we re-use the memory, and enable it after the partition has reset
or quiesced the device (which requires the fd to be open).
>>> /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
>>> #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int)
>>> #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int)
>>>
>>> Hope that covers it.
>>
>> It could be done this way, but I predict that the code (both kernel and
>> user side) will be larger. Maybe not much more complex, but more
>> boilerplate.
>>
>> How will you manage extensions to the interface?
>
> I would assume we'd do something similar to the kvm capabilities checks.
This information is already built into the data-structure approach.
>> The table should not be particularly large, and you'll need to keep the
>> information around in some form regardless. Maybe in the PCI case you
>> could produce it dynamically (though I probably wouldn't), but it really
>> wouldn't make sense in the device tree case.
>
> It would be entirely dynamic for PCI, there's no advantage to caching
> it. Even for device tree, if you can't fetch it dynamically, you'd have
> to duplicate it between an internal data structure and a buffer reading
> the table.
I don't think we'd need to keep the device tree path/index info around
for anything but the table -- but really, this is a minor consideration.
>> You also lose the ability to easily have a human look at the hexdump for
>> debugging; you'll need a special "lsvfio" tool. You might want one
>> anyway to pretty-print the info, but with ioctls it's mandatory.
>
> I don't think this alone justifies duplicating the data and making it
> difficult to parse on both ends. Chances are we won't need such a tool
> for the ioctl interface because it's easier to get it right the first
> time ;)
It's not just useful for getting the code right, but for e.g. sanity
checking that the devices were bound properly. I think such a tool
would be generally useful, no matter what the kernel interface ends up
being. I don't just use lspci to debug the PCI subsystem. :-)
> Note that I'm not stuck on this interface, I was just thinking about how
> to generate the table last week, it seemed like a pain so I thought I'd
> spend a few minutes outlining an ioctl interface... turns out it's not
> so bad. Thanks,
Yeah, it can work either way, as long as the information's there and
there's a way to add new bits of information, or new bus types, down the
road. Mainly a matter of aesthetics between the two.
-Scott
next prev parent reply other threads:[~2011-09-27 21:29 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 13:11 RFC [v2]: vfio / device assignment -- layout of device fd files Stuart Yoder
2011-09-09 13:16 ` Stuart Yoder
2011-09-19 15:16 ` Alex Williamson
2011-09-19 19:37 ` Scott Wood
2011-09-19 21:07 ` Alex Williamson
2011-09-19 21:15 ` Scott Wood
2011-09-26 7:51 ` [Qemu-devel] " David Gibson
2011-09-26 10:04 ` Alexander Graf
2011-09-26 18:34 ` Alex Williamson
2011-09-26 20:03 ` Stuart Yoder
2011-09-26 20:42 ` Alex Williamson
2011-09-26 23:59 ` Scott Wood
2011-09-27 0:45 ` Alex Williamson
2011-09-27 21:28 ` Scott Wood [this message]
2011-09-28 2:40 ` Alex Williamson
2011-09-28 8:58 ` [Qemu-devel] " Alexander Graf
2011-09-30 8:55 ` David Gibson
2011-09-30 8:50 ` David Gibson
2011-09-30 8:46 ` David Gibson
2011-09-30 16:37 ` Alex Williamson
2011-09-30 21:59 ` Alex Williamson
2011-10-06 23:18 ` Aaron Fabbri
2011-09-30 8:40 ` David Gibson
2011-09-26 19:57 ` Stuart Yoder
2011-09-27 0:25 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E823FFB.1030508@freescale.com \
--to=scottwood@freescale.com \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=avi@redhat.com \
--cc=b08248@gmail.com \
--cc=benh@au.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).