From: Alex Williamson <alex.williamson@redhat.com>
To: Scott Wood <scottwood@freescale.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Stuart Yoder <b08248@gmail.com>,
Benjamin Herrenschmidt <benh@au.ibm.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Alexander Graf <agraf@suse.de>, "avi@redhat.com" <avi@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Mon, 26 Sep 2011 18:45:31 -0600 [thread overview]
Message-ID: <1317084333.25092.138.camel@x201.home> (raw)
In-Reply-To: <4E8111E5.4030209@freescale.com>
On Mon, 2011-09-26 at 18:59 -0500, Scott Wood wrote:
> On 09/26/2011 01:34 PM, Alex Williamson wrote:
> > The other obvious possibility is a pure ioctl interface. To match what
> > this proposal is trying to describe, plus the runtime interfaces, we'd
> > need something like:
> >
> > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */
> > #define VFIO_DEVICE_GET_FLAGS _IOR(, , u64)
> >
> >
> > /* Return number of mmio/iop/config regions.
> > * For PCI this is always 8 (BAR0-5 + ROM + Config) */
> > #define VFIO_DEVICE_GET_NUM_REGIONS _IOR(, , int)
>
> How do you handle BARs that a particular device doesn't use? Zero-length?
Yep
> > /* Return the device tree path for type/index into the user
> > * allocated buffer */
> > struct dtpath {
> > u32 type; (0 = region, 1 = IRQ)
> > u32 index;
> > u32 buf_len;
> > char *buf;
> > };
> > #define VFIO_DEVICE_GET_DTPATH _IOWR(, , struct dtpath)
>
> So now the user needs to guess a buffer length in advance... and what
> happens if it's too small?
-ENOSPC. Call with buf_len = 0 and it could indicate the size.
> > /* Reset the device */
> > #define VFIO_DEVICE_RESET _IO(, ,)
>
> What generic way do we have to do this? We should probably have a way
> to determine whether it's possible, without actually asking to do it.
It's not generic, it could be a VFIO_DEVICE_PCI_RESET or we could add a
bit to the device flags to indicate if it's available or we could add a
"probe" arg to the ioctl to either check for existence or do it.
> > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
> > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int)
> > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int)
> >
> > Hope that covers it.
>
> It could be done this way, but I predict that the code (both kernel and
> user side) will be larger. Maybe not much more complex, but more
> boilerplate.
>
> How will you manage extensions to the interface?
I would assume we'd do something similar to the kvm capabilities checks.
I don't know if that's just bits of GET_FLAGS or a different ioctl.
> With the table it's
> simple, you see a new (sub)record type and you either understand it or
> you skip it. With ioctls you need to call every information-gathering
> ioctl you know and care about (or are told is present via some feature
> advertisement), and see if there's anything there.
I don't really see much difference between the interfaces here. You'd
pick and choose which table entries you care about and pick and choose
ioctls. For one you see it in the table, for the other there's a bit
indicating the capability exists.
> > Something I prefer about this interface is that
> > everything can easily be generated on the fly, whereas reading out a
> > table from the device means we really need to have that table somewhere
> > in kernel memory to easily support reading random offsets. Thoughts?
>
> The table should not be particularly large, and you'll need to keep the
> information around in some form regardless. Maybe in the PCI case you
> could produce it dynamically (though I probably wouldn't), but it really
> wouldn't make sense in the device tree case.
It would be entirely dynamic for PCI, there's no advantage to caching
it. Even for device tree, if you can't fetch it dynamically, you'd have
to duplicate it between an internal data structure and a buffer reading
the table.
> You also lose the ability to easily have a human look at the hexdump for
> debugging; you'll need a special "lsvfio" tool. You might want one
> anyway to pretty-print the info, but with ioctls it's mandatory.
I don't think this alone justifies duplicating the data and making it
difficult to parse on both ends. Chances are we won't need such a tool
for the ioctl interface because it's easier to get it right the first
time ;)
Note that I'm not stuck on this interface, I was just thinking about how
to generate the table last week, it seemed like a pain so I thought I'd
spend a few minutes outlining an ioctl interface... turns out it's not
so bad. Thanks,
Alex
next prev parent reply other threads:[~2011-09-27 0:45 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 13:11 [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files Stuart Yoder
2011-09-09 13:16 ` Stuart Yoder
2011-09-19 15:16 ` Alex Williamson
2011-09-19 19:37 ` Scott Wood
2011-09-19 21:07 ` Alex Williamson
2011-09-19 21:15 ` Scott Wood
2011-09-26 7:51 ` David Gibson
2011-09-26 10:04 ` Alexander Graf
2011-09-26 18:34 ` Alex Williamson
2011-09-26 20:03 ` Stuart Yoder
2011-09-26 20:42 ` Alex Williamson
2011-09-26 23:59 ` Scott Wood
2011-09-27 0:45 ` Alex Williamson [this message]
2011-09-27 21:28 ` Scott Wood
2011-09-28 2:40 ` Alex Williamson
2011-09-28 8:58 ` Alexander Graf
2011-09-30 8:55 ` David Gibson
2011-09-30 8:50 ` David Gibson
2011-09-30 8:46 ` David Gibson
2011-09-30 16:37 ` Alex Williamson
2011-09-30 21:59 ` Alex Williamson
2011-09-30 8:40 ` David Gibson
2011-09-26 19:57 ` Stuart Yoder
2011-09-27 0:25 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1317084333.25092.138.camel@x201.home \
--to=alex.williamson@redhat.com \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=b08248@gmail.com \
--cc=benh@au.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).