From: Alex Williamson <alex.williamson@redhat.com>
To: David Gibson <dwg@au1.ibm.com>
Cc: Benjamin Herrenschmidt <benh@au1.ibm.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Stuart Yoder <b08248@gmail.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Alexander Graf <agraf@suse.de>, "avi@redhat.com" <avi@redhat.com>,
Scott Wood <scottwood@freescale.com>
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Fri, 30 Sep 2011 15:59:29 -0600 [thread overview]
Message-ID: <1317419970.2552.53.camel@bling.home> (raw)
In-Reply-To: 20110930084647.GF4512@yookeroo.fritz.box
On Fri, 2011-09-30 at 10:37 -0600, Alex Williamson wrote:
> On Fri, 2011-09-30 at 18:46 +1000, David Gibson wrote:
> > On Mon, Sep 26, 2011 at 12:34:52PM -0600, Alex Williamson wrote:
> > > On Mon, 2011-09-26 at 12:04 +0200, Alexander Graf wrote:
> > > > Am 26.09.2011 um 09:51 schrieb David Gibson <david@gibson.dropbear.id.au>:
> > [snip]
> > > > Also, if you can come up with an interface that does not have variable
> > > > length descriptors but is still able to export all the required
> > > > generic information, please send a proposal to the list :)
> > > >
> > >
> > > Hi,
> > >
> > > The other obvious possibility is a pure ioctl interface. To match what
> > > this proposal is trying to describe, plus the runtime interfaces, we'd
> > > need something like:
> >
> > Right, this also seems a reasonable possibility to me, depending on
> > the details.
> >
> > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_FLAGS _IOR(, , u64)
> > >
> > >
> > > /* Return number of mmio/iop/config regions.
> > > * For PCI this is always 8 (BAR0-5 + ROM + Config) */
> > > #define VFIO_DEVICE_GET_NUM_REGIONS _IOR(, , int)
> > >
> > > /* Return length for region index (may be zero) */
> > > #define VFIO_DEVICE_GET_REGION_LEN _IOWR(, , u64)
> > >
> > > /* Return flags for region index
> > > * :0 - mmap'able, :1 - read-only, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_REGION_FLAGS _IOR(, , u64)
> > >
> > > /* Return file offset for region index */
> > > #define VFIO_DEVICE_GET_REGION_OFFSET _IOWR(, , u64)
> >
> > The above 3 can be be folded into one "getregioninfo" call.
>
> Yep, and the phys addr one below. We can use a flags bit to indicate
> whether it's valid.
>
> > > /* Return physical address for region index - not implemented for PCI */
> > > #define VFIO_DEVICE_GET_REGION_PHYS_ADDR _IOWR(, , u64)
> > >
> > >
> > >
> > > /* Return number of IRQs (Not including MSI/MSI-X for PCI) */
> > > #define VFIO_DEVICE_GET_NUM_IRQ _IOR(, , int)
> > >
> > > /* Set IRQ eventfd for IRQ index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_IRQ_EVENTFD _IOW(, , int)
> > >
> > > /* Unmask IRQ index */
> > > #define VFIO_DEVICE_UNMASK_IRQ _IOW(, , int)
> > >
> > > /* Set unmask eventfd for index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFD _IOW(, , int)
> > >
> > >
> > > /* Return the device tree path for type/index into the user
> > > * allocated buffer */
> > > struct dtpath {
> > > u32 type; (0 = region, 1 = IRQ)
> > > u32 index;
> > > u32 buf_len;
> > > char *buf;
> > > };
> > > #define VFIO_DEVICE_GET_DTPATH _IOWR(, , struct dtpath)
> > >
> > > /* Return the device tree index for type/index */
> > > struct dtindex {
> > > u32 type; (0 = region, 1 = IRQ)
> > > u32 index;
> > > u32 prop_type;
> > > u32 prop_index;
> > > };
> > > #define VFIO_DEVICE_GET_DTINDEX _IOWR(, , struct dtindex)
> >
> > I think those need some work, but that doesn't impinge on the core
> > semantics.
> >
> > > /* Reset the device */
> > > #define VFIO_DEVICE_RESET _IO(, ,)
> > >
> > >
> > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
> > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int)
> > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int)
> >
> > Why does this need seperate controls, rather than just treating MSIs
> > as interrupts beyond the first for PCI devices?
>
> Well, we could say that PCI will always report 3 for
> VFIO_DEVICE_GET_NUM_IRQ where 0 = legacy, 1 = MSI, 2 = MSI-X. ioctls on
> unimplemented IRQs will fail, UNMASK* ioctls on non-level triggered
> interrupts will fail, and the parameter to SET_IRQ_EVENTFD becomes
> arg[0] = index, arg[1] = count, arg[2-n] = fd. Maybe we'd then have a
> GET_IRQ_INFO that takes something like:
>
> struct vfio_irq_info {
> int index;
> unsigned int count;
> u64 flags;
> #define VFIO_IRQ_INFO_FLAGS_LEVEL (1 << 0)
> };
>
> count would be 0 on PCI if the type of interrupt isn't supported.
> Better? Thanks,
FYI for all, I've pushed a branch out to github with the current state
of the re-write. You can find it here
https://awilliam@github.com/awilliam/linux-vfio.git
git://github.com/awilliam/linux-vfio.git
The vfio-ng branch is the latest. The framework is quite a bit more
solid now, so I figure it's time to move into the device and iommu
implementation. vfio-pci is now it's own module that depends on vfio, I
expect vfio-dt to be implemented the same. The PCI ioctl is in place
and supports the interface described above. I'll continue to port
pieces of the old vfio code into this new infrastructure. Comments and
patches welcome. Thanks,
Alex
next prev parent reply other threads:[~2011-09-30 21:59 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 13:11 [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files Stuart Yoder
2011-09-09 13:16 ` Stuart Yoder
2011-09-19 15:16 ` Alex Williamson
2011-09-19 19:37 ` Scott Wood
2011-09-19 21:07 ` Alex Williamson
2011-09-19 21:15 ` Scott Wood
2011-09-26 7:51 ` David Gibson
2011-09-26 10:04 ` Alexander Graf
2011-09-26 18:34 ` Alex Williamson
2011-09-26 20:03 ` Stuart Yoder
2011-09-26 20:42 ` Alex Williamson
2011-09-26 23:59 ` Scott Wood
2011-09-27 0:45 ` Alex Williamson
2011-09-27 21:28 ` Scott Wood
2011-09-28 2:40 ` Alex Williamson
2011-09-28 8:58 ` Alexander Graf
2011-09-30 8:55 ` David Gibson
2011-09-30 8:50 ` David Gibson
2011-09-30 8:46 ` David Gibson
2011-09-30 16:37 ` Alex Williamson
2011-09-30 21:59 ` Alex Williamson [this message]
2011-09-30 8:40 ` David Gibson
2011-09-26 19:57 ` Stuart Yoder
2011-09-27 0:25 ` Scott Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1317419970.2552.53.camel@bling.home \
--to=alex.williamson@redhat.com \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=b08248@gmail.com \
--cc=benh@au1.ibm.com \
--cc=dwg@au1.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).