From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:37144) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8I0w-00018s-Ki for qemu-devel@nongnu.org; Mon, 26 Sep 2011 16:42:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R8I0v-0006Od-CK for qemu-devel@nongnu.org; Mon, 26 Sep 2011 16:42:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:15612) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8I0v-0006OZ-4s for qemu-devel@nongnu.org; Mon, 26 Sep 2011 16:42:57 -0400 From: Alex Williamson Date: Mon, 26 Sep 2011 14:42:43 -0600 In-Reply-To: References: <20110926075144.GT12286@yookeroo.fritz.box> <3D54B89C-A0A3-4461-A7A1-3F1E4AB79296@suse.de> <1317062095.25515.75.camel@bling.home> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Message-ID: <1317069765.25092.67.camel@x201.home> Mime-Version: 1.0 Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stuart Yoder Cc: "kvm@vger.kernel.org" , Benjamin Herrenschmidt , "qemu-devel@nongnu.org" , Alexander Graf , "avi@redhat.com" , Scott Wood , David Gibson On Mon, 2011-09-26 at 15:03 -0500, Stuart Yoder wrote: > > > > The other obvious possibility is a pure ioctl interface. To match what > > this proposal is trying to describe, plus the runtime interfaces, we'd > > need something like: > > > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */ > > #define VFIO_DEVICE_GET_FLAGS _IOR(, , u64) > > > > > > /* Return number of mmio/iop/config regions. > > * For PCI this is always 8 (BAR0-5 + ROM + Config) */ > > #define VFIO_DEVICE_GET_NUM_REGIONS _IOR(, , int) > > > > /* Return length for region index (may be zero) */ > > #define VFIO_DEVICE_GET_REGION_LEN _IOWR(, , u64) > > > > /* Return flags for region index > > * :0 - mmap'able, :1 - read-only, 63:2 - reserved */ > > #define VFIO_DEVICE_GET_REGION_FLAGS _IOR(, , u64) > > > > /* Return file offset for region index */ > > #define VFIO_DEVICE_GET_REGION_OFFSET _IOWR(, , u64) > > > > /* Return physical address for region index - not implemented for PCI */ > > #define VFIO_DEVICE_GET_REGION_PHYS_ADDR _IOWR(, , u64) > > > > > > > > /* Return number of IRQs (Not including MSI/MSI-X for PCI) */ > > #define VFIO_DEVICE_GET_NUM_IRQ _IOR(, , int) > > > > /* Set IRQ eventfd for IRQ index, arg[0] = index, arg[1] = fd */ > > #define VFIO_DEVICE_SET_IRQ_EVENTFD _IOW(, , int) > > > > /* Unmask IRQ index */ > > #define VFIO_DEVICE_UNMASK_IRQ _IOW(, , int) > > > > /* Set unmask eventfd for index, arg[0] = index, arg[1] = fd */ > > #define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFD _IOW(, , int) > > > > > > /* Return the device tree path for type/index into the user > > * allocated buffer */ > > struct dtpath { > > u32 type; (0 = region, 1 = IRQ) > > u32 index; > > u32 buf_len; > > char *buf; > > }; > > #define VFIO_DEVICE_GET_DTPATH _IOWR(, , struct dtpath) > > > > /* Return the device tree index for type/index */ > > struct dtindex { > > u32 type; (0 = region, 1 = IRQ) > > u32 index; > > u32 prop_type; > > u32 prop_index; > > }; > > #define VFIO_DEVICE_GET_DTINDEX _IOWR(, , struct dtindex) > > > > > > /* Reset the device */ > > #define VFIO_DEVICE_RESET _IO(, ,) > > > > > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */ > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int) > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int) > > > > Hope that covers it. Something I prefer about this interface is that > > everything can easily be generated on the fly, whereas reading out a > > table from the device means we really need to have that table somewhere > > in kernel memory to easily support reading random offsets. Thoughts? > > I think this could work, but I'm not sure it makes the problem David > had any better-- you substitute the complexity of parsing the > variable length regions with invoking a set of APIs. I read it as mostly the complexity problem, which I think this makes fairly trivial. It also eliminates a lot of complexity on the kernel side of supporting the table driven interface. Thanks, Alex if (!(GET_FLAGS & PCI)) return error; if (GET_NUM_REGIONS < 8) return error; GET_REGION_LEN(7) GET_REGION_OFFSET(7) // setup config access for (i = 0; i < 6; i++) { if (GET_REGION_LEN(i)) { GET_REGION_OFFSET(i) setup mmap/rw } } if (GET_REGION_LEN(6)) { GET_REGION_OFFSET(6) setup ROM access }