From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:39206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QvcNp-0004VX-8g for qemu-devel@nongnu.org; Mon, 22 Aug 2011 17:50:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QvcNo-0003d7-4U for qemu-devel@nongnu.org; Mon, 22 Aug 2011 17:50:13 -0400 Received: from gate.crashing.org ([63.228.1.57]:54456) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QvcNn-0003ci-RV for qemu-devel@nongnu.org; Mon, 22 Aug 2011 17:50:12 -0400 From: Benjamin Herrenschmidt In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Date: Tue, 23 Aug 2011 07:49:45 +1000 Message-ID: <1314049785.7662.44.camel@pasglop> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] kvm PCI assignment & VFIO ramblings List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: aafabbri Cc: Alexey Kardashevskiy , kvm@vger.kernel.org, Paul Mackerras , "linux-pci@vger.kernel.org" , qemu-devel , iommu , chrisw , Alex Williamson , Avi Kivity , linuxppc-dev , benve@cisco.com > > I wouldn't use uiommu for that. > > Any particular reason besides saving a file descriptor? > > We use it today, and it seems like a cleaner API than what you propose > changing it to. Well for one, we are back to square one vs. grouping constraints. .../... > If we in singleton-group land were building our own "groups" which were sets > of devices sharing the IOMMU domains we wanted, I suppose we could do away > with uiommu fds, but it sounds like the current proposal would create 20 > singleton groups (x86 iommu w/o PCI bridges => all devices are partitionable > endpoints). Asking me to ioctl(inherit) them together into a blob sounds > worse than the current explicit uiommu API. I'd rather have an API to create super-groups (groups of groups) statically and then you can use such groups as normal groups using the same interface. That create/management process could be done via a simple command line utility or via sysfs banging, whatever... Cheers, Ben. > Thanks, > Aaron > > > > > Another option is to make that static configuration APIs via special > > ioctls (or even netlink if you really like it), to change the grouping > > on architectures that allow it. > > > > Cheers. > > Ben. > > > >> > >> -Aaron > >> > >>> As necessary in the future, we can > >>> define a more high performance dma mapping interface for streaming dma > >>> via the group fd. I expect we'll also include architecture specific > >>> group ioctls to describe features and capabilities of the iommu. The > >>> group fd will need to prevent concurrent open()s to maintain a 1:1 group > >>> to userspace process ownership model. > >>> > >>> Also on the table is supporting non-PCI devices with vfio. To do this, > >>> we need to generalize the read/write/mmap and irq eventfd interfaces. > >>> We could keep the same model of segmenting the device fd address space, > >>> perhaps adding ioctls to define the segment offset bit position or we > >>> could split each region into it's own fd (VFIO_GET_PCI_BAR_FD(0), > >>> VFIO_GET_PCI_CONFIG_FD(), VFIO_GET_MMIO_FD(3)), though we're already > >>> suffering some degree of fd bloat (group fd, device fd(s), interrupt > >>> event fd(s), per resource fd, etc). For interrupts we can overload > >>> VFIO_SET_IRQ_EVENTFD to be either PCI INTx or non-PCI irq (do non-PCI > >>> devices support MSI?). > >>> > >>> For qemu, these changes imply we'd only support a model where we have a > >>> 1:1 group to iommu domain. The current vfio driver could probably > >>> become vfio-pci as we might end up with more target specific vfio > >>> drivers for non-pci. PCI should be able to maintain a simple -device > >>> vfio-pci,host=bb:dd.f to enable hotplug of individual devices. We'll > >>> need to come up with extra options when we need to expose groups to > >>> guest for pvdma. > >>> > >>> Hope that captures it, feel free to jump in with corrections and > >>> suggestions. Thanks, > >>> > >>> Alex > >>> > > > >