From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=42172 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OY1Fc-0007bS-DD for qemu-devel@nongnu.org; Sun, 11 Jul 2010 14:27:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OY1Fb-0008Hr-2l for qemu-devel@nongnu.org; Sun, 11 Jul 2010 14:27:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35633) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OY1Fa-0008Hi-Pi for qemu-devel@nongnu.org; Sun, 11 Jul 2010 14:27:39 -0400 Message-ID: <4C3A0D15.3070302@redhat.com> Date: Sun, 11 Jul 2010 21:27:33 +0300 From: Avi Kivity MIME-Version: 1.0 References: <20100711180910.20121.93313.stgit@localhost6.localdomain6> <20100711180942.20121.97368.stgit@localhost6.localdomain6> In-Reply-To: <20100711180942.20121.97368.stgit@localhost6.localdomain6> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [RFC PATCH 5/5] VFIO based device assignment List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: chrisw@redhat.com, mst@redhat.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, pugs@cisco.com On 07/11/2010 09:09 PM, Alex Williamson wrote: > This patch adds qemu device assignment support using the proposed > VFIO/UIOMMU kernel interfaces. The existing KVM-only device assignment > code makes use of various pci sysfs files for config space, MMIO BAR > mapping, and misc other config items. It then jumps over to KVM-specific > ioctls for enabling interrupts and assigning devices to IOMMU domains. > Finally, IO-port support uses in/out directly. This is a messy model > to support and causes numerous issues when we try to allow unprivileged > users to access PCI devices. > > VFIO/UIOMMU reduces this to two interfaces, /dev/vfioX and /dev/uiommu. > The VFIO device file provides all the necessary support for accessing > PCI config space, read/write/mmap BARs (including IO-port space), > configuring INTx/MSI/MSI-X interupts and setting up DMA mapping. The > UIOMMU interface allows iommu domains to be created, and via vfio, > devices can be bound to a domain. This provides an easier model to > support (IMHO) and removes the bindings that make current device > assignment only useable for KVM enabled guests. > > Usage is similar to KVM device assignment. Rather than binding the > device to the pci-stub driver, vfio devices need to be bound to the > vfio driver. From there, it's a simple matter of specifying the > device as: > > -device vfio,host=01:00.0 > > This example requires either root privileges or proper permissions on > /dev/uiommu and /dev/vfioX. To support unprivileged operation, the > options vfiofd= and uiommufd= are available. Depending on the usage > of uiommufd, each guest device can be assigned to the same iommu > domain, or to independent iommu domains. In the example above, each > device is assigned to a separate iommu domain. > > As VFIO has no KVM dependencies, this patch works with or without > -enable-kvm. I have successfully used a couple assigned devices in a > guest without KVM support, however Michael Tsirkin warns that tcg > may not provide atomic operations to memory visible to the passthrough > device, which could result in failures for devices depending on such > for synchronization. > > This patch is functional, but hasn't seen a lot of testing. I've > tested 82576 PFs and VFs, an Intel HDA audio device, and UHCI and EHCI > USB devices (this actually includes INTx/MSI/MSI-X, 4k aligned MMIO > BARs, non-4k aligned MMIO BARs, and IO-Port BARs). > > Good stuff. I presume the iommu interface is responsible for page pinning. What about page attributes? There are two cases: - snoop capable iommu - can use write-backed RAM, but need to enable snoop. BARs still need to respect page attributes. - older mmu - need to respect guest memory type; probably cannot be done without kvm. If the guest maps a BAR or RAM using write-combine memory type, can we reflect that? This may provide a considerable performance benefit. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.