From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: [RFC PATCH 5/5] VFIO based device assignment
Date: Sun, 11 Jul 2010 21:27:33 +0300
Message-ID: <4C3A0D15.3070302@redhat.com>
References: <20100711180910.20121.93313.stgit@localhost6.localdomain6> <20100711180942.20121.97368.stgit@localhost6.localdomain6>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, pugs@cisco.com,
	chrisw@redhat.com, mst@redhat.com
To: Alex Williamson <alex.williamson@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:32028 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1754107Ab0GKS1j (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sun, 11 Jul 2010 14:27:39 -0400
In-Reply-To: <20100711180942.20121.97368.stgit@localhost6.localdomain6>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 07/11/2010 09:09 PM, Alex Williamson wrote:
> This patch adds qemu device assignment support using the proposed
> VFIO/UIOMMU kernel interfaces.  The existing KVM-only device assignment
> code makes use of various pci sysfs files for config space, MMIO BAR
> mapping, and misc other config items.  It then jumps over to KVM-specific
> ioctls for enabling interrupts and assigning devices to IOMMU domains.
> Finally, IO-port support uses in/out directly.  This is a messy model
> to support and causes numerous issues when we try to allow unprivileged
> users to access PCI devices.
>
> VFIO/UIOMMU reduces this to two interfaces, /dev/vfioX and /dev/uiommu.
> The VFIO device file provides all the necessary support for accessing
> PCI config space, read/write/mmap BARs (including IO-port space),
> configuring INTx/MSI/MSI-X interupts and setting up DMA mapping.  The
> UIOMMU interface allows iommu domains to be created, and via vfio,
> devices can be bound to a domain.  This provides an easier model to
> support (IMHO) and removes the bindings that make current device
> assignment only useable for KVM enabled guests.
>
> Usage is similar to KVM device assignment.  Rather than binding the
> device to the pci-stub driver, vfio devices need to be bound to the
> vfio driver.  From there, it's a simple matter of specifying the
> device as:
>
> -device vfio,host=01:00.0
>
> This example requires either root privileges or proper permissions on
> /dev/uiommu and /dev/vfioX.  To support unprivileged operation, the
> options vfiofd= and uiommufd= are available.  Depending on the usage
> of uiommufd, each guest device can be assigned to the same iommu
> domain, or to independent iommu domains.  In the example above, each
> device is assigned to a separate iommu domain.
>
> As VFIO has no KVM dependencies, this patch works with or without
> -enable-kvm.  I have successfully used a couple assigned devices in a
> guest without KVM support, however Michael Tsirkin warns that tcg
> may not provide atomic operations to memory visible to the passthrough
> device, which could result in failures for devices depending on such
> for synchronization.
>
> This patch is functional, but hasn't seen a lot of testing.  I've
> tested 82576 PFs and VFs, an Intel HDA audio device, and UHCI and EHCI
> USB devices (this actually includes INTx/MSI/MSI-X, 4k aligned MMIO
> BARs, non-4k aligned MMIO BARs, and IO-Port BARs).
>
>    

Good stuff.

I presume the iommu interface is responsible for page pinning.  What 
about page attributes?

There are two cases:

- snoop capable iommu - can use write-backed RAM, but need to enable 
snoop.  BARs still need to respect page attributes.
- older mmu - need to respect guest memory type; probably cannot be done 
without kvm.

If the guest maps a BAR or RAM using write-combine memory type, can we 
reflect that?  This may provide a considerable performance benefit.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.