From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Dutile Subject: Re: RFC: vfio / iommu driver for hardware with no iommu Date: Tue, 30 Apr 2013 14:13:39 -0400 Message-ID: <518009D3.2050304@redhat.com> References: <9F6FE96B71CF29479FF1CDC8046E15035BE0A3@039-SN1MPN1-002.039d.mgd.msft.net> <1366736189.2918.573.camel@bling.home> <9F6FE96B71CF29479FF1CDC8046E15035BE2BD@039-SN1MPN1-002.039d.mgd.msft.net> <1366746427.2918.650.camel@bling.home> <51783553.80202@redhat.com> <5179ACE8.2030506@redhat.com> <20130430172849.GB22752@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Konrad Rzeszutek Wilk Cc: Yoder Stuart-B08248 , "iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org" List-Id: iommu@lists.linux-foundation.org On 04/30/2013 01:28 PM, Konrad Rzeszutek Wilk wrote: > On Sat, Apr 27, 2013 at 12:22:28PM +0800, Andrew Cooks wrote: >> On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile wrote: >>> On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote: >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu- >>>>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile >>>>> Sent: Thursday, April 25, 2013 1:11 AM >>>>> To: Alex Williamson >>>>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu >>>>> >>>>> On 04/23/2013 03:47 PM, Alex Williamson wrote: >>>>>> >>>>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org] >>>>>>>> Sent: Tuesday, April 23, 2013 11:56 AM >>>>>>>> To: Yoder Stuart-B08248 >>>>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >>>>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu >>>>>>>> >>>>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote: >>>>>>>>> >>>>>>>>> Joerg/Alex, >>>>>>>>> >>>>>>>>> We have embedded systems where we use QEMU/KVM and have the >>>>>>>>> requirement to do device assignment, but have no iommu. So we >>>>>>>>> would like to get vfio-pci working on systems like this. >>>>>>>>> >>>>>>>>> We're aware of the obvious limitations-- no protection, DMA'able >>>>>>>>> memory must be physically contiguous and will have no iova->phy >>>>>>>>> translation. But there are use cases where all OSes involved are >>>>>>>>> trusted and customers can >>>>>>>>> live with those limitations. Virtualization is used >>>>>>>>> here not to sandbox untrusted code, but to consolidate multiple >>>>>>>>> OSes. >>>>>>>>> >>>>>>>>> We would like to get your feedback on the rough idea. There are >>>>>>>>> two parts-- iommu driver and vfio-pci. >>>>>>>>> >>>>>>>>> 1. iommu driver >>>>>>>>> >>>>>>>>> First, we still need device groups created because vfio is based on >>>>>>>>> that, so we envision a 'dummy' iommu driver that implements only >>>>>>>>> the add/remove device ops. Something like: >>>>>>>>> >>>>>>>>> static struct iommu_ops fsl_none_ops = { >>>>>>>>> .add_device = fsl_none_add_device, >>>>>>>>> .remove_device = fsl_none_remove_device, >>>>>>>>> }; >>>>>>>>> >>>>>>>>> int fsl_iommu_none_init() >>>>>>>>> { >>>>>>>>> int ret = 0; >>>>>>>>> >>>>>>>>> ret = iommu_init_mempool(); >>>>>>>>> if (ret) >>>>>>>>> return ret; >>>>>>>>> >>>>>>>>> bus_set_iommu(&platform_bus_type,&fsl_none_ops); >>>>>>>>> bus_set_iommu(&pci_bus_type,&fsl_none_ops); >>>>>>>>> >>>>>>>>> return ret; >>>>>>>>> } >>>>>>>>> >>>>>>>>> 2. vfio-pci >>>>>>>>> >>>>>>>>> For vfio-pci, we would ideally like to keep user space mostly >>>>>>>>> unchanged. User space will have to follow the semantics of mapping >>>>>>>>> only physically contiguous chunks...and iova will equal phys. >>>>>>>>> >>>>>>>>> So, we propose to implement a new vfio iommu type, called >>>>>>>>> VFIO_TYPE_NONE_IOMMU. This implements any needed vfio interfaces, >>>>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a >>>>>>>>> noop. >>>>>>>>> >>>>>>>>> Would like your feedback. >>>>>>>> >>>>>>>> >>>>>>>> My first thought is that this really detracts from vfio and iommu >>>>>>>> groups being a secure interface, so somehow this needs to be clearly >>>>>>>> an insecure mode that requires an opt-in and maybe taints the >>>>>>>> kernel. Any notion of unprivileged use needs to be blocked and it >>>>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at >>>>>>>> critical access points. We might even have interfaces exported that >>>>>>>> would allow this to be an out-of-tree driver (worth a check). >>>>>>>> >>>>>>>> I would guess that you would probably want to do all the iommu group >>>>>>>> setup from the vfio fake-iommu driver. In other words, that driver >>>>>>>> both creates the fake groups and provides the dummy iommu backend for >>>>> >>>>> vfio. >>>>>>>> >>>>>>>> That would be a nice way to compartmentalize this as a >>>>>>>> vfio-noiommu-special. >>>>>>> >>>>>>> >>>>>>> So you mean don't implement any of the iommu driver ops at all and >>>>>>> keep everything in the vfio layer? >>>>>>> >>>>>>> Would you still have real iommu groups?...i.e. >>>>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group >>>>>>> ../../../../kernel/iommu_groups/26 >>>>>>> >>>>>>> ...and that is created by vfio-noiommu-special? >>>>>> >>>>>> >>>>>> I'm suggesting (but haven't checked if it's possible), to implement >>>>>> the iommu driver ops as part of the vfio iommu backend driver. The >>>>>> primary motivation for this would be to a) keep a fake iommu groups >>>>>> interface out of the iommu proper (possibly containing it in an >>>>>> external driver) and b) modularizing it so we don't have fake iommu >>>>>> groups being created by default. It would have to populate the iommu >>>>>> groups sysfs interfaces to be compatible with vfio. >>>>>> >>>>>>> Right now when the PCI and platform buses are probed, the iommu >>>>>>> driver add-device callback gets called and that is where the >>>>>>> per-device group gets created. Are you envisioning registering a >>>>>>> callback for the PCI bus to do this in vfio-noiommu-special? >>>>>> >>>>>> >>>>>> Yes. It's just as easy to walk all the devices rather than doing >>>>>> callbacks, iirc the group code does this when you register. In fact, >>>>>> this noiommu interface may not want to add all devices, we may want to >>>>>> be very selective and only add some. >>>>>> >>>>> Right. >>>>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and >>>>> still leverage/use vfio for qemu's device assignment. >>>>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in >>>>> place. >>>>> >>>>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped, >>>>> so assigned devices are 'remapped' from system-B:D.F to virt- >>>>> machine's >>>>> (virtualized) B:D.F of the assigned device. >>>>> Are pci-cfg cycles trapped in freescale qemu model ? >>>>> >>>> The vfio-pci device would be visible (to a KVM guest) as a PCI device on >>>> the virtual PCI bus (emulated by qemu). >>>> >>>> -Varun >>>> >>> Understood, but as Alex stated, the whole purpose of VFIO is to >>> be able to do _secure_, user-level-driven I/O. Since this would >>> be 'unsecure', there should be a way to note that during configuration. >>> >> >> Does vfio work with swiotlb and if not, can/should swiotlb be >> extended? Or does the time and space overhead make it a moot point? > > It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API. > I think you got it reversed. vfio uses iommu api, not dma api. if vfio used dma api, swiotlb is configured as the default dma-ops interface and it could work (with more interfaces... domain-alloc, etc.). > It could be extended to use it. I was toying with this b/c for Xen to > use VFIO I would have to implement an Xen IOMMU driver that would basically > piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes > care of all the hard work of securing each guest). > > But your requirement would be the same, so it might as well be an generic > driver called SWIOTLB-IOMMU driver. > > If you are up for writting I am up for reviewing/Ack-ing/etc. > > The complexity would be to figure out the VFIO group thing and how to assign > PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as > xen-pciback does (or pcistub). That is by writting the BDF in the "bind" > attribute in SysFS (or via a kernel parameter). > Did uio provide this un-secure support, and just needs some attention upstream?