* [RFC] Independent use of IOMMU groups
@ 2015-11-05 17:54 Alex Williamson
[not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2015-11-05 17:54 UTC (permalink / raw)
To: iommu; +Cc: Paolo Bonzini
Hi,
We have a couple things in-flight that are trying to make use of IOMMU
groups, independent of the rest of the IOMMU API. One is the proposed
VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU
backed device in order to make it operate within vfio and exposed via
vfio-pci:
https://lkml.org/lkml/2015/11/4/437
This has all the caveats that DMA for the device is unsafe and we taint
the kernel, but at least it provides users that are already doing this
thing with a more featureful interface without duplicating tons of code
into UIO and it gives them a consistent device model to easily move to
when an IOMMU is supported.
When we do this, vfio creates the IOMMU group for the device when it
binds to the vfio bus driver (vfio-pci) and removes it when unbound.
For that period, we own the device and don't interact with the IOMMU API
for any sort of mapping.
Another idea that's floating around is that vfio could actually expose
virtual devices to a user, think for instance vGPUs in a non-SR-IOV
scenario. A struct device is created where portions of the device are
backed directly by some subset of a physical device while other parts
may be emulated by the vfio bus driver. The virtual device needs an
IOMMU group to participate in the vfio framework, but the platform
itself doesn't necessarily need an IOMMU. In this case isolation of the
virtual device might be provided by policing of the user programming of
the device and MMU control on the physical device itself. The vfio
IOMMU backend for such a device would use device specific programming
rather than making use of the IOMMU API.
The IOMMU group address space is global, so creating these groups will
necessarily cause them to appear in /sys/kernel/iommu_groups/, so I want
to make sure there are no objections to these sorts of uses. In one
scenario the device is real, but the IOMMU group is only present when
bound to a driver that knows the usage restrictions, in the other the
device is virtual, created by the driver itself, which is therefore
aware of its restrictions. Comments?
A question I'd expect to be asked is why not create a new bus_type and
register an IOMMU for it? In the no-iommu case, the bus for the device
already exists and does not have an IOMMU present. Registering an IOMMU
for that bus_type risks other devices on that bus_type attempting to do
real IOMMU API tasks. In the virtual case, this is more reasonable
since all the virtual devices could be children on a new bus_type, but
the overhead of trying to mate a general purpose API to a device that
really only requires very special purpose mappings seems like
unnecessary overhead.
If there are any gotchas that I'm missing, please let me know. Thanks,
Alex
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [RFC] Independent use of IOMMU groups [not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-11-06 12:29 ` Joerg Roedel [not found] ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Joerg Roedel @ 2015-11-06 12:29 UTC (permalink / raw) To: Alex Williamson; +Cc: Paolo Bonzini, iommu Hi Alex, On Thu, Nov 05, 2015 at 10:54:39AM -0700, Alex Williamson wrote: > We have a couple things in-flight that are trying to make use of IOMMU > groups, independent of the rest of the IOMMU API. One is the proposed > VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU > backed device in order to make it operate within vfio and exposed via > vfio-pci: > > https://lkml.org/lkml/2015/11/4/437 Do you really need iommu-groups for non-IOMMU vfio backend? VFIO has its own representation of groups (iirc they map 1-1 to iommu-groups). Can this concept in VFIO not be made more independent of iommu-groups? I think having iommu-groups in sysfs without an iommu in the system is pretty confusing for the user. Not to say that the usual iommu grouping code makes no sense anymore, as there is no isolation at all :) Joerg ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] Independent use of IOMMU groups [not found] ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> @ 2015-11-06 15:35 ` Alex Williamson [not found] ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Alex Williamson @ 2015-11-06 15:35 UTC (permalink / raw) To: Joerg Roedel; +Cc: Paolo Bonzini, iommu On Fri, 2015-11-06 at 13:29 +0100, Joerg Roedel wrote: > Hi Alex, > > On Thu, Nov 05, 2015 at 10:54:39AM -0700, Alex Williamson wrote: > > We have a couple things in-flight that are trying to make use of IOMMU > > groups, independent of the rest of the IOMMU API. One is the proposed > > VFIO No-IOMMU hack that will create an IOMMU group for a non-IOMMU > > backed device in order to make it operate within vfio and exposed via > > vfio-pci: > > > > https://lkml.org/lkml/2015/11/4/437 > > Do you really need iommu-groups for non-IOMMU vfio backend? VFIO has its > own representation of groups (iirc they map 1-1 to iommu-groups). Can > this concept in VFIO not be made more independent of iommu-groups? > > I think having iommu-groups in sysfs without an iommu in the system is > pretty confusing for the user. Not to say that the usual iommu grouping > code makes no sense anymore, as there is no isolation at all :) Hi Joerg, VFIO is really built on iommu groups, so making a vfio group independent of iommu groups is a difficult proposition. With introducing the no-iommu vfio code, I accept that people are going to run userspace drivers without iommu protection, regardless of whether it's supportable. By using the vfio device interface, we're at least pushing them towards code that does have a supported use case. So my goal there is to enable no-iommu mode in a way that is compact (I'm only willing to invest limited lines of code to enable this) and does not undermine the foundation of vfio. I also do everything I can to make it clear that this is unsafe, from the naming of the opt-in module parameter to the tainting of the kernel when a no-iommu group is created to the dev_warn with that group creation and later when the device is opened, using a differently named vfio device node for the group, and allowing only a no-iommu IOMMU backend for the group. There is no chance that a user can accidentally operate on a no-iommu vfio group and there are breadcrumbs left behind even in the normal process of using them. Also, as I mentioned previously, the lifetime of this no-iommu group is tied to the device being bound to the vfio driver, so no other drivers would have access to the iommu group and the user has already had to opt-in their system and generated a dmesg log and kernel taint before they even get the chance to be confused by that iommu group. Thanks, Alex ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [RFC] Independent use of IOMMU groups [not found] ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-11-27 15:39 ` Joerg Roedel [not found] ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Joerg Roedel @ 2015-11-27 15:39 UTC (permalink / raw) To: Alex Williamson; +Cc: Paolo Bonzini, iommu Hi Alex, On Fri, Nov 06, 2015 at 08:35:40AM -0700, Alex Williamson wrote: > VFIO is really built on iommu groups, so making a vfio group independent > of iommu groups is a difficult proposition. I have been thinking about the relation between vfio device groups and iommu-groups lately, because at least for PCI the iommu-grouping is too coarse grained. I ran into this with the default-domain approach I am working on. Grouping devices together that have different request-ids (multifunction and acs based grouping) only makes sense when the device is controlled by an untrusted piece of software, in our case userspace or a KVM guest. The device drivers in Linux are trusted, and this coarse grained grouping becomes problematic, because it forces more devices into a single domain, which can become a bottleneck for DMA-API allocations. I have been thinking about moving the multi-function and acs grouping into vfio code, meaning that a vfio-group contains more than one iommu-group. The problem with this is that iommu-groups are exposed in sysfs and thus became a userspace ABI. So the vfio-group code might need changes anyway which could solve the above problem too, no? I am just not sure yet what the best way is to solve it. Joerg ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>]
* Re: [RFC] Independent use of IOMMU groups [not found] ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> @ 2015-12-02 15:58 ` Alex Williamson 0 siblings, 0 replies; 5+ messages in thread From: Alex Williamson @ 2015-12-02 15:58 UTC (permalink / raw) To: Joerg Roedel; +Cc: Paolo Bonzini, iommu On Fri, 2015-11-27 at 16:39 +0100, Joerg Roedel wrote: > Hi Alex, > > On Fri, Nov 06, 2015 at 08:35:40AM -0700, Alex Williamson wrote: > > VFIO is really built on iommu groups, so making a vfio group independent > > of iommu groups is a difficult proposition. > > I have been thinking about the relation between vfio device groups and > iommu-groups lately, because at least for PCI the iommu-grouping is too > coarse grained. I ran into this with the default-domain approach I am > working on. > > Grouping devices together that have different request-ids (multifunction > and acs based grouping) only makes sense when the device is controlled > by an untrusted piece of software, in our case userspace or a KVM guest. > The device drivers in Linux are trusted, and this coarse grained > grouping becomes problematic, because it forces more devices into a > single domain, which can become a bottleneck for DMA-API allocations. > > I have been thinking about moving the multi-function and acs grouping > into vfio code, meaning that a vfio-group contains more than one > iommu-group. The problem with this is that iommu-groups are exposed > in sysfs and thus became a userspace ABI. > > So the vfio-group code might need changes anyway which could solve the > above problem too, no? I am just not sure yet what the best way is to > solve it. Hi Joerg, That's a hard one. As you say, iommu groups are really a userspace ABI and tightly integrated into the mapping of vfio groups, so I don't really think we have much flexibility in re-defining an iommu group. The original intent with putting the grouping logic in the iommu drivers and core code was that vfio isn't smart enough to be able to determine both the iommu visibility and topology based isolation for any given architecture. I think that's still true. We don't really want to enable vfio on platforms that haven't given the issue sufficient consideration to enable the iommu API for devices. On the other hand, it doesn't make a whole lot of sense for native kernel drivers to care about topology based isolation. It would be preferable to fully isolate a device, but we do manage the IOVA address space for native drivers, so unintentional peer-to-peer shouldn't really be a possibility. It makes sense to me to think about an iommu group as a set of one or more iommu granules, where each granule is the granularity of the iommu visibility. A granule probably also has a relation to DMA aliases. So perhaps the granule encompasses all DMA related visibility issues and the group is an overlay which takes topology into account when the IOVA space is defined by the user (and malicious DMA needs to be considered as well). So maybe the first step is to create that dividing line and figure out what granules look like and how we can more explicitly expose groups on top of them. Easier said than done, I'm sure. Thanks, Alex PS - sorry for the delay, was off on holiday. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-12-02 15:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-05 17:54 [RFC] Independent use of IOMMU groups Alex Williamson
[not found] ` <1446746079.8831.82.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-11-06 12:29 ` Joerg Roedel
[not found] ` <20151106122939.GA13027-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-11-06 15:35 ` Alex Williamson
[not found] ` <1446824140.8831.168.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-11-27 15:39 ` Joerg Roedel
[not found] ` <20151127153910.GL2064-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-12-02 15:58 ` Alex Williamson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox