iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* RFC:  vfio / iommu driver for hardware with no iommu
@ 2013-04-23 16:13 Yoder Stuart-B08248
       [not found] ` <9F6FE96B71CF29479FF1CDC8046E15035BE0A3-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-23 16:13 UTC (permalink / raw)
  To: Joerg Roedel, Alex Williamson
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

Joerg/Alex,

We have embedded systems where we use QEMU/KVM and have
the requirement to do device assignment, but have no
iommu.  So we would like to get vfio-pci working on
systems like this.

We're aware of the obvious limitations-- no protection,
DMA'able memory must be physically contiguous and will
have no iova->phy translation.  But there are use cases
where all OSes involved are trusted and customers can
live with those limitations.   Virtualization is used
here not to sandbox untrusted code, but to consolidate
multiple OSes.

We would like to get your feedback on the rough idea.  There
are two parts-- iommu driver and vfio-pci.

1.  iommu driver

First, we still need device groups created because vfio
is based on that, so we envision a 'dummy' iommu
driver that implements only  the add/remove device
ops.  Something like:

    static struct iommu_ops fsl_none_ops = {
            .add_device     = fsl_none_add_device,
            .remove_device  = fsl_none_remove_device,
    };
    
    int fsl_iommu_none_init()
    {
            int ret = 0;
    
            ret = iommu_init_mempool();
            if (ret)
                    return ret;
    
            bus_set_iommu(&platform_bus_type, &fsl_none_ops);
            bus_set_iommu(&pci_bus_type, &fsl_none_ops);
    
            return ret;
    }

2.  vfio-pci

For vfio-pci, we would ideally like to keep user space mostly
unchanged.  User space will have to follow the semantics
of mapping only physically contiguous chunks...and iova
will equal phys.

So, we propose to implement a new vfio iommu type,
called VFIO_TYPE_NONE_IOMMU.  This implements
any needed vfio interfaces, but there are no calls
to the iommu layer...e.g. map_dma() is a noop.

Would like your feedback.

Thanks,
Stuart Yoder

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC:  vfio / iommu driver for hardware with no iommu
       [not found] ` <9F6FE96B71CF29479FF1CDC8046E15035BE0A3-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
@ 2013-04-23 16:56   ` Alex Williamson
       [not found]     ` <1366736189.2918.573.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
  2013-04-24 10:57   ` Joerg Roedel
  1 sibling, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2013-04-23 16:56 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> Joerg/Alex,
> 
> We have embedded systems where we use QEMU/KVM and have
> the requirement to do device assignment, but have no
> iommu.  So we would like to get vfio-pci working on
> systems like this.
> 
> We're aware of the obvious limitations-- no protection,
> DMA'able memory must be physically contiguous and will
> have no iova->phy translation.  But there are use cases
> where all OSes involved are trusted and customers can
> live with those limitations.   Virtualization is used
> here not to sandbox untrusted code, but to consolidate
> multiple OSes.
> 
> We would like to get your feedback on the rough idea.  There
> are two parts-- iommu driver and vfio-pci.
> 
> 1.  iommu driver
> 
> First, we still need device groups created because vfio
> is based on that, so we envision a 'dummy' iommu
> driver that implements only  the add/remove device
> ops.  Something like:
> 
>     static struct iommu_ops fsl_none_ops = {
>             .add_device     = fsl_none_add_device,
>             .remove_device  = fsl_none_remove_device,
>     };
>     
>     int fsl_iommu_none_init()
>     {
>             int ret = 0;
>     
>             ret = iommu_init_mempool();
>             if (ret)
>                     return ret;
>     
>             bus_set_iommu(&platform_bus_type, &fsl_none_ops);
>             bus_set_iommu(&pci_bus_type, &fsl_none_ops);
>     
>             return ret;
>     }
> 
> 2.  vfio-pci
> 
> For vfio-pci, we would ideally like to keep user space mostly
> unchanged.  User space will have to follow the semantics
> of mapping only physically contiguous chunks...and iova
> will equal phys.
> 
> So, we propose to implement a new vfio iommu type,
> called VFIO_TYPE_NONE_IOMMU.  This implements
> any needed vfio interfaces, but there are no calls
> to the iommu layer...e.g. map_dma() is a noop.
> 
> Would like your feedback.

My first thought is that this really detracts from vfio and iommu groups
being a secure interface, so somehow this needs to be clearly an
insecure mode that requires an opt-in and maybe taints the kernel.  Any
notion of unprivileged use needs to be blocked and it should test
CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
points.  We might even have interfaces exported that would allow this to
be an out-of-tree driver (worth a check).

I would guess that you would probably want to do all the iommu group
setup from the vfio fake-iommu driver.  In other words, that driver both
creates the fake groups and provides the dummy iommu backend for vfio.
That would be a nice way to compartmentalize this as a
vfio-noiommu-special.

Would map/unmap really be no-ops?  Seems like you still want to do page
pinning.  Also, you're using fsl in the example above, but would such a
driver have any platform dependency?  Thanks,

Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]     ` <1366736189.2918.573.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
@ 2013-04-23 18:36       ` Sethi Varun-B16395
  2013-04-23 19:16       ` Yoder Stuart-B08248
  1 sibling, 0 replies; 21+ messages in thread
From: Sethi Varun-B16395 @ 2013-04-23 18:36 UTC (permalink / raw)
  To: Alex Williamson, Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Alex Williamson
> Sent: Tuesday, April 23, 2013 10:26 PM
> To: Yoder Stuart-B08248
> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> 
> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> > Joerg/Alex,
> >
> > We have embedded systems where we use QEMU/KVM and have the
> > requirement to do device assignment, but have no iommu.  So we would
> > like to get vfio-pci working on systems like this.
> >
> > We're aware of the obvious limitations-- no protection, DMA'able
> > memory must be physically contiguous and will have no iova->phy
> > translation.  But there are use cases where all OSes involved are
> > trusted and customers can
> > live with those limitations.   Virtualization is used
> > here not to sandbox untrusted code, but to consolidate multiple OSes.
> >
> > We would like to get your feedback on the rough idea.  There are two
> > parts-- iommu driver and vfio-pci.
> >
> > 1.  iommu driver
> >
> > First, we still need device groups created because vfio is based on
> > that, so we envision a 'dummy' iommu driver that implements only  the
> > add/remove device ops.  Something like:
> >
> >     static struct iommu_ops fsl_none_ops = {
> >             .add_device     = fsl_none_add_device,
> >             .remove_device  = fsl_none_remove_device,
> >     };
> >
> >     int fsl_iommu_none_init()
> >     {
> >             int ret = 0;
> >
> >             ret = iommu_init_mempool();
> >             if (ret)
> >                     return ret;
> >
> >             bus_set_iommu(&platform_bus_type, &fsl_none_ops);
> >             bus_set_iommu(&pci_bus_type, &fsl_none_ops);
> >
> >             return ret;
> >     }
> >
> > 2.  vfio-pci
> >
> > For vfio-pci, we would ideally like to keep user space mostly
> > unchanged.  User space will have to follow the semantics of mapping
> > only physically contiguous chunks...and iova will equal phys.
> >
> > So, we propose to implement a new vfio iommu type, called
> > VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces, but
> > there are no calls to the iommu layer...e.g. map_dma() is a noop.
> >
> > Would like your feedback.
> 
> My first thought is that this really detracts from vfio and iommu groups
> being a secure interface, so somehow this needs to be clearly an insecure
> mode that requires an opt-in and maybe taints the kernel.  Any notion of
> unprivileged use needs to be blocked and it should test
> CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
> points.  We might even have interfaces exported that would allow this to
> be an out-of-tree driver (worth a check).
> 
> I would guess that you would probably want to do all the iommu group
> setup from the vfio fake-iommu driver.  In other words, that driver both
> creates the fake groups and provides the dummy iommu backend for vfio.
> That would be a nice way to compartmentalize this as a vfio-noiommu-
> special.
> 
[Sethi Varun-B16395] Yes, we would be doing device group creation in the dummy iommu driver

> Would map/unmap really be no-ops?  Seems like you still want to do page
> pinning.  Also, you're using fsl in the example above, but would such a
> driver have any platform dependency?  Thanks,

[Sethi Varun-B16395] map/unmap ioctls would be supported for pinning guest pages. Yes, there would be a platform dependency for the driver.

-Varun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]     ` <1366736189.2918.573.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
  2013-04-23 18:36       ` Sethi Varun-B16395
@ 2013-04-23 19:16       ` Yoder Stuart-B08248
       [not found]         ` <9F6FE96B71CF29479FF1CDC8046E15035BE2BD-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
  1 sibling, 1 reply; 21+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-23 19:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> Sent: Tuesday, April 23, 2013 11:56 AM
> To: Yoder Stuart-B08248
> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> 
> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> > Joerg/Alex,
> >
> > We have embedded systems where we use QEMU/KVM and have
> > the requirement to do device assignment, but have no
> > iommu.  So we would like to get vfio-pci working on
> > systems like this.
> >
> > We're aware of the obvious limitations-- no protection,
> > DMA'able memory must be physically contiguous and will
> > have no iova->phy translation.  But there are use cases
> > where all OSes involved are trusted and customers can
> > live with those limitations.   Virtualization is used
> > here not to sandbox untrusted code, but to consolidate
> > multiple OSes.
> >
> > We would like to get your feedback on the rough idea.  There
> > are two parts-- iommu driver and vfio-pci.
> >
> > 1.  iommu driver
> >
> > First, we still need device groups created because vfio
> > is based on that, so we envision a 'dummy' iommu
> > driver that implements only  the add/remove device
> > ops.  Something like:
> >
> >     static struct iommu_ops fsl_none_ops = {
> >             .add_device     = fsl_none_add_device,
> >             .remove_device  = fsl_none_remove_device,
> >     };
> >
> >     int fsl_iommu_none_init()
> >     {
> >             int ret = 0;
> >
> >             ret = iommu_init_mempool();
> >             if (ret)
> >                     return ret;
> >
> >             bus_set_iommu(&platform_bus_type, &fsl_none_ops);
> >             bus_set_iommu(&pci_bus_type, &fsl_none_ops);
> >
> >             return ret;
> >     }
> >
> > 2.  vfio-pci
> >
> > For vfio-pci, we would ideally like to keep user space mostly
> > unchanged.  User space will have to follow the semantics
> > of mapping only physically contiguous chunks...and iova
> > will equal phys.
> >
> > So, we propose to implement a new vfio iommu type,
> > called VFIO_TYPE_NONE_IOMMU.  This implements
> > any needed vfio interfaces, but there are no calls
> > to the iommu layer...e.g. map_dma() is a noop.
> >
> > Would like your feedback.
> 
> My first thought is that this really detracts from vfio and iommu groups
> being a secure interface, so somehow this needs to be clearly an
> insecure mode that requires an opt-in and maybe taints the kernel.  Any
> notion of unprivileged use needs to be blocked and it should test
> CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
> points.  We might even have interfaces exported that would allow this to
> be an out-of-tree driver (worth a check).
> 
> I would guess that you would probably want to do all the iommu group
> setup from the vfio fake-iommu driver.  In other words, that driver both
> creates the fake groups and provides the dummy iommu backend for vfio.
> That would be a nice way to compartmentalize this as a
> vfio-noiommu-special.

So you mean don't implement any of the iommu driver
ops at all and keep everything in the vfio layer?

Would you still have real iommu groups?...i.e. 
$ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
../../../../kernel/iommu_groups/26

...and that is created by vfio-noiommu-special?

Right now when the PCI and platform buses are probed,
the iommu driver add-device callback gets called and
that is where the per-device group gets created.  Are
you envisioning registering a callback for the PCI
bus to do this in vfio-noiommu-special?

> Would map/unmap really be no-ops?  Seems like you still want to do page
> pinning.

You're right, that was a bad example...most would be no ops though.

> Also, you're using fsl in the example above, but would such a
> driver have any platform dependency?

This wouldn't have to be fsl specific if we thought it was
potentially generally useful.

Stuart

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]         ` <9F6FE96B71CF29479FF1CDC8046E15035BE2BD-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
@ 2013-04-23 19:47           ` Alex Williamson
       [not found]             ` <1366746427.2918.650.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2013-04-23 19:47 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> > Sent: Tuesday, April 23, 2013 11:56 AM
> > To: Yoder Stuart-B08248
> > Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> > 
> > On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> > > Joerg/Alex,
> > >
> > > We have embedded systems where we use QEMU/KVM and have
> > > the requirement to do device assignment, but have no
> > > iommu.  So we would like to get vfio-pci working on
> > > systems like this.
> > >
> > > We're aware of the obvious limitations-- no protection,
> > > DMA'able memory must be physically contiguous and will
> > > have no iova->phy translation.  But there are use cases
> > > where all OSes involved are trusted and customers can
> > > live with those limitations.   Virtualization is used
> > > here not to sandbox untrusted code, but to consolidate
> > > multiple OSes.
> > >
> > > We would like to get your feedback on the rough idea.  There
> > > are two parts-- iommu driver and vfio-pci.
> > >
> > > 1.  iommu driver
> > >
> > > First, we still need device groups created because vfio
> > > is based on that, so we envision a 'dummy' iommu
> > > driver that implements only  the add/remove device
> > > ops.  Something like:
> > >
> > >     static struct iommu_ops fsl_none_ops = {
> > >             .add_device     = fsl_none_add_device,
> > >             .remove_device  = fsl_none_remove_device,
> > >     };
> > >
> > >     int fsl_iommu_none_init()
> > >     {
> > >             int ret = 0;
> > >
> > >             ret = iommu_init_mempool();
> > >             if (ret)
> > >                     return ret;
> > >
> > >             bus_set_iommu(&platform_bus_type, &fsl_none_ops);
> > >             bus_set_iommu(&pci_bus_type, &fsl_none_ops);
> > >
> > >             return ret;
> > >     }
> > >
> > > 2.  vfio-pci
> > >
> > > For vfio-pci, we would ideally like to keep user space mostly
> > > unchanged.  User space will have to follow the semantics
> > > of mapping only physically contiguous chunks...and iova
> > > will equal phys.
> > >
> > > So, we propose to implement a new vfio iommu type,
> > > called VFIO_TYPE_NONE_IOMMU.  This implements
> > > any needed vfio interfaces, but there are no calls
> > > to the iommu layer...e.g. map_dma() is a noop.
> > >
> > > Would like your feedback.
> > 
> > My first thought is that this really detracts from vfio and iommu groups
> > being a secure interface, so somehow this needs to be clearly an
> > insecure mode that requires an opt-in and maybe taints the kernel.  Any
> > notion of unprivileged use needs to be blocked and it should test
> > CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
> > points.  We might even have interfaces exported that would allow this to
> > be an out-of-tree driver (worth a check).
> > 
> > I would guess that you would probably want to do all the iommu group
> > setup from the vfio fake-iommu driver.  In other words, that driver both
> > creates the fake groups and provides the dummy iommu backend for vfio.
> > That would be a nice way to compartmentalize this as a
> > vfio-noiommu-special.
> 
> So you mean don't implement any of the iommu driver
> ops at all and keep everything in the vfio layer?
> 
> Would you still have real iommu groups?...i.e. 
> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
> ../../../../kernel/iommu_groups/26
> 
> ...and that is created by vfio-noiommu-special?

I'm suggesting (but haven't checked if it's possible), to implement the
iommu driver ops as part of the vfio iommu backend driver.  The primary
motivation for this would be to a) keep a fake iommu groups interface
out of the iommu proper (possibly containing it in an external driver)
and b) modularizing it so we don't have fake iommu groups being created
by default.  It would have to populate the iommu groups sysfs interfaces
to be compatible with vfio.

> Right now when the PCI and platform buses are probed,
> the iommu driver add-device callback gets called and
> that is where the per-device group gets created.  Are
> you envisioning registering a callback for the PCI
> bus to do this in vfio-noiommu-special?

Yes.  It's just as easy to walk all the devices rather than doing
callbacks, iirc the group code does this when you register.  In fact,
this noiommu interface may not want to add all devices, we may want to
be very selective and only add some.

> > Would map/unmap really be no-ops?  Seems like you still want to do page
> > pinning.
> 
> You're right, that was a bad example...most would be no ops though.
> 
> > Also, you're using fsl in the example above, but would such a
> > driver have any platform dependency?
> 
> This wouldn't have to be fsl specific if we thought it was
> potentially generally useful.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC:  vfio / iommu driver for hardware with no iommu
       [not found] ` <9F6FE96B71CF29479FF1CDC8046E15035BE0A3-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
  2013-04-23 16:56   ` Alex Williamson
@ 2013-04-24 10:57   ` Joerg Roedel
       [not found]     ` <20130424105718.GJ17148-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  1 sibling, 1 reply; 21+ messages in thread
From: Joerg Roedel @ 2013-04-24 10:57 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Tue, Apr 23, 2013 at 04:13:00PM +0000, Yoder Stuart-B08248 wrote:
> We're aware of the obvious limitations-- no protection,
> DMA'able memory must be physically contiguous and will
> have no iova->phy translation.  But there are use cases
> where all OSes involved are trusted and customers can
> live with those limitations.   Virtualization is used
> here not to sandbox untrusted code, but to consolidate
> multiple OSes.

One of the major points of VFIO is to provide a userspace interface for
hardware IOMMUs. So if you have a platform without an IOMMU why do you
care about VFIO at all?


	Joerg

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]     ` <20130424105718.GJ17148-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2013-04-24 11:04       ` Bhushan Bharat-R65777
       [not found]         ` <6A3DF150A5B70D4F9B66A25E3F7C888D06FF5799-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
  2013-04-24 11:52       ` Sethi Varun-B16395
  1 sibling, 1 reply; 21+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-04-24 11:04 UTC (permalink / raw)
  To: Joerg Roedel, Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Joerg Roedel
> Sent: Wednesday, April 24, 2013 4:27 PM
> To: Yoder Stuart-B08248
> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> 
> On Tue, Apr 23, 2013 at 04:13:00PM +0000, Yoder Stuart-B08248 wrote:
> > We're aware of the obvious limitations-- no protection, DMA'able
> > memory must be physically contiguous and will have no iova->phy
> > translation.  But there are use cases where all OSes involved are
> > trusted and customers can
> > live with those limitations.   Virtualization is used
> > here not to sandbox untrusted code, but to consolidate multiple OSes.
> 
> One of the major points of VFIO is to provide a userspace interface for hardware
> IOMMUs. So if you have a platform without an IOMMU why do you care about VFIO at
> all?

We want to do direct device assignment to user space.
So if the device is behind iommu then it will be a secure interface.
if device is not behind a iommu then it is insecure. But the user space can access the device. This way we can be consistent with one mechanism to do direct device assignment.

Do you suggest that we should use UIO or some other mechanism for non iommu devices ?

Thanks
-Bharat

> 
> 
> 	Joerg
> 
> 
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]     ` <20130424105718.GJ17148-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2013-04-24 11:04       ` Bhushan Bharat-R65777
@ 2013-04-24 11:52       ` Sethi Varun-B16395
  1 sibling, 0 replies; 21+ messages in thread
From: Sethi Varun-B16395 @ 2013-04-24 11:52 UTC (permalink / raw)
  To: Joerg Roedel, Yoder Stuart-B08248
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Joerg Roedel
> Sent: Wednesday, April 24, 2013 4:27 PM
> To: Yoder Stuart-B08248
> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> 
> On Tue, Apr 23, 2013 at 04:13:00PM +0000, Yoder Stuart-B08248 wrote:
> > We're aware of the obvious limitations-- no protection, DMA'able
> > memory must be physically contiguous and will have no iova->phy
> > translation.  But there are use cases where all OSes involved are
> > trusted and customers can
> > live with those limitations.   Virtualization is used
> > here not to sandbox untrusted code, but to consolidate multiple OSes.
> 
> One of the major points of VFIO is to provide a userspace interface for
> hardware IOMMUs. So if you have a platform without an IOMMU why do you
> care about VFIO at all?
> 
Agreed, but vfio also provides a standardized interface for direct device assignment under KVM. 

-Varun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]         ` <6A3DF150A5B70D4F9B66A25E3F7C888D06FF5799-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
@ 2013-04-24 15:22           ` Yoder Stuart-B08248
  0 siblings, 0 replies; 21+ messages in thread
From: Yoder Stuart-B08248 @ 2013-04-24 15:22 UTC (permalink / raw)
  To: Bhushan Bharat-R65777, Joerg Roedel
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: Bhushan Bharat-R65777
> Sent: Wednesday, April 24, 2013 6:04 AM
> To: Joerg Roedel; Yoder Stuart-B08248
> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: RE: RFC: vfio / iommu driver for hardware with no iommu
> 
> 
> 
> > -----Original Message-----
> > From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> > bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Joerg Roedel
> > Sent: Wednesday, April 24, 2013 4:27 PM
> > To: Yoder Stuart-B08248
> > Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> >
> > On Tue, Apr 23, 2013 at 04:13:00PM +0000, Yoder Stuart-B08248 wrote:
> > > We're aware of the obvious limitations-- no protection, DMA'able
> > > memory must be physically contiguous and will have no iova->phy
> > > translation.  But there are use cases where all OSes involved are
> > > trusted and customers can
> > > live with those limitations.   Virtualization is used
> > > here not to sandbox untrusted code, but to consolidate multiple OSes.
> >
> > One of the major points of VFIO is to provide a userspace interface for hardware
> > IOMMUs. So if you have a platform without an IOMMU why do you care about VFIO at
> > all?
> 
> We want to do direct device assignment to user space.
> So if the device is behind iommu then it will be a secure interface.
> if device is not behind a iommu then it is insecure. But the user space can access the device. This way
> we can be consistent with one mechanism to do direct device assignment.

And more specifically, there's the desire to assign things like
a PCI device to a KVM virtual machine on a platform without
an iommu.   QEMU uses vfio-pci and we don't want to implement
a completely separate user space approach to do this.  We want
QEMU to stay (mostly) unchanged.

Stuart

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]             ` <1366746427.2918.650.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
@ 2013-04-24 19:41               ` Don Dutile
       [not found]                 ` <51783553.80202-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Don Dutile @ 2013-04-24 19:41 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/23/2013 03:47 PM, Alex Williamson wrote:
> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
>>
>>> -----Original Message-----
>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>>> Sent: Tuesday, April 23, 2013 11:56 AM
>>> To: Yoder Stuart-B08248
>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>
>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
>>>> Joerg/Alex,
>>>>
>>>> We have embedded systems where we use QEMU/KVM and have
>>>> the requirement to do device assignment, but have no
>>>> iommu.  So we would like to get vfio-pci working on
>>>> systems like this.
>>>>
>>>> We're aware of the obvious limitations-- no protection,
>>>> DMA'able memory must be physically contiguous and will
>>>> have no iova->phy translation.  But there are use cases
>>>> where all OSes involved are trusted and customers can
>>>> live with those limitations.   Virtualization is used
>>>> here not to sandbox untrusted code, but to consolidate
>>>> multiple OSes.
>>>>
>>>> We would like to get your feedback on the rough idea.  There
>>>> are two parts-- iommu driver and vfio-pci.
>>>>
>>>> 1.  iommu driver
>>>>
>>>> First, we still need device groups created because vfio
>>>> is based on that, so we envision a 'dummy' iommu
>>>> driver that implements only  the add/remove device
>>>> ops.  Something like:
>>>>
>>>>      static struct iommu_ops fsl_none_ops = {
>>>>              .add_device     = fsl_none_add_device,
>>>>              .remove_device  = fsl_none_remove_device,
>>>>      };
>>>>
>>>>      int fsl_iommu_none_init()
>>>>      {
>>>>              int ret = 0;
>>>>
>>>>              ret = iommu_init_mempool();
>>>>              if (ret)
>>>>                      return ret;
>>>>
>>>>              bus_set_iommu(&platform_bus_type,&fsl_none_ops);
>>>>              bus_set_iommu(&pci_bus_type,&fsl_none_ops);
>>>>
>>>>              return ret;
>>>>      }
>>>>
>>>> 2.  vfio-pci
>>>>
>>>> For vfio-pci, we would ideally like to keep user space mostly
>>>> unchanged.  User space will have to follow the semantics
>>>> of mapping only physically contiguous chunks...and iova
>>>> will equal phys.
>>>>
>>>> So, we propose to implement a new vfio iommu type,
>>>> called VFIO_TYPE_NONE_IOMMU.  This implements
>>>> any needed vfio interfaces, but there are no calls
>>>> to the iommu layer...e.g. map_dma() is a noop.
>>>>
>>>> Would like your feedback.
>>>
>>> My first thought is that this really detracts from vfio and iommu groups
>>> being a secure interface, so somehow this needs to be clearly an
>>> insecure mode that requires an opt-in and maybe taints the kernel.  Any
>>> notion of unprivileged use needs to be blocked and it should test
>>> CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
>>> points.  We might even have interfaces exported that would allow this to
>>> be an out-of-tree driver (worth a check).
>>>
>>> I would guess that you would probably want to do all the iommu group
>>> setup from the vfio fake-iommu driver.  In other words, that driver both
>>> creates the fake groups and provides the dummy iommu backend for vfio.
>>> That would be a nice way to compartmentalize this as a
>>> vfio-noiommu-special.
>>
>> So you mean don't implement any of the iommu driver
>> ops at all and keep everything in the vfio layer?
>>
>> Would you still have real iommu groups?...i.e.
>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
>> ../../../../kernel/iommu_groups/26
>>
>> ...and that is created by vfio-noiommu-special?
>
> I'm suggesting (but haven't checked if it's possible), to implement the
> iommu driver ops as part of the vfio iommu backend driver.  The primary
> motivation for this would be to a) keep a fake iommu groups interface
> out of the iommu proper (possibly containing it in an external driver)
> and b) modularizing it so we don't have fake iommu groups being created
> by default.  It would have to populate the iommu groups sysfs interfaces
> to be compatible with vfio.
>
>> Right now when the PCI and platform buses are probed,
>> the iommu driver add-device callback gets called and
>> that is where the per-device group gets created.  Are
>> you envisioning registering a callback for the PCI
>> bus to do this in vfio-noiommu-special?
>
> Yes.  It's just as easy to walk all the devices rather than doing
> callbacks, iirc the group code does this when you register.  In fact,
> this noiommu interface may not want to add all devices, we may want to
> be very selective and only add some.
>
Right.
Sounds like a no-iommu driver is needed to leave vfio unaffected,
and still leverage/use vfio for qemu's device assignment.
Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in place.

btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
        so assigned devices are 'remapped' from system-B:D.F to virt-machine's
        (virtualized) B:D.F of the assigned device.
        Are pci-cfg cycles trapped in freescale qemu model ?

>>> Would map/unmap really be no-ops?  Seems like you still want to do page
>>> pinning.
>>
>> You're right, that was a bad example...most would be no ops though.
>>
>>> Also, you're using fsl in the example above, but would such a
>>> driver have any platform dependency?
>>
>> This wouldn't have to be fsl specific if we thought it was
>> potentially generally useful.
>
> Thanks,
> Alex
>
>
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]                 ` <51783553.80202-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-04-25  2:49                   ` Sethi Varun-B16395
       [not found]                     ` <C5ECD7A89D1DC44195F34B25E172658D4BA91B-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Sethi Varun-B16395 @ 2013-04-25  2:49 UTC (permalink / raw)
  To: Don Dutile, Alex Williamson
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
> Sent: Thursday, April 25, 2013 1:11 AM
> To: Alex Williamson
> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> 
> On 04/23/2013 03:47 PM, Alex Williamson wrote:
> > On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
> >>
> >>> -----Original Message-----
> >>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> >>> Sent: Tuesday, April 23, 2013 11:56 AM
> >>> To: Yoder Stuart-B08248
> >>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> >>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> >>>
> >>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> >>>> Joerg/Alex,
> >>>>
> >>>> We have embedded systems where we use QEMU/KVM and have the
> >>>> requirement to do device assignment, but have no iommu.  So we
> >>>> would like to get vfio-pci working on systems like this.
> >>>>
> >>>> We're aware of the obvious limitations-- no protection, DMA'able
> >>>> memory must be physically contiguous and will have no iova->phy
> >>>> translation.  But there are use cases where all OSes involved are
> >>>> trusted and customers can
> >>>> live with those limitations.   Virtualization is used
> >>>> here not to sandbox untrusted code, but to consolidate multiple
> >>>> OSes.
> >>>>
> >>>> We would like to get your feedback on the rough idea.  There are
> >>>> two parts-- iommu driver and vfio-pci.
> >>>>
> >>>> 1.  iommu driver
> >>>>
> >>>> First, we still need device groups created because vfio is based on
> >>>> that, so we envision a 'dummy' iommu driver that implements only
> >>>> the add/remove device ops.  Something like:
> >>>>
> >>>>      static struct iommu_ops fsl_none_ops = {
> >>>>              .add_device     = fsl_none_add_device,
> >>>>              .remove_device  = fsl_none_remove_device,
> >>>>      };
> >>>>
> >>>>      int fsl_iommu_none_init()
> >>>>      {
> >>>>              int ret = 0;
> >>>>
> >>>>              ret = iommu_init_mempool();
> >>>>              if (ret)
> >>>>                      return ret;
> >>>>
> >>>>              bus_set_iommu(&platform_bus_type,&fsl_none_ops);
> >>>>              bus_set_iommu(&pci_bus_type,&fsl_none_ops);
> >>>>
> >>>>              return ret;
> >>>>      }
> >>>>
> >>>> 2.  vfio-pci
> >>>>
> >>>> For vfio-pci, we would ideally like to keep user space mostly
> >>>> unchanged.  User space will have to follow the semantics of mapping
> >>>> only physically contiguous chunks...and iova will equal phys.
> >>>>
> >>>> So, we propose to implement a new vfio iommu type, called
> >>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
> >>>> but there are no calls to the iommu layer...e.g. map_dma() is a
> >>>> noop.
> >>>>
> >>>> Would like your feedback.
> >>>
> >>> My first thought is that this really detracts from vfio and iommu
> >>> groups being a secure interface, so somehow this needs to be clearly
> >>> an insecure mode that requires an opt-in and maybe taints the
> >>> kernel.  Any notion of unprivileged use needs to be blocked and it
> >>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
> >>> critical access points.  We might even have interfaces exported that
> >>> would allow this to be an out-of-tree driver (worth a check).
> >>>
> >>> I would guess that you would probably want to do all the iommu group
> >>> setup from the vfio fake-iommu driver.  In other words, that driver
> >>> both creates the fake groups and provides the dummy iommu backend for
> vfio.
> >>> That would be a nice way to compartmentalize this as a
> >>> vfio-noiommu-special.
> >>
> >> So you mean don't implement any of the iommu driver ops at all and
> >> keep everything in the vfio layer?
> >>
> >> Would you still have real iommu groups?...i.e.
> >> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
> >> ../../../../kernel/iommu_groups/26
> >>
> >> ...and that is created by vfio-noiommu-special?
> >
> > I'm suggesting (but haven't checked if it's possible), to implement
> > the iommu driver ops as part of the vfio iommu backend driver.  The
> > primary motivation for this would be to a) keep a fake iommu groups
> > interface out of the iommu proper (possibly containing it in an
> > external driver) and b) modularizing it so we don't have fake iommu
> > groups being created by default.  It would have to populate the iommu
> > groups sysfs interfaces to be compatible with vfio.
> >
> >> Right now when the PCI and platform buses are probed, the iommu
> >> driver add-device callback gets called and that is where the
> >> per-device group gets created.  Are you envisioning registering a
> >> callback for the PCI bus to do this in vfio-noiommu-special?
> >
> > Yes.  It's just as easy to walk all the devices rather than doing
> > callbacks, iirc the group code does this when you register.  In fact,
> > this noiommu interface may not want to add all devices, we may want to
> > be very selective and only add some.
> >
> Right.
> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
> still leverage/use vfio for qemu's device assignment.
> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
> place.
> 
> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
>         so assigned devices are 'remapped' from system-B:D.F to virt-
> machine's
>         (virtualized) B:D.F of the assigned device.
>         Are pci-cfg cycles trapped in freescale qemu model ?
> 
The vfio-pci device would be visible (to a KVM guest) as a PCI device on the virtual PCI bus (emulated by qemu).

-Varun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC:  vfio / iommu driver for hardware with no iommu
       [not found]                     ` <C5ECD7A89D1DC44195F34B25E172658D4BA91B-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
@ 2013-04-25 22:23                       ` Don Dutile
       [not found]                         ` <5179ACE8.2030506-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Don Dutile @ 2013-04-25 22:23 UTC (permalink / raw)
  To: Sethi Varun-B16395
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
>
>
>> -----Original Message-----
>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
>> Sent: Thursday, April 25, 2013 1:11 AM
>> To: Alex Williamson
>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>
>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
>>>>> To: Yoder Stuart-B08248
>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>
>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
>>>>>> Joerg/Alex,
>>>>>>
>>>>>> We have embedded systems where we use QEMU/KVM and have the
>>>>>> requirement to do device assignment, but have no iommu.  So we
>>>>>> would like to get vfio-pci working on systems like this.
>>>>>>
>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
>>>>>> memory must be physically contiguous and will have no iova->phy
>>>>>> translation.  But there are use cases where all OSes involved are
>>>>>> trusted and customers can
>>>>>> live with those limitations.   Virtualization is used
>>>>>> here not to sandbox untrusted code, but to consolidate multiple
>>>>>> OSes.
>>>>>>
>>>>>> We would like to get your feedback on the rough idea.  There are
>>>>>> two parts-- iommu driver and vfio-pci.
>>>>>>
>>>>>> 1.  iommu driver
>>>>>>
>>>>>> First, we still need device groups created because vfio is based on
>>>>>> that, so we envision a 'dummy' iommu driver that implements only
>>>>>> the add/remove device ops.  Something like:
>>>>>>
>>>>>>       static struct iommu_ops fsl_none_ops = {
>>>>>>               .add_device     = fsl_none_add_device,
>>>>>>               .remove_device  = fsl_none_remove_device,
>>>>>>       };
>>>>>>
>>>>>>       int fsl_iommu_none_init()
>>>>>>       {
>>>>>>               int ret = 0;
>>>>>>
>>>>>>               ret = iommu_init_mempool();
>>>>>>               if (ret)
>>>>>>                       return ret;
>>>>>>
>>>>>>               bus_set_iommu(&platform_bus_type,&fsl_none_ops);
>>>>>>               bus_set_iommu(&pci_bus_type,&fsl_none_ops);
>>>>>>
>>>>>>               return ret;
>>>>>>       }
>>>>>>
>>>>>> 2.  vfio-pci
>>>>>>
>>>>>> For vfio-pci, we would ideally like to keep user space mostly
>>>>>> unchanged.  User space will have to follow the semantics of mapping
>>>>>> only physically contiguous chunks...and iova will equal phys.
>>>>>>
>>>>>> So, we propose to implement a new vfio iommu type, called
>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
>>>>>> noop.
>>>>>>
>>>>>> Would like your feedback.
>>>>>
>>>>> My first thought is that this really detracts from vfio and iommu
>>>>> groups being a secure interface, so somehow this needs to be clearly
>>>>> an insecure mode that requires an opt-in and maybe taints the
>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
>>>>> critical access points.  We might even have interfaces exported that
>>>>> would allow this to be an out-of-tree driver (worth a check).
>>>>>
>>>>> I would guess that you would probably want to do all the iommu group
>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
>>>>> both creates the fake groups and provides the dummy iommu backend for
>> vfio.
>>>>> That would be a nice way to compartmentalize this as a
>>>>> vfio-noiommu-special.
>>>>
>>>> So you mean don't implement any of the iommu driver ops at all and
>>>> keep everything in the vfio layer?
>>>>
>>>> Would you still have real iommu groups?...i.e.
>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
>>>> ../../../../kernel/iommu_groups/26
>>>>
>>>> ...and that is created by vfio-noiommu-special?
>>>
>>> I'm suggesting (but haven't checked if it's possible), to implement
>>> the iommu driver ops as part of the vfio iommu backend driver.  The
>>> primary motivation for this would be to a) keep a fake iommu groups
>>> interface out of the iommu proper (possibly containing it in an
>>> external driver) and b) modularizing it so we don't have fake iommu
>>> groups being created by default.  It would have to populate the iommu
>>> groups sysfs interfaces to be compatible with vfio.
>>>
>>>> Right now when the PCI and platform buses are probed, the iommu
>>>> driver add-device callback gets called and that is where the
>>>> per-device group gets created.  Are you envisioning registering a
>>>> callback for the PCI bus to do this in vfio-noiommu-special?
>>>
>>> Yes.  It's just as easy to walk all the devices rather than doing
>>> callbacks, iirc the group code does this when you register.  In fact,
>>> this noiommu interface may not want to add all devices, we may want to
>>> be very selective and only add some.
>>>
>> Right.
>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
>> still leverage/use vfio for qemu's device assignment.
>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
>> place.
>>
>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
>>          so assigned devices are 'remapped' from system-B:D.F to virt-
>> machine's
>>          (virtualized) B:D.F of the assigned device.
>>          Are pci-cfg cycles trapped in freescale qemu model ?
>>
> The vfio-pci device would be visible (to a KVM guest) as a PCI device on the virtual PCI bus (emulated by qemu).
>
> -Varun
>
Understood, but as Alex stated, the whole purpose of VFIO is to
be able to do _secure_, user-level-driven I/O.  Since this would
be 'unsecure', there should be a way to note that during configuration.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                         ` <5179ACE8.2030506-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-04-27  4:22                           ` Andrew Cooks
  2013-04-30 17:28                             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooks @ 2013-04-27  4:22 UTC (permalink / raw)
  To: Don Dutile
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Yoder Stuart-B08248

On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile <ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
>>
>>
>>
>>> -----Original Message-----
>>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
>>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
>>> Sent: Thursday, April 25, 2013 1:11 AM
>>> To: Alex Williamson
>>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>
>>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
>>>>
>>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
>>>>>> To: Yoder Stuart-B08248
>>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>>
>>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
>>>>>>>
>>>>>>> Joerg/Alex,
>>>>>>>
>>>>>>> We have embedded systems where we use QEMU/KVM and have the
>>>>>>> requirement to do device assignment, but have no iommu.  So we
>>>>>>> would like to get vfio-pci working on systems like this.
>>>>>>>
>>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
>>>>>>> memory must be physically contiguous and will have no iova->phy
>>>>>>> translation.  But there are use cases where all OSes involved are
>>>>>>> trusted and customers can
>>>>>>> live with those limitations.   Virtualization is used
>>>>>>> here not to sandbox untrusted code, but to consolidate multiple
>>>>>>> OSes.
>>>>>>>
>>>>>>> We would like to get your feedback on the rough idea.  There are
>>>>>>> two parts-- iommu driver and vfio-pci.
>>>>>>>
>>>>>>> 1.  iommu driver
>>>>>>>
>>>>>>> First, we still need device groups created because vfio is based on
>>>>>>> that, so we envision a 'dummy' iommu driver that implements only
>>>>>>> the add/remove device ops.  Something like:
>>>>>>>
>>>>>>>       static struct iommu_ops fsl_none_ops = {
>>>>>>>               .add_device     = fsl_none_add_device,
>>>>>>>               .remove_device  = fsl_none_remove_device,
>>>>>>>       };
>>>>>>>
>>>>>>>       int fsl_iommu_none_init()
>>>>>>>       {
>>>>>>>               int ret = 0;
>>>>>>>
>>>>>>>               ret = iommu_init_mempool();
>>>>>>>               if (ret)
>>>>>>>                       return ret;
>>>>>>>
>>>>>>>               bus_set_iommu(&platform_bus_type,&fsl_none_ops);
>>>>>>>               bus_set_iommu(&pci_bus_type,&fsl_none_ops);
>>>>>>>
>>>>>>>               return ret;
>>>>>>>       }
>>>>>>>
>>>>>>> 2.  vfio-pci
>>>>>>>
>>>>>>> For vfio-pci, we would ideally like to keep user space mostly
>>>>>>> unchanged.  User space will have to follow the semantics of mapping
>>>>>>> only physically contiguous chunks...and iova will equal phys.
>>>>>>>
>>>>>>> So, we propose to implement a new vfio iommu type, called
>>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
>>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
>>>>>>> noop.
>>>>>>>
>>>>>>> Would like your feedback.
>>>>>>
>>>>>>
>>>>>> My first thought is that this really detracts from vfio and iommu
>>>>>> groups being a secure interface, so somehow this needs to be clearly
>>>>>> an insecure mode that requires an opt-in and maybe taints the
>>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
>>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
>>>>>> critical access points.  We might even have interfaces exported that
>>>>>> would allow this to be an out-of-tree driver (worth a check).
>>>>>>
>>>>>> I would guess that you would probably want to do all the iommu group
>>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
>>>>>> both creates the fake groups and provides the dummy iommu backend for
>>>
>>> vfio.
>>>>>>
>>>>>> That would be a nice way to compartmentalize this as a
>>>>>> vfio-noiommu-special.
>>>>>
>>>>>
>>>>> So you mean don't implement any of the iommu driver ops at all and
>>>>> keep everything in the vfio layer?
>>>>>
>>>>> Would you still have real iommu groups?...i.e.
>>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
>>>>> ../../../../kernel/iommu_groups/26
>>>>>
>>>>> ...and that is created by vfio-noiommu-special?
>>>>
>>>>
>>>> I'm suggesting (but haven't checked if it's possible), to implement
>>>> the iommu driver ops as part of the vfio iommu backend driver.  The
>>>> primary motivation for this would be to a) keep a fake iommu groups
>>>> interface out of the iommu proper (possibly containing it in an
>>>> external driver) and b) modularizing it so we don't have fake iommu
>>>> groups being created by default.  It would have to populate the iommu
>>>> groups sysfs interfaces to be compatible with vfio.
>>>>
>>>>> Right now when the PCI and platform buses are probed, the iommu
>>>>> driver add-device callback gets called and that is where the
>>>>> per-device group gets created.  Are you envisioning registering a
>>>>> callback for the PCI bus to do this in vfio-noiommu-special?
>>>>
>>>>
>>>> Yes.  It's just as easy to walk all the devices rather than doing
>>>> callbacks, iirc the group code does this when you register.  In fact,
>>>> this noiommu interface may not want to add all devices, we may want to
>>>> be very selective and only add some.
>>>>
>>> Right.
>>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
>>> still leverage/use vfio for qemu's device assignment.
>>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
>>> place.
>>>
>>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
>>>          so assigned devices are 'remapped' from system-B:D.F to virt-
>>> machine's
>>>          (virtualized) B:D.F of the assigned device.
>>>          Are pci-cfg cycles trapped in freescale qemu model ?
>>>
>> The vfio-pci device would be visible (to a KVM guest) as a PCI device on
>> the virtual PCI bus (emulated by qemu).
>>
>> -Varun
>>
> Understood, but as Alex stated, the whole purpose of VFIO is to
> be able to do _secure_, user-level-driven I/O.  Since this would
> be 'unsecure', there should be a way to note that during configuration.
>

Does vfio work with swiotlb and if not, can/should swiotlb be
extended? Or does the time and space overhead make it a moot point?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
  2013-04-27  4:22                           ` Andrew Cooks
@ 2013-04-30 17:28                             ` Konrad Rzeszutek Wilk
       [not found]                               ` <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-04-30 17:28 UTC (permalink / raw)
  To: Andrew Cooks
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Sat, Apr 27, 2013 at 12:22:28PM +0800, Andrew Cooks wrote:
> On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile <ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> >>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
> >>> Sent: Thursday, April 25, 2013 1:11 AM
> >>> To: Alex Williamson
> >>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> >>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> >>>
> >>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
> >>>>
> >>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> >>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
> >>>>>> To: Yoder Stuart-B08248
> >>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> >>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> >>>>>>
> >>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> >>>>>>>
> >>>>>>> Joerg/Alex,
> >>>>>>>
> >>>>>>> We have embedded systems where we use QEMU/KVM and have the
> >>>>>>> requirement to do device assignment, but have no iommu.  So we
> >>>>>>> would like to get vfio-pci working on systems like this.
> >>>>>>>
> >>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
> >>>>>>> memory must be physically contiguous and will have no iova->phy
> >>>>>>> translation.  But there are use cases where all OSes involved are
> >>>>>>> trusted and customers can
> >>>>>>> live with those limitations.   Virtualization is used
> >>>>>>> here not to sandbox untrusted code, but to consolidate multiple
> >>>>>>> OSes.
> >>>>>>>
> >>>>>>> We would like to get your feedback on the rough idea.  There are
> >>>>>>> two parts-- iommu driver and vfio-pci.
> >>>>>>>
> >>>>>>> 1.  iommu driver
> >>>>>>>
> >>>>>>> First, we still need device groups created because vfio is based on
> >>>>>>> that, so we envision a 'dummy' iommu driver that implements only
> >>>>>>> the add/remove device ops.  Something like:
> >>>>>>>
> >>>>>>>       static struct iommu_ops fsl_none_ops = {
> >>>>>>>               .add_device     = fsl_none_add_device,
> >>>>>>>               .remove_device  = fsl_none_remove_device,
> >>>>>>>       };
> >>>>>>>
> >>>>>>>       int fsl_iommu_none_init()
> >>>>>>>       {
> >>>>>>>               int ret = 0;
> >>>>>>>
> >>>>>>>               ret = iommu_init_mempool();
> >>>>>>>               if (ret)
> >>>>>>>                       return ret;
> >>>>>>>
> >>>>>>>               bus_set_iommu(&platform_bus_type,&fsl_none_ops);
> >>>>>>>               bus_set_iommu(&pci_bus_type,&fsl_none_ops);
> >>>>>>>
> >>>>>>>               return ret;
> >>>>>>>       }
> >>>>>>>
> >>>>>>> 2.  vfio-pci
> >>>>>>>
> >>>>>>> For vfio-pci, we would ideally like to keep user space mostly
> >>>>>>> unchanged.  User space will have to follow the semantics of mapping
> >>>>>>> only physically contiguous chunks...and iova will equal phys.
> >>>>>>>
> >>>>>>> So, we propose to implement a new vfio iommu type, called
> >>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
> >>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
> >>>>>>> noop.
> >>>>>>>
> >>>>>>> Would like your feedback.
> >>>>>>
> >>>>>>
> >>>>>> My first thought is that this really detracts from vfio and iommu
> >>>>>> groups being a secure interface, so somehow this needs to be clearly
> >>>>>> an insecure mode that requires an opt-in and maybe taints the
> >>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
> >>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
> >>>>>> critical access points.  We might even have interfaces exported that
> >>>>>> would allow this to be an out-of-tree driver (worth a check).
> >>>>>>
> >>>>>> I would guess that you would probably want to do all the iommu group
> >>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
> >>>>>> both creates the fake groups and provides the dummy iommu backend for
> >>>
> >>> vfio.
> >>>>>>
> >>>>>> That would be a nice way to compartmentalize this as a
> >>>>>> vfio-noiommu-special.
> >>>>>
> >>>>>
> >>>>> So you mean don't implement any of the iommu driver ops at all and
> >>>>> keep everything in the vfio layer?
> >>>>>
> >>>>> Would you still have real iommu groups?...i.e.
> >>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
> >>>>> ../../../../kernel/iommu_groups/26
> >>>>>
> >>>>> ...and that is created by vfio-noiommu-special?
> >>>>
> >>>>
> >>>> I'm suggesting (but haven't checked if it's possible), to implement
> >>>> the iommu driver ops as part of the vfio iommu backend driver.  The
> >>>> primary motivation for this would be to a) keep a fake iommu groups
> >>>> interface out of the iommu proper (possibly containing it in an
> >>>> external driver) and b) modularizing it so we don't have fake iommu
> >>>> groups being created by default.  It would have to populate the iommu
> >>>> groups sysfs interfaces to be compatible with vfio.
> >>>>
> >>>>> Right now when the PCI and platform buses are probed, the iommu
> >>>>> driver add-device callback gets called and that is where the
> >>>>> per-device group gets created.  Are you envisioning registering a
> >>>>> callback for the PCI bus to do this in vfio-noiommu-special?
> >>>>
> >>>>
> >>>> Yes.  It's just as easy to walk all the devices rather than doing
> >>>> callbacks, iirc the group code does this when you register.  In fact,
> >>>> this noiommu interface may not want to add all devices, we may want to
> >>>> be very selective and only add some.
> >>>>
> >>> Right.
> >>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
> >>> still leverage/use vfio for qemu's device assignment.
> >>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
> >>> place.
> >>>
> >>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
> >>>          so assigned devices are 'remapped' from system-B:D.F to virt-
> >>> machine's
> >>>          (virtualized) B:D.F of the assigned device.
> >>>          Are pci-cfg cycles trapped in freescale qemu model ?
> >>>
> >> The vfio-pci device would be visible (to a KVM guest) as a PCI device on
> >> the virtual PCI bus (emulated by qemu).
> >>
> >> -Varun
> >>
> > Understood, but as Alex stated, the whole purpose of VFIO is to
> > be able to do _secure_, user-level-driven I/O.  Since this would
> > be 'unsecure', there should be a way to note that during configuration.
> >
> 
> Does vfio work with swiotlb and if not, can/should swiotlb be
> extended? Or does the time and space overhead make it a moot point?

It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.

It could be extended to use it. I was toying with this b/c for Xen to
use VFIO I would have to implement an Xen IOMMU driver that would basically
piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
care of all the hard work of securing each guest).

But your requirement would be the same, so it might as well be an generic
driver called SWIOTLB-IOMMU driver.

If you are up for writting I am up for reviewing/Ack-ing/etc.

The complexity would be to figure out the VFIO group thing and how to assign
PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
attribute in SysFS (or via a kernel parameter).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                               ` <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
@ 2013-04-30 17:54                                 ` Alex Williamson
  2013-04-30 18:13                                 ` Don Dutile
  2013-04-30 18:25                                 ` Don Dutile
  2 siblings, 0 replies; 21+ messages in thread
From: Alex Williamson @ 2013-04-30 17:54 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Tue, 2013-04-30 at 13:28 -0400, Konrad Rzeszutek Wilk wrote:
> On Sat, Apr 27, 2013 at 12:22:28PM +0800, Andrew Cooks wrote:
> > On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile <ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
> > >>
> > >>
> > >>
> > >>> -----Original Message-----
> > >>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> > >>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
> > >>> Sent: Thursday, April 25, 2013 1:11 AM
> > >>> To: Alex Williamson
> > >>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > >>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> > >>>
> > >>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
> > >>>>
> > >>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
> > >>>>>
> > >>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> > >>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
> > >>>>>> To: Yoder Stuart-B08248
> > >>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > >>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> > >>>>>>
> > >>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> > >>>>>>>
> > >>>>>>> Joerg/Alex,
> > >>>>>>>
> > >>>>>>> We have embedded systems where we use QEMU/KVM and have the
> > >>>>>>> requirement to do device assignment, but have no iommu.  So we
> > >>>>>>> would like to get vfio-pci working on systems like this.
> > >>>>>>>
> > >>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
> > >>>>>>> memory must be physically contiguous and will have no iova->phy
> > >>>>>>> translation.  But there are use cases where all OSes involved are
> > >>>>>>> trusted and customers can
> > >>>>>>> live with those limitations.   Virtualization is used
> > >>>>>>> here not to sandbox untrusted code, but to consolidate multiple
> > >>>>>>> OSes.
> > >>>>>>>
> > >>>>>>> We would like to get your feedback on the rough idea.  There are
> > >>>>>>> two parts-- iommu driver and vfio-pci.
> > >>>>>>>
> > >>>>>>> 1.  iommu driver
> > >>>>>>>
> > >>>>>>> First, we still need device groups created because vfio is based on
> > >>>>>>> that, so we envision a 'dummy' iommu driver that implements only
> > >>>>>>> the add/remove device ops.  Something like:
> > >>>>>>>
> > >>>>>>>       static struct iommu_ops fsl_none_ops = {
> > >>>>>>>               .add_device     = fsl_none_add_device,
> > >>>>>>>               .remove_device  = fsl_none_remove_device,
> > >>>>>>>       };
> > >>>>>>>
> > >>>>>>>       int fsl_iommu_none_init()
> > >>>>>>>       {
> > >>>>>>>               int ret = 0;
> > >>>>>>>
> > >>>>>>>               ret = iommu_init_mempool();
> > >>>>>>>               if (ret)
> > >>>>>>>                       return ret;
> > >>>>>>>
> > >>>>>>>               bus_set_iommu(&platform_bus_type,&fsl_none_ops);
> > >>>>>>>               bus_set_iommu(&pci_bus_type,&fsl_none_ops);
> > >>>>>>>
> > >>>>>>>               return ret;
> > >>>>>>>       }
> > >>>>>>>
> > >>>>>>> 2.  vfio-pci
> > >>>>>>>
> > >>>>>>> For vfio-pci, we would ideally like to keep user space mostly
> > >>>>>>> unchanged.  User space will have to follow the semantics of mapping
> > >>>>>>> only physically contiguous chunks...and iova will equal phys.
> > >>>>>>>
> > >>>>>>> So, we propose to implement a new vfio iommu type, called
> > >>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
> > >>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
> > >>>>>>> noop.
> > >>>>>>>
> > >>>>>>> Would like your feedback.
> > >>>>>>
> > >>>>>>
> > >>>>>> My first thought is that this really detracts from vfio and iommu
> > >>>>>> groups being a secure interface, so somehow this needs to be clearly
> > >>>>>> an insecure mode that requires an opt-in and maybe taints the
> > >>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
> > >>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
> > >>>>>> critical access points.  We might even have interfaces exported that
> > >>>>>> would allow this to be an out-of-tree driver (worth a check).
> > >>>>>>
> > >>>>>> I would guess that you would probably want to do all the iommu group
> > >>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
> > >>>>>> both creates the fake groups and provides the dummy iommu backend for
> > >>>
> > >>> vfio.
> > >>>>>>
> > >>>>>> That would be a nice way to compartmentalize this as a
> > >>>>>> vfio-noiommu-special.
> > >>>>>
> > >>>>>
> > >>>>> So you mean don't implement any of the iommu driver ops at all and
> > >>>>> keep everything in the vfio layer?
> > >>>>>
> > >>>>> Would you still have real iommu groups?...i.e.
> > >>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
> > >>>>> ../../../../kernel/iommu_groups/26
> > >>>>>
> > >>>>> ...and that is created by vfio-noiommu-special?
> > >>>>
> > >>>>
> > >>>> I'm suggesting (but haven't checked if it's possible), to implement
> > >>>> the iommu driver ops as part of the vfio iommu backend driver.  The
> > >>>> primary motivation for this would be to a) keep a fake iommu groups
> > >>>> interface out of the iommu proper (possibly containing it in an
> > >>>> external driver) and b) modularizing it so we don't have fake iommu
> > >>>> groups being created by default.  It would have to populate the iommu
> > >>>> groups sysfs interfaces to be compatible with vfio.
> > >>>>
> > >>>>> Right now when the PCI and platform buses are probed, the iommu
> > >>>>> driver add-device callback gets called and that is where the
> > >>>>> per-device group gets created.  Are you envisioning registering a
> > >>>>> callback for the PCI bus to do this in vfio-noiommu-special?
> > >>>>
> > >>>>
> > >>>> Yes.  It's just as easy to walk all the devices rather than doing
> > >>>> callbacks, iirc the group code does this when you register.  In fact,
> > >>>> this noiommu interface may not want to add all devices, we may want to
> > >>>> be very selective and only add some.
> > >>>>
> > >>> Right.
> > >>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
> > >>> still leverage/use vfio for qemu's device assignment.
> > >>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
> > >>> place.
> > >>>
> > >>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
> > >>>          so assigned devices are 'remapped' from system-B:D.F to virt-
> > >>> machine's
> > >>>          (virtualized) B:D.F of the assigned device.
> > >>>          Are pci-cfg cycles trapped in freescale qemu model ?
> > >>>
> > >> The vfio-pci device would be visible (to a KVM guest) as a PCI device on
> > >> the virtual PCI bus (emulated by qemu).
> > >>
> > >> -Varun
> > >>
> > > Understood, but as Alex stated, the whole purpose of VFIO is to
> > > be able to do _secure_, user-level-driven I/O.  Since this would
> > > be 'unsecure', there should be a way to note that during configuration.
> > >
> > 
> > Does vfio work with swiotlb and if not, can/should swiotlb be
> > extended? Or does the time and space overhead make it a moot point?
> 
> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
> 
> It could be extended to use it. I was toying with this b/c for Xen to
> use VFIO I would have to implement an Xen IOMMU driver that would basically
> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
> care of all the hard work of securing each guest).
> 
> But your requirement would be the same, so it might as well be an generic
> driver called SWIOTLB-IOMMU driver.
> 
> If you are up for writting I am up for reviewing/Ack-ing/etc.
> 
> The complexity would be to figure out the VFIO group thing and how to assign
> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
> attribute in SysFS (or via a kernel parameter).

Just to reiterate, we need to be very, very careful about fake iommu
groups.  iommu groups are meant to express hardware isolation
capabilities.  swiotlb by definition has no hardware isolation
capabilities.  Except for very specific (likely embedded) use cases,
that makes the whole idea of vfio less interesting.  Devices would be
exposed to userspace with neither isolation nor translation.  You might
as well use the uio pci interface at that point.  The qemu use case of
vfio is very difficult to achieve in that model as you either need to
identity map the guest or expose an iommu to the guest.  You won't
achieve transparent device assignment on x86 with such a model if that's
the goal.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                               ` <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
  2013-04-30 17:54                                 ` Alex Williamson
@ 2013-04-30 18:13                                 ` Don Dutile
       [not found]                                   ` <518009D3.2050304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2013-04-30 18:25                                 ` Don Dutile
  2 siblings, 1 reply; 21+ messages in thread
From: Don Dutile @ 2013-04-30 18:13 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/30/2013 01:28 PM, Konrad Rzeszutek Wilk wrote:
> On Sat, Apr 27, 2013 at 12:22:28PM +0800, Andrew Cooks wrote:
>> On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile<ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>  wrote:
>>> On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
>>>>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
>>>>> Sent: Thursday, April 25, 2013 1:11 AM
>>>>> To: Alex Williamson
>>>>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>
>>>>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
>>>>>>
>>>>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>>>>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
>>>>>>>> To: Yoder Stuart-B08248
>>>>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>>>>
>>>>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
>>>>>>>>>
>>>>>>>>> Joerg/Alex,
>>>>>>>>>
>>>>>>>>> We have embedded systems where we use QEMU/KVM and have the
>>>>>>>>> requirement to do device assignment, but have no iommu.  So we
>>>>>>>>> would like to get vfio-pci working on systems like this.
>>>>>>>>>
>>>>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
>>>>>>>>> memory must be physically contiguous and will have no iova->phy
>>>>>>>>> translation.  But there are use cases where all OSes involved are
>>>>>>>>> trusted and customers can
>>>>>>>>> live with those limitations.   Virtualization is used
>>>>>>>>> here not to sandbox untrusted code, but to consolidate multiple
>>>>>>>>> OSes.
>>>>>>>>>
>>>>>>>>> We would like to get your feedback on the rough idea.  There are
>>>>>>>>> two parts-- iommu driver and vfio-pci.
>>>>>>>>>
>>>>>>>>> 1.  iommu driver
>>>>>>>>>
>>>>>>>>> First, we still need device groups created because vfio is based on
>>>>>>>>> that, so we envision a 'dummy' iommu driver that implements only
>>>>>>>>> the add/remove device ops.  Something like:
>>>>>>>>>
>>>>>>>>>        static struct iommu_ops fsl_none_ops = {
>>>>>>>>>                .add_device     = fsl_none_add_device,
>>>>>>>>>                .remove_device  = fsl_none_remove_device,
>>>>>>>>>        };
>>>>>>>>>
>>>>>>>>>        int fsl_iommu_none_init()
>>>>>>>>>        {
>>>>>>>>>                int ret = 0;
>>>>>>>>>
>>>>>>>>>                ret = iommu_init_mempool();
>>>>>>>>>                if (ret)
>>>>>>>>>                        return ret;
>>>>>>>>>
>>>>>>>>>                bus_set_iommu(&platform_bus_type,&fsl_none_ops);
>>>>>>>>>                bus_set_iommu(&pci_bus_type,&fsl_none_ops);
>>>>>>>>>
>>>>>>>>>                return ret;
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>> 2.  vfio-pci
>>>>>>>>>
>>>>>>>>> For vfio-pci, we would ideally like to keep user space mostly
>>>>>>>>> unchanged.  User space will have to follow the semantics of mapping
>>>>>>>>> only physically contiguous chunks...and iova will equal phys.
>>>>>>>>>
>>>>>>>>> So, we propose to implement a new vfio iommu type, called
>>>>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
>>>>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
>>>>>>>>> noop.
>>>>>>>>>
>>>>>>>>> Would like your feedback.
>>>>>>>>
>>>>>>>>
>>>>>>>> My first thought is that this really detracts from vfio and iommu
>>>>>>>> groups being a secure interface, so somehow this needs to be clearly
>>>>>>>> an insecure mode that requires an opt-in and maybe taints the
>>>>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
>>>>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
>>>>>>>> critical access points.  We might even have interfaces exported that
>>>>>>>> would allow this to be an out-of-tree driver (worth a check).
>>>>>>>>
>>>>>>>> I would guess that you would probably want to do all the iommu group
>>>>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
>>>>>>>> both creates the fake groups and provides the dummy iommu backend for
>>>>>
>>>>> vfio.
>>>>>>>>
>>>>>>>> That would be a nice way to compartmentalize this as a
>>>>>>>> vfio-noiommu-special.
>>>>>>>
>>>>>>>
>>>>>>> So you mean don't implement any of the iommu driver ops at all and
>>>>>>> keep everything in the vfio layer?
>>>>>>>
>>>>>>> Would you still have real iommu groups?...i.e.
>>>>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
>>>>>>> ../../../../kernel/iommu_groups/26
>>>>>>>
>>>>>>> ...and that is created by vfio-noiommu-special?
>>>>>>
>>>>>>
>>>>>> I'm suggesting (but haven't checked if it's possible), to implement
>>>>>> the iommu driver ops as part of the vfio iommu backend driver.  The
>>>>>> primary motivation for this would be to a) keep a fake iommu groups
>>>>>> interface out of the iommu proper (possibly containing it in an
>>>>>> external driver) and b) modularizing it so we don't have fake iommu
>>>>>> groups being created by default.  It would have to populate the iommu
>>>>>> groups sysfs interfaces to be compatible with vfio.
>>>>>>
>>>>>>> Right now when the PCI and platform buses are probed, the iommu
>>>>>>> driver add-device callback gets called and that is where the
>>>>>>> per-device group gets created.  Are you envisioning registering a
>>>>>>> callback for the PCI bus to do this in vfio-noiommu-special?
>>>>>>
>>>>>>
>>>>>> Yes.  It's just as easy to walk all the devices rather than doing
>>>>>> callbacks, iirc the group code does this when you register.  In fact,
>>>>>> this noiommu interface may not want to add all devices, we may want to
>>>>>> be very selective and only add some.
>>>>>>
>>>>> Right.
>>>>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
>>>>> still leverage/use vfio for qemu's device assignment.
>>>>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
>>>>> place.
>>>>>
>>>>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
>>>>>           so assigned devices are 'remapped' from system-B:D.F to virt-
>>>>> machine's
>>>>>           (virtualized) B:D.F of the assigned device.
>>>>>           Are pci-cfg cycles trapped in freescale qemu model ?
>>>>>
>>>> The vfio-pci device would be visible (to a KVM guest) as a PCI device on
>>>> the virtual PCI bus (emulated by qemu).
>>>>
>>>> -Varun
>>>>
>>> Understood, but as Alex stated, the whole purpose of VFIO is to
>>> be able to do _secure_, user-level-driven I/O.  Since this would
>>> be 'unsecure', there should be a way to note that during configuration.
>>>
>>
>> Does vfio work with swiotlb and if not, can/should swiotlb be
>> extended? Or does the time and space overhead make it a moot point?
>
> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
>
I think you got it reversed.  vfio uses iommu api, not dma api.
if vfio used dma api, swiotlb is configured as the default dma-ops interface
and it could work (with more interfaces... domain-alloc, etc.).

> It could be extended to use it. I was toying with this b/c for Xen to
> use VFIO I would have to implement an Xen IOMMU driver that would basically
> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
> care of all the hard work of securing each guest).
>
> But your requirement would be the same, so it might as well be an generic
> driver called SWIOTLB-IOMMU driver.
>
> If you are up for writting I am up for reviewing/Ack-ing/etc.
>
> The complexity would be to figure out the VFIO group thing and how to assign
> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
> attribute in SysFS (or via a kernel parameter).
>

Did uio provide this un-secure support, and just needs some attention upstream?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                               ` <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
  2013-04-30 17:54                                 ` Alex Williamson
  2013-04-30 18:13                                 ` Don Dutile
@ 2013-04-30 18:25                                 ` Don Dutile
  2 siblings, 0 replies; 21+ messages in thread
From: Don Dutile @ 2013-04-30 18:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/30/2013 01:28 PM, Konrad Rzeszutek Wilk wrote:
> On Sat, Apr 27, 2013 at 12:22:28PM +0800, Andrew Cooks wrote:
>> On Fri, Apr 26, 2013 at 6:23 AM, Don Dutile<ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>  wrote:
>>> On 04/24/2013 10:49 PM, Sethi Varun-B16395 wrote:
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
>>>>> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Don Dutile
>>>>> Sent: Thursday, April 25, 2013 1:11 AM
>>>>> To: Alex Williamson
>>>>> Cc: Yoder Stuart-B08248; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>
>>>>> On 04/23/2013 03:47 PM, Alex Williamson wrote:
>>>>>>
>>>>>> On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>>>>>>>> Sent: Tuesday, April 23, 2013 11:56 AM
>>>>>>>> To: Yoder Stuart-B08248
>>>>>>>> Cc: Joerg Roedel; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>>>>>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
>>>>>>>>
>>>>>>>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
>>>>>>>>>
>>>>>>>>> Joerg/Alex,
>>>>>>>>>
>>>>>>>>> We have embedded systems where we use QEMU/KVM and have the
>>>>>>>>> requirement to do device assignment, but have no iommu.  So we
>>>>>>>>> would like to get vfio-pci working on systems like this.
>>>>>>>>>
>>>>>>>>> We're aware of the obvious limitations-- no protection, DMA'able
>>>>>>>>> memory must be physically contiguous and will have no iova->phy
>>>>>>>>> translation.  But there are use cases where all OSes involved are
>>>>>>>>> trusted and customers can
>>>>>>>>> live with those limitations.   Virtualization is used
>>>>>>>>> here not to sandbox untrusted code, but to consolidate multiple
>>>>>>>>> OSes.
>>>>>>>>>
>>>>>>>>> We would like to get your feedback on the rough idea.  There are
>>>>>>>>> two parts-- iommu driver and vfio-pci.
>>>>>>>>>
>>>>>>>>> 1.  iommu driver
>>>>>>>>>
>>>>>>>>> First, we still need device groups created because vfio is based on
>>>>>>>>> that, so we envision a 'dummy' iommu driver that implements only
>>>>>>>>> the add/remove device ops.  Something like:
>>>>>>>>>
>>>>>>>>>        static struct iommu_ops fsl_none_ops = {
>>>>>>>>>                .add_device     = fsl_none_add_device,
>>>>>>>>>                .remove_device  = fsl_none_remove_device,
>>>>>>>>>        };
>>>>>>>>>
>>>>>>>>>        int fsl_iommu_none_init()
>>>>>>>>>        {
>>>>>>>>>                int ret = 0;
>>>>>>>>>
>>>>>>>>>                ret = iommu_init_mempool();
>>>>>>>>>                if (ret)
>>>>>>>>>                        return ret;
>>>>>>>>>
>>>>>>>>>                bus_set_iommu(&platform_bus_type,&fsl_none_ops);
>>>>>>>>>                bus_set_iommu(&pci_bus_type,&fsl_none_ops);
>>>>>>>>>
>>>>>>>>>                return ret;
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>> 2.  vfio-pci
>>>>>>>>>
>>>>>>>>> For vfio-pci, we would ideally like to keep user space mostly
>>>>>>>>> unchanged.  User space will have to follow the semantics of mapping
>>>>>>>>> only physically contiguous chunks...and iova will equal phys.
>>>>>>>>>
>>>>>>>>> So, we propose to implement a new vfio iommu type, called
>>>>>>>>> VFIO_TYPE_NONE_IOMMU.  This implements any needed vfio interfaces,
>>>>>>>>> but there are no calls to the iommu layer...e.g. map_dma() is a
>>>>>>>>> noop.
>>>>>>>>>
>>>>>>>>> Would like your feedback.
>>>>>>>>
>>>>>>>>
>>>>>>>> My first thought is that this really detracts from vfio and iommu
>>>>>>>> groups being a secure interface, so somehow this needs to be clearly
>>>>>>>> an insecure mode that requires an opt-in and maybe taints the
>>>>>>>> kernel.  Any notion of unprivileged use needs to be blocked and it
>>>>>>>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at
>>>>>>>> critical access points.  We might even have interfaces exported that
>>>>>>>> would allow this to be an out-of-tree driver (worth a check).
>>>>>>>>
>>>>>>>> I would guess that you would probably want to do all the iommu group
>>>>>>>> setup from the vfio fake-iommu driver.  In other words, that driver
>>>>>>>> both creates the fake groups and provides the dummy iommu backend for
>>>>>
>>>>> vfio.
>>>>>>>>
>>>>>>>> That would be a nice way to compartmentalize this as a
>>>>>>>> vfio-noiommu-special.
>>>>>>>
>>>>>>>
>>>>>>> So you mean don't implement any of the iommu driver ops at all and
>>>>>>> keep everything in the vfio layer?
>>>>>>>
>>>>>>> Would you still have real iommu groups?...i.e.
>>>>>>> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
>>>>>>> ../../../../kernel/iommu_groups/26
>>>>>>>
>>>>>>> ...and that is created by vfio-noiommu-special?
>>>>>>
>>>>>>
>>>>>> I'm suggesting (but haven't checked if it's possible), to implement
>>>>>> the iommu driver ops as part of the vfio iommu backend driver.  The
>>>>>> primary motivation for this would be to a) keep a fake iommu groups
>>>>>> interface out of the iommu proper (possibly containing it in an
>>>>>> external driver) and b) modularizing it so we don't have fake iommu
>>>>>> groups being created by default.  It would have to populate the iommu
>>>>>> groups sysfs interfaces to be compatible with vfio.
>>>>>>
>>>>>>> Right now when the PCI and platform buses are probed, the iommu
>>>>>>> driver add-device callback gets called and that is where the
>>>>>>> per-device group gets created.  Are you envisioning registering a
>>>>>>> callback for the PCI bus to do this in vfio-noiommu-special?
>>>>>>
>>>>>>
>>>>>> Yes.  It's just as easy to walk all the devices rather than doing
>>>>>> callbacks, iirc the group code does this when you register.  In fact,
>>>>>> this noiommu interface may not want to add all devices, we may want to
>>>>>> be very selective and only add some.
>>>>>>
>>>>> Right.
>>>>> Sounds like a no-iommu driver is needed to leave vfio unaffected, and
>>>>> still leverage/use vfio for qemu's device assignment.
>>>>> Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in
>>>>> place.
>>>>>
>>>>> btw -- qemu has the inherent assumption that pci cfg cycles are trapped,
>>>>>           so assigned devices are 'remapped' from system-B:D.F to virt-
>>>>> machine's
>>>>>           (virtualized) B:D.F of the assigned device.
>>>>>           Are pci-cfg cycles trapped in freescale qemu model ?
>>>>>
>>>> The vfio-pci device would be visible (to a KVM guest) as a PCI device on
>>>> the virtual PCI bus (emulated by qemu).
>>>>
>>>> -Varun
>>>>
>>> Understood, but as Alex stated, the whole purpose of VFIO is to
>>> be able to do _secure_, user-level-driven I/O.  Since this would
>>> be 'unsecure', there should be a way to note that during configuration.
>>>
>>
>> Does vfio work with swiotlb and if not, can/should swiotlb be
>> extended? Or does the time and space overhead make it a moot point?
>
> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
>
> It could be extended to use it. I was toying with this b/c for Xen to
> use VFIO I would have to implement an Xen IOMMU driver that would basically
> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
> care of all the hard work of securing each guest).
>
> But your requirement would be the same, so it might as well be an generic
> driver called SWIOTLB-IOMMU driver.
>
arch/x86/kernel/pci-nommu.c as a starting point?


> If you are up for writting I am up for reviewing/Ack-ing/etc.
>
> The complexity would be to figure out the VFIO group thing and how to assign
> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
> attribute in SysFS (or via a kernel parameter).
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                                   ` <518009D3.2050304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-04-30 19:11                                     ` Konrad Rzeszutek Wilk
       [not found]                                       ` <20130430191131.GC24298-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-04-30 19:11 UTC (permalink / raw)
  To: Don Dutile
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

> >>Does vfio work with swiotlb and if not, can/should swiotlb be
> >>extended? Or does the time and space overhead make it a moot point?
> >
> >It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
> >
> I think you got it reversed.  vfio uses iommu api, not dma api.

Right.  That is what I was saying :-) SWIOTLB uses the DMA API, not
the IOMMU API. Hence it won't work with VFIO. Unless SWIOTLB implements
the IOMMU API.


> if vfio used dma api, swiotlb is configured as the default dma-ops interface
> and it could work (with more interfaces... domain-alloc, etc.).

<nods>
> 
> >It could be extended to use it. I was toying with this b/c for Xen to
> >use VFIO I would have to implement an Xen IOMMU driver that would basically
> >piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
> >care of all the hard work of securing each guest).
> >
> >But your requirement would be the same, so it might as well be an generic
> >driver called SWIOTLB-IOMMU driver.
> >
> >If you are up for writting I am up for reviewing/Ack-ing/etc.
> >
> >The complexity would be to figure out the VFIO group thing and how to assign
> >PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
> >xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
> >attribute in SysFS (or via a kernel parameter).
> >
> 
> Did uio provide this un-secure support, and just needs some attention upstream?

I don't recall how UIO did it. Not sure if it even had the group
support.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                                       ` <20130430191131.GC24298-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
@ 2013-04-30 20:48                                         ` Don Dutile
       [not found]                                           ` <51802E19.9050601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Don Dutile @ 2013-04-30 20:48 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/30/2013 03:11 PM, Konrad Rzeszutek Wilk wrote:
>>>> Does vfio work with swiotlb and if not, can/should swiotlb be
>>>> extended? Or does the time and space overhead make it a moot point?
>>>
>>> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
>>>
>> I think you got it reversed.  vfio uses iommu api, not dma api.
>
> Right.  That is what I was saying :-) SWIOTLB uses the DMA API, not
> the IOMMU API. Hence it won't work with VFIO. Unless SWIOTLB implements
> the IOMMU API.
>

>
>> if vfio used dma api, swiotlb is configured as the default dma-ops interface
>> and it could work (with more interfaces... domain-alloc, etc.).
>
> <nods>
>>
>>> It could be extended to use it. I was toying with this b/c for Xen to
>>> use VFIO I would have to implement an Xen IOMMU driver that would basically
>>> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
>>> care of all the hard work of securing each guest).
>>>
>>> But your requirement would be the same, so it might as well be an generic
>>> driver called SWIOTLB-IOMMU driver.
>>>
>>> If you are up for writting I am up for reviewing/Ack-ing/etc.
>>>
>>> The complexity would be to figure out the VFIO group thing and how to assign
>>> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
>>> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
>>> attribute in SysFS (or via a kernel parameter).
>>>
>>
>> Did uio provide this un-secure support, and just needs some attention upstream?
>
> I don't recall how UIO did it. Not sure if it even had the group
> support.
no group support. probably doesn't have an iommu-like api either...

sounds like a no-iommu iommu interface is needed! :-p
(Alex: that slipped out! sorry!)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                                           ` <51802E19.9050601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-04-30 21:15                                             ` Alex Williamson
       [not found]                                               ` <1367356521.22436.7.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2013-04-30 21:15 UTC (permalink / raw)
  To: Don Dutile
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On Tue, 2013-04-30 at 16:48 -0400, Don Dutile wrote:
> On 04/30/2013 03:11 PM, Konrad Rzeszutek Wilk wrote:
> >>>> Does vfio work with swiotlb and if not, can/should swiotlb be
> >>>> extended? Or does the time and space overhead make it a moot point?
> >>>
> >>> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
> >>>
> >> I think you got it reversed.  vfio uses iommu api, not dma api.
> >
> > Right.  That is what I was saying :-) SWIOTLB uses the DMA API, not
> > the IOMMU API. Hence it won't work with VFIO. Unless SWIOTLB implements
> > the IOMMU API.
> >
> 
> >
> >> if vfio used dma api, swiotlb is configured as the default dma-ops interface
> >> and it could work (with more interfaces... domain-alloc, etc.).
> >
> > <nods>
> >>
> >>> It could be extended to use it. I was toying with this b/c for Xen to
> >>> use VFIO I would have to implement an Xen IOMMU driver that would basically
> >>> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
> >>> care of all the hard work of securing each guest).
> >>>
> >>> But your requirement would be the same, so it might as well be an generic
> >>> driver called SWIOTLB-IOMMU driver.
> >>>
> >>> If you are up for writting I am up for reviewing/Ack-ing/etc.
> >>>
> >>> The complexity would be to figure out the VFIO group thing and how to assign
> >>> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
> >>> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
> >>> attribute in SysFS (or via a kernel parameter).
> >>>
> >>
> >> Did uio provide this un-secure support, and just needs some attention upstream?
> >
> > I don't recall how UIO did it. Not sure if it even had the group
> > support.
> no group support. probably doesn't have an iommu-like api either...

It doesn't, in fact uio-pci doesn't even allow enabling bus master
because there's zero isolation.

> sounds like a no-iommu iommu interface is needed! :-p
> (Alex: that slipped out! sorry!)

I wouldn't say "needed", I'm really not sure how or why this is even
practical.  What would we do with a userspace driver interface that's
backed by a software IOMMU that provides neither translation nor
isolation?  This is exactly why I suggested to the freescale guys that
it should be some kind of vfio-fake-iommu backend with very, very strict
capability checking and no default loading.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: vfio / iommu driver for hardware with no iommu
       [not found]                                               ` <1367356521.22436.7.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
@ 2013-04-30 21:51                                                 ` Don Dutile
  0 siblings, 0 replies; 21+ messages in thread
From: Don Dutile @ 2013-04-30 21:51 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yoder Stuart-B08248,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org

On 04/30/2013 05:15 PM, Alex Williamson wrote:
> On Tue, 2013-04-30 at 16:48 -0400, Don Dutile wrote:
>> On 04/30/2013 03:11 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> Does vfio work with swiotlb and if not, can/should swiotlb be
>>>>>> extended? Or does the time and space overhead make it a moot point?
>>>>>
>>>>> It does not work with SWIOTLB as it uses the DMA API, not the IOMMU API.
>>>>>
>>>> I think you got it reversed.  vfio uses iommu api, not dma api.
>>>
>>> Right.  That is what I was saying :-) SWIOTLB uses the DMA API, not
>>> the IOMMU API. Hence it won't work with VFIO. Unless SWIOTLB implements
>>> the IOMMU API.
>>>
>>
>>>
>>>> if vfio used dma api, swiotlb is configured as the default dma-ops interface
>>>> and it could work (with more interfaces... domain-alloc, etc.).
>>>
>>> <nods>
>>>>
>>>>> It could be extended to use it. I was toying with this b/c for Xen to
>>>>> use VFIO I would have to implement an Xen IOMMU driver that would basically
>>>>> piggyback on the SWIOTLB (as Xen itself does the IOMMU parts and takes
>>>>> care of all the hard work of securing each guest).
>>>>>
>>>>> But your requirement would be the same, so it might as well be an generic
>>>>> driver called SWIOTLB-IOMMU driver.
>>>>>
>>>>> If you are up for writting I am up for reviewing/Ack-ing/etc.
>>>>>
>>>>> The complexity would be to figure out the VFIO group thing and how to assign
>>>>> PCI B:D:F devices to the SWIOTLB-IOMMU driver. Perhaps the same way as
>>>>> xen-pciback does (or pcistub). That is by writting the BDF in the "bind"
>>>>> attribute in SysFS (or via a kernel parameter).
>>>>>
>>>>
>>>> Did uio provide this un-secure support, and just needs some attention upstream?
>>>
>>> I don't recall how UIO did it. Not sure if it even had the group
>>> support.
>> no group support. probably doesn't have an iommu-like api either...
>
> It doesn't, in fact uio-pci doesn't even allow enabling bus master
> because there's zero isolation.
>
>> sounds like a no-iommu iommu interface is needed! :-p
>> (Alex: that slipped out! sorry!)
>
> I wouldn't say "needed", I'm really not sure how or why this is even
> practical.  What would we do with a userspace driver interface that's
> backed by a software IOMMU that provides neither translation nor
> isolation?  This is exactly why I suggested to the freescale guys that
> it should be some kind of vfio-fake-iommu backend with very, very strict
> capability checking and no default loading.  Thanks,
>
> Alex
>
that's what I would expect as well.  but it's still a wonky fake-iommu...
writing code to do almost nothing.... sounds like pci-stub! :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-04-30 21:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-23 16:13 RFC: vfio / iommu driver for hardware with no iommu Yoder Stuart-B08248
     [not found] ` <9F6FE96B71CF29479FF1CDC8046E15035BE0A3-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
2013-04-23 16:56   ` Alex Williamson
     [not found]     ` <1366736189.2918.573.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
2013-04-23 18:36       ` Sethi Varun-B16395
2013-04-23 19:16       ` Yoder Stuart-B08248
     [not found]         ` <9F6FE96B71CF29479FF1CDC8046E15035BE2BD-TcFNo7jSaXPiTqIcKZ1S2K4g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
2013-04-23 19:47           ` Alex Williamson
     [not found]             ` <1366746427.2918.650.camel-xdHQ/5r00wBBDLzU/O5InQ@public.gmane.org>
2013-04-24 19:41               ` Don Dutile
     [not found]                 ` <51783553.80202-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-04-25  2:49                   ` Sethi Varun-B16395
     [not found]                     ` <C5ECD7A89D1DC44195F34B25E172658D4BA91B-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
2013-04-25 22:23                       ` Don Dutile
     [not found]                         ` <5179ACE8.2030506-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-04-27  4:22                           ` Andrew Cooks
2013-04-30 17:28                             ` Konrad Rzeszutek Wilk
     [not found]                               ` <20130430172849.GB22752-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-04-30 17:54                                 ` Alex Williamson
2013-04-30 18:13                                 ` Don Dutile
     [not found]                                   ` <518009D3.2050304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-04-30 19:11                                     ` Konrad Rzeszutek Wilk
     [not found]                                       ` <20130430191131.GC24298-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-04-30 20:48                                         ` Don Dutile
     [not found]                                           ` <51802E19.9050601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-04-30 21:15                                             ` Alex Williamson
     [not found]                                               ` <1367356521.22436.7.camel-85EaTFmN5p//9pzu0YdTqQ@public.gmane.org>
2013-04-30 21:51                                                 ` Don Dutile
2013-04-30 18:25                                 ` Don Dutile
2013-04-24 10:57   ` Joerg Roedel
     [not found]     ` <20130424105718.GJ17148-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2013-04-24 11:04       ` Bhushan Bharat-R65777
     [not found]         ` <6A3DF150A5B70D4F9B66A25E3F7C888D06FF5799-RL0Hj/+nBVCMXPU/2EZmt64g8xLGJsHaLnY5E4hWTkheoWH0uzbU5w@public.gmane.org>
2013-04-24 15:22           ` Yoder Stuart-B08248
2013-04-24 11:52       ` Sethi Varun-B16395

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).