* Re: [Qemu-devel] vfio / iommu domain attributes
@ 2011-12-07 16:38 ` Joerg Roedel
0 siblings, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2011-12-07 16:38 UTC (permalink / raw)
To: Stuart Yoder
Cc: chrisw, Alexey Kardashevskiy, kvm, pmac, linux-pci, konrad.wilk,
qemu-devel, iommu, agraf, aafabbri, B08248, Alex Williamson, avi,
David Gibson, dwg, B07421, benve
On Wed, Dec 07, 2011 at 09:54:39AM -0600, Stuart Yoder wrote:
> Alex, Alexey I'm wondering if you've had any new thoughts on this over
> the last week.
>
> For Freescale, our iommu domain attributes would look something like:
> -domain iova base address
> -domain iova window size
I agree with that.
> -domain enable/disable
> -number of subwindows
> -operation mapping table index
> -stash destination CPU
> -stash target (cache– L1, L2, L3)
Why does the user of the IOMMU-API need to have control over these
things?
> These are all things that need to be set by the creator of the domain.
>
> Since the domain attributes are going to be so different for each platform does
> it make sense to define a new iommu_ops call back that just takes a void pointer
> that can be implemented in a platform specific way? For example:
>
> struct iommu_ops {
> [cut]
> int (*domain_set_attrs)(struct iommu_domain *domain,
> void *attrs);
> int (*domain_get_attrs)(struct iommu_domain *domain,
> void *attrs);
> }
A void pointer is certainly the worst choice for an interface. I think
it is better to have at least a set of common attributes. Somthing like
this:
iommu_domain_set_attr(struct iommu_domain *domain, enum attr_type, void *data)
iommu_domain_get_attr(struct iommu_domain *domain, enum attr_type, void *data)
The iova base/size options make sense for more IOMMUs than just
Freescale. For example it would allow to manage GART-like IOMMUs with
the IOMMU-API too.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: vfio / iommu domain attributes
2011-12-07 16:38 ` [Qemu-devel] " Joerg Roedel
(?)
@ 2011-12-07 18:59 ` Scott Wood
-1 siblings, 0 replies; 8+ messages in thread
From: Scott Wood @ 2011-12-07 18:59 UTC (permalink / raw)
To: Joerg Roedel
Cc: Stuart Yoder, Alex Williamson, Alexey Kardashevskiy, aafabbri,
kvm, pmac, qemu-devel, konrad.wilk, agraf, dwg, chrisw, B08248,
iommu, avi, linux-pci, B07421, benve, benh, David Gibson
On 12/07/2011 10:38 AM, Joerg Roedel wrote:
> On Wed, Dec 07, 2011 at 09:54:39AM -0600, Stuart Yoder wrote:
>> Alex, Alexey I'm wondering if you've had any new thoughts on this over
>> the last week.
>>
>> For Freescale, our iommu domain attributes would look something like:
>> -domain iova base address
>> -domain iova window size
>
> I agree with that.
>
>> -domain enable/disable
>> -number of subwindows
>> -operation mapping table index
>> -stash destination CPU
>> -stash target (cache– L1, L2, L3)
>
> Why does the user of the IOMMU-API need to have control over these
> things?
Stash configuration needs to match what the user of the device is doing
(in particular, which CPU(s) it is accessing the device's ring buffer
from). Operation mapping table is related to stashing, and while
perhaps not as critical to be controlled by the driver (though I'm not
too familiar with the details here), it seems better than hardcoding
this knowledge elsewhere in the system -- it does relate to the kind of
things that the specific device is doing.
Domain enable/disable is something we'd use when we reset a KVM or
userspace device user (or reassign the device). We need the device to
not be able to DMA until it has been quiesced from previous activity,
and we don't have anything like PCIe function-level reset or the PCI
bus-master enable bit. The driver needs to let us know when it's safe
to enable DMA. This *could* be done via map/unmap, but in
configurations where maps are static, we'd like to not risk the map
failing post-init. Plus, it's simpler to just have a toggle rather than
need to tear down and rebuild the maps, and guest reset/failover is a
performance-critical path for some of our customers.
Number of subwindows goes along with iova base/size. It affects which
mappings will be valid. If you're using large pages and a small iova
window, a smaller number of subwindows may suffice. We could just
always use the maximum number of subwindows, but that has a good chance
of thrashing the IOMMU's cache. On p4080, max subwindows per device is
256 and the cache can hold 128 entries globally. Lower-end chips may
have a smaller cache.
-Scott
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: vfio / iommu domain attributes
2011-12-07 16:38 ` [Qemu-devel] " Joerg Roedel
(?)
@ 2011-12-07 19:11 ` Stuart Yoder
-1 siblings, 0 replies; 8+ messages in thread
From: Stuart Yoder @ 2011-12-07 19:11 UTC (permalink / raw)
To: Joerg Roedel
Cc: Alex Williamson, Alexey Kardashevskiy, aafabbri, kvm, pmac,
qemu-devel, konrad.wilk, agraf, dwg, chrisw, B08248, iommu, avi,
linux-pci, B07421, benve, benh, David Gibson
On Wed, Dec 7, 2011 at 10:38 AM, Joerg Roedel <joerg.roedel@amd.com> wrote:
> On Wed, Dec 07, 2011 at 09:54:39AM -0600, Stuart Yoder wrote:
>> Alex, Alexey I'm wondering if you've had any new thoughts on this over
>> the last week.
>>
>> For Freescale, our iommu domain attributes would look something like:
>> -domain iova base address
>> -domain iova window size
>
> I agree with that.
>
>> -domain enable/disable
>> -number of subwindows
>> -operation mapping table index
>> -stash destination CPU
>> -stash target (cache– L1, L2, L3)
>
> Why does the user of the IOMMU-API need to have control over these
> things?
Our IOMMU complicates things in that it is used for more than just
memory protection
and address translation. It has a concept of operation translation as well.
Some devices could do a 'write' transaction that when passing through the
iommu gets translated to a a 'write-with-stash'. Stashed transactions
get pushed directly into some cache.
It's the entity creating and setting up the domain that will have the knowledge
of what cache is to be stashed to. Right now software that uses stashing
is pinned to a CPU, but if in the future it's possible that we'll want to
work without pinning and may need to update stashing attributes on the
fly.
The overall iova window for the domain can be divided into a configurable
number of subwindows (a power of 2, up to 256), which means we can have
a contiguous iova region backed by up to 256 physically dis-contiguous
subwindows. The creator of the iommu domain is in the best position
to know how many subwindows are needed (the fewer the better for
performance reasons).
So, in short, the above list of attributes are the attributes of our
iommu hardware
and the knowlege of how they should be set is with the domain creator.
>> These are all things that need to be set by the creator of the domain.
>>
>> Since the domain attributes are going to be so different for each platform does
>> it make sense to define a new iommu_ops call back that just takes a void pointer
>> that can be implemented in a platform specific way? For example:
>>
>> struct iommu_ops {
>> [cut]
>> int (*domain_set_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> int (*domain_get_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> }
>
> A void pointer is certainly the worst choice for an interface. I think
> it is better to have at least a set of common attributes. Somthing like
> this:
>
> iommu_domain_set_attr(struct iommu_domain *domain, enum attr_type, void *data)
> iommu_domain_get_attr(struct iommu_domain *domain, enum attr_type, void *data)
That would be fine.
Stuart
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Qemu-devel] vfio / iommu domain attributes
@ 2011-12-07 19:11 ` Stuart Yoder
0 siblings, 0 replies; 8+ messages in thread
From: Stuart Yoder @ 2011-12-07 19:11 UTC (permalink / raw)
To: Joerg Roedel
Cc: chrisw, Alexey Kardashevskiy, kvm, pmac, linux-pci, konrad.wilk,
qemu-devel, iommu, agraf, aafabbri, B08248, Alex Williamson, avi,
David Gibson, dwg, B07421, benve
On Wed, Dec 7, 2011 at 10:38 AM, Joerg Roedel <joerg.roedel@amd.com> wrote:
> On Wed, Dec 07, 2011 at 09:54:39AM -0600, Stuart Yoder wrote:
>> Alex, Alexey I'm wondering if you've had any new thoughts on this over
>> the last week.
>>
>> For Freescale, our iommu domain attributes would look something like:
>> -domain iova base address
>> -domain iova window size
>
> I agree with that.
>
>> -domain enable/disable
>> -number of subwindows
>> -operation mapping table index
>> -stash destination CPU
>> -stash target (cache– L1, L2, L3)
>
> Why does the user of the IOMMU-API need to have control over these
> things?
Our IOMMU complicates things in that it is used for more than just
memory protection
and address translation. It has a concept of operation translation as well.
Some devices could do a 'write' transaction that when passing through the
iommu gets translated to a a 'write-with-stash'. Stashed transactions
get pushed directly into some cache.
It's the entity creating and setting up the domain that will have the knowledge
of what cache is to be stashed to. Right now software that uses stashing
is pinned to a CPU, but if in the future it's possible that we'll want to
work without pinning and may need to update stashing attributes on the
fly.
The overall iova window for the domain can be divided into a configurable
number of subwindows (a power of 2, up to 256), which means we can have
a contiguous iova region backed by up to 256 physically dis-contiguous
subwindows. The creator of the iommu domain is in the best position
to know how many subwindows are needed (the fewer the better for
performance reasons).
So, in short, the above list of attributes are the attributes of our
iommu hardware
and the knowlege of how they should be set is with the domain creator.
>> These are all things that need to be set by the creator of the domain.
>>
>> Since the domain attributes are going to be so different for each platform does
>> it make sense to define a new iommu_ops call back that just takes a void pointer
>> that can be implemented in a platform specific way? For example:
>>
>> struct iommu_ops {
>> [cut]
>> int (*domain_set_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> int (*domain_get_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> }
>
> A void pointer is certainly the worst choice for an interface. I think
> it is better to have at least a set of common attributes. Somthing like
> this:
>
> iommu_domain_set_attr(struct iommu_domain *domain, enum attr_type, void *data)
> iommu_domain_get_attr(struct iommu_domain *domain, enum attr_type, void *data)
That would be fine.
Stuart
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: vfio / iommu domain attributes
@ 2011-12-07 19:11 ` Stuart Yoder
0 siblings, 0 replies; 8+ messages in thread
From: Stuart Yoder @ 2011-12-07 19:11 UTC (permalink / raw)
To: Joerg Roedel
Cc: chrisw, Alexey Kardashevskiy, kvm, pmac, linux-pci, konrad.wilk,
qemu-devel, iommu, agraf, aafabbri, B08248, Alex Williamson, avi,
David Gibson, dwg, B07421, benve
On Wed, Dec 7, 2011 at 10:38 AM, Joerg Roedel <joerg.roedel@amd.com> wrote:
> On Wed, Dec 07, 2011 at 09:54:39AM -0600, Stuart Yoder wrote:
>> Alex, Alexey I'm wondering if you've had any new thoughts on this over
>> the last week.
>>
>> For Freescale, our iommu domain attributes would look something like:
>> -domain iova base address
>> -domain iova window size
>
> I agree with that.
>
>> -domain enable/disable
>> -number of subwindows
>> -operation mapping table index
>> -stash destination CPU
>> -stash target (cache– L1, L2, L3)
>
> Why does the user of the IOMMU-API need to have control over these
> things?
Our IOMMU complicates things in that it is used for more than just
memory protection
and address translation. It has a concept of operation translation as well.
Some devices could do a 'write' transaction that when passing through the
iommu gets translated to a a 'write-with-stash'. Stashed transactions
get pushed directly into some cache.
It's the entity creating and setting up the domain that will have the knowledge
of what cache is to be stashed to. Right now software that uses stashing
is pinned to a CPU, but if in the future it's possible that we'll want to
work without pinning and may need to update stashing attributes on the
fly.
The overall iova window for the domain can be divided into a configurable
number of subwindows (a power of 2, up to 256), which means we can have
a contiguous iova region backed by up to 256 physically dis-contiguous
subwindows. The creator of the iommu domain is in the best position
to know how many subwindows are needed (the fewer the better for
performance reasons).
So, in short, the above list of attributes are the attributes of our
iommu hardware
and the knowlege of how they should be set is with the domain creator.
>> These are all things that need to be set by the creator of the domain.
>>
>> Since the domain attributes are going to be so different for each platform does
>> it make sense to define a new iommu_ops call back that just takes a void pointer
>> that can be implemented in a platform specific way? For example:
>>
>> struct iommu_ops {
>> [cut]
>> int (*domain_set_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> int (*domain_get_attrs)(struct iommu_domain *domain,
>> void *attrs);
>> }
>
> A void pointer is certainly the worst choice for an interface. I think
> it is better to have at least a set of common attributes. Somthing like
> this:
>
> iommu_domain_set_attr(struct iommu_domain *domain, enum attr_type, void *data)
> iommu_domain_get_attr(struct iommu_domain *domain, enum attr_type, void *data)
That would be fine.
Stuart
^ permalink raw reply [flat|nested] 8+ messages in thread