* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 12:41 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 12:41 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Thierry Reding, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
> > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
> > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
> [...]
> > > > > > You should never need #size-cells > #address-cells
> > > > >
> > > > > That was always my impression as well. But how then do you represent the
> > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
> > > > > 4 GiB - 1, which makes it 4 GiB large. That's:
> > > > >
> > > > > <0 1 0>
> > > > >
> > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is:
> > > > >
> > > > > <0 0xffffffff>
> > > > >
> > > > > but that's not accurate.
> > > >
> > > > I think we've done both in the past, either extended #size-cells or
> > > > taken 0xffffffff as a special token. Note that in your example,
> > > > the iommu actually needs #address-cells = <2> anyway.
> > >
> > > But it needs #address-cells = <2> only to encode an ID in addition to
> > > the address. If this was a single-master IOMMU then there'd be no need
> > > for the ID.
> >
> > Right. But for a single-master IOMMU, there is no need to specify
> > any additional data, it could have #address-cells=<0> if we take the
> > optimization you suggested.
>
> Couldn't a single-master IOMMU be windowed?
Ah, yes. That would actually be like an IBM pSeries, which has a windowed
IOMMU but uses one window per virtual machine. In that case, the window could
be a property of the iommu node though, rather than part of the address
in the link.
> > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f
> > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2>
> > > > and have an empty dma-ranges property in the PCI host bridge node,
> > > > and interpret this as using the same encoding as the PCI BARs in
> > > > the ranges property.
> > >
> > > I'm somewhat confused here, since you said earlier:
> > >
> > > > After giving the ranges stuff some more thought, I have come to the
> > > > conclusion that using #iommu-cells should work fine for almost
> > > > all cases, including windowed iommus, because the window is not
> > > > actually needed in the device, but only in the iommu, wihch is of course
> > > > free to interpret the arguments as addresses.
> > >
> > > But now you seem to be saying that we should still be using the
> > > #address-cells and #size-cells properties in the IOMMU node to determine
> > > the length of the specifier.
> >
> > I probably wasn't clear. I think we can make it work either way, but
> > my feeling is that using #address-cells/#size-cells gives us a nicer
> > syntax for the more complex cases.
>
> Okay, so in summary we'd have something like this for simple cases:
>
> Required properties:
> --------------------
> - #address-cells: The number of cells in an IOMMU specifier needed to encode
> an address.
> - #size-cells: The number of cells in an IOMMU specifier needed to represent
> the length of an address range.
>
> Typical values for the above include:
> - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> configurable and therefore no additional information needs to be encoded in
> the specifier. This may also apply to multiple master IOMMU devices that do
> not allow the association of masters to be configured.
> - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> need to be configured in order to enable translation for a given master. In
> such cases the single address cell corresponds to the master device's ID.
> - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> window for masters to be configured. The first cell of the address in this
> may contain the master device's ID for example, while the second cell could
> contain the start of the DMA window for the given device. The length of the
> DMA window is specified by two additional cells.
>
> Examples:
> =========
>
> Single-master IOMMU:
> --------------------
>
> iommu {
> #address-cells = <0>;
> #size-cells = <0>;
> };
>
> master {
> iommus = <&/iommu>;
> };
>
> Multiple-master IOMMU with fixed associations:
> ----------------------------------------------
>
> /* multiple-master IOMMU */
> iommu {
> /*
> * Masters are statically associated with this IOMMU and
> * address translation is always enabled.
> */
> #iommu-cells = <0>;
> };
copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
> /* static association with IOMMU */
> master@1 {
> reg = <1>;
> iommus = <&/iommu>;
> };
>
> /* static association with IOMMU */
> master@2 {
> reg = <2>;
> iommus = <&/iommu>;
> };
>
> Multiple-master IOMMU:
> ----------------------
>
> iommu {
> /* the specifier represents the ID of the master */
> #address-cells = <1>;
> #size-cells = <0>;
> };
>
> master {
> /* device has master ID 42 in the IOMMU */
> iommus = <&/iommu 42>;
> };
>
> Multiple-master device:
> -----------------------
>
> /* single-master IOMMU */
> iommu@1 {
> reg = <1>;
> #address-cells = <0>;
> #size-cells = <0>;
> };
>
> /* multiple-master IOMMU */
> iommu@2 {
> reg = <2>;
> #address-cells = <1>;
> #size-cells = <0>;
> };
>
> /* device with two master interfaces */
> master {
> iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> };
>
> Multiple-master IOMMU with configurable DMA window:
> ---------------------------------------------------
>
> / {
> #address-cells = <1>;
> #size-cells = <1>;
>
> iommu {
> /* master ID, address of DMA window */
> #address-cells = <2>;
> #size-cells = <2>;
> };
>
> master {
> /* master ID 42, 4 GiB DMA window starting at 0 */
> iommus = <&/iommu 42 0 0x1 0x0>;
> };
> };
>
> Does that sound about right?
Yes, sounds great. I would probably leave out the Multiple-master device
from the examples, since that seems to be a rather obscure case.
I would like to add an explanation about dma-ranges to the binding:
8<--------
The parent bus of the iommu must have a valid "dma-ranges" property
describing how the physical address space of the IOMMU maps into
memory.
A device with an "iommus" property will ignore the "dma-ranges" property
of the parent node and rely on the IOMMU for translation instead.
Using an "iommus" property in bus device nodes with "dma-ranges"
specifying how child devices relate to the IOMMU is a possible extension
but is not recommended until this binding gets extended.
----------->8
Does that make sense to you? We can change what we say about
dma-ranges, I mainly want to be clear with what is or is not
allowed at this point.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 12:41 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 12:41 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
> > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
> > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
> [...]
> > > > > > You should never need #size-cells > #address-cells
> > > > >
> > > > > That was always my impression as well. But how then do you represent the
> > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
> > > > > 4 GiB - 1, which makes it 4 GiB large. That's:
> > > > >
> > > > > <0 1 0>
> > > > >
> > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is:
> > > > >
> > > > > <0 0xffffffff>
> > > > >
> > > > > but that's not accurate.
> > > >
> > > > I think we've done both in the past, either extended #size-cells or
> > > > taken 0xffffffff as a special token. Note that in your example,
> > > > the iommu actually needs #address-cells = <2> anyway.
> > >
> > > But it needs #address-cells = <2> only to encode an ID in addition to
> > > the address. If this was a single-master IOMMU then there'd be no need
> > > for the ID.
> >
> > Right. But for a single-master IOMMU, there is no need to specify
> > any additional data, it could have #address-cells=<0> if we take the
> > optimization you suggested.
>
> Couldn't a single-master IOMMU be windowed?
Ah, yes. That would actually be like an IBM pSeries, which has a windowed
IOMMU but uses one window per virtual machine. In that case, the window could
be a property of the iommu node though, rather than part of the address
in the link.
> > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f
> > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2>
> > > > and have an empty dma-ranges property in the PCI host bridge node,
> > > > and interpret this as using the same encoding as the PCI BARs in
> > > > the ranges property.
> > >
> > > I'm somewhat confused here, since you said earlier:
> > >
> > > > After giving the ranges stuff some more thought, I have come to the
> > > > conclusion that using #iommu-cells should work fine for almost
> > > > all cases, including windowed iommus, because the window is not
> > > > actually needed in the device, but only in the iommu, wihch is of course
> > > > free to interpret the arguments as addresses.
> > >
> > > But now you seem to be saying that we should still be using the
> > > #address-cells and #size-cells properties in the IOMMU node to determine
> > > the length of the specifier.
> >
> > I probably wasn't clear. I think we can make it work either way, but
> > my feeling is that using #address-cells/#size-cells gives us a nicer
> > syntax for the more complex cases.
>
> Okay, so in summary we'd have something like this for simple cases:
>
> Required properties:
> --------------------
> - #address-cells: The number of cells in an IOMMU specifier needed to encode
> an address.
> - #size-cells: The number of cells in an IOMMU specifier needed to represent
> the length of an address range.
>
> Typical values for the above include:
> - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> configurable and therefore no additional information needs to be encoded in
> the specifier. This may also apply to multiple master IOMMU devices that do
> not allow the association of masters to be configured.
> - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> need to be configured in order to enable translation for a given master. In
> such cases the single address cell corresponds to the master device's ID.
> - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> window for masters to be configured. The first cell of the address in this
> may contain the master device's ID for example, while the second cell could
> contain the start of the DMA window for the given device. The length of the
> DMA window is specified by two additional cells.
>
> Examples:
> =========
>
> Single-master IOMMU:
> --------------------
>
> iommu {
> #address-cells = <0>;
> #size-cells = <0>;
> };
>
> master {
> iommus = <&/iommu>;
> };
>
> Multiple-master IOMMU with fixed associations:
> ----------------------------------------------
>
> /* multiple-master IOMMU */
> iommu {
> /*
> * Masters are statically associated with this IOMMU and
> * address translation is always enabled.
> */
> #iommu-cells = <0>;
> };
copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
> /* static association with IOMMU */
> master at 1 {
> reg = <1>;
> iommus = <&/iommu>;
> };
>
> /* static association with IOMMU */
> master at 2 {
> reg = <2>;
> iommus = <&/iommu>;
> };
>
> Multiple-master IOMMU:
> ----------------------
>
> iommu {
> /* the specifier represents the ID of the master */
> #address-cells = <1>;
> #size-cells = <0>;
> };
>
> master {
> /* device has master ID 42 in the IOMMU */
> iommus = <&/iommu 42>;
> };
>
> Multiple-master device:
> -----------------------
>
> /* single-master IOMMU */
> iommu at 1 {
> reg = <1>;
> #address-cells = <0>;
> #size-cells = <0>;
> };
>
> /* multiple-master IOMMU */
> iommu at 2 {
> reg = <2>;
> #address-cells = <1>;
> #size-cells = <0>;
> };
>
> /* device with two master interfaces */
> master {
> iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> };
>
> Multiple-master IOMMU with configurable DMA window:
> ---------------------------------------------------
>
> / {
> #address-cells = <1>;
> #size-cells = <1>;
>
> iommu {
> /* master ID, address of DMA window */
> #address-cells = <2>;
> #size-cells = <2>;
> };
>
> master {
> /* master ID 42, 4 GiB DMA window starting at 0 */
> iommus = <&/iommu 42 0 0x1 0x0>;
> };
> };
>
> Does that sound about right?
Yes, sounds great. I would probably leave out the Multiple-master device
from the examples, since that seems to be a rather obscure case.
I would like to add an explanation about dma-ranges to the binding:
8<--------
The parent bus of the iommu must have a valid "dma-ranges" property
describing how the physical address space of the IOMMU maps into
memory.
A device with an "iommus" property will ignore the "dma-ranges" property
of the parent node and rely on the IOMMU for translation instead.
Using an "iommus" property in bus device nodes with "dma-ranges"
specifying how child devices relate to the IOMMU is a possible extension
but is not recommended until this binding gets extended.
----------->8
Does that make sense to you? We can change what we say about
dma-ranges, I mainly want to be clear with what is or is not
allowed at this point.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 12:41 ` Arnd Bergmann
(?)
@ 2014-05-20 13:17 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 13:17 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 3668 bytes --]
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
Yes, I obviously wasn't careful when I copied this over.
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu@1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu@2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
[...]
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
Agreed. We can easily add such examples if/when such device start to
appear.
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?
Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
Do we need to consider the case where an IOMMU listed in iommus isn't
enabled (status = "disabled")? In that case presumably the device would
either not function or may optionally continue to master onto the parent
untranslated.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
Just for my understanding, bus device nodes with iommus and dma-ranges
properties could be equivalently written by explicitly moving the iommus
properties into the child device nodes, right? In which case they should
be the same as the other examples. So that concept is a convenience
notation to reduce duplication, but doesn't fundamentally introduce any
new concept.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 13:17 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 13:17 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
[-- Attachment #1: Type: text/plain, Size: 3668 bytes --]
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
Yes, I obviously wasn't careful when I copied this over.
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu@1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu@2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
[...]
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
Agreed. We can easily add such examples if/when such device start to
appear.
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?
Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
Do we need to consider the case where an IOMMU listed in iommus isn't
enabled (status = "disabled")? In that case presumably the device would
either not function or may optionally continue to master onto the parent
untranslated.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
Just for my understanding, bus device nodes with iommus and dma-ranges
properties could be equivalently written by explicitly moving the iommus
properties into the child device nodes, right? In which case they should
be the same as the other examples. So that concept is a convenience
notation to reduce duplication, but doesn't fundamentally introduce any
new concept.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 13:17 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 13:17 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
Yes, I obviously wasn't careful when I copied this over.
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu at 1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu at 2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
[...]
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
Agreed. We can easily add such examples if/when such device start to
appear.
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?
Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
Do we need to consider the case where an IOMMU listed in iommus isn't
enabled (status = "disabled")? In that case presumably the device would
either not function or may optionally continue to master onto the parent
untranslated.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
Just for my understanding, bus device nodes with iommus and dma-ranges
properties could be equivalently written by explicitly moving the iommus
properties into the child device nodes, right? In which case they should
be the same as the other examples. So that concept is a convenience
notation to reduce duplication, but doesn't fundamentally introduce any
new concept.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140520/37014a7c/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 13:17 ` Thierry Reding
(?)
@ 2014-05-20 13:34 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 13:34 UTC (permalink / raw)
To: Thierry Reding
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > Couldn't a single-master IOMMU be windowed?
> >
> > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > IOMMU but uses one window per virtual machine. In that case, the window could
> > be a property of the iommu node though, rather than part of the address
> > in the link.
>
> Does that mean that the IOMMU has one statically configured window which
> is the same for each virtual machine? That would require some other
> mechanism to assign separate address spaces to each virtual machine,
> wouldn't it? But I suspect that if the IOMMU allows that it could be
> allocated dynamically at runtime.
The way it works on pSeries is that upon VM creation, the guest is assigned
one 256MB window for use by assigned DMA capable devices. When the guest
creates a mapping, it uses a hypercall to associate a bus address in that
range with a guest physical address. The hypervisor checks that the bus
address is within the allowed range, and translates the guest physical
address into a host physical address, then enters both into the I/O page
table or I/O TLB.
> > I would like to add an explanation about dma-ranges to the binding:
> >
> > 8<--------
> > The parent bus of the iommu must have a valid "dma-ranges" property
> > describing how the physical address space of the IOMMU maps into
> > memory.
>
> With physical address space you mean the addresses after translation,
> not the I/O virtual addresses, right? But even so, how will this work
> when there are multiple IOMMU devices? What determines which IOMMU is
> mapped via which entry?
>
> Perhaps having multiple IOMMUs implies that there will have to be some
> partitioning of the parent address space to make sure two IOMMUs don't
> translate to the same ranges?
These dma-ranges properties would almost always be for the entire RAM,
and we can treat anything else as a bug.
The mapping between what goes into the IOMMU and what comes out of it
is not reflected in DT at all, since it only happens at runtime.
The dma-ranges property I mean above describes how what comes out of
the IOMMU maps into physical memory.
> > A device with an "iommus" property will ignore the "dma-ranges" property
> > of the parent node and rely on the IOMMU for translation instead.
>
> Do we need to consider the case where an IOMMU listed in iommus isn't
> enabled (status = "disabled")? In that case presumably the device would
> either not function or may optionally continue to master onto the parent
> untranslated.
My reasoning was that the DT should specify whether we use the IOMMU
or not. Being able to just switch on or off the IOMMU sounds nice as
well, so we could change the text above to do that.
Another option would be to do this in the IOMMU code, basically
falling back to the IOMMU parent's dma-ranges property and using
linear dma_map_ops when that is disabled.
> > Using an "iommus" property in bus device nodes with "dma-ranges"
> > specifying how child devices relate to the IOMMU is a possible extension
> > but is not recommended until this binding gets extended.
>
> Just for my understanding, bus device nodes with iommus and dma-ranges
> properties could be equivalently written by explicitly moving the iommus
> properties into the child device nodes, right? In which case they should
> be the same as the other examples. So that concept is a convenience
> notation to reduce duplication, but doesn't fundamentally introduce any
> new concept.
The one case where that doesn't work is PCI, because we don't list the
PCI devices in DT normally, and the iommus property would only exist
at the PCI host controller node.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 13:34 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 13:34 UTC (permalink / raw)
To: Thierry Reding
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > Couldn't a single-master IOMMU be windowed?
> >
> > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > IOMMU but uses one window per virtual machine. In that case, the window could
> > be a property of the iommu node though, rather than part of the address
> > in the link.
>
> Does that mean that the IOMMU has one statically configured window which
> is the same for each virtual machine? That would require some other
> mechanism to assign separate address spaces to each virtual machine,
> wouldn't it? But I suspect that if the IOMMU allows that it could be
> allocated dynamically at runtime.
The way it works on pSeries is that upon VM creation, the guest is assigned
one 256MB window for use by assigned DMA capable devices. When the guest
creates a mapping, it uses a hypercall to associate a bus address in that
range with a guest physical address. The hypervisor checks that the bus
address is within the allowed range, and translates the guest physical
address into a host physical address, then enters both into the I/O page
table or I/O TLB.
> > I would like to add an explanation about dma-ranges to the binding:
> >
> > 8<--------
> > The parent bus of the iommu must have a valid "dma-ranges" property
> > describing how the physical address space of the IOMMU maps into
> > memory.
>
> With physical address space you mean the addresses after translation,
> not the I/O virtual addresses, right? But even so, how will this work
> when there are multiple IOMMU devices? What determines which IOMMU is
> mapped via which entry?
>
> Perhaps having multiple IOMMUs implies that there will have to be some
> partitioning of the parent address space to make sure two IOMMUs don't
> translate to the same ranges?
These dma-ranges properties would almost always be for the entire RAM,
and we can treat anything else as a bug.
The mapping between what goes into the IOMMU and what comes out of it
is not reflected in DT at all, since it only happens at runtime.
The dma-ranges property I mean above describes how what comes out of
the IOMMU maps into physical memory.
> > A device with an "iommus" property will ignore the "dma-ranges" property
> > of the parent node and rely on the IOMMU for translation instead.
>
> Do we need to consider the case where an IOMMU listed in iommus isn't
> enabled (status = "disabled")? In that case presumably the device would
> either not function or may optionally continue to master onto the parent
> untranslated.
My reasoning was that the DT should specify whether we use the IOMMU
or not. Being able to just switch on or off the IOMMU sounds nice as
well, so we could change the text above to do that.
Another option would be to do this in the IOMMU code, basically
falling back to the IOMMU parent's dma-ranges property and using
linear dma_map_ops when that is disabled.
> > Using an "iommus" property in bus device nodes with "dma-ranges"
> > specifying how child devices relate to the IOMMU is a possible extension
> > but is not recommended until this binding gets extended.
>
> Just for my understanding, bus device nodes with iommus and dma-ranges
> properties could be equivalently written by explicitly moving the iommus
> properties into the child device nodes, right? In which case they should
> be the same as the other examples. So that concept is a convenience
> notation to reduce duplication, but doesn't fundamentally introduce any
> new concept.
The one case where that doesn't work is PCI, because we don't list the
PCI devices in DT normally, and the iommus property would only exist
at the PCI host controller node.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 13:34 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 13:34 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > Couldn't a single-master IOMMU be windowed?
> >
> > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > IOMMU but uses one window per virtual machine. In that case, the window could
> > be a property of the iommu node though, rather than part of the address
> > in the link.
>
> Does that mean that the IOMMU has one statically configured window which
> is the same for each virtual machine? That would require some other
> mechanism to assign separate address spaces to each virtual machine,
> wouldn't it? But I suspect that if the IOMMU allows that it could be
> allocated dynamically at runtime.
The way it works on pSeries is that upon VM creation, the guest is assigned
one 256MB window for use by assigned DMA capable devices. When the guest
creates a mapping, it uses a hypercall to associate a bus address in that
range with a guest physical address. The hypervisor checks that the bus
address is within the allowed range, and translates the guest physical
address into a host physical address, then enters both into the I/O page
table or I/O TLB.
> > I would like to add an explanation about dma-ranges to the binding:
> >
> > 8<--------
> > The parent bus of the iommu must have a valid "dma-ranges" property
> > describing how the physical address space of the IOMMU maps into
> > memory.
>
> With physical address space you mean the addresses after translation,
> not the I/O virtual addresses, right? But even so, how will this work
> when there are multiple IOMMU devices? What determines which IOMMU is
> mapped via which entry?
>
> Perhaps having multiple IOMMUs implies that there will have to be some
> partitioning of the parent address space to make sure two IOMMUs don't
> translate to the same ranges?
These dma-ranges properties would almost always be for the entire RAM,
and we can treat anything else as a bug.
The mapping between what goes into the IOMMU and what comes out of it
is not reflected in DT at all, since it only happens at runtime.
The dma-ranges property I mean above describes how what comes out of
the IOMMU maps into physical memory.
> > A device with an "iommus" property will ignore the "dma-ranges" property
> > of the parent node and rely on the IOMMU for translation instead.
>
> Do we need to consider the case where an IOMMU listed in iommus isn't
> enabled (status = "disabled")? In that case presumably the device would
> either not function or may optionally continue to master onto the parent
> untranslated.
My reasoning was that the DT should specify whether we use the IOMMU
or not. Being able to just switch on or off the IOMMU sounds nice as
well, so we could change the text above to do that.
Another option would be to do this in the IOMMU code, basically
falling back to the IOMMU parent's dma-ranges property and using
linear dma_map_ops when that is disabled.
> > Using an "iommus" property in bus device nodes with "dma-ranges"
> > specifying how child devices relate to the IOMMU is a possible extension
> > but is not recommended until this binding gets extended.
>
> Just for my understanding, bus device nodes with iommus and dma-ranges
> properties could be equivalently written by explicitly moving the iommus
> properties into the child device nodes, right? In which case they should
> be the same as the other examples. So that concept is a convenience
> notation to reduce duplication, but doesn't fundamentally introduce any
> new concept.
The one case where that doesn't work is PCI, because we don't list the
PCI devices in DT normally, and the iommus property would only exist
at the PCI host controller node.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 13:34 ` Arnd Bergmann
(?)
@ 2014-05-20 14:00 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 14:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 5367 bytes --]
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > Couldn't a single-master IOMMU be windowed?
> > >
> > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > be a property of the iommu node though, rather than part of the address
> > > in the link.
> >
> > Does that mean that the IOMMU has one statically configured window which
> > is the same for each virtual machine? That would require some other
> > mechanism to assign separate address spaces to each virtual machine,
> > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > allocated dynamically at runtime.
>
> The way it works on pSeries is that upon VM creation, the guest is assigned
> one 256MB window for use by assigned DMA capable devices. When the guest
> creates a mapping, it uses a hypercall to associate a bus address in that
> range with a guest physical address. The hypervisor checks that the bus
> address is within the allowed range, and translates the guest physical
> address into a host physical address, then enters both into the I/O page
> table or I/O TLB.
So when a VM is booted it is passed a device tree with that DMA window?
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.
> > > I would like to add an explanation about dma-ranges to the binding:
> > >
> > > 8<--------
> > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > describing how the physical address space of the IOMMU maps into
> > > memory.
> >
> > With physical address space you mean the addresses after translation,
> > not the I/O virtual addresses, right? But even so, how will this work
> > when there are multiple IOMMU devices? What determines which IOMMU is
> > mapped via which entry?
> >
> > Perhaps having multiple IOMMUs implies that there will have to be some
> > partitioning of the parent address space to make sure two IOMMUs don't
> > translate to the same ranges?
>
> These dma-ranges properties would almost always be for the entire RAM,
> and we can treat anything else as a bug.
Would it typically be a 1:1 mapping? In that case could we define an
empty dma-ranges property to mean exactly that? That would make it
consistent with the ranges property.
> The mapping between what goes into the IOMMU and what comes out of it
> is not reflected in DT at all, since it only happens at runtime.
> The dma-ranges property I mean above describes how what comes out of
> the IOMMU maps into physical memory.
Understood. That makes sense.
> > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > of the parent node and rely on the IOMMU for translation instead.
> >
> > Do we need to consider the case where an IOMMU listed in iommus isn't
> > enabled (status = "disabled")? In that case presumably the device would
> > either not function or may optionally continue to master onto the parent
> > untranslated.
>
> My reasoning was that the DT should specify whether we use the IOMMU
> or not. Being able to just switch on or off the IOMMU sounds nice as
> well, so we could change the text above to do that.
>
> Another option would be to do this in the IOMMU code, basically
> falling back to the IOMMU parent's dma-ranges property and using
> linear dma_map_ops when that is disabled.
Yes, it should be trivial for the IOMMU core code to take care of this
special case. Still I think it's worth mentioning it in the binding so
that it's clearly specified.
> > > Using an "iommus" property in bus device nodes with "dma-ranges"
> > > specifying how child devices relate to the IOMMU is a possible extension
> > > but is not recommended until this binding gets extended.
> >
> > Just for my understanding, bus device nodes with iommus and dma-ranges
> > properties could be equivalently written by explicitly moving the iommus
> > properties into the child device nodes, right? In which case they should
> > be the same as the other examples. So that concept is a convenience
> > notation to reduce duplication, but doesn't fundamentally introduce any
> > new concept.
>
> The one case where that doesn't work is PCI, because we don't list the
> PCI devices in DT normally, and the iommus property would only exist
> at the PCI host controller node.
But it could work in classic OpenFirmware where the device tree can be
populated with the tree of PCI devices enumerated by OF. There are also
devices that have a fixed configuration and where technically the PCI
devices can be listed in the device tree. This is somewhat important if
for example one PCI device is a GPIO controller and needs to be
referenced by phandle from some other device.
I'll make a note in the binding document about this possible future
extension.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 14:00 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 14:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
[-- Attachment #1: Type: text/plain, Size: 5367 bytes --]
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > Couldn't a single-master IOMMU be windowed?
> > >
> > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > be a property of the iommu node though, rather than part of the address
> > > in the link.
> >
> > Does that mean that the IOMMU has one statically configured window which
> > is the same for each virtual machine? That would require some other
> > mechanism to assign separate address spaces to each virtual machine,
> > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > allocated dynamically at runtime.
>
> The way it works on pSeries is that upon VM creation, the guest is assigned
> one 256MB window for use by assigned DMA capable devices. When the guest
> creates a mapping, it uses a hypercall to associate a bus address in that
> range with a guest physical address. The hypervisor checks that the bus
> address is within the allowed range, and translates the guest physical
> address into a host physical address, then enters both into the I/O page
> table or I/O TLB.
So when a VM is booted it is passed a device tree with that DMA window?
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.
> > > I would like to add an explanation about dma-ranges to the binding:
> > >
> > > 8<--------
> > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > describing how the physical address space of the IOMMU maps into
> > > memory.
> >
> > With physical address space you mean the addresses after translation,
> > not the I/O virtual addresses, right? But even so, how will this work
> > when there are multiple IOMMU devices? What determines which IOMMU is
> > mapped via which entry?
> >
> > Perhaps having multiple IOMMUs implies that there will have to be some
> > partitioning of the parent address space to make sure two IOMMUs don't
> > translate to the same ranges?
>
> These dma-ranges properties would almost always be for the entire RAM,
> and we can treat anything else as a bug.
Would it typically be a 1:1 mapping? In that case could we define an
empty dma-ranges property to mean exactly that? That would make it
consistent with the ranges property.
> The mapping between what goes into the IOMMU and what comes out of it
> is not reflected in DT at all, since it only happens at runtime.
> The dma-ranges property I mean above describes how what comes out of
> the IOMMU maps into physical memory.
Understood. That makes sense.
> > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > of the parent node and rely on the IOMMU for translation instead.
> >
> > Do we need to consider the case where an IOMMU listed in iommus isn't
> > enabled (status = "disabled")? In that case presumably the device would
> > either not function or may optionally continue to master onto the parent
> > untranslated.
>
> My reasoning was that the DT should specify whether we use the IOMMU
> or not. Being able to just switch on or off the IOMMU sounds nice as
> well, so we could change the text above to do that.
>
> Another option would be to do this in the IOMMU code, basically
> falling back to the IOMMU parent's dma-ranges property and using
> linear dma_map_ops when that is disabled.
Yes, it should be trivial for the IOMMU core code to take care of this
special case. Still I think it's worth mentioning it in the binding so
that it's clearly specified.
> > > Using an "iommus" property in bus device nodes with "dma-ranges"
> > > specifying how child devices relate to the IOMMU is a possible extension
> > > but is not recommended until this binding gets extended.
> >
> > Just for my understanding, bus device nodes with iommus and dma-ranges
> > properties could be equivalently written by explicitly moving the iommus
> > properties into the child device nodes, right? In which case they should
> > be the same as the other examples. So that concept is a convenience
> > notation to reduce duplication, but doesn't fundamentally introduce any
> > new concept.
>
> The one case where that doesn't work is PCI, because we don't list the
> PCI devices in DT normally, and the iommus property would only exist
> at the PCI host controller node.
But it could work in classic OpenFirmware where the device tree can be
populated with the tree of PCI devices enumerated by OF. There are also
devices that have a fixed configuration and where technically the PCI
devices can be listed in the device tree. This is somewhat important if
for example one PCI device is a GPIO controller and needs to be
referenced by phandle from some other device.
I'll make a note in the binding document about this possible future
extension.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 14:00 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-20 14:00 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > Couldn't a single-master IOMMU be windowed?
> > >
> > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > be a property of the iommu node though, rather than part of the address
> > > in the link.
> >
> > Does that mean that the IOMMU has one statically configured window which
> > is the same for each virtual machine? That would require some other
> > mechanism to assign separate address spaces to each virtual machine,
> > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > allocated dynamically at runtime.
>
> The way it works on pSeries is that upon VM creation, the guest is assigned
> one 256MB window for use by assigned DMA capable devices. When the guest
> creates a mapping, it uses a hypercall to associate a bus address in that
> range with a guest physical address. The hypervisor checks that the bus
> address is within the allowed range, and translates the guest physical
> address into a host physical address, then enters both into the I/O page
> table or I/O TLB.
So when a VM is booted it is passed a device tree with that DMA window?
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.
> > > I would like to add an explanation about dma-ranges to the binding:
> > >
> > > 8<--------
> > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > describing how the physical address space of the IOMMU maps into
> > > memory.
> >
> > With physical address space you mean the addresses after translation,
> > not the I/O virtual addresses, right? But even so, how will this work
> > when there are multiple IOMMU devices? What determines which IOMMU is
> > mapped via which entry?
> >
> > Perhaps having multiple IOMMUs implies that there will have to be some
> > partitioning of the parent address space to make sure two IOMMUs don't
> > translate to the same ranges?
>
> These dma-ranges properties would almost always be for the entire RAM,
> and we can treat anything else as a bug.
Would it typically be a 1:1 mapping? In that case could we define an
empty dma-ranges property to mean exactly that? That would make it
consistent with the ranges property.
> The mapping between what goes into the IOMMU and what comes out of it
> is not reflected in DT at all, since it only happens at runtime.
> The dma-ranges property I mean above describes how what comes out of
> the IOMMU maps into physical memory.
Understood. That makes sense.
> > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > of the parent node and rely on the IOMMU for translation instead.
> >
> > Do we need to consider the case where an IOMMU listed in iommus isn't
> > enabled (status = "disabled")? In that case presumably the device would
> > either not function or may optionally continue to master onto the parent
> > untranslated.
>
> My reasoning was that the DT should specify whether we use the IOMMU
> or not. Being able to just switch on or off the IOMMU sounds nice as
> well, so we could change the text above to do that.
>
> Another option would be to do this in the IOMMU code, basically
> falling back to the IOMMU parent's dma-ranges property and using
> linear dma_map_ops when that is disabled.
Yes, it should be trivial for the IOMMU core code to take care of this
special case. Still I think it's worth mentioning it in the binding so
that it's clearly specified.
> > > Using an "iommus" property in bus device nodes with "dma-ranges"
> > > specifying how child devices relate to the IOMMU is a possible extension
> > > but is not recommended until this binding gets extended.
> >
> > Just for my understanding, bus device nodes with iommus and dma-ranges
> > properties could be equivalently written by explicitly moving the iommus
> > properties into the child device nodes, right? In which case they should
> > be the same as the other examples. So that concept is a convenience
> > notation to reduce duplication, but doesn't fundamentally introduce any
> > new concept.
>
> The one case where that doesn't work is PCI, because we don't list the
> PCI devices in DT normally, and the iommus property would only exist
> at the PCI host controller node.
But it could work in classic OpenFirmware where the device tree can be
populated with the tree of PCI devices enumerated by OF. There are also
devices that have a fixed configuration and where technically the PCI
devices can be listed in the device tree. This is somewhat important if
for example one PCI device is a GPIO controller and needs to be
referenced by phandle from some other device.
I'll make a note in the binding document about this possible future
extension.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140520/c582449f/attachment-0001.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 14:00 ` Thierry Reding
@ 2014-05-20 20:31 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 20:31 UTC (permalink / raw)
To: Thierry Reding
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > [...]
> > > > > Couldn't a single-master IOMMU be windowed?
> > > >
> > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > be a property of the iommu node though, rather than part of the address
> > > > in the link.
> > >
> > > Does that mean that the IOMMU has one statically configured window which
> > > is the same for each virtual machine? That would require some other
> > > mechanism to assign separate address spaces to each virtual machine,
> > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > allocated dynamically at runtime.
> >
> > The way it works on pSeries is that upon VM creation, the guest is assigned
> > one 256MB window for use by assigned DMA capable devices. When the guest
> > creates a mapping, it uses a hypercall to associate a bus address in that
> > range with a guest physical address. The hypervisor checks that the bus
> > address is within the allowed range, and translates the guest physical
> > address into a host physical address, then enters both into the I/O page
> > table or I/O TLB.
>
> So when a VM is booted it is passed a device tree with that DMA window?
Correct.
> Given what you describe above this seems to be more of a configuration
> option to restrict the IOMMU to a subset of the physical memory for
> purposes of virtualization. So I agree that this wouldn't be a good fit
> for what we're trying to achieve with iommus or dma-ranges in this
> binding.
Thinking about it again now, I wonder if there are any other use cases
for windowed IOMMUs. If this is the only one, there might be no use
in the #address-cells model I suggested instead of your original
#iommu-cells.
> > > > I would like to add an explanation about dma-ranges to the binding:
> > > >
> > > > 8<--------
> > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > describing how the physical address space of the IOMMU maps into
> > > > memory.
> > >
> > > With physical address space you mean the addresses after translation,
> > > not the I/O virtual addresses, right? But even so, how will this work
> > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > mapped via which entry?
> > >
> > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > partitioning of the parent address space to make sure two IOMMUs don't
> > > translate to the same ranges?
> >
> > These dma-ranges properties would almost always be for the entire RAM,
> > and we can treat anything else as a bug.
>
> Would it typically be a 1:1 mapping? In that case could we define an
> empty dma-ranges property to mean exactly that? That would make it
> consistent with the ranges property.
Yes, I believe that is how it's already defined.
> > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > of the parent node and rely on the IOMMU for translation instead.
> > >
> > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > enabled (status = "disabled")? In that case presumably the device would
> > > either not function or may optionally continue to master onto the parent
> > > untranslated.
> >
> > My reasoning was that the DT should specify whether we use the IOMMU
> > or not. Being able to just switch on or off the IOMMU sounds nice as
> > well, so we could change the text above to do that.
> >
> > Another option would be to do this in the IOMMU code, basically
> > falling back to the IOMMU parent's dma-ranges property and using
> > linear dma_map_ops when that is disabled.
>
> Yes, it should be trivial for the IOMMU core code to take care of this
> special case. Still I think it's worth mentioning it in the binding so
> that it's clearly specified.
Agreed.
> > > > Using an "iommus" property in bus device nodes with "dma-ranges"
> > > > specifying how child devices relate to the IOMMU is a possible extension
> > > > but is not recommended until this binding gets extended.
> > >
> > > Just for my understanding, bus device nodes with iommus and dma-ranges
> > > properties could be equivalently written by explicitly moving the iommus
> > > properties into the child device nodes, right? In which case they should
> > > be the same as the other examples. So that concept is a convenience
> > > notation to reduce duplication, but doesn't fundamentally introduce any
> > > new concept.
> >
> > The one case where that doesn't work is PCI, because we don't list the
> > PCI devices in DT normally, and the iommus property would only exist
> > at the PCI host controller node.
>
> But it could work in classic OpenFirmware where the device tree can be
> populated with the tree of PCI devices enumerated by OF. There are also
> devices that have a fixed configuration and where technically the PCI
> devices can be listed in the device tree. This is somewhat important if
> for example one PCI device is a GPIO controller and needs to be
> referenced by phandle from some other device.
Correct. The flaw of classic Open Firmware here was that it cannot
handle PCIe hotplug though, so we can never rely on the DT to
describe all devices.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 20:31 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 20:31 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > [...]
> > > > > Couldn't a single-master IOMMU be windowed?
> > > >
> > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > be a property of the iommu node though, rather than part of the address
> > > > in the link.
> > >
> > > Does that mean that the IOMMU has one statically configured window which
> > > is the same for each virtual machine? That would require some other
> > > mechanism to assign separate address spaces to each virtual machine,
> > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > allocated dynamically at runtime.
> >
> > The way it works on pSeries is that upon VM creation, the guest is assigned
> > one 256MB window for use by assigned DMA capable devices. When the guest
> > creates a mapping, it uses a hypercall to associate a bus address in that
> > range with a guest physical address. The hypervisor checks that the bus
> > address is within the allowed range, and translates the guest physical
> > address into a host physical address, then enters both into the I/O page
> > table or I/O TLB.
>
> So when a VM is booted it is passed a device tree with that DMA window?
Correct.
> Given what you describe above this seems to be more of a configuration
> option to restrict the IOMMU to a subset of the physical memory for
> purposes of virtualization. So I agree that this wouldn't be a good fit
> for what we're trying to achieve with iommus or dma-ranges in this
> binding.
Thinking about it again now, I wonder if there are any other use cases
for windowed IOMMUs. If this is the only one, there might be no use
in the #address-cells model I suggested instead of your original
#iommu-cells.
> > > > I would like to add an explanation about dma-ranges to the binding:
> > > >
> > > > 8<--------
> > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > describing how the physical address space of the IOMMU maps into
> > > > memory.
> > >
> > > With physical address space you mean the addresses after translation,
> > > not the I/O virtual addresses, right? But even so, how will this work
> > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > mapped via which entry?
> > >
> > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > partitioning of the parent address space to make sure two IOMMUs don't
> > > translate to the same ranges?
> >
> > These dma-ranges properties would almost always be for the entire RAM,
> > and we can treat anything else as a bug.
>
> Would it typically be a 1:1 mapping? In that case could we define an
> empty dma-ranges property to mean exactly that? That would make it
> consistent with the ranges property.
Yes, I believe that is how it's already defined.
> > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > of the parent node and rely on the IOMMU for translation instead.
> > >
> > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > enabled (status = "disabled")? In that case presumably the device would
> > > either not function or may optionally continue to master onto the parent
> > > untranslated.
> >
> > My reasoning was that the DT should specify whether we use the IOMMU
> > or not. Being able to just switch on or off the IOMMU sounds nice as
> > well, so we could change the text above to do that.
> >
> > Another option would be to do this in the IOMMU code, basically
> > falling back to the IOMMU parent's dma-ranges property and using
> > linear dma_map_ops when that is disabled.
>
> Yes, it should be trivial for the IOMMU core code to take care of this
> special case. Still I think it's worth mentioning it in the binding so
> that it's clearly specified.
Agreed.
> > > > Using an "iommus" property in bus device nodes with "dma-ranges"
> > > > specifying how child devices relate to the IOMMU is a possible extension
> > > > but is not recommended until this binding gets extended.
> > >
> > > Just for my understanding, bus device nodes with iommus and dma-ranges
> > > properties could be equivalently written by explicitly moving the iommus
> > > properties into the child device nodes, right? In which case they should
> > > be the same as the other examples. So that concept is a convenience
> > > notation to reduce duplication, but doesn't fundamentally introduce any
> > > new concept.
> >
> > The one case where that doesn't work is PCI, because we don't list the
> > PCI devices in DT normally, and the iommus property would only exist
> > at the PCI host controller node.
>
> But it could work in classic OpenFirmware where the device tree can be
> populated with the tree of PCI devices enumerated by OF. There are also
> devices that have a fixed configuration and where technically the PCI
> devices can be listed in the device tree. This is somewhat important if
> for example one PCI device is a GPIO controller and needs to be
> referenced by phandle from some other device.
Correct. The flaw of classic Open Firmware here was that it cannot
handle PCIe hotplug though, so we can never rely on the DT to
describe all devices.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 20:31 ` Arnd Bergmann
(?)
@ 2014-05-21 8:16 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:16 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 6537 bytes --]
On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > [...]
> > > > > > Couldn't a single-master IOMMU be windowed?
> > > > >
> > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > be a property of the iommu node though, rather than part of the address
> > > > > in the link.
> > > >
> > > > Does that mean that the IOMMU has one statically configured window which
> > > > is the same for each virtual machine? That would require some other
> > > > mechanism to assign separate address spaces to each virtual machine,
> > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > allocated dynamically at runtime.
> > >
> > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > range with a guest physical address. The hypervisor checks that the bus
> > > address is within the allowed range, and translates the guest physical
> > > address into a host physical address, then enters both into the I/O page
> > > table or I/O TLB.
> >
> > So when a VM is booted it is passed a device tree with that DMA window?
>
> Correct.
>
> > Given what you describe above this seems to be more of a configuration
> > option to restrict the IOMMU to a subset of the physical memory for
> > purposes of virtualization. So I agree that this wouldn't be a good fit
> > for what we're trying to achieve with iommus or dma-ranges in this
> > binding.
>
> Thinking about it again now, I wonder if there are any other use cases
> for windowed IOMMUs. If this is the only one, there might be no use
> in the #address-cells model I suggested instead of your original
> #iommu-cells.
So in this case virtualization is the reason why we need the DMA window.
The reason for that is that the guest has no other way of knowing what
other guests might be using, so it's essentially a mechanism for the
host to manage the DMA region and allocate subregions for each guest. If
virtualization isn't an issue then it seems to me that the need for DMA
windows goes away because the operating system will track DMA regions
anyway.
The only reason I can think of why a windowed IOMMU would be useful is
to prevent two or more devices from stepping on each others' toes. But
that's a problem that the OS should already be handling during DMA
buffer allocation, isn't it?
> > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > >
> > > > > 8<--------
> > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > describing how the physical address space of the IOMMU maps into
> > > > > memory.
> > > >
> > > > With physical address space you mean the addresses after translation,
> > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > mapped via which entry?
> > > >
> > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > translate to the same ranges?
> > >
> > > These dma-ranges properties would almost always be for the entire RAM,
> > > and we can treat anything else as a bug.
> >
> > Would it typically be a 1:1 mapping? In that case could we define an
> > empty dma-ranges property to mean exactly that? That would make it
> > consistent with the ranges property.
>
> Yes, I believe that is how it's already defined.
I've gone through the proposal at [0] again, but couldn't find a mention
of an empty "dma-ranges" property. But regardless I think that a 1:1
mapping is the obvious meaning of an empty "dma-ranges" property.
[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
One thing I'm not sure about is whether dma-ranges should be documented
in this binding at all. Since there's an accepted standard proposal it
would seem that it doesn't need to be specifically mentioned. One other
option would be to link to the above proposal from the binding and then
complement that with what an empty "dma-ranges" property means.
Or we could possible document this in a file along with other standard
properties. I don't think we currently do that for any properties, but
my concern is that there will always be a limited number of people
knowing about how such properties are supposed to work. If all of a
sudden all these people would disappear, everybody else would be left
with references to these properties but nowhere to look for their
meaning.
> > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > of the parent node and rely on the IOMMU for translation instead.
> > > >
> > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > enabled (status = "disabled")? In that case presumably the device would
> > > > either not function or may optionally continue to master onto the parent
> > > > untranslated.
> > >
> > > My reasoning was that the DT should specify whether we use the IOMMU
> > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > well, so we could change the text above to do that.
> > >
> > > Another option would be to do this in the IOMMU code, basically
> > > falling back to the IOMMU parent's dma-ranges property and using
> > > linear dma_map_ops when that is disabled.
> >
> > Yes, it should be trivial for the IOMMU core code to take care of this
> > special case. Still I think it's worth mentioning it in the binding so
> > that it's clearly specified.
>
> Agreed.
Okay, I have a new version of the binding that I think incorporates all
the changes discussed so far. It uses #address-cells and #size-cells to
define the length of the specifier, but if we decide against that it can
easily be changed again.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:16 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:16 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
[-- Attachment #1: Type: text/plain, Size: 6537 bytes --]
On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > [...]
> > > > > > Couldn't a single-master IOMMU be windowed?
> > > > >
> > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > be a property of the iommu node though, rather than part of the address
> > > > > in the link.
> > > >
> > > > Does that mean that the IOMMU has one statically configured window which
> > > > is the same for each virtual machine? That would require some other
> > > > mechanism to assign separate address spaces to each virtual machine,
> > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > allocated dynamically at runtime.
> > >
> > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > range with a guest physical address. The hypervisor checks that the bus
> > > address is within the allowed range, and translates the guest physical
> > > address into a host physical address, then enters both into the I/O page
> > > table or I/O TLB.
> >
> > So when a VM is booted it is passed a device tree with that DMA window?
>
> Correct.
>
> > Given what you describe above this seems to be more of a configuration
> > option to restrict the IOMMU to a subset of the physical memory for
> > purposes of virtualization. So I agree that this wouldn't be a good fit
> > for what we're trying to achieve with iommus or dma-ranges in this
> > binding.
>
> Thinking about it again now, I wonder if there are any other use cases
> for windowed IOMMUs. If this is the only one, there might be no use
> in the #address-cells model I suggested instead of your original
> #iommu-cells.
So in this case virtualization is the reason why we need the DMA window.
The reason for that is that the guest has no other way of knowing what
other guests might be using, so it's essentially a mechanism for the
host to manage the DMA region and allocate subregions for each guest. If
virtualization isn't an issue then it seems to me that the need for DMA
windows goes away because the operating system will track DMA regions
anyway.
The only reason I can think of why a windowed IOMMU would be useful is
to prevent two or more devices from stepping on each others' toes. But
that's a problem that the OS should already be handling during DMA
buffer allocation, isn't it?
> > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > >
> > > > > 8<--------
> > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > describing how the physical address space of the IOMMU maps into
> > > > > memory.
> > > >
> > > > With physical address space you mean the addresses after translation,
> > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > mapped via which entry?
> > > >
> > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > translate to the same ranges?
> > >
> > > These dma-ranges properties would almost always be for the entire RAM,
> > > and we can treat anything else as a bug.
> >
> > Would it typically be a 1:1 mapping? In that case could we define an
> > empty dma-ranges property to mean exactly that? That would make it
> > consistent with the ranges property.
>
> Yes, I believe that is how it's already defined.
I've gone through the proposal at [0] again, but couldn't find a mention
of an empty "dma-ranges" property. But regardless I think that a 1:1
mapping is the obvious meaning of an empty "dma-ranges" property.
[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
One thing I'm not sure about is whether dma-ranges should be documented
in this binding at all. Since there's an accepted standard proposal it
would seem that it doesn't need to be specifically mentioned. One other
option would be to link to the above proposal from the binding and then
complement that with what an empty "dma-ranges" property means.
Or we could possible document this in a file along with other standard
properties. I don't think we currently do that for any properties, but
my concern is that there will always be a limited number of people
knowing about how such properties are supposed to work. If all of a
sudden all these people would disappear, everybody else would be left
with references to these properties but nowhere to look for their
meaning.
> > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > of the parent node and rely on the IOMMU for translation instead.
> > > >
> > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > enabled (status = "disabled")? In that case presumably the device would
> > > > either not function or may optionally continue to master onto the parent
> > > > untranslated.
> > >
> > > My reasoning was that the DT should specify whether we use the IOMMU
> > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > well, so we could change the text above to do that.
> > >
> > > Another option would be to do this in the IOMMU code, basically
> > > falling back to the IOMMU parent's dma-ranges property and using
> > > linear dma_map_ops when that is disabled.
> >
> > Yes, it should be trivial for the IOMMU core code to take care of this
> > special case. Still I think it's worth mentioning it in the binding so
> > that it's clearly specified.
>
> Agreed.
Okay, I have a new version of the binding that I think incorporates all
the changes discussed so far. It uses #address-cells and #size-cells to
define the length of the specifier, but if we decide against that it can
easily be changed again.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:16 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:16 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > [...]
> > > > > > Couldn't a single-master IOMMU be windowed?
> > > > >
> > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > be a property of the iommu node though, rather than part of the address
> > > > > in the link.
> > > >
> > > > Does that mean that the IOMMU has one statically configured window which
> > > > is the same for each virtual machine? That would require some other
> > > > mechanism to assign separate address spaces to each virtual machine,
> > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > allocated dynamically at runtime.
> > >
> > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > range with a guest physical address. The hypervisor checks that the bus
> > > address is within the allowed range, and translates the guest physical
> > > address into a host physical address, then enters both into the I/O page
> > > table or I/O TLB.
> >
> > So when a VM is booted it is passed a device tree with that DMA window?
>
> Correct.
>
> > Given what you describe above this seems to be more of a configuration
> > option to restrict the IOMMU to a subset of the physical memory for
> > purposes of virtualization. So I agree that this wouldn't be a good fit
> > for what we're trying to achieve with iommus or dma-ranges in this
> > binding.
>
> Thinking about it again now, I wonder if there are any other use cases
> for windowed IOMMUs. If this is the only one, there might be no use
> in the #address-cells model I suggested instead of your original
> #iommu-cells.
So in this case virtualization is the reason why we need the DMA window.
The reason for that is that the guest has no other way of knowing what
other guests might be using, so it's essentially a mechanism for the
host to manage the DMA region and allocate subregions for each guest. If
virtualization isn't an issue then it seems to me that the need for DMA
windows goes away because the operating system will track DMA regions
anyway.
The only reason I can think of why a windowed IOMMU would be useful is
to prevent two or more devices from stepping on each others' toes. But
that's a problem that the OS should already be handling during DMA
buffer allocation, isn't it?
> > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > >
> > > > > 8<--------
> > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > describing how the physical address space of the IOMMU maps into
> > > > > memory.
> > > >
> > > > With physical address space you mean the addresses after translation,
> > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > mapped via which entry?
> > > >
> > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > translate to the same ranges?
> > >
> > > These dma-ranges properties would almost always be for the entire RAM,
> > > and we can treat anything else as a bug.
> >
> > Would it typically be a 1:1 mapping? In that case could we define an
> > empty dma-ranges property to mean exactly that? That would make it
> > consistent with the ranges property.
>
> Yes, I believe that is how it's already defined.
I've gone through the proposal at [0] again, but couldn't find a mention
of an empty "dma-ranges" property. But regardless I think that a 1:1
mapping is the obvious meaning of an empty "dma-ranges" property.
[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
One thing I'm not sure about is whether dma-ranges should be documented
in this binding at all. Since there's an accepted standard proposal it
would seem that it doesn't need to be specifically mentioned. One other
option would be to link to the above proposal from the binding and then
complement that with what an empty "dma-ranges" property means.
Or we could possible document this in a file along with other standard
properties. I don't think we currently do that for any properties, but
my concern is that there will always be a limited number of people
knowing about how such properties are supposed to work. If all of a
sudden all these people would disappear, everybody else would be left
with references to these properties but nowhere to look for their
meaning.
> > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > of the parent node and rely on the IOMMU for translation instead.
> > > >
> > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > enabled (status = "disabled")? In that case presumably the device would
> > > > either not function or may optionally continue to master onto the parent
> > > > untranslated.
> > >
> > > My reasoning was that the DT should specify whether we use the IOMMU
> > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > well, so we could change the text above to do that.
> > >
> > > Another option would be to do this in the IOMMU code, basically
> > > falling back to the IOMMU parent's dma-ranges property and using
> > > linear dma_map_ops when that is disabled.
> >
> > Yes, it should be trivial for the IOMMU core code to take care of this
> > special case. Still I think it's worth mentioning it in the binding so
> > that it's clearly specified.
>
> Agreed.
Okay, I have a new version of the binding that I think incorporates all
the changes discussed so far. It uses #address-cells and #size-cells to
define the length of the specifier, but if we decide against that it can
easily be changed again.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140521/af21d072/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 8:16 ` Thierry Reding
(?)
@ 2014-05-21 8:54 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:54 UTC (permalink / raw)
To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Thierry Reding,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin
On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > [...]
> > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > >
> > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > in the link.
> > > > >
> > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > is the same for each virtual machine? That would require some other
> > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > allocated dynamically at runtime.
> > > >
> > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > range with a guest physical address. The hypervisor checks that the bus
> > > > address is within the allowed range, and translates the guest physical
> > > > address into a host physical address, then enters both into the I/O page
> > > > table or I/O TLB.
> > >
> > > So when a VM is booted it is passed a device tree with that DMA window?
> >
> > Correct.
> >
> > > Given what you describe above this seems to be more of a configuration
> > > option to restrict the IOMMU to a subset of the physical memory for
> > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > for what we're trying to achieve with iommus or dma-ranges in this
> > > binding.
> >
> > Thinking about it again now, I wonder if there are any other use cases
> > for windowed IOMMUs. If this is the only one, there might be no use
> > in the #address-cells model I suggested instead of your original
> > #iommu-cells.
>
> So in this case virtualization is the reason why we need the DMA window.
> The reason for that is that the guest has no other way of knowing what
> other guests might be using, so it's essentially a mechanism for the
> host to manage the DMA region and allocate subregions for each guest. If
> virtualization isn't an issue then it seems to me that the need for DMA
> windows goes away because the operating system will track DMA regions
> anyway.
>
> The only reason I can think of why a windowed IOMMU would be useful is
> to prevent two or more devices from stepping on each others' toes. But
> that's a problem that the OS should already be handling during DMA
> buffer allocation, isn't it?
Right. As long as we always unmap the buffers from the IOMMU after they
have stopped being in use, it's very unlikely that even a broken device
driver causes a DMA into some bus address that happens to be mapped for
another device.
> > > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > > >
> > > > > > 8<--------
> > > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > > describing how the physical address space of the IOMMU maps into
> > > > > > memory.
> > > > >
> > > > > With physical address space you mean the addresses after translation,
> > > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > > mapped via which entry?
> > > > >
> > > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > > translate to the same ranges?
> > > >
> > > > These dma-ranges properties would almost always be for the entire RAM,
> > > > and we can treat anything else as a bug.
> > >
> > > Would it typically be a 1:1 mapping? In that case could we define an
> > > empty dma-ranges property to mean exactly that? That would make it
> > > consistent with the ranges property.
> >
> > Yes, I believe that is how it's already defined.
>
> I've gone through the proposal at [0] again, but couldn't find a mention
> of an empty "dma-ranges" property. But regardless I think that a 1:1
> mapping is the obvious meaning of an empty "dma-ranges" property.
>
> [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
>
> One thing I'm not sure about is whether dma-ranges should be documented
> in this binding at all. Since there's an accepted standard proposal it
> would seem that it doesn't need to be specifically mentioned. One other
> option would be to link to the above proposal from the binding and then
> complement that with what an empty "dma-ranges" property means.
>
> Or we could possible document this in a file along with other standard
> properties. I don't think we currently do that for any properties, but
> my concern is that there will always be a limited number of people
> knowing about how such properties are supposed to work. If all of a
> sudden all these people would disappear, everybody else would be left
> with references to these properties but nowhere to look for their
> meaning.
I think it makes sense to document how the standard dma-ranges
interacts with the new iommu binding, because it's not obvious
what happens if you have both together, or iommu without a parent
dma-ranges.
> > > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > > of the parent node and rely on the IOMMU for translation instead.
> > > > >
> > > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > > enabled (status = "disabled")? In that case presumably the device would
> > > > > either not function or may optionally continue to master onto the parent
> > > > > untranslated.
> > > >
> > > > My reasoning was that the DT should specify whether we use the IOMMU
> > > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > > well, so we could change the text above to do that.
> > > >
> > > > Another option would be to do this in the IOMMU code, basically
> > > > falling back to the IOMMU parent's dma-ranges property and using
> > > > linear dma_map_ops when that is disabled.
> > >
> > > Yes, it should be trivial for the IOMMU core code to take care of this
> > > special case. Still I think it's worth mentioning it in the binding so
> > > that it's clearly specified.
> >
> > Agreed.
>
> Okay, I have a new version of the binding that I think incorporates all
> the changes discussed so far. It uses #address-cells and #size-cells to
> define the length of the specifier, but if we decide against that it can
> easily be changed again.
Ok.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:54 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:54 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Thierry Reding, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > [...]
> > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > >
> > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > in the link.
> > > > >
> > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > is the same for each virtual machine? That would require some other
> > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > allocated dynamically at runtime.
> > > >
> > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > range with a guest physical address. The hypervisor checks that the bus
> > > > address is within the allowed range, and translates the guest physical
> > > > address into a host physical address, then enters both into the I/O page
> > > > table or I/O TLB.
> > >
> > > So when a VM is booted it is passed a device tree with that DMA window?
> >
> > Correct.
> >
> > > Given what you describe above this seems to be more of a configuration
> > > option to restrict the IOMMU to a subset of the physical memory for
> > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > for what we're trying to achieve with iommus or dma-ranges in this
> > > binding.
> >
> > Thinking about it again now, I wonder if there are any other use cases
> > for windowed IOMMUs. If this is the only one, there might be no use
> > in the #address-cells model I suggested instead of your original
> > #iommu-cells.
>
> So in this case virtualization is the reason why we need the DMA window.
> The reason for that is that the guest has no other way of knowing what
> other guests might be using, so it's essentially a mechanism for the
> host to manage the DMA region and allocate subregions for each guest. If
> virtualization isn't an issue then it seems to me that the need for DMA
> windows goes away because the operating system will track DMA regions
> anyway.
>
> The only reason I can think of why a windowed IOMMU would be useful is
> to prevent two or more devices from stepping on each others' toes. But
> that's a problem that the OS should already be handling during DMA
> buffer allocation, isn't it?
Right. As long as we always unmap the buffers from the IOMMU after they
have stopped being in use, it's very unlikely that even a broken device
driver causes a DMA into some bus address that happens to be mapped for
another device.
> > > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > > >
> > > > > > 8<--------
> > > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > > describing how the physical address space of the IOMMU maps into
> > > > > > memory.
> > > > >
> > > > > With physical address space you mean the addresses after translation,
> > > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > > mapped via which entry?
> > > > >
> > > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > > translate to the same ranges?
> > > >
> > > > These dma-ranges properties would almost always be for the entire RAM,
> > > > and we can treat anything else as a bug.
> > >
> > > Would it typically be a 1:1 mapping? In that case could we define an
> > > empty dma-ranges property to mean exactly that? That would make it
> > > consistent with the ranges property.
> >
> > Yes, I believe that is how it's already defined.
>
> I've gone through the proposal at [0] again, but couldn't find a mention
> of an empty "dma-ranges" property. But regardless I think that a 1:1
> mapping is the obvious meaning of an empty "dma-ranges" property.
>
> [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
>
> One thing I'm not sure about is whether dma-ranges should be documented
> in this binding at all. Since there's an accepted standard proposal it
> would seem that it doesn't need to be specifically mentioned. One other
> option would be to link to the above proposal from the binding and then
> complement that with what an empty "dma-ranges" property means.
>
> Or we could possible document this in a file along with other standard
> properties. I don't think we currently do that for any properties, but
> my concern is that there will always be a limited number of people
> knowing about how such properties are supposed to work. If all of a
> sudden all these people would disappear, everybody else would be left
> with references to these properties but nowhere to look for their
> meaning.
I think it makes sense to document how the standard dma-ranges
interacts with the new iommu binding, because it's not obvious
what happens if you have both together, or iommu without a parent
dma-ranges.
> > > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > > of the parent node and rely on the IOMMU for translation instead.
> > > > >
> > > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > > enabled (status = "disabled")? In that case presumably the device would
> > > > > either not function or may optionally continue to master onto the parent
> > > > > untranslated.
> > > >
> > > > My reasoning was that the DT should specify whether we use the IOMMU
> > > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > > well, so we could change the text above to do that.
> > > >
> > > > Another option would be to do this in the IOMMU code, basically
> > > > falling back to the IOMMU parent's dma-ranges property and using
> > > > linear dma_map_ops when that is disabled.
> > >
> > > Yes, it should be trivial for the IOMMU core code to take care of this
> > > special case. Still I think it's worth mentioning it in the binding so
> > > that it's clearly specified.
> >
> > Agreed.
>
> Okay, I have a new version of the binding that I think incorporates all
> the changes discussed so far. It uses #address-cells and #size-cells to
> define the length of the specifier, but if we decide against that it can
> easily be changed again.
Ok.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:54 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:54 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > [...]
> > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > >
> > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > in the link.
> > > > >
> > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > is the same for each virtual machine? That would require some other
> > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > allocated dynamically at runtime.
> > > >
> > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > range with a guest physical address. The hypervisor checks that the bus
> > > > address is within the allowed range, and translates the guest physical
> > > > address into a host physical address, then enters both into the I/O page
> > > > table or I/O TLB.
> > >
> > > So when a VM is booted it is passed a device tree with that DMA window?
> >
> > Correct.
> >
> > > Given what you describe above this seems to be more of a configuration
> > > option to restrict the IOMMU to a subset of the physical memory for
> > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > for what we're trying to achieve with iommus or dma-ranges in this
> > > binding.
> >
> > Thinking about it again now, I wonder if there are any other use cases
> > for windowed IOMMUs. If this is the only one, there might be no use
> > in the #address-cells model I suggested instead of your original
> > #iommu-cells.
>
> So in this case virtualization is the reason why we need the DMA window.
> The reason for that is that the guest has no other way of knowing what
> other guests might be using, so it's essentially a mechanism for the
> host to manage the DMA region and allocate subregions for each guest. If
> virtualization isn't an issue then it seems to me that the need for DMA
> windows goes away because the operating system will track DMA regions
> anyway.
>
> The only reason I can think of why a windowed IOMMU would be useful is
> to prevent two or more devices from stepping on each others' toes. But
> that's a problem that the OS should already be handling during DMA
> buffer allocation, isn't it?
Right. As long as we always unmap the buffers from the IOMMU after they
have stopped being in use, it's very unlikely that even a broken device
driver causes a DMA into some bus address that happens to be mapped for
another device.
> > > > > > I would like to add an explanation about dma-ranges to the binding:
> > > > > >
> > > > > > 8<--------
> > > > > > The parent bus of the iommu must have a valid "dma-ranges" property
> > > > > > describing how the physical address space of the IOMMU maps into
> > > > > > memory.
> > > > >
> > > > > With physical address space you mean the addresses after translation,
> > > > > not the I/O virtual addresses, right? But even so, how will this work
> > > > > when there are multiple IOMMU devices? What determines which IOMMU is
> > > > > mapped via which entry?
> > > > >
> > > > > Perhaps having multiple IOMMUs implies that there will have to be some
> > > > > partitioning of the parent address space to make sure two IOMMUs don't
> > > > > translate to the same ranges?
> > > >
> > > > These dma-ranges properties would almost always be for the entire RAM,
> > > > and we can treat anything else as a bug.
> > >
> > > Would it typically be a 1:1 mapping? In that case could we define an
> > > empty dma-ranges property to mean exactly that? That would make it
> > > consistent with the ranges property.
> >
> > Yes, I believe that is how it's already defined.
>
> I've gone through the proposal at [0] again, but couldn't find a mention
> of an empty "dma-ranges" property. But regardless I think that a 1:1
> mapping is the obvious meaning of an empty "dma-ranges" property.
>
> [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
>
> One thing I'm not sure about is whether dma-ranges should be documented
> in this binding at all. Since there's an accepted standard proposal it
> would seem that it doesn't need to be specifically mentioned. One other
> option would be to link to the above proposal from the binding and then
> complement that with what an empty "dma-ranges" property means.
>
> Or we could possible document this in a file along with other standard
> properties. I don't think we currently do that for any properties, but
> my concern is that there will always be a limited number of people
> knowing about how such properties are supposed to work. If all of a
> sudden all these people would disappear, everybody else would be left
> with references to these properties but nowhere to look for their
> meaning.
I think it makes sense to document how the standard dma-ranges
interacts with the new iommu binding, because it's not obvious
what happens if you have both together, or iommu without a parent
dma-ranges.
> > > > > > A device with an "iommus" property will ignore the "dma-ranges" property
> > > > > > of the parent node and rely on the IOMMU for translation instead.
> > > > >
> > > > > Do we need to consider the case where an IOMMU listed in iommus isn't
> > > > > enabled (status = "disabled")? In that case presumably the device would
> > > > > either not function or may optionally continue to master onto the parent
> > > > > untranslated.
> > > >
> > > > My reasoning was that the DT should specify whether we use the IOMMU
> > > > or not. Being able to just switch on or off the IOMMU sounds nice as
> > > > well, so we could change the text above to do that.
> > > >
> > > > Another option would be to do this in the IOMMU code, basically
> > > > falling back to the IOMMU parent's dma-ranges property and using
> > > > linear dma_map_ops when that is disabled.
> > >
> > > Yes, it should be trivial for the IOMMU core code to take care of this
> > > special case. Still I think it's worth mentioning it in the binding so
> > > that it's clearly specified.
> >
> > Agreed.
>
> Okay, I have a new version of the binding that I think incorporates all
> the changes discussed so far. It uses #address-cells and #size-cells to
> define the length of the specifier, but if we decide against that it can
> easily be changed again.
Ok.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 8:54 ` Arnd Bergmann
(?)
@ 2014-05-21 9:02 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:02 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 3694 bytes --]
On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > > [...]
> > > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > > >
> > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > > in the link.
> > > > > >
> > > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > > is the same for each virtual machine? That would require some other
> > > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > > allocated dynamically at runtime.
> > > > >
> > > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > > range with a guest physical address. The hypervisor checks that the bus
> > > > > address is within the allowed range, and translates the guest physical
> > > > > address into a host physical address, then enters both into the I/O page
> > > > > table or I/O TLB.
> > > >
> > > > So when a VM is booted it is passed a device tree with that DMA window?
> > >
> > > Correct.
> > >
> > > > Given what you describe above this seems to be more of a configuration
> > > > option to restrict the IOMMU to a subset of the physical memory for
> > > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > > for what we're trying to achieve with iommus or dma-ranges in this
> > > > binding.
> > >
> > > Thinking about it again now, I wonder if there are any other use cases
> > > for windowed IOMMUs. If this is the only one, there might be no use
> > > in the #address-cells model I suggested instead of your original
> > > #iommu-cells.
> >
> > So in this case virtualization is the reason why we need the DMA window.
> > The reason for that is that the guest has no other way of knowing what
> > other guests might be using, so it's essentially a mechanism for the
> > host to manage the DMA region and allocate subregions for each guest. If
> > virtualization isn't an issue then it seems to me that the need for DMA
> > windows goes away because the operating system will track DMA regions
> > anyway.
> >
> > The only reason I can think of why a windowed IOMMU would be useful is
> > to prevent two or more devices from stepping on each others' toes. But
> > that's a problem that the OS should already be handling during DMA
> > buffer allocation, isn't it?
>
> Right. As long as we always unmap the buffers from the IOMMU after they
> have stopped being in use, it's very unlikely that even a broken device
> driver causes a DMA into some bus address that happens to be mapped for
> another device.
I think that if buffers remain mapped in the IOMMU when they have been
deallocated that should be considered a bug.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:02 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:02 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
[-- Attachment #1: Type: text/plain, Size: 3694 bytes --]
On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > > [...]
> > > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > > >
> > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > > in the link.
> > > > > >
> > > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > > is the same for each virtual machine? That would require some other
> > > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > > allocated dynamically at runtime.
> > > > >
> > > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > > range with a guest physical address. The hypervisor checks that the bus
> > > > > address is within the allowed range, and translates the guest physical
> > > > > address into a host physical address, then enters both into the I/O page
> > > > > table or I/O TLB.
> > > >
> > > > So when a VM is booted it is passed a device tree with that DMA window?
> > >
> > > Correct.
> > >
> > > > Given what you describe above this seems to be more of a configuration
> > > > option to restrict the IOMMU to a subset of the physical memory for
> > > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > > for what we're trying to achieve with iommus or dma-ranges in this
> > > > binding.
> > >
> > > Thinking about it again now, I wonder if there are any other use cases
> > > for windowed IOMMUs. If this is the only one, there might be no use
> > > in the #address-cells model I suggested instead of your original
> > > #iommu-cells.
> >
> > So in this case virtualization is the reason why we need the DMA window.
> > The reason for that is that the guest has no other way of knowing what
> > other guests might be using, so it's essentially a mechanism for the
> > host to manage the DMA region and allocate subregions for each guest. If
> > virtualization isn't an issue then it seems to me that the need for DMA
> > windows goes away because the operating system will track DMA regions
> > anyway.
> >
> > The only reason I can think of why a windowed IOMMU would be useful is
> > to prevent two or more devices from stepping on each others' toes. But
> > that's a problem that the OS should already be handling during DMA
> > buffer allocation, isn't it?
>
> Right. As long as we always unmap the buffers from the IOMMU after they
> have stopped being in use, it's very unlikely that even a broken device
> driver causes a DMA into some bus address that happens to be mapped for
> another device.
I think that if buffers remain mapped in the IOMMU when they have been
deallocated that should be considered a bug.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:02 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:02 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
> > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > > > > [...]
> > > > > > > > Couldn't a single-master IOMMU be windowed?
> > > > > > >
> > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> > > > > > > IOMMU but uses one window per virtual machine. In that case, the window could
> > > > > > > be a property of the iommu node though, rather than part of the address
> > > > > > > in the link.
> > > > > >
> > > > > > Does that mean that the IOMMU has one statically configured window which
> > > > > > is the same for each virtual machine? That would require some other
> > > > > > mechanism to assign separate address spaces to each virtual machine,
> > > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be
> > > > > > allocated dynamically at runtime.
> > > > >
> > > > > The way it works on pSeries is that upon VM creation, the guest is assigned
> > > > > one 256MB window for use by assigned DMA capable devices. When the guest
> > > > > creates a mapping, it uses a hypercall to associate a bus address in that
> > > > > range with a guest physical address. The hypervisor checks that the bus
> > > > > address is within the allowed range, and translates the guest physical
> > > > > address into a host physical address, then enters both into the I/O page
> > > > > table or I/O TLB.
> > > >
> > > > So when a VM is booted it is passed a device tree with that DMA window?
> > >
> > > Correct.
> > >
> > > > Given what you describe above this seems to be more of a configuration
> > > > option to restrict the IOMMU to a subset of the physical memory for
> > > > purposes of virtualization. So I agree that this wouldn't be a good fit
> > > > for what we're trying to achieve with iommus or dma-ranges in this
> > > > binding.
> > >
> > > Thinking about it again now, I wonder if there are any other use cases
> > > for windowed IOMMUs. If this is the only one, there might be no use
> > > in the #address-cells model I suggested instead of your original
> > > #iommu-cells.
> >
> > So in this case virtualization is the reason why we need the DMA window.
> > The reason for that is that the guest has no other way of knowing what
> > other guests might be using, so it's essentially a mechanism for the
> > host to manage the DMA region and allocate subregions for each guest. If
> > virtualization isn't an issue then it seems to me that the need for DMA
> > windows goes away because the operating system will track DMA regions
> > anyway.
> >
> > The only reason I can think of why a windowed IOMMU would be useful is
> > to prevent two or more devices from stepping on each others' toes. But
> > that's a problem that the OS should already be handling during DMA
> > buffer allocation, isn't it?
>
> Right. As long as we always unmap the buffers from the IOMMU after they
> have stopped being in use, it's very unlikely that even a broken device
> driver causes a DMA into some bus address that happens to be mapped for
> another device.
I think that if buffers remain mapped in the IOMMU when they have been
deallocated that should be considered a bug.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140521/5b719c8c/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 9:02 ` Thierry Reding
@ 2014-05-21 9:32 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 9:32 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Thierry Reding, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
>
> > Right. As long as we always unmap the buffers from the IOMMU after they
> > have stopped being in use, it's very unlikely that even a broken device
> > driver causes a DMA into some bus address that happens to be mapped for
> > another device.
>
> I think that if buffers remain mapped in the IOMMU when they have been
> deallocated that should be considered a bug.
You could see it as a performance optimization in some cases, e.g. when you
cannot flush individual IOTLBs or doing so is expensive, and you just keep
assigning new addresses until every one has been used, then you do a global
IOTLB flush once. Obviously you have to maintain the IO page tables correctly.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:32 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 9:32 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
>
> > Right. As long as we always unmap the buffers from the IOMMU after they
> > have stopped being in use, it's very unlikely that even a broken device
> > driver causes a DMA into some bus address that happens to be mapped for
> > another device.
>
> I think that if buffers remain mapped in the IOMMU when they have been
> deallocated that should be considered a bug.
You could see it as a performance optimization in some cases, e.g. when you
cannot flush individual IOTLBs or doing so is expensive, and you just keep
assigning new addresses until every one has been used, then you do a global
IOTLB flush once. Obviously you have to maintain the IO page tables correctly.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 9:32 ` Arnd Bergmann
(?)
@ 2014-05-21 15:44 ` Grant Grundler
-1 siblings, 0 replies; 112+ messages in thread
From: Grant Grundler @ 2014-05-21 15:44 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, Linux DeviceTree, linux-samsung-soc, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon, LKML,
Rob Herring, Marc Zyngier, Linux IOMMU, Thierry Reding,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
>> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
>>
>> > Right. As long as we always unmap the buffers from the IOMMU after they
>> > have stopped being in use, it's very unlikely that even a broken device
>> > driver causes a DMA into some bus address that happens to be mapped for
>> > another device.
>>
>> I think that if buffers remain mapped in the IOMMU when they have been
>> deallocated that should be considered a bug.
There is currently no general requirement to tear down mappings immediately.
An option to enforce immediate tear down might be useful since I agree
Virtual Guests sharing host memory through shared physical devices
will want that. ie there should be no opportunity for a shared device
to be able to access another guests memory through "stale" (but live)
DMA mappings. I don't think that's the case but I'd ask someone like
Alex Williamson for a more certain answer.
> You could see it as a performance optimization in some cases, e.g. when you
> cannot flush individual IOTLBs or doing so is expensive, and you just keep
> assigning new addresses until every one has been used, then you do a global
> IOTLB flush once.
This is in fact exactly what several IOMMUs do - including the Intel
IOMMU driver.
IIRC, the Intel IOMMU driver collects 256 or so entries to flush and
then flushes them all at once and updates the metadata. I don't if it
does this for ranges assigned to different virtual guests though or
just within a given range.
> Obviously you have to maintain the IO page tables correctly.
To be clear, "Correctly" in this case just means until the IOTLB is
flushed, the given IOMMU Pdir entries are marked "in use" even though
the driver has handed "ownership" back to the IOMMU driver.
cheers,
grant
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 15:44 ` Grant Grundler
0 siblings, 0 replies; 112+ messages in thread
From: Grant Grundler @ 2014-05-21 15:44 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel@lists.infradead.org, Thierry Reding,
Mark Rutland, Linux DeviceTree, linux-samsung-soc, Pawel Moll,
Ian Campbell, Grant Grundler, Joerg Roedel, Stephen Warren,
Will Deacon, LKML, Marc Zyngier, Linux IOMMU, Rob Herring,
Kumar Gala, linux-tegra, Cho KyongHo, Dave Martin
On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
>> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
>>
>> > Right. As long as we always unmap the buffers from the IOMMU after they
>> > have stopped being in use, it's very unlikely that even a broken device
>> > driver causes a DMA into some bus address that happens to be mapped for
>> > another device.
>>
>> I think that if buffers remain mapped in the IOMMU when they have been
>> deallocated that should be considered a bug.
There is currently no general requirement to tear down mappings immediately.
An option to enforce immediate tear down might be useful since I agree
Virtual Guests sharing host memory through shared physical devices
will want that. ie there should be no opportunity for a shared device
to be able to access another guests memory through "stale" (but live)
DMA mappings. I don't think that's the case but I'd ask someone like
Alex Williamson for a more certain answer.
> You could see it as a performance optimization in some cases, e.g. when you
> cannot flush individual IOTLBs or doing so is expensive, and you just keep
> assigning new addresses until every one has been used, then you do a global
> IOTLB flush once.
This is in fact exactly what several IOMMUs do - including the Intel
IOMMU driver.
IIRC, the Intel IOMMU driver collects 256 or so entries to flush and
then flushes them all at once and updates the metadata. I don't if it
does this for ranges assigned to different virtual guests though or
just within a given range.
> Obviously you have to maintain the IO page tables correctly.
To be clear, "Correctly" in this case just means until the IOTLB is
flushed, the given IOMMU Pdir entries are marked "in use" even though
the driver has handed "ownership" back to the IOMMU driver.
cheers,
grant
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 15:44 ` Grant Grundler
0 siblings, 0 replies; 112+ messages in thread
From: Grant Grundler @ 2014-05-21 15:44 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
>> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
>>
>> > Right. As long as we always unmap the buffers from the IOMMU after they
>> > have stopped being in use, it's very unlikely that even a broken device
>> > driver causes a DMA into some bus address that happens to be mapped for
>> > another device.
>>
>> I think that if buffers remain mapped in the IOMMU when they have been
>> deallocated that should be considered a bug.
There is currently no general requirement to tear down mappings immediately.
An option to enforce immediate tear down might be useful since I agree
Virtual Guests sharing host memory through shared physical devices
will want that. ie there should be no opportunity for a shared device
to be able to access another guests memory through "stale" (but live)
DMA mappings. I don't think that's the case but I'd ask someone like
Alex Williamson for a more certain answer.
> You could see it as a performance optimization in some cases, e.g. when you
> cannot flush individual IOTLBs or doing so is expensive, and you just keep
> assigning new addresses until every one has been used, then you do a global
> IOTLB flush once.
This is in fact exactly what several IOMMUs do - including the Intel
IOMMU driver.
IIRC, the Intel IOMMU driver collects 256 or so entries to flush and
then flushes them all at once and updates the metadata. I don't if it
does this for ranges assigned to different virtual guests though or
just within a given range.
> Obviously you have to maintain the IO page tables correctly.
To be clear, "Correctly" in this case just means until the IOTLB is
flushed, the given IOMMU Pdir entries are marked "in use" even though
the driver has handed "ownership" back to the IOMMU driver.
cheers,
grant
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 15:44 ` Grant Grundler
@ 2014-05-21 16:01 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 16:01 UTC (permalink / raw)
To: Grant Grundler
Cc: linux-arm-kernel@lists.infradead.org, Thierry Reding,
Mark Rutland, Linux DeviceTree, linux-samsung-soc, Pawel Moll,
Ian Campbell, Joerg Roedel, Stephen Warren, Will Deacon, LKML,
Marc Zyngier, Linux IOMMU, Rob Herring, Kumar Gala, linux-tegra,
Cho KyongHo, Dave Martin
On Wednesday 21 May 2014 08:44:42 Grant Grundler wrote:
> On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
> >> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
> >>
> >> > Right. As long as we always unmap the buffers from the IOMMU after they
> >> > have stopped being in use, it's very unlikely that even a broken device
> >> > driver causes a DMA into some bus address that happens to be mapped for
> >> > another device.
> >>
> >> I think that if buffers remain mapped in the IOMMU when they have been
> >> deallocated that should be considered a bug.
>
> There is currently no general requirement to tear down mappings immediately.
> An option to enforce immediate tear down might be useful since I agree
> Virtual Guests sharing host memory through shared physical devices
> will want that. ie there should be no opportunity for a shared device
> to be able to access another guests memory through "stale" (but live)
> DMA mappings. I don't think that's the case but I'd ask someone like
> Alex Williamson for a more certain answer.
I believe powerpc has a boot-time option to enforce the immediate
IOTLB flush, which is very useful for device driver debugging when
something goes wrong with stale DMAs.
> > Obviously you have to maintain the IO page tables correctly.
>
> To be clear, "Correctly" in this case just means until the IOTLB is
> flushed, the given IOMMU Pdir entries are marked "in use" even though
> the driver has handed "ownership" back to the IOMMU driver.
I don't know what a Pdir is, but I guess strictly speaking we have
to ensure that all IO page table entries that have been unmapped by
a driver are marked as invalid at least by the time the IOTLB is flushed,
plus we have to flush each entry before it gets reused for a different
page.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 16:01 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 16:01 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 08:44:42 Grant Grundler wrote:
> On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
> >> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
> >>
> >> > Right. As long as we always unmap the buffers from the IOMMU after they
> >> > have stopped being in use, it's very unlikely that even a broken device
> >> > driver causes a DMA into some bus address that happens to be mapped for
> >> > another device.
> >>
> >> I think that if buffers remain mapped in the IOMMU when they have been
> >> deallocated that should be considered a bug.
>
> There is currently no general requirement to tear down mappings immediately.
> An option to enforce immediate tear down might be useful since I agree
> Virtual Guests sharing host memory through shared physical devices
> will want that. ie there should be no opportunity for a shared device
> to be able to access another guests memory through "stale" (but live)
> DMA mappings. I don't think that's the case but I'd ask someone like
> Alex Williamson for a more certain answer.
I believe powerpc has a boot-time option to enforce the immediate
IOTLB flush, which is very useful for device driver debugging when
something goes wrong with stale DMAs.
> > Obviously you have to maintain the IO page tables correctly.
>
> To be clear, "Correctly" in this case just means until the IOTLB is
> flushed, the given IOMMU Pdir entries are marked "in use" even though
> the driver has handed "ownership" back to the IOMMU driver.
I don't know what a Pdir is, but I guess strictly speaking we have
to ensure that all IO page table entries that have been unmapped by
a driver are marked as invalid at least by the time the IOTLB is flushed,
plus we have to flush each entry before it gets reused for a different
page.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 12:41 ` Arnd Bergmann
(?)
@ 2014-05-20 15:24 ` Dave Martin
-1 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-20 15:24 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Thierry Reding, Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
Cho KyongHo, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
> > > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
> > > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
> > [...]
> > > > > > > You should never need #size-cells > #address-cells
> > > > > >
> > > > > > That was always my impression as well. But how then do you represent the
> > > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
> > > > > > 4 GiB - 1, which makes it 4 GiB large. That's:
> > > > > >
> > > > > > <0 1 0>
> > > > > >
> > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is:
> > > > > >
> > > > > > <0 0xffffffff>
> > > > > >
> > > > > > but that's not accurate.
> > > > >
> > > > > I think we've done both in the past, either extended #size-cells or
> > > > > taken 0xffffffff as a special token. Note that in your example,
> > > > > the iommu actually needs #address-cells = <2> anyway.
> > > >
> > > > But it needs #address-cells = <2> only to encode an ID in addition to
> > > > the address. If this was a single-master IOMMU then there'd be no need
> > > > for the ID.
> > >
> > > Right. But for a single-master IOMMU, there is no need to specify
> > > any additional data, it could have #address-cells=<0> if we take the
> > > optimization you suggested.
> >
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
>
> > > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f
> > > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2>
> > > > > and have an empty dma-ranges property in the PCI host bridge node,
> > > > > and interpret this as using the same encoding as the PCI BARs in
> > > > > the ranges property.
> > > >
> > > > I'm somewhat confused here, since you said earlier:
> > > >
> > > > > After giving the ranges stuff some more thought, I have come to the
> > > > > conclusion that using #iommu-cells should work fine for almost
> > > > > all cases, including windowed iommus, because the window is not
> > > > > actually needed in the device, but only in the iommu, wihch is of course
> > > > > free to interpret the arguments as addresses.
> > > >
> > > > But now you seem to be saying that we should still be using the
> > > > #address-cells and #size-cells properties in the IOMMU node to determine
> > > > the length of the specifier.
> > >
> > > I probably wasn't clear. I think we can make it work either way, but
> > > my feeling is that using #address-cells/#size-cells gives us a nicer
> > > syntax for the more complex cases.
> >
> > Okay, so in summary we'd have something like this for simple cases:
> >
> > Required properties:
> > --------------------
> > - #address-cells: The number of cells in an IOMMU specifier needed to encode
> > an address.
> > - #size-cells: The number of cells in an IOMMU specifier needed to represent
> > the length of an address range.
> >
> > Typical values for the above include:
> > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > configurable and therefore no additional information needs to be encoded in
> > the specifier. This may also apply to multiple master IOMMU devices that do
> > not allow the association of masters to be configured.
> > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > need to be configured in order to enable translation for a given master. In
> > such cases the single address cell corresponds to the master device's ID.
> > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > window for masters to be configured. The first cell of the address in this
> > may contain the master device's ID for example, while the second cell could
> > contain the start of the DMA window for the given device. The length of the
> > DMA window is specified by two additional cells.
I was trying to figure out how to describe the different kinds of
transformation we could have on the address/ID input to the IOMMU.
Treating the whole thing as opaque gets us off the hook there.
IDs are probably not propagated, not remapped, or we simply don't care
about them; whereas the address transformation is software-controlled,
so we don't describe that anyway.
Delegating grokking the mapping to the iommu driver makes sense --
it's what it's there for, after all.
I'm not sure whether the windowed IOMMU case is special actually.
Since the address to program into the master is found by calling the
IOMMU driver to create some mappings, does anything except the IOMMU
driver need to understand that there is windowing?
> >
> > Examples:
> > =========
> >
> > Single-master IOMMU:
> > --------------------
> >
> > iommu {
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > master {
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
>
> > /* static association with IOMMU */
> > master@1 {
> > reg = <1>;
Just for clarification, "reg" just has its standard meaning here, and
is nothing to do with the IOMMU?
> > iommus = <&/iommu>;
In effect, "iommus" is doing the same thing as my "slaves" property.
The way #address-cells and #size-cells determine the address and range
size for mastering into the IOMMU is also similar. The main difference
is that I didn't build the ID into the address.
> > };
> >
> > /* static association with IOMMU */
> > master@2 {
> > reg = <2>;
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU:
> > ----------------------
> >
> > iommu {
> > /* the specifier represents the ID of the master */
> > #address-cells = <1>;
> > #size-cells = <0>;
How do we know the size of the input address to the IOMMU? Do we
get cases for example where the IOMMU only accepts a 32-bit input
address, but some 64-bit capable masters are connected through it?
The size of the output address from the IOMMU will be determined
by its own mastering destination, which by default in ePAPR is the
IOMMU node's parent. I think that's what you intended, and what we
expect in this case.
For determining dma masks, it is the output address that it
important. Santosh's code can probably be taught to handle this,
if given an additional traversal rule for following "iommus"
properties. However, deploying an IOMMU whose output address size
is smaller than the
> > };
> >
> > master {
> > /* device has master ID 42 in the IOMMU */
> > iommus = <&/iommu 42>;
> > };
> >
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu@1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu@2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
> >
> > Multiple-master IOMMU with configurable DMA window:
> > ---------------------------------------------------
> >
> > / {
> > #address-cells = <1>;
> > #size-cells = <1>;
> >
> > iommu {
> > /* master ID, address of DMA window */
> > #address-cells = <2>;
> > #size-cells = <2>;
> > };
> >
> > master {
> > /* master ID 42, 4 GiB DMA window starting at 0 */
> > iommus = <&/iommu 42 0 0x1 0x0>;
> > };
> > };
> >
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
I think multi-master is the common case.
>
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
This sounds just right. The required semantics is that the presence of
"iommus" on some bus mastering device overrides the ePAPR default
destination so that transactions are delivered to the IOMMU for
translation instead of the master's DT parent node.
Where transactions flow out of the IOMMU, the iommu takes on the role
of the master, so the default destination would be the iommu node's
parent.
> ----------->8
>
> Does that make sense to you? We can change what we say about
> dma-ranges, I mainly want to be clear with what is or is not
> allowed at this point.
I think it would be inconsistent and unnecessary to disallow it in the
binding. The meaning you've proposed seems completely consistent with
ePAPR, so I suggest to keep it. The IOMMU is just another bus master
from the ePAPR point of view -- no need to make special rules for it
unless they are useful.
The binding does not need to be (and generally shouldn't be) a
description of precisely what the kernel does and does not support.
However, if we don't need to support non-identity dma-ranges in Linux
yet, we have the option to barf if we see such a dma-ranges memorywards
of an IOMMU, if it simplifies the Linux implementation. We could always
relax that later -- and it'll be obvious how to describe that situation
in DT.
What I would like to see is a recommandation, based on Thierry's binding
here, for describing how cross-mastering in general is described. It's
not really a binding, but more of a template for bindings.
I'm happy to have a go at writing it, then we can decide whether it's
useful or not.
There are a few things from the discussion that are *not* solved by this
iommu binding, but they seem reasonable. The binding also doesn't block
solving those things later if/when needed:
1) Cross-mastering to things that are not IOMMUs
We might need to solve this later if we encounter SoCs with
problematic topologies, we shouldn't worry about it for the time
being.
We'll to revisit it for GICv3 but that's a separate topic.
2) Describing address and ID remappings for cross-mastering.
We can describe this in a way that is consistent with this IOMMU
binding. We will need to describe something for GICv3, but the
common case will be that IDs are just passed through without
remapping.
We don't need to clarify how IDs are propagated until we have
something in DT for IDs to propagate to.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 15:24 ` Dave Martin
0 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-20 15:24 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Rob Herring,
Marc Zyngier, iommu, Thierry Reding, Kumar Gala, linux-tegra,
Cho KyongHo
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
> > > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
> > > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
> > [...]
> > > > > > > You should never need #size-cells > #address-cells
> > > > > >
> > > > > > That was always my impression as well. But how then do you represent the
> > > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
> > > > > > 4 GiB - 1, which makes it 4 GiB large. That's:
> > > > > >
> > > > > > <0 1 0>
> > > > > >
> > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is:
> > > > > >
> > > > > > <0 0xffffffff>
> > > > > >
> > > > > > but that's not accurate.
> > > > >
> > > > > I think we've done both in the past, either extended #size-cells or
> > > > > taken 0xffffffff as a special token. Note that in your example,
> > > > > the iommu actually needs #address-cells = <2> anyway.
> > > >
> > > > But it needs #address-cells = <2> only to encode an ID in addition to
> > > > the address. If this was a single-master IOMMU then there'd be no need
> > > > for the ID.
> > >
> > > Right. But for a single-master IOMMU, there is no need to specify
> > > any additional data, it could have #address-cells=<0> if we take the
> > > optimization you suggested.
> >
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
>
> > > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f
> > > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2>
> > > > > and have an empty dma-ranges property in the PCI host bridge node,
> > > > > and interpret this as using the same encoding as the PCI BARs in
> > > > > the ranges property.
> > > >
> > > > I'm somewhat confused here, since you said earlier:
> > > >
> > > > > After giving the ranges stuff some more thought, I have come to the
> > > > > conclusion that using #iommu-cells should work fine for almost
> > > > > all cases, including windowed iommus, because the window is not
> > > > > actually needed in the device, but only in the iommu, wihch is of course
> > > > > free to interpret the arguments as addresses.
> > > >
> > > > But now you seem to be saying that we should still be using the
> > > > #address-cells and #size-cells properties in the IOMMU node to determine
> > > > the length of the specifier.
> > >
> > > I probably wasn't clear. I think we can make it work either way, but
> > > my feeling is that using #address-cells/#size-cells gives us a nicer
> > > syntax for the more complex cases.
> >
> > Okay, so in summary we'd have something like this for simple cases:
> >
> > Required properties:
> > --------------------
> > - #address-cells: The number of cells in an IOMMU specifier needed to encode
> > an address.
> > - #size-cells: The number of cells in an IOMMU specifier needed to represent
> > the length of an address range.
> >
> > Typical values for the above include:
> > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > configurable and therefore no additional information needs to be encoded in
> > the specifier. This may also apply to multiple master IOMMU devices that do
> > not allow the association of masters to be configured.
> > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > need to be configured in order to enable translation for a given master. In
> > such cases the single address cell corresponds to the master device's ID.
> > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > window for masters to be configured. The first cell of the address in this
> > may contain the master device's ID for example, while the second cell could
> > contain the start of the DMA window for the given device. The length of the
> > DMA window is specified by two additional cells.
I was trying to figure out how to describe the different kinds of
transformation we could have on the address/ID input to the IOMMU.
Treating the whole thing as opaque gets us off the hook there.
IDs are probably not propagated, not remapped, or we simply don't care
about them; whereas the address transformation is software-controlled,
so we don't describe that anyway.
Delegating grokking the mapping to the iommu driver makes sense --
it's what it's there for, after all.
I'm not sure whether the windowed IOMMU case is special actually.
Since the address to program into the master is found by calling the
IOMMU driver to create some mappings, does anything except the IOMMU
driver need to understand that there is windowing?
> >
> > Examples:
> > =========
> >
> > Single-master IOMMU:
> > --------------------
> >
> > iommu {
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > master {
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
>
> > /* static association with IOMMU */
> > master@1 {
> > reg = <1>;
Just for clarification, "reg" just has its standard meaning here, and
is nothing to do with the IOMMU?
> > iommus = <&/iommu>;
In effect, "iommus" is doing the same thing as my "slaves" property.
The way #address-cells and #size-cells determine the address and range
size for mastering into the IOMMU is also similar. The main difference
is that I didn't build the ID into the address.
> > };
> >
> > /* static association with IOMMU */
> > master@2 {
> > reg = <2>;
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU:
> > ----------------------
> >
> > iommu {
> > /* the specifier represents the ID of the master */
> > #address-cells = <1>;
> > #size-cells = <0>;
How do we know the size of the input address to the IOMMU? Do we
get cases for example where the IOMMU only accepts a 32-bit input
address, but some 64-bit capable masters are connected through it?
The size of the output address from the IOMMU will be determined
by its own mastering destination, which by default in ePAPR is the
IOMMU node's parent. I think that's what you intended, and what we
expect in this case.
For determining dma masks, it is the output address that it
important. Santosh's code can probably be taught to handle this,
if given an additional traversal rule for following "iommus"
properties. However, deploying an IOMMU whose output address size
is smaller than the
> > };
> >
> > master {
> > /* device has master ID 42 in the IOMMU */
> > iommus = <&/iommu 42>;
> > };
> >
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu@1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu@2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
> >
> > Multiple-master IOMMU with configurable DMA window:
> > ---------------------------------------------------
> >
> > / {
> > #address-cells = <1>;
> > #size-cells = <1>;
> >
> > iommu {
> > /* master ID, address of DMA window */
> > #address-cells = <2>;
> > #size-cells = <2>;
> > };
> >
> > master {
> > /* master ID 42, 4 GiB DMA window starting at 0 */
> > iommus = <&/iommu 42 0 0x1 0x0>;
> > };
> > };
> >
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
I think multi-master is the common case.
>
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
This sounds just right. The required semantics is that the presence of
"iommus" on some bus mastering device overrides the ePAPR default
destination so that transactions are delivered to the IOMMU for
translation instead of the master's DT parent node.
Where transactions flow out of the IOMMU, the iommu takes on the role
of the master, so the default destination would be the iommu node's
parent.
> ----------->8
>
> Does that make sense to you? We can change what we say about
> dma-ranges, I mainly want to be clear with what is or is not
> allowed at this point.
I think it would be inconsistent and unnecessary to disallow it in the
binding. The meaning you've proposed seems completely consistent with
ePAPR, so I suggest to keep it. The IOMMU is just another bus master
from the ePAPR point of view -- no need to make special rules for it
unless they are useful.
The binding does not need to be (and generally shouldn't be) a
description of precisely what the kernel does and does not support.
However, if we don't need to support non-identity dma-ranges in Linux
yet, we have the option to barf if we see such a dma-ranges memorywards
of an IOMMU, if it simplifies the Linux implementation. We could always
relax that later -- and it'll be obvious how to describe that situation
in DT.
What I would like to see is a recommandation, based on Thierry's binding
here, for describing how cross-mastering in general is described. It's
not really a binding, but more of a template for bindings.
I'm happy to have a go at writing it, then we can decide whether it's
useful or not.
There are a few things from the discussion that are *not* solved by this
iommu binding, but they seem reasonable. The binding also doesn't block
solving those things later if/when needed:
1) Cross-mastering to things that are not IOMMUs
We might need to solve this later if we encounter SoCs with
problematic topologies, we shouldn't worry about it for the time
being.
We'll to revisit it for GICv3 but that's a separate topic.
2) Describing address and ID remappings for cross-mastering.
We can describe this in a way that is consistent with this IOMMU
binding. We will need to describe something for GICv3, but the
common case will be that IDs are just passed through without
remapping.
We don't need to clarify how IDs are propagated until we have
something in DT for IDs to propagate to.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 15:24 ` Dave Martin
0 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-20 15:24 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
> > > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
> > > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
> > > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
> > [...]
> > > > > > > You should never need #size-cells > #address-cells
> > > > > >
> > > > > > That was always my impression as well. But how then do you represent the
> > > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
> > > > > > 4 GiB - 1, which makes it 4 GiB large. That's:
> > > > > >
> > > > > > <0 1 0>
> > > > > >
> > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is:
> > > > > >
> > > > > > <0 0xffffffff>
> > > > > >
> > > > > > but that's not accurate.
> > > > >
> > > > > I think we've done both in the past, either extended #size-cells or
> > > > > taken 0xffffffff as a special token. Note that in your example,
> > > > > the iommu actually needs #address-cells = <2> anyway.
> > > >
> > > > But it needs #address-cells = <2> only to encode an ID in addition to
> > > > the address. If this was a single-master IOMMU then there'd be no need
> > > > for the ID.
> > >
> > > Right. But for a single-master IOMMU, there is no need to specify
> > > any additional data, it could have #address-cells=<0> if we take the
> > > optimization you suggested.
> >
> > Couldn't a single-master IOMMU be windowed?
>
> Ah, yes. That would actually be like an IBM pSeries, which has a windowed
> IOMMU but uses one window per virtual machine. In that case, the window could
> be a property of the iommu node though, rather than part of the address
> in the link.
>
> > > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f
> > > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2>
> > > > > and have an empty dma-ranges property in the PCI host bridge node,
> > > > > and interpret this as using the same encoding as the PCI BARs in
> > > > > the ranges property.
> > > >
> > > > I'm somewhat confused here, since you said earlier:
> > > >
> > > > > After giving the ranges stuff some more thought, I have come to the
> > > > > conclusion that using #iommu-cells should work fine for almost
> > > > > all cases, including windowed iommus, because the window is not
> > > > > actually needed in the device, but only in the iommu, wihch is of course
> > > > > free to interpret the arguments as addresses.
> > > >
> > > > But now you seem to be saying that we should still be using the
> > > > #address-cells and #size-cells properties in the IOMMU node to determine
> > > > the length of the specifier.
> > >
> > > I probably wasn't clear. I think we can make it work either way, but
> > > my feeling is that using #address-cells/#size-cells gives us a nicer
> > > syntax for the more complex cases.
> >
> > Okay, so in summary we'd have something like this for simple cases:
> >
> > Required properties:
> > --------------------
> > - #address-cells: The number of cells in an IOMMU specifier needed to encode
> > an address.
> > - #size-cells: The number of cells in an IOMMU specifier needed to represent
> > the length of an address range.
> >
> > Typical values for the above include:
> > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > configurable and therefore no additional information needs to be encoded in
> > the specifier. This may also apply to multiple master IOMMU devices that do
> > not allow the association of masters to be configured.
> > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > need to be configured in order to enable translation for a given master. In
> > such cases the single address cell corresponds to the master device's ID.
> > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > window for masters to be configured. The first cell of the address in this
> > may contain the master device's ID for example, while the second cell could
> > contain the start of the DMA window for the given device. The length of the
> > DMA window is specified by two additional cells.
I was trying to figure out how to describe the different kinds of
transformation we could have on the address/ID input to the IOMMU.
Treating the whole thing as opaque gets us off the hook there.
IDs are probably not propagated, not remapped, or we simply don't care
about them; whereas the address transformation is software-controlled,
so we don't describe that anyway.
Delegating grokking the mapping to the iommu driver makes sense --
it's what it's there for, after all.
I'm not sure whether the windowed IOMMU case is special actually.
Since the address to program into the master is found by calling the
IOMMU driver to create some mappings, does anything except the IOMMU
driver need to understand that there is windowing?
> >
> > Examples:
> > =========
> >
> > Single-master IOMMU:
> > --------------------
> >
> > iommu {
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > master {
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU with fixed associations:
> > ----------------------------------------------
> >
> > /* multiple-master IOMMU */
> > iommu {
> > /*
> > * Masters are statically associated with this IOMMU and
> > * address translation is always enabled.
> > */
> > #iommu-cells = <0>;
> > };
>
> copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
>
> > /* static association with IOMMU */
> > master at 1 {
> > reg = <1>;
Just for clarification, "reg" just has its standard meaning here, and
is nothing to do with the IOMMU?
> > iommus = <&/iommu>;
In effect, "iommus" is doing the same thing as my "slaves" property.
The way #address-cells and #size-cells determine the address and range
size for mastering into the IOMMU is also similar. The main difference
is that I didn't build the ID into the address.
> > };
> >
> > /* static association with IOMMU */
> > master at 2 {
> > reg = <2>;
> > iommus = <&/iommu>;
> > };
> >
> > Multiple-master IOMMU:
> > ----------------------
> >
> > iommu {
> > /* the specifier represents the ID of the master */
> > #address-cells = <1>;
> > #size-cells = <0>;
How do we know the size of the input address to the IOMMU? Do we
get cases for example where the IOMMU only accepts a 32-bit input
address, but some 64-bit capable masters are connected through it?
The size of the output address from the IOMMU will be determined
by its own mastering destination, which by default in ePAPR is the
IOMMU node's parent. I think that's what you intended, and what we
expect in this case.
For determining dma masks, it is the output address that it
important. Santosh's code can probably be taught to handle this,
if given an additional traversal rule for following "iommus"
properties. However, deploying an IOMMU whose output address size
is smaller than the
> > };
> >
> > master {
> > /* device has master ID 42 in the IOMMU */
> > iommus = <&/iommu 42>;
> > };
> >
> > Multiple-master device:
> > -----------------------
> >
> > /* single-master IOMMU */
> > iommu at 1 {
> > reg = <1>;
> > #address-cells = <0>;
> > #size-cells = <0>;
> > };
> >
> > /* multiple-master IOMMU */
> > iommu at 2 {
> > reg = <2>;
> > #address-cells = <1>;
> > #size-cells = <0>;
> > };
> >
> > /* device with two master interfaces */
> > master {
> > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > };
> >
> > Multiple-master IOMMU with configurable DMA window:
> > ---------------------------------------------------
> >
> > / {
> > #address-cells = <1>;
> > #size-cells = <1>;
> >
> > iommu {
> > /* master ID, address of DMA window */
> > #address-cells = <2>;
> > #size-cells = <2>;
> > };
> >
> > master {
> > /* master ID 42, 4 GiB DMA window starting at 0 */
> > iommus = <&/iommu 42 0 0x1 0x0>;
> > };
> > };
> >
> > Does that sound about right?
>
> Yes, sounds great. I would probably leave out the Multiple-master device
> from the examples, since that seems to be a rather obscure case.
I think multi-master is the common case.
>
> I would like to add an explanation about dma-ranges to the binding:
>
> 8<--------
> The parent bus of the iommu must have a valid "dma-ranges" property
> describing how the physical address space of the IOMMU maps into
> memory.
> A device with an "iommus" property will ignore the "dma-ranges" property
> of the parent node and rely on the IOMMU for translation instead.
> Using an "iommus" property in bus device nodes with "dma-ranges"
> specifying how child devices relate to the IOMMU is a possible extension
> but is not recommended until this binding gets extended.
This sounds just right. The required semantics is that the presence of
"iommus" on some bus mastering device overrides the ePAPR default
destination so that transactions are delivered to the IOMMU for
translation instead of the master's DT parent node.
Where transactions flow out of the IOMMU, the iommu takes on the role
of the master, so the default destination would be the iommu node's
parent.
> ----------->8
>
> Does that make sense to you? We can change what we say about
> dma-ranges, I mainly want to be clear with what is or is not
> allowed at this point.
I think it would be inconsistent and unnecessary to disallow it in the
binding. The meaning you've proposed seems completely consistent with
ePAPR, so I suggest to keep it. The IOMMU is just another bus master
from the ePAPR point of view -- no need to make special rules for it
unless they are useful.
The binding does not need to be (and generally shouldn't be) a
description of precisely what the kernel does and does not support.
However, if we don't need to support non-identity dma-ranges in Linux
yet, we have the option to barf if we see such a dma-ranges memorywards
of an IOMMU, if it simplifies the Linux implementation. We could always
relax that later -- and it'll be obvious how to describe that situation
in DT.
What I would like to see is a recommandation, based on Thierry's binding
here, for describing how cross-mastering in general is described. It's
not really a binding, but more of a template for bindings.
I'm happy to have a go at writing it, then we can decide whether it's
useful or not.
There are a few things from the discussion that are *not* solved by this
iommu binding, but they seem reasonable. The binding also doesn't block
solving those things later if/when needed:
1) Cross-mastering to things that are not IOMMUs
We might need to solve this later if we encounter SoCs with
problematic topologies, we shouldn't worry about it for the time
being.
We'll to revisit it for GICv3 but that's a separate topic.
2) Describing address and ID remappings for cross-mastering.
We can describe this in a way that is consistent with this IOMMU
binding. We will need to describe something for GICv3, but the
common case will be that IDs are just passed through without
remapping.
We don't need to clarify how IDs are propagated until we have
something in DT for IDs to propagate to.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread[parent not found: <20140520152458.GB5041-M5GwZQ6tE7x5pKCnmE3YQBJ8xKzm50AiAL8bYrjMMd8@public.gmane.org>]
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 15:24 ` Dave Martin
(?)
@ 2014-05-20 20:26 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 20:26 UTC (permalink / raw)
To: Dave Martin
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Thierry Reding, Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
Cho KyongHo, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > Typical values for the above include:
> > > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > > configurable and therefore no additional information needs to be encoded in
> > > the specifier. This may also apply to multiple master IOMMU devices that do
> > > not allow the association of masters to be configured.
> > > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > > need to be configured in order to enable translation for a given master. In
> > > such cases the single address cell corresponds to the master device's ID.
> > > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > > window for masters to be configured. The first cell of the address in this
> > > may contain the master device's ID for example, while the second cell could
> > > contain the start of the DMA window for the given device. The length of the
> > > DMA window is specified by two additional cells.
>
> I was trying to figure out how to describe the different kinds of
> transformation we could have on the address/ID input to the IOMMU.
> Treating the whole thing as opaque gets us off the hook there.
>
> IDs are probably not propagated, not remapped, or we simply don't care
> about them; whereas the address transformation is software-controlled,
> so we don't describe that anyway.
>
> Delegating grokking the mapping to the iommu driver makes sense --
> it's what it's there for, after all.
>
>
> I'm not sure whether the windowed IOMMU case is special actually.
>
> Since the address to program into the master is found by calling the
> IOMMU driver to create some mappings, does anything except the IOMMU
> driver need to understand that there is windowing?
No. I tried to explain that earlier today, and in my earlier mails
I hadn't thought that part through. Only the IOMMU driver needs to care
about the window.
> > >
> > > Examples:
> > > =========
> > >
> > > Single-master IOMMU:
> > > --------------------
> > >
> > > iommu {
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > master {
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU with fixed associations:
> > > ----------------------------------------------
> > >
> > > /* multiple-master IOMMU */
> > > iommu {
> > > /*
> > > * Masters are statically associated with this IOMMU and
> > > * address translation is always enabled.
> > > */
> > > #iommu-cells = <0>;
> > > };
> >
> > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
> >
> > > /* static association with IOMMU */
> > > master@1 {
> > > reg = <1>;
>
> Just for clarification, "reg" just has its standard meaning here, and
> is nothing to do with the IOMMU?
correct
> > > iommus = <&/iommu>;
>
> In effect, "iommus" is doing the same thing as my "slaves" property.
>
> The way #address-cells and #size-cells determine the address and range
> size for mastering into the IOMMU is also similar. The main difference
> is that I didn't build the ID into the address.
Right. I think the difference is more about what we want to call
things: Calling it iommu means we want to specifically describe
the case of iommus that needs to get handled by all OSs in a particular
way, while the more generic slave connection doesn't correspond to
a specific concept in the OS.
> > > };
> > >
> > > /* static association with IOMMU */
> > > master@2 {
> > > reg = <2>;
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU:
> > > ----------------------
> > >
> > > iommu {
> > > /* the specifier represents the ID of the master */
> > > #address-cells = <1>;
> > > #size-cells = <0>;
>
> How do we know the size of the input address to the IOMMU? Do we
> get cases for example where the IOMMU only accepts a 32-bit input
> address, but some 64-bit capable masters are connected through it?
I was stuck on this question for a while before, but then I realized
that it doesn't matter at all: It's the IOMMU driver itself that
manages the address space, and it doesn't matter if a slave can
address a larger range than the IOMMU can accept. If the IOMMU
needs to deal with the opposite case (64-bit input addresses
but a 32-bit master), that limitation can be put into the specifier.
> The size of the output address from the IOMMU will be determined
> by its own mastering destination, which by default in ePAPR is the
> IOMMU node's parent. I think that's what you intended, and what we
> expect in this case.
Rihgt.
> For determining dma masks, it is the output address that it
> important. Santosh's code can probably be taught to handle this,
> if given an additional traversal rule for following "iommus"
> properties. However, deploying an IOMMU whose output address size
> is smaller than the
Something seems to be missing here. I don't think we want to handle
the case where the IOMMU output cannot the entire memory address
space. If necessary, that would mean using both an IOMMU driver
and swiotlb, but I think it's a reasonable assumption that hardware
isn't /that/ crazy.
> > > Multiple-master device:
> > > -----------------------
> > >
> > > /* single-master IOMMU */
> > > iommu@1 {
> > > reg = <1>;
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* multiple-master IOMMU */
> > > iommu@2 {
> > > reg = <2>;
> > > #address-cells = <1>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* device with two master interfaces */
> > > master {
> > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > > };
> > >
> > > Multiple-master IOMMU with configurable DMA window:
> > > ---------------------------------------------------
> > >
> > > / {
> > > #address-cells = <1>;
> > > #size-cells = <1>;
> > >
> > > iommu {
> > > /* master ID, address of DMA window */
> > > #address-cells = <2>;
> > > #size-cells = <2>;
> > > };
> > >
> > > master {
> > > /* master ID 42, 4 GiB DMA window starting at 0 */
> > > iommus = <&/iommu 42 0 0x1 0x0>;
> > > };
> > > };
> > >
> > > Does that sound about right?
> >
> > Yes, sounds great. I would probably leave out the Multiple-master device
> > from the examples, since that seems to be a rather obscure case.
>
> I think multi-master is the common case.
Which of the two cases above do you mean? I was referring to the first
as being obscure, not the second.
I still haven't seen an example of the first, while the second one
is very common.
> > ----------->8
> >
> > Does that make sense to you? We can change what we say about
> > dma-ranges, I mainly want to be clear with what is or is not
> > allowed at this point.
>
> I think it would be inconsistent and unnecessary to disallow it in the
> binding. The meaning you've proposed seems completely consistent with
> ePAPR, so I suggest to keep it. The IOMMU is just another bus master
> from the ePAPR point of view -- no need to make special rules for it
> unless they are useful.
>
> The binding does not need to be (and generally shouldn't be) a
> description of precisely what the kernel does and does not support.
>
> However, if we don't need to support non-identity dma-ranges in Linux
> yet, we have the option to barf if we see such a dma-ranges memorywards
> of an IOMMU, if it simplifies the Linux implementation. We could always
> relax that later -- and it'll be obvious how to describe that situation
> in DT.
Ok.
> What I would like to see is a recommandation, based on Thierry's binding
> here, for describing how cross-mastering in general is described. It's
> not really a binding, but more of a template for bindings.
>
> I'm happy to have a go at writing it, then we can decide whether it's
> useful or not.
I don't mind if you take this on, but I'm not sure if that should be
part of this binding or not. Let's see what you come up with.
> There are a few things from the discussion that are *not* solved by this
> iommu binding, but they seem reasonable. The binding also doesn't block
> solving those things later if/when needed:
>
> 1) Cross-mastering to things that are not IOMMUs
>
> We might need to solve this later if we encounter SoCs with
> problematic topologies, we shouldn't worry about it for the time
> being.
>
> We'll to revisit it for GICv3 but that's a separate topic.
> 2) Describing address and ID remappings for cross-mastering.
>
> We can describe this in a way that is consistent with this IOMMU
> binding. We will need to describe something for GICv3, but the
> common case will be that IDs are just passed through without
> remapping.
>
> We don't need to clarify how IDs are propagated until we have
> something in DT for IDs to propagate to.
Ok, thanks for pointing these out. I had forgotten about the MSI
case, but it seems ok to defer that part for now.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 20:26 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 20:26 UTC (permalink / raw)
To: Dave Martin
Cc: linux-arm-kernel, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Rob Herring,
Marc Zyngier, iommu, Thierry Reding, Kumar Gala, linux-tegra,
Cho KyongHo
On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > Typical values for the above include:
> > > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > > configurable and therefore no additional information needs to be encoded in
> > > the specifier. This may also apply to multiple master IOMMU devices that do
> > > not allow the association of masters to be configured.
> > > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > > need to be configured in order to enable translation for a given master. In
> > > such cases the single address cell corresponds to the master device's ID.
> > > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > > window for masters to be configured. The first cell of the address in this
> > > may contain the master device's ID for example, while the second cell could
> > > contain the start of the DMA window for the given device. The length of the
> > > DMA window is specified by two additional cells.
>
> I was trying to figure out how to describe the different kinds of
> transformation we could have on the address/ID input to the IOMMU.
> Treating the whole thing as opaque gets us off the hook there.
>
> IDs are probably not propagated, not remapped, or we simply don't care
> about them; whereas the address transformation is software-controlled,
> so we don't describe that anyway.
>
> Delegating grokking the mapping to the iommu driver makes sense --
> it's what it's there for, after all.
>
>
> I'm not sure whether the windowed IOMMU case is special actually.
>
> Since the address to program into the master is found by calling the
> IOMMU driver to create some mappings, does anything except the IOMMU
> driver need to understand that there is windowing?
No. I tried to explain that earlier today, and in my earlier mails
I hadn't thought that part through. Only the IOMMU driver needs to care
about the window.
> > >
> > > Examples:
> > > =========
> > >
> > > Single-master IOMMU:
> > > --------------------
> > >
> > > iommu {
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > master {
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU with fixed associations:
> > > ----------------------------------------------
> > >
> > > /* multiple-master IOMMU */
> > > iommu {
> > > /*
> > > * Masters are statically associated with this IOMMU and
> > > * address translation is always enabled.
> > > */
> > > #iommu-cells = <0>;
> > > };
> >
> > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
> >
> > > /* static association with IOMMU */
> > > master@1 {
> > > reg = <1>;
>
> Just for clarification, "reg" just has its standard meaning here, and
> is nothing to do with the IOMMU?
correct
> > > iommus = <&/iommu>;
>
> In effect, "iommus" is doing the same thing as my "slaves" property.
>
> The way #address-cells and #size-cells determine the address and range
> size for mastering into the IOMMU is also similar. The main difference
> is that I didn't build the ID into the address.
Right. I think the difference is more about what we want to call
things: Calling it iommu means we want to specifically describe
the case of iommus that needs to get handled by all OSs in a particular
way, while the more generic slave connection doesn't correspond to
a specific concept in the OS.
> > > };
> > >
> > > /* static association with IOMMU */
> > > master@2 {
> > > reg = <2>;
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU:
> > > ----------------------
> > >
> > > iommu {
> > > /* the specifier represents the ID of the master */
> > > #address-cells = <1>;
> > > #size-cells = <0>;
>
> How do we know the size of the input address to the IOMMU? Do we
> get cases for example where the IOMMU only accepts a 32-bit input
> address, but some 64-bit capable masters are connected through it?
I was stuck on this question for a while before, but then I realized
that it doesn't matter at all: It's the IOMMU driver itself that
manages the address space, and it doesn't matter if a slave can
address a larger range than the IOMMU can accept. If the IOMMU
needs to deal with the opposite case (64-bit input addresses
but a 32-bit master), that limitation can be put into the specifier.
> The size of the output address from the IOMMU will be determined
> by its own mastering destination, which by default in ePAPR is the
> IOMMU node's parent. I think that's what you intended, and what we
> expect in this case.
Rihgt.
> For determining dma masks, it is the output address that it
> important. Santosh's code can probably be taught to handle this,
> if given an additional traversal rule for following "iommus"
> properties. However, deploying an IOMMU whose output address size
> is smaller than the
Something seems to be missing here. I don't think we want to handle
the case where the IOMMU output cannot the entire memory address
space. If necessary, that would mean using both an IOMMU driver
and swiotlb, but I think it's a reasonable assumption that hardware
isn't /that/ crazy.
> > > Multiple-master device:
> > > -----------------------
> > >
> > > /* single-master IOMMU */
> > > iommu@1 {
> > > reg = <1>;
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* multiple-master IOMMU */
> > > iommu@2 {
> > > reg = <2>;
> > > #address-cells = <1>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* device with two master interfaces */
> > > master {
> > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > > };
> > >
> > > Multiple-master IOMMU with configurable DMA window:
> > > ---------------------------------------------------
> > >
> > > / {
> > > #address-cells = <1>;
> > > #size-cells = <1>;
> > >
> > > iommu {
> > > /* master ID, address of DMA window */
> > > #address-cells = <2>;
> > > #size-cells = <2>;
> > > };
> > >
> > > master {
> > > /* master ID 42, 4 GiB DMA window starting at 0 */
> > > iommus = <&/iommu 42 0 0x1 0x0>;
> > > };
> > > };
> > >
> > > Does that sound about right?
> >
> > Yes, sounds great. I would probably leave out the Multiple-master device
> > from the examples, since that seems to be a rather obscure case.
>
> I think multi-master is the common case.
Which of the two cases above do you mean? I was referring to the first
as being obscure, not the second.
I still haven't seen an example of the first, while the second one
is very common.
> > ----------->8
> >
> > Does that make sense to you? We can change what we say about
> > dma-ranges, I mainly want to be clear with what is or is not
> > allowed at this point.
>
> I think it would be inconsistent and unnecessary to disallow it in the
> binding. The meaning you've proposed seems completely consistent with
> ePAPR, so I suggest to keep it. The IOMMU is just another bus master
> from the ePAPR point of view -- no need to make special rules for it
> unless they are useful.
>
> The binding does not need to be (and generally shouldn't be) a
> description of precisely what the kernel does and does not support.
>
> However, if we don't need to support non-identity dma-ranges in Linux
> yet, we have the option to barf if we see such a dma-ranges memorywards
> of an IOMMU, if it simplifies the Linux implementation. We could always
> relax that later -- and it'll be obvious how to describe that situation
> in DT.
Ok.
> What I would like to see is a recommandation, based on Thierry's binding
> here, for describing how cross-mastering in general is described. It's
> not really a binding, but more of a template for bindings.
>
> I'm happy to have a go at writing it, then we can decide whether it's
> useful or not.
I don't mind if you take this on, but I'm not sure if that should be
part of this binding or not. Let's see what you come up with.
> There are a few things from the discussion that are *not* solved by this
> iommu binding, but they seem reasonable. The binding also doesn't block
> solving those things later if/when needed:
>
> 1) Cross-mastering to things that are not IOMMUs
>
> We might need to solve this later if we encounter SoCs with
> problematic topologies, we shouldn't worry about it for the time
> being.
>
> We'll to revisit it for GICv3 but that's a separate topic.
> 2) Describing address and ID remappings for cross-mastering.
>
> We can describe this in a way that is consistent with this IOMMU
> binding. We will need to describe something for GICv3, but the
> common case will be that IDs are just passed through without
> remapping.
>
> We don't need to clarify how IDs are propagated until we have
> something in DT for IDs to propagate to.
Ok, thanks for pointing these out. I had forgotten about the MSI
case, but it seems ok to defer that part for now.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-20 20:26 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-20 20:26 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
> > > Typical values for the above include:
> > > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not
> > > configurable and therefore no additional information needs to be encoded in
> > > the specifier. This may also apply to multiple master IOMMU devices that do
> > > not allow the association of masters to be configured.
> > > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may
> > > need to be configured in order to enable translation for a given master. In
> > > such cases the single address cell corresponds to the master device's ID.
> > > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA
> > > window for masters to be configured. The first cell of the address in this
> > > may contain the master device's ID for example, while the second cell could
> > > contain the start of the DMA window for the given device. The length of the
> > > DMA window is specified by two additional cells.
>
> I was trying to figure out how to describe the different kinds of
> transformation we could have on the address/ID input to the IOMMU.
> Treating the whole thing as opaque gets us off the hook there.
>
> IDs are probably not propagated, not remapped, or we simply don't care
> about them; whereas the address transformation is software-controlled,
> so we don't describe that anyway.
>
> Delegating grokking the mapping to the iommu driver makes sense --
> it's what it's there for, after all.
>
>
> I'm not sure whether the windowed IOMMU case is special actually.
>
> Since the address to program into the master is found by calling the
> IOMMU driver to create some mappings, does anything except the IOMMU
> driver need to understand that there is windowing?
No. I tried to explain that earlier today, and in my earlier mails
I hadn't thought that part through. Only the IOMMU driver needs to care
about the window.
> > >
> > > Examples:
> > > =========
> > >
> > > Single-master IOMMU:
> > > --------------------
> > >
> > > iommu {
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > master {
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU with fixed associations:
> > > ----------------------------------------------
> > >
> > > /* multiple-master IOMMU */
> > > iommu {
> > > /*
> > > * Masters are statically associated with this IOMMU and
> > > * address translation is always enabled.
> > > */
> > > #iommu-cells = <0>;
> > > };
> >
> > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here.
> >
> > > /* static association with IOMMU */
> > > master at 1 {
> > > reg = <1>;
>
> Just for clarification, "reg" just has its standard meaning here, and
> is nothing to do with the IOMMU?
correct
> > > iommus = <&/iommu>;
>
> In effect, "iommus" is doing the same thing as my "slaves" property.
>
> The way #address-cells and #size-cells determine the address and range
> size for mastering into the IOMMU is also similar. The main difference
> is that I didn't build the ID into the address.
Right. I think the difference is more about what we want to call
things: Calling it iommu means we want to specifically describe
the case of iommus that needs to get handled by all OSs in a particular
way, while the more generic slave connection doesn't correspond to
a specific concept in the OS.
> > > };
> > >
> > > /* static association with IOMMU */
> > > master at 2 {
> > > reg = <2>;
> > > iommus = <&/iommu>;
> > > };
> > >
> > > Multiple-master IOMMU:
> > > ----------------------
> > >
> > > iommu {
> > > /* the specifier represents the ID of the master */
> > > #address-cells = <1>;
> > > #size-cells = <0>;
>
> How do we know the size of the input address to the IOMMU? Do we
> get cases for example where the IOMMU only accepts a 32-bit input
> address, but some 64-bit capable masters are connected through it?
I was stuck on this question for a while before, but then I realized
that it doesn't matter at all: It's the IOMMU driver itself that
manages the address space, and it doesn't matter if a slave can
address a larger range than the IOMMU can accept. If the IOMMU
needs to deal with the opposite case (64-bit input addresses
but a 32-bit master), that limitation can be put into the specifier.
> The size of the output address from the IOMMU will be determined
> by its own mastering destination, which by default in ePAPR is the
> IOMMU node's parent. I think that's what you intended, and what we
> expect in this case.
Rihgt.
> For determining dma masks, it is the output address that it
> important. Santosh's code can probably be taught to handle this,
> if given an additional traversal rule for following "iommus"
> properties. However, deploying an IOMMU whose output address size
> is smaller than the
Something seems to be missing here. I don't think we want to handle
the case where the IOMMU output cannot the entire memory address
space. If necessary, that would mean using both an IOMMU driver
and swiotlb, but I think it's a reasonable assumption that hardware
isn't /that/ crazy.
> > > Multiple-master device:
> > > -----------------------
> > >
> > > /* single-master IOMMU */
> > > iommu at 1 {
> > > reg = <1>;
> > > #address-cells = <0>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* multiple-master IOMMU */
> > > iommu at 2 {
> > > reg = <2>;
> > > #address-cells = <1>;
> > > #size-cells = <0>;
> > > };
> > >
> > > /* device with two master interfaces */
> > > master {
> > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */
> > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */
> > > };
> > >
> > > Multiple-master IOMMU with configurable DMA window:
> > > ---------------------------------------------------
> > >
> > > / {
> > > #address-cells = <1>;
> > > #size-cells = <1>;
> > >
> > > iommu {
> > > /* master ID, address of DMA window */
> > > #address-cells = <2>;
> > > #size-cells = <2>;
> > > };
> > >
> > > master {
> > > /* master ID 42, 4 GiB DMA window starting at 0 */
> > > iommus = <&/iommu 42 0 0x1 0x0>;
> > > };
> > > };
> > >
> > > Does that sound about right?
> >
> > Yes, sounds great. I would probably leave out the Multiple-master device
> > from the examples, since that seems to be a rather obscure case.
>
> I think multi-master is the common case.
Which of the two cases above do you mean? I was referring to the first
as being obscure, not the second.
I still haven't seen an example of the first, while the second one
is very common.
> > ----------->8
> >
> > Does that make sense to you? We can change what we say about
> > dma-ranges, I mainly want to be clear with what is or is not
> > allowed at this point.
>
> I think it would be inconsistent and unnecessary to disallow it in the
> binding. The meaning you've proposed seems completely consistent with
> ePAPR, so I suggest to keep it. The IOMMU is just another bus master
> from the ePAPR point of view -- no need to make special rules for it
> unless they are useful.
>
> The binding does not need to be (and generally shouldn't be) a
> description of precisely what the kernel does and does not support.
>
> However, if we don't need to support non-identity dma-ranges in Linux
> yet, we have the option to barf if we see such a dma-ranges memorywards
> of an IOMMU, if it simplifies the Linux implementation. We could always
> relax that later -- and it'll be obvious how to describe that situation
> in DT.
Ok.
> What I would like to see is a recommandation, based on Thierry's binding
> here, for describing how cross-mastering in general is described. It's
> not really a binding, but more of a template for bindings.
>
> I'm happy to have a go at writing it, then we can decide whether it's
> useful or not.
I don't mind if you take this on, but I'm not sure if that should be
part of this binding or not. Let's see what you come up with.
> There are a few things from the discussion that are *not* solved by this
> iommu binding, but they seem reasonable. The binding also doesn't block
> solving those things later if/when needed:
>
> 1) Cross-mastering to things that are not IOMMUs
>
> We might need to solve this later if we encounter SoCs with
> problematic topologies, we shouldn't worry about it for the time
> being.
>
> We'll to revisit it for GICv3 but that's a separate topic.
> 2) Describing address and ID remappings for cross-mastering.
>
> We can describe this in a way that is consistent with this IOMMU
> binding. We will need to describe something for GICv3, but the
> common case will be that IDs are just passed through without
> remapping.
>
> We don't need to clarify how IDs are propagated until we have
> something in DT for IDs to propagate to.
Ok, thanks for pointing these out. I had forgotten about the MSI
case, but it seems ok to defer that part for now.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-20 20:26 ` Arnd Bergmann
(?)
@ 2014-05-21 8:26 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:26 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 2300 bytes --]
On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > > > Multiple-master IOMMU:
> > > > ----------------------
> > > >
> > > > iommu {
> > > > /* the specifier represents the ID of the master */
> > > > #address-cells = <1>;
> > > > #size-cells = <0>;
> >
> > How do we know the size of the input address to the IOMMU? Do we
> > get cases for example where the IOMMU only accepts a 32-bit input
> > address, but some 64-bit capable masters are connected through it?
>
> I was stuck on this question for a while before, but then I realized
> that it doesn't matter at all: It's the IOMMU driver itself that
> manages the address space, and it doesn't matter if a slave can
> address a larger range than the IOMMU can accept. If the IOMMU
> needs to deal with the opposite case (64-bit input addresses
> but a 32-bit master), that limitation can be put into the specifier.
Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
master device's DMA mask to do the right thing here?
As such I don't think this information needs to be in device tree at
all. The DMA mask should typically be set by the driver in the first
place because it has knowledge about the capabilities of the device.
A different way of saying that is that the DMA mask is implied by the
device's compatible string.
> > For determining dma masks, it is the output address that it
> > important. Santosh's code can probably be taught to handle this,
> > if given an additional traversal rule for following "iommus"
> > properties. However, deploying an IOMMU whose output address size
> > is smaller than the
>
> Something seems to be missing here. I don't think we want to handle
> the case where the IOMMU output cannot the entire memory address
> space. If necessary, that would mean using both an IOMMU driver
> and swiotlb, but I think it's a reasonable assumption that hardware
> isn't /that/ crazy.
Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:26 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:26 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
[-- Attachment #1: Type: text/plain, Size: 2300 bytes --]
On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > > > Multiple-master IOMMU:
> > > > ----------------------
> > > >
> > > > iommu {
> > > > /* the specifier represents the ID of the master */
> > > > #address-cells = <1>;
> > > > #size-cells = <0>;
> >
> > How do we know the size of the input address to the IOMMU? Do we
> > get cases for example where the IOMMU only accepts a 32-bit input
> > address, but some 64-bit capable masters are connected through it?
>
> I was stuck on this question for a while before, but then I realized
> that it doesn't matter at all: It's the IOMMU driver itself that
> manages the address space, and it doesn't matter if a slave can
> address a larger range than the IOMMU can accept. If the IOMMU
> needs to deal with the opposite case (64-bit input addresses
> but a 32-bit master), that limitation can be put into the specifier.
Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
master device's DMA mask to do the right thing here?
As such I don't think this information needs to be in device tree at
all. The DMA mask should typically be set by the driver in the first
place because it has knowledge about the capabilities of the device.
A different way of saying that is that the DMA mask is implied by the
device's compatible string.
> > For determining dma masks, it is the output address that it
> > important. Santosh's code can probably be taught to handle this,
> > if given an additional traversal rule for following "iommus"
> > properties. However, deploying an IOMMU whose output address size
> > is smaller than the
>
> Something seems to be missing here. I don't think we want to handle
> the case where the IOMMU output cannot the entire memory address
> space. If necessary, that would mean using both an IOMMU driver
> and swiotlb, but I think it's a reasonable assumption that hardware
> isn't /that/ crazy.
Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:26 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 8:26 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
> > > > Multiple-master IOMMU:
> > > > ----------------------
> > > >
> > > > iommu {
> > > > /* the specifier represents the ID of the master */
> > > > #address-cells = <1>;
> > > > #size-cells = <0>;
> >
> > How do we know the size of the input address to the IOMMU? Do we
> > get cases for example where the IOMMU only accepts a 32-bit input
> > address, but some 64-bit capable masters are connected through it?
>
> I was stuck on this question for a while before, but then I realized
> that it doesn't matter at all: It's the IOMMU driver itself that
> manages the address space, and it doesn't matter if a slave can
> address a larger range than the IOMMU can accept. If the IOMMU
> needs to deal with the opposite case (64-bit input addresses
> but a 32-bit master), that limitation can be put into the specifier.
Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
master device's DMA mask to do the right thing here?
As such I don't think this information needs to be in device tree at
all. The DMA mask should typically be set by the driver in the first
place because it has knowledge about the capabilities of the device.
A different way of saying that is that the DMA mask is implied by the
device's compatible string.
> > For determining dma masks, it is the output address that it
> > important. Santosh's code can probably be taught to handle this,
> > if given an additional traversal rule for following "iommus"
> > properties. However, deploying an IOMMU whose output address size
> > is smaller than the
>
> Something seems to be missing here. I don't think we want to handle
> the case where the IOMMU output cannot the entire memory address
> space. If necessary, that would mean using both an IOMMU driver
> and swiotlb, but I think it's a reasonable assumption that hardware
> isn't /that/ crazy.
Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140521/7d8e7b76/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 8:26 ` Thierry Reding
(?)
@ 2014-05-21 8:50 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:50 UTC (permalink / raw)
To: Thierry Reding
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > > > Multiple-master IOMMU:
> > > > > ----------------------
> > > > >
> > > > > iommu {
> > > > > /* the specifier represents the ID of the master */
> > > > > #address-cells = <1>;
> > > > > #size-cells = <0>;
> > >
> > > How do we know the size of the input address to the IOMMU? Do we
> > > get cases for example where the IOMMU only accepts a 32-bit input
> > > address, but some 64-bit capable masters are connected through it?
> >
> > I was stuck on this question for a while before, but then I realized
> > that it doesn't matter at all: It's the IOMMU driver itself that
> > manages the address space, and it doesn't matter if a slave can
> > address a larger range than the IOMMU can accept. If the IOMMU
> > needs to deal with the opposite case (64-bit input addresses
> > but a 32-bit master), that limitation can be put into the specifier.
>
> Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> master device's DMA mask to do the right thing here?
Ah, yes. I guess that's the right way to do it.
> > > For determining dma masks, it is the output address that it
> > > important. Santosh's code can probably be taught to handle this,
> > > if given an additional traversal rule for following "iommus"
> > > properties. However, deploying an IOMMU whose output address size
> > > is smaller than the
> >
> > Something seems to be missing here. I don't think we want to handle
> > the case where the IOMMU output cannot the entire memory address
> > space. If necessary, that would mean using both an IOMMU driver
> > and swiotlb, but I think it's a reasonable assumption that hardware
> > isn't /that/ crazy.
>
> Similarily, should the IOMMU not be treated like any other device here?
> Its DMA mask should determine what address range it can access.
Right. But for that we need a dma-ranges property in the parent of the
iommu, just so the mask can be set correctly and we don't have to
rely on the 32-bit fallback case.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:50 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:50 UTC (permalink / raw)
To: Thierry Reding
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > > > Multiple-master IOMMU:
> > > > > ----------------------
> > > > >
> > > > > iommu {
> > > > > /* the specifier represents the ID of the master */
> > > > > #address-cells = <1>;
> > > > > #size-cells = <0>;
> > >
> > > How do we know the size of the input address to the IOMMU? Do we
> > > get cases for example where the IOMMU only accepts a 32-bit input
> > > address, but some 64-bit capable masters are connected through it?
> >
> > I was stuck on this question for a while before, but then I realized
> > that it doesn't matter at all: It's the IOMMU driver itself that
> > manages the address space, and it doesn't matter if a slave can
> > address a larger range than the IOMMU can accept. If the IOMMU
> > needs to deal with the opposite case (64-bit input addresses
> > but a 32-bit master), that limitation can be put into the specifier.
>
> Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> master device's DMA mask to do the right thing here?
Ah, yes. I guess that's the right way to do it.
> > > For determining dma masks, it is the output address that it
> > > important. Santosh's code can probably be taught to handle this,
> > > if given an additional traversal rule for following "iommus"
> > > properties. However, deploying an IOMMU whose output address size
> > > is smaller than the
> >
> > Something seems to be missing here. I don't think we want to handle
> > the case where the IOMMU output cannot the entire memory address
> > space. If necessary, that would mean using both an IOMMU driver
> > and swiotlb, but I think it's a reasonable assumption that hardware
> > isn't /that/ crazy.
>
> Similarily, should the IOMMU not be treated like any other device here?
> Its DMA mask should determine what address range it can access.
Right. But for that we need a dma-ranges property in the parent of the
iommu, just so the mask can be set correctly and we don't have to
rely on the 32-bit fallback case.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 8:50 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 8:50 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> [...]
> > > > > Multiple-master IOMMU:
> > > > > ----------------------
> > > > >
> > > > > iommu {
> > > > > /* the specifier represents the ID of the master */
> > > > > #address-cells = <1>;
> > > > > #size-cells = <0>;
> > >
> > > How do we know the size of the input address to the IOMMU? Do we
> > > get cases for example where the IOMMU only accepts a 32-bit input
> > > address, but some 64-bit capable masters are connected through it?
> >
> > I was stuck on this question for a while before, but then I realized
> > that it doesn't matter at all: It's the IOMMU driver itself that
> > manages the address space, and it doesn't matter if a slave can
> > address a larger range than the IOMMU can accept. If the IOMMU
> > needs to deal with the opposite case (64-bit input addresses
> > but a 32-bit master), that limitation can be put into the specifier.
>
> Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> master device's DMA mask to do the right thing here?
Ah, yes. I guess that's the right way to do it.
> > > For determining dma masks, it is the output address that it
> > > important. Santosh's code can probably be taught to handle this,
> > > if given an additional traversal rule for following "iommus"
> > > properties. However, deploying an IOMMU whose output address size
> > > is smaller than the
> >
> > Something seems to be missing here. I don't think we want to handle
> > the case where the IOMMU output cannot the entire memory address
> > space. If necessary, that would mean using both an IOMMU driver
> > and swiotlb, but I think it's a reasonable assumption that hardware
> > isn't /that/ crazy.
>
> Similarily, should the IOMMU not be treated like any other device here?
> Its DMA mask should determine what address range it can access.
Right. But for that we need a dma-ranges property in the parent of the
iommu, just so the mask can be set correctly and we don't have to
rely on the 32-bit fallback case.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 8:50 ` Arnd Bergmann
(?)
@ 2014-05-21 9:00 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 2661 bytes --]
On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > > > Multiple-master IOMMU:
> > > > > > ----------------------
> > > > > >
> > > > > > iommu {
> > > > > > /* the specifier represents the ID of the master */
> > > > > > #address-cells = <1>;
> > > > > > #size-cells = <0>;
> > > >
> > > > How do we know the size of the input address to the IOMMU? Do we
> > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > address, but some 64-bit capable masters are connected through it?
> > >
> > > I was stuck on this question for a while before, but then I realized
> > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > manages the address space, and it doesn't matter if a slave can
> > > address a larger range than the IOMMU can accept. If the IOMMU
> > > needs to deal with the opposite case (64-bit input addresses
> > > but a 32-bit master), that limitation can be put into the specifier.
> >
> > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > master device's DMA mask to do the right thing here?
>
> Ah, yes. I guess that's the right way to do it.
>
> > > > For determining dma masks, it is the output address that it
> > > > important. Santosh's code can probably be taught to handle this,
> > > > if given an additional traversal rule for following "iommus"
> > > > properties. However, deploying an IOMMU whose output address size
> > > > is smaller than the
> > >
> > > Something seems to be missing here. I don't think we want to handle
> > > the case where the IOMMU output cannot the entire memory address
> > > space. If necessary, that would mean using both an IOMMU driver
> > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > isn't /that/ crazy.
> >
> > Similarily, should the IOMMU not be treated like any other device here?
> > Its DMA mask should determine what address range it can access.
>
> Right. But for that we need a dma-ranges property in the parent of the
> iommu, just so the mask can be set correctly and we don't have to
> rely on the 32-bit fallback case.
Shouldn't the IOMMU driver be the one to set the DMA mask for the device
in exactly the same way that other drivers override the 32-bit default?
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:00 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
[-- Attachment #1: Type: text/plain, Size: 2661 bytes --]
On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > > > Multiple-master IOMMU:
> > > > > > ----------------------
> > > > > >
> > > > > > iommu {
> > > > > > /* the specifier represents the ID of the master */
> > > > > > #address-cells = <1>;
> > > > > > #size-cells = <0>;
> > > >
> > > > How do we know the size of the input address to the IOMMU? Do we
> > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > address, but some 64-bit capable masters are connected through it?
> > >
> > > I was stuck on this question for a while before, but then I realized
> > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > manages the address space, and it doesn't matter if a slave can
> > > address a larger range than the IOMMU can accept. If the IOMMU
> > > needs to deal with the opposite case (64-bit input addresses
> > > but a 32-bit master), that limitation can be put into the specifier.
> >
> > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > master device's DMA mask to do the right thing here?
>
> Ah, yes. I guess that's the right way to do it.
>
> > > > For determining dma masks, it is the output address that it
> > > > important. Santosh's code can probably be taught to handle this,
> > > > if given an additional traversal rule for following "iommus"
> > > > properties. However, deploying an IOMMU whose output address size
> > > > is smaller than the
> > >
> > > Something seems to be missing here. I don't think we want to handle
> > > the case where the IOMMU output cannot the entire memory address
> > > space. If necessary, that would mean using both an IOMMU driver
> > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > isn't /that/ crazy.
> >
> > Similarily, should the IOMMU not be treated like any other device here?
> > Its DMA mask should determine what address range it can access.
>
> Right. But for that we need a dma-ranges property in the parent of the
> iommu, just so the mask can be set correctly and we don't have to
> rely on the 32-bit fallback case.
Shouldn't the IOMMU driver be the one to set the DMA mask for the device
in exactly the same way that other drivers override the 32-bit default?
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:00 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 9:00 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > [...]
> > > > > > Multiple-master IOMMU:
> > > > > > ----------------------
> > > > > >
> > > > > > iommu {
> > > > > > /* the specifier represents the ID of the master */
> > > > > > #address-cells = <1>;
> > > > > > #size-cells = <0>;
> > > >
> > > > How do we know the size of the input address to the IOMMU? Do we
> > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > address, but some 64-bit capable masters are connected through it?
> > >
> > > I was stuck on this question for a while before, but then I realized
> > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > manages the address space, and it doesn't matter if a slave can
> > > address a larger range than the IOMMU can accept. If the IOMMU
> > > needs to deal with the opposite case (64-bit input addresses
> > > but a 32-bit master), that limitation can be put into the specifier.
> >
> > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > master device's DMA mask to do the right thing here?
>
> Ah, yes. I guess that's the right way to do it.
>
> > > > For determining dma masks, it is the output address that it
> > > > important. Santosh's code can probably be taught to handle this,
> > > > if given an additional traversal rule for following "iommus"
> > > > properties. However, deploying an IOMMU whose output address size
> > > > is smaller than the
> > >
> > > Something seems to be missing here. I don't think we want to handle
> > > the case where the IOMMU output cannot the entire memory address
> > > space. If necessary, that would mean using both an IOMMU driver
> > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > isn't /that/ crazy.
> >
> > Similarily, should the IOMMU not be treated like any other device here?
> > Its DMA mask should determine what address range it can access.
>
> Right. But for that we need a dma-ranges property in the parent of the
> iommu, just so the mask can be set correctly and we don't have to
> rely on the 32-bit fallback case.
Shouldn't the IOMMU driver be the one to set the DMA mask for the device
in exactly the same way that other drivers override the 32-bit default?
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140521/26508560/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 9:00 ` Thierry Reding
(?)
@ 2014-05-21 9:36 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 9:36 UTC (permalink / raw)
To: Thierry Reding
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
The IOMMU driver could /ask/ for an appropriate mask based on its internal
design, but if you have an IOMMU with a 64-bit output address connected
to a 32-bit bus, that should fail.
Note that it's not obvious what the IOMMU's DMA mask actually means.
It clearly has to be the mask that is used for allocating the IO page
tables, but it wouldn't normally be used in the path that allocates
pages on behalf of a DMA master attached to the IOMMU, because that
allocation is performed by the code that looks at the other device's
dma mask.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:36 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 9:36 UTC (permalink / raw)
To: Thierry Reding
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
The IOMMU driver could /ask/ for an appropriate mask based on its internal
design, but if you have an IOMMU with a 64-bit output address connected
to a 32-bit bus, that should fail.
Note that it's not obvious what the IOMMU's DMA mask actually means.
It clearly has to be the mask that is used for allocating the IO page
tables, but it wouldn't normally be used in the path that allocates
pages on behalf of a DMA master attached to the IOMMU, because that
allocation is performed by the code that looks at the other device's
dma mask.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 9:36 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 9:36 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
The IOMMU driver could /ask/ for an appropriate mask based on its internal
design, but if you have an IOMMU with a 64-bit output address connected
to a 32-bit bus, that should fail.
Note that it's not obvious what the IOMMU's DMA mask actually means.
It clearly has to be the mask that is used for allocating the IO page
tables, but it wouldn't normally be used in the path that allocates
pages on behalf of a DMA master attached to the IOMMU, because that
allocation is performed by the code that looks at the other device's
dma mask.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 9:36 ` Arnd Bergmann
(?)
@ 2014-05-21 10:50 ` Thierry Reding
-1 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 10:50 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Joerg Roedel, Stephen Warren,
Will Deacon, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
Marc Zyngier, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo
[-- Attachment #1: Type: text/plain, Size: 3136 bytes --]
On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
>
> > > > > > For determining dma masks, it is the output address that it
> > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > if given an additional traversal rule for following "iommus"
> > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > is smaller than the
> > > > >
> > > > > Something seems to be missing here. I don't think we want to handle
> > > > > the case where the IOMMU output cannot the entire memory address
> > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > isn't /that/ crazy.
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
>
> The IOMMU driver could /ask/ for an appropriate mask based on its internal
> design, but if you have an IOMMU with a 64-bit output address connected
> to a 32-bit bus, that should fail.
Are there real use-cases where that really happens? I guess if we need
that the correct thing would be to bitwise AND both the DMA mask of the
IOMMU device (as set by the driver) with that derived from the IOMMU's
parent bus' dma-ranges property.
> Note that it's not obvious what the IOMMU's DMA mask actually means.
> It clearly has to be the mask that is used for allocating the IO page
> tables, but it wouldn't normally be used in the path that allocates
> pages on behalf of a DMA master attached to the IOMMU, because that
> allocation is performed by the code that looks at the other device's
> dma mask.
Interesting. If a DMA buffer is allocated using the master's DMA mask
wouldn't that cause breakage if the IOMMU and master's DMA masks don't
match. It seems to me like the right thing to do for buffer allocation
is to use the IOMMU's DMA mask if a device uses the IOMMU for
translation and use the device's DMA mask when determining to what I/O
virtual address to map that buffer.
Obviously if we always assume that IOMMU hardware is sane and can always
access at least the whole memory then this isn't an issue. But what if a
device can do DMA to a 64-bit address space, but the IOMMU can only
address 32 bits. If the device's DMA mask is used for allocations, then
buffers could reside beyond the 4 GiB boundary that the IOMMU can
address, so effectively the IOMMU wouldn't be able to write to those
buffers.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 10:50 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 10:50 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
[-- Attachment #1: Type: text/plain, Size: 3136 bytes --]
On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
>
> > > > > > For determining dma masks, it is the output address that it
> > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > if given an additional traversal rule for following "iommus"
> > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > is smaller than the
> > > > >
> > > > > Something seems to be missing here. I don't think we want to handle
> > > > > the case where the IOMMU output cannot the entire memory address
> > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > isn't /that/ crazy.
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
>
> The IOMMU driver could /ask/ for an appropriate mask based on its internal
> design, but if you have an IOMMU with a 64-bit output address connected
> to a 32-bit bus, that should fail.
Are there real use-cases where that really happens? I guess if we need
that the correct thing would be to bitwise AND both the DMA mask of the
IOMMU device (as set by the driver) with that derived from the IOMMU's
parent bus' dma-ranges property.
> Note that it's not obvious what the IOMMU's DMA mask actually means.
> It clearly has to be the mask that is used for allocating the IO page
> tables, but it wouldn't normally be used in the path that allocates
> pages on behalf of a DMA master attached to the IOMMU, because that
> allocation is performed by the code that looks at the other device's
> dma mask.
Interesting. If a DMA buffer is allocated using the master's DMA mask
wouldn't that cause breakage if the IOMMU and master's DMA masks don't
match. It seems to me like the right thing to do for buffer allocation
is to use the IOMMU's DMA mask if a device uses the IOMMU for
translation and use the device's DMA mask when determining to what I/O
virtual address to map that buffer.
Obviously if we always assume that IOMMU hardware is sane and can always
access at least the whole memory then this isn't an issue. But what if a
device can do DMA to a 64-bit address space, but the IOMMU can only
address 32 bits. If the device's DMA mask is used for allocations, then
buffers could reside beyond the 4 GiB boundary that the IOMMU can
address, so effectively the IOMMU wouldn't be able to write to those
buffers.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 10:50 ` Thierry Reding
0 siblings, 0 replies; 112+ messages in thread
From: Thierry Reding @ 2014-05-21 10:50 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
>
> > > > > > For determining dma masks, it is the output address that it
> > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > if given an additional traversal rule for following "iommus"
> > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > is smaller than the
> > > > >
> > > > > Something seems to be missing here. I don't think we want to handle
> > > > > the case where the IOMMU output cannot the entire memory address
> > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > isn't /that/ crazy.
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
>
> The IOMMU driver could /ask/ for an appropriate mask based on its internal
> design, but if you have an IOMMU with a 64-bit output address connected
> to a 32-bit bus, that should fail.
Are there real use-cases where that really happens? I guess if we need
that the correct thing would be to bitwise AND both the DMA mask of the
IOMMU device (as set by the driver) with that derived from the IOMMU's
parent bus' dma-ranges property.
> Note that it's not obvious what the IOMMU's DMA mask actually means.
> It clearly has to be the mask that is used for allocating the IO page
> tables, but it wouldn't normally be used in the path that allocates
> pages on behalf of a DMA master attached to the IOMMU, because that
> allocation is performed by the code that looks at the other device's
> dma mask.
Interesting. If a DMA buffer is allocated using the master's DMA mask
wouldn't that cause breakage if the IOMMU and master's DMA masks don't
match. It seems to me like the right thing to do for buffer allocation
is to use the IOMMU's DMA mask if a device uses the IOMMU for
translation and use the device's DMA mask when determining to what I/O
virtual address to map that buffer.
Obviously if we always assume that IOMMU hardware is sane and can always
access at least the whole memory then this isn't an issue. But what if a
device can do DMA to a 64-bit address space, but the IOMMU can only
address 32 bits. If the device's DMA mask is used for allocations, then
buffers could reside beyond the 4 GiB boundary that the IOMMU can
address, so effectively the IOMMU wouldn't be able to write to those
buffers.
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140521/2507f68d/attachment.sig>
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 10:50 ` Thierry Reding
(?)
@ 2014-05-21 14:01 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 14:01 UTC (permalink / raw)
To: Thierry Reding
Cc: Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Joerg Roedel, Stephen Warren,
Will Deacon, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
Marc Zyngier, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo
On Wednesday 21 May 2014 12:50:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> >
> > > > > > > For determining dma masks, it is the output address that it
> > > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > > if given an additional traversal rule for following "iommus"
> > > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > > is smaller than the
> > > > > >
> > > > > > Something seems to be missing here. I don't think we want to handle
> > > > > > the case where the IOMMU output cannot the entire memory address
> > > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > > isn't /that/ crazy.
> > > > >
> > > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > > Its DMA mask should determine what address range it can access.
> > > >
> > > > Right. But for that we need a dma-ranges property in the parent of the
> > > > iommu, just so the mask can be set correctly and we don't have to
> > > > rely on the 32-bit fallback case.
> > >
> > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > > in exactly the same way that other drivers override the 32-bit default?
> >
> > The IOMMU driver could /ask/ for an appropriate mask based on its internal
> > design, but if you have an IOMMU with a 64-bit output address connected
> > to a 32-bit bus, that should fail.
>
> Are there real use-cases where that really happens? I guess if we need
> that the correct thing would be to bitwise AND both the DMA mask of the
> IOMMU device (as set by the driver) with that derived from the IOMMU's
> parent bus' dma-ranges property.
It would be unusual for an IOMMU to need this, but it's how the DMA
mask is supposed to work for normal devices. As mentioned before, I
would probably just error out if we ever encounter such an IOMMU.
> > Note that it's not obvious what the IOMMU's DMA mask actually means.
> > It clearly has to be the mask that is used for allocating the IO page
> > tables, but it wouldn't normally be used in the path that allocates
> > pages on behalf of a DMA master attached to the IOMMU, because that
> > allocation is performed by the code that looks at the other device's
> > dma mask.
>
> Interesting. If a DMA buffer is allocated using the master's DMA mask
> wouldn't that cause breakage if the IOMMU and master's DMA masks don't
> match. It seems to me like the right thing to do for buffer allocation
> is to use the IOMMU's DMA mask if a device uses the IOMMU for
> translation and use the device's DMA mask when determining to what I/O
> virtual address to map that buffer.
Unfortunately not all code agrees regarding how dma mask is actually
interpreted. The most important use is within the dma_map_ops, and
that is aware of the IOMMU. The dma_map_ops use that to decide what
IOVA (bus address) to generate that is usable for the device, normally
this would be a 32-bit range.
When driver code looks at the dma mask of the device itself to make
an allocation decision without taking the IOMMU or swiotlb into
account, things can indeed go wrong.
Russell has recently done a good cleanup of various issues around
dma masks, and I can't find any drivers that get this wrong.
However, there is an issue with the two or three subsystems using
"PCI_DMA_BUS_IS_PHYS" to decide how they should treat high buffers
coming from user space that get passed to hardware.
If the SCSI layer or the network layer find the horribly misnamed
PCI_DMA_BUS_IS_PHYS (which is hardcoded to "1" on ARM32), they
will create copy in low memory for any data that is above
the device dma_mask (SCSI) or above max_low_pfn (network).
This is not normally a bug, and won't hurt for the swiotlb case,
but will give us worse performance for the IOMMU case, and
we should probably change this code to calculate the boundary
per device by calling a function from dma_map_ops.
We also really need to implement swiotlb support on ARM32 to deal
with any other device (besides SCSI and network) that does not
have an IOMMU but wants to use the streaming DMA API on pages
outside of the dma_mask. We already have this case on shmobile.
> Obviously if we always assume that IOMMU hardware is sane and can always
> access at least the whole memory then this isn't an issue. But what if a
> device can do DMA to a 64-bit address space, but the IOMMU can only
> address 32 bits. If the device's DMA mask is used for allocations, then
> buffers could reside beyond the 4 GiB boundary that the IOMMU can
> address, so effectively the IOMMU wouldn't be able to write to those
> buffers.
The mask of the device is not even an issue here, it's more the general
case of passing a buffer outside of the IOMMU's upstream bus DMA mask
into a driver connected to the IOMMU.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 14:01 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 14:01 UTC (permalink / raw)
To: Thierry Reding
Cc: Dave Martin, linux-arm-kernel, Mark Rutland, devicetree,
linux-samsung-soc, Pawel Moll, Ian Campbell, Grant Grundler,
Joerg Roedel, Stephen Warren, Will Deacon, linux-kernel,
Rob Herring, Marc Zyngier, iommu, Kumar Gala, linux-tegra,
Cho KyongHo
On Wednesday 21 May 2014 12:50:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> >
> > > > > > > For determining dma masks, it is the output address that it
> > > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > > if given an additional traversal rule for following "iommus"
> > > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > > is smaller than the
> > > > > >
> > > > > > Something seems to be missing here. I don't think we want to handle
> > > > > > the case where the IOMMU output cannot the entire memory address
> > > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > > isn't /that/ crazy.
> > > > >
> > > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > > Its DMA mask should determine what address range it can access.
> > > >
> > > > Right. But for that we need a dma-ranges property in the parent of the
> > > > iommu, just so the mask can be set correctly and we don't have to
> > > > rely on the 32-bit fallback case.
> > >
> > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > > in exactly the same way that other drivers override the 32-bit default?
> >
> > The IOMMU driver could /ask/ for an appropriate mask based on its internal
> > design, but if you have an IOMMU with a 64-bit output address connected
> > to a 32-bit bus, that should fail.
>
> Are there real use-cases where that really happens? I guess if we need
> that the correct thing would be to bitwise AND both the DMA mask of the
> IOMMU device (as set by the driver) with that derived from the IOMMU's
> parent bus' dma-ranges property.
It would be unusual for an IOMMU to need this, but it's how the DMA
mask is supposed to work for normal devices. As mentioned before, I
would probably just error out if we ever encounter such an IOMMU.
> > Note that it's not obvious what the IOMMU's DMA mask actually means.
> > It clearly has to be the mask that is used for allocating the IO page
> > tables, but it wouldn't normally be used in the path that allocates
> > pages on behalf of a DMA master attached to the IOMMU, because that
> > allocation is performed by the code that looks at the other device's
> > dma mask.
>
> Interesting. If a DMA buffer is allocated using the master's DMA mask
> wouldn't that cause breakage if the IOMMU and master's DMA masks don't
> match. It seems to me like the right thing to do for buffer allocation
> is to use the IOMMU's DMA mask if a device uses the IOMMU for
> translation and use the device's DMA mask when determining to what I/O
> virtual address to map that buffer.
Unfortunately not all code agrees regarding how dma mask is actually
interpreted. The most important use is within the dma_map_ops, and
that is aware of the IOMMU. The dma_map_ops use that to decide what
IOVA (bus address) to generate that is usable for the device, normally
this would be a 32-bit range.
When driver code looks at the dma mask of the device itself to make
an allocation decision without taking the IOMMU or swiotlb into
account, things can indeed go wrong.
Russell has recently done a good cleanup of various issues around
dma masks, and I can't find any drivers that get this wrong.
However, there is an issue with the two or three subsystems using
"PCI_DMA_BUS_IS_PHYS" to decide how they should treat high buffers
coming from user space that get passed to hardware.
If the SCSI layer or the network layer find the horribly misnamed
PCI_DMA_BUS_IS_PHYS (which is hardcoded to "1" on ARM32), they
will create copy in low memory for any data that is above
the device dma_mask (SCSI) or above max_low_pfn (network).
This is not normally a bug, and won't hurt for the swiotlb case,
but will give us worse performance for the IOMMU case, and
we should probably change this code to calculate the boundary
per device by calling a function from dma_map_ops.
We also really need to implement swiotlb support on ARM32 to deal
with any other device (besides SCSI and network) that does not
have an IOMMU but wants to use the streaming DMA API on pages
outside of the dma_mask. We already have this case on shmobile.
> Obviously if we always assume that IOMMU hardware is sane and can always
> access at least the whole memory then this isn't an issue. But what if a
> device can do DMA to a 64-bit address space, but the IOMMU can only
> address 32 bits. If the device's DMA mask is used for allocations, then
> buffers could reside beyond the 4 GiB boundary that the IOMMU can
> address, so effectively the IOMMU wouldn't be able to write to those
> buffers.
The mask of the device is not even an issue here, it's more the general
case of passing a buffer outside of the IOMMU's upstream bus DMA mask
into a driver connected to the IOMMU.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 14:01 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 14:01 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 12:50:38 Thierry Reding wrote:
> On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
> > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> >
> > > > > > > For determining dma masks, it is the output address that it
> > > > > > > important. Santosh's code can probably be taught to handle this,
> > > > > > > if given an additional traversal rule for following "iommus"
> > > > > > > properties. However, deploying an IOMMU whose output address size
> > > > > > > is smaller than the
> > > > > >
> > > > > > Something seems to be missing here. I don't think we want to handle
> > > > > > the case where the IOMMU output cannot the entire memory address
> > > > > > space. If necessary, that would mean using both an IOMMU driver
> > > > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > > > isn't /that/ crazy.
> > > > >
> > > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > > Its DMA mask should determine what address range it can access.
> > > >
> > > > Right. But for that we need a dma-ranges property in the parent of the
> > > > iommu, just so the mask can be set correctly and we don't have to
> > > > rely on the 32-bit fallback case.
> > >
> > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > > in exactly the same way that other drivers override the 32-bit default?
> >
> > The IOMMU driver could /ask/ for an appropriate mask based on its internal
> > design, but if you have an IOMMU with a 64-bit output address connected
> > to a 32-bit bus, that should fail.
>
> Are there real use-cases where that really happens? I guess if we need
> that the correct thing would be to bitwise AND both the DMA mask of the
> IOMMU device (as set by the driver) with that derived from the IOMMU's
> parent bus' dma-ranges property.
It would be unusual for an IOMMU to need this, but it's how the DMA
mask is supposed to work for normal devices. As mentioned before, I
would probably just error out if we ever encounter such an IOMMU.
> > Note that it's not obvious what the IOMMU's DMA mask actually means.
> > It clearly has to be the mask that is used for allocating the IO page
> > tables, but it wouldn't normally be used in the path that allocates
> > pages on behalf of a DMA master attached to the IOMMU, because that
> > allocation is performed by the code that looks at the other device's
> > dma mask.
>
> Interesting. If a DMA buffer is allocated using the master's DMA mask
> wouldn't that cause breakage if the IOMMU and master's DMA masks don't
> match. It seems to me like the right thing to do for buffer allocation
> is to use the IOMMU's DMA mask if a device uses the IOMMU for
> translation and use the device's DMA mask when determining to what I/O
> virtual address to map that buffer.
Unfortunately not all code agrees regarding how dma mask is actually
interpreted. The most important use is within the dma_map_ops, and
that is aware of the IOMMU. The dma_map_ops use that to decide what
IOVA (bus address) to generate that is usable for the device, normally
this would be a 32-bit range.
When driver code looks at the dma mask of the device itself to make
an allocation decision without taking the IOMMU or swiotlb into
account, things can indeed go wrong.
Russell has recently done a good cleanup of various issues around
dma masks, and I can't find any drivers that get this wrong.
However, there is an issue with the two or three subsystems using
"PCI_DMA_BUS_IS_PHYS" to decide how they should treat high buffers
coming from user space that get passed to hardware.
If the SCSI layer or the network layer find the horribly misnamed
PCI_DMA_BUS_IS_PHYS (which is hardcoded to "1" on ARM32), they
will create copy in low memory for any data that is above
the device dma_mask (SCSI) or above max_low_pfn (network).
This is not normally a bug, and won't hurt for the swiotlb case,
but will give us worse performance for the IOMMU case, and
we should probably change this code to calculate the boundary
per device by calling a function from dma_map_ops.
We also really need to implement swiotlb support on ARM32 to deal
with any other device (besides SCSI and network) that does not
have an IOMMU but wants to use the streaming DMA API on pages
outside of the dma_mask. We already have this case on shmobile.
> Obviously if we always assume that IOMMU hardware is sane and can always
> access at least the whole memory then this isn't an issue. But what if a
> device can do DMA to a 64-bit address space, but the IOMMU can only
> address 32 bits. If the device's DMA mask is used for allocations, then
> buffers could reside beyond the 4 GiB boundary that the IOMMU can
> address, so effectively the IOMMU wouldn't be able to write to those
> buffers.
The mask of the device is not even an issue here, it's more the general
case of passing a buffer outside of the IOMMU's upstream bus DMA mask
into a driver connected to the IOMMU.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 9:00 ` Thierry Reding
(?)
@ 2014-05-21 17:09 ` Dave Martin
-1 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-21 17:09 UTC (permalink / raw)
To: Thierry Reding
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann,
Pawel Moll, Ian Campbell, Grant Grundler, Stephen Warren,
Will Deacon, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > [...]
> > > > > > > Multiple-master IOMMU:
> > > > > > > ----------------------
> > > > > > >
> > > > > > > iommu {
> > > > > > > /* the specifier represents the ID of the master */
> > > > > > > #address-cells = <1>;
> > > > > > > #size-cells = <0>;
> > > > >
> > > > > How do we know the size of the input address to the IOMMU? Do we
> > > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > > address, but some 64-bit capable masters are connected through it?
> > > >
> > > > I was stuck on this question for a while before, but then I realized
> > > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > > manages the address space, and it doesn't matter if a slave can
> > > > address a larger range than the IOMMU can accept. If the IOMMU
> > > > needs to deal with the opposite case (64-bit input addresses
> > > > but a 32-bit master), that limitation can be put into the specifier.
> > >
> > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > > master device's DMA mask to do the right thing here?
> >
> > Ah, yes. I guess that's the right way to do it.
> >
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
>
Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
here? The device has a next-hop mask which may be non-trivial. The
IOMMU also has a next-hop mask and/or remapping, but I think we agree
that in sensible systems that will be trivial. There might be other
non-trivial remappings between the IOMMU and memory (but again, not
in the common case).
If we just use the same name for all these, we are liable to get
confused.
To answer the question "what memory can the kernel allocate for DMA
with this device", it is the end-to-end transformation that is
important.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 17:09 ` Dave Martin
0 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-21 17:09 UTC (permalink / raw)
To: Thierry Reding
Cc: Arnd Bergmann, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo,
linux-arm-kernel
On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > [...]
> > > > > > > Multiple-master IOMMU:
> > > > > > > ----------------------
> > > > > > >
> > > > > > > iommu {
> > > > > > > /* the specifier represents the ID of the master */
> > > > > > > #address-cells = <1>;
> > > > > > > #size-cells = <0>;
> > > > >
> > > > > How do we know the size of the input address to the IOMMU? Do we
> > > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > > address, but some 64-bit capable masters are connected through it?
> > > >
> > > > I was stuck on this question for a while before, but then I realized
> > > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > > manages the address space, and it doesn't matter if a slave can
> > > > address a larger range than the IOMMU can accept. If the IOMMU
> > > > needs to deal with the opposite case (64-bit input addresses
> > > > but a 32-bit master), that limitation can be put into the specifier.
> > >
> > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > > master device's DMA mask to do the right thing here?
> >
> > Ah, yes. I guess that's the right way to do it.
> >
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
>
Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
here? The device has a next-hop mask which may be non-trivial. The
IOMMU also has a next-hop mask and/or remapping, but I think we agree
that in sensible systems that will be trivial. There might be other
non-trivial remappings between the IOMMU and memory (but again, not
in the common case).
If we just use the same name for all these, we are liable to get
confused.
To answer the question "what memory can the kernel allocate for DMA
with this device", it is the end-to-end transformation that is
important.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 17:09 ` Dave Martin
0 siblings, 0 replies; 112+ messages in thread
From: Dave Martin @ 2014-05-21 17:09 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
> > > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
> > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
> > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
> > > [...]
> > > > > > > Multiple-master IOMMU:
> > > > > > > ----------------------
> > > > > > >
> > > > > > > iommu {
> > > > > > > /* the specifier represents the ID of the master */
> > > > > > > #address-cells = <1>;
> > > > > > > #size-cells = <0>;
> > > > >
> > > > > How do we know the size of the input address to the IOMMU? Do we
> > > > > get cases for example where the IOMMU only accepts a 32-bit input
> > > > > address, but some 64-bit capable masters are connected through it?
> > > >
> > > > I was stuck on this question for a while before, but then I realized
> > > > that it doesn't matter at all: It's the IOMMU driver itself that
> > > > manages the address space, and it doesn't matter if a slave can
> > > > address a larger range than the IOMMU can accept. If the IOMMU
> > > > needs to deal with the opposite case (64-bit input addresses
> > > > but a 32-bit master), that limitation can be put into the specifier.
> > >
> > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
> > > master device's DMA mask to do the right thing here?
> >
> > Ah, yes. I guess that's the right way to do it.
> >
> > > > > For determining dma masks, it is the output address that it
> > > > > important. Santosh's code can probably be taught to handle this,
> > > > > if given an additional traversal rule for following "iommus"
> > > > > properties. However, deploying an IOMMU whose output address size
> > > > > is smaller than the
> > > >
> > > > Something seems to be missing here. I don't think we want to handle
> > > > the case where the IOMMU output cannot the entire memory address
> > > > space. If necessary, that would mean using both an IOMMU driver
> > > > and swiotlb, but I think it's a reasonable assumption that hardware
> > > > isn't /that/ crazy.
> > >
> > > Similarily, should the IOMMU not be treated like any other device here?
> > > Its DMA mask should determine what address range it can access.
> >
> > Right. But for that we need a dma-ranges property in the parent of the
> > iommu, just so the mask can be set correctly and we don't have to
> > rely on the 32-bit fallback case.
>
> Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> in exactly the same way that other drivers override the 32-bit default?
>
Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
here? The device has a next-hop mask which may be non-trivial. The
IOMMU also has a next-hop mask and/or remapping, but I think we agree
that in sensible systems that will be trivial. There might be other
non-trivial remappings between the IOMMU and memory (but again, not
in the common case).
If we just use the same name for all these, we are liable to get
confused.
To answer the question "what memory can the kernel allocate for DMA
with this device", it is the end-to-end transformation that is
important.
Cheers
---Dave
^ permalink raw reply [flat|nested] 112+ messages in thread[parent not found: <20140521170954.GC3830-M5GwZQ6tE7x5pKCnmE3YQBJ8xKzm50AiAL8bYrjMMd8@public.gmane.org>]
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
2014-05-21 17:09 ` Dave Martin
(?)
@ 2014-05-21 18:11 ` Arnd Bergmann
-1 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 18:11 UTC (permalink / raw)
To: Dave Martin
Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, Pawel Moll,
Ian Campbell, Grant Grundler, Stephen Warren, Will Deacon,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Marc Zyngier,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Thierry Reding,
Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Wednesday 21 May 2014 18:09:54 Dave Martin wrote:
> On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
> >
>
> Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
> here? The device has a next-hop mask which may be non-trivial. The
> IOMMU also has a next-hop mask and/or remapping, but I think we agree
> that in sensible systems that will be trivial. There might be other
> non-trivial remappings between the IOMMU and memory (but again, not
> in the common case).
>
> If we just use the same name for all these, we are liable to get
> confused.
>
> To answer the question "what memory can the kernel allocate for DMA
> with this device", it is the end-to-end transformation that is
> important.
Yes, but that is not the same as the dma mask of the device. The DMA
mask gets set by the device according to its capabilities, and may
get limited by what the bus to either memory or to the iommu can do,
if one is in use.
Without an IOMMU, the mask is used for allocations, with an IOMMU,
The iommu code makes the decision what to allocate without taking
dev->dma_mask into account.
As mentioned, this is currently not handled well for SCSI and network,
where we use a smaller mask than necessary and can end up copying
data.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 18:11 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 18:11 UTC (permalink / raw)
To: Dave Martin
Cc: Thierry Reding, Mark Rutland, devicetree, linux-samsung-soc,
Pawel Moll, Ian Campbell, Grant Grundler, Joerg Roedel,
Stephen Warren, Will Deacon, linux-kernel, Marc Zyngier, iommu,
Rob Herring, Kumar Gala, linux-tegra, Cho KyongHo,
linux-arm-kernel
On Wednesday 21 May 2014 18:09:54 Dave Martin wrote:
> On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
> >
>
> Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
> here? The device has a next-hop mask which may be non-trivial. The
> IOMMU also has a next-hop mask and/or remapping, but I think we agree
> that in sensible systems that will be trivial. There might be other
> non-trivial remappings between the IOMMU and memory (but again, not
> in the common case).
>
> If we just use the same name for all these, we are liable to get
> confused.
>
> To answer the question "what memory can the kernel allocate for DMA
> with this device", it is the end-to-end transformation that is
> important.
Yes, but that is not the same as the dma mask of the device. The DMA
mask gets set by the device according to its capabilities, and may
get limited by what the bus to either memory or to the iommu can do,
if one is in use.
Without an IOMMU, the mask is used for allocations, with an IOMMU,
The iommu code makes the decision what to allocate without taking
dev->dma_mask into account.
As mentioned, this is currently not handled well for SCSI and network,
where we use a smaller mask than necessary and can end up copying
data.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread
* [PATCH] devicetree: Add generic IOMMU device tree bindings
@ 2014-05-21 18:11 ` Arnd Bergmann
0 siblings, 0 replies; 112+ messages in thread
From: Arnd Bergmann @ 2014-05-21 18:11 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 21 May 2014 18:09:54 Dave Martin wrote:
> On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
> > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
> > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
> > > >
> > > > Similarily, should the IOMMU not be treated like any other device here?
> > > > Its DMA mask should determine what address range it can access.
> > >
> > > Right. But for that we need a dma-ranges property in the parent of the
> > > iommu, just so the mask can be set correctly and we don't have to
> > > rely on the 32-bit fallback case.
> >
> > Shouldn't the IOMMU driver be the one to set the DMA mask for the device
> > in exactly the same way that other drivers override the 32-bit default?
> >
>
> Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask"
> here? The device has a next-hop mask which may be non-trivial. The
> IOMMU also has a next-hop mask and/or remapping, but I think we agree
> that in sensible systems that will be trivial. There might be other
> non-trivial remappings between the IOMMU and memory (but again, not
> in the common case).
>
> If we just use the same name for all these, we are liable to get
> confused.
>
> To answer the question "what memory can the kernel allocate for DMA
> with this device", it is the end-to-end transformation that is
> important.
Yes, but that is not the same as the dma mask of the device. The DMA
mask gets set by the device according to its capabilities, and may
get limited by what the bus to either memory or to the iommu can do,
if one is in use.
Without an IOMMU, the mask is used for allocations, with an IOMMU,
The iommu code makes the decision what to allocate without taking
dev->dma_mask into account.
As mentioned, this is currently not handled well for SCSI and network,
where we use a smaller mask than necessary and can end up copying
data.
Arnd
^ permalink raw reply [flat|nested] 112+ messages in thread